What is it about?
Learning methods are challenged when there is not enough labeled data. It gets worse when the existing learning data have different distributions in different domains. To deal with such situations, deep unsupervised domain adaptation techniques have newly been widely used. This study surveys such domain adaptation methods that have been used for classification tasks in computer vision. The survey includes the very recent papers on this topic that have not been included in the previous surveys and introduces a taxonomy by grouping methods published on unsupervised domain adaptation into five groups of discrepancy-, adversarial-, reconstruction-, representation-, and attention-based methods.
Featured Image
Why is it important?
By exploiting massive labeled data, deep neural networks (NN)s have shown improved performance in many applications, like image classification, object detection, semantic segmentation, text recognition, person re-identification, to name a few. The performance of these systems highly depends on the qualification of the labeled training data. The major assumption here is that the training and testing data have independent and identical distributions. This assumption can, however, be easily challenged on differences of illumination, pose, quality, background, etc. between the domains. If the training (labeled) data is not sufficient, one could use domain adaptation techniques to transfer the knowledge a model has gained on a domain with enough labeled data to a domain with limited labeled data, even when the source and target domains are of different distributions. Labeling, however, is also a time- and resource-consuming process. This survey, therefore, focuses on deep unsupervised domain adaptation (UDA) methods that have been utilized for classification purposes in computer vision.
Perspectives
The main contributions of this survey in comparison to other related surveys are as follows: • There are many recent papers on deep visual UDA approaches, that are not mentioned in any of the previous surveys but are included in our paper. • This survey paper represents a comprehensive coverage of deep methods for domain adaptation, while previous surveys were mostly focused on non-deep methods and have mentioned deep methods only briefly. • This survey paper presents a new taxonomy for deep visual UDA for classification tasks. This taxonomy is useful because, it covers almost all existing techniques to solve the UDA problem, which are categorised into five main groups based on the technology adopted for domain adaptation. The first group is discrepancybased which consists of used techniques for decreasing the difference between the domains and making more similarity between data distributions by utilising statistical techniques (i.e. maximum mean discrepancy, correlation alignment, entropy minimisation, batch normalisation, moment matching, and Wasserstein discrepancy). The second group is adversarial-based which consists of used techniques for minimising the distribution difference across domains by using an adversarial objective with a domain discriminator through assuming that the source labels are equivalent to the target labels or not (i.e. partial adversarial networks, and non-partial adversarial networks with three subsetting: discriminative adversarial networks, generative adversarial networks, and feature matching adversarial networks). The third group is reconstruction-based which consists of used techniques for decreasing the difference between the domains by mapping the source and target, or both domain samples into a shared representation domain (i.e. encoder–decoder models, dictionary and sparse coding models, and graph-based models). The fourth group is representation-based which consists of used techniques for decreasing the difference between the domains by utilising the trained network as input to use intermediate representations to a new network (i.e. domain confusion representation, domain invariant representation, and representation disentangling). The fifth group is attention-based which consists of used techniques for decreasing the difference between the domains by focusing on some transferable attention regions or images from source data and relating them to the target data (i.e. adversarial attention alignment, transferable local attention, transferable global attention). • We investigate and analyze some important methods in each category of our taxonomy, based on the results reported by different methods on well-known public databases. To the best of our knowledge, this is the first survey paper in deep visual UDA for classification tasks that quantitatively compares the performance of different deep UDA methods. It can help to provide proper insight for designing accurate and robust deep UDA methods.
Yeganeh Madadi
University of Tehran
Read the Original
This page is a summary of: Deep visual unsupervised domain adaptation for classification tasks: a survey, IET Image Processing, August 2020, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/iet-ipr.2020.0087.
You can read the full text:
Contributors
The following have contributed to this page