What is it about?

Deep Learning models’ performance strongly correlate with availability of annotated data; however, massive data labelling is laborious, expensive, and error-prone when performed by human experts. Active Learning (AL) effectively handles this challenge by selecting the uncertain samples from unlabeled data collection, but the existing AL approaches involve repetitive human feedback for labelling uncertain samples, thus rendering these techniques infeasible to be deployed in industry related real-world applications. In the proposed Proxy Model based Active Learning technique (PMAL), this issue is addressed by replacing human oracle with a deep learning model, where human expertise is reduced to label only two small subsets of data for training proxy model and initializing the AL loop. In the PMAL technique, firstly, proxy model is trained with a small subset of labeled data, which subsequently acts as an oracle for annotating uncertain samples. Secondly, active model's training, uncertain samples extraction via uncertainty sampling, and annotation through proxy model is carried out until predefined iterations to achieve higher accuracy and labelled data. Finally, the active model is evaluated using testing data to verify the effectiveness of our technique for practical applications. The correct annotations by the proxy model are ensured by employing the potentials of explainable artificial intelligence. Similarly, emerging vision transformer is used as an active model to achieve maximum accuracy. Experimental results reveal that the proposed method outperforms state-of-the-art in terms of minimum labelled data usage and improves the accuracy with 2.2%, 2.6%, and 1.35% on Caltech-101, Caltech-256, and CIFAR-10 datasets, respectively. Since the proposed technique offers a highly reasonable solution to exploit huge multimedia data, therefore it can be widely used in different evolutionary industrial domains.

Featured Image

Why is it important?

Our work is important due to its contributions. The main contributions of our PMAL for replacing human in the AL loop are summarized in the following bullets: • Data annotation is tedious when carried out by humans, therefore, a novel AL technique with focus on multimedia related industrial applications utilizing the potentials of emerging Vision Transformer (ViT) and Explainable Artificial Intelligence is proposed to replace human involvement in the AL loop by a proxy model. • The proposed technique, attested by empirical evidence in the form of extensive experimentation and ablation study, offers a low-cost solution to the exorbitant prices of data labelling for industrial applications. Moreover, in comparison to other state-of-the-art AL techniques, the proposed method achieves higher accuracy with minimum labeled data. • To demonstrate generalization of our research work, computer vision’s generic benchmark datasets have been used for experiments; hence the proposed technique could be tuned to any kind of multimedia related industrial application.

Perspectives

Our PMAL works better on resource hungry DL models i.e., Vision Transformer, whereas some industrial applications are resource constrained and require fast computation using lightweight deep learning models. Therefore, more work is needed to optimize the proposed PMAL technique with the lightweight deep learning models so that it could be easily deployed in resource constrained industrial environment as well.

Khan Muhammad
Sungkyunkwan University

The idea presented in the article, bears the potential to alleviate the prevalent challenge of labeled data acquisition for training deep learning models. Unlike other active learning methods, which although achieves reasonable accuracy in several domains, however, they exhibits acute reliance on human during the training process. In the proposed idea, this issue has been solved by efficiently utilizing deep learning model instead of human.

Abbas khan
Sejong University

Read the Original

This page is a summary of: PMAL: A Proxy Model Active Learning Approach for Vision Based Industrial Applications, ACM Transactions on Multimedia Computing Communications and Applications, June 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3534932.
You can read the full text:

Read

Contributors

The following have contributed to this page