What is it about?

A job seeker's resume contains several sections, including educational qualifications. Educational qualifications capture the knowledge and skills relevant to the job. Our paper attempts to identify the institute and degree names from the education section of a resume using Named Entity Recognition. We propose a semi-supervised NER model with a correction module that aims to overcome the lack of annotated data, which is significant for a good performance Deep Learning model. We have been able to achieve an accuracy of 92.06% on the NER task.

Featured Image

Why is it important?

For any supervised deep learning model to perform well, we need a significant amount of annotated data to train the model. But there could be cases where we don’t have this with us like in resumes. We propose a semi-supervised model trained on the small annotated data we have and can make predictions on the unlabeled data, which is then corrected by our list based correction module. It can now act as data to train the model, and so is fed as training data to the model to improve its performance by retraining. So we propose a technique that overcomes the scarcity of labeled data required for supervised models by using a semi-supervised model that provides similar performance as a supervised model would have. This way, it can provide a high overall accuracy without the need for extensive annotated data.

Perspectives

Our intention of writing this paper was to address the issue of manually annotating datasets for training. Resumes being close to every job-seeker’s heart was an automatic favorite area of interest. Hopefully we have inspired people to become lazier, in a good manner, so that manual annotation for neural network / machine learning can be reduced much further.

Dr. Sanjay Singh
Manipal Institute of Technology, Manipal

Read the Original

This page is a summary of: Semi-supervised deep learning based named entity recognition model to parse education section of resumes, Neural Computing and Applications, September 2020, Springer Science + Business Media,
DOI: 10.1007/s00521-020-05351-2.
You can read the full text:

Read

Contributors

The following have contributed to this page