A thorough review of models, evaluation metrics, and datasets on image captioning

Gaifang Luo; Lijun Cheng; Chao Jing; Can Zhao; Guozhu Song

doi:10.1049/ipr2.12367

What is it about?

Image captioning means generate descriptive sentences from a image. We illustrate relevant representative methods and discusses their advantages and limitations. The ultimate goal of this work is to serve as a tool for understanding the existing literature and highlighting future directions in the area of image captioning for Computer Vision and Natural Language Processing communities may benefit from.

Photo by Ion Fet on Unsplash

Why is it important?

Intending to give a testament to the journey that captioning has taken so far and to encourage novel ideas, in this paper, we provide a holistic overview of the models developed in the last years. Another contribution of this study is to quantitatively compare the main image captioning methods considering standard metrics, and discuss the strengths and weaknesses of various techniques, thereby clarifying the performance, differences and characteristics of the most critical models. Finally, we outlined the recent research trends of image captioning and discussed some open challenges and future directions.

Perspectives

I hope this article makes what people might think is a boring, slightly abstract area like artificial intelligence and machine learning, kind of interesting and maybe even exciting. Because artificial intelligence is silently changing my way of life, making our life better and better. More than anything else, and if nothing else, I hope you find this article thought-provoking.
Gaifang Luo
Shanxi Agricultural University

This page is a summary of: A thorough review of models, evaluation metrics, and datasets on image captioning, IET Image Processing, November 2021, the Institution of Engineering and Technology (the IET),
DOI: 10.1049/ipr2.12367.
You can read the full text:

Read

Contributors

The following have contributed to this page

Gaifang Luo
Shanxi Agricultural University

Recent advances in AI for image description

What is it about?

Why is it important?

Perspectives

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Recent advances in AI for image description

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

You might also like

Local motion phases for learning multi-contact character movements

An orthogonal method for solving maximum correntropy-based power system state estimation

A new multi-criteria decision-making method based on Pythagorean hesitant fuzzy Archimedean Muirhead mean operators1

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management