Automatic extraction of specimens from multi specimen herbaria

Kenzo Milleville; Krishna Kumar Thirukokaranam Chandrasekar; Steven Verstockt

doi:10.1145/3575862

What is it about?

Herbarium collections are invaluable resources for studying plant diversity and evolution, with specimens often serving as a record of past and present plant distributions. To fully leverage these collections, it is necessary to develop tools that can automatically process and enrich digitized specimens, making it easier for researchers to access and analyze the data. While models that work well with single specimens are available, there is a need to develop models that can accurately extract multiple specimens from the same image. This paper addresses this challenge by experimenting with different deep learning models to identify the best approach to localize plant specimens in more complex herbarium sheets. We found that segmentation models outperformed detection models and achieved promising results for multi-specimen extraction. The main bottleneck in this research was the lack of labeled data, which is essential for training and evaluating deep learning models. To address this issue, methods were developed to semi-automatically generate specimen annotations based on color segmentation. These annotations were then combined via a copy-paste augmentation method, which improved the model's accuracy.

Photo by Lucas George Wendt on Unsplash

Why is it important?

This research provides an essential step towards making herbarium collections more accessible for research and analysis. The automated localization and extraction of plant specimens from herbaria sheets enable researchers to analyze and compare specimens on a larger scale, which can help to advance our understanding of plant diversity. It expands upon previous work and applies the developed techniques to complex herbaria sheets featuring multiple specimens. Additionally, methods were developed to semi-automatically generate annotations for herbarium images, significantly reducing the manual annotation efforts needed to train the deep learning models.

Perspectives

It was a great pleasure to write this article with my co-authors, as it demonstrates our ongoing research on applying computer vision methods to herbarium collections. Many problems still remain in processing and analyzing natural science and archive collections at scale. Automated methods make these collections more accessible and reduce manual annotation efforts, opening up new research opportunities.
Kenzo Milleville
Universiteit Gent

This page is a summary of: Automatic extraction of specimens from multi specimen herbaria, Journal on Computing and Cultural Heritage, March 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3575862.
You can read the full text:

Read

Contributors

The following have contributed to this page

Kenzo Milleville
Universiteit Gent

Specimen extraction from herbaria sheets using deep learning

What is it about?

Why is it important?

Perspectives

Contributors

You might also like

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

Specimen extraction from herbaria sheets using deep learning

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

You might also like

A Case For Intra-rack Resource Disaggregation in HPC

Preliminary Analysis of Standalone Galileo and NavIC in the context of Positioning Performance for Low Latitude Region

Prioritizing MCDC test cases by spectral analysis of Boolean functions

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management