What is it about?

Technology-Assisted Reviews (TAR) aim to expedite document reviewing (e.g. medical articles or legal documents) by iteratively incorporating machine learning algorithms and human feedback on document relevance. Continuous Active Learning (CAL) algorithms have demonstrated superior performance compared to other methods in effciently identifying relevant documents. We address one of the key challenges for CAL algorithms -- deciding when to stop displaying documents to reviewers.

Featured Image

Why is it important?

In this paper, we handle the problem of deciding the stopping point of TAR under the continuous active learning framework by jointly training a ranking model to rank documents, and conducting a “greedy” sampling to estimate the total number of relevant documents in the collection. We prove the unbiasedness of the proposed estimators under a with-replacement sampling design, while experimental results demonstrate that the proposed approach, similar to CAL, e￿ectively retrieves relevant documents but it also provides a transparent, accurate, and e￿ective stopping point.

Perspectives

This paper addresses an important but under-studied problem in information retrieval. It proposes a new method to automatically determine when to stopping reviewing or assessing documents. It also provides a bunch of popular baselines public, which is important for reproduction. The writing is nice, reading it is a enjoyable experience.

Dan Li
University of Amsterdam

Read the Original

This page is a summary of: When to Stop Reviewing in Technology-Assisted Reviews, ACM Transactions on Information Systems, October 2020, ACM (Association for Computing Machinery),
DOI: 10.1145/3411755.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page