Technology-Assisted Review (TAR) systems are becoming indispensable in domains demanding extensive document screening with high precision, notably in eDiscovery and systematic biomedical reviews. Recent advancements in machine learning, particularly the emergence of Large Language Models (LLMs), have expanded the capabilities of TAR systems, enabling them to handle voluminous text data more efficiently and accurately. Despite these strides, significant challenges remain, including the development of effective stopping criteria, availability of high-quality domain-specific datasets, and robust evaluation metrics to ensure reproducibility and defensibility in high-stakes applications. This paper surveys recent trends and emerging methodologies in TAR, with an emphasis on approaches aimed at improving document relevance screening, query generation, and validation protocols across active learning (AL) and reinforcement learning (RL) frameworks. We examine the utilization of LLMs for Boolean query refinement and abstract screening, particularly in enhancing systematic review workflows. Additionally, we discuss the role of specialized datasets and data-driven approaches in addressing the unique requirements of TAR systems in fields like biomedical research and eDiscovery.

Technology Assisted Review Systems: Current and Future Directions

Di Nunzio G. M.
2024

Abstract

Technology-Assisted Review (TAR) systems are becoming indispensable in domains demanding extensive document screening with high precision, notably in eDiscovery and systematic biomedical reviews. Recent advancements in machine learning, particularly the emergence of Large Language Models (LLMs), have expanded the capabilities of TAR systems, enabling them to handle voluminous text data more efficiently and accurately. Despite these strides, significant challenges remain, including the development of effective stopping criteria, availability of high-quality domain-specific datasets, and robust evaluation metrics to ensure reproducibility and defensibility in high-stakes applications. This paper surveys recent trends and emerging methodologies in TAR, with an emphasis on approaches aimed at improving document relevance screening, query generation, and validation protocols across active learning (AL) and reinforcement learning (RL) frameworks. We examine the utilization of LLMs for Boolean query refinement and abstract screening, particularly in enhancing systematic review workflows. Additionally, we discuss the role of specialized datasets and data-driven approaches in addressing the unique requirements of TAR systems in fields like biomedical research and eDiscovery.
2024
CEUR Workshop Proceedings
3rd Workshop on Augmented Intelligence for Technology-Assisted Reviews Systems, ALTARS 2024
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3542120
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact