Holt JR, Chew RF, Bunnage E, Lea I, Borghoff SJ, Sayer R, et al. Categorizing articles for environmental health systematic evidence mapping: A zero-shot machine learning approach using large language models. American Society of Cellular and Computational Toxicology Annual Meeting, Research Triangle Park, NC, October 2024.
Abstract
Systematic evidence mapping (SEM) is an approach for collecting and organizing sets of literature to assess trends and identify knowledge gaps. SEM requires that two or more humans search for and agree to apply domain-specific labels to a large number of documents based on their expert human judgements. These labels are further analyzed to understand the scope and structure of the available literature.
The application of supervised machine learning (ML) approaches in SEM can greatly reduce the time and labor and have become increasingly useful in tackling the mounting complexity of modern environmental health SEMs. However, the effectiveness of supervised ML approaches depends on high-quality training data, which can be burdensome to create for each bespoke SEM.
This work tests the utility of generalized Large Language Models (LLMs) for highly specialized environmental health SEMs using PDF full text journal articles from a systematic review focused on thyroid hormone. To test this unsupervised approach, we inventoried (i.e., categorized) 636 full text journal articles using GPT-4 across 37 criteria, including indicators for species, reference type, and mechanism. Model predictions were compared to human labels to assess the quality of the predictions.
Results show tested LLMs have near-human proficiency across many evidence mapping categories relevant to environmental health. Furthermore, the model returned similar results for different random states. This points to the potential for using LLMs to aid in the categorization and triage of large literature sets to facilitate systematic evidence review.
This abstract does not necessarily reflect U.S. EPA policy.