Publications : 2026

Shobair M, Allen D, Ballone I, Brull M, Burbank M, Choksi N, Fitch S, Irizar A, et al. A machine-learning approach for the development of a knowledgebase to identify systemic toxicity mechanistic targets and evaluate biological space coverage. Abstract 5161, Society of Toxicology 65th Annual Meeting, San Diego, CA, March 2026.

Abstract

OPEN ACCESS

Background and Purpose: With increased reliance on New Approach Methods (NAMs) to support chemical risk assessment, mechanistic data modeling frameworks and knowledgebases are needed to facilitate hypothesis-based hazard evaluation. Methods: A data discovery and enrichment approach was developed using natural language processing (NLP) and large-language models (LLMs), to extract key mechanistic relationships from unstructured research articles to build a knowledgebase of adverse-outcome pathways (AOPs) related to systemic toxicity. The workflow employs contextual learning and hierarchical data extraction to connect molecular targets and biological processes with downstream events relevant to systemic toxicity. Results: Over 1000 mechanistic pathways for more than 9000 chemicals were identified using an integrated machine learning scheme that incorporates human input for model refinement, data confidence quantification, and ontology-based standardization. Expert curation, validation and interpretation with calibrated statistical confidence scoring achieved over 90% precision for chemical target relationships and up to 100% precision for data cited in multiple studies. Conclusions: The knowledgebase was validated with AOPWiki annotations and expert-curated pharmacological datasets, providing insights into systemic toxicity mechanistic pathways. This can inform on the systemic toxicity biological space coverage and mapping needed to identify downstream mechanistic events that can further the understanding of the bioactivity-to-adversity relationship and continuum.