Publications : 2024

Nelms M, Antonijevic T, Ring C, Harris D, Bever RJ,… Borghoff S, et al. Chemistry domain of applicability evaluation of estrogen receptor high-throughput assay-based activity models. Abstract 4380, Society of Toxicology Annual Meeting, Salt Lake City, UT, March 2024. 

Abstract

Background and Purpose: U.S. Environmental Protection Agency’s (EPA’s) Endocrine Disruptor Screening Program (EDSP) Tier 1 assays screen for potential endocrine system perturbations. A model integrating data from 16 high-throughput assays to predict estrogen receptor (ER) agonism has been proposed as an alternative to some low-throughput Tier 1 assays. Later work demonstrated that as few as 4 assays could replicate the ER agonism predictions from the full model with 98% sensitivity and 92% specificity. The current study utilized chemical clustering to illustrate the coverage of the EDSP Universe of Chemicals (UoC) tested in the ER pathway model and to investigate the utility of chemical clustering to evaluate the screening approach using a 4-assay model as a test case. While the full original assay battery is no longer available, the demonstrated contribution of chemical clustering is broadly applicable to assay sets, chemical inventories, and models, and the data analysis used can also be applied to future evaluation of minimal assay models for consideration in screening. Methods: Chemical structures were collected for 6,947 substances via the CompTox Chemicals Dashboard from the 10k+ UoC and grouped based on structural similarity, generating 826 chemical clusters. Of the 1,812 substances run in the original ER model, 1,730 substances had a single, clearly defined structure. The ER model chemicals with a clearly defined structure that were not present in the EDSP UoC were assigned to chemical clusters using a k-nearest neighbors (k-NN) approach, resulting in 557 EDSP UoC clusters containing at least one ER model chemical. Results: Performance of a 4-assay model in comparison to the full ER agonist model was analyzed as related to chemical clustering. This was a case study, and a similar analysis can be performed with any subset model where the same chemicals (or subset of chemicals) are screened. Of the 365 clusters containing >1 ER model chemical, 321 did not have any chemicals predicted to be agonists by the full ER agonist model. The best 4-assay subset ER agonist model disagreed with the full ER agonist model by predicting agonist activity for 122 chemicals from 91 of the 321 clusters. There were 44 clusters with at least 2 chemicals and at least one agonist based upon the full ER agonist model, which allowed accuracy predictions on a per cluster basis. The accuracy of the best 4-assay subset ER agonist model ranged from 50% to 100% across these 44 clusters, with 32 clusters having an accuracy ≥90%. Overall, the best 4-assay subset ER agonist model resulted in 122 false positive and only 2 false negative predictions compared to the full ER agonist model. Most false positives (89) were only active in 2 of the 4 assays, whereas all but 11 true positive chemicals were active in at least 3 assays. False positive chemicals also tended to have lower area under the curve (AUC) values, with 110 out of 122 false positives having an AUC value below 0.214, which is lower than 75% of the positives as predicted by the full ER agonist model. The median AUC value for the 122 false positive from the best 4-assay subset ER agonist model was 0.138, while the threshold for an active prediction is 0.1. Conclusions: Our results show that the 4-assay model performs well across a range of structurally diverse chemicals. While this is a descriptive analysis of previous results, several concepts can be applied to any screening model used in the future. First, the clustering of the chemicals provides a means of ensuring that future screening evaluations consider the broad chemical space represented by the EDSP UoC. The clusters can also assist in prioritizing future chemicals for screening in specific clusters based on the activity of known chemicals in those clusters. The clustering approach can be useful in providing a framework to evaluate which portions of the EDSP UoC chemical space are reliably covered by in silico and in vitro approaches and where predictions from either method alone or both methods combined are most reliable. The lessons learned from this case study can be easily applied to future evaluations of model applicability and screening to evaluate future datasets. The views expressed in this abstract are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA.