Leveraging high-throughput screening data, deep neural networks, and conditional generative adversarial networks to advance predictive toxicology

doi:10.1371/journal.pcbi.1009135

Fig 1.

Regression generator diagram.

Schematic representation of Go-ZT architecture showing chemical structural input represented as weights (w_i) and views (v_i) matrices passed through two fully connected neural networks to produce a predicted toxicity matrix. Darker matrix shading indicates higher toxicity values.

More »

Expand

Fig 2.

Conditional GAN diagram.

Schematic representation of GAN-ZT architecture showing chemical structural input represented as weights (w_i) and views (v_i) matrices passed through two fully connected neural networks to produce a predicted toxicity matrix. Chemical features along with predicted or empirical toxicity matrices are then passed to a discriminator comprising a fully-connected neural network. Darker matrix shading indicates higher toxicity values.

More »

Expand

Fig 3.

Data subdivision.

Principal component analysis displayed against the background of over 800,000 chemicals in the Integrated Chemical Environment database. Compares physical chemical properties between the training and test sets.

More »

Expand

Fig 4.

Experimental design.

Schematic representation of the experimental approach for screening developmental and neurotoxicity of chemicals in larval zebrafish.

More »

Expand

Fig 5.

Diagram showing the vectorization of Methyl isothiocyanate.

Atom information from the PDB file (shown in grey) in converted into the views and weights matrices. The views space (v_i) columns one and two identify the chemical species and correspond to an atom’s position on the periodic table indicating their period and group, respectively. While the last three columns show the relative position of each atom. The weight space (w_i) values correspond to each of the views space matrices. In the first views Table C1 is set at the center while in the second view C2 is set at the center of the view. This molecule has nine views, which can be reduced to three views if preference is given only to carbon.

More »

Expand

Table 1.

Summary of training and testing data used in this study.

More »

Expand

Fig 6.

Go-ZT and GAN-ZT loss functions during training.

Changes of loss functions during the training of (A) Go-ZT and (B) GAN-ZT.

More »

Expand

Table 2.

Performance of different methods in activity classification with 10-fold cross-validation.

More »

Expand

Table 3.

Performances of different methods in activity prediction of test set chemicals.

More »

Expand

Fig 7.

Test dataset confusion matrices.

Evaluation of the classification of chemicals in the test data set as either active or inactive using real versus generated toxicity matrices by (A) Go-ZT or (B) GAN-ZT. Color scale represents percent of total chemicals.

More »

Expand

Fig 8.

Model consensus on chemical activity.

(A)Venn diagram showing the overlap between true active chemicals and chemicals predicted to be active by either Go-ZT or GAN-ZT. (B) A confusion matrix showing the performance of the combined Go-ZT and GAN-ZT models using the test dataset.

More »

Expand

Table 4.

Model performance using shuffled data.

More »

Expand