Deriving time-concordant event cascades from gene expression data: A case study for Drug-Induced Liver Injury (DILI)

Adverse event pathogenesis is often a complex process which compromises multiple events ranging from the molecular to the phenotypic level. In toxicology, Adverse Outcome Pathways (AOPs) aim to formalize this as temporal sequences of events, in which event relationships should be supported by causal evidence according to the tailored Bradford-Hill criteria. One of the criteria is whether events are consistently observed in a certain temporal order and, in this work, we study this time concordance using the concept of “first activation” as data-driven means to generate hypotheses on potentially causal mechanisms. As a case study, we analysed liver data from repeat-dose studies in rats from the TG-GATEs database which comprises measurements across eight timepoints, ranging from 3 hours to 4 weeks post-treatment. We identified time-concordant gene expression-derived events preceding adverse histopathology, which serves as surrogate readout for Drug-Induced Liver Injury (DILI). We find known mechanisms in DILI to be time-concordant, and show further that significance, frequency and log fold change (logFC) of differential expression are metrics which can additionally prioritize events although not necessary to be mechanistically relevant. Moreover, we used the temporal order of transcription factor (TF) expression and regulon activity to identify transcriptionally regulated TFs and subsequently combined this with prior knowledge on functional interactions to derive detailed gene-regulatory mechanisms, such as reduced Hnf4a activity leading to decreased expression and activity of Cebpa. At the same time, also potentially novel events are identified such as Sox13 which is highly significantly time-concordant and shows sustained activation over time. Overall, we demonstrate how time-resolved transcriptomics can derive and support mechanistic hypotheses by quantifying time concordance and how this can be combined with prior causal knowledge, with the aim of both understanding mechanisms of toxicity, as well as potential applications to the AOP framework. We make our results available in the form of a Shiny app (https://anikaliu.shinyapps.io/dili_cascades), which allows users to query events of interest in more detail.

ranging from the molecular to the phenotypic level. In toxicology, Adverse Outcome Pathways 23 (AOPs) aim to formalize this as temporal sequences of events, in which event relationships should 24 be supported by causal evidence according to the tailored Bradford-Hill criteria. One of the criteria is 25 whether events are consistently observed in a certain temporal order and, in this work, we study this 26 time concordance using the concept of "first activation" as data-driven means to generate 27 hypotheses on potentially causal mechanisms. As a case study, we analysed liver data from repeat-28 dose studies in rats from the TG-GATEs database which comprises measurements across eight 29 timepoints, ranging from 3 hours to 4 weeks post-treatment. We identified time concordant gene 30 expression-derived events preceding adverse histopathology, which serves as surrogate readout for 31 Drug-Induced Liver Injury (DILI). We find known mechanisms in DILI to be time-concordant, and 32 show further that significance, frequency and log fold change (logFC) of differential expression are 33 metrics which can additionally prioritize events although not necessary to be mechanistically 34 relevant. Moreover, we used the temporal order of transcription factor (TF) expression and regulon 35 activity to identify transcriptionally regulated TFs and subsequently combined this with prior 36 knowledge on functional interactions to derive detailed gene-regulatory mechanisms, such as 37 reduced Hnf4a activity leading to decreased expression and activity of Cebpa. At the same time, 38 also potentially novel events are identified such as Sox13 which is highly significantly time-39 concordant and shows sustained activation over time. Overall, we demonstrate how time-resolved 40 transcriptomics can derive and support mechanistic hypotheses by quantifying time concordance 41 and how this can be combined with prior causal knowledge, with the aim of both understanding 42 mechanisms of toxicity, as well as potential applications to the AOP framework. We make our 43 results available in the form of a Shiny app (https://github.com/anikaliu/DILICascades_App), which 44 allows users to query events of interest in more detail. 45

61
Adverse drug reactions are a major reason for compound failure in the clinical trials [1,2] and the 62 significant cause for post-marketing withdrawals. To counter exposing patients to these risks, it is 63 desired to identify adverse events earlier in the individual patient but also in the drug development 64 process. Mechanistic understanding of how adverse event pathogenesis is crucial in this regard, i.e. 65 to derive early safety biomarkers or in vitro assays. However, current understanding of toxicity is 66 largely incomplete, in particular for complex phenotypes such as organ injury which can usually be 67 which readouts describing molecular or cellular effects are indicative for systems-level adverse 93 effects in the first place. 94 Biological readouts such as transcriptomics are particularly suited to study such intermediate key 95 events as they provide broad insights into cellular changes, e.g. in contrast to target profiling, which 96 can then lead to the identification of predictive signatures and mechanistically relevant insights. This 97 is for example true in the context of DILI [14][15][16][17][18], which is a major cause for attrition in drug 98 development and accounts for around half of the cases of acute liver failure in the US and European 99 countries [19,20]. In this regard, in particular time-series data is interesting as it is able to trace the 100 dynamic effects throughout pathogenesis. Previous studies focussed on the time (and dose) 101 dependence of gene expression-derived events in the context of adverse findings [21,22], so the 102 changes of individual events across changes in time (and dose), and also aimed to predict later 103 adverse findings from fixed early timepoints [14,15]. From a mechanistic perspective, however, 104 neither activation at a certain timepoint nor a certain progression over time is mandatory, but only 105 time concordance, so activation of the key event before the downstream adverse effects. 106 In this study, we hence quantify time concordance across gene expression-derived cellular events 107 and adverse events based on histopathology across wide range of compounds. To do so, we 108 introduce the concept of "first activation" for mechanistic analysis, which focusses only on the 109 earliest timepoint an event can be reliably detected and then orders events within a time-series by 110 their timepoint of first activation ( Fig 1A). In contrast to previous time concordance analyses in 111 AOPs which addressed a defined set of KER and known KE [23][24][25], this analysis derives statistical 112 evidence for temporal concordance across time-series and can do so for any combination of events 113 based on gene expression or histopathology. Although the confidence of these temporal orders per 114 time series is limited by the noisiness of gene expression data and the low time resolution, statistical 115 significance can be evaluated across time series and we only consider relations time-concordant if 116 the preceding event is significantly more frequently found before the later event than in time series 117 without ( Fig 1B). Furthermore, this also allows us to separate out events which depict general 118 perturbation response but are unspecific, as well as rare events, which are predictive but only 119 observed for a small subset of compounds. concordance can both prioritize novel links and provide further evidence on potential 131 mechanistic links between events. Events are indicated as nodes and mechanistic links 132 between them as edges. 133 We demonstrate the utility of this concept in this work using liver gene expression and 134 histopathology data from repeat-dose studies in rats provided by the TG-GATEs database (S1 Fig)  135 This allows us to take advantage of previous data curation and work on the dataset itself, in 136 ral nly he rst es en on e a e ial ks nd ).
in particular by Sutherland et al. [15] who provide an adverse classification of each compound-dose 137 combination and toxscores summarising histopathological findings in each condition. Furthermore, 138 Drug-Induced Liver Injury (DILI) is well understood in comparison to other organ-level toxicities and 139 we hence know which processes are expected to precede injury, including cell death, inflammation 140 and other adaptive stress responses [19]. The concept, however, is generally applicable beyond the 141 toxicity area and transcriptomics data and can be used to derive mechanistic event cascades from 142 time-series data of any kind as long as the first activation of events within the time-series can be 143 defined. 144 We first describe the time concordance for known processes, similar to mechanistic qAOPs, and 145 then prioritize predictive, time-concordant KE providing a strong data-driven, automatable starting 146 point for AOP development, aligning with the objective of cpAOPs (more detailed comparison in S1 147 Table). We then combine data-driven time concordance and prior knowledge on event relations 148 between transcription factors (TFs) and gene expression to generate hypotheses for causal gene-149 regulatory mechanisms in DILI pathogenesis and to generally show how time concordance can 150 stratify and support other streams of causal evidence. Overall, we show that time-resolved gene 151 expression and histopathology data can be used to quantify time concordance across a large set of 152 compounds and events, which allows us to characterize known mechanistic links and to prioritize 153 new ones ( Fig 1C). 154

155
In order to derive the time concordance between cellular events and later adverse histopathology, 156 we use the workflow outlined in Fig 2 with each step being also introduced in the subsequent 157 sections and details on their respective implementation in Methods. We first derived TF and 158 pathway activity across expression profiles from the same experiment and subsequently defined the 159 first up-or downregulation TFs or pathways as events. Furthermore, we obtained binary 160 histopathology labels describing the occurrence of each histopathological finding at different levels 161 of severity and frequency from the Toxscores provided by Sutherland et al. [15]. Subsequently, we 162 derive the earliest timepoint of each event, e.g. pathways or adverse histopathology, within each 163 time-series. As last step of the time concordance analysis, we then evaluate which gene 164 expression-derived events are significantly enriched before or at the time where adverse 165 histopathology is found, as well as additional time concordance metrics outlined in Table 1

176
To define the earliest timepoint of adverse histopathology within each time-series, we used the 177 annotations of time series as adverse or non-adverse by Sutherland et al. [15], as well as the 178 toxscores, which summarise the severity and frequency for each histological finding and each 179 compound-dose-time combination as mean severity score and range from 0 (normal) to 4 (severe) 180 els we ch ne se ed on he he as en se he he ch e).
These toxscores were used to define three levels for each histological finding: "null" (toxscore > 0), 181 "low" (toxscore > 0.67) and "high" (toxscore >1.34). For example in case of a toxscore of 1, both 182 "null" and "low" are considered to be present. We then evaluated which histology groups were 183 frequently found in the adverse compound-dose combinations (observed in >10% of adverse time 184 series corresponding to at least 5 out of 40 cases) with at least 50% of findings being in adverse 185 time series (Fig 3A). All of the included histology groups are significantly enriched in adverse 186 conditions, however, these criteria were implemented to identify findings with a certain specificity 187 and frequency instead of allowing a trade-off between both. The histology groups which passed the 188 filtering are regarded as adverse histopathological findings and include hepatocellular single cell 189 necrosis and biliary hyperplasia at all toxscore thresholds. In contrast, only some of the three 190 toxscore thresholds were selected with the above criteria for all other findings, e.g. the two higher 191 toxscore cut-offs for hepatocellular necrosis and inflammation and only the "high" cut-off for 192 increased hepatocellular mitosis. In all cases, the lower toxscore level was also frequently observed 193 in non-adverse conditions and hence considered too unspecific. In contrast, only the two milder 194 levels of fibrosis were included in the selection, as severe fibrosis was observed rarely. While we 195 focus on the described definition of adverse histopathological findings in this study, the difficulty in 196 summarising a complex phenotype such as DILI into a binary classification, adverse or not adverse, 197 is well established [27,28] and is also demonstrated by the discrepancies between DILI 198 classifications from DILIst, DILIrank and those derived by Sutherland et al based on the TG-GATEs 199 data (S1). We are aware that also broader or more targeted phenotypes might be of interest, and 200 we hence provide a Shiny app where results for alternative definitions of adverse and non-adverse 201 histopathology groups can be explored. 202 For the adverse histopathology labels, the distribution of toxscores and first activation over time (Fig 203 3B) shows that some findings are predominantly found late, like fibrosis, while others are 204 predominantly found early, e.g. hepatocellular single cell necrosis. Next, out of all 360 time-series 205 with at least 6 measured timepoints, the 61 time-series in which any of the adverse histopathology 206 1 0 labels is found were identified, which covered 38 compounds (S2 Table). In those, the earliest 207 evidence of an adverse phenotype is used to approximate the timepoint of the primary adverse 208 phenotype. Across all timeseries with adverse histopathology, we find that hepatocellular single cell 209 necrosis is most frequently the primary adverse phenotype, while biliary hyperplasia at any severity 210 is in most cases a secondary effect (Fig 3C, S3 Fig). Known pathways in DILI preceding adverse histopathology 225 To identify cellular mechanisms in the early pathogenesis of DILI, we next studied time-concordant 226 cellular changes preceding later adverse histopathology (see Methods). This identified 911 227 pathway-level events (37.3%), and 108 TF-level events (33.6%) with significant enrichment (p-228 value<0.05) before or at adverse histopathology. We next evaluated time concordance for a set of 229 ten known events in DILI (Fig 4 and S3 Table). Recycling of bile acids and salts was the most 230 significantly enriched geneset overall and hence also among the ones linked to known events. Also 231 down-regulation of the other bile acid gene sets was significantly enriched (p-value < 0.05) pointing 232 to an overall down-regulation of bile acid metabolism. While cell death was also only found to be up-233 regulated, dysregulation in both directions was found to precede injury for all other key events (Fig  234   4). However, only for peroxisomal processes, namely peroxisomal protein import and beta-oxidation 235 of very long fatty acids, both directions were significantly enriched indicating that dysregulation in 236 either direction might be linked to injury. Overall, significantly enriched gene sets are found for all 237 ten represented known events in DILI (p-value < 0.05) indicating that our analysis is able to recover 238 known cellular events. histopathology is shown for gene sets mapping to known key events in DILI, for which first 243 activation was defined as first timepoint of differential GSVA-derived gene set activity. 244 Furthermore, also enrichment of individual genes within these genesets is shown and was 245 derived based on the first timepoints of differential expression. Aligning with the expected 246 direction, a significant down-regulation of Liver X Receptor (LXR) signalling and bile acid 247 related pathways is observed, while all other gene sets were found to be more significantly 248 up-regulated. Only for peroxisomal pathways, both directions were significantly enriched 249 indicating that dysregulation in direction might be linked to adverse histopathology. 250 To gain insights on a more fine-grained level, we next analysed the enrichment of significantly and 251 strongly (absolute log fold change > 1) dysregulated individual genes from the above gene set (S2 252 File). Among the ten most significantly enriched gene-level events, three are involved in known 253 processes, namely the up-regulation of Acyl-CoA thioesterase 3 (Acot3) and Carnitine O- For JNK signalling, we did not find any significantly enriched genes indicating that while the overall 262 process is changing none of the individual genes shows strong and frequent expression changes. In 263 contrast, the opposite was found for oxidative stress with the Jun Proto-Oncogene (Jun) being one 264 of the most significantly enriched gene-level events but lacking significant changes on the gene-set 265 level. This shows that both gene-and gene-set level analysis can provide complementary insights 266 into cellular changes preceding DILI, and that in some cases effects can be attributed in individual 267 genes which might give more detailed information about the cellular changes. 268 While significant enrichment before or at adverse histopathology can be regarded as a necessary 269 criterion for time concordance, the temporal event relationship can be further characterised based 270 on the observed behaviour across experimental conditions which may be useful to further prioritize 271 mechanistically relevant pathways in a hypothesis-free manner. Following the Bradford-Hill 272 considerations, we hypothesize that this might be the case for observed effect size, frequency and 273 specificity of event occurrence before adverse histopathology. Firstly, we investigated how strongly 274 pathways were dysregulated comparing the maximal absolute log fold changes (|logFCs|) before or 275 at adverse histopathology in each adverse time-series for significantly time-concordant events (Fig 276 5). High median maximal |logFCs| were overall found for mitochondrial and peroxisomal pathways 277 and the highest median maximal |logFC| among all significant events was found for mitochondrial 278 fatty acid oxidation of unsaturated fatty acids. At the same time, however, the high variance for 279 pathways with high median maximal |logFC| as well as the only moderately high |logFC|s observed 280 for other known pathways in DILI, such as programmed cell death. This indicates that a high 281 magnitude of |logFC| is not necessary to contribute to an adverse event, but at the same time can 282 be a useful property to further prioritize important pathways. which correspond to significantly enriched events before adverse histopathology, the max 286 |logFC| before adverse histopathology is shown. In comparison to other known pathways 287 and the overall background distribution, a high logFC is found for mitochondrial beta 288 oxidation followed by peroxisomal beta oxidation and mitophagy. 289 We next analysed to what extent dysregulation in a pathway is predictive for a particular type of 290 histopathology. To this end, we calculated across how many adverse time-series each pathway is 291 observed, summarised by the true positive rate (TPR), and the positive predictive value (PPV) 292 indicating whether presence of the key event is a confident indicator for the later adverse event (Fig  293   6). We focus on significantly enriched events only (p-value < 0.05) and find a trade-off with respect 294 to the highest TPR and PPV (Fig 6; for distribution of all events see S4 Fig). This generally shows 295 that either highly frequent events with lower specificity can be identified, e.g. increased mitophagy 296 (TPR: 0.41, PPV: 0.72), or more specific events at the expense of lower relative frequency, e.g. bile 297 acid recycling (TPR= 0.30, PPV:1). 298 Surprisingly, lower relative frequencies are particularly observed for stress response and signaling 299 pathways with only Liver X Receptor (LXR)-dependent gene expression linked to lipogenesis 300 reaching a TPR over 20%. One explanation for the lower observed frequencies is that these 301 pathways are predominantly and initially mediated through post-transcriptional alterations instead of 302 gene expression changes [35,36], making the expression of pathway members a weak proxy for 303 pathway activation in early pathogenesis and explaining the overall low frequencies. In fact, one 304 reason LXR-dependent changes might have achieved higher frequencies as they explicitly include 305 the downstream regulated genes unlike the other signalling and stress response pathways [37]. 306 Due to the previously discussed complementarity of gene-and gene set-level analysis, we also 307 show TPR and PPV for individual genes with a focus on those which are involved in gene sets 308 mapping to known key events. The most significant genes, already highlighted in Fig 4,  To gain insight into signalling and expression regulation preceding adverse histopathology, we next 323 analysed transcription factors (TFs), which are involved in early perturbation response preceding 324 downstream gene expression changes and also are likely to show strong signal in transcriptomics 325 data given their direct link to gene expression. As known TFs in DILI, we thereby included TFs 326 mediating the stress response and signalling pathways already introduced above, as well as nuclear 327 receptors which take in important roles in liver physiology and malfunctions and can be, both, MIEs 328 or KEs (mapping shown in Fig 7A). Consistent with the pathway-level results, an enriched up-329 regulation was found for Nuclear factor erythroid 2-related factor 2 (Nfe2l2) which is a key mediator 330 of oxidative stress [18,39] as well as the Nf-κB subunits Rela and Nfkb1 indicating inflammation 331 [40], while the Oxysterols Receptors LXRα (Nr1h3) and LXRβ (Nr1h2) which control lipid 332 metabolism showed enriched down-regulation [41]. 333 For ER stress, we included three TFs mapping to the three branches of unfolded protein response 334 [42]: Activating transcription factor 4 (Atf4), Activating transcription factor 6 (Atf6) and X-box binding 335 protein 1 (Xbp1). Atf4 up-regulation was found to be most significantly enriched, most frequent and 336 also showing the largest logFC (Fig 7). This highlights its overall importance in mediating ER stress 337 and is consistent with the known role for ATF4 in DILI [43]. While Atf4 is a member of the pro-338 apoptotic unfolded protein response branch, the ATF6 and XPB1-mediated branches tend to be 339 cytoprotective [44]. In agreement with this, Atf6 was not significantly enriched, however, Xbp1 340 showed rare but significantly enriched down-regulation. 341 Transcription Factor AP-1 (Jun) which is one of downstream target TFs of c-Jun N-terminal kinase 342 (JNK) signaling was not significantly enriched in either direction due its rare activation among 343 adverse time series although JNK signaling up-regulation itself was significantly enriched with Jun 344 up-regulation being one of the most significantly enriched gene-level events. However, JNK 345 signaling is particularly known in acetaminophen-induced liver injury and in this context leads to 346 hepatocyte death through interactions with Sab on the mitochondrial outer membrane and not 347 through transcriptional regulation mediated by 46]. As increased Jun activity is hence 348 known to be a consequence of JNK signaling but not a cause of injury, it would be plausible to see 349 enriched pathway activity but not in TF activity before adverse histopathology. Overall, we were 350 hence able to show significant enrichment of some of the known TFs in DILI before adverse 351 histopathology and can also biologically reason the absence of significance for others. 352 While none of the included TF-level events ranked as most significant or most strongly changing 353 before adverse histopathology as in the analysis of pathway-level events, the down-regulation of 354 Nr1h3, which is involved in lipid metabolism, was identified as most frequent event (Fig 7B)  355 indicating that the linked physiological changes are commonly but not specifically found before 356 injury. Similarly, the up-regulation of stress response, indicated by Nfe2l2 and Atf4, was found to be 357 frequent aligning with their role in adaptive stress response [47]. Overall, frequency might hence be 358 a useful metric to identify pre-adverse cellular events which precede injury but are not highly 359 specific. Data-driven prioritization of cellular events taking place before adverse 368 histopathology 369 As many events were found to be significantly enriched before adverse histopathology, we next 370 aimed at identifying and characterizing events most supported by time concordance, and hence to 371 move closer to the eventual aim of constructing AOPs from data. In our analysis, some known 372 events in DILI ranked highest by enrichment p-value while others rank highest by max. |logFC| 373 before adverse histopathology. In contrast, known TFs in DILI were found as most frequent ones in 374 the dataset. We hence next looked into the top 10 TF-and pathway-level events identified using 375 max. |logFC|, the enrichment p-value, and the TPR before or at adverse histopathology. These are 376 shown in Fig 8 and S4 Table,  Significance of enrichment in adverse conditions for matched TF activity and expression-434 based events. Events only found on the expression or TF level are not included in the figure  435 due to the inability to perform a statistical test for those. B) For significantly enriched TF 436 activity-based events, the True Positive Rate (TPR) of observing TF activity before or at the 437 time of adverse histopathology is shown, as well as the TPR for observing TF expression 438 changes before TF activity in the time series where it precedes adverse histopathology. 439 To derive stronger mechanistic evidence for induction we next evaluated how frequently expression 440 changes precede TF activity in the same adverse time-series and compare this against the overall 441 frequency of TF event occurrences preceding adverse histopathology (Fig 9B). Among the events 442 with significant enrichment of TF expression and activity, the most frequent evidence for induction 443 was found for the down-regulation of CCAAT/enhancer-binding protein alpha (Cebpa). In humans, 444 decreased expression of the homologous CEBPA is not only known across liver diseases, 445 exogenously increased CEBPA expression has also been shown to reverse liver injury and is 446 explored as therapeutic target in hepatocellular carcinoma [55]. The event with the 2nd highest 447 relative frequency of expression preceding TF activity as well as the highest frequency of TF activity 448 preceding injury is Atf4, for which expression of the homologous gene in humans is known to be 449 induced as part of the ER stress response contributing to adverse liver phenotypes [56,57]. In 450 contrast, it was found that for the Aryl Hydrogen Receptor (Ahr) and the Peroxisome Proliferator-451 activated Receptor Alpha (Ppara) changes in expression never preceded those in TF activity which 452 aligns with their roles as nuclear receptors which are generally post-translationally activated via 453 ligand binding [58,59]. As this provides counterevidence for transcriptional induction, these were not 454 included as induced TF in the subsequent analysis. 455 After investigating the mode of regulation for individual TFs above, we next considered how these 456 TFs are interlinked. To this end, we identified protein-protein interactions and, for induced TFs, TF-457 target gene interactions between significantly enriched TFs, which showed significant enrichment 458 before adverse histopathology for both expression and regulon activity, as well as evidence of 459 expression preceding TF activity within the same adverse time series. Results of this analysis are 460 shown in Fig 10, and details on the observed absolute and relative frequencies, as well as the 461 source of the interaction are shown in S5 Table. One of the two identified interactions by highest 462 absolute frequency is Nr1h3 down-regulation resulting in reduced Srebf1 activity. Furthermore, 463 Srebf1 is also linked to upstream regulation by Nr1h2 which interacts with Peroxisome Proliferator-464 Activated Receptor (Ppara) in both directions, and this cross-talk between Ppara and LXR 465 regulating Srebf1 expression has been explicitly studied in the context of fatty acid metabolism 466 regulation [60-62]. The 2 nd most frequently observed interaction is the down-regulation of 467 Transcription Factor 12 (Tcf12) inducing reduced activity of TEA Domain Transcription Factor 1 468 (Tead1). While Tead1 is indeed known to be involved in liver diseases and injury [63,64], the 469 interaction itself has not been reported before in the context of liver injury and the same applies also 470 for the other upstream Tead1 regulators identified. It should also be noted that for these interactions 471 first activation is only found at the same time but not in the time-concordant order providing weaker 472 evidence than, for example, the interaction between Nr1h3 and Srebf1. As additional larger TF 473 cluster, decreased activity of the Hepatocyte Nuclear Factor 1 (Hnf1a), Retinoic Acid Receptor 474 alpha (Rara) and Pancreatic And Duodenal Homeobox 1 (Pdx1) was found to lead to decreased 475 expression and activity of Hepatocyte Nuclear Factor 14 (Hnf4a) which is linked to reduced 476 expression and activity of CCAAT/enhancer-binding protein (Cebpa) through edges in both 477 directions. This cluster stands out due to the high confidence score of all interactions except the 478 edge between Pdx1 and Hnf4a indicating that there is strong support based on prior knowledge for 479 the involved interactions. Furthermore, it was previously found that artificially increased expression 480 of Hnf4a is able to reverse hepatic liver failure in rats, while also restoring expression of a highly 481 interconnected TF network including Hnf1a and Cebpa which supports the identified interactions 482 [65,66]. Two of the yet unknown TFs in DILI are Meis Homeobox 1 (Meis1) and Meis Homeobox 2 483 (Meis2) which are generally known in a developmental context [67,68]. However, their down-484 regulation in early pathogenesis is supported by enriched TF activity, differential expression before 485 adverse histopathology as well as upstream regulators which are also enriched before adverse 486 histopathology. 487 High confidence, C: Medium confidence). 495

496
While events do not have to be activated continuously to be causally involved in pathogenesis, 497 events with consistent or increasing activation over time are particularly interesting as biomarkers as 498 they can be experimentally measured without the chance of missing the timepoint of activation, and 499 can potentially reflect disease progression beyond early pathogenesis. We therefore studied which 500 TFs and pathways show time-dependent activation by testing for significant Spearman correlation 501 between activation logFC and time in adverse time-series, and whether this overlaps with the 502 previously derived time concordance (Fig 11). Overall, 118 pathways and 19 TFs were supported by 503 both, significant time concordance and dependence, which represents 86.1% or 70.4% of the time 504 concordant events, and 59.9% or 48.7% of the time-dependent events, respectively. 505 On the pathway level, multiple genesets pointed to a reduced level plasma lipoprotein particle 506 assembly and remodelling which indicates changes in lipid distribution. This aligns with the known 507 dyslipidaemia in chronic liver diseases, including decreasing serum values of LDL, HDL, total 508 cholesterol, and triglycerides with increasing severity of disease, based on which previous studies 509 suggested that routine monitoring of lipid profiles can improve the outcome for CLD patients [70] 510 Furthermore, a down-regulation of response to metal ions was found which could be related to 511 to metallothioneins which protect against oxidative stress and are able to chelate heavy metals [71]. 512 Both directions of dysregulation were previously observed in liver diseases: While a negative 513 correlation with disease progression was found in hepatocellular carcinoma [72], a positive 514 correlation was found in most other liver diseases including acetaminophen-induced liver injury [73]. 515 This indicates that opposite directionality is more plausible based on current literature knowledge, 516 but cannot be fully clarified. The most time-concordant and -dependent TF event was down-517 regulation of SRY-Box Transcription Factor 13 (Sox13) which is generally involved in cell fate [74] 518 and embryonal development [75]. As Sox13 does not yet have well understood functions on a more 519 detailed level, experimental validation of a potential role in DILI would be interesting. In contrast, the 520 next most significant time dependence is found for the hepatocyte nuclear factors Hnf1a and Hnf4a, 521 as well as Cebpa, which are known to negatively correlate with liver cirrhosis in rats [76,77]. Overall, 522 this shows that a mechanistic role for time-concordant and -dependent events is strongly supported 523 by the understanding of adverse liver phenotypes. While in general events with highly significant time dependence also showed highly significant time 533 concordance, some exceptions were found in which only one of both was highly significant. For 534 instance, the pathway with the 2 nd most significant time-dependence (p-value < 10 44 ) is signaling via 535 advanced glycosylation end product receptor (RAGE) which contributes to inflammation and 536 oxidative stress generation and did not pass the significance threshold for time concordance (p-537 value = 0.058). RAGE expression and activity, which are both induced by binding of RAGE ligands,538 are thereby known to be up-regulated in various hepatic disorders resulting in a positive feedback 539 loop explaining increasing or sustained RAGE activation [78]. This indicates that, while RAGE 540 signaling is correlated with progression, there is no clear evidence for a role in early pathogenesis 541 preceding adverse histopathological changes. In contrast, SUMOylation of TFs, is time-concordant 542 (p-value = 0.002) but not -dependent (p-value = 0.48) indicating a mechanistic role in early 543 pathogenesis which is not sustained over time. This aligns with the finely regulated and pleiotropic 544 roles of SUMOylation in post-transcriptional regulation which have also been found to be involved in 545 the context of liver diseases [79]. 546

547
In this study, we introduced a time-concordance based approach to derive mechanistic insight from 548 gene expression and histopathology data. We were able to recover known mechanisms in DILI as 549 well as able to propose novel and detailed mechanistic hypotheses. However, the present analysis 550 is based on a limited number time-series as well as only few timepoints within each time-series. This 551 does not only mean that rare events might be missed as they occur between measured timepoints 552 and that small effects might not be identified as significant, but also that there is potentially a bias 553 based on the tested compounds towards the represented modes of toxicity. 554 Furthermore, the analysis is limited by how confidently biological processes are inferred from the 555 data. This was for instance demonstrated by the differences between pathway and TF activation for 556 signalling and stress response pathways highlighting the discrepancy between protein activation 557 and gene expression. As only pathways induced through changes in gene expression or their 558 downstream expression footprints [69] can be confidently detected, this means that good estimates 559 of time concordance can predominantly be derived for intermediate or later key events while 560 preceding key events or molecular initiating events which are not mediated by transcriptional 561 regulation cannot be estimated based on the data. That being said, this is a limitation of gene 562 expression data in general and the time concordance approach would also be able to integrate 563 other data types describing events not covered yet. 564 Moreover, multiple choices were made to align our analysis to the AOP concept prioritizing 565 mechanisms supported by prior knowledge over purely data-driven hypothesis. First, detailed 566 insights might be lost by summarising results to the pathway level. While generally measurements 567 for individual genes can be noisy, this can be summarised in different ways e.g. based on similarity 568 in expression profiles [15]. In this study, however, we used curated gene sets due to their 569 interpretability and to derive modular events as defined in the AOP framework. Additionally, prior 570 knowledge was taken as ground truth, both in the gene set and interaction analysis, meaning that 571 only generally known pathways and interactions could be discovered. Like all methods based on 572 curated gene set and interactions, it was hence informed and biased by the current understanding 573 of biology. However, this prior biological knowledge contributes to the biological plausibility of the 574 derived events and relationships contributing to the weight of evidence of our findings in the context 575 of AOPs. 576 Lastly, it should be highlighted that time concordance is necessary for casual relations but not 577 sufficient to prove it. For instance, two events may be time-concordant because they are causally 578 linked to a shared preceding cause. To distinguish these effects, the additional Bradford-Hill 579 considerations can be helpful, but only prior knowledge has been considered in parts of this study. 580 In particular essentiality would provide strong evidence for causality, however, requires targeted 581 experiments and hence is unsuitable for hypothesis generation. In contrast, dose and incidence 582 concordance are generally feasible from a data-driven standpoint but were not pursued in this case 583 study due to the low number of doses and replicates. 584

585
In this study, we introduce "first activation" as concept to quantify the strength of temporal 586 concordance between events across time series with the assumption that each activated event may 587 have downstream effects irrespective of whether it is continuously or only transiently activated. With 588 this approach, we study gene expression-based TF and pathway-level events found before adverse 589 histopathology indicating liver injury in repeat-dose studies in rats from TG-GATEs as a case study. 590 We find some known processes in DILI to be highly confident, e.g. bile acid recycling, while others 591 are highly frequent but less specific including adaptive response pathways such as the eIF2α/ATF4 592 pathway [51]. 593 Beyond quantifying time concordance for known and potentially novel events in DILI, we additionally 594 show how time concordance can be combined with prior biological knowledge to generate 595 hypothesis on potentially causal gene-regulatory cascades in DILI. Amongst others, this identifies 596 LXRα down-regulation leading to decreased Srebf1 expression, an interaction known to regulate 597 fatty acid synthesis in the liver [41], but also characterizes yet unknown TFs based on their time 598 concordance, their mode of regulation (either transcriptional or post-transcriptional) and potential 599 upstream regulators and downstream effectors. Two of the identified induced TFs are Meis1 and 600 Meis 2 which is supported by significantly enriched decrease in expression and activity before 601 adverse histopathology, as well as upstream regulators which also show significant enrichment of 602 regulon activity and are found within the same time series. On top of time concordance, we also 603 derive each event's time dependence and show that events mechanistically involved in early 604 pathogenesis do not necessarily reflect disease progression and vice versa. However, for some 605 events, e.g. Sox13, both properties are found and these may be useful biomarkers which reflect 606 injury progression and already change preceding histopathological manifestation. 607 We believe that the described analysis can provide supporting evidence for mechanistic links 608 between events in line with the evolved Bradford-Hill considerations on time concordance and 609 biological plausibility and can hence e.g. support AOP development. Furthermore, the approach is 610 not limited to a particular adverse event and can instead quantify the interaction between any two 611 events represented in time series in a data-driven and automatable fashion. Consequentially, this 612 type of analysis could also be of interest to study the mechanism of action particular compound 613 classes or patterns of disease progression. We make the results of our analysis on the TG-GATEs 614 in vivo liver data publicly available in a Shiny app through which users can query the most time-615 concordant events for more specific types of histopathology and study in detail in which time series 616 time concordance was observed or not observed (https://anikaliu.shinyapps.io/dili_cascades). 617

618
Open TG-GATES data processing

633
To characterize the extent of histological findings, we used the toxscores by Sutherland et al. [15] in 634 order to consider both severity and frequency of events in a single numerical output measure. 635 These are based on the lesion severity per animal which was first converted to a numerical scale 636 (normal = 0, minimal = 1, slight = 2, moderate = 3, marked or severe = 4) and then averaged across 637 all biological replicates as an aggregate measure for lesion frequency and severity. One 638 characteristic of this measure is that the overall distributions varied between different findings, e.g. 639 inflammation was more frequently annotated with low than with high toxscores while a more 640 balanced distribution of scores was observed for Hepatocellular single cell necrosis (S2 Fig). 641 To study which histological findings were enriched in adverse conditions, we first defined binary 642 histopathology labels describing the presence of histological findings with different extents in each 643 time-series. Based on the toxscore ranges used by Sutherland et al., three toxscore cut-offs are 644 implemented to describe each histopathological finding "Null" (toxscore > 0), "low" (toxscore > 0.67) 645 and "high" (toxscore > 1.34). We then studied which labels were over-represented in adverse time-646 series. These were defined using the annotation of Sutherland et al., where pathologists classified 647 compound-dose combinations in the TG-GATEs database as adverse or non-adverse after 4 and 648 29 days of treatment. We used the 29 days classification to define 40 adverse time series and only 649 regarded time-series as non-adverse for compounds which were not classified as adverse at any 650 dose in the negative control, in order to account for the fact that some of the cellular changes of 651 interest might already take place at lower doses, although the resulting phenotype is not considered 652 adverse yet. 653 We defined findings as adverse histopathology if they are observed in at least 5 out of 40 adverse 654 time-series to remove rare histopathological findings, and additionally require that at least 50% of 655 findings are in adverse conditions to remove findings which are unspecific. All labels which were 656 identified with these criteria are significantly enriched among time-series labelled as adverse by 657 Sutherland et al. in Table). This overlap with compound-level DILI annotations by the 665 FDA shows that the compounds in this study partially represent known mechanisms of DILI in 666 humans, while also highlighting the fact that a clear classification is not possible. 667

668
The activity of pathways and TFs across all doses and timepoints of a treatment including vehicle 669 controls was derived based on the expression of its gene sets members using GSVA [26], which 670 computes a gene set enrichment by sample matrix from the gene expression by sample matrix. This 671 was performed using a Gaussian kernel requiring at least 5 genes per gene set, and overall 672 provides the basis for the subsequent pathway-and TF-centric steps. interactions and are assigned a confidence level based on the strength of evidence of these 677 interactions. Thereby, only the 207 TFs with a high to medium confidence level of A-C were 678 included and the few TF-gene interactions with a negative mode of regulation were removed to 679 better infer TF directionality. To evaluate which pathway or TF is dysregulated, we computed the 680 differential activity in comparison to the vehicle control group, which was treated for the same 681 amount of time and as part of the same experiment, using the moderated t-statistic in limma [90]. 682 We identify significantly dysregulated gene sets with a False Discovery Rate (FDR) < 0.05. 683 Temporal concordance of events 684 In this study, the order of events was derived based on each event's timepoint of first activation 685 within each time-series ( Fig 1A). For pathways and TFs, we defined first activation as the earliest 686 time of measurement at which significant differential regulation was observed (FDR < 0.05) in each 687 direction, while an additional logFC cut-off has been implemented for individual genes. As first 688 evidence of adverse morphological changes in the liver, we used the first timepoint any of the 689 adverse histopathology label derived before were found. 690 We were then generally interested in potential preceding events PE which are first activated before 691 or at the same time as a potential later event or outcome LE and used multiple metrics to quantify 692 the degree of time concordance which can be related to the original work by Bradford Hill (Table 1). 693 Thereby, the key later event in this study was adverse histopathology but we used a more general 694 notation LE, as some of the following criteria to quantify time concordance are also applied in the 695 TF analysis, where gene expression-derived events are used as later event. First, we used the true 696 positive rate (TPR) which describes how frequently PE is observed before LE among all time-series 697 with LE and hence its consistency across compounds. Secondly, we use the maximal effect size of 698 PE observed before LE, summarised across time-series by median, to characterise the strength of 699 association. To evaluate the significance of the findings, we additionally defined a set of background 700 time-series unrelated to LE (Fig 1B). For adverse histopathology, these unrelated background time-701 series were the 133 time-series without any observed histological changes. We then computed the 702 enrichment of PE before or at LE using the fisher.test function of the stats R package [83], first 703 estimating the odds ratio using the conditional maximum likelihood estimate and subsequently 704 testing the null hypothesis whether the odds ratio derived from a confusion matrix as described in 705 Across all metrics, we only consider time-series in the statistics for which any event of the same 709 type as PE, e.g. TF or pathway, was observed at the included timepoints, so before or at LE or at 710 any timepoint in the background time-series. We do this to account for the fact that in some cases 711 no changes are found which may be a consequence of the fact that there isn't a measured timepoint 712 before LE or that at the available timepoints expression changes cannot be detected. We argue that 713 in these cases this should not be treated as evidence of absence of the given event, but rather as 714 absence of evidence. 715 Combining time concordance on TF-TF interactions 719 We used 3 sources of causal prior knowledge to derive mechanistic hypotheses linking TFs: 720 Protein-protein interaction between TFs derived from Omnipath through OmnipathR [91,92], TF-721 target gene interactions from DoRothEA [88] and the link between gene expression and protein 722 levels following the central dogma of molecular biology. Using these interactions as backbone, we 723 then derived those additionally supported by time concordance. Thereby, the dysregulation of the 724 nodes was required to match the reported mode of regulation (edge sign) and the source node or 725 upstream event was required to be observed in at least 20% of cases before or at the same time as 726 the target node or downstream event. For induced TFs, significant enrichment of gene expression 727 (|logFC|>0.5) and TF activity before adverse histopathology was required, as well as evidence for 728 changes in expression preceding changes in the same direction in regulon activity within the same 729 time series. 730

731
In each adverse time-series, we tested for Spearman correlation between timepoint and event 732 activation logFC using the correlation R package [93], and include a logFC of 0 at timepoint 0 h 733 assuming that there are no differences in comparison to the control group before treatment. We 734 then identify pathways and TFs which only show significant Spearman correlation in one direction, 735 positive or negative. For those events, we apply the Fisher's combined probability test using the 736 metap R package [94] across all adverse time-series to evaluate whether overall significant 737 correlation between event activation and time is found. 738

Supporting information
739 S1 "before" or "before or at" downstream TF activity. Additionally, the source of the 758 interactions provided in Omnipath are shown for protein-protein interactions and the 759 DoRothEA confidence level for TF-target gene interactions. 760