ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis

Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/.


Introduction
Systems biology, transcription factor annotation, and expression data analysis are major applications of bioinformatics. Research in those fields has yielded specialized software and methods that promote research in life sciences and have led to many biological discoveries [1][2][3]. However, many bioinformatics tools require advanced technical knowledge. For instance, the installation of the software itself can be difficult due to dependencies on specific requirements on the operating system or third-party libraries. Knowledge about internals of file formats may also be needed when converting files or in order to combine different tools. Many tools do not provide a graphical user interface but only a command-line interface or lack sufficient documentation for simple usage. In recent years, the advent of large-scale data has introduced new oppertunities, but also challenges in many fields of biology. A number of computational frameworks have been proposed for the handling of large-scale data, one of the most prominent being the MapReduce framework, which has been succesfully used in bioinformatics applications [4]. However, these frameworks require tailor-made software and infrastructure as well as advanced technical knowledge for installation and maintenance.
To facilitate access to tools for life science researchers, online platforms have been established that provide predefined interfaces through which tools can be remotely used. The advantage of these web-platforms is that they are installed remotely on a server. Thus, users do not need to install software on their local machine, but can access the tools through a convenient interface in the web browser. This allows the usage of tools that require lots of processing power or memory regardless of the user's hardware configuration, because the computation is performed on the server. For this reason, the user's device and operating system are almost irrelevant. Even mobile devices with very limited resources could be used to submit jobs through the online platform and view results in standard formats.
One example for these web-based services is Galaxy, an open, web-based platform for computational biology [5]. Galaxy was originally developed for sequence analysis, but has also been used in other fields, such as proteomics and systems biology [6,7]. Galaxy provides an established, user-friendly interface to command-line tools. It includes user management, storage of results in accordance with scientific requirements, and allows the inclusion of custom command-line tools through XML files.
Here, we present the ZBIT Bioinformatics Toolbox, a customized Galaxy instance for systems biology, transcription factor annotation, and expression data analysis. Our online platform offers a user-friendly interface to a collection of command-line tools that have been developed at the chair of Cognitive Systems. These tools can be categorized into systems biology (BioPAX2SBML, SBMLsqueezer, SBML2LaTeX, ModelPolisher), transcription factor analysis (TFpredict, SABINE), and expression data analysis (RPPApipe, ToxDBScan).

Available tools
Systems biology. The Systems Biology Markup Language (SBML) [8] and the Biological Pathway Exchange (BioPAX) format [9] belong to the most widely used community standards in systems biology [10]. For a long time, both formats were incompatible: while SBML has mainly been designed for quantitative analysis, BioPAX is optimized for exchange of qualitative pathways between databases [11]. To facilitate the exchange of models between researchers and databases, many tools have been developed to convert model formats and to add information from external databases.
BioPAX2SBML converts BioPAX models into SBML. This program was the first converter that properly translated qualitative relations [11]. BioPAX2SBML was used in the Path2Models project to create mathematical models in the SBML format from biochemical pathway maps retrieved from multiple data sources [12].
SBMLsqueezer generates kinetic equations needed for dynamic simulation from the stoichiometry, the participating species, and regulatory relations stored in a SBML model [13]. The program is also capable of retrieving experimentally determined rate laws from the SABIO-RK database [14]. SBMLsqueezer has been used in the Path2Models project to add kinetic equations to the translated SBML models [12]. Other uses of SBMLsqueezer include modeling of the MAPK machinery activation in plants [15] or simulation of drug effects using systems biology approaches [16].
SBML2LaTeX generates human-readable model reports from SBML files [17]. For example, SBML2LaTeX has been used in the BioModels Database to generate human-readable PDF reports for each model within this database [18,19]. The three tools can be combined into pipelines that create SBML models and reports (see Fig 1a).
ModelPolisher takes as input SBML models that make use of conventions from the constraint-based modeling community and complements all of its components with annotations from the BiGG Models knowledgebase [20]. The application matches the identifiers of all model components against the specification of BiGG IDs (see [21]). Whenever a component has a corresponding entry in the BiGG database, ModelPolisher pulls all available metadata about that component. ModelPolisher uses BiGG IDs to recognize specific reaction and metabolite types and uses corresponding terms from the Systems Biology Ontology [22] to clearly annotate those components. It also performs basic checks in order to ensure the structural correctness of the model and displays warnings in cases such as mass balance deficiencies. The output of ModelPolisher is an updated SBML file that can be used as input for subsequent tools within the toolbox or external tools that support the SBML format.
Transcription factor annotation. The binding of transcription factors (TF) at defined DNA domains is essential for the regulation of genes. TFpredict identifies TFs, predicts their structural superclass given a protein sequence, and uses InterProScan to detect their DNAbinding domains (DBD) [23,24]. TFpredict applies a sequence-based machine learning approach to predict functional characteristics trained on data from the TRANSFAC and Mat-Base databases [25,26]. Eichner et al. showed that TFpredict performs better than previously published methods [23]. (a) SBML model processing with BioPAX2SBML, SBMLsqueezer, and SBML2LaTeX. BioPAX2SBML converts models from the BioPAX format to SBML and conserves qualitative models. SBMLsqueezer generates kinetic rate laws for each reaction contained in an SBML file. SBML2LaTeX creates human-readable reports from SBML files. (b) Transcription factor analysis using TFpredict and SABINE. TFpredict is used to identify transcription factors and predict their superclass and DNA binding domains. SABINE uses this information to infer the position frequency matrix, which represents their DNA binding profile. (c) Reverse phase protein array analysis with RPPApipe. RPPApipe implements a customizable pipeline for RPPA data analysis. This includes normalization and annotation of raw data, statistical methods for the detection of deregulated and differentially modified proteins, and their association with alterations on the pathway level, and visualization of the results.
The Stand-alone binding specificity estimator (SABINE) infers the DNA motif of a TF as a position frequency matrix (PFM), based on the amino acid sequence, detected DBDs, superclass, and species [23]. SABINE uses Support Vector Regression to predict the PFM based on the similarity to other TFs with well-defined PFMs. The similarity to other TFs is established based on evolutionary, structural, and chemical similarities.
In combination, TFpredict and SABINE may be used for structural and functional annotation of TFs (see Fig 1b). For example, NR2C2 and PPARA were predicted as TFs for CYP3A4 in human hepatocytes and have been confirmed in wet-lab experiments [27].
Expression data analysis. High-throughput gene expression analysis has become an important part of biological research [2]. Reverse phase protein arrays (RPPAs) have been used, e.g., in individualized medicine and cancer biology [28,29]. RPPApipe offers customizable workflows for RPPA experiments, including preprocessing, annotation, statistical analysis, clustering, pathway analysis, and visualization of results (see Fig 1c, [30]). RPPApipe supports several experimental designs: standard paired condition and control designs as well as more specialized designs with multiple conditions or replicated time-series. Particularly, RPPApipe supports a number of RPPA-specific analyses, such as evaluation of differential modification or pathway profiles that account for the lower number of analytes compared to transcriptomics studies. RPPApipe is fully compatible with InCroMAP, which allows users to integrate RPPA data with other omics layers, e.g., mRNA-and microRNA expression or epigenetic modifications [31].
Transcriptomics studies have recently been investigated for integration into the preclinical drug development process [32]. Two major databases for gene expression changes after shortterm exposure of rodents to carcinogenic chemicals have been released for public access: Open TG-GATEs [33] and DrugMatrix [34]. ToxDBScan performs large scale similarity screening of these two databases to elucidate the carcinogenic potential and mode of action of new chemicals based on well-characterized chemicals that induce similar gene expression patterns [35]. ToxDBScan provides a similarity score to assess the relevance of carcinogenicity results for related chemicals. In addition, pathway enrichment analysis is performed to inform on possible modes of action. The similarity scoring approach has been successfully validated with external data not included in DrugMatrix and TG-GATEs [35].

Workflow creation
The Galaxy framework allows users to combine multiple tools to complex workflows. These workflows enable users to build sophisticated pipelines by connecting analysis tools through their input and output. This is in accordance with the UNIX philosophy of writing programs that do one task and do it well to ensure modularity and reusability of tools and code. In consequence, complex tasks can be solved by the combination of simple tools.
We created workflows for common use cases that we anticipate will be useful to users of our web platform. All workflows can be saved and shared with other users. An overview of the example workflows along with descriptions is given in Table 1.

Methods
The ZBIT Bioinformatics Toolbox has been implemented as a web platform that is hosted on a GNU/Linux operating system. We use the open source platform Galaxy [5] to provide a common front end to the individual tools. All currently included tools are implemented in either Java™or the R language for statistical computing. A schematic overview of the system is shown in Fig 2. Web server setup All requirements for the individual tools and the Galaxy platform have been installed on a machine running Ubuntu 14.04. Apache2 and Python 2.7.3 were installed from the main Ubuntu repository. Galaxy was downloaded and installed without root privileges to secure the system. We use the Oracle Grid Engine to distribute and manage the analysis on a computing cluster dedicated to the ZBIT Bioinformatics Toolbox. The computing cluster consists of three nodes running Ubuntu 14.04. On all nodes, the Java™Runtime Environment (JRE, version 1.7.0) was installed from the main Ubuntu repository. R (version 3.2.2) was installed by adding the appropriate repository provided by the R developers [36]. The command-line tools have been integrated with Galaxy through XML files and shell scripts.

Results and Discussion
The ZBIT Bioinformatics Toolbox contains seven tools from the areas of systems biology, transcription factor annotation, and expression data analysis. Most of the tools have been developed as stand-alone, command-line tools for specific problems in the respective area. However, tools can be combined to create complex analysis, which previously required manual execution of each tool subsequently. The Galaxy front end provides a user-friendly interface to these command-line tools, without requiring the installation of the software or its dependencies. In addition, predefined and user-created workflows can be used to automate complex analysis that would have required multiple command-line tools. Where possible, the tool output is generated in standardized and established formats (e.g., SBML, PDF, CSV). For each tool, we provide extensive documentation, tutorials, example data, and predefined workflows General architecture of the ZBIT Bioinformatics Toolbox. This schematic represents the ZBIT Bioinformatics Toolbox and its subsystems. First, researchers (also called clients) access the site through the internet. All in-and outgoing traffic is handled by Apache. On the server host, the Galaxy framework is used to provide the front end, i.e., the interface with which users interact to select tools, upload data, and set parameters for analysis. Galaxy also handles user management, workflows, and persistent data storage. Requested analysis are submitted to a computing cluster through the Oracle Grid Engine (OGE). OGE manages the distribution of jobs, i.e., individual analysis with a specific tool, to available nodes of the cluster and the queue of running, waiting, and finished jobs. On each cluster node, the Java™Runtime Environment or R is used to execute the actual analysis with the selected tool. After the execution finishes, results are passed back along this command chain to Galaxy. Galaxy then stores the result and displays it to the user in an appropriate format. doi:10.1371/journal.pone.0149263.g002 to maximize usability for life science researchers. Furthermore, most tools are also available as stand-alone programs for download and offline use.

Use cases
To demonstrate the usage of the ZBIT Bioinformatics Toolbox, we will shortly describe a use case for each of the three categories. We used the predefined workflows to analyze real data obtained from public repositories. All data files have also been deposited in the ZBIT Bioinformatics Toolbox for reproduction of the described use cases.
Creation of full kinetic models from pathway maps. Ceramides are sphingolipids that are found in the cell membrane of cells. Ceramide signaling has been linked to apoptosis and programmed cell death [37,38]. We downloaded the ceramide signaling pathway from the Pathway Interaction Database (PID) in BioPAX format (see S1 Text, [39]). A graphical representation of the pathway is shown in S1 Fig. We used the BioPAX2SBMLandSqueeze2La-TeX workflow to generate a full kinetic model stored in the community standard SBML format and a human readable PDF report (see also Table 1 and Fig 3(a)). The BioPAX2SBMLandS-queeze2LaTeX workflow consists of three steps. First, BioPAX2SBML was used to convert the BioPAX file to SBML without loss of information (see Fig 3(b)). This is achieved by using the SBML extension package for qualitative models [40]. Second, SBMLsqueezer was used to generate and add kinetic equations for all reactions in the model (see S2 Text and Fig 3(c)). Third, SBML2LaTeX was used to generate the human readable report as PDF (see S3 Text). We have used the workflow with the preset default options. In total, the created SBML model contains 50 reactions with 93 involved molecules and 263 kinetic parameters. To further customize the model or the report, SBMLsqueezer and SBML2LaTeX offer a number of user-settings to influence the program's behavior and choices. The SBML model can be used with any modeling software that supports the SBML standard and the SBML Level 3 qual package.
Identification of a transcription factor and its DNA binding domain. TFs are DNAbinding proteins that are involved in many biological processes in the cell nucleus, e.g., initiation of transcription at the promotor site of genes or regulation of nucleases and helicases [42]. NF-κB is a human TF that is present in most cell types and is involved in many signaling events in the cell nucleus [43]. We downloaded the sequence of NF-κB from UniProt as a FASTA file (UniProt ID P19838, see S4 Text) and used the TFpredict & SABINE workflow to assess if NF-κB is correctly predicted as a TF (see Fig 4a). The TFpredict & SABINE workflow consists of two steps. First, TFpredict is used to predict if the input protein sequence is a TF, assign it to a superclass, and detect possible DBDs through InterProScan [24]. For NF-κB, TFpredict correctly predicts that it is a TF of the beta scaffold class and identified four potential DNA binding domains (see Fig 5). Second, SABINE is used to predict the PFM of the DNA sequence that is recognized by the TF based on the identified superclass and DBDs. SABINE was able to identify the PFM with medium confidence (see Fig 4c). The predicted PFM 5'-GGRAANYCCC-3' is in good concordance with the DNA sequence recognized by NF-κB: 5'-GGGRNYYYCC-3', where R represents a purine, Y a pyrimidine, and N any nucleotide [44].
Effects of drugs on protein expression. The administration of drugs may have undesired side effects. For this reason, the drug development process includes several phases for testing the side effects of candidate drugs. During the preclinical phase, in vitro and in vivo experiments are performed with animals before testing proceeds to human patients. This preclinical phase includes carcinogenicity tests, e.g., the Ames test to assess DNA damaging effects [45]. However, some chemicals are known to cause cancer through mechanisms not related to DNA damage, so called non-genotoxic carcinogens (NGCs) [46]. Currently, methods are being investigated to reliably detect NGCs early in the preclinical phase [32]. To assess the effects of NGCs, we analyzed a data set that measured the protein expression in the liver of Wistar rats in vivo after 14 days of chronic treatment with 11 NGCs, 2 genotoxic carcinogens (GCs), and 2 non-hepatocarcinogens (NCs) (available from GEO under GSE53084 [47,48]). We used the RPPApipe two-class workflow (see Fig 5a and Table 1). The RPPApipe two-class workflow consists of 12 steps and requires two input files: the measured protein expression and a class definition file (see S5 Text and S6 Text, respectively). The workflow is divided into three major phases. First, the samples were assigned to treatment groups defined in the class file and preprocessed. In this case, preprocessing was done without scaling and log-transformation, as the data was already normalized. During preprocessing, additional information, e.g., gene descriptions or alternative identifiers, were fetched from public resources. Second, fold changes and p-values for differential expression were computed using default settings, i.e., limma [49]  was used to identify differentially regulated proteins and p-values were corrected for multiple testing using the Benjamini-Hochberg approach [50]. Third, various plots have been generated, e.g., volcano plots for differential regulation and modification, and KEGG [51] pathway profiling was performed (see Fig 5b). The clustering shows a clear distinction between the two NCs, Nif and CFX, and carcinogens, and to a lesser degree between NGCs and the two GCs, CIDB and DMN, as can be seen in Fig 5c. This supports recent findings that suggest that integration of multiple omics levels can improve early assessment of carcinogenic effects [48].

Related web platforms
Over the last years, bioinformatics tools have gained ever more importance for biological and biomedical research. The main reason is the development of new technologies, such as nextgeneration sequencing, in silico modeling of cellular processes, sequence-based protein characterization, and many more. As stated before, bioinformatics software often requires advanced technical knowledge. To this end, web platforms have been established as an easy, user-friendly interface to many bioinformatics tools. In the following, we will describe and compare some web platforms that are related to our ZBIT Bioinformatics Toolbox.
Systems Biology. For systems biology researchers, we provide the tools BioPAX2SBML, SBMLsqueezer, and SBML2LaTeX. To find related web platforms we used the SBML Software Guide, which is actively maintained and available from the official SBML website (sbml.org). At the time of writing, the Software Guide includes a number of web platforms for SBML editing (semanticSBML, [52]) visualization (PATIKAweb, [53]), and annotation (MetaNetX, [54]). However, none of the listed web platforms provides the functionality offered by the tools The input FASTA sequence file contains the protein sequence. TFpredict uses the sequence to predict if the protein is a transcription factor, infer its superclass, and detect DNA binding domains. SABINE uses the output of TFpredict to identify the DNA sequence that is bound by the transcription factor. (b) TFpredict output for NF-κB protein sequence. NF-κB was correctly identified as a transcription factor. The superclass was predicted to be beta scaffold. TFpredict detected four DNA binding domains. (c) DNA binding profile predicted by SABINE. SABINE predicted a position frequency matrix with medium confidence. The predicted DNA binding profile shows good concordance with the consensus DNA binding profile established by Wan and Lenardo [44]. In the consensus sequence, R represents a purine, Y a pyrimidine, and N any nucleotide. hosted on our web platform. For the conversion of BioPAX files to SBML format, the SBML Software Guide lists only BioPAX2SBML and SyBiL. In contrast to BioPAX2SBML, SyBiL does not provide a web interface and requires the installation of additional dependencies. The Software Guide lists no alternatives for the conversion to human-readable PDF reports that is (a) Predefined Galaxy workflow for RPPA data analysis. Two input files are required: a CSV file containing the RPPA expression values and a class file, which defines the relations between samples. First, the data is normalized and annotated. Second, differential expression of proteins is determined. Third, various plots are generated and pathway enrichment and clustering are performed. (b) Volcano plot and pathway profile for RPPA data. These example plots were generated using a data set for effects of drug exposure on the protein expression in rat liver. Several protein are differentially modified after treatment with Wy-14643, a non-genotoxic carcinogen (left). Differentially regulated proteins have been mapped to KEGG pathways to identify potential deregulation on pathway level (right). (c) Clustering of drugs by effects in rat liver. All 15 drugs in the data set were clustered by their protein expression profiles. The two non-hepatocarcinogens (Nif, CFX) formed a separate cluster from the carcinogens. The two genotoxic carcinogens (CIDB, DMN) formed a cluster within the carcinogens. The non-genotoxic carcinogens formed several clusters.
doi:10.1371/journal.pone.0149263.g005 offered by SBML2LaTeX, the inference of kinetic equations with SBMLsqueezer, or the model enrichment with information from the BiGG database with ModelPolisher.
Transcription factor annotation. Traditional, experimental methods for identifying and characterizing TFs were time-consuming and expensive. For this reason, a number of computational in silico methods have been proposed during the last years, among them TFpredict and SABINE, which are available from our web platform. A number of other web platforms for predicting DNA-binding proteins have been published, most notable iDNA-Prot|dis by Liu et al. [55] and nDNA-Prot by Song et al. [42] for DNA binding prediction. Both iDNA-Prot|dis and nDNA-Prot provide a simple web interface for pasting the protein sequence and predicting if the protein is DNA binding. In contrast to TFpredict, both do not attempt to identify the superclass if a protein is found to be a transcription factor. They also do not provide additional information that is provided by TFpredict, like DNA binding domains, but only report the prediction result. In a comparison with similar tools, iDNA-Prot|dis and nDNA-Prot have been shown to outperform their competitors. However, the tools have neither been compared to each other, nor to TFpredict. To our knowledge, there are no web platforms which allow the prediction of PFMs that is performed by SABINE.
Expression data analysis. For expression data analysis, we provide RPPApipe for analyzing protein data and ToxDBScan for gene expression data. In a recent review, Wachter et al. [56] compiled an overview of tools for RPPA data analysis. Among the referenced tools only two web platforms are present: RPPApipe and Miracle [57]. The other tools that are discussed by Wachter et al. are available as R packages or Excel macros. In addition, some of these are missing documentation or require registration prior to use. While Miracle does provide a web platform, no officially hosted web server running Miracle is available. Rather, the user is expected to set up a local instance of Miracle for his analyses, which requires the necessary infrastructure and advanced informatics skills. To our knowledge, RPPApipe is currently the only RPPA analysis pipeline that is available as a web platform which requires no set up or registration. If the user is only interested in finding patterns in the RPPA expression data, without RPPA specific analyses, there are alternative web platforms which address specific problems. For example, PaGeFinder [58] provides pattern analysis for user submitted expression data and PaGenBase provides a database of pattern genes in a number of model organsims [59]. These web platform can be used to identify genes that are specifically expressed under certain conditions, e.g., to identify spatiotemporal patters in sequential gene expression experiments. While RPPApipe does not offer specific tools for detecting spatiotemporal patterns, it provides visualizations and RPPA specific analyses that allow the identification of genes that respond to specific conditions or in specific tissues.
ToxDBScan performs a similarity search for gene expression patterns in TG-GATEs and DrugMatrix. These two databases are the largest resources on the effects of non-genotoxic, carcinogenic substances on gene expression in rats. Currently, there are two other web portals that provide similar functionality: Toxygates [60] and LTMap [61]. Toxygates is a data portal which provides exploration tools for the TG-GATEs data, allows compound ranking by gene expression and links expression data with pathology reports [60]. However, Toxygates does not provide similarity search based on differentially regulated genes provided by the user, LTMap performs similarity ranking based on user-submitted probe lists in TG-GATEs data, but does not offer any additional analyses [61]. In contrast, ToxDBScan not only offers the similarity scoring of user-submitted gene expression profiles, but also performs pathway enrichment analysis and creates visualizations that aid the interpretation of the similarity profiles. Another advantage of ToxDBScan over the other two tools is the integration of DrugMatrix data, which almost doubles the number of compounds available for similarity search.

Conclusion
The ZBIT Bioinformatics Toolbox is an easily usable collection of online tools for systems biology, transcription factor annotation, and expression data analysis that does neither require installation of software nor advanced technical knowledge and allows the combination of tools to build custom analysis workflows. All tools and workflows are accessible from any device with a modern web browser and internet access. The tools have been applied by researchers to gain new knowledge in systems biology [15,16,27] and are adopted in established databases [18]. We used the Galaxy framework, which was designed with particular consideration of the requirements for scientific software, such as storage of analysis results, scalability, and reproducibility. Tutorials and example data are available for all tools. We have created predefined workflows that demonstrate the capabilities and use cases of each tool and are available through the toolbox. New tools can easily be integrated in the system with the flexible Galaxy framework and we are looking into extending the functionality by incorporating external tools that might benefit the users of the our platform, e.g., new tools for feature extraction for protein characterization, such as Pse-in-One [62]. We will continue to maintain and extend the ZBIT Bioinformatics Toolbox as new tools are developed. This network represents the full SBML model of the ceramide signaling pathway generated by Bio-PAX2SBML and SBMLsqueezer. Gray squares represent reactions, light gray circles reactants, black arrows participation in a reaction, blue lines indicate enzymatic behavior. The network was created with CySBML. (PDF) S1 Text. NCI curated ceramide signaling pathway. This is the NCI curated ceramide signaling pathway from the Pathway Interaction Database in BioPAX format. The pathway is available for download in the custom PID XML format and the BioPAX community standard. (OWL) S2 Text. Full SBML model of the ceramide signaling pathway. This full SBML model of the ceramide signaling pathway was generated by BioPAX2SBML and SBMLsqueezer. This represents a draft model, which should be checked and possibly curated before using it for simulation. (XML) S3 Text. Human-readable report for full SBML model of the ceramide signaling pathway. This is a human-readable report that was generated with SBML2LaTeX from the full SBML model of the ceramide signaling pathway generated by BioPAX2SBML and SBMLsqueezer. (PDF) S4 Text. FASTA sequence of human NF-κB. This FASTA file contains the protein sequence of the human transcription factor NF-κB obtained from UniProt (ID P19838). (FASTA) S5 Text. Protein expression in Wistar rats after treatment with several chemicals. This CSV file contains RPPA expression data collected from the liver of Wistar rats that have been treated with one of 11 non-genotoxic carcinogens, 2 carcinogenic carcinogens, and 2 non-hepatocarcinogens. Samples have been performed in triplicates, with matched controls. (CSV) S6 Text. Class definition file for RPPApipe. This is a plain text file that describes the relation between treated and control samples in the RPPA expression data (S5 Text). (TXT)