ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis

doi:10.1371/journal.pone.0149263

Fig 1.

Analysis workflows available in the ZBIT Bioinformatics Toolbox.

The workflows presented in this figure represent fundamental use-case scenarios, which combine tools from all three basic classes of tools within the ZBIT Toolbox. (a) SBML model processing with BioPAX2SBML, SBMLsqueezer, and SBML2LaTeX. BioPAX2SBML converts models from the BioPAX format to SBML and conserves qualitative models. SBMLsqueezer generates kinetic rate laws for each reaction contained in an SBML file. SBML2LaTeX creates human-readable reports from SBML files. (b) Transcription factor analysis using TFpredict and SABINE. TFpredict is used to identify transcription factors and predict their superclass and DNA binding domains. SABINE uses this information to infer the position frequency matrix, which represents their DNA binding profile. (c) Reverse phase protein array analysis with RPPApipe. RPPApipe implements a customizable pipeline for RPPA data analysis. This includes normalization and annotation of raw data, statistical methods for the detection of deregulated and differentially modified proteins, and their association with alterations on the pathway level, and visualization of the results.

More »

Expand

Table 1.

Predefined workflows in the ZBIT Bioinformatics Toolbox.

More »

Expand

Fig 2.

General architecture of the ZBIT Bioinformatics Toolbox.

This schematic represents the ZBIT Bioinformatics Toolbox and its subsystems. First, researchers (also called clients) access the site through the internet. All in- and outgoing traffic is handled by Apache. On the server host, the Galaxy framework is used to provide the front end, i.e., the interface with which users interact to select tools, upload data, and set parameters for analysis. Galaxy also handles user management, workflows, and persistent data storage. Requested analysis are submitted to a computing cluster through the Oracle Grid Engine (OGE). OGE manages the distribution of jobs, i.e., individual analysis with a specific tool, to available nodes of the cluster and the queue of running, waiting, and finished jobs. On each cluster node, the Java™Runtime Environment or R is used to execute the actual analysis with the selected tool. After the execution finishes, results are passed back along this command chain to Galaxy. Galaxy then stores the result and displays it to the user in an appropriate format.

More »

Expand

Fig 3.

Creation of a full kinetic model for the ceramide signaling pathway.

(a) Predefined Galaxy workflow for creation of kinetic models from BioPAX files. BioPAX files are used by many pathway databases to describe pathways and qualitative relations of molecules. BioPAX2SBML is used to convert the BioPAX encoded pathway to a draft SBML model. SBMLsqueezer infers reaction equations and kinetic rate laws for the relations defined in the resulting SBML model. SBML2LaTeX creates a human-readable report for model inspection to facilitate interpretation and curation. (b) Subnetwork of the SBML model of the ceramide signaling pathway. This network represents a small part of the full ceramide signaling pathway that is involved in creation and degradation of ceramide. This network contains four reversible reactions (dark gray squares) and 13 reactants. The black arrows indicate the participation of reactants in reactions. Blue lines indicate enzymatic behavior of reactants. The reaction highlighted in red degrades ceramide to sphingosine and fatty acid and is catalyzed by Platelet-derived growth factor subunit A (PDGFA). This network was created with CySBML [41] from the draft SBML model generated by SBMLsqueezer. For the full model see S2 Fig. (c) Reaction equation for ceramide degradation. SBML2LaTeX creates reaction equations for all reactions in the PDF report. This reaction degrades ceramide to sphingosine and fatty acid and is catalyzed by PDGFA. The reaction is also part of the subnetwork shown in (b) and is highlighted in red.

More »

Expand

Fig 4.

Transcription factor prediction for human NF-κB with TFpredict and SABINE workflow.

(a) Predefined Galaxy workflow for transcription factor annotation. The input FASTA sequence file contains the protein sequence. TFpredict uses the sequence to predict if the protein is a transcription factor, infer its superclass, and detect DNA binding domains. SABINE uses the output of TFpredict to identify the DNA sequence that is bound by the transcription factor. (b) TFpredict output for NF-κB protein sequence. NF-κB was correctly identified as a transcription factor. The superclass was predicted to be beta scaffold. TFpredict detected four DNA binding domains. (c) DNA binding profile predicted by SABINE. SABINE predicted a position frequency matrix with medium confidence. The predicted DNA binding profile shows good concordance with the consensus DNA binding profile established by Wan and Lenardo [44]. In the consensus sequence, R represents a purine, Y a pyrimidine, and N any nucleotide.

More »

Expand

Fig 5.

Analysis of the effects of drugs on protein expression with RPPApipe.

(a) Predefined Galaxy workflow for RPPA data analysis. Two input files are required: a CSV file containing the RPPA expression values and a class file, which defines the relations between samples. First, the data is normalized and annotated. Second, differential expression of proteins is determined. Third, various plots are generated and pathway enrichment and clustering are performed. (b) Volcano plot and pathway profile for RPPA data. These example plots were generated using a data set for effects of drug exposure on the protein expression in rat liver. Several protein are differentially modified after treatment with Wy-14643, a non-genotoxic carcinogen (left). Differentially regulated proteins have been mapped to KEGG pathways to identify potential deregulation on pathway level (right). (c) Clustering of drugs by effects in rat liver. All 15 drugs in the data set were clustered by their protein expression profiles. The two non-hepatocarcinogens (Nif, CFX) formed a separate cluster from the carcinogens. The two genotoxic carcinogens (CIDB, DMN) formed a cluster within the carcinogens. The non-genotoxic carcinogens formed several clusters.

More »

Expand