Genomic Analysis of Stress Response against Arsenic in Caenorhabditis elegans

Arsenic, a known human carcinogen, is widely distributed around the world and found in particularly high concentrations in certain regions including Southwestern US, Eastern Europe, India, China, Taiwan and Mexico. Chronic arsenic poisoning affects millions of people worldwide and is associated with increased risk of many diseases including arthrosclerosis, diabetes and cancer. In this study, we explored genome level global responses to high and low levels of arsenic exposure in Caenorhabditis elegans using Affymetrix expression microarrays. This experimental design allows us to do microarray analysis of dose-response relationships of global gene expression patterns. High dose (0.03%) exposure caused stronger global gene expression changes in comparison with low dose (0.003%) exposure, suggesting a positive dose-response correlation. Biological processes such as oxidative stress, and iron metabolism, which were previously reported to be involved in arsenic toxicity studies using cultured cells, experimental animals, and humans, were found to be affected in C. elegans. We performed genome-wide gene expression comparisons between our microarray data and publicly available C. elegans microarray datasets of cadmium, and sediment exposure samples of German rivers Rhine and Elbe. Bioinformatics analysis of arsenic-responsive regulatory networks were done using FastMEDUSA program. FastMEDUSA analysis identified cancer-related genes, particularly genes associated with leukemia, such as dnj-11, which encodes a protein orthologous to the mammalian ZRF1/MIDA1/MPP11/DNAJC2 family of ribosome-associated molecular chaperones. We analyzed the protective functions of several of the identified genes using RNAi. Our study indicates that C. elegans could be a substitute model to study the mechanism of metal toxicity using high-throughput expression data and bioinformatics tools such as FastMEDUSA.


Introduction
Arsenic is a metalloid, which is distributed throughout the Earth crust in diverse complex forms with pyrites. Depending on the physicochemical conditions of the environment, arsenic can readily be dissociated from the complex, enter into ground water [1] and be taken up by microorganisms resulting in high levels of bio-availability [1,2]. In Asia, including India, Bangladesh, Vietnam, Thailand and China millions of people are exposed to arsenic. Two different oxidative states of arsenic, (III) and (V), are available in organic and inorganic forms that correlate with their cytotoxic potentials. Between these two states, compounds with (+3) oxidation state are more toxic to target cells and tissues due to several mechanisms including high affinity for protein thiols or vicinal sulfhydryl groups [3][4][5][6][7][8].
Chronic and/or acute high dose arsenic exposure can cause wide range of health problems including cancer, severe gastrointestinal toxicity, diabetes, cardiovascular disease and even death [5,8,9]. Arsenic is considered as a group1 carcinogen, a categorical classification of an agent/mixture, which is definitely carcinogenic to humans [10]. Since carcinogenic metals, including arsenic, tend to be weak mutagens, and they do not directly interact with DNA, several recent studies have suggested that epigenetic regulation may play a role in metal-induced carcinogenesis [11].
Although the metabolism of inorganic arsenic is quite well known, the precise mechanism of arsenic toxicity is not clearly understood. In mammals, a methylation pathway has been proposed for the metabolic processing of inorganic arsenicals. In this pathway, arsenite (iAs III ) is sequentially converted to monomethylarsonic acid (MMA v ) and dimethylarsinic acid (DMA v ) in both humans and laboratory animals including mice and rats. The intermediate arsenicals, MMAIII and DMAIII, also produced in this pathway, are highly toxic and suspected to be responsible for arsenic toxicity [12]. While some steps in this pathway are strictly chemical reactions, others are enzymatically catalyzed. However, work to date has identified one methyltransferase that is clearly a participant in this pathway. Arsenic (+3 oxidation state) methyltransferase (AS3MT)1 catalyzes conversion of iAs to methylated products. AS3MT homologs have not been identified in C. elegans genome [13]. Other aspects of arsenic metabolism in C. elegans remain to be seen. Arsenic causes oxidative stress, apoptosis and mutagenesis [14][15][16]. Oxidative stress through generation of reactive oxygen species due to arsenic exposure [17][18][19][20] have been reported in tumor cell lines [21] as well as in normal human cells [22,23].
While arsenic is mostly documented as an inducing factor in cancers or several other diseases, there is extensive evidence that one form of arsenic, As 2 O 3 , has a potential antitumor effect in vitro and in vivo [24][25][26]. United States Food and Drug Administration (US-FDA) approved As 2 O 3 for the treatment of Acute Promyelocytic Leukemia (APL). It's well established that As 2 O 3 can completely cure ,80-90% of newly diagnosed APL patient [24][25][26].
C. elegans, a model organism that is less complex than the mammalian system while still sharing high genomic homology, provides an excellent model to elucidate the mechanisms of heavy metal toxicity [27]. This soil nematode has been used in toxicology studies, revealing molecular mechanisms of heavy metal toxicity [28], [29], [30]. Therefore, the C. elegans model system is valuable for the investigation of metal toxicity and may be particularly useful for examining gene-environment interactions. Several toxicity endpoints are well documented in the nematode, including growth rate, lifespan, reproduction, and feeding [31,32]. Acute toxicity can also be assessed in the nematode using altered gene expression levels, as well as behavioral endpoints, such as locomotion, and head thrashing [33][34][35][36][37]. Several cellular stress response systems such as the glutathione (GSH), metallothioneins (MTs), heat shock proteins (HSPs), as well as a variety of pumps and transporters are found to work to detoxify and excrete metals in C. elegans [27]. Previously, whole genome C. elegans DNA microarray and RNAi analysis were used to explore global changes in this nematode to understand mechanisms involved in resistance to cadmium toxicity [38].
In this study we used C. elegans whole genome expression microarrays to examine global changes in the nematode transcription profile upon arsenic exposure. Bioinformatics analysis of regulatory networks was done using FastMEDUSA. We analyzed the protective functions of several of the identified genes using RNAi. Molecular players previously associated with arsenic exposure in higher organisms were identified at a global level, confirming the effectiveness of the study. Moreover, we identified evolutionary conserved genes which were not previously associated with arsenic exposure, but associated with carcinogenesis.

Arsenic treatment for microarray experiments
Synchronized L1 stage animals were collected by spinning at 800 rpm for five minutes and transferred to Sodium arsenite containing (0.03% and 0.003% w/v) CeHM media and incubated at 22uC for 6 hours.

RNA Isolation
After arsenic treatment animals were collected and washed in M9 buffer, RNA was extracted using TRIzol reagent (Invitrogen). Residual genomic DNA was removed by DNase treatment (Ambion, Austin, TX). Three independent RNA isolations were performed with each condition for microarray analysis.

Microarray Analysis
For each experimental condition, RNA was isolated from three biological replicate samples. cRNA was synthesized from 10 mg of total RNA, and samples were hybridized to the C. elegans GeneChip (Affymetrix, Santa Clara, CA) by the US Food and Drug Administration/CFSAN/DMB Microarray Facility following the manufacturers instruction. The chip represents 22,500 transcripts of the expressed C. elegans genome based on the December 2005 genome sequence. The data were processed using Partek Genomics Suite, version 6.5 (Copyright ß 2010 Partek Inc., St. Louis, MO, USA). The robust multichip averaging (RMA) algorithm was used to normalize and summarize the probe data into probeset expression values. The RMA algorithm performs background correction, normalization, and summarization using PM-only probes ( Figure S1). A gene usually maps to several probesets. To convert probeset-level expression data to gene-level, we picked the highest-intensity probeset of each gene. We used ANOVA to compute differentially expressed genes between experimental treatment groups and control using the log transformed normalized intensity values generated from application of the RMA algorithm. We used the FDR for multiple comparison correction. Genes were considered differentially expressed if they had a p-value#0.05 after correction. The microarray data have been deposited in the GEO repository. Accession number is GSE39012.

Functional Enrichment Analysis
Genes showing a significant change in expression by microarray analysis (FDR,0.05) were analyzed using 'stats' R package of R software (R Development Core Team [2012]: A language and environment for statistical computing. R foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http:// www.R-project.org). Genes were compared against a 21,249 C. elegans gene database to identify over-represented Gene Ontology terms. Statistical analysis was performed using chi-square test and the Yates' continuity correction. Significant functional terms were defined as p,0.05.

qRT-PCR
cDNA was synthesized from 5 mg of total RNA using random hexamers and SuperScript II reverse transcriptase (Invitrogen). Real time PCR was performed using SYBR Advantage quantitative PCR premix (Clontech) and gene-specific oligonucleotide primers on the LightCycler (Roche). Primers for qRT-PCR are listed on Table S1. Relative fold-changes for transcripts were calculated using the comparative C T (2 2DDCT ) method [41]. Cycle thresholds of amplification were determined by Light Cycler software (Roche). All samples were run in triplicates and normalized to GAPDH.

RNA Interference
E. coli DH5a bacterial strains expressing double-stranded C. elegans RNA [42] were grown in LB broth containing ampicillin (100 mg/ml) at 37uC and plated onto NGM containing 100 mg/ml ampicillin and 3 mM isopropyl 1-thio-b-D-galactopyranoside (IPTG). RNAi-expressing bacteria were allowed to grow overnight at 37uC. Synchronized L1 stage NL2099 (rrf-3) strains were used for RNAi experiments for the functional validation of the differentially expressed genes identified through microarray. NL2099 (rrf-3) worms were exposed to fresh RNAi expressing bacterial lawn on NGM agar plates for 48 hours, then washed with M9 and plated on sodium arsenite containing NGM plates with E.coli OP50 bacterial lawn, and incubated at 22uC (See 'C. elegans survival assays for arsenic exposure following RNAi' section below). L4440 RNAi which contains the RNAi plasmid only was included as a control in all experiments.
C. elegans survival assays for arsenic exposure following RNAi Sodium arsenite containing (0.03%) nematode growth media (NGM), in 6-cm Petri plates, were prepared for survival assays. The plates contained a lawn of OP50 bacteria as a food source. Plates were incubated overnight at room temperature before animals were added. Worms (L1 stage), treated with RNAi bacteria for 48 hours, were transferred to sodium arsenite containing NGM plates with OP50 bacterial lawns and incubated at 22uC. Around 20-30 L4 stage worms were added to each plate. Total 75 to 100 animals were scored for each condition every 24 h for survival and transferred to fresh bacterial lawns every day to avoid overgrowth by progeny. Assay was continued up to ten days. Animal survival was plotted using Kaplan-Meier survival curves and analyzed by log rank test using Graph Pad Prism (Graph Pad Software, Inc., La Jolla, CA). Survival curves resulting in p values of ,0.05 relative to control were considered significantly different.

FastMEDUSA analysis
We used FastMEDUSA [43] to elucidate transcription factors (TFs) that putatively regulate the genome-level responses to high and low levels of arsenic exposure in C. elegans. FastMEDUSA applies a machine learning algorithm called boosting to train a predictive model from expression and promoter sequences of genes in a number of experimental conditions. FastMEDUSA uses a list of candidate TFs, the promoter sequences of all the genes and a matrix of discrete expression data as input. To discretize gene expression data, we computed fold change of expression signal of a gene in a sample to the gene's median expression across reference samples. A gene in a sample was called upregulated if the fold change $1.5 and downregulated if the fold change is #21.5. Genes having inconsistent expression calls across technical replicates were filtered out. We obtained the list of candidate TFs in C. elegans from EDGEdb [44], and obtained 1,000 bp promoter sequence of genes from BioMart [45].
FastMEDUSA potentially builds a different model at each run as it contains some stochastic steps. Thus, we ran FastMEDUSA five times using a different random seed value at each run on the Biowulf cluster at the National Institutes of Health. For each FastMEDUSA run, we computed significance score of TFs as following. First, we computed prediction score for the upregulated genes in the experimental condition based on the original FastMEDUSA model. Then, we remove the TF from the FastMEDUSA model and recomputed the prediction score for the same gene set. The difference between the prediction scores give the significance score of the TF (details in [46]). We selected top 20 TFs with highest significance score. Then we selected top ten consensus significant TFs that were selected as significant in at least four out of five runs. To find significant TF-gene associations, we computed the significance score for each TF-gene pair. We selected TF-gene associations that had a significance score$1 for at least four out of five runs and generated a network of these associations by using Cytoscape [47].

Results and Discussion
Arsenic exposure induced genome-wide gene expression changes in C. elegans Arsenic induced global gene expression has been poorly explored. To study the global gene expression pattern after acute arsenic exposure, we performed a microarray study where wild type L4 stage C. elegans (N2) was exposed to sodium arsenite in two different concentrations (0.03% and 0.003% w/v) in CeHM media for 6 hours. Differentially expressed genes were identified (considering fold change (+/2) 1.2 fold, FDR = 0.05 and P,0.05). C. elegans gave a strong global gene expression response to sodium arsenite where about one fifth of the genome (4731 genes) was differentially expressed upon high dose (0.03% w/v) exposure. Low dose (0.003% w/v) sodium arsenite led to differential expression in 218 genes, 179 of those were common between the two exposures ( Fig. 1). Microarray data were confirmed using qRT-PCR to measure the expression levels of a set of selected genes (Fig. S2).
Comparison of gene expression changes between high and low levels of sodium arsenite exposure We exposed worms to two different concentration of sodium arsenite to evaluate the genomic responses to different levels of arsenic exposure. This experimental design allows us to do microarray analysis of dose-response relationships of global gene expression patterns. High dose (0.03%) exposure caused stronger global gene expression changes in comparison with low dose (0.003%) exposure (Fig. 1, Table S2). Two hundred and four genes were up regulated four fold and higher upon high dose exposure, and forty nine genes were up regulated four fold and up upon low dose exposure. Forty six of these were common between these lists (Table 1). Forty three of forty six commonly upregulated genes show dose-response relationship where high levels of sodium arsenite led to higher gene expression levels in C. elegans (Table 1). At eight hours exposure we did not observe anatomical level changes in tissue structure, and lethality (data not shown).
Protective function of the subset of the genes upregulated against arsenic treatment was evaluated using RNAi We wanted to test whether knocking down the upregulated genes will affect the sodium arsenite induced lethality in C. elegans.
Four out of seven genes tested, caused statistically significant increase in lethality upon sodium arsenite exposure when knockedout via RNAi, suggesting that these genes may have stress response function against arsenic (Fig. 2). Among these genes, aip-1 encodes an AN-1-like zinc finger-containing protein homologous to arsenite-inducible RNA-associated protein (AIRAP), which is conserved among C. elegans, Drosophila, and mammals. AIP-1 is a predicted RNA binding protein that may function in ubiquitinmediated proteolysis following arsenite treatment. AIP-1 is expressed at high levels in hypodermal and intestinal cells of C. elegans following arsenic exposure, and previously shown to protect C. elegans and mammalian cells from arsenite toxicity [48]. Our aip-1 RNAi results agree with the previously published data ( Fig. 2A). gcs-1 encodes the C. elegans ortholog of gamma-glutamylecysteine synthetase heavy chain (GCS(h)), which is predicted to function as a phase II detoxification enzyme that catalyzes the rate-limiting first step in glutathione biosynthesis, in a conserved oxidative stress response pathway [49]. Inoue et al [50] showed that the Caenorhabditis elegans PMK-1 p38 MAPK pathway regulates the oxidative stress response via the CNC transcription factor SKN-1, leading to phosphorylated SKN-1 accumulation in intestine nuclei, where SKN-1 activates transcription of gcs-1. SKN-1 also regulates expression of AIP-1 [51]. We found that most of the C. elegans Glutathion S-transferases (GSTs), which are important detoxifying enzymes, responded to arsenic exposure ( Table 2). Among these genes, gst-37, previously defined as a acrylamide responsive gene in C. elegans using expression microarrays [52]. gst-37 RNAi experiments resulted in increased lethality in arsenic exposure conditions ( Figure 2B). In our microarray data several hsp (heat-shock protein) genes found to be responsive to sodium arsenite ( Table 2). HSP-70 is a member of the hsp70 family of molecular chaperones, involving in general stress response, including response to heat and cadmium exposure, in C. elegans [30,53]. We found that hsp-70 RNAi leads to increased lethality in arsenic exposure conditions ( Figure 2D). Arsenic toxicity leads to induced HSP70 in other systems including Xenopus laevis embryos [54], and broiler chickens [55]. Oxidative stress, the central component of heat shock response, is also induced by arsenic [56].
Oxidative stress-response genes are induced due to sodium arsenite exposure Our microarray data revealed that genomic response of C. elegans to sodium arsenite exhibits characteristics of global oxidative stress response (Fig. 3). Oxidative stress from arsenic exposure might result from production of Reactive Oxygen Species (ROS), such as superoxide, hydrogen peroxide, or hydroxyl radical by arsenicals, or from release of iron from ferritin or through induction of heme oxygenase. Increased biosynthesis of defensive enzymes responsive to oxidative stress has been described in both prokaryotes and eukaryotes. We compared our arsenic-response microarray data with previously published C. elegans global stress-response results. C. elegans' response to both, paraquat-induced stress [57] (Fig. 3A, Table  S4), and hyperbaric oxygen-induced stress [58] (Fig. 3B, Table S5) showed significant overlap with our arsenic-response microarray data. We performed Gene Ontology (GO) Term enrichment analysis on our high dose arsenic response data ( Figure 1C) along with paraquat and hyperbaric oxygen stress data ( Figure 3C and 3D). General stress related GO categories, such as 'unfolded protein binding', 'protein folding', 'protein transport', and proteolysis were found to be enriched under high dose arsenic exposure conditions (Fig. 1C). Some of the protein folding, and transport related GO term enrichments were also present in paraquat stress data but not in hyperbaric oxygen stress data (Fig. 3C, D), suggesting that arsenic and paraquat result in similar functional responses in C. elegans. Interestingly, expression of zinc ion binding gene classes was depleted in all of these stresses ( Fig. 1C, 3C-D). The essential trace element zinc is broadly required in cellular functions, and disturbances in zinc homeostasis cause a range of health problems that include growth retardation, immunodeficiency, neuronal and sensory dysfunctions [59].
Glutathion S-transferases (GSTs) are essential detoxifying enzymes that constitute up to 10% of cytosolic protein in some mammalian organs, and catalyze the conjugation of reduced glutathione on a wide variety of substrates [60,61]. This activity detoxifies endogenous compounds such as peroxidised lipids [62]. GSTs may also bind toxins and function as transport proteins [63]. C. elegans genome possesses a large number of GST genes. We found that sixty seven percent of (thirty three of forty nine) the C. elegans gst genes are differentially expressed upon arsenic exposure ( Table 2). Other genes encoding antioxidant enzymes such as catalase, superoxide dismutase, and glutathione peroxidase are differentially expressed in arsenic-exposed C. elegans (Table 2). Lynn et. al. [64] reported that arsenite activates NADH oxidase to produce superoxide, which then causes oxidative DNA damage. We found that putative NADH oxidase encoding gene F56D5.3 is upregulated 43 fold in arsenic exposed C. elegans (Table 2). Recent studies revealed an association between sonic hedgehog signaling and oxidative stress in several different tissues including rat brain and mouse bone marrow [65,66]. Twenty of the fifty eight hedgehog related genes of C. elegans, found to be differentially expressed upon arsenic exposure ( Table 2). Functions of sonic hedgehog signaling genes in arsenic toxcicity and protection remain to be seen.

Arsenic-induced perturbations in iron metabolism may lead to oxidative stress
Almost all cells utilize iron as a cofactor for essential biochemical activities, such as oxygen transport, energy metabo-lism and DNA synthesis. However, iron catalyses the propagation of ROS and generation of highly reactive radicals through fenton chemistry, hence, free iron is potentially toxic to cells [67,68]. Much of the excess intracellular iron is stored in the cytosol, bound to ferritin. Very little is known about the interaction of the species of arsenic with free iron at the cellular level. Release of iron from ferritin is an under investigated possible mechanism of arsenic induced oxidative stress. It has been shown that arsenic species can cause release of iron from horse spleen ferritin in vitro [69]. Iron administration into HeLa cells leads to increased ferritin mRNA levels [70]. We found that ferritin encoding genes of C. elegans, ftn-1, and ftn-2 are upregulated upon sodium arsenite exposure (table 2). There is strong experimental support suggesting a protective role for ferritin against oxidative stress. Both transcriptional and posttranscriptional mechanisms have been implicated in ferritin induction by oxidants, such as ROS, and nitric oxide [71,72]. C. elegans homologs of iron transporter ferroportin, fpn-1.1 and fpn-1.2 are also differentially expressed against sodium arsenite ( Table 2). Sideroflexins are recently discovered mitochondrial multiple transmembrane proteins with unknown function, which are associated with iron accumulation in mitochondria [73]. We found that C. elegans sideroflexin genes sfxn-2 and sfxn-5 are downregulated upon arsenic exposure. Altogether, our data suggest that arsenic may induce perturbations in proteins involved in iron metabolism.

Genomic response to arsenic versus cadmium in C. elegans
Heavy metals such as copper, zinc, cadmium, and metalloids such as arsenic, are major environmental toxicants that are associated with a variety of human diseases. In spite of extensive research on the pathogenesis of human diseases which are linked   to environmental heavy metal and metalloid exposure, the fraction of the molecular mechanisms of pathogenesis induced by these agents, that shared, is not known at the genomic level. We compared C. elegans' response to cadmium and arsenic using previously published microarray dataset [38]. Using the same threshold for both cadmium and arsenic response datasets (1.5 fold p,0.0001), we found a significant overlap between affected genes (Fig. 1A,B, Fig. S3, Table S3). We performed Gene Ontology (GO) term enrichment analysis on cadmium microarray expression data. Some of the protein folding, and transport related GO term enrichments were present in cadmium response data (Fig.  S3A, B). Expression of zinc ion binding gene classes was depleted in cadmium data, similar to our findings regarding arsenic, paraquat stress, and hyperbaric oxygen stress data. GO class of 'nematode larval development' was found to be enriched in cadmium response data but not in arsenic response data, suggesting that different developmental consequences may arise against arsenic and cadmium in C. elegans. Robinson et al. reported arsenic-and cadmium-induced toxicogenomic response in mouse embryos undergoing neurulation. They examined the dose-dependent effects of arsenics and cadmium on gene expression in association with increased embryotoxicity in C57BL/6J mouse embryos, and identified overlapping and non-overlapping metal-induced gene expression alterations [74]. They found that 1960 and 775 genes identified to be significantly altered by arsenic and cadmium, respectively (Ftest, pb0.0001), and 116 of these genes overlapping between these two populations. Understanding genomic level responses to different heavy metals will help to resolve shared mechanisms of heavy metal-induced diseases.
Genomic responses of C. elegans to environmental contamination can be used as an ecotoxicogenomics tool Arsenic is ubiquitous throughout the earth crust in different complex forms with pyrites [75], can easily dissociate from the complex and enter into ground water [1], and be taken up by microorganisms resulting in high levels of bio-availability [1,2]. Because of these properties, arsenic is considered as an important environmental toxin. Menzel et al. used C. elegans as a bio-monitor to characterize sediment toxicity of German rivers Rhine and Elbe [76]. In that study, C. elegans were exposed to sediments of three German rivers, Danube, Rhine, Elbe; Danube being the cleanest, and Elbe is the most contaminated among the three rivers, based on chemical properties of the sediments, including arsenic levels. Using expression profile of C. elegans exposed to Danube sediment as a reference, they identified that 748 and 694 transcripts were significantly altered in Elbe and Rhine exposed animals, respectively. We wanted to address the question of how an expression profile identified against a particular pollutant, in our case arsenic, would correlate with an expression profile identified against contaminated river sediments. We found bigger overlap between global arsenic response and response to Elbe, which is the most contaminated river in this study, with higher arsenic levels (Fig. 4, Table S6, S7). These results indicate that C. elegans may be used as an environmental bio-monitor, and meta-analysis of publicly available C. elegans expression microarray data will provide a platform to gain insights into complex environmental issues.
Discovering transcription factors of genomic response to arsenic exposure in C. elegans using FastMEDUSA We utilized FastMEDUSA to compute TFs involved in the transcriptional response against high (0.03%) and low (0.003%) concentrations of arsenic in C. elegans (see Materials and Methods). We predicted ten consensus-significant TFs associated with high concentration of arsenic (Table 3). As the transcriptional response of C. elegans to low concentrations of arsenic was very minimal, FastMEDUSA did not find any significant TFs associated with this condition. We also predicted significant TF-gene associations based on FastMEDUSA models and plotted them in a network ( Figure S4, Table S8).
Two of the predicted significant TFs, dnj-11and dac-1, were tested for their contribution to arsenic stress response. We found that loss of function of these genes using RNAi exhibited increased lethality suggesting that these genes induce the stress response in C. elegans (Fig. 5). dnj-11 encodes a protein containing DnaJ and Myb domains that is orthologous to the mammalian ZRF1/MIDA1/ MPP11/DNAJC2 family of ribosome-associated molecular chaperones (Wormbase). MPP11 was identified as a leukemiaassociated antigen, and expression of this gene is up-regulated during leukemic blasts in patients [77]. In rats, MPP11 homolog MIDA1 was identified to induce humoral immune responses in glioma, and immunization with MIDA1 containing plasmid resulted in a significant suppression of tumor growth in immunized animals [78,79].
Chromosomal defects involving MPP11 are associated with primary head and neck squamous cell tumors [80]. Interestingly, a recent study revealed that As 2 O 3 had anti-cancer effects on both cultured oral squamous cell carcinoma (OSCC) cells and OSCC xenografts by inhibiting cell growth, suppressing angiogenesis and Table 3. Transcription factors that are predicted to be involving in the response to arsenic exposure by FastMEDUSA.  inducing apoptosis [81]. There is extensive evidence that arsenic trioxide (As 2 O 3 ), has a potential role of antitumor effect in vitro and in vivo [24][25][26]. US Food and Drug Administration approved As 2 O 3 for the treatment of Acute Promyelocytic Leukemia (APL). It's well established that As 2 O 3 can cure ,80-90% of newly diagnosed APL patients [24][25][26]. Precise molecular mechanism of the therapeutic effect of As 2 O 3 is not known. Our results suggest a molecular mechanism for the therapeutic effect of As 2 O 3 , such that, As 2 O 3 may regulate MPP11 expression, which may stimulate immune responses lead to killing of leukemic blast cells, and squamous cell carcinoma cells. dac-1 encodes the C. elegans ortholog of Dachshund, a transcriptional regulator of the SKI/SNO/DAC family of proteins first described in Drosophila. The altered expression of DACH1, a Drosophila Dachshund homolog, has been associated with tumor progression and metastasis in human breast, prostate, ovarian and endometrial cancers [82][83][84][85][86]. Another arsenic response gene identified via FastMEDUSA is zip-4, a putative C/EBP protein, divergent orthologue to human CEBPA gene, which is mutated in acute myeloid leukemia [87]. We also identified several C. elegans nuclear receptors (NRs) as arsenic responsive genes using Fast MEDUSA. Nuclear receptors (NRs) encompass a family of transcription factors often regulated by small lipophilic molecules, such as steroids, retinoids, bile and fatty acids, that mediate endocrine control [88]. C. elegans has a large family of NRs, containing 284 of these receptors in its genome (wormbook). A large percentage of human cancers, particularly breast, prostate, and endometrial cancers, rely on steroid production for initial growth [89,90]. Our data suggest that induction of NRs via arsenic may contribute increased incidence of cancers in arsenic exposed human populations.  Figure S4 Significant transcription factor-gene interactions in high arsenic condition. The color of the nodes represent the overall expression of the gene (green: downregulated, red: up-regulated). The size of vertices is proportional to their degree (i.e., number of edges incident on them). Each node is labeled with the corresponding gene or TF's name. Rounded squares represent transcription factors, and circles represent putative target genes of these transcription factors. The layout of the network was generated manually on Cytoscape. (TIF)