Gene Expression and Pathway Analysis of Effects of the CMAH Deactivation on Mouse Lung, Kidney and Heart

Background N-glycolylneuraminic acid (Neu5Gc) is generated by hydroxylation of CMP-Neu5Ac to CMP-Neu5Gc, catalyzed by CMP-Neu5Ac hydroxylase (CMAH). However, humans lack this common mammalian cell surface molecule, Neu5Gc, due to inactivation of the CMAH gene during evolution. CMAH is one of several human-specific genes whose function has been lost by disruption or deletion of the coding frame. It has been suggested that CMAH inactivation has resulted in biochemical or physiological characteristics that have resulted in human-specific diseases. Methodology/Principal Findings To identify differential gene expression profiles associated with the loss of Neu5Gc expression, we performed microarray analysis using Illumina MouseRef-8 v2 Expression BeadChip, using the main tissues (lung, kidney, and heart) from control mice and CMP-Neu5Ac hydroxylase (Cmah) gene knock-out mice, respectively. Out of a total of 25,697 genes, 204, 162, and 147 genes were found to be significantly modulated in the lung, kidney, and heart tissues of the Cmah null mouse, respectively. In this study, we examined the gene expression profiles, using three commercial pathway analysis software packages: Ingenuity Pathways Analysis, Kyoto Encyclopedia of Genes and Genomes analysis, and Pathway Studio. The gene ontology analysis revealed that the top 6 biological processes of these genes included protein metabolism and modification, signal transduction, lipid, fatty acid, and steroid metabolism, nucleoside, nucleotide and nucleic acid metabolism, immunity and defense, and carbohydrate metabolism. Gene interaction network analysis showed a common network that was common to the different tissues of the Cmah null mouse. However, the expression of most sialytransferase mRNAs of Hanganutziu-Deicher antigen, sialy-Tn antigen, Forssman antigen, and Tn antigen was significantly down-regulated in the liver tissue of Cmah null mice. Conclusions/Significance Mice bearing a human-like deletion of the Cmah gene serve as an important model for the study of abnormal pathogenesis and/or metabolism caused by the evolutionary loss of Neu5Gc synthesis in humans.


Introduction
Xenotransplantation using pig organs has the potential to solve the increasing shortage of donor organs available for allotransplantation [1]. Over the last two decades, there have been considerable advances in our understanding of the immunological and physiological hurdles to xenotransplantation. Many of the obstacles to xenotransplantion result from mismatches in receptorligand or enzyme-substrate interactions between the pig tissue and the recipient's blood and immune system [2]. The initial cause of failure for pig cardiac and renal xenografts is thought to be antibody-mediated injury to the endothelium, leading to the development of microvascular thrombosis. Factors contributing to the development of thrombotic microangiopathy include anti-non-Gal antibodies, natural killer cells or macrophage activity, and inherent coagulation dysregulation between pigs and primates [3].
To address these problems, several researchers have produced gene knockouts or transgenic animals to correct these mismatches. However, the combination of these modifications into an ''idealized'' transgenic animal has yet to be reported.
Neu5Gc is produced from Neu5Ac through enzymatic hydroxylation of the N-acetyl residue of free Neu5Ac, CMP-Neu5Ac, or glycoconjugate-linked Neu5Ac [4,5]. Neu5Gc, also called Hanganutziu-Deicher (H-D) antigen, is expressed on the endothelial cell surface of all mammals, with the exception of humans, and is a target for non-Gala 1,3 Gal antibodies [6]. As consequence of inactivation of the CMP-Neu5Ac hydroxylase (Cmah) gene during evolution, humans have lost the ubiquitous mammalian cell surface molecule Neu5Gc [7,8,9]. Conversely, Neu5Gc produced by CMAH activity is one of the non-Gal xenoantigens of secondary importance to a-1, 3-galactosyltransferase (GGTA1), for pig-to-human xenotransplantation [10]. Similar to the galactose a1,3 galactose (a-Gal), Neu5Gc is immunogenic in humans as it is responsible for the expression of Neu5Gc, a key non-Gal antigen [11]. Very recently, we have developed a biallelic CMAH knock-out in pigs [12]. However, these data raise the possibility of an alternate pathway in metabolism and the immune system, which might contribute to acute immune rejection of xenografts.
The silencing of the CMAH gene expression resulted in a number of genetic and biochemical changes to the biosynthesis of sialic acids, which may have contributed to several unique aspects of human biology in health and disease [13,14]. Cmah null mice show Nue5Ac accumulation, a characteristic present in humans. The Cmah null mice also exhibit many of the problems common in humans, including diminished acoustic sensitivity and startle response threshold, which resulted in hearing loss and delays to skin healing [15,16]. Recently, microarray analysis has been shown to be a powerful tool for the analysis of gene expression and is particularly suited for the identification of transcription factor target genes. In addition, network-assisted analysis of DNA chip data is an emerging area in which network-related approaches are developed and utilized in order to study human diseases or traits. To identify the transcriptional alterations caused by the humanspecific loss of Neu5Gc, we examined expression profiling of the main organs, which included the lung, kidney, and heart harvested from wild type (WT) and Cmah null mice. Here, we report the use of network-assisted transcript profiling to various diseases and discuss the options relating to practical applications.

Animal ethics
All animal experiments were approved and performed under the guidelines of the Konkuk University Animal Care and Experimentation Community [IACUC approval number: KU12045]. Cmah ,tm1Ykoz. knockout mice were kindly provided by RIKEN (Japan). All lines were maintained on a congenic C57Bl/6J background. The mice were allowed to eat and drink ad libitum and were fed with standard mouse chow (Cargill Agri Purina, Inc., Seongnam-Si, Korea). Twelve weeks old wild type and Cmah null male mouse in this study were used.

Immunohistochemistry (IHC)
For IHC, the tissues were fixed in neutral buffer with 10% formalin and then embedded on slides. Endogenous peroxidase activity was blocked using 3% hydrogen peroxide. The samples were then pretreated with Borg Decloaker, and blocked in background Sniper solution. After washing, the samples were incubated with specific primary antibodies for CMAH (Santa Cruz; Texas, USA, 1:100) and Neu5Gc (Sialix; San Diego, CA, USA; 1:200) at 4uC overnight. After the incubation, the samples were washed and incubated with horseradish peroxidase-conjugated secondary antibody. Samples were then stained with ImmPACT TM DAB peroxidase substrate (Vector Laboratories; CA, USA) to visualize the signal. Samples were also stained with Hematoxylin QS to provide background information for reference. The samples were mounted using VECTORSHIELD HardSet mounting medium (Vector Laboratories; CA, USA) and observed using fluorescence microscopy (Olympus; Japan).

Microarray analysis
WT and Cmah null mice with same age (12 weeks) and genetic background (C57BL/6J) were used (n = 3 per each group) for microarray analysis. Total RNA was extracted and purified from the lung, kidney, and heart of WT and Cmah null mice using RNeasy columns (Qiagen; Valencia, CA, USA) according to the manufacturer's protocol. The RNA quality was verified using an

Raw data preparation and statistical analysis
The quality of hybridization and overall chip performance were monitored by visual inspection of both internal quality control checks and the raw scanned data. The raw data were extracted using the software provided by the manufacturer (Illumina GenomeStudio v2009.2 [Gene Expression Module v1.5.4]). Array data were filtered by a detection p-value , 0.05 in at least 50% of the samples. Selected gene signal values were logarithm-transformed and normalized using the quantile method [17]. Comparative analysis between the wild-type group and Cmah null mice group was carried out based on fold-change in expression levels.

Gene ontology (GO) analysis
GO analysis of the significant probe list was performed using PANTHER (http://www.pantherdb.org/), using text files containing the Gene ID list and accession numbers of the Illumina probe ID. All data analysis and visualization of differentially expressed genes were conducted using R 2.4.1 (www.r-project. org). In addition, the DAVID Functional Annotation Bioinformatics Microarray Analysis tools (http://david.abcc.ncifcrf.gov/) were used to study the biological function of the regulated genes [18].

Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis
KEGG is a collection of online databases dealing with genomes, enzymatic pathways, and biological chemicals [19]. The PATH-WAY database records networks of molecular interactions in the cell that includes organism-specific network maps (http://www. genome.jp/kegg/). A total of 9 pathways, involving 49 genes, were collected from KEGG.

Interactome/network analysis of differentially expressed genes
Significantly affected or differentially expressed genes were subjected to a comprehensive search to identify their biological functions. Gene interaction networks, bio functions, and pathway analysis were generated by Ingenuity Pathway Analysis (IPA) (Ingenuity Systems; Mountain View, CA, USA), which assists with microarray data interpretation via grouping of differentially expressed genes into known functions, pathways, and networks primarily based on human and rodent studies. The identified genes were mapped to genetic networks available from the Ingenuity database and were then ranked by score. The significance was set at a p-value of 0.05.

Pathway Studio analysis
To identify molecular pathways, we arranged the data by relation type, using Pathway Studio 9.0 software (Ariadne Genomics; Rockville, MD, USA). This program integrates relevant information among the imported genes, consequently allowing identification of biological pathways, gene regulation networks, and protein interaction maps.

Quantitative real-time polymerase chain reaction (RT-qPCR)
The total RNAs obtained from each tissues (lung, kidney, heart, and liver) used for microarray analysis was reverse-transcribed with the QuantiTect Reverse Transcription Kit (Qiagen; Valencia, CA, USA), according to the manufacturer's recommendations. To assess gene expression, quantitative real-time polymerase chain reaction (RT-qPCR) was performed on an ABI ViiA TM 7 system (Applied Biosystems; Foster City, CA, USA) with SYBR Green as the fluorescence detection method (Bio-Rad; Hercules, CA, USA). Specific primer sets for RT-qPCR were listed in Table S1 and S2. The mouse glyceraldehyde-3-phosphate dehydrogenase (Gadph) gene was used as an internal control to normalize the RT-qPCR efficiency and for quantification of gene expression. After normalization with Gadph expression, we compared the relative expression of each mRNA sample in the Cmah null mice with those of the controls. The RT-qPCR was performed in triplicate for each sample.

Disruption of the Cmah gene completely abolishes Neu5Gc production
The production of Cmah null mice has been described previously [15,16]. The Cmah null mouse was confirmed by PCR and the lack of CMAH protein expression was confirmed by western blotting analysis (data not shown). When Cmah null mice from Cmah +/2 6 Cmah +/2 crosses were examined and weighed, homozygous Cmah null mice were indistinguishable from their wild type or heterozygous littermates in all respects.
IHC was used to determine expression patterns for Cmah and Neu5Gc in the lung, kidney, and heart tissues of control and Cmah null mice at 12 weeks of age, using specific antibodies for CMAH and Neu5Gc. As shown in Figure 1, both the CMAH protein and Neu5Gc epitope were highly expressed in the control mice tissues, whereas the Cmah null mice were completely deficient for both Cmah and Neu5Gc in the lung, kidney, and heart. These observations suggested that the lung, kidney and heart tissues of the Cmah null mouse may contain possible molecular targets associated with xenoantigens relevant to animalto-human xenotransplantation or human species-specific metabolic syndrome.

Cmah null mice show reduced sialyltransferase-related gene expression
Neu5Gc, also called the H-D antigen, is produced by the CMAH enzyme and is one of the non-Gal xenoantigens for pig-tohuman xenotransplantation [10]. Therefore, we examined the sialyltransferase (ST3Gal1-4 and ST6Gal1) expression pattern in the livers of control and Cmah null mice. Gadph gene expression was simultaneously assayed as an endogenous invariant control for data normalization. Compared with the H-D antigen gene expression in the control mouse (160.079; 160.024; 160.056; 160.061; 160.061; 160.005), RT-qPCR analysis showed that Cmah null mice had low ST6Gal1 (0.5360.012), ST3Gal1 (0.2060.015), ST3Gal2 (0.3960.004), ST3Gal3 (0.2160.012), ST3Gal4 (0.7060.0016), and ST3Gal6 (0.5460.079) expression levels, whereas the ST3Gal5 expression level was slightly elevated (1.1360.079) ( Figure 2). Interestingly, the H-D antigen mRNA expression in Cmah null mice was not completely lost, suggesting that the H-D antigen mRNA expression may be compensated when one gene does not function for some reason. Since ST3GAL1, 3, and 4 catalyze the addition of Neu5Ac to the nonreducing terminal of the galactose (Gal) residue of glycans, our observations indicate that down-regulation of ST3GAL1, 3, and 4 and ST6GAL1 expression in Cmah null mice may reduce cytotoxic T cell numbers and therefore attenuate resistance to pathogens and tumorigenic lesions.
Expression levels of the Sialy-Tn antigen-related genes ST6Gal-Nac2 (0.01860.003), ST6GalNac3 (0.01860.021), ST6GalNac4 (0.01760.012), and ST6GalNac6 (0.2860.023), were significantly lower in the Cmah null mice than in the control mice (160.024, 160.030, 160.052, and 160.179, respectively; Figure. 2A). Expression levels of GBGT1 (0.5660.015) for the Forssman antigen, and of GalNT2 (0.8360.063), GalNT4 (0.6860.065), and GalNT6 (0.6760.031) for the Tn antigen in Cmah null mice were also significantly lower in comparison to those of the control mice (160.058, 160.066, and 160.202, respectively). However, GalNT7 expression for the Tn antigen, unlike other family members, was significantly increased. Considering that the GalNT7 enzyme shows exclusive specificity for partially Gal-NAc-glycosylated acceptor substrates and shows no activity with non-glycosylated peptides, increased GalNT7 expression in the Cmah null mouse may promote the initiation of the Oglycosylation step.
We next analyzed the binding of natural xenoreactive antibodies in human serum using control (WT), Cmah heterozygote (mKO), and Cmah homozygote (Cmah dKO or Cmah null)) mice. As shown in Figure S1, most of the serum samples from healthy volunteers with different blood groups (A, B, O, and AB) showed the presence of naturally occurring IgM and IgG antibodies that were bound to thymocytes from WT, Cmah-mKO, Cmah-dKO mice. Of note, IgG binding to Cmah dKO mouse-derived thymocytes was significantly reduced in all tested blood groups compared to WT-or Cmah mKO-derived thymocytes. Whereas, there was no significant IgM antibody in the A, O, and AB blood types.
Microarray analysis of gene expression profiles in the lung, kidney, and heart of Cmah null mice Following microarray analysis using the Illumina MouseRef-8 v2 Expression BeadChip (composed of 25,697 filtered probe sets) using the main tissues (lung, kidney, and heart) from the control and Cmah null mouse groups (n = 3, respectively), all hybridization spots on the image were quantified. The fluorescence intensity data were converted into log 10 values. Genes with significantly different expression levels (P,0.05) in the lung, kidney, and heart from the Cmah null mouse, relative to the control mouse, were extracted for further analysis. The results showed that 204, 162, and 147 genes were differentially expressed, with a more than 1.5fold change in the lung, kidney, and heart of Cmah null mice when compared with age-and sex-matched control mice, respectively ( Figure 3A). In the lung, 101 genes were up-regulated and 103 genes were down-regulated (File S1). In the kidney, 108 genes were up-regulated and 54 genes were down-regulated (File S2). In the heart, 78 genes were up-regulated and 69 genes were down-regulated (File S3). Even though 2 up-regulated genes (Entpd4 and Dex3y) and 4 down-regulated genes (Nr1d1, Sybl1, LOC10047427, and Cfd) overlapped in all tissues (Table S3), overall, the overlapping between 2 different organs (lung vs. heart, kidney vs. heart, and lung vs. kidney) and 3 different organs (lung vs. kidney vs. heart) was minimal.
To understand their biological roles, the genes with significant changes in expression detected in Cmah null mice by microarray analysis were assigned to establish GO classification categories using the PANTHER and DAVID tools. There are 3 ontologies in GO: cellular component, molecular function, and biological process [20]. In GO classification, the top 6 enriched biological processes from the GO analysis were categorized according to their functional role, as shown in Table 1 and Files S1, S2, and S3. These biological groups covered 5 subcategories (protein biosynthesis, protein complex assembly, protein folding, protein modification, and proteolysis) for protein metabolism and modification, 3 subcategories (cell communication, cell surface receptor-mediated signal transduction, and intracellular signaling cascade) for signal transduction, 4 subcategories (fatty acid metabolism, lipid and fatty acid transport, lipid metabolism, and steroid metabolism) for lipid, fatty acid, and steroid metabolism, 10 subcategories (antioxidation and free radical removal, B-cell-and antibodymediated immunity, blood clotting, complement-mediated immunity, cytokine/chemokine-mediated immunity, detoxification, interferon-mediated immunity, macrophage-mediated immunity, stress response, and T-cell-mediated immunity) for immunity and defense, 6 subcategories (carbohydrate transport, gluconeogenesis, glycogen metabolism, glycolysis, pentose-phosphate shunt, and tricarboxylic acid pathway) for carbohydrate metabolism, and 3 subcategories (mRNA transcription, purine metabolism, pyrimidine metabolism) for nucleoside, nucleotide, and nucleic acid metabolism. Although expression profiles were completely different for each tissue of the Cmah null mice, some genes were shared between tissues in a biological group (Fkbp5, Cfd, Dbp, and Entpd4).
The results generated using the PANTHER classification system is shown in Figure 3B and Table 1. Each gene was assigned to one or more biological groups, according to the function of its proteins. The classification was based on 31 biological groups; the 6 groups that contained the majority of the differentially expressed genes were protein metabolism and modification (33, 24, and 16 genes in the lung, kidney, and heart, respectively), signal transduction (26, 13, and 14 genes in the lung, kidney, and heart), lipid, fatty acid, and steroid metabolism (16,24, and 8 genes in the lung, kidney, and heart), nucleoside, nucleotide, and nucleic acid metabolism (16,15, and 13 genes in the lung, kidney, and heart), immunity and defense (15, 14, and 9 genes in the lung, kidney, and heart), and carbohydrate metabolism (12,11, and 8 genes in the lung, kidney, and heart) ( Figure 3B). In addition, GO classification by molecular function showed that differentially expressed genes were assigned 29 different functions (viral protein, cell junction protein, isomerase, chaperone, lyase, ion channel, synthase/synthetase, cell adhesion molecule, extracellular matrix, select calcium binding protein, defense/immunity protein, phosphatase, transfer/carrier protein, membrane traffic protein, protease, ligase, cytoskeletal protein, signaling molecule, transporter, hydrolase, oxidoreductase, miscellaneous function, kinase, transferase, receptor, select regulatory molecule, transcription factor, nucleic acid binding, and molecular function unclassified), while 42 (24 down-regulated genes and 18 up-regulated genes), 23 (5 down-regulated genes and 18 upregulated genes), and 28 (15 down-regulated genes and 13 upregulated genes) genes had multiple functions in the lung, kidney, and heart, respectively (Files S1, S2, and S3). The GO classification by cellular component using the DAVID tool showed that 34 differentially expressed genes were mapped to cell fraction, membrane fraction, plasma membrane, etc.; 32 genes mapped to Figure 1. Immunohistochemistry in Cmah-dKO mouse-derived tissues (lung, kidney, and heart) for detection of CMAH and Neu5Gc. WT mice expressed both Cmah and Neu5Gc epitopes. Cmah-dKO mice were completely deficient for both CMAH and Neu5Gc epitopes in the lung, kidney, and heart tissues. Bar: 100 mm. LU; lung, KI; kidney, HE; heart. doi:10.1371/journal.pone.0107559.g001 extracellular region, insoluble fraction, organelle membrane, etc.; and 36 genes mapped to extracellular matrix, cytoplasmic vesicle, organelle lumen, etc. in the lung, kidney, and heart, respectively (data not shown).
The molecular pathways associated with the differentially expressed genes from the lung, kidney, and heart of Cmah null mice were identified using KEGG pathway analysis. KEGG pathways are manually drawn maps representing well-known molecular interaction and reaction networks. Differentially expressed genes from the lung, kidney, and heart of Cmah null mice were associated with the following pathways (P,0.05): for the lung, retinol metabolism and glycerolipid metabolism; for the kidney, drug metabolism, arachidonic acid metabolism, complement and coagulation cascades, retinol metabolism, fatty acid metabolism, and circadian rhythm; for the heart, focal adhesion ( Table 2). These data indicate that deletion of the Cmah gene leads to both up-and down-regulation of gene expression in the lung, kidney, and heart tissues, which may regulate the metabolism and signaling pathway within these tissues.
We then performed IPA, which identified 3 gene interaction networks identified in the Cmah null tissues from the uploaded gene lists, based on the literature contained in the IPA knowledge base (P,0.05) ( Table S4). The analysis revealed one network associated with lipid metabolism and small molecular biochemistry that was common to all tissue types (Figure 4). The network was merged and clustered around several central genes, including the up-and down-regulated genes identified from the microarray data (40.63% of the total nodes in the lung, 51.43% of the total nodes in the kidney, and 41.18% of the total nodes in the heart). This network contained 5 up-regulated genes (Cap1, Gpr182, Myl2, Myl3, and Prg2) and 8 down-regulated genes (Cel, Cela1, Clps, Cpa1, Ctrl, Plip, Pnliprp1, and Zg16) in the lung, whereas all the genes (Abcc3, Ahsg, Akr1c3, Apoa1, Apoc1, Azgp1, Cfd, Cyp27b1, Cyp4a14, Cyp4a22, Hdc, Hp, Hpx, Gtp, Ly6a, Mat1a, Prlr, and Serpina1) were up-regulated in the kidney. Similarly, the network in the heart was composed of 9 up-regulated genes (Adh1c, Dbp, Fkbp5, Gck, Herpud1, Per2, Ptgds, Scgb1a1, and Tef) and 5 down-regulated genes (Alas2, Chad, Egr1, Scd, and Snca).
IPA pathway analysis identified putative disease and disorders in lung, kidney, and heart of Cmah null mouse, respectively. These are as follows: inflammatory response, cardiovascular disease, developmental disorder, and skeletal and muscular diseases for lung, gastrointestinal disease, nutrition disease, and cancer for kidney, neurological disease, cancer, cardiovascular disease, and skeletal and muscular diseases for heart (Table 3). Taken together, the predicted interacting molecular networks in the Cmah null tissues from the IPA suggest that loss of Neu5Gc production has affected the signaling pathways that involved in human disease and disorders. Therefore, this result provides further information for more detailed evaluation of the potential effects of the loss of function of CMAH.

RT-qPCR validation
The microarray expression data were validated by RT-qPCR analysis using genes selected for analysis of KEGG pathway in the lung, kidney, and heart of Cmah null mice. As shown in Figure 5A, the expression levels of Pnlip (pancreatic lipase), Cel (carboxyl ester lipase), Pnliprp1 (pancreatic lipase related protein 1), and Dgat2 (diacylglycerol O-acyltransferase 2) for glycerolipid metabolism in the lung, Cyp4a12b (cytochrome P450, family 4, subfamily a, polypeptide 12B), Adh1 (alcohol dehydrogenase 1), and Cyp4a14 (cytochrome P450, family 4, subfamily a, polypeptide 14) for fatty acid metabolism in the kidney, Actb (actin, beta), Myl7 (myosin, light chain 7), Itgb6 (integrin beta 6), and Pik3r1 (phosphatidylinositol 3-kinase, regulatory subunit, polypeptide 1) for regulation of actin cytoskeleton in the heart corresponded to the microarray results. This shows that the data sets obtained from the microarray analysis accurately reflect the differential gene . Gene expression profile in the lung, kidney, and heart of Cmah null mice. A. Venn diagram showing differential expression of genes in the lung, kidney, and heart of Cmah null mice. Numbers in red and blue Venn diagram present up-and down-regulated genes, respectively. LU, lung; KI, kidney; HE, heart. B. The differentially up-or down-regulated genes were classified according to biological process in the lung, kidney, and heart of Cmah-dKO mice. LU, lung; KI, kidney; HE, heart. X axis of bar graph indicates the gene numbers. doi:10.1371/journal.pone.0107559.g003   expression in the lung, kidney, and heart between the control and Cmah null mice ( Figure 5A).

Putative Cmah interaction network analysis
Heterogeneous high-throughput biological data has become readily available for a number of diseases. However, the amount of data points generated by such experiments does not facilitate manual integration of the information, which is required to design the most optimal therapy for a disease. In this study, we examined a novel computational workflow for designing a therapy strategy using the Pathway Studio Software. Using subnetwork enrichment analysis ( Figure 5B), we identified eight CMAH downstreamlocalized proteins (ERBB2, CMAS, CYB5R3, CTNNB1, IL4, IL17A, IL6, and IFNG), six small molecules (Fe 2+ , NSC2921, sialate, CMP-N-glycolylneuraminate, N-glycolylneuraminate, CMP-N-acetylneuraminate), three cell processes (immune reactivity, B-cell activation, cell growth), and two diseases (metabolic disease and Duchenne muscular dystrophy). In addition, we identified four CMAH upstream-localized diseases (neoplasm, malaria, Mdx dystrophy, and nematode infections), three small molecules (NaCl, NADH, and ferrosulfate), eight functional classes (serotonin receptor, ncRNA, DNA-directed RNA polymerase, Table 3. Disease and disorders predicted by Ingenuity Pathway Analysis in lung, kidney, and heart of Cmah null mice.

Tissues
Lung (molecules) Kidney (molecules) Heart (molecules) Disease and disorders Inflammatory response (10) Gastrointestinal disease (13) Neurological disease (11) Cardiovascular disease (3) Nutrition disease (7) Cancer (9) Developmental disorder (5) Cancer (21) Cardiovascular disease (3) Skeletal and muscular diseases (6) Skeletal and muscular diseases (12) doi:10.1371/journal.pone.0107559.t003 nuclease, PKC, electron carrier, oncogene, and muscin), and one protein (CYB5A). As shown in Figure 5B, the data set obtained from a representative potential signaling pathway accurately predicted previously reported metabolic disease models, such as human type II diabetes [21] and human Duchenne muscular dystrophy syndrome [16]. Taken together, this analysis suggests that regulation analysis of both Cmah and other proteins provide a valuable tool to evaluate human-specific diseases, caused by loss of CMAH function during human evolution.

Discussion
Microarray analysis was selected to uncover the underlying molecular mechanisms necessary for animal-to-human xenotransplantation, as well as the metabolic conditions arising from the evolutionary loss of CMAH. The microarray expression data were validated by RT-qPCR analysis of selected genes from the lung, kidney, and heart of Cmah null mice. Overall, the decrease in expression of glycosyltransferases suggests a corresponding reduction in the cell surface sialylation and carbohydrate modification in the Cmah null mouse tissues. From the regulatory network identified from the Cmah null mice, we were able to show that many of the transcription factors and target genes associated with CMAH activity were linked in our network.
Recently, several studies have linked CMAH deactivation to hearing loss [15], skin healing delay [15], a human-like muscular dystrophy phenotype following combined mutation of the Dmd gene [16], abnormal B cell proliferation and antibody production, and type 2 diabetes-like syndromes [22]. To elucidate the role of CMAH in an animal model, a conventional knockout mouse has been constructed and made available to the scientific community [22]. We have also recently reported the first successful CMAH knockout in pigs [12]. First, we compared the expression profiles of Cmah +/+ and Cmah 2/2 mice littermates and between the Cmah null mouse and pig models. In this study, GalNT7 expression for Tn antigen in Cmah null mouse, unlike other family, significantly increased whereas, the expression of this gene in CMAH null pigs was significantly decreased. Moreover, ST6GalNac2 expression for Sialyl-Tn antigen and GalNT2, GalNT3, and GalNT4 expression for Tn antigen in CMAH null pigs were not changed compared to the control [12], whereas the expression of these genes in Cmah null mice was significantly decreased (Figure 2). While considerable differences in sialyltransferase expression between the Cmah null mouse and pig models were observed, their sialyltransferase expression profiles were strikingly similar overall. With respect to expression of H-D antigens, ST6GalNac3, ST6GalNac4, and ST6GalNac6 expression for sialyl-Tn antigen and GalNT6 expression for Tn antigen were all significantly down-regulated in both the Cmah null mouse and CMAH-null pig, suggesting a common expression signature for both models. This observation also suggests that the potential for Cmah null mice to serve as model for human-specific disease is becoming a reality, whereas that they have limitations because of their differences in gene expression and physiology compared to humans.
GO is an international standardized classification system for determining gene function, which supplies a set of controlled vocabulary to comprehensively describe the property of genes and gene products [23]. There are 3 ontologies in GO: cellular component, molecular function, and biological process [20]. In this study, analysis of molecular function showed that 204, 162, and 147 genes were assigned to 29 different functions based on gene background, while 42, 23, and 28 genes were involved in multiple functions in the lung, kidney, and heart, respectively (Files S1, S2, and S3). GO enrichment for gene background, based on the cellular component, showed that 34, 32, and 36 altered genes were mapped in the lung, kidney, and heart to GO terms in the database. As for the KEGG pathway analysis, it is worth noting that the following pathways were significantly (P value, 0.0121528-0.0000001) enriched in the Cmah null mouse ( Table 2): for the heart, focal adhesion; for the lung, retinol metabolism and glycerolipid metabolism; for the kidney, drug metabolism, arachidonic acid metabolism, complement and coagulation cascades, retinol metabolism, fatty acid metabolism, and circadian rhythm. The IPA further identified interacting modules involved in lipid metabolism and apoptosis signaling networks, which include fatty acids and their surrogate genes Pnlip (pancreatic lipase), Pnliprp1 (pancreatic lipase-related protein 1), and Pnliprp2 (pancreatic lipase-related protein 2). These significantly enriched pathways may imply that alterations in glycerolipid metabolism and fatty acid metabolism are involved in the pathogenesis of potentially fatal conditions, such as obesity, type II diabetes, and cancer. The results of the GO annotation and KEGG pathway analyses can therefore provide direction for future research. This study identified a common network in the three main organs, as the lung, kidney and heart, which are associated with lipid metabolism and small-molecule biochemistry.
The microarray analysis revealed distinct expression profiles for the lung, kidney, and heart from Cmah +/+ and Cmah 2/2 littermates. The microarray data were analyzed using the Ingenuity System Database software that includes the Ingenuity Knowledge Base (IKB) and the Global Molecular Network (GMN). These databases integrate published findings on biologically meaningful genetic or molecular gene/gene product interactions and identify functionally related gene networks [24]. The differential gene expression values (Cmah +/+ versus Cmah 2/ 2 litter mates) were entered into the IPA to determine the most highly regulated networks of gene interactions and to highlight the biological processes that are relevant to each of the treatments. Only networks with a score of 30 or higher were selected for further analysis. Networks describing the relationships between a subset of genes, their neighboring genes, and symbols representing the functional categories of the molecules are presented in Figure 4. Focus genes are denoted by red symbols for up-regulated genes and green symbols indicate down-regulated genes. Grey and open symbols are intermediate genes, placed in the network by the Ingenuity Software and shown in the literature to interact with genes in this dataset. Genes in gray have been shown to interact with the colored gene products that appear in this scheme.
The lung network (left of Figure 4) showed a score of 30 and contains 13 differentially regulated genes. Among them, 10 and 13 genes were involved lipid metabolism (P value: 3.21E-07-4.83E-02) and small-molecule biochemistry (3.21E-07-4.83E-02), respectively. The 5 genes (Cel, Lpl, Pnlip, Pnliprp1, and Pnliprp2) in this network have a central role in glycerolipid metabolism. These genes (Pnlip, Pnliprp1, and Pnliprp2) encode key lipolytic enzymes and are known to alter lipid metabolism. Pnliprp1, and Pnliprp2 encode two novel pancreatic lipase-related proteins, referred to as Pnlip-related proteins 1 and 2. Both these proteins have an amino acid sequence identity of 68% to Pnlip [25]. Pnliprp2 shows lipolytic activity that is marginally dependent on the presence of colipase, whereas the function of Pnliprp1 remains unclear [26]. Overall, these functionally related lipases play key roles in directing the regulation of fatty acid turnover and signaling. The most common symptoms associated with lipase deficiency are muscle spasms, acne, arthritis, gallbladder stress, formation of gallstones, bladder problems, and cystitis [27]. Therefore, our study has also provided the first evidence that a lower expression of Pnlip in Cmah null mice may be closely associated with above mentioned disease symptoms. The kidney network (middle of Figure 4), with the highest score (48), includes 18 genes. Among them, 10 and 13 genes were involved lipid metabolism (P value: 3.94E-087-4.82E-02) and small-molecule biochemistry (3.94E-08-4.82E-02), respectively. The heart network (right of Figure 4) received a score of 34 and contains 14 differentially regulated genes. Among them, 9 and 10 genes were involved lipid metabolism (P value: 1.14E-05-4.56E-02) and small-molecule biochemistry (1.14E-05-4.99E-02), respectively. Taken together, these data suggested that the gene interacting network predicted in the Cmah null tissues could be critical in understanding the role of evolutionary loss of CMAH function in human metabolism disorders.
PathwayAssist is a software application developed for navigation and analysis of biological pathways, gene regulation networks, and protein interaction maps [28]. Using Pathway Studio, we identified a number of protein interaction pathways with important roles following the evolutionary loss of CMAH function ( Figure 5B). Comparing previously reported diseases caused by loss of CMAH function, we were able to accurately predict metabolic diseases, including human type II diabetes and human Duchenne muscular dystrophy syndrome, caused by CMAH inactivation. Therefore, our observations suggested that accuracy and efficiency in the interpretation of CMAH function is improved by Pathway Studio analysis.
Herein we identified 8 disease and disorders identified in the Cmah null tissues, based on the literature contained in the IPA knowledge base (P,0.05) ( Table 3). Of them, specific disorders in Cmah null mice were closely associated with cardiovascular disease (Ptgds, Scd, and Egrl)-, skeletal and muscular disease (Adh1c, Dbp, Dkk3, Egr1, Herpud1, Scd, Snca, Ptgds, Timp4, Myl4, Alas2, and Fkbp5)-, and cancer (Cfd, Egr1, Scd, Timp4, Per2, Scgb1a1, Spon2, Slc46a3, and Gck)-associated genes in heart (Table 3). In conclusion, these findings suggest that CMAH inactivation could affect a variety of signaling pathways accompanying orchestrated gene expression changes. Therefore, our results argue that evolutionary loss of Neu5Gc affect complex regulation of cellular signaling pathways that involved in human diseases. Figure S1 Binding of natural xenoreactive antibodies in human sera to thymocytes from WT, Cmah-mKO (+/2; hetero) and Cmah-dKO (2/2; homo) mice. Indirect immunofluorescence staining of thymocytes from WT, Cmah-mKO, and Cmah-dKO mice was used to detect xenoreactive antibody levels in healthy human serum samples (blood group: A, B, O, and AB). Detection of IgM or IgG binding was achieved by further incubating the cells with DyLight 649-conjugated Monkey anti-Human IgM or DyLight 488 Monkey anti-Human IgG Abs. Histogram profiles show differences in binding of natural antibodies present in human sera to the different thymocytes. Green lines indicate binding to thymocytes from WT; pink lines, binding to thymocytes from Cmah-mKO; sky blue lines, binding to thymocytes from Cmah-dKO, and gery lines indicate negative control WT thymocytes stained with secondary antibody alone. (TIF)    File S1 UP or down-regulated genes in lung of Cmahnull mice.