Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC) Transporter Genes in Common Carp (Cyprinus carpio)

The ATP-binding cassette (ABC) gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio) are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill) revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.


Introduction
The ATP-binding cassette (ABC) transporters are integral membrane proteins and are one of the largest superfamilies ubiquitously present in all phyla [1]. The majority of ABC proteins 11 ABCGs. The transcripts, coding sequences, and locations of these ABCs are summarized in Table 1. All sequences are available in S1 Table. ABCA subfamily. The eleven ABCA genes identified in the common carp genome were annotated as ABCA1a-1, ABCA1a-2, ABCA1b-1, ABCA1b-2, ABCA2, ABCA3b, ABCA4a, ABCA4b, ABCA5-1, ABCA5-2, and ABCA12 (Table 1). The phylogenetic analysis showed that each ABCA subfamily clustered with the orthologs from the other species (Figs 1 and 2). We identified four copies of ABCA1 (ABCA1a-1, ABCA1a-2, ABCA1b-1, and ABCA1b-2) in common carp, while only two copies have been identified in zebrafish and medaka. We also identified two copies of ABCA4 (ABCA4a and ABCA4b) and ABCA5 (ABCA5-1 and ABCA5-2), and only single copy of ABCA2, ABCA3b, and ABCA12. In the ABCA5 clade, the common carp ABCA5s were grouped with the other ABCA5s, and this group then clustered with ABCA6/8/9/10 sequences from the tetrapods, even though orthologous sequences were not detected in the common carp genome and have not been reported in other teleosts such as zebrafish and medaka. This finding implies that ABCA5-related gene divergence may have occurred in tetrapods [23]. We found a similar scenario in the ABCA3 clade, where ABCA14/ 15/16/17 were expanded in mouse. Unexpectedly, no ABCA7 gene was detected in the common carp genome, even though one ABCA7 copy is retained in both the zebrafish and medaka genomes. Further investigations are necessary to verify whether ABCA7 was lost during evolution or it was missed because of inaccurate gene prediction or annotation.
ABCB subfamily. We detected only six ABCB genes in the common carp genome compared with twelve in zebrafish and nine in medaka. The common carp ABCB genes were annotated as ABCB4, ABCB5-1, ABCB5-2, ABCB9, ABCB11a, and ABCB11b ( Table 1). The phylogenetic analysis showed that the full-length coding sequences were obtained for the six ABCB transporters (Table 1) and all the common carp ABCB transporters clustered with the orthologs from the other species (Figs 1 and 3). The ABCBs clustered into two distinct clades with ABCB1, ABCB4, ABCB5, and ABCB11 in one clade, and ABCB8/10/7/6/9/tap1/tap2 in another clade. The phylogenetic tree showed that ABCB1, ABCB4, and ABCB5 were closely related, implying that they may have shared a common ancestor during chordate evolution [24]. Until now, no teleost ABCB1 gene has been detected, suggesting that ABCB1 is absent in teleost fishes. Two copies of ABCB11 (ABCB11a and ABCB11b) were detected in common carp, as well as in zebrafish and medaka. The teleost ABCB11b sequences formed a group with the ABCB11 sequences of chicken and frog, and then clustered with the teleost ABCB11a sequences. The topology suggests that ABCB11a and ABCB11b may be derived from the second round (2R) of WGD instead of from the teleost-specific 3R WGD. The chicken and frog genomes lost the orthologous copies of ABCB11a and have retained only single copy of ABCB11b.
ABCC subfamily. We detected 19 ABCC genes in the common carp genome, which is more that the sixteen in zebrafish and eleven copies in medaka. The common carp ABCC genes were annotated as ABCC1, ABCC2-1, ABCC2-2, ABCC4-1, ABCC4-2, ABCC5-1, ABCC5-2, ABCC6a, ABCC6-2, ABCC6-3, ABCC7, ABCC8, ABCC8-like, ABCC9-1, ABCC9-2, ABCC10, ABCC12-1, ABCC12-2, and ABCC13 (Table 1). Full-length coding sequences as well as partial sequences were obtained for the 19 ABCC transporters ( Table 1). The phylogenetic analysis showed that all the common carp ABCC transporters clustered with the orthologs from the other species (Figs 1 and 4). Two copies of ABCC2 (ABCC2-1 and ABCC2-2), ABCC4 (ABCC4-1 and ABCC4-2), ABCC5 (ABCC5-1 and ABCC5-2), ABCC9 (ABCC9-1 and ABCC9-2), and ABCC12 (ABCC12-1 and ABCC12-2) were detected in common carp, probably derived from either the teleost-specific 3R WGD or the latest 4R WGD. We identified three copies of ABCC6 in both the common carp and zebrafish genomes, which were likely derived from the 2R WGD, as well as from segmental duplication. Additional comparative genomic studies will be necessary to unveil the potential mechanisms. We also identified two copies of ABCC8 (ABCC8 and ABCC8-like) in common carp compared with three ABCC8 copies in zebrafish. The DreABCC8-1 and DreABCC8-2 sequences, which were closely related to the common carp ABCC8-like sequence, shared the highest similarity, suggesting they are newly duplicated genes that arose after common carp and zebrafish divergence. In addition, we identified a ABCC13 gene in the common carp genome; until now, ABCC13 was considered to be zebrafish specific [25]. This result suggests that ABCC13 could be a cyprinid-specific rather than a zebrafish-specific gene. The ABCC subfamily includes multidrug resistance-associated proteins that transport diverse substrates including drugs, endogenous compounds, and xenobiotics. In addition to these multidrug resistance proteins, the ABCC subfamily includes the chloride channel ABCC7 (also known as CFTR) and the sulfonylurea receptors ABCC8 and ABCC9 (also known as SUR1 and SUR2 respectively) [26]. The gene expansion of the ABCC subfamily in common carp may facilitate its ability to survive in diverse aquatic environments. ABCD subfamily. The eight ABCD genes identified in the common carp genome were annotated as ABCD1, ABCD2, ABCD3a-1, ABCD3a-2, ABCD3b, ABCD4-1, ABCD4-2, and ABCD4-like (Table 1), which is considerably more than the four and five ABCD genes that have been identified in the medaka and zebrafish genomes respectively. Full-length coding sequences as well as partial sequences were obtained for the eight ABCD transporters ( Table 1). The phylogenic analysis suggested that the ABCD3b sequences from zebrafish and common carp may be more ancient than the ABCD3a sequences, which may be derived from the 2R WGD in vertebrates (Figs 1 and 5). Although the other studied vertebrates, from teleost fishes to mammals, retained only ABCD3a in their genomes, the cyprinids retained the more ancient ABCD3b. The two ABCD3a copies (ABCD3a-1 and ABCD3a-2) are probably young homologs derived from the latest 4R WGD.
ABCE/F subfamily. All the ABCE and ABCF proteins contained two NBDs but no TMDs, making them non-functional as transporters [27]. Four members of the ABCE/F subfamily have been identified in vertebrates. We identified six ABCE/F genes in the common carp genome, with gene duplications only in ABCE1 and ABCF2 (Table 1, Figs 1 and 6).
ABCG/H subfamily. Five members of the ABCG subfamily have been identified in vertebrates. We identified 11 ABCG genes in the common carp genome, which were annotated as ABCG1, ABCG2, ABCG2b, ABCG2c, ABCG2d, ABCG4, ABCG4b, ABCG5, ABCG8-1, ABCG8-2, and ABCG2-like (Table 1). Full-length coding sequences were obtained for the 11 ABCG/H transporters ( Table 1). The phylogenic analysis showed that extensive ABCG2 gene expansion had taken place in the three teleost fishes (Figs 1 and 7). Five ABCG2 copies (ABCG2, ABCG2b, ABCG2c, ABCG2d, ABCG2-like) were found in common carp compared with four ABCG2 copies in zebrafish and three in medaka. The ABCG2 transporter has been reported to be a high-capacity urate exporter [28]. The ABCG2 gene expansion may be an important mechanism for living in aquatic environments. In addition, two copies of ABCG4 (ABCG4 and ABCG4b) and ABCG8 (ABCG8-1, ABCG8-2) were identified in the common carp genome. Furthermore, we identified an orphan copy of ABCH (ABCH1) in the zebrafish genome. The presence of the ABCH subfamily in teleost fish is still controversial. Until now, the ABCH gene has been annotated only in zebrafish and a putative form was identified in green spotted pufferfish, which was confirmed in a review of ABC drug transporters [29]. Our analyses did not detect any ABCH homologs in any of the surveyed vertebrate genomes, which is consistent with a previous report [14].

Gene duplications and losses of ABC transporter genes in common carp
WGD is one of the main driving forces in the evolution of many vertebrates because it produces an enormous number of new genes with the potential for new functions. The high diversification of teleost fish is supposed to correlate with the teleost-specific 3R WGD that took place in the common ancestor of all extant teleosts. As a result, teleost fish have two paralogous copies for many genes, while only one ortholog is present in tetrapods [30]. Common carp retained 100 chromosomes. It has been suggested that the 4R WGD may have occurred around 8.2 million years ago [31]. Analysis of microsatellite loci [32] and comparative analysis of the common carp and zebrafish linkage maps [33] have provided critical evidence in support of the 4R WGD in common carp.
In this study, we examined the copy numbers of ABC genes in several vertebrate genomes and inferred that duplicate copies of ABCA1, ABCA4, ABCB11, ABCC6, ABCC8, ABCD3, ABCF2, ABCG2, and ABCG4 derived from the 3R WGD were retained in the diploid teleosts (medaka or/and zebrafish) ( Table 2). Lineage-specific duplication events generally result from segmental duplication, especially tandem duplication. Tandem duplication usually generates tandem genes or gene clusters in one chromosome or scaffold. Syntenic analysis of ABCG8 paralogs across all the vertebrates revealed that the genes were located in the same scaffold of the common carp genome, implying they were derived from tandem duplication rather than   (Fig 8). Moreover, we also found that ABCG5 was closely related to ABCG8 in all the surveyed species.
We found that the common carp ABCA1, ABCA5, ABCB5, ABCC2, ABCC4, ABCC5, ABCC9, ABCC12, ABCD4, ABCE1, and ABCG8 genes had undergone significant numbers of duplications compared with their orthologs in the closely related zebrafish genome, which suggested that they these genes may be derived from the 4R WGD (Table 2). To test this, we constructed syntenic blocks of ABCD3a and found that the two copies of ABCD3a were located on two different scaffolds, which clearly demonstrated their WGD origin in common carp (Fig 9).
Lineage-specific gene losses were also observed in common carp; indeed, ABCA6, ABCA8/ 9/10, ABCA13/14/15/16/17, ABCB1/2, TAP2, ABCC11, and ABCG3 genes were not found in any of the fish genome (Table 2). These non-fish ABC genes must have appeared as a result of duplications that occurred after the split of tetrapods from teleost fishes. Among them, ABCA14/15/16/17 and ABCG3 were found only in mouse, ABCC11 was found only in human, and ABCA6 was found only in the mammalian species, while ABCA8 and TAP2 were found in all the surveyed species except the teleosts [34]. Interestingly, ABCC13 was found in the zebrafish and common carp genomes when previously it was believed to be present only in the dog and macaque genomes [34].

Expression of ABC transporter genes in common carp
The unexpected expansion of the ABC transporter gene family that we detected in common carp raises the question of how many of these genes are actually expressed. The expression patterns of these genes together with information about their orthologous in model species should allow functional inferences to be made. We conducted RT-PCRs using gene-specific primers to examine the expression patterns of each the ABC transporter genes in six common carp tissues. In general, most of the ABC transporter genes were widely expressed, but their expression levels were different in different tissues.   ABCA subfamily. Nine the ABCA transporter genes were ubiquitously expressed in the six common carp tissues tested, while ABCA2 and ABCA12 were expressed in only two tissues (Fig 10). The four ABCA1 genes were expressed in all the tissues; in human, ABCA1 has been reported to play a significant role in cholesterol efflux [35]. ABCA3 was highly expressed in gill; in human, its expression was found to by developmentally regulated, peaking prior to birth under the influence of steroids and transcription factors [36]. The two copies of ABCA5 had different expression in each of the tissues, especially in spleen and intestine, and ABCA5 has been reported to regulate amyloid-beta peptide production in human and mouse brain [37]. In addition, a mutation in the ABCA12 gene was shown to regulate Harlequin ichthyosis [38]. ABCA4 was highly expressed in all of the common carp tissues except brain. In human, ABCA4 acts as a transporter of N-retinylidene-phosphatidylethanolamine (NrPE), and a lack of ABCA4 was reported to lead to the formation of lipid deposits in the macular region of the retina [39]. ABCB subfamily. The six common carp ABCB transporter genes were found to be widely expressed, although ABCB9 expression was very low in all six tissues tested (Fig 10). The two ABCB5 copies had different expressions in each tissue, especially in gill. In a cell-based study, ABCB5 was shown to identify immunoregulatory dermal cells and thus suppressed T cell proliferation [40]. ABCB4 was relatively highly expressed in brain, spleen, and intestine. The two ABCB11 copies were expressed at average levels in the six tissues tested. A novel ABCB11 mutation was reported to be associated with benign recurrent intrahepatic cholestasis in human [41]. Moreover, some members of the plant ABCB subfamily have been found to display very high substrate specificity compared with mammalian ABCBs, which are often associated with multidrug resistance [1].
ABCC subfamily. The 19 common carp ABCC transporter genes were widely expressed, although the expression of ABCC2-1 and ABCC12-2 was very low (Fig 10). ABCC1 was highly expressed in all six tissues. In human, overexpression of ABCC1 was found to lead to multidrug resistance, especially during cancer and leukemia chemotherapy treatments [42]. The two ABCC2 copies were expressed differently in each of the tissues, especially in spleen and intestine. In a human cell-based study, ABCC2 was found to be responsible for the transport of conjugated bilirubin through the plasma membrane [43]. ABCC4 and ABCC5 had low expression levels while ABCC10 had high expression levels in the six tissues tested. The three ABCC6 copies were highly expressed in all six tissues. ABCC8 and ABCC9 are sulfonylurea receptors, which, in human, were found to be the molecular targets of the sulfonylurea class of anti-diabetic drugs [44]. In addition, highly prevalent point mutations in the chloride ion channel ABCC7 gene have been shown to cause cystic fibrosis [9]. We found that ABCC13 was highly expressed in common carp heart. In a previous study, an unusual truncated ABC transporter was reported to be highly expressed in fetal human liver [45]. ABCD subfamily. The eight common carp ABCD transporter genes were expressed in the six tissues tested, although their expression levels were relatively low (Fig 10). Members of the ABCD subfamily are thought to localize in the peroxisomal membrane, endoplasmic reticulum, or lysosomes [46]. In mammalian cells, peroxisomes are involved in a number of important metabolic pathways, including the αand β-oxidation of fatty acids and the biosynthesis of phospholipids and bile acids. Substrates for β-oxidation enter peroxisomes via ABCD transporters and are activated by specific acyl-CoA synthetases for further metabolism [47]. Mammalian ABCDs are also thought to be responsible for adrenoleukodystrophy, which is an X chromosome-linked disease [10].
ABCE/F subfamily. The six common carp ABCE/F transporter genes were widely expressed in all six tissues, although the expressions of ABCF1-like and ABCF2a were slightly different, as they just expressed in specific tissues (Fig 10). Because the ABCE and ABCF proteins contain a pair of NBDs but no TMDs, they are probably non-functional as transporters.
ABCG/H subfamily. Ten of the 11 common carp ABCG transporter genes had very weak expression in all six tissues; ABCG5, which was highly expressed in all tissues, was the exception (Fig 10). Similar to the multidrug resistance proteins, ABCB1 and ABCC1, the multidrug resistance ABCG2 transporters have been shown to transport diverse therapeutic drugs [48]. A recent study revealed that ABCG2 showed enhanced expression in side population cells that exist around cancer stem cells [49]. The five ABCG2 copies had different expressions in each of the common carp tissues tested, especially in heart and spleen, which implies a tissue-specific pattern of gene expression. The two ABCG4 copies were expressed highly only in brain. In plants, ABCG5 and ABCG8 transport sterols including cholesterol [11].

Ethic Statement
This study was approved by the Animal Care and Use committee of Centre for Applied Aquatic Genomics at Chinese Academy of Fishery Sciences. The methods were carried out in accordance with approved guidelines. Adult common carp were collected from the Breeding Station of Henan Academy of Fishery Research, Zhengzhou, Henan province, China. Euthanasia is performed by immersion fish in MS-222 solution. Tissue samples of brain, heart, gill, intestine, kidney and spleen were collected from 10 individuals and immediately placed in 2 ml RNAlater (Qiagen, Hilden, Germeny) and kept at -20°C until RNA extraction.

Identification of ABC transporter genes and homologs
All available ABC transporter gene sequences of zebrafish (Danio rerio) were downloaded from the Ensembl Zebrafish database (http://asia.ensembl.org/Danio_rerio/info/index). BLAST searches (with E-value cutoff of 1e−5) were conducted against the whole genome sequences, annotated genes, and transcriptome contigs of common carp to obtain candidate ABC genes. Reciprocal BLAST searches were conducted to verify the veracity of the candidate genes. Coding sequences were confirmed by BLAST searches against the NCBI non-redundant protein sequence database. Full-length translated amino acid sequences as well as the partial sequences coding for the conserved domains were used in the phylogenetic analyses. Fulllength ABC protein sequences from other vertebrate species were retrieved from the Ensembl genome database (Release 82) for the phylogenetic analyses.

Phylogenetic analysis of ABC transporter genes
Phylogenetic analysis can be used to support gene annotations, especially for non-model species. Reference ABC transporter genes from representative vertebrate model species were used for phylogenetic analysis, including Homo sapiens (human), Mus musculus (mouse), Gallus gallus (chicken), Xenopus tropicalis (frog), Oryzias latipes (medaka), and Danio rerio (zebrafish) for the phylogenetic analyses. The amino acid sequences were aligned using ClustalW (http://www. clustal.org/clustal2) with the default parameters. Several neighbor-joining phylogenetic trees were constructed using Clustal Omega (http://www.ebi.ac.uk/Tools/msa/clustalo/) with default settings of Pairwise Alignment and Multiple Sequence Alignment. Each common carp ABC protein was assigned to a family based on the phylogenetic analysis with the human and zebrafish ABC protein sequences. Separate phylogenetic trees were constructed for each ABC subfamily using the same methodology with other representative vertebrate species.

Nomenclature of the ABC transporter genes in common carp
The common carp ABC orthologous genes were named based on the topologies of the phylogenetic trees, as well as the most closely related zebrafish genes. Then, the closely related zebrafish ABC genes were assigned to each common carp ABC ortholog and the ABC genes were named after their most closely related zebrafish gene (for instance, ABCB11a, ABCB11b). Besides, we annotated ABCs that had more than two copies in the common carp genome with the postscript "-1" or "-2" following the name of zebrafish orthologs to reflect the fourth round (4R) of WGD [50] (For example, ABCA1a-1 and ABCA1a-2). In addition, some common carp ABCs were assigned the postscript "-like".

Syntenic analysis of ABC transporter genes
Syntenic analysis was performed on selected ABC genes in human, mouse, chicken, frog, medaka, zebrafish, and common carp chromosomes/scaffolds by identifying the positions of neighboring ABC genes. Syntenic maps were drawn based on the organization of the genes on the chromosomes of the model species from the Ensembl databases, and the gene organization of common carp was according to the draft common carp genome assembly.

Expression of ABC transporter genes
Total RNA was extracted from six adult common carp tissues (brain, heart, spleen, kidney, intestine, and gill) using Trizol reagent (Life Technologies, NY, USA), and cDNA was synthesized by RT-PCR using a SuperScript III Synthesis System (Life Technologies). The ß-actin gene was used as an internal positive control, with forward primer (5 0 -TGCAAAGCCGGATT CGCTGG-3 0 ) and reverse primer (5 0 -AGTTGGTGACAATACCGTGC-3 0 ). The PCR amplification comprised an initial denaturation step for 5 min at 95°C followed by 30 cycles of denaturation (30 sec at 94°C), annealing (30 sec at 60°C), and extension (20 sec at 72°C), and a final elongation step of 5 min at 72°C. The primers were listed in S2 Table. The PCR products were separated by gel electrophoresis (1% agarose gel at 150 V) in the presence of ethidium bromide and visualized under ultraviolet light.

Conclusions
In this study, we identified a total of 61ABC transporter genes in the tetraploid common carp genome. Phylogenetic analysis and comparative genomic study provided a comprehensive understanding of the ABC gene family and their distribution in the genome. Both 3R and 4R WGDs are inferred during the analysis. While the great majority of ABC transporters are well conserved through evolution, identification and phylogenetic analysis of the ABC transporters in common carp produced some interesting results. 1) Ten ABC transporter genes were duplicated in the teleost genomes, namely ABCA1, ABCA4, ABCB11, ABCC6, ABCC8, ABCD3, ABCF2, ABCG2, ABCG4, and ABCG8, indicating a teleost-specific 3R WGD occurred in these fishes. 2) Twelve ABC transporter genes were duplicated in the common carp genome compared with the zebrafish genome, namely ABCA1, ABCA5, ABCB5, ABCC2/4/5/9/12, ABCD3/4, ABCE1, and ABCG8, which implies a 4R WGD event occurred in common carp. 3) Fourteen ABC transporters have not yet been identified in any of the fish genomes studied, namely ABCA6/8/9/10/13/14/15/16/17, ABCB1, two copies of TAP, ABCC11, and ABCG3, suggesting lineage-specific gene losses occurred from the teleost genomes. 4) ABCC13 was detected in the common carp and zebrafish genomes, although it has not yet been detected in other teleost fishes.
We also performed RT-PCRs to determine the expression patterns of the common carp ABC transporter genes. Most of the ABC genes were ubiquitously expressed in six tissues from common carp. The exceptions were the ABCG genes, most of which were weakly expressed. Different gene copies were differently expressed in some tissues, indicating tissue-specific gene functions may be present to some extent. However, the detailed functions of each of the genes need further study. This study provides essential genomic resources for future biochemical, toxicological, physiological and evolutionary studies in common carp.
Supporting Information S1