Testing the utility of DNA barcodes and a preliminary phylogenetic framework for Chinese freshwater mussels (Bivalvia: Unionidae) from the middle and lower Yangtze River

The middle and lower portions of the Yangtze River basin is the most species-rich region for freshwater mussels in Asia. The management and conservation of the taxa in this region has been greatly hampered by the lack of a well-developed phylogeny and species-level taxonomic framework. In this study, we tested the utility of two mitochondrial genes commonly used as DNA barcodes: the first subunit of the cytochrome oxidase c gene (COI) and the first subunit of the NADH dehydrogenase gene (ND1) for 34 putative species representing 15 genera, and also generated phylogenetic hypotheses for Chinese unionids based on the combined dataset of the two mitochondrial genes. The results showed that both loci performed well as barcodes for species identification, but the ND1 sequences provided better resolution when compared to COI. Based on the two-locus dataset, Bayesian Inference (BI) and Maximum Likelihood (ML) phylogenetic analyses indicated 3 of the 15 genera of Chinese freshwater mussels examined were polyphyletic. Additionally, the analyses placed the 15 genera into 3 subfamilies: Unioninae (Aculamprotula, Cuneopsis, Nodularia and Schistodesmus), Gonideninae (Lamprotula, Solenaia and Ptychorhychus) and Anodontinae (Cristaria, Arconaia, Acuticosta, Lanceolaria, Anemina and Sinoanodonta). Our results contradict previous taxonomic classification that placed the genera Arconaia, Acuticosta and Lanceolaria in the Unioninae. This study represents one of the first attempts to develop a molecular phylogenetic framework for the Chinese members of the Unionidae and will provide a basis for future research on the evolution, ecology, and conservation of Chinese freshwater mussels.

a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 Introduction Freshwater mussels of the order Unionida comprise a significant proportion of the benthic biomass, and can have a significant influence on the community structure of the benthos [1]. As filter feeders, the Unionida provide important ecological functions and ecosystem services [2]. Current estimates indicate there are 840 species of freshwater mussels in the world, belonging to 6 families and 180 genera. Among them, the Unionidae is by far the most speciesrich family with 620 species in 142 genera, and is widely distributed in North America, Eurasia, Central America, Africa and Southeast Asia [3][4].
Freshwater mussels are considered to be some of the most threatened freshwater taxa in the world [5][6]. The middle and lower portions of the Yangtze River basin in China is the most species-rich regions for freshwater mussels in Asia [3,7], and includes a number of endemic species [8][9]. According to the surveys of species diversity conducted over the last ten years, there are up to 15 genera of Unionidae represented in the middle and lower reaches of the Yangtze River [9][10]. However, in the past two decades, anthropogenic stressors, including habitat destruction and degradation, commercial exploitation, and water pollution have had negative impacts on the survival and reproduction for many mussel species, and the imperilment of a number or endemic populations in the region [11][12].
Understanding phylogenetic diversity is crucial for conservation prioritization of freshwater mussels, but until recently, taxonomic and phylogenetic work in China has lagged relative to North America and Europe. The first and most important classification of the Chinese unionid fauna was attempted by Heude beginning in 1875 [13][14][15][16][17][18][19][20][21][22]. His collected works on Chinese taxa resulted in the classification of 140 species based on shell morphology. Over the past century, there has been substantial disagreement on the validity of species and the taxonomy of this group [8,[23][24]. Since 1949, a number of faunal investigations were conducted that improved the accuracy of species ranges [25][26][27][28][29][30], but the classifications were still based on shell morphology alone. The inclusion of anatomical characters, such as the morphology of the marsupium and larval type [31][32][33][34][35], did improve the classification of Chinese taxa; however, these advancements were largely restricted to the genus and species-level classification of Chinese Unionidae.
Accurate identification of biological diversity is an important component in the conservation of species. One of the greatest barriers to the conservation of endangered species is our lack of knowledge of their existence. Application of molecular genetic tools has resulted in a dramatic increase in the amount of biodiversity recognized to date [36]. Molecular genetic markers have the potential to provide more objective and accurate characters for improving our understanding of the systematics and taxonomy, evolutionary history and genetic diversity of Chinese freshwater mussels [37]. The number of studies examining the phylogenetic relationships of Chinese freshwater mussels based on molecular data has increased recently [38][39][40][41][42][43][44][45][46][47][48][49]; however, most of the studies included only a limited number genera, species, and specimens. Other studies have attempted to construct a phylogenetic framework for Chinese mussels [50][51], but suffered due to limited taxon sampling; thus, an integrated phylogenetic framework for Chinese taxa is still lacking.
DNA barcoding technology has proven to be a reliable tool in species identification and phylogenetic analysis [52]. The first subunit of the cytochrome oxidase c gene (COI) and the first subunit of the NADH dehydrogenase gene (ND1) are widely used in phylogenetic analysis, taxonomic identification and identification of cryptic species [51][52][53][54][55][56][57][58], but information on the usefulness of DNA barcodes for Chinese mussels is largely lacking. There are only COI and/or ND1 sequences from a dozen Chinese mussel taxa in GenBank at this time, which severely hinders our understanding of phylogenetic diversity and monitoring of the fauna in China using environmental DNA (eDNA).
The middle and lower reaches of the Yangtze River in China represents one of the mostspecies rich regions for freshwater mussels on earth [7]. By sampling the tributary lakes and rivers in the middle and lower reaches of the Yangtze River, we were able to collect 34 putative species representing 15 genera of freshwater mussels. Our goals were to: (1) evaluate the efficacy of the mitochondrial COI and ND1 loci for DNA barcoding of Chinese freshwater mussels; and (2) begin to develop a modern phylogenetic framework for the Unionidae in China thereby placing more Chinese species into a global taxonomic classification.

Ethics statement
All necessary permits were obtained for the described field studies from the Yangtze River Fishery Administration of China. The handling of mussels was conducted in accordance with the guidelines on the care and use of animals for scientific purposes set by the Institutional Animal Care and Use Committee (IACUC) of Nanchang University, Jiangxi, China.

Taxon sampling
Unionids were collected between 2014 and 2017 from a selection of lakes and tributaries in the middle and lower reaches of the Yangtze River, including Dongting Lake (Hunan Province, N: 28 16.00˚), Tai Lake (Jiangsu Province, N: 31.24˚; E: 120.23˚) and Hongze Lake (Jiangsu Province, N: 33.22˚; E: 118.68˚). Field-collected specimens were taken to the laboratory and identified to species based on the shell morphology. At present, there are three authoritative publications on the classification of Chinese freshwater bivalves [8,29,59]. However, the taxonomy of the Asian unionid fauna is continuing to evolve as more studies are published [7,41,49]. For this study, we based our identification of freshwater mussels on these publications. In addition, we made use of the MusselP website [60], which provided a global taxonomic framework and pictures of type specimens, which greatly facilitated identifications. Gender for each individual was determined by gonad smear [61]. All Chinese species collected and examined are listed in Table 1. A small sample of foot tissue was removed from each specimen and preserved in 96% ethanol for later DNA extraction. Voucher specimens representing the species included in this study were deposited in the Nanchang University Museum of Biology.

DNA extraction, amplification and sequencing
Whole genomic DNA was extracted from preserved foot tissue using the TIANGEN TIANamp Marine Animals DNA Kit (Tiangen Biotech, Beijing, China) according to the manufacturer's instructions. Polymerase chain reaction (PCR) primers for the two gene regions were COI (LCO1490/HCO2198) [62], and ND1 (Leu-uurF/LoGlyR) [63]. Thermal cycling conditions for both sets of primers were 98˚C for 10 s, followed by 35 cycles of 94˚C for 1 min, 50˚C for 1 min, 72˚C for 1-2 min, and a final extension of 72˚C for 7 min, following the TaKaRa Ex manufacturer's protocol. Amplified PCR products were purified and sequenced by Sangon Biotech (Shanghai). PCR product sizes for the COI and ND1 amplicons were 680 bp and 900 bp, respectively. The sequences obtained in this study have been uploaded to GenBank (Accession Numbers: MG933687-MG933805).

DNA barcoding dataset construction
The COI and ND1 sequences of all available Chinese freshwater mussel species were downloaded from NCBI GenBank and combined with the new sequences from this study. We used DNA Collapser (http://users-birc.au.dk/biopv/php/fabox/dnacollapser.php#) to identify unique haplotypes of COI and ND1 sequences for each species, and excluded any identical sequences ( Table 1). As some of the specimens used in this study were obtained from Gen-Bank, COI or ND1 sequences are missing for some specimens. In addition, the dataset analyzed also includes DNA sequences obtained from GenBank for representatives of major clades of the Unionidae, as determined in recent studies [49,64]. Margaritifera falcata and M. dahurica from the putative unionid sister family Margaritiferidae were selected as out-groups. All 57 species used in the phylogenetic analyses, including 34 Chinese species in this study, are listed in Table 1.

Phylogenetic analyses
The COI and ND1 nucleotide sequences were translated to amino acid sequences using MEGA 5.0 [65], and aligned based on the amino acid sequences using the program MUSCLE [66] with default setting. We calculated inter-and intraspecific distances for each data set with MEGA 5.0 using the Kimura-2-parameter model [67]. Standard error was assessed using 1000 bootstrap replicates. Meanwhile, we constructed Maximum Likelihood (ML) tree based on codon position using the GTR+I+G model in MEGA5.0 for the COI and ND1 datasets separately. The results of ML analysis were shown in S1 and S2 Figs. Using SequenceMatrix [68], the two (COI and ND1) data sets were concatenated (1011 bp) for construction of phylogenetic trees. Prior to phylogenetic analysis of the combined dataset, a partition homogeneity test was carried out in PAUP Ã version 4.0b10 [69] to determine if significantly different signals were being generated by the COI and ND1 fragments. The partition homogeneity test indicated there was no significant difference in signals (P > 0.05), and the concatenated two-loci dataset was suitable for phylogenetic construction. For the combined dataset, a single scheme with 6 partitions was applied based on genes and codon positions. The best-fit models of nucleotide substitution under the corrected Akaike Information Criterion were selected using PartitionFinder v1.1.1 [70] for each partition, of the subsequent analyses. A Bayesian topology was inferred for each dataset using MrBayes Version 2.01 [71]. The GTR +I+G model was used for the first and second COI and ND1 codon positions, while the GTR +I model was applied to third codon positions for both genes. Four chains were run simultaneously for 1 million generations and trees were sampled every 1000 generations, with a burnin of 25%. Stationarity was considered to be reached when the average standard deviation of split frequencies was less than 0.01.
The gene and codon site-based partitioned ML analyses were performed in RAxML implemented in raxmlGUI v.1.3 [72], using the GTRGAMMAI model of nucleotide substitution with the search strategy set to rapid bootstrapping.

Efficacy of both loci for DNA barcoding
Initial analysis resulted in 98 COI sequences representing 32 species and 85 ND1 sequences representing 34 species in this study ( Table 2). The aligned COI and ND1 sequences had a total length of 522bp and 489bp, respectively. The final alignment of ND1 sequences was trimmed to the length of the shortest sequence in the final data set. For the COI locus, average intraspecific distances calculated by the Kimura-2-parameter model ranged from 0.002-0.027 (mean = 0.007; Table 2 and Fig 1A). Interspecific genetic distances ranged from 0.05 to 0.21, except for Anemina arcaeformis and A. globosula with the lower interspecific genetic distance of 0.005 (see S1 Table and Fig 1A). The average interspecific distances were 10 times larger than the average intraspecific distances (Fig 1A). As a result, COI had excellent potential for species-level identification of unionids. For the ND1 locus, average intraspecific distances ranged from 0.002 to 0.024 (mean = 0.007; Table 2 and Fig 1B). The average interspecific distance of ND1 was greater than 10 times average intraspecific distance (Fig 1B). ND1 also showed excellent potential for species-level identification, as the DNA sequences exhibited larger difference between average intra-and interspecific distances than the COI gene.

Phylogenetic analyses
The phylogenetic trees produced by Bayesian Inference (BI) and Maximum Likelihood (ML) converged on a completely consistent topology; therefore, only the BI tree was shown here (Fig  2). Phylogenetic analyses supported the monophyly of the Unioninae, Anodontinae and  Testing the utility of DNA barcodes and a preliminary phylogenetic framework for Chinese Unionidae Gonideninae, and the sister relationship of the Unioninae and the Anodontinae; however, the Ambleminae was recovered as a polytomy. Several Chinese genera included were supported as monophyletic groups: Aculamprotula, Cuneopsis, Schistodesmus and Lamprotula, whereas others (Solenaia, Lanceolaria and Anemina) were not. Based on our results, Chinese genera included in the Unioninae are: Aculamprotula, Cuneopsis, Nodularia and Schistodesmus. The Anodontinae includes the following six Chinese genera: Cristaria, Arconaia, Acuticosta, Lanceolaria, Anemina and Sinoanodonta. The Gonideninae include Lamprotula, Solenaia and Ptychorhychus (Fig 2). Sinohyriopsis cumingii was recovered as the sister taxon to the monophyletic group (Anodontinae + Unioninae) with low posterior probability, whereas Testing the utility of DNA barcodes and a preliminary phylogenetic framework for Chinese Unionidae Lepidodesma languilati was recovered as paraphyletic to the Unioninae and the Anodontinae with high posterior probability.

Discussion
Traditionally, the taxonomic classification and identification of freshwater mussels has been mainly based on the comparative morphology of shells, especially in field surveys of freshwater mussel diversity. However, under different habitat conditions and environmental pressures the morphological characteristics of the shell, such as shell size, shape, and sculpture (e.g. ridges, bumps, and knobs) can vary significantly among populations within a species [73][74].
Using only morphological characters for unionid species identification has led to a risk of misidentification, synonymy, and a failure to describe some cryptic species [75][76]. With species as the key unit for measuring biodiversity, a failure to recognize species not only undermines biodiversity research, protection efforts and sustainable harvests and uses, but also affects biodiversity assessment efforts and the ability of resource managers to compare faunas among regions [77]. This study evaluated the utility of two mitochondrial genes widely used as DNA barcoding for 34 species of freshwater mussels in China, which not only lays a foundation for understanding the phylogenetic diversity of unionids, but also provides an important reference for assessing and comparing regional biodiversity. Furthermore, this dataset will facilitate field survey studies using eDNA and metabarcoding analyses [52] for assemblages of Chinese Unionidae in the future. Hebert et al. [78] proposed that a 10× difference between intra-and inter-specific distances is a desirable characteristic for a barcoding locus. Both of the two loci used in this study satisfy this criterion and have excellent potential for species-level identification of Chinese unionids. The larger distance between the ranges of intra-and inter-specific distances exhibited by the ND1 gene indicates that it is more sensitive and more accurate than COI, the proposed standard barcoding locus [79], and therefore has advantages over COI for species-level identification of unionids. This is a similar result to what was found in a study on North American unionids [80]. We also found that ND1 sequence data were generally easier to obtain than COI sequence data, and that the ND1 primers had a higher rate of successful amplification, and sequence data generally were of higher quality.
After 1949, Chinese malacologists conducted a number of faunal investigations, revised classifications, and in some cases described new species [25][26][27][28][29][30]. Based on shell sculpture and hinge tooth morphology, Liu et al. [29] divided Chinese unionids into two subfamilies, the Unioninae and the Anodontinae. However, shell characteristics are not as stable as the anatomical characters (e.g. arrangement of the marsupial demibranchs in which larvae are brooded and larval morphology) and are not recommended for diagnosing higher-level taxa among freshwater mussels [81][82][83][84]. Wu et al. [32] found that larval morphology differed between members of the Unioninae and Anodontinae. For most Unioninae species, the overall shape of the glochidia was described as widely triangular. The surface of the valves of glochidia from members of the Unioninae was imperforate and included some small depressions or fovea. Whereas, for all species of Anodontinae, the overall shape of the glochidia was elongately triangular, and the surface of the larval valves included numerous perforations. On the basis of these characteristics, the genera Arconaia, Lanceolaria and Acuticosta were placed into the Unioninae [29,32], which has been supported by later molecular studies [48,[50][51].
Recently, however, Lopes-Lima et al. [49] placed the genera Arconaia and Lanceolaria into the Anodontinae based on phylogenetic analysis of COI and 28S rRNA sequences. Our study supports not only changing the subfamilial affinities of Arconaia and Lanceolaria to the Anodontinae, but also indicates with high confidence that the genus Acuticosta also belongs in the Anodontinae. Renewed examination of the anatomy of these genera is warranted to determine if characteristics that support the molecular classification presented here and in other publications can be identified.
The taxonomy of Chinese anodontines had been controversial for a long time. Chinese freshwater malacologists consistently used the generic name Anodonta in the taxonomy, ecology and molecular biology, and considered all Chinese anodontines as monophyletic. But, other research [9,85] indicated that the genus Anodonta sensu stricto was restricted to Western North America and Western Europe ranging as far east as Lake Baikal, and that Chinese anodontines were instead members of the genera Anemina and Sinoanodonta. Our study provides additional molecular evidence indicating that Chinese Anodonta sensu lato are polyphyletic, and supports dividing Chinese Anodonta into the genera Anemina and Sinoanodonta. (See and Figs 2, S1 and S2). Anemina angula is a Chinese endemic species, and it was placed in Anemina by Prozorova et al. [86] and Graf and Cummings [3] on the basis of morphological characters. Our results instead indicate it has closer affinities to Sinoanodonta.
The combined COI and ND1 dataset indicates that the Ambleminae is the sister group to the rest of the Unionidae. The phylogeny of Lopes-Lima et al. [49] indicated that the 4 subfamilies were divided into 2 branches: (Anodontinae + Unioninae) and (Ambleminae + Gonideninae). Bolotov et al. [64] thought the placement of the Pseudodontinae as a tribe within Gonideinae was incorrect and proposed that they actually represent a separate subfamily. But in general, his phylogenetic analyses supported the Unionidae phylogenetic framework established by Lopes-Lima et al. [49]. The subfamily-level phylogenetic relationships in our study differ from the above-mentioned results. In addition, our analyses were unable to place the genera Sinohyriopsis and Lepidodesma in the phylogeny with confidence. Some researchers have placed these genera into the Gonideninae and the Unioninae, respectively, based on the analysis of complete mitochondrial genomes [45,47].
For the purpose of recognizing and delimiting species in this study we are employing the monophyly version of the phylogenetic species concept [87][88][89]. In our DNA barcoding dataset for 34 freshwater mussel species, the interspecific genetic distance between Anemina arcaeformis and A. globosula (COI: d = 0.006; ND1: d = 0.006) were much smaller than those of other species (See S1 and S2 Tables). At the same time, they clustered with each other in an evolutionary lineage and formed a monophyletic group (S1 and S2 Tables and Fig 2). According to the phylogenetic species concept, our results do not support recognizing A. arcaeformis and A. globosula as distinct species. Data from nuclear gene sequences and morphological data are needed to corroborate the findings based on this mtDNA dataset.
DNA barcoding and modern molecular phylogenetic analyses of the Unionidae have had significant impacts on our understanding of the biogeography of this important family of freshwater bivalves. The pattern being revealed is one that includes both highly endemic subfamilies (e.g. Ambleminae in North and Central America) and some subfamilies that are found on several continents (e.g. Anodontinae and Gonideinae) [3][4]49]. China, and in particular the Yangtze Basin is recognized a region of high unionid diversity which includes a number of endemic taxa [7]. The majority of unionid taxa in China and the surrounding region have never been included in a molecular phylogenetic analysis [49]. In view of the generally imperiled conservation status for freshwater mussels in China [7,[11][12], it is of crucial importance to develop a phylogenetic framework for Chinese taxa to assist with species delineation and determining priorities for species conservation. This study provides an improved foundation for the systematics and taxonomy of Unionidae in China and serves as a reference for future studies of Chinese freshwater mussel diversity.
Supporting information S1 Table. Interspecific distances of 32 putative species using COI loci. The lower left is the interspecific genetic distance; the upper right is the standard error. (DOCX) S2 Table. Interspecific distances of 34 putative species using ND1 loci. The lower left is the interspecific genetic distance; the upper right is the standard error.