aTBP: A versatile tool for fish genotyping

Animal Tubulin-Based-Polymorphism (aTBP), an intron length polymorphism method recently developed for vertebrate genotyping, has been successfully applied to the identification of several fish species. Here, we report data that demonstrate the ability of the aTBP method to assign a specific profile to fish species, each characterized by the presence of commonly shared amplicons together with additional intraspecific polymorphisms. Within each aTBP profile, some fragments are also recognized that can be attributed to taxonomic ranks higher than species, e.g. genus and family. Versatility of application across different taxonomic ranks combined with the presence of a significant number of DNA polymorphisms, makes the aTBP method an additional and useful tool for fish genotyping, suitable for different purposes such as species authentication, parental recognition and detection of allele variations in response to environmental changes.


Introduction
With approximately 35.000 described species, fishes account for about 50% of all vertebrates. Fish exhibit a great level of diversity, reflecting processes of adaptation to very different aquatic environments. High species number, significant morphological and genetic diversity and environmental fitness, are at the basis of several important scientific issues. These may refer to taxonomy and correct species identification, evolutionary biology and assessment of variation and changes in allele frequencies, resilience and adaptability to extremely variable climate conditions, diversification and parental recognition, traceability of seafood. All these issues find in cellular DNA a common and effective target for investigation. In fact, cellular DNA can potentially be retrieved from any species and any kind of organic substrate, such as muscle, fin, or blood and DNA-based analyses can be applied to any of the issues just mentioned. Species identification is nowadays largely based on DNA barcoding, through the amplification and sequencing of some mitochondrial genes where a sufficient interspecies variation can be detected [1][2][3]. The fish section of the consortium for barcoding of life (http://www. boldsystems.org/ or https://ibol.org/) includes about 8.000 fish species and relies on the sequence of the 650 bp region of the mitochondrial gene cytochrome c oxidase I (COI). It represents an effective and comprehensive resource for the analysis of fishes and fish products  [4,5]. More recently, and for the purpose of tracing species in food matrixes that contain a low quality DNA, due to harsh food processing, the use of minibarcodes (shorter fragments of the full length DNA barcode approximately 200 bp long) has been applied with some success. Several minibarcode regions have been identified that allow for differentiation of a range of species, and these regions have also been tested in silico to differentiate commercially important salmon and trout species [6][7][8]. However, limits in the classical DNA barcoding approach may be encountered in the analysis of mixtures composed of multiple species, in the recognition of undeclared substitutions, especially with local varieties, in the availability of specific, known target sequences, and in the need for sequencing and related costs for data elaboration and instrument maintenance. Genomic DNA data are also very important for conservation management of genetic resources and for assessment of variations occurring in natural populations. This data provides a novel opportunity to investigate how populations have responded to changes, to identify mechanisms underlying these changes, and to evaluate the adaptive potential and vulnerability of populations in the future. A recent and worrisome example has been reported concerning a 60% decline in the populations of salmon of North America and Europe, clearly associated to warmer winter temperatures. Using single nucleotide polymorphisms (SNPs) as molecular tools, declining and near to decline populations have been identified [9]. These declining fish numbers are not only problematic for biodiversity, but their loss also represents an impediment to improving our scientific understanding of key fundamental adaptation strategies revealing molecular responses to life in cold conditions. Cited in the line of our present contribution, this is reminiscent of a well known and early reported adaptation process that explained the occurrence of microtubule polymerization at cold temperatures as dependent on specific amino acid substitutions found in the α-and β-tubulin moieties [10,11]. In more general terms, the availability of a key functional marker is of importance to monitor the effect of climate changes on population fitness. In this way genomic screening can effectively assess population vulnerability. This has been successfully applied for salmon in a Canadian alpine environment where the maintenance of an almost balanced population of red and white Chinook salmon (Onchorhynchus tshawytscha) has been associated to increased carotenoids synthesis and increased heterozygosity at the major histocompatibility complex loci [12,13], respectively. In addition, the reproduction system can obviously affect variation in natural populations and thus the use of suitable molecular markers like polymorphic microsatellite loci and COI can help in assigning parentage, in identifying hybridization events and in recording the breeding system [14,15].
As previously reported, different molecular markers may be utilized for different purposes. Thus, we want to direct the attention to a relatively new molecular marker, animal Tubulin Based Polymorphism (aTBP; [16]), sufficiently versatile to assist these many different purposes. Based on the natural occurrence of polymorphisms in the intron length and nucleotide composition of the β-tubulin genes, the approach may offer an attractive and workable alternative to the genetic identification of fish species, as well as subpopulations and local varieties, with no need for sequencing. Hereby, we present experimental evidence in favour of the use of aTBP for fish genotyping and discuss its possible applications.

Experimental samples
Total DNA extracts made from the following fish species: Sparus aurata, Dicentrarchus labrax, Oncorhynchus mykiss, Acipenser naccarii, Thunnus thynnus, Salmo carpio and Salmo trutta f. fario were provided by the Spallanzani Institute (Rivolta d'Adda, Italy). These were obtained from different research projects in which the Institute has been involved: Competus-CRAFT-017633; Cobice-LIFE-04NAT/IT/000126; FP7-SME-2010-1-262523; FP4-FAIR989211; Salvacarpio-Regional project n. 1220; MIIPAF, Three-year plans for fishing and aquaculture-VI 2000-02. The DNA extracts were originally produced from fin-clipped samples by using the semi-automatic BioSprint 96 DNA system (QIAGEN) and the BioSprint 96 DNA Blood Kit (Qiagen) following the manufacturer's protocols. Fish species identification of these samples was performed by the use of a panel of Single Sequence Repeats markers (SSRs), as reported [17][18][19][20][21], with the exception of T. thynnus and O. mykiss. 15-20 samples of each species were randomly chosen and used for the aTBP molecular analyses. The DNA samples identified by the prefix FT were instead provided by the Life Sciences Department of the University of Siena. These included 34 fish specimens, purchased frozen from local Tuscan markets, consisting of 6 specimens of Sparus aurata, 2 of Acipenser transmontanus, 6 of Thunnus albacares, 4 of Pangasisus hypophthalmus, 8 of Salmo salar, 4 of Oncorhynchus mykiss, and 4 of Dicentrarchus labrax. Total DNA extractions were performed by using the Wizard 1 SV Genomic DNA Purification System (Promega), following the manufacturer's instruction for animal tissues. Fish species identification of the FT samples was obtained by DNA sequencing of the fragments amplified with the use of the following universal primers: 5'-TCAACYAATCAYA AAGATATYGGCAC-3' for the forward and 5'-ACTTCYGGGTGRCCRAARAATCA-3' for the reverse, known to target a conserved portion of the COI gene [1][2][3]; DNA sequencing was performed on both strands and sequences matched to each other. Unaligned and aligned COI sequences are provided in the S1 Data.

aTBP amplification and capillary electrophoresis
30 ng of any total DNA sample, previously characterized either by SSRs or COI, were used as template for aTBP PCR amplification. PCR conditions and primer sequences for amplification of intron III (aFex3.2 and aRex3.2) have been recently reported [16]. The forward primer was labeled in 5' position as described in [22]. Two negative controls (no template) were always included in each PCR reaction and all PCR amplifications were repeated at least twice to check the consistency of the amplification profile. 4 μl of each PCR reaction was preliminary loaded on a 2% agarose gel, stained by Atlas Clear Sight DNA Stain (1μg mL -1 ) (Bioatlas) and compared to gene Ruler™1 Kb plus ladder as reference, to verify the intensity of the amplification signal to proceed with the appropriate dilutions to be used for amplicon resolution analysis done by capillary electrophoresis. 2μL of each diluted sample was mixed with 0.2 μl of 1200 LIZ Size Standard and 17.8 μl Hi-Di formamide to a final volume of 20 μL. Samples were denaturated at 95˚C for 5 min and, after cooling to -20˚C, were loaded onto the ABI 3500 Genetic Analyzer (Thermo Fisher Scientific) for CE separation following the running protocol described by [23].

Data analysis
The amplicons resolution data were collected using the Data Collection Software v. 3.1 (Thermo Fisher Scientific) and then analyzed by the Gene Mapper Software v. 5.0 tool (Thermo Fisher Scientific). Data analysis was made by comparison of the numerical output of the ABI 3500 analyzer, converted in an excel spreadsheet which allows the association of each specific amplicon profile to each fish species. At least two different electrophoretic runs were performed for each amplified product in order to confirm the aTBP profile. The PCA analysis was carried out with Past3 software [24] based on a presence-absence matrix, obtained from the score of the aTBP markers.

Results
As shown in Fig 1, the ability of the aTBP method in discriminating among animal species is based on the variation of the length of intron III, commonly found in members of the animal β-tubulin gene family that may differ by number. Therefore, the same couple of primers conveniently located at the boundaries of the third intron amplifies, in a typical PCR reaction, a group of fragments that can vary in number and length in each analyzed species. If resolved in a capillary electrophoresis system, they eventually define a species-specific DNA code. The separation resolution is such that peaks/fragments differing from just 1-2 bp can be recognized. Each peak of the electropherogram is defined by sizes, expressed in bp, and by a height, expressed in Relative Fluorescence Units (RFU) values.
A good and paradigmatic example of the level of information that is retrievable by aTBP, when applied to individuals of the same species, is provided in Table 1 with reference to gilthead seabream (S. aurata). Sixteen different individuals coming from aquaculture, already characterized for their morphological traits and with a panel of SSRs primers, were analyzed with aTBP together with 6 individuals of the same species purchased in the market and classified by COI barcoding. The data reveal the presence of five amplified fragments that are commonly shared among all of the analyzed samples (grey columns in Table 1), In addition, a quite diffuse and interesting intra-species variation, characterized in both subgroups by either missing or supplementary amplicons, likely corresponding to allelic variations, was identified. It is of relevance to note that, with the exception of the 312 bp long fragment amplified from the DNA extracted from three individuals of the Spallanzani group of specimens, the DNA polymorphisms detected at intra-species level are present in both the analyzed groups, likely reflecting ongoing variations in the general gilthead seabream population.
A similar situation was found when analysing 16 samples of the Adriatic sturgeon (A. naccarrii). Once more, commonly shared amplicons were found together with intraspecific polymorphisms, as shown in the upper panel of Table 2. Quite remarkably, one of these samples (A10) showed a very different pattern of aTBP amplification, perfectly matching that found in two available samples of white sturgeon (A. transmontanus). This reassignment is fully consistent with data previously obtained on the same experimental group with the use of a panel of SSR markers [19].
Data reported in Table 3 more adequately underscore the application of the aTBP method for the discrimination of two different, important and largely commercialized tuna species: red tuna (Thunnus thynnus) and yellowfin tuna (Thunnus albacares). As can be easily appreciated, the two tuna species show commonly shared amplified aTBP fragments, referable to their genus, and species-specific fragments, two of 255 bp and 778 bp, and one of 282 bp in yellowfin and red tuna, respectively. Once more, both groups are further characterized by the presence of additional intraspecies polymorphic fragments that may be shared or not between the two species.
The ability of aTBP to easily discriminate among different fish species, revealed by the data just presented, motivated us to verify if the method could be used as a simple way to detect   Table 2. aTBP profile of sturgeon (Acipenser spp.).

CE peaks
Adriatic sturgeon  fraud and substitutions, frequently reported, and to a vast scale, in the fisheries market [25].
To this purpose, we analyzed and compared the aTBP profile of two fish species, pangasius (P. hypophthalmus) and European seabass (D. labrax), because the latter is often replaced by the former when commercialized as fillets or canned food. Fig 2 readily shows how the two species look completely different from each other when their corresponding aTBP profiles are compared. Not a single amplified fragment is shared among those that are species-specific. As shown, in case of a suspected substitution, this difference can be conveniently revealed by a simple electrophoresis run of the amplified fragments in an agarose gel.
The Salmonidae is a particularly relevant fish family often studied with reference to multiple important issues such as variations in response to climate changes, reproductive habits and parentage recognition, metabolic species-specific features, and, of course, market traceability. Table 4 shows the data obtained applying the aTBP method to individuals of four different salmonid species: carpione trout (Salmo carpio), an endemic species of the Garda lake in Italy, brown trout fario (Salmo trutta f. fario), Atlantic salmon (Salmo salar) and rainbow trout (Oncorhynchus mykiss), belonging to two genera of the family and present in natural environments as different as ocean or fresh water. With the premise that intraspecific polymorphisms, present also in these groups, have been reduced to a minimum to set up a consultable table of immediately appreciable results, Table 4 delivers several useful information. First, the amplified fragments can be individually assigned to different taxonomic ranks, starting with the 220 bp long amplicon that is attributable to the Salmonidae family since it is present in all the samples we have analyzed. The three species belonging to the Salmo genus also share five common aTBP amplified fragments (219, 228, 259, 289 and 330 bp) while each single species is characterized by the presence of a small yet variable number of clearly specific amplification products, shown in the dark grey columns of Table 4. Additional similarities, such as those between carpione trout and brown trout fario, are notable (boxed columns).
Similarity between these two species, indicating their more recent separation, was further confirmed by the Principal Component Analysis (PCA) of Fig 3, where the four salmonid species are distributed in three major directions for a cumulative contribution of the first two principal components that explains 76% of the total variance. The complete data set used for PCA is provided in S2 Table.  Overall, the data shown indicate that the aTBP method can be easily and conveniently used to monitor variations occurring at different taxonomic ranks, providing a useful and very versatile tool for different kind of investigations.

Discussion
This paper presents evidence in favour of the use of the aTBP method for the genetic characterization of fish at different taxonomic levels and for different purposes. We have demonstrated that using a single PCR-based reaction with the same pair of primers, the TBP method can amplify from the genome of any fish sample a number of fragments that delineate a specific DNA profile, or barcode. The aTBP amplification products of a single barcode can then be sequentially attributed to the family, genus, species and subspecies categories. In its essence, aTBP adds to the two fundamental features of an ideal DNA barcode: high taxonomic coverage and high interspecific resolution. Thus, with aTBP recognition of subspecies polymorphisms become simpler and more efficient providing immediate data, with no need for sequencing or necessary prior knowledge of the target sequences. The power of the discrimination of the aTBP genomic profiling method is also shown to be unaffected by ploidy since sturgeon and salmonid species, known polyploidys [26,27], can be easily distinguished. In fact, the two sturgeon species we have analysed, A. naccarii and A. transmontanus, are natural octaploid with 240-264 chromosomes. Due to the high level of fragments resolution granted by CE (1-2 bp), aTBP is expected to perform well also in presence of higher ploidy and chromosome numbers. Problems may arise in the reading of the electropherogram output that can become complex for the presence of numerous peaks. A software that can help in the fast recognition of the output is presently under development. Finally, aTBP is a functional and nuclear-based molecular marker. All these features may offer new opportunities to studies that are performed in diverse fields of investigation. The exception is molecular taxonomy where a long term, well established, rapidly diffused and internationally supported method based on the sequencing of the mitochondrial COI gene has provided the deposition of more than 80.000 barcoding sequences corresponding to approximately 8.000 different fish species. Nevertheless, as also shown in this paper, since aTBP substantially confirm COI data, it may be useful when species assignment, based on COI, is uncertainly relying on minimal SNPs differences.
This stated, the use of aTBP for identification, authentication and detection of fish species in food samples is quite appropriate and particularly suitable for all those laboratories that are not equipped with demanding sequencing facilities. As a classical DNA barcoding, aTBP can be applied to a high number of species, characterized by a large spectrum of variation. Differently from a classical DNA barcoding, the aTBP primers are effective independently from the taxonomic rank while COI primers must be often optimized for the successful use at ranks higher than species. In addition, aTBP can be used for detecting subspecies populations and local varieties. Anyhow, both applications, aTBP and classical DNA barcoding, are particularly suitable for seafood traceability, especially when transformation processes make morphological inspection impossible for fillets, frozen and canned foods, fostering frauds and substitutions. These irregularities could be easily uncovered by the detection of the aTBP speciesspecific diagnostic peaks as well as the visualization, even in a very simple agarose gel, of very diverse patterns of amplification as here shown for pangasius versus seabass (Fig 2). aTBP can also be of help for assessing variation in a natural population, a major goal in the field of evolutionary biology. To this regard, it is of interest to highlight the finding of a hierarchical distribution that assigns specific aTBP amplification fragments to different taxonomic ranks, as observed in Thunnus, Acipenser and Salmonidae. It looks like evolution has left molecular traces of its action in the introns of tubulin, from family down to species, and the presence of intra specific subpopulations, characterized by the sharing of few polymorphisms, promise to be a renovated handle for monitoring future evolutions. Since these intraspecific changes in allele frequency can be easily scored, they provide useful information on the overall structure of populations with respect to vulnerability, or resilience, in response to environmental changes and in natural selection constraints. Unique responses often are associated with mutations in genomic regions related to metabolic, developmental, immunogenic and physiological processes. aTBP genomic profiling is based on a functional marker, that is tubulin, since long related to cold response because of the identification of cold-inducible promoters and aminoacid changes exclusively present in the α-and β-tubulin moieties of the Antarctic fishes. Thus it is reasonable to consider the aTBP genomic profiling as a useful tool that can further our understanding of changes in fish genotypes and variations in population fitness.
Another field of possible and useful application of the aTBP method is the potential contribution to our understanding of the role that natural or anthropogenic hybridization and sexual competition play in genetic diversity including breeding among native and introduced species. For example, aTBP could be used for identifying preferential occupation of spawning grounds by a given species as well as recognition of the breeding system and parental assignment. Since the aTBP is a nuclear-based codominant marker, its usage may favor the recognition of hybrids already present in the F1 generation, rather than the F2 populations as is commonly practiced by the use of the mitochondrial, maternally inherited COI gene. In summary, understanding the processes underlying diversification can aid in formulating appropriate conservation management plans that will help to maintain the evolutionary potential of taxa, particularly under human-induced activities and climate changes.
Under most practical terms, aTBP is a simple and quick technique, based on a single PCR reaction and the resolution of the amplified fragments by electrophoresis, that may take few hours for an easy recognition on an agarose gel. Several samples can be concomitantly analyzed, 24 a day in our experience, providing consistent and reproducible genomic profiles that assist in the characterization of the genetic variation of the investigated species. A possible further improvement could be obtained by combining aTBP amplification to High Resolution Melting, as recently done for a combination of different plant DNA barcodes [28]. Also, efforts are in place to establish a practical aTBP data base with the help of Institutions and fishery companies. In conclusion, aTBP should be considered as valuable new tool of genetic investigation in fish for its simplicity of use, good costs/effectiveness ratio, usefulness in different fields of application and wide taxonomic coverage.
Supporting information S1