Genome-Wide Survey and Analysis of Microsatellite Sequences in Bovid Species

Microsatellites or simple sequence repeats (SSRs) have become the most popular source of genetic markers, which are ubiquitously distributed in many eukaryotic and prokaryotic genomes. This is the first study examining and comparing SSRs in completely sequenced genomes of the Bovidae. We analyzed and compared the number of SSRs, relative abundance, relative density, guanine-cytosine (GC) content and proportion of SSRs in six taxonomically different bovid species: Bos taurus, Bubalus bubalis, Bos mutus, Ovis aries, Capra hircus, and Pantholops hodgsonii. Our analysis revealed that, based on our search criteria, the total number of perfect SSRs found ranged from 663,079 to 806,907 and covered from 0.44% to 0.48% of the bovid genomes. Relative abundance and density of SSRs in these Bovinae genomes were non-significantly correlated with genome size (Pearson, r < 0.420, p > 0.05). Perfect mononucleotide SSRs were the most abundant, followed by the pattern: perfect di- > tri- > penta- > tetra- > hexanucleotide SSRs. Generally, the number of SSRs, relative abundance, and relative density of SSRs decreased as the motif repeat length increased in each species of Bovidae. The most GC-content was in trinucleotide SSRs and the least was in the mononucleotide SSRs in the six bovid genomes. The GC-contents of tri- and pentanucleotide SSRs showed a great deal of similarity among different chromosomes of B. taurus, O. aries, and C. hircus. SSR number of all chromosomes in the B. taurus, O.aries, and C. hircus is closely positively correlated with chromosome sequence size (Pearson, r > 0.980, p < 0.01) and significantly negatively correlated with GC-content (Pearson, r < -0.638, p < 0.01). Relative abundance and density of SSRs in all chromosomes of the three species were significantly negatively correlated with GC-content (Pearson, r < -0.333, P < 0.05) but not significantly correlated with chromosome sequence size (Pearson, r < -0.185, P > 0.05). Relative abundances of the same nucleotide SSR type showed great similarity among different chromosomes of B. taurus, O. aries, and C. hircus.


Introduction
Information). So we selected these six genome sequences as samples to analyze the SSR distributions in the genomic level. All the genome sequences were downloaded in FASTA format from the GenBank (http://www.ncbi.nlm.nih.gov). The species, genome size, the GC-content, etc., have been summarized in Table 1

SSRs identification and investigation
SSRs were identified and localized using the software MSDB (Microsatellite Search and Building Database) downloaded at https://code.google.com/p/msdb/ [18], which is a Perl program providing a user-friendly interface for identification and building databases of SSRs from complete genome sequences. SSRs can be grouped into six categories: (1) pure or perfect (P) SSRs, (2) interrupted perfect (IP) SSRs, (3) compound (CD) SSRs, (4) interrupted compound (ICD) SSRs, (5) complex (CX) SSRs, and (6) interrupted complex (ICX) SSRs [19][20]. MSDB has two search modes: A 'perfect search mode' is used to search perfect SSRs or pure SSRs and an 'imperfect search mode' is used to search the six categories of SSRs mentioned above [18]. In order to search a sequence for perfect SSRs, the definition of the minimum repeat number is an important criterion. Since bovid species have very large genomes, relatively systemic search criteria were adopted in this study: The parameters for minimum repeat numbers were set as 12, 7, 5, 4, 4, 4 for mono-, di-, tri-, tetra-, penta-, and hexanucleotide SSRs, respectively [18]. The maximum distance allowed between any two SSRs (dMAX) was 10 bp; other parameters were set as default. In this study, repeats with unit patterns being circular permutations and/or reverse complements of each other were grouped together as one type for statistical analysis [21][22]. For example, ACT denotes ACT, CTA, TAC, TGA, GAT and ATG in different reading frames or on the complementary strand. For tetranucleotide and hexanucleotide repeats, combinations representing perfect di-and trinucleotide repeats were filtered from the final counts, for example, a (ACAC) 9 was considered as a (AC) 18 dinucleotide and not as a tetranucleotide repeat. The combinations of SSRs for this study will help to give a better understanding of the total occurrence of SSRs, and their genomic locations will be very useful in selecting SSRs representative of similar repeat classes from different genomic locations as potential markers.
To facilitate the comparison among different repeat categories or motifs, we used relative abundance, which means the number of SSRs per Mb of the sequence analyzed, and relative density, which means the length (in bp) of SSRs per Mb of the sequence analyzed [18,23]. These total numbers have been normalized either as relative abundance or relative density to allow comparison among genome sequences of different sizes. The relative abundance and density on each chromosome was calculated by dividing the total chromosome length by each nucleotide SSR. Primer pairs for the identified SSR loci were designed using the Primer 3 software implemented in the MSDB using default parameters.

Statistical analysis
All data analyses were performed using SPSS version 18.0 and followed standard procedures. The Pearson test was used to reveal the correlation between two variables, including relative abundance, relative density, genome size, GC-content, and chromosome sequence size. Student's t-test was used to compare means of two groups.

The number, relative abundance and density of SSRs in bovid genomes
The six categories of SSRs were found in each of these bovid genomic sequences by using computer software MSDB for a genome-wide scan ( Table 2). P-SSRs was the most abundant type in these bovid species, followed by the pattern: CD-SSRs > ICD-SSRs > IP-SSRs > ICX-SSRs > CX-SSRs ( Table 2). The relative abundances of the same SSR types showed great similarity in the Bovinae species and also in the Caprinae species. The number, relative abundance and density of perfect mono-to hexanucleotide repeat types across these species genomes are presented in Table 3. Results here indicated that the number, relative abundance, and density of the same repeat type of perfect SSRs (mono-to hexanucleotides) showed great similarity in the six bovid species. Perfect mononucleotide SSRs were the most abundant category, followed by the pattern: perfect di-> tri-> penta-> tetra-> hexanucleotide SSRs (Table 3). The proportion of mono-to hexanucleotide SSRs was very similar in the six bovid genomes (Fig 1). Mononucleotide SSRs were the maximum ratio, accounting for 43.02%~45.33% of all of the SSRs, followed by the dinucleotide SSRs, whereas trinucleotide SSRs were the third most frequent. The proportion of pentanucleotide SSRs was more than that of tetranucleotide SSRs and hexanucleotide SSRs was the minimum percentage. There were non-significant differences in these parameters between Bovinae and Caprinae genomes (t-test, p > 0.05).
It is amazing to find that the number of SSRs is closely positively correlated with genome size (Pearson, r = 0.898, p < 0.05) and but not significantly correlated with GC-content (Pearson, r < 0.185, p > 0.05) in these bovid genomes. Neither relative abundance nor relative density of SSRs in these bovid genomes was significantly correlated with genome size (Pearson, r < 0.420, p > 0.05) and GC-content (Pearson, r < −0.121, p > 0.05). For example, B. taurus (v4.6.1) has the longest genome sequence length of 2,983.31 Mb among all surveyed species, while, it is not as we hoped that has the highest SSR abundance and density (270.48 /Mb and 4,783.37 bp/Mb, respectively). Similarly, O. aries (v3.1) has the shortest genome sequence length of 2,587.51Mb, while it has the highest SSR density. The number, relative abundance  The Distribution and GC-Content of Perfect SSRs in the Bovid Species and density of pentanucleotide SSR is more than that of tetranucleotide repeat types in these genomes. B. taurus and Bu. bubalis showed the largest number of pentanucleotides with 52,793 and 52,900 loci, respectively. Bu. bubalis and C. hircus have the highest and same relative abundance of pentanucleotides with 29.65 /Mb (Table 3), even though B. taurus has the lowest relative abundance (25.54 /Mb).

Diversity of SSRs in the bovid genomes
The most frequent motifs for different length varied with the different bovid species at the whole genome level (Table 4) and the chromosome level (S1 Table). Among mononucleotide repeat type, the motif (A) n were predominant (over 93.27%), while (C) n repeats were rare (less than 6.73%) in these bovid species genomes, with no obvious relation to the AT-richness of the genomes (Pearson, r < 0.160, p > 0.05). (AC) n , (AT) n and (AG) n were the three most frequent dinucleotide SSRs motifs, the three of which accounted for over 99% of all motifs of dinucleotide SSRs in each genome and each chromosome. In contrast, the (AC) n motif was particularly dominant, the (AT) n and the (AG) n motifs were less abundant, and (CG) n was the least frequent motifs found in any of the six genomes and each chromosome of B. taurus, O. aries, and C. hircus. In the trinucleotide repeat type, (ACG) n and (AGC)n were the most frequent motifs, followed by the (AAC) n , (AAT) n and (ACC) n motifs in these bovid genomes and each chromosome of B. taurus, O. aries, and C. hircus (except for Y chromosome). The (CCG) n motif was the least frequent in B. mutus, O. aries, C. hircus, and P. hodgsonii genomes, while the (AGT) n motif was the least frequent in B. taurus and Bu. bubalis genomes. The most frequent tetranucleotide SSRs motif was found to be the (AAAT) n unit, followed by the (AAAC) n and (AAAG) n motifs, and the (CCGG) n motifs was the least frequent in the six Bovidae genomes and each chromosome of B. taurus, O. aries, and C. hircus. The richness of tetranucleotide repeats is less than that of mono-to trinucleotide repeat motifs in these genomes except for the (AAAT) n , (AAAC) n and (AAAG) n motifs. The most frequent motifs of mono-to tetranucleotide was more invariable, with the list of most frequent motifs becoming identical for each bovid species, and the most frequent penta-and hexanucleotide motifs appeared to be more variable among these species, and each genome displayed its own characteristic. Penta-and hexanucleotide SSRs have a great many motifs in all six genomes. The (AACTG) n and (AGTTC) n motifs were the two most frequent tetranucleotide repeat units in these species and each chromosome of B. taurus, O. aries, and C. hircus, and none of these single hexanucleotide motifs appeared to be shared by the six bovid species. The (AAACAA) n motif was most frequent in B. mutus, Bu. bubalis, O. aries, and C. hircus genomes, whereas the (AAAGTG) n motif was most frequent in B. taurus and P. hodgsonii genomes. The telomeric-like hexanucleotide (AACCCT) n motif was also observed in all six genomes. The most frequent tetra-to hexanucleotide motifs appeared to be more variable between Bovinae and Caprinae species.

The GC-content of all perfect SSRs in the bovid genomes
The adenine-thymine (AT) and GC-content were calculated in perfect SSRs of bovid genomes. The results were shown in Table 5. From the results, we can know that except for the trinucleotide SSRs, the AT-content of the remaining nucleotide repeat types are more than the GC-content. Mononucleotide SSRs had the most AT-content (over 92.06%), followed by the pattern: tetra-> di-> penta-> hexanucleotide SSRs, and the least was in the trinucleotide SSRs (ranging from 40.11% to 42.68%) in the six bovid genomes. On the other hand, we analyzed the GCcontent of SSRs in the bovid genomes. The results showed that the most GC-content is in the trinucleotide, ranging from 57.32% (C. hircus) to 59.89% (B. taurus), and the least is in the mononucleotide, ranging from 1.97% to 7.94% in these genomes. In contrast, the GC-content in all mononucleotide SSRs was significantly lower than that in entire genome, and the GCcontent in the di-and tetranucleotide SSRs were also less than that in entire genome in these analyzed genomes, and the GC-content in the remaining SSRs was more than that in entire genome. In the bovid entire genome, the total AT-contents range from 71.44% to 73.78%, were significantly higher than the GC-content. Therefore, the AT-content of SSRs is very high in the bovid species.
The GC-content of perfect SSRs was analyzed in all chromosomes of B. taurus, O. aries, and C. hircus, and the results are shown in Fig 2. From the results we can know that except for the chromosome 18 and Y in B. taurus, trinucleotide SSRs had the most GC-content (over 54.43%) and the least was in the mononucleotide SSRs in any chromosome of the three genomes. SSRs number of all chromosome in the B. taurus, O. aries, and C. hircus is closely positive correlated with chromosome sequence size (Pearson, r > 0.980, p < 0.01) and significantly negative correlated with GC-content (Pearson, r < -0.638, p < 0.01). Relative abundance and density in all chromosome of the B. taurus, O.arie, and C. hircus were significantly negatively correlated with GC-content (Pearson, r < −0.333, p < 0.05) and but not significantly correlated with chromosome sequence size (Pearson, r < -0.185, p > 0.05). The fluctuation range of GC-content in triand pentanucleotide SSRs tended to a horizontal line in all chromosomes of the three bovid species, and so was in the mononucleotide SSRs of B. taurus. There were some differences in the GC-contents of the same di-, tetra-and hexanucleotide SSRs among different chromosomes of the three bovid species, and so was in the same mononucleotide of O. aries and C. hircus chromosomes. The GC-content in the di-, penta-and hexanucleotide SSRs overlap and interweave in all chromosomes of the three species. The percentage sum of GC-content plus AT-content is equal to 100%, from Fig 2 we can know that the AT-contents of mono-to hexanucleotide SSRs were distributed in all chromosomes of the three bovid species.

The distribution of perfect SSRs in the chromosomes of B. taurus, O. aries, and C. hircus
The relative abundances of the same nucleotide SSR type show highly similarity in all chromosomes of B. taurus, O. aries, and C. hircus (Fig 3). In the relative abundance of all chromosomes of these three bovid species, mononucleotide was the most abundant, followed by the pattern: perfect di-> tri-> penta-> tetra-> hexanucleotide SSRs. The relative overall mono-to tetranucleotide SSR abundances were higher in the B. taurus Y chromosome than in its autosomes and X chromosome. The relative pentanucleotide SSR abundances was higher in the Y chromosome of B. taurus than in its autosomes and X chromosome except for chromosome 1, 2, 4, 6, 9 and 12. It's roughly equivalent to the same nucleotide SSRs abundance in the autosomes of B. taurus. Dinucleotide SSRs abundance were higher in the C. hircus X chromosome than in its autosomes and so was in the O. aries Y chromosome than in its autosomes. It is almost equal to the abundance in the same tri-, tetra-and hexanucleotide SSRs of the C. hircus and O. aries autosomes. Our analysis revealed that the fluctuations of relative abundance were within a narrow range in all chromosomes of the three bovid species.

Diversity of microsatellite distribution in the bovid genomes
In this study, we used MSDB to scan the recently assembled B. taurus, B. mutus, Bu. bubalis, O. aries, C. hircus, P. hodgsonii genomes for microsatellites of 1-6 bp. To compare our results, we performed a similar analysis of these bovid genomes using the same bioinformatics tool and search parameters. Clearly, these data provide evidence of similarity patterns of SSRs distribution in bovid genomes, indicating that the particular contribution of these SSRs to the genome of the six bovids may be the rule for other bovid species. Mononucleotides SSRs were the most abundant repeat type, accounting for 43.01%-45.33% of all of the SSRs, followed by the The Distribution and GC-Content of Perfect SSRs in the Bovid Species pattern: di-> tri-> Penta-> tetra-> hexanucleotides SSRs in the study. Eukaryotic genomes are characterized by the prevalence of mononucleotide repeats over other nucleotide repeat classes [24]. Mononucleotide repeats are the most abundant class of SSRs in all the human chromosomes [25], Volvariella volvacea and Agaricus bisporus [26]. However, dinucleotide repeats are the most abundant SSRs in rodents [5] and majority of the dicot species [27]. Trinucleotide repeats are the most abundant SSRs in Neurospora crassa [28], Cyanidioschyzon merolae, Thalassiosira pseudonana [24], Coprinus cinereus, Schizophyllum commune, Pleurotus ostreatus [26] and Eremothecium gossypii genomes [24], which could indicate their structural similarity with prokaryotes. Previous research has shown that hexanucleotide repeats are the most abundant SSRs in the coding regions of eukaryotes [25]. Here, hexanucleotide SSRs appeared significantly underrepresented, with as few as 0.15%-0.29% of the total number of SSRs in the bovid species. In contrast, tetranucleotide SSRs were less abundant than pentanucleotide SSRs in the study. It might be due to positive selection of even-number motif repeats relative to odd-number motif repeats. Alternatively, there could be a more passive reason, namely that even-number motif repeats might be favored to accumulate and/or to be maintained [25]. Further studies will be required to test these possibilities. The smaller motifs were predominant in each genome, as motif length increases, the occurrence decreases. This trend has been observed for a range of organisms [23]. Among mononucleotide repeat type, the motif (A/T) n were predominant, while (C/G) n repeats were rare in these bovid genomes. Also, the (A/T) n motif was the most frequent mononucleotide repeats in A. bisporus, V. volvacea, C.cinereus, P. ostreatus [26], Caenorhabditis elegans, Brugia malayi, Meloidogyne hapla [14], and Carlavirus [29], whereas the (C/G) n motifs were most frequent in the S.commune [29], Meloidogyne incognita and Pristionchus pacificus [14] genomes. Among the dinucleotide SSRs of these bovids, the (AC) n motif seem to be predominant compared with other motifs, while (CG) n were extremely rare and all present in these Bovidae species. Also, (AC) n motif was predominant in human beings [25] and Carlavirus [29], and (AG) n motifs are the most abundant in Magnaporthe grisea, Ustilago maydis [23,28], Camellia sinensis L. [30], nematodes [14], insects [31] and other invertebrates [32], while the (CG) n repeats were extremely rare. This is especially interesting because (CG) n motifs were also rare in human beings, Drosophila melanogaster, C. elegans, Arabidopsis thaliana [32], Brassica rapa [33], yeast [32], and fungi [23,28].
Our study showed that the occurrence of (AC) n motif was nearly 261.03 times on average as abundant as the (CG) n motif in the bovid genomes ( Table 2). The lower frequencies of (CG) n motifs can be explained on the basis of A/T richness and the relative difficulty of strand separation for CG compared to AT and other tracts [6]. In the same way, trinucleotide SSRs were dominated by CG-rich motifs, with (AGC) n and (ACG) n being always present in the most common motifs and (CCG) n being forever existing in the least frequent motif in the bovid genomes investigated. Previous study revealed that the (AAG) n motif predominated in Potyvirus [6], Aspergillus nidulans, Cryptococcus neoformans, Encephalitozoon cuniculi, Saccharomyces cerevisiae [23], C. elegans [14], Serpula lacrymans [34] and the (AAT) n motif in M. hapla, P. pacificus, B. malayi [14] and Schizosaccharomyces pombe [23], and the (ACG) n motif in Ganoderma lucidum, Coprinopsis cinerea, Laccaria bicolor, Postia placenta [34] and U.maydis [23], whereas (CCG) n motif was the most frequent in Phanerochaete chrysosporium [34] and M. grisea [23], the (AAC) n motifs are the most frequent in M. incognita [14] and N. crassa [23]. The (AACTG) n and (AGTTC) n motifs were the two most frequent tetranucleotide repeat units in these species and none of these single hexanucleotide motifs appeared to be shared by the six bovid species. The (AAACAA) n motif was most frequent in B. mutus, Bu. bubalis, O. aries, and C. hircus, whereas the (AAAGTG) n motif was most frequent in B. taurus and P. hodgsonii. Overall, the diversity of SSRs motifs gave each of the six bovid species a similarity pattern of SSRs distribution, suggesting that they can be nearly phylogenetic relationships. Conversely, none of the most frequent di-to hexanucleotide motifs contains exclusively Cs or Gs. The relative abundances of the same SSRs motifs show great similarity in the Bovinae species and so is in the Caprinae species. Indeed, such a consistency in the study may be considered as a strong indication of the robustness of the global analysis.

The GC-content in all analyzed SSRs
It has been reported that the level of GC-content may play some important roles in the entire genome. Indeed, the (G) n mutants in the thymidine kinase (TK) gene (tk) was reported to be related with the reactivation of herpes simplex virus [35]. The high GC-content repeats have also been reported to be related to some diseases in human and the pathogenesis of some microorganisms. For example, fragile X mental retardation-1 (FMR-1) alleles with the (CGG) n repeats were associated with neurodegeneration [36] and ovarian insufficiency [37]. FRA12A mental retardation resulted from the expansion of a large (CGG) n tract in the 5 0 UTR of the DIP2B gene [38]. The (G) n repeats in membrane protein-gene pmp10 of Chlamydophila (Chlamydia) pneumoniae was involved in virulence and pathogenesis of Chlamydia [39] and the (C) n in outer membrane proteins was involved in the pathogenesis of C. pneumoniae [40]. Long SSR with 5-11bp motif (SSR 5-11 ) were more common in GC-rich genomes, and large genomes tend to be GC-rich, and the weak correlation between Long SSR 5-11 counts and GCcontent may arise as an artifact of correlations of both with the genome size [4]. Interestingly, GC-rich SSRs were generally more difficult to expand in these PCR experiments, seemingly agreeing with our observation. There was a negative correlation between the GC-content of the flanking regions of SSRs and its polymorphism [41], which might be valuable in choosing SSRs markers. This may be due to the preponderance of motif repeats with low GC-content and SSRs frequently constitute genomic regions of low Tm.
Data-mining of 26 completed genomes showed that SSRs with low GC-content were predominant in most eukaryotic genomes [24]. This trend also emerged from our survey, with the majority of the most frequent SSRs motifs from bovids being AT-rich. The (A/T) n motifs were significantly more prevalent than the (G/C) n motifs in each complete bovid genome, whose difference could be explained by the AT-content being only notably higher than GC-content in each of the analyzed sequences. Trinucletide SSRs had high GC-content in monocot genomes [42], which was consistent with our study. The GC-content of SSRs in different coding regions was different. For example, GC-content of those reverse repeat regions (RS and RL) was significantly higher than that in unique long and unique short regions (UL and US) in Herpes simplex virus type 1 (HSV-1) [43]. Also, the GC-content of SSRs in RS-and RL-coding regions is significantly higher than that in UL-and US-regions. This could be due to the different mutational pressure in different coding regions [44]. The GC-content has been shown to covary with genomic properties such as regulated replication or expression timing [45][46], DNA bendability [47] and ability to B-Z transition [48]. The (CCG) n repeats can form secondary structures (hairpin-like) that escape DNA repair in yeast [49]. The (CCG) n repeats which were rich in HSV-1 genome were exhibited considerable hairpinforming and quadruplex-forming potential [10]. Therefore, the high GC-content in genome may affect the genome structure, especially the high GC-content in SSRs.

Distributional difference of SSR abundance, density and GC-content on different chromosomes
Sex chromosomes have been found to differ in SSRs density from autosomes in many eukaryotes. Human, rat [50], and mouse [51] X chromosomes were found to have a lower abundance of SSRs compared to autosomes, whereas the reverse was the case for dinucleotide SSRs in the Drosophila X chromosome [52]. The Z chromosome of Bombyx mori, equivalent to the X chromosome of mammals and Drosophila, had a higher trinucleotide SSRs density in the Z chromosome than in its autosomes [53]. In the mon-and dinucleotide SSRs of B. taurus, C. hircus, and O. aries, all chromosomes had the highest abundance and density of (A) n and (AC) n motifs. The autosomes and X chromosome of these three bovid species had the highest abundance and density of (ACG) n and (AGC) n motifs in trinucleotide SSRs, whereas the B. taurus Y chromosome had the highest abundance and density of (AAC) n motifs. Also, the autosomes and X chromosome of these species all had the highest abundance and density of (AAAT) n motifs in tetranucleotide SSRs, whereas the Y chromosome of B. taurus had the highest abundance and density of (AAAC) n and (AAAG) n motifs. The B.mori Z chromosome had a higher density of (ATT) n repeats compared to the autosomes, and its sequences contained very few tetra-and hexanucleotide repeats and were devoid of pentanucleotides [53]. In these three Bovidae species, all chromosome sequences also contained very few hexanucleotide SSRs.
It is almost equal to the GC-contents in the same tri-and pentanucleotide SSRs of the B. taurus, O. aries, and C. hircus autosomes, whereas the reverse was the case for the same mono-, di-, tetra-and hexanucleotide SSRs in its autosomes and sex chromosomes (Fig 2). The GCcontents of tri-to hexanucleotide SSRs were less in the X and Y chromosomes of B. taurus than that in its autosomes. Trinucleotide SSRs had the most GC-content except for the chromosome 18 and Y in B. taurus, and the least was in the mononucleotide SSRs in the chromosomes of the three bovid genomes. The GC-contents of tri-and hexanucleotide SSRs were less in the X chromosomes of O. aries than that in its autosomes. And the GC-contents of di-to pentanucleotide SSRs were less in the X chromosomes of C. hircus than that in its autosomes (Fig 2).

Conclusions
The mononucleotide SSRs were the most abundant, followed by the pattern: di-> tri-> penta-> tetra-> hexanucleotide SSRs. Generally, the number of SSRs, relative abundance, and relative density of SSRs decreased as the motif repeat length increased in each species of the Bovidae. The most GC-content was in trinucleotide SSRs and the least was in the mononucleotide SSRs in the six bovid genomes. The GC-contents of tri-and pentanucleotide SSRs display a great deal of similarity among different chromosomes of B. taurus, O. aries, and C. hircus. The SSR number of all chromosomes in the B. taurus, O.arie, and C. hircus is closely positively correlated with chromosome sequence size and significantly negatively correlated with GC-content. Relative abundance and density of SSRs in all chromosomes of the three species were significantly negatively correlated with GC-content and but not significantly correlated with chromosome sequence size. These data provided evidence for similarity patterns of SSR distributions in the six bovid species, which indicated that the particular contribution of their SSRs may be the rule for other bovids.
Supporting Information S1