Discovery of Genome-Wide Microsatellite Markers in Scombridae: A Pilot Study on Albacore Tuna

Recent developments in sequencing technologies and bioinformatics analysis provide a greater amount of DNA sequencing reads at a low cost. Microsatellites are the markers of choice for a variety of population genetic studies, and high quality markers can be discovered in non-model organisms, such as tuna, with these recent developments. Here, we use a high-throughput method to isolate microsatellite markers in albacore tuna, Thunnus alalunga, based on coupling multiplex enrichment and next-generation sequencing on 454 GS-FLX Titanium pyrosequencing. The crucial minimum number of polymorphic markers to infer evolutionary and ecological processes for this species has been described for the first time. We provide 1670 microsatellite design primer pairs, and technical and molecular genetics selection resulting in 43 polymorphic microsatellite markers. On this panel, we characterized 34 random and selectively neutral markers («neutral») and 9 «non-neutral» markers. The variability of «neutral» markers was screened with 136 individuals of albacore tuna from southwest Indian Ocean (42), northwest Indian Ocean (31), South Africa (31), and southeast Atlantic Ocean (32). Power analysis demonstrated that the panel of genetic markers can be applied in diversity and population genetics studies. Global genetic diversity for albacore was high with a mean number of alleles at 16.94; observed heterozygosity 66% and expected heterozygosity 77%. The number of individuals was insufficient to provide accurate results on differentiation. Of the 9 «non-neutral» markers, 3 were linked to a sequence of known function. The one is located to a sequence having an immunity function (ThuAla-Tcell-01) and the other to a sequence having energy allocation function (ThuAla-Hki-01). These two markers were genotyped on the 136 individuals and presented different diversity levels. ThuAla-Tcell-01 has a high number of alleles (20), heterozygosity (87–90%), and assignment index. ThuAla-Hki-01 has a lower number of alleles (9), low heterozygosity (24–27%), low assignment index and significant inbreeding. Finally, the 34 «neutral» and 3 «non-neutral» microsatellites markers were tested on four economically important Scombridae species—Thunnus albacares, Thunnus thynnus, Thunnus obesus, and Acanthocybium solandri.


Introduction
Albacore tuna (Thunnus alalunga) is a highly migratory tuna species found in both subtropical and temperate waters of the three oceans and in the Mediterranean [1]. With a high commercial value [2], this species is mainly targeted by pelagic fisheries in all ocean basins and current catches are estimated to represent 5% of the global tuna catch [2,3]. As such, it is the responsibility of regional fisheries management organizations, such as the Indian Ocean Tuna Commission (IOTC), to oversee the management and sustainable harvesting of this species. Several stocks of albacore are currently considered fully exploited or overexploited, although considerable uncertainty remains in the results of stock assessment due to fisheries statistics and species biology uncertainties (e.g. for the Indian Ocean; [4]). Therefore precautionary approach to the management of albacore should be applied and it remains a priority to improve stock assessments of this species, through the development of alternative methods of population assessment [5,6,7].
Scientific results are the baseline to improve the management of a species and investigation of population structure provides key information to improve stock assessments [8]. The stock structure assumed during an assessment process has important consequences in the management and must be as close as possible to the actual population structure of the resource [9]. Population genetics have much to offer to improve stock structure for fisheries management. For example, whereas all tuna species are highly migratory, genetic differentiation has been detected at various scales, within an ocean basin for bluefin tuna Thunnus thynnus [10], and both within and among oceans for the yellowfin tuna Thunnus albacares [11] and bigeye tuna Thunnus obesus [12,13]. Information on the population structure of albacore and its habitats are unfortunately scarce (see review of albacore stock structure in [14]. For instance, the Indian Ocean is the oceanic region in which the least knowledge of albacore is available and, in lieu of the results of recent albacore stock assessments, the IOTC Scientific Committee has encouraged studies on the population structure within the Indian Ocean and adjacent waters [15,4]. Over the past several years, mainly by using 454 pyrosequencing, genome-wide microsatellite screening and marker development has been performed in many non-model species, such as fish, for genetic and molecular ecology study [16,17,18,19]. Next-generation sequencing technology (454) with the reduced representation library (RRL) construction rapidly and easily isolates the microsatellite of the genome of the non-model teleost at low cost and time [19]. In this study, we used the high throughput 454 technology from an enriched microsatellites library on albacore tuna to insulate rapidly, easily and flexibly microsatellite on the whole genome.
Genetic markers are widely used to investigate genetic diversity within populations, connectivity between populations, and to identify stocks and mixed stocks in a fishery [20,21]. Molecular genetics has led to considerable progress but to unravel population structures, studies are dependent on the use of polymorphic neutral markers. Neutral markers usually indicate a DNA region that is not under the influence of selection, and the vast majority of genetic diversity estimates are based on neutral markers [22]. Neutral markers that are capable of inferring genetic diversity are most commonly microsatellites [22]. The hypothetically random and selectively «neutral» markers are mentioned in this study. Microsatellites markers have much to offer in fisheries management (see the review in [23,24]). These genetic markers are used in a variety of population genetic studies on marine species because of their high locus variability allowing high statistical power to detect genetic structure within and among populations, as well as inferring evolutionary history [25,26,27,28,29]. Due to their cosmopolitan distribution, large population size, high fecundity, production of numerous pelagic larvae, long larval periods allowing widespread dispersal in currents and due to the ability of adults to easily migrate inter-ocean distances [30]; marine pelagic fish species have commonly been thought to lack genetic spatial structure [31,32]. In this last decade, genetic studies using microsatellites in pelagic fish investigations have increased [33,34,35,36,37,38,39,40,41,42,43,44,45]. Microsatellites have been characterized from Thunnus thynnus, Thunnus orientalis, Thunnus obesus, Thunnus albacares, yet none have been specifically designed for albacore tuna. Some of the markers developed on bluefin ( [46], [47] (4 markers), [48] (24 markers)) were tested on albacore to study the population structure of albacore in the Atlantic ( [42] (12 markers), [44] (13 markers)). These studies revealed contrasting results and have fuelled the need for an increase in the number microsatellite markers to be able to spatial structure in such pelagic species. In this short communication, we describe the development of new appropriate microsatellite markers for extensive population genetic analysis on albacore using shotgun pyrosequencing of a microsatellite-enriched library [49], and the power analysis. Additionally, these new microsatellites markers have been tested with four other Scombridae species (Thunnus albacares, Thunnus thynnus, Thunnus obesus, and Acanthocybium solandri).

Ethics statement
The field studies did not involve endangered or protected species. Albacore tuna is a commercial species caught all over the world and does not fall in any official ethical rules (UICN, RED list etc.). No specific permissions were required for the sampling locations (Fig 1). All fishes were randomly sampled from French, Seychelles and South African fishing vessels either at sea within an observer program in the authorized marine waters or at landing sites. The fishing areas are related to the fishing method (mainly longliner and purse seine) and are from one to several kilometers in range.

Test, procedure, and analysis
Our study includes 136 samples of albacore tuna collected from four different geographic areas, A) southwest Indian Ocean (42), B) northwest Indian Ocean (31), C) South Africa (31), and D) southeast Atlantic Ocean (32) (Fig 1). The Fig 1 was performed using ArcGIS software (www.arcgis.com). Hence, we followed the rule-of-thumb for the estimation of differentiation with > 30 individuals per area [50].
The number of individuals used to develop high quality microsatellite markers in this study varied from 8 to 136 depending on the molecular process. The genomic DNA was isolated from muscle tissue sample (25ng) of a single fish using Qiagen DNeasy spin columns. 1 μg of an equimolar pool of 13 DNA samples was used for the development of a microsatellites library through 454 GS-FLX Titanium pyrosequencing of enriched DNA libraries, as described in [49]. In order to increase the percentage of final sequences with microsatellites, total DNA was enriched for AG, AC, AAC, AAG, AGG, ACG, ACAT, and ATCT repeat motifs and subsequently amplified. Polymerase Chain Reaction (PCR) products were purified, quantified, and GsFLX libraries were then carried out following the manufacturer's protocols (Roche Diagnostics), and sequenced by 454 GS FLX Titanium pyrosequencing.
A summary of the different selection steps to obtain a final microsatellite panel of markers is presented in Table 1.
From the 62 682 sequences obtained, the bioinformatics program QDD [51] was used to filter the primers that designed successfully. This software allowed for high-throughput microsatellite isolation of 4 285 sequences containing SSR motifs, including motifs longer than five repeats (Fig 2).
A total of 1 670 primer pairs were designed (S2 Table for detailed information). Among the 225 microsatellites designed, we retained and tested 95 based on the sequence pattern that would maximize the number of polymorphic markers (S2 Table for detailed information). All primers were tested with one PCR condition in order to apply multiplexed reactions. These consisted of 75 di-nucleotide, 7 tri-nucleotide, 12 tetra-nucleotide, and 1 penta-nucleotide microsatellites primer pairs. Among the 95 candidate loci tested, 25 failed to amplify. From the 60 loci tested for polymorphism, 16 gave inconsistent electrophoretic patterns and 1 showed no or low polymorphism levels. 43 microsatellites markers were interpretable, clear, repeatable, and the polymorphic patterns were validated. Multiplexed loci were built with the same optimal primer pairs annealing temperature of 55°C and can be used for future genetic studies on albacore (see example S1 Fig and Table 2).
PCR were performed in 25 μl reactions containing 5 ng of template DNA, 1X reaction buffer, 1.5 mM MgCl2, 0.24 mM dNTP, 0.1 μM of each primer, and 1U Taq polymerase. The PCR cycling consisted of an initial denaturation at 95°C for 10 min, followed by 40 cycles: denaturation at 95°C for 30 s, annealing at 55°C for 30 s, and extension at 72°C for 1 min and a final extension at 72°C for 10 min. Out of the 95 markers, 70 markers were validated on agarose gel electrophoresis and 60 were selected for a polymorphism study (minimum of 3 alleles) on 15 albacore DNA samples (5 from area A, 5 from area B, and 5 from area C; Fig 1). PCR was performed following the same conditions as set above but with fluorescent forward primers (with 6'FAM, PET, VIC or NED fluorescent dye-Applied Biosystems). Each PCR amplicon was diluted with pure water (1:20), mixed with Hi-Di Formamide and GeneScan 500 LIZ dye size standard (Applied  Table) Detail on the corresponding alignment (see S2  Table) Supplement details   Table) Detail on the corresponding alignment (see S2  Table) Supplement Biosystems), and were run on an Applied Biosystems 3730 XL DNA Analyzer. Alleles were scored using GeneMapper v 5.0 (Applied Biosystems). Of the 60 markers, we retained 43 markers based on technical (PCR feasibility and genotype reading) and molecular (optimal primer length of 20 bp (range 19-27 bp); optimal 50% GC content (range 25-60%); number of repeats  Table) Detail on the corresponding alignment (see S2  Table) Supplement details greater than 8; most dinucleotide motif repeats; polymorphic (minimum of 4 alleles for each marker, observed on 15 individuals genotyped)) criteria. Sequences similarities were sought by BLASTn (scanning databases of nucleotide collections with Megablast to search for highly similar sequences, [52] on the 43 markers. Sequences from GenBank NR and BOLD systems were downloaded for a local deployment (version 2014, Gen-Bank; http://www.ncbi.nlm.nih.gov). We retained the alignment sequences with the expected value significance cut-off (E-value) 10-3. The degree of similarity was assessed using highly similar sequences (Megablast) and a ratio of similar bases (nucleotides) as a function of the microsatellite length to reveal the alignment sequences >75% (Table 2 and S2 Table). Sequence alignments were performed using the ClustalW program, setting parameters to default for gap criterions, followed by manual corrections with BioEdit software (http://www.ebi.ac.uk/Tools/ msa/clustalo/).
Population diversity and structure analyses require random «neutral» microsatellite markers. 9 markers were detected as potentially encoded and 34 potentially «neutral» markers ( Table 2). ThuAla-mt-30 has a high alignment and correspondence with a microsatellite sequence in Cottus gobio. The variability of 34 «neutral» microsatellites markers was screened using 136 individuals from the four areas. The level of diversity (allelic richness (Na); expected (He), expected unbiased from [53] (Hnb) and observed (Ho) heterozygosity) by locus was analyzed using GENETIX 4.05 [54]. Estimates of homozygote and heterozygote excess that differed significantly from zero (P<0.05) were calculated from the standard error in Pedant [55]. Probability of identity (PI) by locus was estimated using GenAlEx v6 [56]. PI is an advanced frequency-based analysis, also referred to as population match probability that provides an estimate of the average probability that two unrelated individuals will have the same multilocus genotype. It indicates the statistical power of marker loci. Deviations from Hardy-Weinberg equilibrium (HWE) were detected by exact tests and permutations (1 000 000 chains and 100 000 steps) and linkage disequilibrium by chi-square test and permutations (10 000) with ARLEQUIN version 3.1 [57]. Fisher's inbreeding coefficient (Fis) and its significance was estimated by the exact test and Markov Chain method (10 000 dememorization, 1000 batches, 10 000 iterations per batch) using GENEPOP [58], and it was based on heterozygote excess to avoid disadvantages of common tests such as chi-square. Polymorphism Information Content (PIC) was generated in Cervus [59]. Null allele frequency (Fnull) was estimated with INEst [60] using the individual inbreeding model (estimates significantly different from zero, P<0.05), followed by MICRO-CHEKER [61] to understand the result of null alleles. Probability of parentage exclusion (PE1, single parent [62]); PE2, a second parent given a first parent assigned [63]; PE3, a pair of parents [62] was estimated per locus using INest. Assigning an individual determines the probability of assigning individuals to their likely population of origin. Genotyping error rate per allele, E1 referring to allelic dropout rate and E2 to the false allele rate, and the 95% confidence interval (CI), was evaluated using the number of repeated genotypes (Nrep and percentage (%) of the total number of individuals genotyped for each loci) and based on He computed in Pedant.
POWSIM software [64] was used to estimate the statistical power to detect levels of differentiation with a minimum of 30 individuals per area. Burn-in consisted of 1000 steps followed by 100 batches of 1000 steps. Chi-square and Fisher's probabilities were used to test the significance of a Wright's F-statistics (FST) value for each replicate run. The number of significant FST values in 1000 replicate simulations provided an estimate of statistical power for a given level of divergence, which was controlled by allowing frequencies to drift for a given number of generations.
Differentiation between the four areas (Fig 1) was visualized by Factorial Component Analysis in GENETIX with different numbers of markers. Global FST considering the 4 areas and the panel of potentially «neutral» microsatellite markers was estimated using GENETIX with 1000 bootstrap. Analysis of Molecular Variance (AMOVA) and Phi-statistics (analogous to Fstatistics) were performed between the 4 areas using adegenet [65] and poppr [66] R package with 1 000 permutations.
SPOTG [67] was used to estimate the power of assignment of 4 populations, using 1000 runs. FST was equal to 0.005 and normal allele frequencies were used with the mean number of alleles equal to 17. The number of genetic markers to consider varied between 20 and 150 with 30 individuals. The number of individuals to sample varied between 30 and 500 with 34 markers. This software uses inputs from ARLEQUIN [68] and SIMCOAL [69].
The above analysis on the genetic diversity and structure were also applied to two «non-neutral» microsatellites markers in which the functions were well defined from GenBank NR and BOLD sequences alignment (ThuAla-Tcell-01, ThuAla-Hki-01; Table 1).

Development of microsatellite panel on albacore tuna
A total of 62 628 sequences with 4 285 (7%, Fig 2) unique and consensus sequences containing microsatellite markers were identified (motifs-type of repeat unit-range length of 248-288 bp) from 454 pyrosequencing. Genotyping profile characteristics of 1 670 primer pairs have been designed and described (S2 Table). Out of these sequences, 250 were high quality candidate microsatellite markers (Fig 3) and 225 were successfully designed. As expected, the most commonly found motifs were those used for library enrichment, in particular dinucleotide types AG and AC (37 and 139 microsatellites, respectively), followed by trinucleotides AAG, AAC, and AGG (7, 11, and 5 microsatellites, respectively) (Fig 3). However, although AT was not used as a motif for enrichment, 3 AT microsatellites were identified. Focusing on AG and AC motifs, the average number of repeated motifs was 8 for AG and 11 for AC with a maximum of 21 and 29, respectively (Fig 3). Allelic size range was 106 bp to 302 bp for 43 microsatellite markers ( Table 2).
Among the 43 microsatellite markers, the BLASTn search revealed 9 microsatellites markers localized in a coding sequence. These 9 markers, called «non-neutral», have an E-value 10 −3 , except for marker ThuAla-Und-05 (Table 2 and S2 Table). Of the 9 «non-neutral» markers (>75% of alignment with a sequence mainly marine species, S2 Table), 6 undetermined function (ThuAla-Und-01, ThuAla-Und-02, ThuAla-Und-03, ThuAla-Und-04, ThuAla-Und-05, ThuAla-Und-06) presented some difficulties in the PCR process, in particular ThuAla-Und-01 and ThuAla-Und-03. These markers have not been included in the final panel. Concerning the remaining 3 «non-neutral» markers, ThuAla-Tcell-01 has a high ratio alignment (85%, S2 Table) with FERM and PDZ domain-containing protein 1-like. This domain is a protein often involved in localizing proteins to the plasma membrane and is both dispensable for the T cell receptor signal transduction [70] and could provide information on the immune system. ThuAla-Hki-01 has a high ratio alignment (80-83%, S2 Table) with the hexokinase type I which is one of the four hexokinases that participate in glycolysis playing a significant role in a wide range of cellular processes particularly in providing energy in muscle cells. ThuAla-Tyr-01 has a high ratio alignment (89%, S2 Table) with the receptor-type tyrosine-protein phosphatase-like N-like (PTPRN). It is an enzyme that regulates a variety of cellular processes (cell    growth, differentiation, mitotic cycle, and oncogenic transformation) but the role in fish is unknown and it may have a general role in neuroendocrine functions, as in humans. In this study, we analyzed ThuAla-Hki-01 and ThuAla-Tcell-01 markers on overall albacore collected (136) as they are located to sequence having role in important biological traits (immunity and energy). Genotyping was successfully performed on 136 albacore tunas collected from 4 different geographic areas (Tables 3 and 4) with the 34 supposed «neutral» and 2 «non-neutral» markers, ThuAla-Hki-01 and ThuAla-Tcell-01.
"Encoding" markers analysis on albacore tuna Number of alleles, heterozygosity and PIC was high for ThuAla-Tcell-01 and low for ThuAla-Hki-01 (Table 4). Both markers could be under balanced selection judging by the frequency of their allelic distribution, particularly ThuAla-Tcell-01 (S2 Fig), though these results are not sufficient to support this hypothesis. ThuAla-Tcell-01 presented low PI and high probability of parentage exclusion meaning high potential to assign individuals (Table 4). ThuAla-Hki-01 showed a significantly greater than zero estimate of Fis, a high PI and low probability of parentage exclusion (Table 4) and were detected in deviation from HWE. Concerning the linkage disequilibrium analysis, there is random association of alleles at all loci. These loci have a low genotyping error rate giving exactly repeatable genotypes with an observed error rate of 0.00 with low 95% CI.
"Neutral" markers analysis on albacore tuna Most of these markers had a large number of alleles per locus (A), ranging from 3 to 33 alleles (Table 3). 26 markers had at least 12 alleles and 10 markers had 16 or more alleles. The mean He and Ho varied, from 21% to 95% and from 15% to 94%, respectively. The PIC value averaged 0.75. Of all the markers, two presented low number of alleles, heterozygosity and PIC (ThuAla-mt-11, ThuAla-mt-13) ( Table 3). 16 markers showed a significantly greater than zero estimate of Fis (Table 3) and they were detected in deviation from HWE. Null alleles may be present at 9 markers (ThuAla-mt-01, ThuAla-mt-03, ThuAla-mt-04, ThuAla-mt-17, ThuAlamt-19, ThuAla-mt-20, ThuAla-mt-21, ThuAla-mt-22, and ThuAla-mt-27) ( Table 3) as is also suggested by the significant excess of homozygotes (heterozygosity deficit). In these loci there was no evidence for scoring error due to stuttering and no evidence for large allele dropout. However, the significant null allele frequency in ThuAla-mt-03, ThuAla-mt-21, and ThuAlamt-22 (Table 3) may be due to stuttering, resulting in possible scoring errors, as indicated by the highly significant shortage of heterozygote genotypes with alleles of one repeat unit difference. Concerning the linkage disequilibrium analysis, there is random association of alleles at all loci. Loci have a low genotyping error rate giving exactly repeatable genotypes with an observed error rate of 0.00 with low 95% CI except ThuAla-mt-03, ThuAla-mt-17, ThuAla-mt-21, and ThuAla-mt-22 (Table 3). These results confirm the stuttering for ThuAla-mt-03, ThuAla-mt-21, and ThuAla-mt-22. Concerning ThuAla-mt-17, this may be due to the null alleles.

Comparison of selected panel with and without «non-neutral» markers
POWSIM simulations indicated that the 34 independent markers (34 «neutral») ( Table 3) and 2 «non-neutral» markers (Table 4) were able to detect significant differences among samples   with FST = 0.002 in around 90-95% of the tests and with FST = 0.005 in 100% of the tests (Table 5). Subsequently, the 34 high quality independent «neutral» markers were able to detect the same significant differences among samples with FST ! 0.002 in about 90-95% of the tests (Table 5). Finally, differentiation between the four areas was visualized by FCA with different numbers of markers (with and without "encoding markers" and potential Fnull markers (34-9 = 25 markers)) (Fig 4 and S3 Fig). The results obtained by power FST analysis and FCA analysis provided evidence of the suitability of 34 «neutral» microsatellite markers to determine the genetic relatedness among different populations and to evaluate their genetic variability. The addition of the two «non-neutral» markers does not improve or damage the analysis (Table 5, Fig 4 and S3 Fig). Jacknife by locus estimated the values of FST similar, around 0.0045  (standard deviation 0.00114) per markers. Global FST considering the 4 areas and the panel of potentially «neutral» microsatellite markers was low (0.005) with 95% CI equal to 0.003-0.007. FCA plots differentiated area C from D, whereas A and B were more similar (Fig 4). However AMOVA analysis does not support this result, Phi-statistics was low (0.003) and not significant between C and D. The degree of differentiation between all area divisions was low and not significant, excepted weakly for A-D, and B-D, then highly between B and C (S4 Table). With regards to the high PI (15; Table 3), the number of individuals in each area may be insufficient, yet this probability may improve by increasing the number of markers providing high assignment discrimination. SPOTG estimated that with 30 individuals per sampling area, 40 microsatellites markers are the minimum number required to detect evolutionary and ecological processes with a power > 50% (S3 Table). SPOTG estimated that with 34 microsatellites, a minimum of 35 individuals from each sampling area is necessary to obtain a power > 50% and with 300 individuals the power increases to > 80% (S3 Table). SPOTG will not run simulations on more than 500 individuals.

Discussion
The 454 GS FLX Titanium technology allowed fast development of polymorphic markers in albacore tuna, a non-model organism, for which low genomic information was available. This technology is interesting in term of cost and time and is effective in discovering high quality microsatellite markers for albacore tuna. This study provides the design of 1 670 microsatellite markers with all characteristics which could be used for different genetics projects on tuna (such as those carried by IOTC and ICCAT). Here, we chose a set of microsatellite markers, from the available markers designed, to investigate the albacore population genetics. Hence, the set of microsatellite markers developed in this study provides an additional tool to scientists who are investigating the genetic stock structure of this species and its implications for conservation and management measures. The same annealing temperature for optimal primer pairs allows easy multiplexing and faster manipulation at lower cost. Moreover, these markers display perfect microsatellite motif, making them easily usable in demographic inference, as in the coalescent theory [71,29], which is a key question for albacore tuna (ex. population structure inferences' implications on tuna species by [72]). Finally, most of 36 novel markers can also be used on other Scombridae species such as Thunnus albacares, Thunnus thynnus, Thunnus obesus, and Acanthocybium solandri.
The suitability of selected loci for population genetics analyses was assessed by computing several diversity and information content parameters and estimating 95% CI for genotyping error rate using repeated blind genotyping of the test panel. Analyses on the 136 individuals from all 4 areas results in a significant deviation from HWE. The 36 novel markers discovered constitute a useful tool for achieving detailed information on the genetic diversity and structure of this species and investigating its evolutionary history. Their high polymorphism, with the exception of 3 markers, proves their value in the characterization and evaluation of genetic diversity within and between populations.
Of the 9 «non-neutral» microsatellites markers discovered, two markers (ThuAla-Hki-01 and ThuAla-Tcell-01) were also characterized based on their link to sequence having potential role in a main biological trait (immunity and energy). Assessing statistical power by POWSIM confirmed that the panel of 25 «neutral» markers and of 34 «neutral» markers (Tables 2 and 3) could detect high levels of differentiation. However many markers have huge PI (15 from 34 «neutral»). In this study, FCA plots differentiated area C from D, whereas A and B were more similar. FCA did the best discrimination with all the markers. However AMOVA did not support this discrimination, particularly between C and D, and the Phi-statistics were low. Populations separated by lower genetic differentiation are less easy to make assignments, as is the case for albacore with a very low FST. The SPOTG simulations were made based on the mean number of alleles from this study. A higher number of individuals will increase the number of alleles and hence decrease the number of markers necessary to obtain an assignment power > 95%. The analysis by SPOTG revealed the necessity to increase the number of individuals and/or markers to detect evolutionary and ecological processes. Hence, we cannot conclude on the population genetics analysis due to the low number of individuals per sampling area. An increase in the number of individuals is required to assess the connectivity of albacore between geographic areas (Indian and Atlantic oceans).
Tests that produce different results based on increasing/decreasing the numbers of individuals used are encouraged to ensure the best individual assignment. Population structure and migration of albacore tuna is a challenging scientific question, but it is also a key question that needs to be addressed in terms of management of this species at the ocean-wide scale. There are at least six genetically distinct stocks of albacore, located in the North and South Pacific Ocean, North and South Atlantic Ocean, the Indian Ocean and the Mediterranean Sea [9,42,47,73,74]. Doubt subsists about the heterogeneity of stocks between the South Atlantic and Indian Oceans [14]. Small numbers of albacore may undertake inter-oceanic migrations between the South Atlantic Ocean and the Indian Ocean [75]. Nevertheless, the results are contrasted with one side genetic homogeneity [44] and the other heterogeneity [7,76,77]; between South Atlantic and Indian Oceans. The genetic studies, which did not detect any differentiation between populations, may not have enough resolution in the markers (type, polymorphism, and number) and/or the number of individuals sampled may have been too low.
A small number of «neutral» markers may not reflect inbreeding depression because they are unlikely to represent genome wide changes in homozygosity ( [78] by [22]). Fine-scale genetic population structure often needs a large number of polymorphic microsatellite markers; and the final panel of microsatellite markers in this study corresponds to the general recommended number [27,79,80,81]; under the condition of a minimum number of albacore individuals sampled. This panel could be expanded by existing markers (total 18) from literature on albacore population genetics studies [7,42,44,47,82].   Table. Information on the alignment analysis corresponding to microsatellites (micro.) markers developed for albacore. 9 «non-neutral» markers and 1 marker (ThuAla-mt-30) align to Cottus gobio microsatellite corresponding sequence. (XLSX) S3 Table. Assignment power with SPOTG simulations using different numbers of markers and individuals sampled. Table. Matrix of pairwise Phi-statistics from AMOVA analysis. Lower matrice shows the Phi-statistics values of the four geographic location of albacore (A, B, C, and D). Significance was estimated using Monte Carlo tests and 1 000 permutations, Ã P-value<0.05, ÃÃ Pvalue <0.01, ÃÃÃ P-value <0.001. (XLSX)