Evaluation of genetic diversity among Russet potato clones and varieties from breeding programs across the United States

DNA fingerprinting is a powerful tool for plant diversity studies, cultivar identification, and germplasm conservation and management. In breeding programs, fingerprinting and diversity analysis provide an insight into the extent of genetic variability available in the breeding material, which in turn helps breeders to maintain a pool of highly diverse genotypes by avoiding the selection of closely related parents. Oblong-long tubers with russeting skin characterize Russet potato, a primary potato market class in the United States, and especially in the western production regions. The aim of this study was to estimate the level of genetic diversity within this market class potato, utilizing clones and varieties from various breeding programs across the United States. A collection of 264 Russet and non-Russet breeding clones and varieties was fingerprinted using 23 highly polymorphic genome-wide simple sequence repeat (SSR) markers, resulting in 142 polymorphic alleles. The number of alleles produced per SSR varied from 2 to 10, with an average of 6.2 alleles per marker. The polymorphic information content and expected heterozygosity of SSRs ranged from 0.37 to 0.89 and 0.50 to 0.89 with an average of 0.77 and 0.81, respectively. Out of these 23 markers, we propose nine SSR markers best suited for fingerprinting Russet potatoes based on polymorphic information content, heterozygosity and ease of scoring. Diversity analysis of these clones suggest that there is significant diversity across the breeding material and the diversity has been evenly distributed among all the regional breeding programs.


Introduction
Potato (Solanum tuberosum L.) is the third most important carbohydrate source in the human diet after rice and wheat [1]. It is an autotetraploid, vegetatively propagated crop grown in temperate, subtropical and tropical regions [2,3]. Solanum is one of the most species-rich genera of flowering plants [4], and potato possesses tremendous intraspecies genetic variability and morphological plasticity. The ploidy level varies from diploid to hexaploid [5]; however, the cultivated potato is generally tetraploid (2n = 4x = 48). In addition to being consumed fresh and in various processed forms, potato has important industrial applications that includes manufacturing starch, processed foods and alcoholic beverages. Potato tubers contain valuable nutrients including carbohydrates, vitamins, proteins, fiber, antioxidants, calcium, potassium, phosphorus and iron [6][7][8].
Among various potato types grown in the United States, Russet potato is the most popular market class. Russet potatoes are oval-oblong to long with russeting skin that varies in color from tan to darker brown and is characterized by netting on the skin. The flesh is usually white and very firm. Russet potato tubers usually range from 3-8 inches in length and 1.5-3 inches in width (76. .2mm length and 38.1-76.2mm width). Russet potatoes in the United States are consumed mostly as French-fries, baked, mashed, roasted or dehydrated. The higher dry matter content of this market class is desirable for low oil uptake during frying, thus making Russet potatoes one of the best choice for French-fries [9].
The evaluation and quantification of genetic variation in plants is an important aspect of breeding and crop improvement programs. Molecular markers are important tools in plant breeding programs that can be utilized to improve yield, quality, disease resistance and stress tolerance [10]. Genetic markers based on simple sequence repeats (SSRs) are highly polymorphic, co-dominant, well conserved across related species and follow Mendelian inheritance patterns, which makes them ideal markers for studying genetic diversity [11][12][13]. SSRs have been extensively used in diversity analysis and identification of cultivated potato clones [14][15][16][17][18][19][20][21][22][23][24]. Although, many SSR studies have been done in potato in general, SSR analysis within the Russet group is very limited. The only study done on Russet varieties to date is by Karaagac et al. [25], who used 25 SSR markers to genotype 54 potato clones including some released Russet varieties and found that all the Russet clones fell within the same cluster. With increasing interest in the processing products of potatoes worldwide, Russet varieties being the primary market class for this end use, it is important to understand the diversity within the Russet potato gene pool utilized among the breeding programs throughout the United States. The major objective of the present study is to quantify the genetic variability among the Russet potato clones and study the genetic relationships based on their pedigrees. This study will also determine whether the available markers can be utilized to distinguish among Russet breeding clones currently being evaluated by breeding programs for release as improved potato varieties.

Plant materials
Tubers from 264 potato clones were collected from seven major potato-breeding programs across the United States [Pacific Northwest (NWPVD), Colorado (CO), Maine (ME), Minnesota (MN), Wisconsin (WI), North Dakota (ND) and Maryland (MD)]. This collection includes 198 Russet breeding selections, 50 released Russet varieties and 16 non-Russet potatoes (chip, specialty and germplasm). The details of Russet selections, including pedigree information and the associated breeding programs is presented in the S1 Table. The details of released Russet varieties and non-Russet potato clones used in the study are presented in Table 1 and Table 2, respectively. Seed tuber pieces were planted in the greenhouse and leaf material was collected from the emerged shoots for DNA isolation.

Genomic DNA extraction
DNA was isolated from young tender leaves using Mag-Bind 1 Plant DNA plus 96 Kit (Omega Bio-tek, Norcross, Georgia, USA) according to the instruction manual. DNA quality and quantity was determined by agarose gel electrophoresis and spectrophotometer (Nano Drop™, Thermo Scientific, Waltham, Massachusetts, USA) respectively. All the samples were diluted to 20 ng/μl concentration with nuclease-free water.

SSR fingerprinting
Primer testing. Thirty-two SSR markers identified previously as being highly informative in potato [17,21,24,[26][27][28] [Scottish Crop Research Institute (SCRI, unpublished)], were selected and used on 24 Russet potato clones to identify primer pairs with scorable polymorphisms. Twenty-three primer pairs were shortlisted and used to fingerprint 264 potato clones for further analysis (Table 1, Table 2, and S1 Table). The majority of these SSR markers (60%) are composed of trinucleotide repeat motifs, followed by 30% di and 10% tetra nucleotide repeat motifs. The forward primer of SSR marker was fluorescently labeled with either 6-FAM, 5-HEX or NED. PCR products of 8-9 primer pairs each with three different fluorescent label and compatible amplicon sizes were multiplexed before capillary electrophoresis.
Polymerase chain reaction. Polymerase chain reactions (PCR) were performed in 10 μl volumes using 1X AmpliTaq Gold1 360 master mix (Life Technology, Carlsbad, California, USA), 0.2 μM of each primer (forward and reverse) and 20 ng DNA. The amplification cycle was performed on a 96 well Thermal cycler (Applied Biosystems, Foster City, California, USA) as follows: one cycle at 95˚C for 5 min followed by 40 cycles at 95˚C for 40 s, annealing at 54-60˚C for 50 s, 72˚C for 40 s, ending with one cycle at 72˚C for 10 min. PCR amplicons were separated on 2% agarose gel with 100 bp DNA ladder (Promega, Madison, Wisconsin, USA) as size standard. Gels were stained with ethidium bromide (0.5 μg/ml) for 20 minutes and destained with distilled water for 20 min on an orbital shaker. DNA bands were visualized and recorded on GelDoc™ XR+ (Bio-Rad, Hercules, California, USA).
Capillary electrophoresis. Two microliters of labeled PCR product from each primer pair were pooled to prepare respective multiplex set and diluted with sterile deionized water up to a final volume of 180 μl. Subsequently, 1.2 μl aliquot of the diluted sample was denatured and size fractioned using capillary electrophoresis on an ABI 3730 DNA analyzer (Applied Biosystems, Life Technologies, Carlsbad, California, USA) with an internal-lane size standard (Gen-eScan TM 500 ROX TM ) at core facility of the Center for Genomic Research and Biocomputing (CGRB), Oregon State University, Corvallis, Oregon, USA.

Data analysis
SSR genotyping, Neighbor-Joining and STRUCTURE analysis. Capillary electrophoresis data was scored using GeneMapper1 Software v4.1. (Applied Biosystems, Foster City, California, USA). Peak sizes were recorded and the number of alleles, polymorphic information content (PIC) and expected heterozygosity (H e ) were calculated using an online PIC calculator (https://www.liverpool.ac.uk/~kempsj/pic.html). SSR fingerprints of 264 potato clones is presented in S1 File. Binary data (0/1) was used to calculate a "dissimilarity index" using Jaccard coefficient. Factorial analysis was performed using dissimilarity index and a genetic diversity tree (dendrogram) was constructed using the weighted Neighbor-Joining (NJ) method in  [29]. Genetic structure analysis was performed using Bayesian method based interactive software, Structure 2.3.4 [30]. Based on the results of NJ analysis, the hypothetical number of sub-populations (K = 1 to 10) was run at three independent replicates at Burnin period length of 100,000 and 200,000 Markov Chain Monte Carlo (MCMC). The value of ΔK was calculated using Evanno's method in Structure Harvester [31,32]. In order to determine the population stratification, 264 clones were run at K = 3 with a Burnin period length of 100,000 and 200,000 MCMC using admixture model. Allele frequency divergence among the clusters, fixation index (F st ) and the average distance among individuals in the same cluster was calculated using Structure 2.3.4. SNP genotyping. In order to measure the differentiation power of SSR markers, SSR data of a subset of 21 Russet clones was compared with SNP data generated in our previous study [33]. Briefly, SNP genotyping was performed using Infinium SolCAP 12K array (12,808 SNPs) and intensity data was analyzed using GenomeStudio (Illumina, San Diego, California, USA). SNP markers with 10% "no call" rate were dropped from the study. NJ trees were constructed using SSR and SNP marker data separately in Darwin 6.0.1.2 as described above. A tanglegram [34] comparing the NJ trees was constructed using Dendroscope 3.5.9 [35].

SSR fingerprinting
Of 32 SSR markers tested on a set of 24 Russet clones, 25 markers showed scorable polymorphism. Two of the markers (STM1106 and STI0003) were dropped from the final analysis, as they produced inconsistent allelic patterns. In total, 23 markers that produced clear consistent polymorphic alleles (Table 3) were used for genotyping 264 potato clones collected from various potato breeding programs (Table 1, Table 2 and S1 Table). For multiplexing, ease and accuracy of scoring alleles, primers were fluorescently labeled and separated on capillary electrophoresis (Table 3). Twenty-three SSR markers used in the study spanned all 12 chromosomes of the potato genome with an average of two markers per chromosome. Chromosome VIII had a maximum of five markers whereas, Chromosome V and X had one marker each. The observed product size and the expected product size showed minor variations in a few of the markers but the majority of them were within the expected range ( Table 3). The total number of alleles ranged from 2-10 with an average of 6.2 alleles per marker. SSR marker STG0001 produced the maximum of 10 alleles while, STM1053 amplified only two alleles. Eleven of the markers produced rare alleles (alleles amplified only in upto two clones) ( Table 3). Marker  [33]. For future fingerprinting and genetic analyses, all the SSR markers in this study were rated on a scale of 1-3 for the ease of scoring (1-easy to score, 2-moderate and 3-difficult). The majority of the primers (12) were easy to score, followed by difficult to score (6) and moderate to score (5). We propose a set of best nine markers in terms of PIC, H e scores and ease of scoring for Russet class potatoes. These markers include STI0033, STM1016, STM5114, STG0016, STM5140, STM1052, STG0001, STI0012 and STM5127. Four of these markers, STI0033, STM1016, STG0016 and STI0012 were also reported in our previous study [33] as the best markers for Russet potato varieties released by Northwest Potato Variety Development (NWPVD) based on their PIC, H e values and ease of scoring.

Genetic diversity analysis
Analysis of genetic diversity plays a major role in correct utilization of germplasm for crop improvement programs. Germplasm with high genetic variation is a valuable resource for breeding programs. The panel of 264 clones used in this study represent 198 Russet selections (NWPVD: 114, ME: 49, CO: 16, WI: 10 and MN: 09) (S1 Table), 50 released Russet varieties (NWPVD: 25, CO: 11, ND: 05, ME: 03, MD: 02, WI: 02, BV de ZPC, Netherlands: 01 and  Table 2) from various potato breeding programs. Neighbor-Joining clustering analysis of these clones revealed three clusters (two major and one minor) (Fig 1). No tight clustering was observed based on the origin or geographical location of the Russet clones where they were bred, indicating adequate gene flow across the breeding programs. This is likely because of continuous exchange of early generation seedling tuber exchange among the potato breeding programs across the United States. However, groupings based on the lineage/pedigree of the clones were observed. As a result, Russet selections with one or both common parent tend to cluster together along with their parental clones in the same group (Group 1A, Group 1B, Group 2A, and Group 2B) (S2, S3 and S4 Files). Cluster 1 comprised of 101 clones and is further divided into three groups (1A, 1B and 1C) (Fig 2 and S2 File). Group 1A is the largest group in this cluster and is composed of a mix of Russet and non-Russet clones, 29 from NWPVD, 12 from ME, 10 from CO, three from WI, two from MN and one each from LA, MD, ND and Common Wealth Potato Collection, Scotland, United Kingdom. Nine out of 16 non-Russet clones, 'Red La Soda', 'Pallida CPC', P2-4', 'Snowden', 'Chipeta', 'TerraRossa', 'Yukon Nugget', 'Harvest Moon' and 'Masquerade' are placed in this group along with promising Russet varieties, Coastal Russet, Fortress Russet, Ute Russet, Centennial Russet, Wallowa Russet and Dakota Russet. 'Russet Burbank', the oldest Russet potato is also placed in this group and is closely clustered with breeding selection A06968-4. One of the prominent clones A06021-1T (to be released as 'La Belle Russet') is also placed in this group. S. etuberosum introgressed clones ETB6-21-3 and A00ETB12-2 are placed together in this group and are closely clustered with 'Pallida CPC', P2-4, 'Snowden' and 'Chipeta'. In addition to S. etuberosum, this group includes prominent selections with potato virus Y and Globodera pallida resistance. Important clones in the pedigree include: 'GemStar Russet', 'Gem Russet', 'Katahdin', 'Lenape' and 'Wischip'.
Group 1B comprised of 35 clones: 13 from NWPVD, nine from ME, five from CO, four from WI, two from ME and one each from NY and Agriculture Canada (Fig 2 and S2 File). Russet varieties clustered in this group include, Allagash Russet, Classic Russet, Clearwater Russet, Echo Russet, GemStar Russet Keystone Russet, Mesa Russet, Owyhee Russet, Silverton Russet and Teton Russet. 'Mesa Russet' a progeny of 'Silverton Russet' and 'Teton Russet' a progeny of 'Classic Russet' are placed in this group. In addition, two non-Russet clones 'Shepody' and Q115-6 (germplasm clone with tuber worm resistance) are also placed in this group. This group is also characterized by clones (Classic Russet, Owyhee Russet and Teton Russet) having tubers with typy appearance, an important trait for fresh market. Group 1C is the smallest in this cluster with only five clones, four from NWPVD and one (AF4882-3) from ME (Fig 2 and S2 File). Major cultivar in this group is Ranger Russet, which is characterized by long tubers with good processing quality. AO96365-3, a selected progeny of 'Ranger Russet' with good processing traits is also placed in this group.
Cluster 2 is the largest cluster with 142 clones. It is further divided into three groups, two major (2A and 2B) and one minor (2C) (Fig 3 and S3 File). Group 2A is the largest group with 63 clones, 41 from NWPVD, 13 from ME, five from CO, two from MN and one clone each from ND and WI. Similar to Group 1B, this group mainly consist of clones with fresh market potential namely, 'Russet Norkotah', 'Reeves Kingpin', 'Rio Grande Russet' and their offsprings. Out of 50 released Russet varieties, 14 clustered in this group: Castle Russet, Caribou Russet, Century Russet, Defender, Gem Russet, Klamath Russet, Payette Russet, Pioneer Russet, Premier Russet, Reeves Kingpin, Rio Grande Russet, Russet Norkotah, Russet Nugget, and Targhee Russet. 'Russet Norkotah' variant selections S3 and S8 are also placed in this group and are tightly clustered to 'Russet Norkotah. 'Caribou Russet' and its female parent 'Reeves Kingpin', 'Targhee Russet' and its offspring, A07061-6 were placed next to each other in this group.
Group 2B composed of 59 clones, 35 from NWPVD, nine each from ME and CO, three from WI, two from MN and one clone from MD (Fig 3 and S3 File). Released Russet varieties in this group are Alpine Russet, Belrus, Blazer Russet, Canela Russet, Crestone Russet, Freedom Russet, Mercury Russet, Millennium Russet, Pallisade Russet, Summit Russet, Umatilla Russet and Western Russet. Surprisingly, S. etuberousum germplasm clone, A00ETB12-3 is also clustered in this group. Most of the released varieties (Blazer Russet, Mercury Russet, Pallisade Russet and Umatilla Russet) are clustered in this group with their offsprings.
Group 2C is composed of 21 clones, nine from NWPVD, eight from ME and one each from ND, CO, BV de ZPC, Netherlands and CIP (Fig 3 and S3 File). The prominent Russet varieties in this group include, Dakota Trailblazer, Highland Russet, Innovator, and Sage Russet. 'Tacna', a non-Russet variety released by CIP is also placed in this group.
Cluster 3 composed of 20 clones and is divided into two groups (3A and 3B) (Fig 4 and S4  File). Group 3A has 12 clones, six from NWPVD, two from CO and one clone each from MD, WI, MN and Agriculture Canada. Chip processing cultivar Atlantic and fresh market table stock specialty cultivar Yukon gold are placed in this group along with hybrids of 'Russet Norkotah'. Group 3B consists of only eight clones, six from NWPVD and one each from CIP and WI. This is the smallest group and contains clones with late blight resistance from LBR-8 (S4 File).
Analysis of the grouping of clones in the clusters suggest that there is a thorough mixing of germplasm among the breeding programs. NWPVD clones that include clones from USDA-ARS potato-breeding program at Aberdeen, ID and potato-breeding and variety development program at Oregon State University, OR are evenly distributed in all the clusters. The USDA-ARS at Aberdeen, ID is a primary contributor for the NWPVD program with reciprocal exchange of material to other United States breeding programs. The results of the present study also support the fact that there is a continuous reciprocal exchange of breeding material among various potato-breeding programs, which has resulted in the clustering of mixed genotypes in the diversity analysis.

Genetic structure analysis
Structure analysis was performed to determine the amount and distribution of genetic variation in Russet potato clones. Structure is an efficient software for examining genetic structure of different populations and infer the origins of individuals in an admixture population. Population structure of all the 264 clones used in this study was analyzed using a Bayesian-based approach in the admixture model. The evaluation of ΔK, using Evanno's method showed a peak at K = 3 (S1 Fig), which indicated that the entire panel of 264 clones can be grouped into three clusters based upon the differences in their genetic makeup.
Structure analysis revealed significant admixture in the breeding material and no fixed clusters representing any particular group of clones based upon location of the breeding program were observed. There is one major (cluster 3) and two minor clusters (cluster1 and 2). All the genetic groups display a significant level of admixture present within each cluster (Fig 5 and S2  Fig). The detailed composition of three clusters with their pedigree and breeding programs is presented in the S5 File. Structure clusters show significant congruence with the clusters from the Neighbor Joining analysis; cluster 1, cluster 2 and cluster 3 roughly correspond to group 2, Russet, Highland Russet, Klamath Russet, Keystone Russet, Mesa Russet, Millennium Russet, Owyhee Russet, Pallisade Russet, Reeves Kingpin, Sage Russet, Summit Russet, Umatilla Russet and Wallowa Russet. Non-Russet varieties that grouped in this cluster include Belrus, Innovator, Masquerade, Shepody and Yukon Gold.
In addition, breeding selection A06021-1T(to be released as La Bella Russet) was also placed in this cluster. The detailed summary of the entire three clusters with their pedigree and breeding program information is presented in S5 File.
Allele frequency divergence among all the three clusters ranged between 0.06 to 0.11 with a mean value of 0.08 that signifies moderate amount of gene flow between the sub-populations or sub-clusters (Table 4). This further, supports our presumption of continuous exchange of breeding material between various potato breeding programs. The average distance among the clones in the same cluster ranges from 0.39 to 0.44 with an average value of 0.41. Cluster 2 showed the highest heterozygosity among the individual clones, indicating it to be highly diverse whereas cluster 1 showed the lowest heterozygosity (Table 4). Fixation index (F st ) measures the substructure and genetic diversity present in a set of individuals or subpopulations. In three of the clusters, F st ranges from 0.25 to 0.29 with an average of 0.27, indicating significant differentiation among the 264 clones used in the present study.

SSR markers as powerful fingerprinting tool
SSR markers have the potential to detect high level of variation that increases the resolution for genetic diversity studies thus, reducing the number of markers required to distinguish between distinct genotypes. A subset of 21 clones (18 Russet and 3 specialty clones) was used to compare the power of SSR markers with SNP markers. The clustering analysis of these clones was performed using SNP data previously reported by Bali et al. [33] and SSR data generated in the present study. Comparison of clusters using tanglegram revealed that majority of the groupings show congruence between both the NJ trees (Fig 6). The SNP-based tree resulted into two clusters (one major and one minor) with 'Russet Burbank' as an outlier whereas the SSR marker-based NJ tree divided 21 clones into three clusters (one major and two minor (Fig 6). SSR data grouped 'Russet Burbank' along with 'Ranger Russet' and 'Defender' whereas SNP data presented it as an outlier. Defender, Highland Russet, Owyhee Russet, Premier Russet, Sage Russet, Umatilla Russet and Yukon Gold reshuffled in both the trees. SSR markers separate out Highland Russet, Owhyee Russet, Sage Russet and Yukon Gold into a different cluster. Overall, the grouping of clones is similar in both the analyses. Therefore, we propose that 23 informative SSRs can be as powerful as thousands of SNP markers to perform genetic diversity analysis in potato.

Conclusion
In the present study, we are reporting the fingerprinting and diversity analysis of a large set of Russet breeding clones collected from various breeding programs across the United States. Our analysis could not separate Russet selections according to the breeding programs they originated from, which is an indicative of free-flow of germplasm among the potato breeding programs across the United States. Further, the SSR markers used in the study allowed the differentiation among Russet clones and varieties, and characterization of genetic relationships with the clustering of more closely related material. Characterization of genetic diversity of these clones can aid breeders in choosing desirable parents in breeding to exploit hybrid vigor and minimize inbreeding depression. Thus, these 23 SSR markers, separately or in tandem with SNP markers, can aid in variety identification, including misclassifications and duplications among varieties.
Supporting information S1