Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Genetic diversity analysis and marker-trait associations in Amaranthus species

  • Norain Jamalluddin,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Writing – original draft

    Affiliation Future Food Beacon, School of Biosciences, University of Nottingham Malaysia, Jalan Broga, Semenyih, Selangor, Malaysia

  • Festo J. Massawe,

    Roles Conceptualization, Funding acquisition, Investigation, Project administration, Supervision, Writing – review & editing

    Affiliation Future Food Beacon, School of Biosciences, University of Nottingham Malaysia, Jalan Broga, Semenyih, Selangor, Malaysia

  • Sean Mayes,

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliations Future Food Beacon, School of Biosciences, University of Nottingham Malaysia, Jalan Broga, Semenyih, Selangor, Malaysia, Plant and Crop Sciences, Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough, Leicestershire, United Kingdom, Crops for the Future (UK) CIC, NIAB, Cambridge, United Kingdom

  • Wai Kuan Ho,

    Roles Formal analysis, Investigation, Methodology, Writing – review & editing

    Affiliations Future Food Beacon, School of Biosciences, University of Nottingham Malaysia, Jalan Broga, Semenyih, Selangor, Malaysia, Crops for the Future (UK) CIC, NIAB, Cambridge, United Kingdom

  • Rachael C. Symonds

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Project administration, Supervision, Validation, Writing – review & editing

    R.C.Symonds@ljmu.ac.uk

    Affiliation Liverpool John Moores University, School of Biological and Environmental Sciences, Liverpool, Merseyside, United Kingdom

Abstract

Amaranth (Amaranthus spp.) is a highly nutritious, underutilized vegetable and pseudo-cereal crop. It possesses diverse abiotic stress tolerance traits, is genetically diverse and highly phenotypically plastic, making it an ideal crop to thrive in a rapidly changing climate. Despite considerable genetic diversity there is a lack of detailed characterization of germplasm or population structures. The present study utilized the DArTSeq platform to determine the genetic relationships and population structure between 188 amaranth accessions from 18 agronomically important vegetable, grain, and weedy species. A total of 74, 303 SNP alleles were generated of which 63, 821 were physically mapped to the genome of the grain species A. hypochondriacus. Population structure was inferred in two steps. First, all 188 amaranth accessions comprised of 18 species and second, only 120 A. tricolor accessions. After SNP filtering, a total of 8,688 SNPs were generated on 181 amaranth accessions of 16 species and 9,789 SNPs generated on 118 A. tricolor accessions. Both SNP datasets produced three major sub-populations (K = 3) and generate consistent taxonomic classification of the amaranth sub-genera (Amaranthus Amaranthus, Amaranthus Acnida and Amaranthus albersia), although the accessions were poorly demarcated by geographical origin and morphological traits. A. tricolor accessions were well discriminated from other amaranth species. A genome-wide association study (GWAS) of 10 qualitative traits revealed an association between specific phenotypes and genetic variants within the genome and identified 22 marker trait associations (MTAs) and 100 MTAs (P≤0.01, P≤0.001) on 16 amaranth species and 118 A.tricolor datasets, respectively. The release of SNP markers from this panel has produced invaluable preliminary genetic information for phenotyping and cultivar improvement in amaranth species.

Introduction

Climate predictions indicate that the agriculture sector in many parts of the world will be subjected to increasingly detrimental weather conditions such as droughts and elevated temperatures, directly impacting global food supply chains. A strategy to mitigate climate related agricultural losses is to diversify the food basket with a wide range of underutilized crop species with increased abiotic stress tolerance traits [1]. Amaranth (Amaranthus spp.), an ancient, nutrient-dense and climate-smart crop has high degree of genetic variation, environmental adaptability and phenotypic plasticity [2, 3]. Amaranth belongs to the Amaranthaceae family and is a C4 dicotyledonous plant [4]. It consists of approximately 60–70 species grouped into three sub-genera; Amaranthus Albersia (vegetable amaranth), Amaranthus Amaranthus (cultivated grain amaranth) and Amaranthus Acnida (weedy amaranth) [5].

Amaranthus tricolor is a leafy vegetable amaranth species, widely cultivated in South Asia and Africa [6, 7], and is an excellent source of vitamins, protein, carotenoid, minerals and antioxidants, greater than other leafy vegetables such as lettuce and spinach [2, 8]. A. tricolor has the capacity to alter its physiological characteristics in response to environmental changes, for instance, increasing transpiration efficiency [9] and accumulation of compatible solutes such as proline, in response to drought stress [10]. It also had high genetic and phenotypic diversity which may provide an excellent opportunity for varietal development with increased drought tolerance characteristics [1113].

Correct genotypic identification and preservation of genetic variation in amaranth is important to maintain ecotypes with desired traits useful for breeding programmes. The assembly of very high-quality grain amaranth, Amaranthus hypochondriacus (“Plainsman” cultivar) sequence genome by [14] allows anchoring of genotyping-by-sequencing (GBS) markers for all the SNP loci and allele sequences discovered, and GBS has proven to be the most efficient method to evaluate genetic diversity of grain amaranth as well as to validate the phylogeny of the genus [1517]. This genome assembly was used as a reference genome for an annotation framework and gene discovery of MYB-like transcription factor genes that regulate the betalain red pigment pathway, which gives rise to stem and seed colour variations [18] through traditional bi-parental mapping [14] and now through genome-wide association studies (GWAS) [19]. More recently, this plainsman reference genome together with low-coverage PacBio reads and the contigs of amaranth draft genome [20] were used to assemble A. hypochondriacus (A.hyp_K_white), a landrace cultivated in India [21]. This assembly offers a better reference genome for the improvement of grain and vegetable amaranth crops in South Asia as it is genetically closer to most landraces and accessions originated from India and South Asia.

Nevertheless, vegetable species of amaranth have been less studied by molecular means than pseudo-cereal grain amaranths and weed species, especially when both are phylogenically related and the domestication events separating them have been revealed [15, 19, 22, 23]. Limited knowledge of the genetic diversity in these leafy vegetable amaranth species and the lack of availability of suitable molecular markers hamper breeding efforts. Cultivar development and improvement relies on access to a well characterised, genetically diverse pool of material and so a comprehensive knowledge of these genetic relationships is essential. To date, there is only one molecular study that exploits a large number of A. tricolor accessions using simple sequence repeat (SSR) and matK protein-coding chloroplast gene, which concluded that the genetic diversity in Vietnamese amaranths was established by dispersal events mainly from East Asia and adaptation to local environments [24]. While the amaranth marker studies have been useful for evolutionary and phylogenetic studies, further germplasm characterization and marker validation is needed.

GBS offers a number of potential advantages to SSR markers; it is more practical, inexpensive and has driven genotyping to be applied for non-model organisms [25, 26]. DArTSeqTM technology based on GBS methods is a platform developed by Diversity Arrays Technology Pty Ltd. (Canberra, Australia) for high-throughput genotyping via an intelligent selection of genome fraction by targeting active genes and low copy DNA areas [27, 28]. This present study is the first to utilize the DArTSeq platforms in amaranth to determine the genetic relationships and population structure between 188 amaranth accessions from 18 agronomically important vegetable, grain, and weedy species. This study also aimed to investigate the genetic relationship among a numerically larger group of A. tricolor accessions. The development of SNP markers from this panel has allowed a GWAS analysis on morphological traits such as shape, size, and colour of the leaf, stem, and inflorescence. These traits are fast and easy to assess for direct use by farmers and are of great help to plant breeders when selecting potential parental lines [12, 29]. This will facilitate understanding of the genetic bases and dissection of complex genes controlling economic traits such as drought tolerance and provides useful information on the degree of genetic variation and its correlations with agronomic traits.

Materials and methods

Plant materials, growing conditions and morphological assessment

A total of 188 amaranth accessions, comprising 18 species originating worldwide were used for genetic diversity analysis. Out of 188 accessions, 131 accessions were obtained from the World Vegetable Center Genebank, Taiwan (AVRDC), 52 accessions from the United State Department of Agriculture Genebank (USDA) and five commercial varieties were included as checks, of which three African varieties from East-West Seed, Thailand, were included and two local varieties from Serbajadi Gardening, Malaysia (Table 1).

thumbnail
Table 1. List of 188 amaranth accessions and their morphological traits observed under shade-house conditions.

https://doi.org/10.1371/journal.pone.0267752.t001

A single plant of each accession was grown under shade-house conditions at University of Nottingham Malaysia (latitude 2.940°N, longitude 101.8740°E), with an average of 36°C daytime temperature, 28°C night temperature and 66% relative humidity. Plants were grown in a 16 x 12.5 x 14.5 cm plastic pot containing 2 kg black compost (Holland peat, Malaysia), irrigated daily to field capacity and at 3 weeks old, 3 g of 15N: 15P: 15K fertilizer was applied once to individual pots.

Ten qualitative traits including leaf, petiole and stem pigmentations, growth habit, branching index and, leaf shape and margin were recorded at 7 weeks post-emergence and terminal inflorescence color, shape and attitude were recorded when all accessions had fully set (at 11 weeks post emergence) using AVRDC descriptors (https://avrdc.org/seed/) (Table 1). Young leaf material was collected and snap frozen in liquid N2 and stored at -80°C for DNA analysis.

DNA extraction and DArTSeq genotyping

Total genomic DNA of 188 amaranth accessions was isolated from young leaves using a Qiagen DNeasy plant DNA extraction kit (Qiagen, USA) and DNA quality and quantity was evaluated using a Nanodrop spectrophotometer (Thermo Scientific, USA). The DNA concentration was adjusted within the range of 50-100ng/μl. 2 μg of high molecular weight and good quality DNA per sample was sent to Diversity Arrays Technology Pty Ltd, Canberra, Australia for DArTSeq analysis.

In brief, DArTSeq technology relies on the combination of a complexity reduction method to enrich genomic representations, followed by next-generation sequencing by HiSeq2000 (Illumina, USA), as described by Kilian et al. [27]. In this study, a combination of a rare cutting methylation-sensitive restriction enzyme (RE) PstI with secondary frequently cutting RE MseI were selected to optimize the locus coverage, reproducibility and polymorphisms. The PstI-compatible adapter consists of the Illumina flow cell attachment sequence, sequencing primer and a ‘staggered’ of varying length barcode region. The reverse adapter consists of Illumina flow cell attachment region and MseI overhang sequence. The ligated fragments with both a PstI and MseI adapter were amplified via polymerase chain reaction (PCR) with a programme set to an initial denaturation step of 94°C for 1 min, followed by 30 cycles of denaturation at 94°C for 20 s, annealing at 58°C for 30 s and extension at 72°C for 45 s, before a final extension at 72°C for 7 min. Equimolar amounts of PCR products from each sample were combined followed by a single end sequencing of 77 cycles on an Illumina Hiseq2500. Twenty-four DNA samples were also genotyped in two technical replications to obtain the reproducibility of the marker data. The full SNP dataset is shown in S1 Table.

Data analysis

SNP filtering.

The SNP data generated from DArTSeq technology were first physically mapped to Amaranthus hypochondriacus genome v2.1 [14] using CLC Genomic Workbench v8 (Qiagen), based on match of aligned sequence tags against the reference genome, with 80% length and similarity fraction [29]. To investigate species-specific SNPs among 12 amaranth species (not including species with one representative), the amaranth species were manually examined for unique SNPs presence in the mapped SNP markers. Six species with the highest species-specific SNPs were subjected to a Venn diagram to visualize the SNP loci shared among the species. The Venn diagram of overlapping SNP loci was generated using the online program Van de Peer Lab (http://bioinformatics.psb.ugent.be/). Genetic diversity and population structure was carried out in two steps. First, all 188 amaranth accessions consisting of 18 species were analyzed together and second, a subset of 120 A. tricolor accessions were analyzed separately, aiming to explore the genetic distances and population structure among the A. tricolor populations, which were of primary interest. In each dataset, the mapped SNP markers were trimmed by removing SNPs with <97% reproducibility, <70% call rate and <0.05 polymorphic information content (PIC) and SNPs located on minor contigs that were not have been annotated. Individual accessions with >30% missing data and SNP loci with >30% missing data were removed. The most informative SNPs with minimum allele frequency (MAF) >0.05 imputed using TASSEL v5.2.52 software [30] were selected for further analysis.

Population structure was constructed using the structure-like population genetic analyses using R package LEA [3133]. The number of populations was determined using cross-entropy criterion, based on the predictions of a fraction of masked genotypes (matrix completion) and on the cross-validation approach, with runs of eight values of K (K = 1:8). A distance matrix was generated using TASSEL v5.2.52 software which was used to conduct principal coordinate analysis (PCoA) and a phylogenetic tree based on UPGMA distance.

Genome-wide association study of morphological traits.

GWAS was conducted on the observed ten morphological traits on the same SNP datasets used for genetic diversity analysis. A mixed linear model (MLM) was generated to determine the associations by using the Q-matrix from population structure analysis (R package LEA) and kinship (K) from centered IBS method via TASSEL v5.2.52 and marker trait association (MTA) was determined at P≤0.01 and P≤0.001. The Manhattan plots of–log(p-values) and the quantile-quantile plots (Q-Q) of expected vs observed p-values for SNP based genotype-phenotype associations were generated using TASSEL v5.2.52. The most significant flanking sequences of SNPs associated with the traits (P≤0.001) were queried against JBrowse Phytozome v13 database to obtain the putative biological functions.

Results

SNP marker discovery

DArTSeq generated 74,306 polymorphic SNP reads from the 188 amaranth accessions of 18 species (S1 Table). Of these reads, 63,821 SNPs could be physically mapped to the Amaranthus hypochondriacus genome with an averaged of 100% reproducibility (max = 100%, min = 93%, median = 100%), 74% call rate (max = 100%, min = 19%, median = 74%) and 0.14 PIC (max = 0.50, min = 0, median = 0.09). The majority of the SNPs were an A/G or C/T transition mutation (62%) while the other 38% were A/C, A/T, C/G, and G/T transversion mutation. A Venn diagram of the six largest sets of amaranth species showed species-specific SNP loci, with A. thunbergii showed the highest number of unique SNPs (26,629), followed by A. spinosus (1,008), A. graecizans (1,067), A. tricolor (820), A. hypochondriacus (437) and A. hybridus (296) (Fig 1). There were only 1,394 polymorphic SNP shared by all six species group.

thumbnail
Fig 1. Venn diagram showing the presence, average and overlap of SNPs in the six largest amaranth species sets.

https://doi.org/10.1371/journal.pone.0267752.g001

Genetic diversity and population structure of two amaranth sets

First, for all 18 amaranth species, individual genotypes with >30% missing SNP data including A. atropurperus (AV-ATR), A. blitoides (AV-BLITO), and A. spinosus (AV-SPI 1, AV-SPI 5 and AV-SPI 6), A. retroflexus (US-RET 1) and A. hybridus (AV-HYB 3) were removed and a total of 8,668 SNPs remained for 16 amaranth species, comprised of 181 accessions, with an averaged of 100% reproducibility, 91% call rate, 0.28 PIC, 0.16 MAF, and 6.97% averaged missing value in SNP loci and 5% averaged missing value at the individual-level. Second, for 120 A. tricolor accessions, two individual accessions (AV-TRI 20 and AV-TRI 28) which contribute to 30% of the missing values were removed and a total of 9,789 SNPs remained for 118 A. tricolor accessions, with and averaged of 100% reproducibility, 78% call rate, 0.20 PIC, 0.07 MAF, and 2% averaged missing values in SNP loci and at individual-level. Both SNP datasets (from 16 amaranth species [181 accessions] and 118 A. tricolor accessions) shared 1346 SNPs identical markers.

Population structure analysis demonstrated that the K-values of the 16 amaranth species dataset and the A. tricolor subset were K = 3 respectively, based on minimal cross-entropy (S1 Fig) and the Q-matrix is displayed in a bar plot representation (Fig 2A and 2B). Each vertical bar represents a single accession, and the length of each bar represents the proportion contributed by each sub-population (admixture) and the grouping of the populations are illustrated in UPGMA phylogenetic tree (Fig 3A and 3B). The PCoA demonstrates the genetic divergence of both marker datasets was consistent with the output of the population structure (Fig 4A and 4B).

thumbnail
Fig 2.

Population structure of (A) 16 amaranth species and (B) 118 A. tricolor accessions at K = 3, respectively. Each vertical bar represents a single accession and the length of each bar represents the proportion contributed by each sub-population. The group membership for each population structure is similar to the UPGMA dendogram.

https://doi.org/10.1371/journal.pone.0267752.g002

thumbnail
Fig 3.

UPGMA phylogenetic tree of (A) 16 amaranth species and (B) 118 A. tricolor accessions. Yellow-dotted accessions were out-grouped A. tricolor and red-colored accessions were amaranth species that are closely related to most A. tricolor. Purple-colored and yellow-colored accessions were positioned in different clades of the second population structure as in (B).

https://doi.org/10.1371/journal.pone.0267752.g003

thumbnail
Fig 4.

3D-plot principles coordinate analysis of (A) 16 amaranth species and (B) 118 A. tricolor accessions.

https://doi.org/10.1371/journal.pone.0267752.g004

The 16 amaranth species were grouped into three populations. The majority of A. tricolor accessions belonged to Pop 1, with the exception of six A. tricolor accessions which originated from Bangladesh and belonged to POP 2 (brown-colored accessions) while the two out-grouped A. tricolor accessions (AV-TRI 20 and AV-TRI 28) were separated into POP 3 (yellow-dotted colour accessions). The two grain-types amaranth species (A. hypochondriacus and A. cruentus) belonged to Pop 3 together with their putative progenitor (A. hybridus), with the exception of one A. cruentus accession (AV-CRU 5) which belongs to Pop 1. Other cultivated vegetable-type species such as A. blitum, A. graecizan, A. sp and A. thunbergii were closely related to A. tricolor in Pop 1 (red-colored accessions), although several accessions belonged to Pop 2. The weed-type species such as A. retroflexus and A. viridis were diverse between the three populations. The PCoA demonstrated that Pop 1 clustered tightly together depicting that little diversity may exist within the populations and closer to Pop 2 which may explain the inter-specific admixtures. Meanwhile Pop 2 and Pop 3 showed some dispersal and diversity within the populations.

The A. tricolor subset demonstrated that accessions were divided into three sub-populations. Sub-pop 1 was made up of 105 accessions from 12 countries of origin, Sub-Pop 2 comprised of seven accessions, of which three accessions were from Papua New Guinea and four accessions from USA, and Sub-pop 3 consists of six Bangladeshi accessions with distinct morphological traits (had branches along the stem, purple-pink stem color, purple leaf and petiole color, red-green inflorescence color and erect terminal inflorescence attitude) (Table 1). In comparison with the 16 amaranth species population structure, A. tricolor accessions that belong to Sub-pop 2 grouped together with the rest of A. tricolor accessions in Pop 1 (brown-colored accessions). Meanwhile, the six distinct Bangladeshi A. tricolor accessions of Sub-pop 3 remained separated from the rest of A. tricolor accessions similar to Pop 2. The PCoA displayed a clear division between the sub-populations and the overall population statistic calculated using a Monte-Carlo test revealed that there is an overall significant difference between the sub-populations (P = 0.002).

SNP associations for morphological traits

GWAS identified 22 significant and “suggestive” MTAs on 16 chromosomes of 16 amaranth species that underline four morphological traits observed in branching index, inflorescence color, leaf shape (P≤0.01, P≤0.001) and leaf pigmentation (P≤0.01) (Table 2; S2 Table). At P≤0.001, four SNP markers were associated with branching index, six SNP markers associated with inflorescence color and two SNP markers associated with leaf shape. Meanwhile, 100 significant MTAs were generated from 118 A. tricolor, distributed among 16 chromosomes that underline four morphological traits observed in inflorescence color, and leaf, petiole and stem pigmentations (P≤0.01, P≤0.001) (Table 3, S2 Table). At P≤0.001, forty-five SNP markers were associated with leaf pigmentation, eight SNP markers associated with petiole pigmentation, four SNP markers associated with inflorescence color and two SNP markers associated with stem pigmentation.

thumbnail
Table 2. 12 MTAs (P≤0.001) of three morphological traits; branching index, inflorescence color and leaf shape in 16 amaranth species.

https://doi.org/10.1371/journal.pone.0267752.t002

thumbnail
Table 3. 58 MTAs (P≤0.001) of four morphological traits; inflorescence color, and leaf, petiole and stem pigmentations in 118 A. tricolor accessions.

https://doi.org/10.1371/journal.pone.0267752.t003

Furthermore, the mapping of this amaranth panel with the reference genome, A. hypochondriacus [14] identified twelve putative candidate genes with functional protein. These markers had low phenotypic variation (<20%) evaluated on all respective traits. The Manhattan plots of–log(p)>3 and the Q-Q plots of these traits are presented in Figs 5 and 6.

thumbnail
Fig 5. Manhattan plot and QQ plot for branching index (BI), inflorescence color (IC), leaf pigmentation (LP) and leaf shape (LS) of 16 amaranth species.

https://doi.org/10.1371/journal.pone.0267752.g005

thumbnail
Fig 6. Manhattan plot and QQ plot for inflorescence color (IC), leaf pigmentation (LP), petiole pigmentation (PP) and stem pigmentation (SP) of 118 A. tricolor accessions.

https://doi.org/10.1371/journal.pone.0267752.g006

Discussion

The evaluation of molecular markers and morphological traits was carried out on single plants to retain homogeneity of germplasm, as morphological variations were observed among amaranth plants within one collection. The evaluation of single plants is necessary as amaranth has high phenotypic plasticity which appears to be heterogamous in field plantings and thus adapts easily to the environmental changes, even though selection within cultivar/landrace has the possibility to be infertile [34]. The capacity of amaranth to have wide genetic variability provides new prospects in the development of new crop varieties. Therefore, the construction of population structure in amaranth through a combination of morphological and molecular data is needed in order to develop a framework for future breeding programmes.

GBS data can have a high proportion of missing values [16] and the number of SNPs retained for the analysis depends on the quality control method [35]. In this study, a large number of SNP markers (74,306 SNP) were generated through the DArTSeq method, a non-reference based approach (de novo) using the PstI and MseI endonucleases in the library preparation step. After aligning the sequence tags against the very high quality and full length macromolecules of the A. hypochondriacus reference genome for SNP locations [14], the DArTSeq was able to generate relatively large numbers of SNP marker which could be mapped to the A. hypochondriacus genome (63,821 SNP) and suggests that DArTSeq as a technique should provide for full genome coverage. The number of SNP loci discovered in this study compared favorably with previous GBS studies generated in amaranth species that used ApeKI single enzyme cutting combined with deep reference-based assembly methodsm [17] as well as studies that used two library preparations via reference-based and non-reference based assembly methods [15, 16]. After filtration, the range of polymorphic SNP markers used in this study was comparable with other findings, such as 3,974 DArTSeq SNPs successfully used for population structure of 67 wild Galapagos tomato accessions (Solanum cheesmaniae and S. galapagense) [36] and 3,956 DArTSeq SNPs used in 80 macadamia accessions (Macadamia integrifolia, M. tetraphylla and hybrids) [37].

Population structure analysis on 16 amaranth species generates consistent taxonomic classification of amaranth sub-genera which was previously defined using seeds, inflorescence and floral characteristics [7, 38]. Three amaranth sub-genera Amaranthus Amaranthus, Amaranthus Acnida and Amaranthus Albersia were well defined in this study, consistent with other GBS findings by [15]. Subgenus Amaranthus, comprised of grain amaranth (A. hypochondriacus and A. cruentus) and its weed progenitor (A. hybridus) were distinguished in Pop 3. Subgenus Albersia, which comprised of vegetable amaranth including A. tricolor were distinguished in Pop 1 and Pop 2, together with six out of seven A. blitum accessions, three A. graecizans accessions and four of six A. viridis accessions. Meanwhile, species belonging to subgenus Acnida, which comprised of weedy amaranth, A. spinosus and A. palmeri were diverse between the three sub-populations. Another important finding was A. hybridus that belonged to sub-genus Amaranthus was split into sub-genus Albersia. A. hybridus is the direct ancestor of cultivated grain amaranth species [39, 40], and the split of accessions identity could be due to inter-varietal hybridization. Weedy amaranth, A. spinosus is a cross-pollinated and subsequent gene flow between populations may occur more rapidly than the primarily self-pollinated amaranth species [40]. Lee et al. [47] also have stated that varying amounts of outcrossing and frequent interspecific and inter-varietal hybridization have occurred in amaranth accessions even though it is self-pollinated. Therefore, this could explain the admixture between amaranth species. Besides, this study found that weedy amaranth possessed more unique SNPs per accession than grain amaranth evidently perhaps suggesting that weedy species have had far less selection pressure than the cultivated grain species, which is useful from a breeding perspective

There is also genetic differentiation between grain and vegetable amaranth in this study, which has also been observed in many molecular markers studies, including AFLP [41], SSR [23, 42, 43] and GBS [15], although those studies incorporated far fewer A. tricolor accessions. This genetic analysis has not only revealed duplicates and genetically closely related individuals, but also allowed categorization of accessions into the correct species. In this study, two A. tricolor accessions (AV-TRI 20 and AV-TRI 28) from Asia deviated from the A. tricolor clade and were grouped together with sub-genera Amaranthus, which mainly belonged to grain and weed amaranths. There are two assumptions for this finding, either the two amaranths were incorrectly identified as A. tricolor [17] or were originally a landrace that was grown in a region where grain amaranth was traditionally cultivated over a long time through seeds exchange [4446]. In a previous study, GBS accurately identified A. caudatus accession PI 490752, characterized as A. hypochondriacus by 11 SSR markers [39], but it should be assigned into the A. caudatus group [17]. Therefore, re-analysis should be carried out for these two A. tricolor accessions, with addition of larger morphological dataset, which could correct the possible misclassification. The occurrence of admixed/hybrid genotypes may indicate frequent hybridization or introgression events. An experiment based on SSR markers by [23] revealed that A. tricolor accessions did not correlate between groups which may imply that A. tricolor had larger genetic variation. There was also uncertainty in positioning phylogeny of A. tricolor accessions among amaranth species, although A. tricolor accessions were grouped together in a clade [15]. A. tricolor had by far the largest estimated genome size (782.7Mbp) among 35 amaranth species, and this suggests that polyploidization likely influenced the genome size of this species [15].

In this study, the species groupings were independent of the accession’s geographical origin, contradicting previous GBS findings [1517]. In previous studies, geographical patterns demonstrate that comprehensive origin sampling can assist in understanding the evolution of the species as shown by a strong split of geographic pattern in A. hybridus between accessions from Central and South America, which later supports the hypothesis that two different lineages were the ancestors of the grain amaranth [15]. In this study, the genetic differentiation between species and geographical origin was weak, although a strong split of geographical pattern was observed in A. hybridus where accessions from America and Africa were divided into two clusters, which may explain the genetic differentiation of hybridus complex [23]. This is probably due to the cosmopolitan nature of the genus, or the results of human activities such as breeding and resource exchange [47]. While the current study used a different restriction endonuclease frequent cutter for construction of the genomic representations sequenced, the biased number of accession per species could contribute to the lack discrimination of geographical origin and species level. This was also observed in 3,431 DArTSeq SNPs used to conduct genetic diversity in 89 safflower accessions (Carthamus tinctorius L.), in which the SNPs showed weak correlation between safflower diversity pattern and origins, when compared with to a larger SNP dataset [48]. However, for a large set of 118 A. tricolor accessions, genetic differentiation of Bangladeshi accessions was clear as they clustered together and had distinct morphological characters.

The closely related A. hypochondriacus genome was used as the genome reference for association mapping as no A. tricolor genome is available to date. The utilities of the reference quality genome were demonstrated in two ways, i.e. chromosomal evolution and mapping of genetic locus responsible for stem color, hence ample support to clarify the scientific understanding of a useful agricultural trait in amaranth. The highly significant MTA found in morphological traits in this study illustrate how this DArTSeq data can provide high resolution genome coverage for mapping opportunities. However, the most significant associations detected in the MLM model had a lower threshold (−log(p−value)<4, although the mixed model was superior, it still could be lead to at least one false negative and false positive [49]. This could be due to the use of different amaranth species (A. hypochondriacus) as a reference genome instead of the A. tricolor genome. The difficulty of working with plant genomes is that they are highly repetitive and feature extensive structural variation between members of the same species, mostly attributed to their active transposons [50] and chromosomal rearrangements. For example, in the well-studied species Arabidopsis thaliana, natural accessions are missing 15% of the reference genome, indicating a similar fraction would be absent from the reference, but present in other accessions [51]. Moreover, although A. thaliana has a small (140 Mb) and not very repetitive genome compared to many other plants, SNPs may be assigned to incorrect positions due to sequence similarity shared between unlinked loci [52]. Therefore, more extensive structural variation would be expected in a larger A. tricolor genome, which contain a higher proportion of repeats and has undergone ancient and recent rounds of polyploidization [15].

Conclusions

The findings in this study demonstrated that the DArTSeq SNP data generated from 181 amaranth accessions comprised of 16 species was capable of differentiating vegetable amaranth, A. tricolor from grain and wild amaranth species. The species groupings were independent of accessions’ geographical origin. This is likely a result of germplasm origin being registered as where the seeds were donated from, which may not be the actual origin of the accession or movement of germplasm in recent historical time. For a larger A. tricolor data set, there was likelihood that a good differentiation of A. tricolor could be achieved based on a combined analysis of molecular markers, geographical origin and morphological traits. GWAS used to conduct a pilot genome association for 10 morphological traits demonstrates the potential effectiveness of the amaranth diversity panel for trait dissection. The high degree of morphological variation observed in amaranth may be beneficial in terms of its adaptive capabilities in different climatic conditions.

Supporting information

S1 Table. DArTSeq SNP reads from the 188 amaranth accessions for18 species.

https://doi.org/10.1371/journal.pone.0267752.s001

(XLSX)

S2 Table. 10 morphological traits of A. tricolor subset observed under shade-house conditions.

https://doi.org/10.1371/journal.pone.0267752.s002

(XLSX)

S1 Fig.

Cross-entropy plot for (a) first population structure: 181 amaranth accessions of 16 species and (b) second population structure: 118 A. tricolor accessions. A range of K = 1:8 was tested and K = 3 was chosen as the cross-entropy curve exhibits a plateau in both datasets.

https://doi.org/10.1371/journal.pone.0267752.s003

(TIF)

Acknowledgments

We thank the School of Biosciences, University of Nottingham Malaysia and Crops for the Future for the technical support. We are grateful to Dr David Brenner and the USDA for the donation of 179 accessions from the USDA amaranth collection.

References

  1. 1. Mayes S, Massawe FJ, Alderson PG, Roberts JA, Azam-Ali SN, Hermann M. The potential for underutilized crops to improve security of food production. J Exp Bot. 2011; 63:1075–1079. pmid:22131158
  2. 2. Rastogi A, Shukla S. Amaranth: A new millennium crop of nutraceutical values. Crit Rev Food Sci Nutr. 2013; 53:109–125. pmid:23072528
  3. 3. Jamalluddin N, Symonds RC, Mayes S, Ho WK, Massawe F. Chapter 6: Diversifying crops for food and nutrition security: a case of vegetable amaranth, an ancient climate-smart crop. In: Galanakis CM. Food Security and Nutrition. Elsevier; 2021. pp. 125–146.
  4. 4. Kauffman CS, Weber LE. Grain amaranth. In: Janick J, Simon JE (ed). Advances in new crops. Timber Press, Portland; 1990. pp. 127–139.
  5. 5. Mosyakin SL, Robertson KR. New infrageneric taxa and combinations in Amaranthus (Amaranthaceae). Ann Bot Fenn. 1996; 33:275–282.
  6. 6. Grubben GJH, van Sloten DH. Genetic resources of Amaranths: A global plan of action. ACP:IBPGR/80/2. International Board for Plant Genetic Resources, Food and Agriculture Organization of the United Nations; 1981. Rome, Italy. pp. 57.
  7. 7. Achigan-Dako EG, Sogbohossou OED, Maundu P. Current knowledge on Amaranthus spp.: Research avenues for improved nutritional value and yield in leafy amaranths in sub-Saharan Africa. Euphytica. 2014; 197:303–317.
  8. 8. Jiménez-Aguilar DM, Grusak MA. Minerals, vitamin C, phenolics, flavonoids and antioxidant activity of Amaranthus leafy vegetables. J Food Compos Anal. 2017; 58:33–39.
  9. 9. Jamalluddin N, Massawe F, Symonds RC. Transpiration efficiency of amaranth (Amaranthus sp.) in response to drought stress. J Hortic Sci Biotechnol. 2018; 94:448–459.
  10. 10. Sarker U, Islam MT, Oba S. Salinity stress accelerates nutrients, dietary fiber, minerals, phytochemicals and antioxidant activity in Amaranthus tricolor leaves. PLoS ONE; 2018. 13(11):e0206388. pmid:30383779
  11. 11. Alemayehu FR, Bendevis MA, Jacobsen SE. The potential for utilizing the seed crop Amaranth (Amaranthus spp.) in East Africa as an alternative crop to support food security and climate change mitigation. J Agron Crop Sci. 2014; 201(5):321–329.
  12. 12. Sarker U, Islam MT, Rabbani MG, Oba S. Genotypic variability for nutrient, antioxidant, yield and yield contributing traits in vegetable amaranth. J Food Agric Environ. 2014; 12(3&4):168–174.
  13. 13. Sogbohossou EOD, Achigan-Dako EG. Phenetic differentiation and use-type delimitation in Amaranthus spp. from worldwide origins. Sci Hortic. 2014; 178:31–42.
  14. 14. Lightfoot DJ, Jarvis DE, Ramaraj T, Lee R, Jellen EN, Maughan PJ. Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution. BMC Biol. 2017; 15:74. pmid:28854926
  15. 15. Stetter M.G, Schmid K. Analysis of phylogenetic relationships and genome size evolution of the Amaranthus genus using GBS indicates the ancestors of an ancient crop. Mol Phylogenet Evol. 2017; 109: 80–92. pmid:28057554
  16. 16. Stetter MG, Müller T, Schmid KJ. Genomic and phenotypic evidence for an incomplete domestication of South American grain amaranth (Amaranthus caudatus). Mol Ecol. 2017; 26(3): 871–886. pmid:28019043
  17. 17. Wu X, Blair MW. Diversity in grain amaranths and relatives distinguished by genotyping by sequencing (GBS). Front Plant Sci. 2017; 8:1960. pmid:29204149
  18. 18. Gates DJ, Strickler SR, Mueller LA, Olson BJ, Smith SD. Diversification of R2R3-MYB transcription factors in the tomato family Solanaceae. J Mol Evol. 2016; 83(1–2):26–37. pmid:27364496
  19. 19. Stetter MG, Vidal-Villarejo M, Schmid KJ. Parallel seed color adaptation during multiple domestication attempts of an ancient new world grain. Mol Biol Evol. 2020; 37:1407–1419. pmid:31860092
  20. 20. Sunil M, Hariharan AK, Nayak S, Gupta S, Nambisan SR, Gupta RP, et al. The draft genome and transcriptome of Amaranthus hypochondriacus: A C4 dicot producing high-lysine edible pseudo-cereal. DNA Res. 2014; 21:585–602. pmid:25071079
  21. 21. Deb S, Jayaprasad S, Ravi S, Rao KR, Whadgar S, Hariharan N, et al. Classification of grain amaranths using chromosome-level genome assembly of Ramdana, A. hypochondriacus. Front Plant Sci. 2020; 579529. pmid:33262776
  22. 22. Mallory MA, Hall RV, McNabb AP, Pratt DB, Jellen EN, Maughan PJ. Development and characterization of microsatellite markers for the grain amaranths. Crop Sci. 2008; 48:1098–1106.
  23. 23. Khaing AA, Moe KT, Chung JW, Baek HJ, Park, YJ. Genetic diversity and population structure of the selected core set in Amaranthus using SSR markers. Plant Breed. 2013; 132(2):165–173.
  24. 24. Nguyen DC, Tran DS, Tran TTH, Ohsawa R, Yoshioka Y. Genetic diversity of leafy amaranth (Amaranthus tricolor L.) resources in Vietnam. Breed Sci. 2019; 69(4):640–650. pmid:31988628
  25. 25. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE. 2011; 6(5):e19379. pmid:21573248
  26. 26. Andrews KR, Good JM, Miller MR, Luikart G, Hohenlohe PA. Harnessing the power of RADseq for ecological and evolutionary genomics. Nat Rev Genet. 2016; 17:81–92. pmid:26729255
  27. 27. Kilian A, Wenzl P, Huttner E, Carling J, Xia L, Blois H, et al. Diversity arrays technology: a generic genome profiling technology on open platforms. Methods Mol Biol. 2012; 888:67–89. pmid:22665276
  28. 28. Li HS, Bhavani J, Vikram S, Sehgal P, Huerta-Espino D, Kilian A, et al. A high density GBS map of bread wheat and its application for dissecting complex disease resistance traits. BMC Genomics. 2015; 16:216. pmid:25887001
  29. 29. Ho WK, Chai HH, Kendabie P, Ahmad NS, Jani J, Massawe F, et al. Integrating genetic maps in bambara groundnut [Vigna subterranea (L) Verdc.] and their syntenic relationships among closely related legumes. BMC Genomics. 2017; 18:192. pmid:28219341
  30. 30. Krichen L, Audergon JM, Trifi-Farah N. Relative efficiency of morphological characters and molecular markers in the establishment of an apricot core collection. Hereditas. 2012; 149(5):163–172. pmid:23121327
  31. 31. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics. 2007; 23(19): 2633–2635. pmid:17586829
  32. 32. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: Dominant markers and null alleles. Mol Ecol Notes. 2007; 7(4):574–578. pmid:18784791
  33. 33. François O. Running structure-like population genetic analyses with R, R tutorials in population genetics. U. Grenoble-Alpes. 2016; 1–9.
  34. 34. Guillen-Portal FR, Baltensperger DD, Nelson LA. Plant population influence on yield and agronomic traits in ‘Plainsman’ grain amaranth. In: Janick J (ed.) Perspectives on new crops and new uses. ASHS Press, Alexandria, VA; 1999. pp 190–193.
  35. 35. Marees AT, de Kluiver H, Stringer S, Vorspan F, Curis E, Marie-Claire C, et al. A tutorial on conducting genome-wide association studies: Quality control and statistical analysis. Int J Methods Psychiatr Res. 2018; 27(2):e1608. pmid:29484742
  36. 36. Pailles Y, Ho S, Pires IS, Tester M, Negrão S, Schmöckel SM. Genetic diversity and population structure of two tomato species from the Galapagos Islands. Front Plant Sci. 2017; 8:138. pmid:28261227
  37. 37. Alam M, Neal J, O’Connor K, Kilian A, Topp B. Ultra-high-throughput DArTseq-based silicoDArT and SNP markers for genomic studies in macademia. PloS ONE. 2018; 13(8): e0203465. pmid:30169500
  38. 38. Das S. Systematics and taxonomic delimitation of vegetable, grain and weed amaranths: A morphological and biochemical approach. Genet Resour Crop Evol. 2012; 59:289–303.
  39. 39. Kietlinski KD, Jimenez F, Jellen EN, Maughan PJ, Smith SM, Pratt DB. Relationships between the weedy (Amaranthaceae) and the grain amaranths. Crop Sci. 2014; 54(1):220–228.
  40. 40. Stetter MG, Zeitler L, Steinhaus A, Kroener K, Biljecki M, Schmid KJ. Crossing methods and cultivation conditions for rapid production of segregating populations in three grain amaranth species. Front Plant Sci. 2016; 7:816. pmid:27375666
  41. 41. Costea M, Weaver SE, Tardif SJ. The biology of Canadian weeds. 130. Amaranthus retroflexus L., A. powellii S. Watson and A. hybridus L. Can J Plant Sci. 2004; 84(2):631–668.
  42. 42. Oo WH, Park YJ. Analysis of the genetic diversity and population structure of amaranth accessions from South America using 14 SSR markers. Korean J Crop Sci. 2013; 58(4): 336–346.
  43. 43. Suresh S, Chung JW, Cho GT, Sung JS, Park JH, Gwag JG, et al. Analysis of molecular genetic diversity and population structure in Amaranthus germplasm using SSR markers. Plant Biosyst. 2014; 148(4): 635–644.
  44. 44. Brenner D, Baltensperger D, Kulakow P, Lehmann J., Myers R, Slabbert M, et al. Genetic resources and breeding of Amaranthus. Plant Breed Rev. 2010; 19:227–285.
  45. 45. Jimenez FR, Maughan PJ, Alvarez A, Kietlinski KD, Smith SM, Pratt DB, et al. Assessment of genetic diversity in Peruvian Amaranth (Amaranthus caudatus and A. hybridus) germplasm using single nucleotide polymorphism markers. Crop Sci. 2013; 53(2): 532–541.
  46. 46. Das S. Infrageneric classification of amaranths. In: Das S. (Eds.) Amaranthus: A promising crop of future. Springer, Singapore; 2016. pp. 49–56.
  47. 47. Lee JR, Hong GY, Dixit A, Chung JW, Ma KH, et al. Characterization of microsatellite loci developed for Amaranthus hypochondriacus and their cross-amplifications in wild species. Conserv Genet. 2008; 9:243–246.
  48. 48. Hassani SMR, Talebi R, Pourdad SS, Naji AM, Fayaz F. In-depth genome diversity, population structure and linkage disequilibrium analysis of worldwide diverse safflower (Carthamus tinctorius L.) accessions using NGS data generated by DArTseq technology. Mol Biol Rep. 2020; 47(3):2123–2135. pmid:32062796
  49. 49. Voichek Y, Weigel D. Identifying genetic variants underlying phenotypic variation in plants without complete genomes. Nat Genet. 2020; 52:534–540. pmid:32284578
  50. 50. Bennetzen JL. Transposable element contributions to plant gene and genome evolution. Plant Mol Biol. 2000; 42:251–269. pmid:10688140
  51. 51. 1001 Genomes Consortium. 1,135 genomes reveal the global pattern of polymorphism in Arabidopsis thaliana. Cell 2016; 166(2):481–491. pmid:27293186
  52. 52. Long Q, Rabanal FA, Meng D, Huber CD, Farlow A, Platzer A, et al. Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden. Nat Genet. 2013; 45(8):884–890. pmid:23793030