Identification and expression analysis of S-alk(en)yl-L-cysteine sulfoxide lyase isoform genes and determination of allicin contents in Allium species

Alliinase is the key enzyme in allicin biosynthesis pathway. In the current study, the identification and sequencing of alliinase genes along with determination of allicin contents were reported for Allium species with a novel report for Iranian endemic species. The presence of different isoforms in the Allium being discovered for the first time. In bulbs tissue, the highest allicin concentration was in Allium sativum, A. umbilicatum, and A. fistolosum (1.185%, 0.367%, and 0.34%, respectively), followed by A. spititatum (0.072%), A. lenkoranicum (0.055%), A. atroviolaseum (0.36%), A. rubellum (0.041%), and A. stamineum (0.007%). The highest allicin content in the leaves and roots were in A. sativum (0.13%), and A. stamineum (0.195%), respectively. The ORFs length ranged from 1416 in A. sativum (iso-alliinase2; ISA2) to 1523 bp in A. sativum (alliinase); the identity with A. sativum (alliinase) varies from 95% to 68% for A. ampeloprasum, and A. sativum (iso-alliinase1, ISA1) respectively. These data suggested that both ISA1 and ISA2 had a high expression in the roots and bulbs compared to A. sativum as the control in all species. Note that ISA1 and ISA2 were not expressed in the leaves. The results showed that isoforms expression patterns among different tissues in Allium species were variable. The presence of various isoforms is a possible explanation for the difference between the species in terms of obtained results, especially the amount of allicin.


Introduction
Allium L. is one of the largest genera in the family of the Amaryllidaceae, encompassing over 900 species [1,2]. The main center of its diversity is central Asia, including the territory of Iran and the Mediterranean, while the second distribution center for garlic and many Allium is western North America [3][4][5][6]. Allium species are primarily found in temperate, semi-arid, and arid regions of the northern hemisphere [7]. The results of recent classifications propose 15 subgenera and 56 sections for Allium [5], from which more than 30 species, including several a1111111111 a1111111111 a1111111111 a1111111111 a1111111111

Identification and characterization of alliinase isoforms
A fragment of approximately 1500 bp of the alliinase gene was successfully amplified with the new primers, using cDNA as template. The sequences of fragments were deposited in the NCBI GenBank Database (Table 1; Fig 1).
In this study, alliinase gene was scrutinized and two iso-alliinase genes (ISA1 and ISA2) were investigated. The ORF length of gene sequences ranged from 1416 in A. sativum (ISA2) to 1523 bp in A. sativum (Alliinase). The G+C content of the analyzed alliinase genes ranged from 39.3% (in A. sativum-ISA2) to 43.5% (in A. rubellum). The nucleotide sequences from all Allium species were aligned and compared to alliinase amino acid sequences of A. sativum (control). Identity with A. sativum (Alliinase) was 95% for A. ampeloprasum, 92% for A. umbilicatum and A. lenkoranikum, 91% for A. ascalonicum, A. chinensis, A. fistolosum and A. rubellum, 85% for A. tuberosum, 70% for A. macrostemon, 69% and 68% for ISA2 and ISA1 (in A. sativum) respectively (Table 1). There is an EGF-like domain in the N-terminal part of the alliinase structure; these domains are small disulfide-rich structures in a conserved form (a polypeptide with~50 amino-acid residues long) [29][30][31]. Frequently, EGF-like domain constitutes modules for binding to other proteins and they are often unusual in plant proteins found in the secreted proteins [30][31][32]. Among plant enzymes, alliinase is an example of a catalytic domain fused to an EGF-like domain. Note that the sequence alignment of complete alliinase sequences from different species shows a strictly conserved pattern (C-x18-19-C-x-C-x2-C-x5-C-x6-C, Fig 2).
This pattern has been known in different species and is reported by some researchers in Allium species [30][31][32][33][34]. The functional role of EGF-like domain in alliinases is unclear. Its possible role is associated with the vacuolar localization of alliinase, where one may speculate that it may act as a binding site for other proteins or a hypothetical alliinase receptor [34]. A phylogenetic tree of alliinase was constructed from different plants using MEGA7.0 based on CLUS-TALW2 alignments. The results revealed that alliinase from A. tuberosum, A. chinensis, A. fistolosum, A. ascalonicum, A. cepa, A. umbellicatum, A. umpeloperasum, A. sativum (alliinase), A. rubellum, A. lenkoranicum was grouped into one cluster, while A. macrostemon, A. sativum (ISA1), and A. sativum (ISA2) were classified into another cluster (Fig 3).
The presence of the allicin precursor and its derivative products in green garlic extracts has also been reported [28,[36][37][38][39]. In a study, in the quantification of the total thiosulfinate of different Allium spp. by HPLC analysis, A. sativum showed higher amounts of total thiosulfinate compared to the other species [40]. Allicin contents in the dry weight of garlic ranging from 1 to 4 mg g -1 have been reported by many researchers [39,[41][42]. Wang et al. (2014) reported that the amounts of allicin ranged from 0.81 to 3.01% [42]. According to British Pharmacopoeia, the minimum allicin content, in order to ensure pharmaceutical and economic viability of garlic powder, is 4.5 mg g -1 [28,43]. It has been shown that the allicin content in Allium extracts varies considerably across different regions [28,44,45]. It is well known that the allicin content is reasonably variable, and based on the amount of allicin determined for Allium in this study (Fig 4), it may provide pharmacological effects in some Allium species.

Relative expression analysis of alliinase genes
The qPCR technique was applied to determine the relationship between the allicin content and the gene expression pattern of alliinase isoforms influencing the allicin content. The expression of alliinase genes from bulbs, leaves, and roots in eight species was also examined: Allium lenkoranicum, A. atroviolaceum, A. fistolosum, A. stipitatum, A. sativum (control), A. rubellum, A. stamineum, and A. umbellicatum. The qPCR was carried out with three biological replications for each sample and three technical replications for each biological sample.  ANOVA showed significant differences between species and among different tissues (bulb, root and leaf; p < 0.01) for the genes expressions levels ( Table 3). The maximum levels of gene expressions of the alliinase gene in bulbs were detected in A. umbilicatum and A. fistolosum (~1.6 and 1.5-fold, respectively; Fig 5). However, A. sativum with a high content of allicin (1.185%) had a low gene expression level compared to A. umbilicatum and A. fistolosum. Furthermore, this condition occurred in the leaves and roots, where A. umbilicatum, A. fistolosum, and A. lenkoranicum with a low allicin content (0.047, 0.035, and 0.088% for leaves; 0.076, 0.028, and 0.077% for roots, respectively) had a higher Alliinase gene expression (~6.7, 7, and 4.6 fold for leaves; 11.3, 5.2, and 2.3 fold for roots, respectively) compared to A. sativum -alliinase- (Fig 5). The relative expression of alliinase gene in the leaf of A. rubellum was low (0.15-fold), while no allicin content was detected. Note that the alliinase expression varies among the bulbs, leaves, and roots of garlic. It has also been suggested that garlic root tissue expresses a distinct alliinase isozyme with very low homology to the bulb enzyme [21].

S.O.V Df Ms
Alliinase (  In the current study, the presence of iso-alliinase genes was identified. Accordingly, three isoforms of the enzymes were identified as Alliinase, ISA1, and ISA2. Primary data analysis for roots indicated that the gene expression level did not match and it was not consistent with allicin amounts across all species. Other findings revealed that the previously designed primers for Alliinase gene were not suitable for amplified Alliinase gene. To solve this failure, the design of primers was carried out according to the differences, specifically for each gene (Table 4). Expression studies using qPCR indicated that the highest level of ISA1 gene in bulbs was detected in A. lenkoranicum and A. fistolosum (~1229.9 and 1040.9 fold, respectively), followed by A. rubellum (271.79 fold), A. umbilicatum (55.7 fold), A. atroviolaceum (11.14 fold), A. stipitatum (3.08 fold), and A. stamineum (1.73 fold). These expression levels for ISA2 were 4.37, 3.81, 12.35, 5.15, 8.85, 7.4, and 8.19 folds respectively (Fig 5). In the roots for ISA1 and ISA2 in A. umbilicatum, A. fistolosum, A. stipitatum, A. lenkoranicum, A. stamineum, A. 5). These data suggested that both ISA1 and ISA2 had a high expression in the roots and bulbs compared to A. sativum as the control. This situation existed in all species except for A. rubellum, such that, the expression amounts were lower relatively in bulb compared to the roots in all species. Note that ISA1 and ISA2 were not expressed in the leaves (Ct values > 35 or not detectable; S1 Fig  and S2 Fig). Rabinkov et al. (1994) described that alliinase cloned in A. sativum tissues was expressed in the bulbs and leaves with a high alliinase activity, but not in the roots [21]. This demonstrated the presence of a nonhomologous alliinase gene in A. sativum roots. The proteins of alliinase isoform from A. sativum (ISA1 and ISA2) were coded by a cDNA with low sequence identity to other Allium alliinases. Regarding the roots of A. sativum, Rabinkov et al. (1994) reported a protein with low sequence homology to A. sativum alliinase cDNA, but with an alliinase activity [21]. Alliinase in A. cepa is active as a trimer and tetramer [45][46][47]. Van Damme et al. (1997) reported that in some Allium spp., alliinase has been shown to aggregate with low molecular-mass lectins into stable active complexes [48], while Lancaster et al. (2000) found that multimeric forms did not aggregate in the onion root [47]. In some Alliums, it has been reported that the root alliinase has a wider C-S lyase activity. Expression studies using RNA and northern analysis showed that A. cepa root alliinase cDNA was expressed to a far greater extent in roots than in leaves and bulbs confirming that the alliinase cDNA of leaves was not expressed in roots [47]. Note that the expression is inconsistent at different growth stages of plant. Low alliinase activity in the seeds of A. cepa (cv. Rijnsburger) was reported by Freeman (1975) which was less than 2% of that in bulbs, and increased rapidly during the seedling development to reach a stable maximum, 15 to 20 days post-germination [48].

Conclusion and discussion
This study aimed to investigate the Allium species, especially some Iranian Endemic Allium species in terms of allicin contents, and to identify as well as analyze the expression of alliinase isoform genes. The results showed that allicin contents and alliinase gene expression levels in the Alliums were highly variable (Figs 4 and 5). Numerous factors affect the production of secondary metabolites in plants such as genotype, plant genetic, plant physiology, environmental and ecological conditions [49][50][51]52,[53][54][55]. To eliminate the environmental effects, the plants were propagated under the same conditions. Eliminating environmental effects via plant propagation under the same conditions have been noted by others studies [42,52,56]. In the current study, the presence of three different isoforms in the Allium was discovered for the first time. Note that in comparison with simple pathways, the complex metabolic pathways were affected by more regulatory elements. In other words, fewer variable factors such as lower number of genes affect the end product in a simple network. The complexity of the pathway affects the metabolite production rate. In this regard, alliinase hydrolyzes alliin to allicin [13]. The biosynthetic pathway to alliin is still not clear [57]. Alliinase is the key enzyme in allicin biosynthesis pathway. In addition to the complexity of the biosynthetic route, genes expression levels and enzyme activities also affect the allicin contents. Most studies have shown a close relationship between metabolome and genes expression levels [51,52,55,57,58,59,60]. The results showed that the expression patterns of isoforms were variable among different tissues in Allium species (Fig 5). The presence of various isoforms is a possible explanation for the difference between the species in terms of obtained results, especially the amount of allicin.
It has been suggested that two different alliinase isoforms are present in A. sativum, one of which is specific for 1-PECSO (Trans-(+)-S-(1-propenyl)-L-cysteine sulphoxide) and alliin, while the other is specific for MCSO ((+)-S-Methyl-L-cysteine sulphoxide) [41]. It has been reported that the enzyme isoform was inactive in some Allium species during in situ test for alliinase activity in the IEF gel [47]. Lancaster et al. (2000) reported that two isoforms of alliinase existed in the onion root, where isoform 1 had pI = 9.3, while isoform 2 had pI = 7.6, 7.9, 8.1, and 8.3 [47]. They stated that both alliinase isoforms (I and II) showed similar enzymatic activity across the range of substrates. In contrast, only C-S lyases with Cys sulfoxide lyase activity have been reported for Alliums [61]. Lancaster et al. (2000) declared that the isoform I could not be sequenced. On the other hand, the identity between Alliinase, ISA1 and ISA2 with the isoform II of onion root alliinase was 54.30, 81.1, and 64.85 respectively. In the sequence from other Allium alliinases, A. cepa root alliinase has wider C-S lyase activity. A. cepa root alliinase may have a function in sulfur assimilation and remobilization in roots [47].
Further studies such as enzyme assay for measuring the enzymatic activity across the different species, studying x-ray crystallography and the molecular dynamics of proteins can also be beneficial to obtain more details. Indeed, although this study intended to enhance our knowledge about the Allium species, further studies are still required to clarify the details. More studies contribute to a better understanding of the gene expression pattern, e.g. examining different growth stages (i.e., initial stage, vegetative growth, reproductive phase and, maturation stage) or under growth conditions (i.e., changes in temperature, soil composition and light). According to our research, the amino acid sequences of alliinases described here displayed a substantial similarity with other known alliinases. These findings are in accordance with other results from different papers suggesting that the alliinase gene displays a high variability among different species. Thus, it cannot be used as a phylogenetic marker; however, it can easily discriminate between closely related species [62].

Materials and methods
Allium lenkoranicum, A. atroviolaceum, A. fistolosum, A. stipitatum, A. sativum, A. rubellum, A. stamineum, and A. umbellicatum were collected from the natural habitats across various geographic locations of Iran (Table 5). To eliminate the environmental effects, the collected bulbs were maintained and propagated under the same conditions as in Iranian Biological Resource Center (IBRC), Tehran, Iran. The sampled plants at the immature stage were frozen in liquid nitrogen and then stored at -80˚C until further analysis.
Determining the amount of allicin HPLC system and chromatographic conditions. The HPLC system consisted of Agilent 1100 HPLC Series system (Agilent, Santa Clara, CA, USA). Its analyses were performed on a C18 column (250 mm × 4.6 mm) where the allicin was detected at 254 nm wavelength by a UV-visible detector. Methanol and water (50:50%, v/v) were used as mobile phase with a flow rate of 0.7 ml min -1 at ambient temperature. The final injection volume was 20 μl. The allicin content was evaluated according to the method described in British pharmacopoeia with some modifications [28,43].
Preparation of the Allium extracts and internal standard. Butylparahydroxybenzoate was used as the internal standard (IS) for the quantification of allicin which was prepared in the mobile phase (20 mg in 100 ml) [28,43]. Half of the materials collected from five plants, were directly frozen in liquid nitrogen and powdered, using mortar and pestles. Further, 0.8 g powder was homogenized with 20 ml distilled water and sonicated for 5 min continuously at 100% amplitude, using an ultrasonicator (Elmasonic s30, Germany) in an ice container, and then incubated for 30 min at 25˚C. The obtained mash was poured through a five-layer cheesecloth and allow to drain, then transferred into 50 ml falcon tube. The extracts and cell debris were separated by centrifugation (6000 g) for 20 min at 4˚C. The supernatant was transferred into a new sterile 50 ml falcon tube. Also, 10 ml supernatant was diluted to 25 ml by adding mixture A (solution of anhydrous formic acid (1%; v/v): methanol (HPLC grade), 40:60), and centrifuged at 6000 g for 5 min at 4˚C. Further, 0.5 ml of IS was diluted to 10 ml with a supernatant of the second centrifugation in a volumetric flask. Note that allicin is unstable at high temperatures and the assay must be carried out as quickly as possible. Thus, the sample solutions were stored at -70˚C, before injection [14].
Determination of allicin. The following equation was used to calculate the amount of allicin in the samples [28,43]. Allicin

Assembly of reads and identification of alliinase gene
The plants were separated into bulbs, leaves, and roots. Half of each tissue (bulb, leaf, and root) was cut and mixed for RNA extraction and expression analyses, while the other half was

Nested PCR detection and phylogenetic construction
The nested primers for the amplification of alliinase genes were designed based on different positions on consensus sequences to amplify the untranslated region (UTR) and coding region of alliinase gene fragments overlapping with each other to confirm the characterized ORFs (Table 4 and Fig 1). The expected length of alliinase was amplified by RT-PCR from firststrand cDNA from leaves, using Hyperscript™ RT PCR master mix (GeneAll Biotechonolgy Co. Ltd, Fig 1). PCR was carried out with newly specific primers for alliinase gene in three repeats. The PCR program was as follows: 95˚C for 3 min, followed by 34 cycles at 95˚C for 5 min, 55˚C for 30 S, 72˚C for 1.30 min, with the final elongation of 10 min at 72˚C. PCR products were separated in 1% (w/v) agarose TAE gel. The amplified PCR products of long fragments were cleanup by gel recover kit (Top Gel Recovery Kit, TOPAZ GENE RESEARCH., Cat. No.: TGK1006, Iran), and subjected to direct sequencing by an automatic sequence and dye-termination sequencing system (Macrogen Co., Seoul, South Korea). The sequences were edited and assembled by employing SeqMan (DNAstar) [64]. Also, the identification of open reading frames (ORFs) and conserved domains, as well as translated protein sequences were done using the BLASTN, BLASTP, ORF finder, available at http://www.ncbi.nlm.nih.gov/, and Pfam, available at http://pfam.xfam.org/. After a BLASTP search on the NCBI database, alliinase protein sequences were selected from different species with more than 50% identity with the coding region of the consensus sequences. To determine the relationship between the identified alliinase and the protein downloaded from the BLASTP search, multiple alignments were run using web-based Clustal Omega program (https://www.ebi.ac.uk/Tools/msa/ clustalo/). The maximum likelihood method in MEGA7 was done for phylogenetic tree construction and 1000 iterations were applied for calculating the bootstrap value [65].