Detection and Molecular Characterization of Two FAD3 Genes Controlling Linolenic Acid Content and Development of Allele-Specific Markers in Yellow Mustard (Sinapis alba)

Development of yellow mustard (Sinapis alba L.) with superior quality traits (low erucic and linolenic acid contents, and low glucosinolate content) can make this species as a potential oilseed crop. We have recently isolated three inbred lines Y1127, Y514 and Y1035 with low (3.8%), medium (12.3%) and high (20.8%) linolenic acid (C18∶3) content, respectively, in this species. Inheritance studies detected two fatty acid desaturase 3 (FAD3) gene loci controlling the variation of C18∶3 content. QTL mapping revealed that the two FAD3 gene loci responsible for 73.0% and 23.4% of the total variation and were located on the linkage groups Sal02 and Sal10, respectively. The FAD3 gene on Sal02 was referred to as SalFAD3.LA1 and that on Sal10 as SalFAD3.LA2. The dominant and recessive alleles were designated as LA1 and la1 for SalFAD3.LA1, and LA2 and la2 for SalFAD3.LA2. Cloning and alignment of the coding and genomic DNA sequences revealed that the SalFAD3.LA1 and SalFAD3.LA2 genes each contained 8 exons and 7 introns. LA1 had a coding DNA sequence (CDS) of 1143 bp encoding a polypeptide of 380 amino acids, whereas la1 was a loss-of-function allele due to an insertion of 584 bp in exon 3. Both LA2 and la2 had a CDS of 1152 bp encoding a polypeptide of 383 amino acids. Allele-specific markers for LA1, la1, LA2 and la2 co-segregated with the C18∶3 content in the F2 populations and will be useful for improving fatty acid composition through marker assisted selection in yellow mustard breeding.


Introduction
Yellow mustard (Sinapis alba L., 2n = 24) is cultivated as an important condiment crop. It has many desirable agronomic traits such as resistance to cabbage aphids [1], flea beetles [2,3] and blackleg diseases [4]. In addition, it is drought tolerant and resistant to pod shattering. Yellow mustard germplasm with canola quality (low erucic acid and low glucosinolate contents) was developed at Agriculture and Agri-Food Canada-Saskatoon Research Centre (AAFC-SRC) [5], which makes yellow mustard have the potential to become an alternative oilseed crop to canola B. napus, especially in semi-arid areas.
The oil quality of canola B. napus is determined by the proportion of the three major unsaturated fatty acids: oleic acid (C18:1), linoleic acid (C18:2) and linolenic acid (C18:3). Traditional B. napus cultivars contain 9% C18:3 of the total fatty acids [6]. The high level of linolenic acid in canola oil is undesirable since it shortens the shelf life and causes off-type flavour of the oil due to the three easily oxidized double bonds. A low linolenic acid mutant, containing 3-5% C18:3, was produced by ethyl methanesulfonate (EMS) treatment of a high C18:3 B. napus cv. Oro seed [7]. Current low C18:3 canola cultivars have been developed using this low linolenic gene source.
Linolenic acid content is determined mainly by the embryonic genotype with some influence from temperature, maternal genotype and cytoplasm in B. napus [8][9][10]. QTL mapping identified two major QTLs, accounting for 25.2-28.8% and 52.4-62.7% of the C18:3 variation, located on the linkage groups A4 and C4, respectively, in B. napus [11,12]. It was reported that the low C18:3 variant resulted from mutations of FAD3 genes in B. napus [11][12][13]. The FAD3 gene on A4 harboured a C to T substitution in exon 7, which when translated causes the wild type amino acid arginine to be replaced by cysteine. The FAD3 gene on C4 contained a G to A substitution in the 59 splice site of intron 6 in the low C18:3 B. napus line. FAD3 allele-specific markers based on the sequence variation were developed and proved to be useful for identification of different C18:3 genotypes in canola B. napus [11,12]. Yellow mustard accessions contain 6.9-12.4% linolenic acid of total fatty acids in the seed [14,15]. Recently, inbred lines with high (18.5%), medium (13.8%) and low (3.8%) linolenic acid content, respectively, have been obtained through inbreeding of heterozygous open-pollinated plants in yellow mustard [16].
The low linolenic acid variant (3.8%) is a valuable gene source for breeding canola-quality yellow mustard with high stability oil (high oleic and low linolenic acids) as that of canola B. napus. The knowledge about genetic and molecular bases of the variation in C18:3 content and development of FAD3 allele-specific markers will greatly facilitate the development of low linolenic canolaquality yellow mustard. The objectives of this study were: 1) to determine the inheritance and perform QTL mapping of the C18:3 content; and 2) to clone the FAD3 genes and further develop allele-specific markers for marker assisted selection.

Plant Materials
Linolenic acid contents of the three parental lines Y1127, Y514 and Y1035 are shown in Table 1. Y1127 is an S 4 inbred line produced by selfing of the low linolenic S 2 line Y158 for two generations and has a low C18:3 content (average: 3.8%). Y514 is the doubled haploid line SaMD3 [17] and has a medium C18:3 content (average: 12.3%). Y1035 is an S 4 inbred line and has a high C18:3 content (average: 20.8%).
The F 1 seeds of the three crosses Y1127 (low)6Y1035 (high), Y1127 (low)6Y514 (medium) and Y514 (medium)6Y1035 (high) were produced. To produce the BC 1 seeds, the F 1 plants of the three crosses were crossed as the female with the parental line with a lower C18:3 content. All plants were raised under the same conditions in the greenhouse at AAFC-SRC.

Regional Linkage Mapping
Regional linkage mapping of the linolenic acid content was performed using intron length polymorphism (ILP) markers and bulked segregant analysis (BSA) [18]. A total of 1478 ILP primer pairs: 380 from Arabidopsis thaliana [19] and 1098 from B. napus [20] were used to screen the three parental lines for polymorphic markers. The high bulk was made by mixing equal amount of DNA from 10 F 2 plants with the highest C18:3 content, while the low bulk was formed from 10 F 2 plants with the lowest C18:3 content for each of the three crosses. The primers detecting polymorphic markers between the two bulks were subsequently used to genotype individual plants of the three F 2 populations. Genomic DNA was extracted from young leaves of the parental lines Y1127, Y514 and Y1035, F 1 and F 2 plants using a modified sodium dodecyl sulfate method [21]. Each PCR (20 ml) contained 16 standard PCR buffer (NEB), 1 U of Taq polymerase (NEB), 0.25 mM forward primer, 0.25 mM reverse primer, 100 mM each dNTP and 50 ng of genomic DNA in a total volume 20 mL. The PCR amplification consisted of an initial denaturation at 94uC for 5 min, 35 cycles consisting of 94uC (45 sec), 55uC (45 sec), 72uC (1 min) terminating with 72uC for 7 min. All PCR products were analyzed by electrophoresis in 2% agarose gels in 16Tris-acetateethylenediaminetetraacetic acid buffer. Gels were visualized by staining in ethidium bromide and photographed on a digital gel documentation system.
The regional linkage map of C18:3 content was constructed using JoinMap 4.0 [22] with a minimum LOD threshold of 4.0. QTL analysis of C18:3 content was performed using the interval mapping method of MapQTL 6.0 [23]. A Chi-square test was used for evaluating the genetic model of C18:3 content in the BC 1 and F 2 populations, and the ILP markers in the F 2 populations.
Cloning of the Coding Region of the FAD3 Gene Primer pair No 1 (Table S1) was designed based on the conserved coding regions of the FAD3 genes in B. napus and A. thaliana. It was used to clone the coding DNA sequence (CDS) of the FAD3 gene in yellow mustard. Immature seeds at 22 days after pollination were collected from two individual plants from each of the parental lines. Total RNA was extracted from the immature seeds using the RNeasy Plant Mini Kit (Qiagen) as per the manufacturer's instructions. 750 ng of RNA from each of the parental lines was used to prepare the cDNA using Qiagen's Omniscript RT Kit as per the manufacturer's instructions. Each PCR (20 ml) contained 16PCR standard buffer (NEB), 100 mM of each dNTP, 0.25 mM of each forward and reverse primer, 1 U of Taq polymerase (NEB) and 50 ng of cDNA. Polymerase chain reaction was performed with an initial denaturation at 94uC for 3 min followed by 35 cycles of 45 s at 94uC, 30 s at 55uC and 1 min at 72uC with a final extension cycle of 72uC for 10 min.
Cloning of the 59 and 39 Flanking Sequences and the Genomic DNA Sequences of the FAD3 Genes Primer pairs No 2 and 3 (Table S1) were designed based on the 59 coding sequences of the cloned SalFAD3.LA1 and SalFAD3.LA2 genes, respectively. They were used to clone the 59 upstream sequences by PCR walking according to the protocol of Siebert et al. [24]. Primer No 4 (Table S1) was designed based on the 39 coding sequences of the cloned SalFAD3.LA1 and SalFAD3.LA2 genes, and was used to clone the 39 flanking sequence by PCR walking. Primer pairs No 5 and 6 (Table S1) were designed based

DNA Sequencing
The expected PCR bands were cloned using the pGEM-T Vector System I (Promega) following the provided instructions. The plasmids were extracted using the QiaSpin Kit (Qiagen) following the manufacturer's instructions and sequenced using the primer pairs No 7-11 (Table S1) at the Plant Biotechnology Institute, National Research Council, Canada.

Phylogenetic Tree
The multiple alignments were performed using ClustalW (http://www.ebi.ac.uk/clustalw/). MEGA software (version 4.0) (http://www.megasoftware.net/index.html) [25] was used to construct a phylogenetic tree with the aligned protein sequences. The neighbor-joining method was used with the pairwise deletion option, poisson correction model, and the 1000 bootstrap replicates test.   The SalFAD3.LA1 and SalFAD3.LA2 allele-specific markers were generated using primer pair No 12 (Table S1) which was designed based on the conserved flanking sequences of intron 3. The PCR reaction was performed with LongAmp Taq 26 Master Mix (NEB) following the manufacturer's instructions with a 60uC annealing temperature.

Fatty Acid Analysis
Seed fatty acid composition was analyzed according to [26] with the following modification: the gas chromatography of the methyl esters was performed with a HP-INNOWax fused silica capillary column (0.25 mm by 0.5 m and 7.5 mm) (Agilent Technologies) at 250uC using hydrogen as the carrier gas. A minimum of 10 seeds from each of the parental lines and F 1 hybrids as well as 160 F 2 seeds of each of the three crosses were half-seed analyzed according to [27]. Ninety-six seeds from each of the BC 1 populations were analyzed using the single seed method.

Linolenic Acid Content is Controlled by Two Gene Loci in Yellow Mustard
The C18:3 content of the F 1 seeds was significantly higher than the mid-parent value in the crosses of Y1127 (low)6Y1035 (high) (t = 3.84, p,0.01) and Y1127 (low)6Y514 (medium) (t = 5.62, p, 0.01) (Table 1, Figure 1 and 2), suggesting a partial dominance of the high/medium over low C18:3 content. However, in the cross of Y514 (medium)6Y1035 (high) the F 1 seeds had significantly lower C18:3 content (15.3%) than the mid-parent value of 16.5% (t = 6.98, p,0.01) (Table 1, Figure 3), indicating a partial dominance of the medium over high C18:3 content.
The BC 1 seeds of (Y11276Y514)6Y1127 showed a segregation ratio of 1:1 (seeds with 2.7-5.2% versus seeds with 6.4-9.7% C18:3 content) ( Figure 2) (x 2 = 3.38, p = 0.07), suggesting that the C18:3 content was controlled by one gene locus in this cross. The F 2 seeds of Y11276Y514 showed a continuous distribution ranging from 3.0% to 16.5% in the C18:3 content (Figure 2). The BC 1 seeds of (Y5146Y1035)6Y514 and the F 2 seeds of Y5146Y1035 exhibited a continuous frequency distribution in the C18:3 content ( Figure 3). Therefore, it was not possible to classify the seeds into discrete groups.
Two QTLs Accounting for the Variation of C18:3 Content are Mapped to Linkage Groups Sal02 and Sal10, Respectively In the F 2 population of Y1127 (low)6Y1035 (high), eighteen ILP primer pairs were polymorphic between the high (16.6-20.4%) and low (2.9-4.0%) C18:3 bulks and generated 18 markers ( Table 2). The 18 markers were mapped to two linkage groups, each of which carried one QTL for the C18:3 content (Figure 4). Based on the common ILP markers, the two linkage groups were revealed to be Sal02 and Sal10 of the constructed S. alba map [28]. One QTL (LOD = 45.43) accounting for 73.0% of the total variation of C18:3 content was localized between BnapPIP685 and BnapPIP881 in Sal02 (Figure 4). The other QTL (LOD = 9.28) responsible for 23.4% of the total variation was located between BnapPIP1012 and BnapPIP363 in Sal10 ( Figure 4). Together, the two QTLs explained 96.4% of the total variation for C18:3 content in the F 2 population.
The SalFAD3.LA1 and SalFAD3.LA2 Genes are Cloned and Exhibit Differences in the Exon and Intron The coding regions of the dominant alleles LA 1 and LA 2 were cloned from Y1035, while those of the recessive alleles, la 1 and la 2 from Y1127 using primer pair No 1 (Table S1). LA 1 had a coding    Figure  S1). A stop codon at the beginning of the 64 bp insertion might have resulted in the termination of protein translation after the 137 th amino acid residue. Therefore, la 1 is a loss-of-function allele. The 59 flanking sequences from the translation start site were cloned for LA 1 and la 1 using the primer pair No 2 (Table S1). The 59 fragment of LA 1 was 1250 bp, while that of la 1 was 621 bp. A 435 bp 39 flanking sequence from the translation stop codon was cloned for LA 1 and la 1 using the primer pair No 4 (Table S1). The two alleles didn't exhibit any differences in the cloned 39 flanking sequences. The genomic DNA sequences of the LA 1 and la 1 were amplified using the primer pair No 5 (Table S1) which was designed based on the 59 flanking sequence and the conserved 39   Figure 5). The inserted fragment contained a 5 bp direct repeat (59-AGAAC-39) at each end, which is a typical LTR retroelement insertion site ( Figure S4). In addition to differences in the CDS, LA 1 and la 1 exhibited variation in the length of the introns ( Figure 5).
Both LA 2 and la 2 had a CDS of 1152 bp encoding a polypeptide of 383 amino acids ( Figure S1 and S5). Six point mutations at positions 567, 579, 666, 699, 777 and 1059 were observed in the CDS of la 2 when compared with that of LA 2 , but did not lead to any amino acid changes. The 59 flanking sequences from the translation start site were cloned for LA 2 and la 2 using the primer pair No 3 (Table S1). The 59 flanking fragments of the two alleles were 444 nucleotides in length and were similar in sequence. A 435 bp 39 flanking sequence from the translation stop codon was cloned for LA 2 and la 2 using the primer pair No 4 (Table S1). The two alleles didn't show any differences in the cloned 39 flanking sequences. The genomic DNA sequences of LA 2 and la 2 were cloned using primer pair No 6 (Table S1) which was designed based on the 59 flanking sequence and the conserved 39 flanking sequence specific to the candidate SalFAD3.LA2 gene ( Figure S3). Comparison of the coding and genomic DNA sequences indicated that the candidate SalFAD3.LA2 gene also contained 8 exons and 7 introns ( Figure 5). Variation in the length of the introns was observed between LA 2 and la 2 ( Figure 5). For instance, the third intron of LA 2 was 530 bp, while that of la 2 was 1165 bp.
Sequence alignment of LA 1 and LA 2 indicated that LA 1 harboured a 9 bp deletion at position 46 ( Figure 5 and Figure  S1), which resulted in the loss of the three amino acids glycinearginine-lysine at position 16. In addition, 77 point mutations were observed between LA 1 and LA 2 ( Figure S1), of which 19 mutations led to amino acid changes ( Figure S5). The candidate SalFA-D3.LA1 and SalFAD3.LA2 genes exhibited differences in the cloned 59 flanking sequences ( Figure S2), but had the same 39 flanking sequences. Variation in the length of the introns was observed among the four alleles LA 1 , la 1 , LA 2 and la 2 ( Figure 5).

Discussion
The present paper reported on the inheritance and QTL mapping of C18:3 content as well as molecular characterization of the FAD3 genes in yellow mustard. Linolenic acid content was controlled by the nuclear genotype of the embryo in yellow mustard as reported in B. napus [8]. Two nuclear gene loci were detected and functioned independently and additively to determine the total C18:3 content in the seeds. However, maternal effects on the C18:3 content couldn't be ruled out since appropriate progeny tests were not performed in the present study. QTL analysis further revealed that the two gene loci SalFAD3.LA1 and SalFAD3.LA2 had a different magnitude of effect and together explained 96.4% of the total variation for C18:3 content. The residual 3.6% variation of C18:3 content beyond the two QTLs could be resulted from maternal and environmental effects. It has been reported that temperature, maternal genotype and cytoplasm have effects on C18:3 content in B. napus [8][9][10]. The duplication of the FAD3 gene provides additional evidence that yellow mustard is a secondary polyploid species as revealed by molecular studies [29,30]. The two linkage groups Sal02 and Sall0 containing the SalFAD3.LA1 and SalFAD3.LA2 genes, did not share any common ILP markers, suggesting the occurrence of extensive genomic changes during the speciation of yellow mustard.
Molecular cloning and sequencing indicated that the SalFA-D3.LA1 and SalFAD3.LA2 genes contained 8 exons and 7 introns in yellow mustard, which is in agreement with that in B. napus [12] and A. thaliana (Locus: AT2G29980, TAIR) [31]. However, the molecular mechanism underlying the naturally occurring C18:3 variant in yellow mustard was different from that of the EMSinduced C18:3 variant in B. napus and B. oleracea. The FAD3 gene with reduced C18:3 content resulted from SNP mutations in B. napus [11,12] and B. oleracea [32]. However, the recessive allele la 1 of the SalFAD3.LA1 gene was a loss-of-function mutant due to an insertion of 584 bp in exon 3. The inserted fragment contained a typical LTR retroelement insertion site (59-AGAAC-39) at each end, suggesting that the inserted fragment might be a remnant of a transposable element which had undergone a deletion following the insertion event. The recessive allele la 2 of the SalFAD3.LA2 gene was functional and had a CDS encoding the same polypeptide sequence when compared with the dominant allele LA 2 . However, la 2 was different in intron sequence. It remains to be investigated why LA 2 and la 2 controlled a different C18:3 content. The SalFAD3.LA1 and SalFAD3.LA2 allele-specific markers proved to be useful for identification of different C18:3 genotypes in the present study.
The phylogenetic analysis based on the polypeptide sequences indicated that the LA 1 and LA 2 genes in yellow mustard were clustered with the FAD3 genes in Brassica species and A. thaliana. Interestingly, LA 1 and LA 2 were clustered into different groups. LA 1 was grouped together with the FAD3 genes of B. oleracea and the C genome in B. napus, whereas LA 2 was in the same cluster with the FAD3 gene of B. rapa and the A genome in B. napus. In our study, the LA 1 gene controlled a higher C18:3 content than the LA 2 gene. It was reported that the FAD3 gene of the C genome in B. napus also contributed more to the total C18:3 content than that of the A genome [11,12]. This suggested that the molecular divergence of the LA 1 and LA 2 genes occurred before the speciation of yellow mustard and Brassica species.
In conclusion, our study revealed the existence of two FAD3 gene loci contributing to the genetic variation of linolenic acid content in yellow mustard. The SalFAD3.LA1 gene was located in the linkage group Sal02, while the SalFAD3.LA2 gene in Sal10. We have cloned the SalFAD3.LA1 and SalFAD3.LA2 genes and developed allele-specific markers for the detection of desirable genotypes, which will be valuable for marker assisted breeding in yellow mustard.