Endosphere microbiome comparison between symptomatic and asymptomatic roots of Brassica napus infected with Plasmodiophora brassicae

Clubroot caused by Plasmodiophora brassicae, is a severe disease of cruciferous crops that causes large hypertrophic galls in the roots. The plant microbiome is important for growth promotion and disease suppression. In this study, using 16S rRNA and internal transcribed spacer (ITS) sequencing techniques, we compared the endosphere microbiome of symptomatic and asymptomatic B. napus roots infected with P. brassicae collected from the same natural clubroot field. The results showed that the microbial population and its relative abundance in the asymptomatic roots was far higher than that in the symptomatic roots, and that many microorganisms in asymptomatic roots have biological control and plant growth promotion functions that may be related to clubroot symptoms. These results suggest the importance of the endosphere microbiome in clubroot disease and provide potential bio-control resources for its prevention.


Introduction
Although studies have been conducted for many years, the plant microbiome has gained public attention as a new concept only recently [1,2]. Turner et al. considered the plant microbiome to mainly include phyllosphere microorganisms, rhizosphere microorganisms, and endogenous microorganisms [3]. Edwards et al. clarified that spatial resolution distinguished three root-associated compartments: the endosphere (root interior), the rhizoplane (root surface), and the rhizosphere (soil close to the root surface). Compared to the rhizosphere and the rhizoplane, the endosphere is highly specific in terms of the microbial communities that it contains [4].
The plant microbiome has been identified as an important determinant of plant health [5]. In its capacity to do harm, the microbiome contains many plant pathogens such as PLOS

PCR and quantitative (q)-PCR detection
All roots were carefully washed with tap water before being peeled with a sterilized razor. The genomic DNA of all samples was extracted using the CTAB (hexadecyl trimethyl ammonium bromide) method [30]. DNA concentration and purity were monitored on 1% agarose gel. DNA was diluted to 1 ng/μL using sterile distilled water. The PCR protocol described by Wallenhammar and Arwidsson (2001) was used for detection of P. brassicae in RS1 and RS2 samples. PCR amplification was performed using the primers PbITS6: 5 0 -CAACGAGTCAGCTTG AATGC-3 0 and PbITS7: 5 0 -TGTTTCGGCTAGGATGGTTC-3 0 [31]. The conditions for PCR amplification included a denaturation step at 95˚C for 5 min and 32 cycles of 94˚C for 30 s, 59˚C for 30 s, and 72˚C for 1 min. The products of PCR amplification were sequence analyzed by BGI (Beijing Genomics Institute, Beijing). For quantitative detection of P. brassicae in RS1 and RS2 samples, q-PCR analysis was performed using the primers Pb actin-F: 5 0 -CACCGACTACCTGATGAA-3 0 and Pb actin-R: 5 0 -CAGCTTCTCCTTGATGTC-3 0 , according to the method described by Chen et al. [21]. The DNA samples were diluted to 500 ng/μL with sterile distilled water and 10-μL reactions were analyzed in triplicate using the CFX96 Real-Time PCR Detection System (Bio-Rad, USA). Each reaction mixture contained 5 μL of 2× SYBR Green Super mix (Bio-Rad, USA), 1 μL of sample DNA, 0.15 μL of forward primer, and 0.15 μL of reverse primer (10 μmol/L). Sterile distilled water was added to make up the final volume. The program was as follows: denaturation at 95˚C for 3 min followed by 49 amplification cycles of 95˚C for 15 s, 58.5˚C for 20 s, and 72˚C for 15 s. A melt curve was generated to verify the specificity of amplification from 65˚C to 95˚C with an increment of 0.5˚C per cycle, with each cycle held for 5 s. The primer sequences are provided in S1 Table. The B. napus actin gene (primers: Bn actin-F: 5 0 -TGAAG ATCAAGGTGGTCGCA-3 0 and Bn actin-R: 5 0 -GAAGGCAGAAACACTTAGAAG-3 0 ) was used as the internal control for normalization.

16S rRNA and ITS sequencing
Using the genomic DNA as a template, the 16S rRNA gene and ITS fragments were amplified using specific primers (515F: 5 0 -GTGCCAGCMGCCGCGGTAA-3 0 and 806R: 5 0 -GGACTA CHVGGGTWTCTAAT-3 0 for 16S rRNA; ITS1-1F-F: 5 0 -CTTGGTCATTTAGAGGAAGTAA-3 0 and ITS1-1F-R: 5 0 -GCTGCGTTCTTCATCGATGC-3 0 for ITS) tagged with abarcode. PCR was conducted in a total reaction volume of 30 μL containing 15 μL of Phusion1 High-Fidelity PCR Master Mix (New England Biolabs), 0.2 μM forward and reverse primers, and approximately 10 ng of template DNA. The thermal cycling consisted of initial denaturation at 98˚C for 1 min, followed by 30 denaturation cycles at 98˚C for 10 s, annealing at 50˚C for 30 s, elongation at 72˚C for 60 s, and finally 72˚C for 5 min. An equal volume of 1X loading buffer (containing SYBR green) was mixed with the PCR products and electrophoresed on 2% agarose gel for detection. Samples with bright main bands between 400 and 450 bp were chosen for further experiments. The PCR products were mixed in equidensity ratios and purified using the Gene JET Gel Extraction Kit (Thermo Scientific). Sequencing libraries were generated using the NEB Next1 Ultra™ DNA Library Prep Kit for Illumina 1 (NEB, USA) according to the manufacturer's recommendations, and index codes were added. The quality of the library was assessed using the Qubit 1 2.0 Fluorometer (Thermo Scientific) and the Agilent 2100 Bioanalyzer system. The library was sequenced on an Illumina HiSeq 2500 platform by Beijing Novogene Bioinformatics Technology Co. Ltd. The raw sequencing data for 16S rRNA and ITS sequencing results have been deposited at the Sequence Read Archive (SRA, http://www.ncbi. nlm.nih.gov/sra/) under accession numbers PRJNA388067 (https://www.ncbi.nlm.nih.gov/ bioproject/PRJNA388067) and PRJNA388081 (https://www.ncbi.nlm.nih.gov/bioproject/ PRJNA388081), respectively.

Processing of the sequencing data
Since the original raw data from the Illumina sequencing platform included some low-quality points, pre-processing was necessary for further analysis. The specific processing steps were as follows: 1. Data resolution: according to barcode sequences, each sample sequence was resolved from the raw data and the barcode sequences and primers were truncated; 2. Raw read splicing: the split data of each sample read were spliced using FLASH (V1.2.7, http://ccb.jhu.edu/software/FLASH/) for Mosaic [32], as it was the original tag data (raw tags); 3. Tag filtering: to obtain clean tags, the raw tags required stringent filter processing [33] according to the methods described by Caporaso et al. [34]. The raw tags were truncated at the first low quality base site when the continuous length of low-quality value (quality value 3) bases was up to three. The truncated tags in which continuous high quality base length was less than 75% of the tag lengths were excluded. 4. Chimera removal: the clean tags were contrasted with the Gold database (http://drive5. com/uchime/uchime_download.html) using the UCHIME Algorithm (http://www.drive5. com/usearch/manual/uchime_algo.html) to test the chimeric sequences. Final valid data (effective tags) were obtained after removing the chimeric sequences [35,36]. Finally, the effective tags were used for subsequent analysis.

Operational taxonomic units statistics and classification
To study species diversity, the effective tags were clustered using the UPARSE software [37]. Sequences with !97% similarity were assigned to the same operational taxonomic units (OTUs). Since the number of taxon tags was different for each of the six sequenced samples, the OTUs clustering results were standardized for subsequent analysis according to the minimum value of the sequence. The OTUs statistical information of the RS1 and RS2 samples was rendered using the Wayne figure. Sequences in the same OTUs are considered as a corresponding taxon. In the process of assigning OTUs, the highest frequency sequence in the same OTUs was selected as the representative sequence. The representative sequences were used for classification. The 16S rRNA sequences were assigned with the Greengenes database using the RDP Classifier (threshold value is 0.8-1), and the ITS sequences were assigned with the UNITE/INSDC database using QIIME (threshold value is 0.8-1) [38][39][40].

Microorganism community statistics and cluster analysis
After assigning the corresponding database, the OTUs were classified into different classification levels (i.e., Kingdom, Phylum, Class, Order, Family, Genus, and Species). The level into which more than 80% of the OTUs could be classified was chosen for microorganism community analysis. The relative microorganism abundance in each sample was determined according to the number of OTU reads.
To determine the difference in the microorganisms between the two groups, microorganism community cluster analysis was performed based on the Z value using the software MeV 4.9. The Z value of one sample in a classification was equal to the value calculated as the difference between the relative abundance in the sample and the average relative abundance in all samples, divided by the standard deviation of all samples.

Symptoms of clubroot in the field
The symptoms of the clubroot samples are shown in Fig 1. The roots of RS1 samples were not obviously swollen and the lateral roots remained healthy, whereas the roots of RS2 samples were obviously swollen, and few lateral roots could be observed (Fig 1A). PCR amplification results showed that P. brassicae could be successfully detected in both RS1 and RS2 samples (Fig 1B).
To evaluate the relative P. brassicae content in RS1 and RS2 samples, expression of the P. brassicae actin gene was measured using q-PCR. In the symptomatic (RS2) samples, the P. brassicae numbers were thousand-fold higher than those in the asymptomatic (RS1) samples ( Fig 1C).

Sequencing quality assessment
Raw data were generated by Illumina sequencing. Impurities such as joint and primer sequences, and low-quality reads were first removed to obtain effective tags for subsequent analyses. Sequencing quality was determined using the number of effective tags, their proportion in the raw data, and the base percentage of the sequencing error rate, which is less than 0.1% (Q30). In the case of 16S rRNA sequencing (S1 Table), 45471, 27897, 82715, 57064, 25939, and 20842 tags were generated as raw data for the six samples (RS1.1, RS1.2, RS1.3, RS2.1, RS2.2, and RS2.3, respectively). For each sample, the percentage of effective tags in the raw data was up to 92%, and the Q30 values were above 96%. In the case of ITS sequencing (S2 Table), 66344, 66782, 55796, 44365, 44576, and 66344 tags were generated as raw data for the six samples (RS1.1, RS1.2, RS1.3, RS2.1, RS2.2, and RS2.3, respectively). The percentage of effective tags in the raw data was up to 96% and the Q30 values were above 99% in each sample. This demonstrated that the data were suitable for subsequent analyses.

OTU statistics and analysis
All effective tags from the six samples were clustered into OTUs according to sequence similarity (!97%) using the UPARSE software (Edgar & Robert, 2013). According to the OTU statistical results, most effective tags (Taxon Tags) could be clustered to OTUs, with the exception of only a few data (Unclassified Tags and Unique Tags) (S1 Since the number of taxon tags in each sequencing sample was different, the OTU clusters formed above could not be directly used tocompare the two samples. The OTU clustering results were standardized for subsequent analysis on the basis of the minimal sequence read number. The standardized OTU information for RS1 and RS2 samples was rendered using the Wayne figure (Fig 2). In both RS1 and RS2 groups, three samples were used for sequencing. Shared and unique OTUs were found among the three samples. To ensure the accuracy and completeness of the data, OTUs that existed in at least two of the three samples were chosen for further study. In the RS1 group, 96 OTUs (49 shared and 47 in two samples) from 16S rRNA and 110 OTUs (61 shared and 49 in two samples) from ITS were used for subsequent studies (Fig 2A and 2D). In the RS2 group, 58 OTUs (31 shared and 27 in two samples) from 16S rRNA and 54 OTUs (31 shared and 23 in two samples) from ITS were used for further studies (Fig 2C and 2F). Comparison of the OTUs in at least two samples from each group showed that, in 16S rRNA 48 OTUs were common to both groups, 48 OTUs were only in the group RS1, and 10 OTUs were only in the group RS2 (Fig 2B). In ITS, 48 OTUs were common to both groups, 62 OTUs existed only in the group RS1, and 6 OTUs were only in the group RS2 (Fig 2D). Therefore, 106 OTUs from 16S rRNA and 116 OTUs from ITS were used for further analysis. The number of OTUs in the group RS1 was obviously greater than that in the group RS2, both from 16S rRNA and ITS sequencing results.

OTU classification and microorganism community analysis
After the OTU cluster analysis, the same OTU sequences were assumed to derive from a certain taxon, and a representative sequence for each OTU was classified into different classification levels (Kingdom, Phylum, Class, Order, Family, Genus, and Species). [38,39] To determine the differences in endosphere microorganism communities between the asymptomatic (RS1) and symptomatic (RS2) B. napus samples, the relative abundance of bacteria and fungi in each sample was calculated based on the number of OTU reads and used for analysis (S3 and S4 Tables).
Based on the classification results, 39 and 28 bacterial families were found in the asymptomatic RS1 and symptomatic RS2 groups, respectively (S3 Table). The two most abundant families (i.e., Streptophyta and Mitochondria) were excluded from the analysis. Except these two families, the relative abundances of the top 15 bacterial families in each sample is shown in Fig  3A. The most abundant family was Oxalobacteraceae, followed by Pseudomonadaceae, Comamonadaceae, Xanthomonadaceae, Methylophilaceae, Rhizobiaceae, Flavobacteriaceae. The relative abundance of these bacteria in the asymptomatic samples (RS1.1, RS1.1 and RS1.3) was much greater than that in the symptomatic samples (RS2.1, RS2.1 and RS2.3) (Fig 3A, S3  Table). In symptomatic samples, the relative abundance of other bacteria was the highest (0.92, 0.78, and 0.82), whereas in asymptomatic samples, the relative abundance of other bacteria was low (0.13, 0.16, and 0.13). Endosphere microbiome of canola clubbed roots The ITS sequencing results indicated the endosphere fungal communities and P. brassicae. The most abundant species in all six samples was undoubtedly P. brassicae, with its content in the symptomatic samples (0.986, 0.991, and 0.989) was slightly higher than that in the asymptomatic samples (0.952, 0.975, and 0.908). Based on the classification, 65 and 39 fungal genera were found in RS1 and RS2 groups, respectively (S4 Table). The most abundant fungal genera were Un-s-Tremellomycetes sp., Tetracladium, Un-s-Nectriaceae sp., Verticillium, Fusarium, Gibberella, Alternaria, Un-s-Sordariomycetes sp., and Mortierella. The relative abundance of these fungi in the asymptomatic samples was also much greater than that in the symptomatic samples (Fig 3B). In the asymptomatic samples, other fungi (0.013, 0.007, and 0.026) were present in greater amounts than those in the symptomatic samples (0.003, 0.003, and 0.003).

Microorganism community clustering and specific microorganism analysis
For further analysis of the differences in the endosphere communities between the asymptomatic and symptomatic samples, the relative abundance of all bacterial families and fungal genera was clustered. For the bacterial endosphere community, 16 families existed only in the asymptomatic samples, five families only in the symptomatic samples, and 21 families common to both samples (Fig 4). Among the 16 families that were present only in the asymptomatic samples, Streptomycetaceae, Pseudonocardiaceae, Rhodospirillaceae, Sphingomonadaceae, and Verrucomicrobiaceae were present in all three groups (Fig 4AI), and the remaining families (i.e., Intrasporangiaceae, Nocardioidaceae, Mycobacteriaceae, Cryomorphaceae, Burkholderiaceae, Turicibacteraceae, Alteromonadaceae, Bacillaceae, Chitinophagaceae, Nannocystaceae, and Promicromonosporaceae) were present in two of the three samples. The families present only in the symptomatic samples included Peptostreptococcaceae, Moraxellaceae, Acetobacteraceae, Deinococcaceae, and an unclassified bacterium (auto67_4W) (Fig 4AII).
Of the 21 shared families, the relative abundance of Nocardiaceae, Phyllobacteriaceae, and Enterobacteriaceae in the symptomatic samples was greater than that in the asymptomatic samples (Fig 4BI and 4BII). The abundance of Methylophilaceae, Oxalobacteraceae, and Microbacteriaceae was similar in all the samples (Fig 4BIII). The remaining 15 families were more abundant in the asymptomatic samples than in the symptomatic samples, especially Xanthomonadaceae, Comamonadaceae, Flavobacteriaceae, Sphingobacteriaceae, Rhizobiaceae, Caulobacteraceae, and Bradyrhizobiaceae (Fig 4BIII, 4BIV and 4BV).
Overall, the bacterial and fungal endosphere communities in the asymptomatic samples were more abundant than those in the symptomatic samples. Among the unique and highly  Endosphere microbiome of canola clubbed roots plant growth activity, and four bacterial families and seven fungal genera causing plant disease. Interestingly, Rhizobiaceae and Bradyrhizobiaceae were detected in both the asymptomatic and symptomatic samples, and their relative abundance in the asymptomatic samples was much greater than that in the symptomatic samples. Furthermore, Glomus was detected only in the asymptomatic samples, whereas Aspergillus was detected only in the symptomatic samples (Tables 1 and 2).

Discussion
Recently, research on microbiome has attracted increasing attention. Previously, 16S rRNA and ITS sequencing have been widely used in plant microbiome studies [35][36][37][38][39][40][41][42]. In this study, we analyzed the root endosphere microbiome of B. napus infested with P. brassicae using 16S rRNA and ITS sequencing, and tailored the sequencing quality to the requirement of the analysis. The 16S rRNA sequencing results revealed 106 OTUs (OTUs that existed in at least two of the three samples), and 116 OTUs were detected from the ITS sequencing results. The majority of OTUs (except OTU3) from the 16S rRNA sequencing results could be classified to the family level and most OTUs from the ITS sequencing results could be classified to the genus level (S2 Fig).
Although the results showed that 16S rRNA and ITS sequencing work well in microbiome research, some difficulties persist. For instance, in the 16S rRNA sequencing results, 11, 15, and 10 OTUs in the asymptomatic samples (RS1.1, RS1.2, and RS1.3) and 9, 6, and 7 OTUs in the symptomatic samples (RS2.1, RS2.2, and RS2.3) could not be classified into any known family. In the symptomatic samples, OTU 3 was classified as an uncultured bacterial clone and its relative content was 70-90% of all the OTUs. The ITS sequencing results also showed 18, 23, and 25 OTUs in the asymptomatic samples (RS1.1, RS1.2 and RS1.3) and 14, 8, and 8 OTUs in the symptomatic samples (RS2.1, RS2.2 and RS2.3) that could not be classified into any known genera (S2 Fig). This might suggest that new microorganisms were being hosted by P. brassicae-infected B. napus roots including unknown bacteria (i.e., OTU 3). There may be two possible reasons for this. First, our knowledge of microorganisms is still very limited especially for those that cannot be purified and cultivated. Secondly, only 500 bp sequences were created and analyzed in the 16S rRNA sequencing platform, which is a little short for bacterial identification. Moreover, it is difficult to overcome the chloroplast and mitochondrial sequence interference in 16S rRNA sequencing [43]. Moreover, since fungal ITS sequences in the public database are relatively poor compared to the bacterial 16S rRNA gene sequences, it is necessary to include other genes such as EF-1α, β-tubulin, and RNA polymerases to complete the identification [44]. Therefore, achieving a comprehensive understanding of the microbiome and its functions will take a long time.
The main purpose of this study was to clarify the difference between the endosphere microbiomes of symptomatic and asymptomatic B. napus roots from the same natural clubroot field, using 16S rRNA and ITS sequencing. Based on the results, we first found that P. brassicae was common in both the symptomatic and asymptomatic samples. Second, many bacterial families and fungal genera were unique to the asymptomatic samples. Moreover, most of the shared bacteria and fungi present across both groups were more abundant in the asymptomatic Endosphere microbiome of canola clubbed roots samples. These results were similar to those of previous studies. Pseudomonas syringae pv. actinidiae was detected in both the symptomatic and asymptomatic tissues of kiwifruit [45]. The microbial population balance (mainly Methylobacterium spp., C. flaccumfaciens, and X. fastidiosa) was also related to the CVC asymptomatic or symptomatic presentation in citrus plants [20]. Besides, in previous studies, some beneficial microorganisms were found to be much more salient in asymptomatic plants. For example, numerous plant-beneficial bacterial isolates were found in the asymptomatic plants of Huanglongbing diseased citrus compared to the symptomatic plants [21]; C. flaccumfaciens abundance in asymptomatic plants was higher than

Biological Control
Rhodotorula [55] 6.38E−04 9.33E−05 Sporobolomyces [55] 1.17E−04 2.33E−05 Sporidiobolus [56] 8.55E−05 0 Leucosporidium [57] 7.00E−05 0 Un-s-Sporidiobolales sp. [ that in symptomatic plants, and may contribute to CVC resistance [19]. In our study, many of the unique and highly abundant bacteria and fungi in the asymptomatic samples have been attributed with biological control or plant growth promotion functions. Bacillus sp. from the family Bacillaceae, was reported as a biocontrol agent for P. brassicae [44,46,47]; Streptomyces spp. [48,49] and Streptomyces griseoruber A316 [50] from the family Streptomycetaceae have been reported to reduce the severity of clubroot disease; Flavobacteriumhercynim EPB-C313, an endophytic bacteria of the family Flavobacteriaceae, has been used to control Chinese cabbage clubroot [51]. In this study, we identified 1, 1, and 7 OTUs in the asymptomatic samples belonging to Bacillaceae, Streptomycetaceae, and Flavobacterium families, respectively. Strains of Sphingomonadaceae can inhibit Pythium spinosum, whereas strains of Pseudomonadaceae efficiently control fusarium wilt disease [14,15,52]. Species from the Chitinophagaceae and Burkholderiaceae families can promote plant growth [53,54]. Yeasts from Sporobolomyces, Leucosporidium, and Sporidiobolus can be used to control gray mold [55][56][57]. In asymptomatic samples, 9 OTUs were detected as belonging to the six above-mentioned families. They may have similar effects on oilseed rape and clubroot disease. Above all, we assume that the differences in the endosphere microbiome of the symptomatic and asymptomatic samples in the endosphere microbiome may influence the development of clubroot symptoms. Another purpose of this study was to exploit the endosphere microbiome analysis technique to mine microorganism resources for resistance to clubroot disease. Little is known about Rhodocyclaceae, Verrucomicrobiaceae, Pseudonocardiaceae, and "Un-s-" fungi, all of which were only present in the asymptomatic samples, along with the unknown bacteria (OUT 3) in the symptomatic samples. These microorganisms may be potential bio-control resources for preventing clubroot disease. Moreover, in the asymptomatic B. napus root samples, a variety of pathogenic plant microorganisms from other crops were detected including Xanthomonadaceae, Fusarium, Gibberella, Alternaria, Sclerotinia, Phoma, Leptosphaeria, and Cylindrocarpon. The impact of these pathogens on non-host plants remains a topic for further study.
A few anaerobic or facultative anaerobic bacteria like Acetobacteraceae [58] were unique to symptomatic B. napus root samples, suggesting that P. brassicae might consume oxygen or need environmental hypoxia when growing in B. napus roots [59]. Interestingly, some fungi from the genus Aspergillus are typical saprophytes or opportunistic pathogens [60], and Aspergillus could only be detected in the symptomatic samples. Therefore, we postulate that P. brassicae needs to decay and disintegrate the host roots with the help of saprophytic microorganisms, in order to release its resting spores into the soil at the later stages of clubroot development. Furthermore, we also found an interesting phenomenon that bacteria from the family Rhizobiaceae and Bradyrhizobiaceae and fungi from the genus Glomus were detected in B. napus infested with P. brassicae, although B. napus is not a plant that hosts rhizobia and arbuscular mycorrhizal fungi. Further analysis on this phenomenon is underway.
In summary, this study was mainly conducted to compare the differences in the endosphere microbiomes of symptomatic and asymptomatic B. napus roots collected from the same natural clubroot field using 16S rRNA and ITS sequencing. The microbe population and their relative abundance in asymptomatic roots are far greater than that in the symptomatic B. napus roots. Many microorganisms detected in the asymptomatic roots have biological control and plant growth promotion functions. Furthermore, we also found many pathogenic plant microorganisms to which B. napus is not a host, and the related mechanism underlying this observation needs further research. These results provide a new basis for studying the endosphere microbiomes of roots and some potential bio-control resources for clubroot prevention.  results (b). Total Tags, the filtered splicing sequence numbers; Taxon Tags, the number of tags used to build the OTUs and gain classification information; Unclassified Tags, the number of tags used to build OTUs but which did not provide any classification information; Unique Tags, the number of tags for which the frequency is one and cannot be clustered into OTUs; OTUs, the final OTU numbers.