On the core bacterial flora of Ixodes persulcatus (Taiga tick)

Ixodes persulcatus is a predominant hard tick species that transmits a wide range of human and animal pathogens. Since bacterial flora of the tick dwelling in the wild always vary according to their hosts and the environment, it is highly desirable that species-associated microbiomes are fully determined by using next-generation sequencing and based on comparative metagenomics. Here, we examine such metagenomic changes of I. persulcatus starting with samples collected from the wild ticks and followed by the reared animals under pathogen-free laboratory conditions over multiple generations. Based on high-coverage genomic sequences from three experimental groups–wild, reared for a single generation or R1, and reared for eight generations or R8 –we identify the core bacterial flora of I. persulcatus, which contains 70 species that belong to 69 genera of 8 phyla; such a core is from the R8 group, which is reduced from 4625 species belonging to 1153 genera of 29 phyla in the wild group. Our study provides a novel example of tick core bacterial flora acquired based on wild-to-reared comparison, which paves a way for future research on tick metagenomics and tick-borne disease pandemics.


Introduction
The tick, Ixodes persulcatus, is a predominant hard tick species found in Europe, central and northern Asia, China, and Japan. Tick is the second most widely recognized transmission vectors of human diseases worldwide, second only to mosquitoes [1,2]. Ticks carry a large number of pathogens, including viruses, bacteria, fungi, and protozoa, which are transmitted among animals and humans, and among them, Borrelia burgdorferi and Anaplasma phagocytophilum cause Lyme disease [3,4] and granulocytic anaplasmosis [5], respectively. In addition, I. persulcatus is a vector for the tick-borne encephalitis virus [2], and together with Babesia [6,7] and Rickettsiae, they cause spotted fever [8,9].
As progresses made through various microbial sequencing programs, such as the Human Microbiome Project [10], we have learnt that symbiotic microorganisms play important roles a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 in host growth, development, metabolism, and immune system [11]. Therefore, understanding microbial communities of a host and their functional components has become key objectives for species-based genomics. High-throughput sequencing technology has led to the idea and technology of metagenomics, and based on whole genome and 16s rRNA sequencing, ample metagenomic data have been put forward for many symbiont bacterial species [12][13][14], including those of the tick [15] and its variability in life cycle [16]; some studies have focused on metagenomic changes among different tick species [17], between the two sexes [18,19], geographies [20,21], and before-and-after meal [22]. However, it still remains unknown that how tick-borne metagenomes vary among hosts and under pathogen-free rearing conditions, especially in the definition of species-associated core microbial/bacterial flora [11,13,14,23,24].
To explore such core symbiotic bacterial flora of I. persulcatus, we start from wild I. persulcatus, rear the tick in specific pathogen-free (SPF) mice in a sterile environment, and examine its flora over multiple generations. We then use a next-generation-sequencing method, acquire high-coverage metagenomic data, and analyze them by comparing the metagenome change among the samples. Here, we report our results for defining the core bacteril flora of I. persulcatus.

Tick collection, breeding, and storage
The I. persulcatus strain Sfh has been reared in our laboratory; it is originally from female ticks collected with cloth-dragging methods from a forest region of Suifenhe City (E = 131.17, N = 44.38), Heilongjiang Province of northeastern China. The natural enzootic cycle is completed by growing larvae and nymphs on shaved skin of SPF Balb/C mice. We used female adults for all experiments. Larvae were obtained by placing 1.0 g hatched eggs onto feeding patches glued to the shaved patches of SPF Balb/C mice. After feeding, all larvae were removed before they molted to the nymphal stage. Nymphs were molted to the adult stage (first generation), and some adult ticks were maintained at room temperature for 1-3 weeks prior to hatching for propagation to the eighth generation. Other adult ticks were frozen at −80˚C for DNA extraction.

DNA preparation and sequencing
Before DNA extraction, 30 to 40 adult female ticks were sterilized in 70% ethanol for 20 min and washed three times in distilled water. Total DNA was extracted using the Qiagen DNA extraction kit (No.69506, China) according to the manufacturer's instructions. Paired-end libraries were constructed according to the manufacturer's instructions (Illumina) and sequenced by using the Illumina Hiseq 2500 platform.

Data analysis
Paired-end library sequencing reads were quality controlled with FastQC (http://www. bioinformatics.babraham.ac.uk/projects/fastqc/) and trimmed on both ends using the FAS-TX-Toolkit, leaving high-quality nucleotides (http://hannonlab.cshl.edu/fastx_toolkit). The high-quality reads were aligned to NCBI NR databases using DIAMOND [25] with the default parameter. Then, alignment files were imported into MEGAN6 [26], and the program automatically calculated a taxonomic classification of the reads. The results were interactively viewed and inspected. Multiple datasets were simultaneously opened in a single comparative document that provided comparative views of the different classifications.

Ethics statement
Ethics approval for this investigation was obtained from the Research Ethics Committee, Beijing Institute of Genomics, Chinese Academy of Sciences. There is no specific permission required for tick collection in the forest region of Suifenhe City (E = 131.17, N = 44.38), Heilongjiang Province of northeastern China. Our study didn't involve any endangered or protected species.

Experimental design and data acquisition
We divide samples into three groups: wild, direct collection from the wild; R1, reared in the laboratory as the first generation; and R8, reared as the 8 th generation. We use 40 and 30 adult females for DNA extraction for the wild and reared groups, respectively, since the entire animals are ground together. The sequences are from paired-end libraries and one lane for wild group, another lane for R1 and R8 group. Sequence coverage and read length are both within their standards; we note that the matched (protein sequences collected for Blastp) reads are an order of magnitude lower in R8 than the rest, both wild and R1 (Table 1).
Our raw data from the three groups range from 23 Gb to 47 Gb in total nucleotides and from 111 million to 231 million in sequencing reads. Annotation based on Blastp narrows useful reads into 59 million for wild, 58 million for R1, 7.1 million for R8. Rare fraction curve analysis indicates (Fig 1) data saturation ranges at the genus level: the curves for the wild, R1, and R8 ticks plateau at 45, 25, and 1 million reads, respectively. An overall reduction of the bacterial flora for the reared ticks is obvious. Next, we discuss the data at different taxonomic levels.
The kingdom classification. We annotate 80.39%, 89.96%, and 26.31% of all assigned bacterial reads of the groups, wild, R1, and R8, respectively, at the kingdom level (Fig 2). The annotated reads of eukaryotes show different ratios among the groups: 6.2% for wild, 3.65% for R1, and 69.57% for R8. It appears that rearing process tends to increase the fraction of eukaryotes. Archaean data exhibit another trend similar to that of the bacteria but rather unique in characteristics: 424 for wild, 2143 for R1, null for R8. Similarly, the minor component, viruses, displays its own trend: 30632 for wild, 3497 for R1, and 2036 for R8.
The phylum classification. At the phylum level, the annotated read distribution is 29 for wild, 31 for R1, and 8 for R8, where all 8 phyla in R8 are shared by the rest, representing candidates for the core bacterial flora. Since most phyla are accounted for only a small fraction of the total reads, so we define the major bacterial phyla as those greater than 1% (Figs 3 and 4). The major phyla in the wild group are Proteobacteria and Actinobacteria. After one generation of rearing, the major R1 phyla recruit Firmicutes and Bacteroidetes. The most abundant early stage of rearing. The unique genera identified among the groups indicates the same result: 128 for wild, 168 for R1 and 69 for R8. The bacterial core flora of I. persulcatus. Continuous rearing in a sterile environment has lead to only 69 genera in the R8 ticks; these bacteria belong to 8 different phyla that are shared among all three groups. Therefore, we tentatively attribute the 69 genera to be the core bacterial flora of the tick. After normalization, the sequence reads of all three groups (Figs 5 and 6) are mostly Proteobacteria (43 genera), Actinobacteria (18 genera), and Bacteroidetes (4 genera); and the remaining 4 phyla each contain a single genus and Firmicutes has no genus assigned to.
At the species level, the 69 genera of the core tick bacterial flora have 1219 species for wild, 1307 species for R1, and 74 species for R8, of which 70 species are shared by both wild and R1. For example, wild has 38 species of Rickettsia genus, R1 has 22, and R8 has 7; the 7 species of R8 are all shared by the rest 2 groups.

Discussion
As a predominant hard tick species in Europe, central and northern Asia, China, and Japan, I. persulcatus has been studied by using metagenomics [22,28]. Up until 2015, a total of 16 tick metagenomics reports have been published [15], which have investigated several important parameters including species, sex, organ, and life cycle, referencing tick-borne diseases [15,19]. However, such studies have not interrogated tick metagenomes under controlled rearing conditions that further our knowledge on tick flora and its variation under different environmental settings. Tick flora refers to its total symbionts residing mainly in the internal organs and skin since separation of these anatomic portions are not desirable at the early stage of flora definition. We also attempt to define the bacterial flora initially and to narrow the scope of the  study to the core bacterial flora due to poor annotations of the eukaryotic components (such as unknow fungi and molds). In this study, we focus our attention on the bacterial communities and their diversity shift between wild ticks and those reared under SPF environment over multiple generations. In our study, concerning methodology, we took one of the two popular approaches, genome-wide sequencing approach rather than transcriptomes sequencing, and hope to provide an initial definition of the tick core bacterial flora. Previous reports on tick metagenomics have not used genome-wide sequencing approach [15] and they have mostly used 16s rRNAbased methodology [22], having sampled 16s rRNA of wild ticks and blood meal ticks. Taking a genome-wide sequencing approach, however, we are limited by the number of samples that can be handled for an initial analysis, and we therefore only examined three data points at such a stage. More detailed sampling certainly necessary but a careful design of such a experiment is of essence. For instance, it appears that at the 8 th generation the core bacterial flora becomes stabilized. We also need to know how stable such floras are for ticks from a single origin and how variable it can be for ticks from different origins, even at the individual level in an ultimate sense. There are, of course, more questions can be asked for much more details as to how the floras change gradually from the wild conditions to the rearing in each generation, especially under different controlled rearing, such as pathogen-free vs natural hosts. Ample categorization procedures are of importance for more systematic studies.
Nevertheless, such a study is already very informative for further definition of the core floras of the tick under various conditions. First, we notice that at the kingdom level R8 has much less annotated reads as compared to the rest two samples. Judging by the increase of eukaryotic reads, it is reasonable to attribute the reduced annotation to unknown eukaryotes. Another loss in R8 is Archaea, similar to the case of eukaryotes. Second, the dramatic reduction of bacterial phyla and genera after rearing suggests the fact that most of the tick dwelling microbes from the wild are both temporary and environment-associated. It is obvious that an effort to define a species-associated core bacterial flora is both necessary and important for I. persulcatus. Third, we have identified a tentative species-associated core bacterial flora that is composed of 69 genera and 70 species. We notice that the number of species is almost equal to that of the genera as the tick core bacterial flora, and such a phenomenon is rather common due to relative less certainty at the species level. Fourth, the largest number of species common in all 3 experimental groups are from the genus Rickettsia, which are endosymbiont of I. scapularis. This species plays an important role in sex determination in ticks [29].
High-throughput sequencing, coupled with bioinformatics analysis, provides a powerful tool for defining microbiome or flora of ticks, and such floras and their variations provide diagnostic markers or clues for disease transmission and infection mechanisms. Our results represent initial steps that uncover the species-associated flora and its variability under the influence of environmental elements, including hosts, geography, climates, and etc. Our results justify further metagenomic studies on the tick, including further anatomic stratifications.

Author Contributions
Conceptualization: SS YY JY.