The Complete Genome Sequence of Mycoplasma bovis Strain Hubei-1

Infection by Mycoplasma bovis (M. bovis) can induce diseases, such as pneumonia and otitis media in young calves and mastitis and arthritis in older animals. Here, we report the finished and annotated genome sequence of M. bovis strain Hubei-1, a strain isolated in 2008 that caused calf pneumonia on a Chinese farm. The genome of M. bovis strain Hubei-1 contains a single circular chromosome of 953,114 bp with a 29.37% GC content. We identified 803 open reading frames (ORFs) that occupy 89.5% of the genome. While 34 ORFs were Hubei-1 specific, 662 ORFs had orthologs in the M. bovis type strain PG45 genome. Genome analysis validated lateral gene transfer between M. bovis and the Mycoplasma mycoides subspecies mycoides, while phylogenetic analysis found that the closest M. bovis neighbor is Mycoplasma agalactiae. Glycerol may be the main carbon and energy source of M. bovis, and most of the biosynthesis pathways were incomplete. We report that 47 lipoproteins, 12 extracellular proteins and 18 transmembrane proteins are phase-variable and may help M. bovis escape the immune response. Besides lipoproteins and phase-variable proteins, genomic analysis found two possible pathogenicity islands, which consist of four genes and 11 genes each, and several other virulence factors including hemolysin, lipoate protein ligase, dihydrolipoamide dehydrogenase, extracellular cysteine protease and 5′-nucleotidase.


Introduction
Mycoplasma bovis (M. bovis), first isolated in 1961 from a severe case of mastitis in cattle, can cause diseases, such as pneumonia, otitis media in young calves and mastitis and arthritis in older animals [1][2][3]. These diseases have been designated M. bovisassociated diseases (MbAD) [4]. M. bovis has spread widely to all parts of the world via animal movement [5]. In recent years, M. bovis has become an important pathogen in young calves in Europe and North America [6], and its infection results in calf mortality, weight loss in surviving calves and a drop in milk production [7]. Clinical diseases caused by M. bovis tend to be chronic, debilitating and unresponsive to antimicrobial therapy. For animal welfare, the veterinarian can provide only limited relief for calves subject to MbAD [8][9].
In addition to MbAD, contagious bovine pleuropneumonia (CBPP) is another cattle respiratory system infectious disease that is caused by Mycoplasma mycoides subspecies mycoides with a small colony biotype (MmmSC), and it exhibits similar clinical symptoms and pathological anatomy as MbAD. However, the damage caused by CBPP is more severe than by MbAD, and CBPP has been listed as a contagious disease that must be reported to World Organization for Animal Health (OIE). It is difficult to distinguish CBPP from MbAD because there is serological crossreactivity between the two diseases. M. bovis has now become the most important Mycoplasma species in cattle in countries without CBPP.
M. bovis is a member of the Mollicutes class and belongs to the genus Mycoplasma. Although more than 20 Mycoplasma genome sequences have been published, the M. bovis genome was not published until 2010. In recent years, some genes have been discovered to exist in all M. bovis strains including the p48 gene [10], the p81 gene [11] and the p40 gene [12], while the p68 gene [13] is missing in some Mycoplasma strains. The variable membrane surface lipoproteins(Vsps) were major antigen of M. bovis and underwent noncoordinate phase variation [14], which has been a recent focus in M. bovis membrane protein research.
A severe cattle respiratory contagious disease occurred in the Hubei Province of China in 2008 that spread to over 11 Chinese provinces. The main symptoms of this disease were coughing and high fever, which led to the death of 572 out of 2,476 infected cattle. In one farm, all 62 cattle were infected and 24 died, a death ratio of 38.7%. Though these diseases show similar clinical symptoms and pathological changes with CBPP, CBPP antibody identification using the complement fixation test (CFT) reagent of National Veterinary Laboratory (LNIV) and specific PCR reactions did not produce positive results. The organisms were isolated from calf lungs with a method developed by Poumarat [15], and the isolated Mycoplasma strain was named Hubei-1. PCR assays targeting Hubei-1 16S rRNA demonstrated 99.5% homology with M. bovis type strain PG45. The M. bovis isolates were further confirmed by PCR and restriction endonuclease analysis (REA), which had been previously validated for the identification of M bovis isolates [16]. When the serum samples were tested for antibodies against M. bovis using a commercial kit (Biovet), positive results were obtained for almost all of the samples.
In 2004, the genomic sequence of MmmSC PG1, the pathogen responsible for CBPP, was released. In 2010, the genome of M. bovis type strain PG45 was reported; this was the first sequenced M. bovis genome, and it has provided valuable information to understand MbAD. In this paper, we report the complete genomic sequence of M. bovis strain Hubei-1 and compare the sequence with the genome of the M. bovis type strain PG45. Understanding the genetics and pathogenic mechanisms of M. bovis would be valuable to comprehend the difference between the genes, proteins, and nosogenesis of the two mycoplasms that cause CBPP and MbAD, and it would also be helpful in the diagnosis of the two diseases in animals.

Genome features
The M. bovis strain Hubei-1 genome contains a single circular chromosome of 953,114 bp with a GC content of 29.37% ( Figure 1). We identified 803 ORFs in the genome with an average length of 1,058 bp and a mean GC content of 29.76% that occupied 89.5% of the genome. Among these ORFs, 523 (65.1%) genes could be classified into Clusters of Orthologous Groups (COG) families comprising 18 functional categories ( Table 1). Biological functions were assigned to 490 (60.9%) genes, and 30 genes encoded hypothetical proteins. The genome encodes two rRNA operon and 34 tRNAs that represent all 20 amino acids. The genome sequence data were deposited in Genbank with the accession number CP002513.
As a low GC content genome, 80% of the codons have an A or a T at the wobble position in M. bovis resulting in a TAA stop codon in 76.3% of the genes. The three codons that encode for Arginine (i.e., CGG, CGA and CGC) ranked the lowest in the genome, occupying only 0.2% of all codons (Table S1), which indicates a low usage efficiency of tRNA arg [ACG].
M. bovis strain Hubei-1 encodes 73 lipoproteins, 36 secreted proteins and 134 transmembrane proteins, indicating that 30.2% of M. bovis encoded proteins are associated with the extracellular environment. Accordingly, a lipoprotein signal peptidase gene (lsp, MMB_0238) and a prolipoprotein diacylglyceryl transferase gene (lgt, MMB_0093) were predicted in the genome; however, no signal peptidase I gene was found.

IS elements
We found 26 complete insertion sequence (IS) elements in the genome including the known elements ISMbov1, ISMbov2, and ISMbov3 and three newly identified IS elements, which were designated as ISMbov4, ISMbov5 and ISMbov6 (Table S2). ISMbov4 is 1,340 bp long and encodes for a 414 amino acid transposase that shows 50% identity with the transposase IS1138 [17] and could be classified in the IS3 family. There are eight full length and eight truncated copies of ISMbov4 in the M. bovis genome that exhibit the same copy number as ISMbov1.
ISMbov5 is 1,468 bp and encodes for a 477 amino acid transposase that belongs to the IS4 family. There are only two complete copies of ISMbov5 in the genome. ISMbov6 is 1,415 bp long, encodes a 338 amino acid transposase and belongs to the IS30 family. There are three complete and four truncated copies of ISMbov6 in the genome. Of note, two ISMbov2 elements were disrupted by ISMbov4 insertion, and two ISMbov4 elements were disrupted by the insertion of ISMbov3 reflecting an insertion bias of IS elements.
All of the IS elements are evenly distributed across the genome, occupying 5.4% of the M. bovis genome.

Tandem repeats and phase variation
It has been reported that tandem repeats upstream or inside the Mycoplasma coding sequence (CDS) would cause phase-variable lipoprotein gene expression [18][19][20] because polymerase slippage would occur in tandem repeat regions during replication.
We identified 371 tandem repeat loci in the genome (Table  S3). Here, tandem repeat refers to either a repeat unit $10 bp or one that is repeated at least five times. Among these repeats, 61 tandem repeats are associated with 47 M. bovis lipoproteins, and 16 repeats are located in promoter regions. Over half of the lipoproteins, including the major surface lipoprotein P48, and six variable surface lipoproteins contain tandem repeats in the coding sequence or the promoter region and may cause protein sequence changes or expression changes in different growth phases. Two obvious examples are MMB_0316 and MMB_0434. MMB_0316 encodes the variable surface lipoprotein Y and has a 39-bp repeat unit with 9.7 copies found 98 bp downstream of its start codon. MMB_0434 encodes a putative variable prolipoprotein, and it has four different tandem repeats with repeat units ranging from 10 to 69 bp in its coding sequence and 44 base repeats located in the region from 267 to 222 upstream of its start codon. The tandem repeats in the coding region cause a frameshift during replication, thus helping the antigen escape an immune response during the infection process. The tandem repeats in the promoter region can cause a change in the level of gene expression and affect antigen production. Both the promoter and the coding sequence are The scale is shown by the outer black circle. Moving inside, the 1 st and 2 nd circles illustrate predicted coding sequences on the plus and minus strands respectively, colored according to different functional categories. The 3 rd circle represent tRNAs (blue) and ribosomal RNA genes (red). The 4 th circle display IS elements. The 5 th and 6 th (innermost) circle represent mean centered G+C content of the genome (red-above mean, blue-below mean) and GC skew (G2C)/(G+C), respectively, calculating using a 1-kb window in steps of 500 bp. doi:10.1371/journal.pone.0020999.g001 regulated by tandem repeats, indicating that MMB_0434 is an important M. bovis antigen.
In addition to the lipoproteins, 12 extracellular and 18 transmembrane proteins are also affected by tandem repeats. For example, there are two tandem repeats at the 39 end of MMB_0258, which encodes for a hemolysin-related protein.
It is noteworthy that several tandem repeats are located adjacent to restriction-modification (RM) systems, which perhaps cause RM gene degeneration or phase-specific expression. We also observed a 60-bp long tandem repeat at the 59-end of the 50S ribosomal protein L4 gene, which is disrupted in M. bovis and may be due to the tandem repeats.

Replication origin and DNA replication
The oriC region is frequently located within the rnpA-rmpH-dnaA-dnaN-recF-gyrB gene cluster and is usually next to the dnaA gene [21]. However, in M. bovis, recF is not found, and the other five genes are located in three different loci. rnpA-rmpH is located ,70 kb downstream of dnaA-dnaN, and gyrB is located ,27 kb upstream of dnaA. The oriC region is also characterized by DnaA box motifs of which the consensus sequence is 59-TTATCCACA-39 [22]. We identified two perfect DnaA boxes in the area surrounding the dnaA gene, one is 60 bp upstream of dnaA, and the other is located 12,924 bp downstream of dnaA. By examining the intergenic region between dnaA and dnaN, we also found a noncanonical DnaA box (consensus sequence: 59-TTTTAAAAA-39) that has been reported in Borrelia burgdorferi [23]. The prescence of DnaA boxes near dnaA indicates that this region is a putative replication origin; thus, we designated the first dnaA base as the first base of the M. bovis genome. Although we found no features of a terminus of replication, a GC skew inversion was found at position 563,113 that could be considered the terminus of replication.
We found 20 DNA replication proteins in the M. bovis genome ( Table S4). The central enzyme, the DNA polymerase III holoenzyme, comprises six genes that separately encode the subunits alpha (DnaE), beta (DnaN), delta (HolA), delta9 (HolB), gamma/tau (DnaX) and a Gram-positive type subunit alpha (PolC), which has been known to endow the strain with 39 to 59 exonuclease activity [24]. In addition to DNA polymerase III, four genes are also involved in DNA elongation including two RNaseH genes, one DNA ligase gene (ligA) and the truncated DNA polymerase I gene (polA). With the exception of dnaA, four additional genes were found to participate in DNA replication initiation. However, only one gene was identified as a DNA replication termination factor.

Transcription and translation
Seventeen genes are predicted to be involved in transcription including four genes that encode the DNA-dependent RNA polymerase subunits (i.e., alpha, beta, beta9 and sigma) ( Table  S5). Elongation and transcription termination are regulated by three Nus factors, NusA, NusB and NusG, and one Gre factor, GreA. GreA could prevent transcription arrest, and NusA could induce transcription pausing or stimulate anti-termination together with NusB and NusG [25]. Only two transcription factors were found in the M. bovis genome, with one heat-inducible transcription repressor (MMB_0602, HrcA) likely providing negative regulation [26].
A total of 102 M. bovis genes are involved in translation including 47 ribosomal proteins, 20 tRNA synthetase genes and 10 translation factors (Table S6). Among these genes, the 50S ribosomal protein L4 is disrupted and phenylalanyl-tRNA synthetase is composed of two subunits, while glutaminyl-tRNA synthetase is absent. The translation factors include one ribosomebinding factor, three initiation factors, four elongation factors and two peptide chain release factors.

Transporter and secretion systems
The M. bovis transporter system consists of 54 genes, which mainly constitute the ATP-binding cassette (ABC) transporter system and the phosphotransferase system (PTS) ( Table S7). Two genes, ptsI (MMB_0676) and ptsH (MMB_0504), encode the PTS Enzyme I (EI) and phosphocarrier protein HPr and were identified together with an operon (MMB_0684 to MMB_0682) encoding ascorbate-specific IIA, IIB and IIC proteins, which comprise a PTS system for ascorbate absorbance. The other 49 genes encode components of the ABC transporter system including 22 ATPbinding proteins, 20 permease proteins and seven substratebinding proteins. Thirty-one of the 49 genes encode seven complete ABC type transporter systems including two oligopeptide transport systems (MMB_0037 to MMB_0033; MMB_0105 to MMB_0109), two glycerol transport systems (MMB_0249 to MMB_0253 and MMB_0707 to MMB_0704), one sugar transport system (MMB_0017 to MMB_0021), one permidine/putrescine transport system and one alkylphosphonate transport system. Though complete ABC type transport systems were not found for the remaining 18 genes, several lipoproteins are adjacent to these genes, and these may work as substrate-binding proteins.
M. bovis encodes 13 proteins that are involved in protein secretion (Table S8), including SecA, SecD, SecE and SecY. Although secretion via the Sec pathway requires the presence of an N-terminal signal peptide on the secreted protein, signal peptidase (SPaseI) was absent in M. bovis. The channel-forming proteins SecF and SecG were also absent in the genome; consequently, the proteins are presumed to be exported through signal-recognition particle (SRP), which is composed of Ffh and FtsY. We also identified a protein (MMB_0050) belonging to the Type II secretion system.

Metabolism
The M. bovis biosynthetic capacity is severely limited. The 6phosphofructokinase gene of glycolysis (EMP cycle) is absent, and the genes involved in the TCA cycle and the oxidative phase of the pentose phosphate pathway are also missing. Most amino acids, purines, pyrimidines and cofactors cannot be de novo synthesized. [27]. During its intracellular lifestyle M. bovis obtains most of its nutrition from the host.
An extracellular cysteine protease (MMB_0708) was identified that may degrade host proteins into oligopeptides. Thus, two oligopeptide transport systems may import oligopeptides for further usage. Thirteen cytoplasmic peptidases were found in the genome ( Table S9) that could degrade oligopeptides into amino acids to satisfy the M. bovis nitrogen requirement.
Genome analysis revealed that M. bovis uses glycerol instead of sugar as a carbon source. M. bovis assimilates glycerol through two glycerol ABC transport systems and one glycerol uptake facilitator protein (MMB_0301). Glycerol is phosphorylated by glycerol kinase (MMB_0302) to become glycerol 3-phosphate. Glycerol-3phosphate dehydrogenase (MMB_0051) transforms glycerol 3phosphate into glycerone phosphate, which is transformed into glyceraldehyde 3-phosphate by triosephosphate isomerase (MMB_0553). Glyceraldehyde 3-phosphate is involved in the EMP pathway and, finally, transforms into pyruvate. In this process, one molecule of ATP is consumed, and two molecules of ATP are produced. Thus, substrate-level phosphorylation is the major energy generating pathway.
Glycerol may be derived from degraded host lipids, such as serum triglycerides. It is well known that Mycoplasma has lipase activity [28]. In the M. bovis genome, six esterase/lipase genes were found, and these may encode proteins that can guarantee a supply of glycerol in the intracellular lifestyle of the organism.
Pyruvate is transformed into acetyl-CoA by the pyruvate dehydrogenase complex (MMB_0096 to MMB_0100) and into lactate by D-lactate dehydrogenase (MMB_01520) and L-lactate dehydrogenase (MMB_0526). Acetyl-CoA is transformed into acetyl phosphate by phosphate acetyltransferase (MMB_0142), and acetate kinase (MMB_0143) transforms acetyl phosphate into acetate, which results in the production of another ATP molecule. Because M. bovis has lost most of its biosynthesis capacity, energy production through acetyl-CoA may be another important energy generating pathway.

Oxygen Stress
We did not identify superoxide dismutases (SODs), catalase, glutathione peroxidase or glutathione reductases, which could scavenge superoxide radicals (O 2 22 ) and hydrogen peroxide (H 2 O 2 ) damage in M. bovis. The main mechanisms to defend against oxygen stress in M. bovis, include a thioredoxin system, which includes two thioredoxins and a thioredoxin reductase, a thiol peroxidase (MMB_0208) and a peptide methionine sulfoxide reductase (MMB_0449) ( Table S10).
As previously reported [29], glycerol 3-phosphate oxidation may produce H 2 O 2 and cause host cell damage. Although glycerophosphate oxidase (GlpO) was not found in M. bovis, a transmembrane glycerol-3-phosphate dehydrogenase (MMB_0051) may be substituted in the presence of O 2 . Thus, the released H 2 O 2 could be considered a M. bovis virulence factor.
The RM system M. bovis contains two compete type I RM systems (Table S11). One comprises MMB_0610 to MMB_0612 and exhibits homology with the type I RM system of M. agalactiae; however, the ortholog of M. agalactiae HsdR (MMB_0610) is fragmented into pseudogenes. Another type I RM system (MMB_0227, MMB_0228 and MMB_0231) has no homologous proteins in M. agalactiae. Thus, M. bovis exhibits a stronger barrier for gene transfer than M. agalactiae. In addition, gene fragments of the Type III RM system methylase gene and the Type II RM system endonuclease subunit were found in the genome indicating that Type II and Type III RM systems once existed in the genome but are now lost.

Virulence factors
Two islands that have a lower GC content were found in the genome (733,550 to 737,937 and 831,641 to 838,944) ( Table  S12), and they have a bias for binucleotides and an IS element at the end. The 4.4 kb island encoded one variable prolipoprotein with an ISMbov2 element at the 39 end. The 7.3 kb island encodes two lipoproteins with and ISMbov4 element at the 59 end. Homology search of the island sequences revealed ISMbov2 in M. mycoides (see below), indicating the possibility that IS elements are derived from genomic islands transfer. Although the detailed function of these genes remain elusive, the pathogenicity of the islands is essential for virulence in animal pathogens [30].
In addition, we identified eight variable surface lipoproteins displaying antigenic variation. Among them, MMB_0316, MMB_0419 and MMB_0431 were specific for M. bovis. Another four ORFs encoding lipoproteins P30, P40, P48 and P80 were also found in the genome ( Table 2). In all, 45 lipoproteins were found in the genome. These lipoproteins may be associated with M. bovis virulence.
There is one hemolysin-related protein (MMB_0258) in the genome. It has been shown that hemolysins are toxic proteins that cause the lysis of erythrocytes by forming pores in their membranes [31]. Hence, the M. bovis hemolysin may be considered a M. bovis virulence factor.
Three lipoate protein ligase (LplA) genes (MMB_0010, MMB_0011 and MMB_0062) were identified in the M. bovis genome. LplA ligates lipoic acid from host cells to the E2 subunit of the pyruvate dehydrogenase enzyme (PDH) complex to produce E2-lipoamide, which plays a pivotal role in aerobic metabolism. Defective LplA specifically damages the growth of L. monocytogenes in the host cytosol and reduces virulence in animals 300-fold [32]. In addition, a mutant of dihydrolipoamide dehydrogenase, an E3 PDH component, has demonstrated significant attenuation of M. gallisepticum in vivo [33]. Thus, LplA and dihydrolipoamide dehydrogenase (MMB_0100) are considered M. bovis virulence factors.
The extracellular cysteine protease (MMB_0708) is also an important virulence factor that could directly degrade the extracellular matrix (ECM) proteins fibronectin and vitronectin or activate host matrix metalloproteases to degrade the extracellular matrix [34], which would cause tissue damage in the host.
M. bovis secretes a 59-nucleotidase (MMB_0636) that could utilize various nucleotides, such as ATP or ADP, from the host as a substrate and enhance macrophage death [35].
ClpC is an ATPase that promotes the early escape from macrophage phagosomes [36] and is also required for cell adhesion and invasion [37]. Although the ClpC gene is not found in the genome, M. bovis encodes for ClpB (MMB_0664), which shows 50% identity with Listeria monocytogenes ClpC, which may be considered a virulence factor.
MMB_0798 encodes a cation-transporting P-ATPase (MgtA), which is responsible for magnesium uptake and is required for intracellular survival.
In the M. bovis genome, three clusters may encode ABC type transporters that are involved in iron uptake (MMB_494, MMB_495 and MMB_496; MMB_543, MMB_544 and MMB_545; MMB_0799 and MMB_0780). The iron uptake system has been considered a virulence factor and is necessary for the utilization of iron bound to transferrin or iron chelates [38].
In our study, we also found that MMB_0127, MMB_0129, MMB_0345, MMB_0718 and MMB_0756 had immunogenicity. MMB_0129 encodes a transmembrane protein that functions as a spermidine/putrescine ABC transporter permease, and MMB_0345 encodes an extracellular protein. While MMB_0127, MMB_0718 and MMB_0756 are all cytoplasmic proteins, MMB_0127 encodes an oligoendopeptidase, and MMB_0756 encodes a hypothetical protein specific for M. bovis.
Because the pathology of M. bovis-infected calves is similar to M. mycoides-induced CBPP, we compared the two genomes and found 20 horizontal transfer genes. Although some of these genes have no clear biological function (i.e., conserved hypothetical proteins), they may be related to virulence ( Table 2). These genes will be further discussed in the 'Horizontal Gene Transfer' section.     (Table S13).

Pseudogenes, Paralogs and DNA repair systems
The M. bovis genome has 61 pseudogenes accounting for 7.6% of the total protein-coding genes (Table S14), yet the total size of these pseudogenes only occupy 3.4% of the genome. We identified 68 paralogous gene families in the M. bovis genome that comprise 218 genes (Table S15) and account for 27.1% of the M. bovis genes. The largest family consists of 13 ABC transporter ATPbinding proteins, and there are five families that comprise 35 transposases. Ten pseudogenes are included in the paralogous gene families, and their function may be complemented by their paralogs.
In the M. bovis genome, the pseudogenes mainly belong to the RM system (10 genes), which include the type I RM system DNA specificity subunit, the type II RM system endonuclease subunit and the type III RM system methylase subunit. It is noteworthy that an ISMbov4 insertion split a type II RM system endonuclease gene into two pseudogenes (MMB_0666 and MMB_0668).
Two aminopeptidase genes were split into four pseudogenes in M. bovis; however, there are five additional complete aminopeptidase genes in the genome, which could complement the disruption of the two aminopeptidase genes. The same phenomenon is observed for esterase/lipase genes, where one esterase/ lipase gene was split into two pseudogenes, yet six complete esterase/lipase genes guarantee efficient lipid metabolism.
With the exception of the DNA replication proteins, 22 genes are involved in DNA repair and recombination in M. bovis (Table  S16); however, there are no mismatch-repair system (MutHLS) genes encoded in the genome. M. bovis DNA repair is mainly executed by recombinational repair, the SOS repair system, the nucleotide excision repair system and the base excision repair system. The recombination repair and

Horizontal Gene Transfer
We found that 670 out of 803 M. bovis CDS have significant homology with M. agalactiae genes, which is the closest phylogenetically related M. bovis species. In addition, another 46 CDS showed homology to other Mycoplasma genomes located in the same group as M. bovis, while 54 CDS showed homology to the genomes outside of the group, reflecting possible horizontal gene transfer (Table S17). Of these horizontal transfer genes, 29 came from M. mycoides, a pathogen that causes CBPP. These 29 genes include four ISMbov2 transposases and three ISMbov3 transposases. While a homology search showed that ISMbov2 and ISMbov3 have no homologs in the M. agalactiae genome, there are homologs in M. mycoides that exhibit a 97% identity with the M. mycoides genome on the nucleotide level. This is direct evidence that horizontal transfer occurred between M. bovis and M. mycoides.
Because the pathological changes in the lungs of M. bovisinfected calves are sometimes similar to CBPP, we hypothesize that these 29 horizontal transfer genes are related to virulence. Among these 29 genes, two genes (MMB_0693 and MMB_0794) encode prolipoprotein Q, and four genes (MMB_0358, MMB_0434, MMB_0485 and MMB_0614) encode variable prolipoprotein. MMB_0434 is 58% identical to MSC_0364, and MMB_0358 (64% identity) and MMB_0485 (53% identity), two known prolipoprotein of M. myvoides associated with antigenic variation, both show homology to MSC_1005 [39].
Besides the 29 genes, MMB_0328, MMB_0427 and MMB_0429 encode lipoproteins and may also be considered potential virulence factors. Moreover, two CDS show homology to genomes outside of Mollicutes. The MMB_0488 gene encodes a protein that is similar to a Janibacter sp.HTCC2649 hypothetical protein, and MMB_0566 encodes a protein that is similar to a Clostridium spiroforme hypothetical protein.

Comparison with M. bovis PG45
During the preparation of this manuscript, the M. bovis PG45 genome sequence was reported. The PG45 genome size is 1,003,404 bp, 50,290 bp longer than the Hubei-1 genome. Comparison of the two genomes revealed a 540-kb genome inversion ( Figure S1). In PG45, the inversion (248,834 to 788,372) is located between two ISMbov3 elements, and thus, we could speculate that the inversion is caused by the combination of these IS elements. However, in M. bovis Hubei-1, ISMbov1 is upstream of this inversion (200,618 to 757,702), and no IS element was identified on the other side. A 41-kb region (207,849 to 248,833) upstream of the PG45 inversion is absent in M. bovis Hubei-1, which may contribute to the genomic size difference between the two strains. This region contains an ICEB-2 element (Integrative Conjugal Element) that is 37,415 bp long, and several IS elements residing in ICEB-2 may cause its instability. This perhaps explains the absence of ICEB-2 in Hubei-1. Except for this inversion, the genomic content between the two genomes is conserved, and their sequence homology is 98.5%, with 62 identified insertions and 84 deletions (in M. bovis Hubei-1 ( Table S18).
The largest insertion in Hubei-1 is 18.8 kb and encodes 13 proteins (MMB_0316-MMB_0328), including the variable surface lipoprotein Y and an ISMbov1 transposase. The existence of this fragment in Hubei-1 may be due to the transfer of ISMbov1. Another large insertion is 10.7 kb, which encodes nine proteins (MMB_0227 to MMB_235), including a complete type I RM system (Table S11). Moreover, 21 insertions were IS elements. For deletions, with the exception of the ICEB-2 element, two PG45 fragments of 17.8 kb and 10.2 kb were also lost in the Hubei-1 genome. The 17.8-kb fragment encodes 13 variable surface lipoproteins and one transposase (MBOVPG45_0806 to MBOVPG45_0821), implicating an IS element role in this fragment transfer. The 10.2-kb fragment encodes five proteins (MBOVPG45_0616 to MBOVPG45_0621), including a transposase, a site-specific recombinase and three type I RM system enzymes. Although both the 10.7-kb insertion and the 10.2-kb deletion encode type I RM systems and occur in the same genome loci (Table S18), they have no homology at the nucleotide level. In other words, Hubei-1 and PG45 both possess a complete type I RM system, but they encode proteins with low similarity, which may endow the strains with different abilities to resist gene transfer. In addition, 32 deletions are in IS elements.
Homology searching between the Hubei-1 and PG45 genomes found 662 orthologs, which may be considered the 'core genome' of M. bovis (Table S19). As we can see from COG classification ( Table 1), most functional category genes were shared by the two genomes, meaning that the two strains fundamentally possess the same genetic background for metabolism and growth. The only exception is the 'DNA replication, recombination and repair' category because 25 transposases in the Hubei-1 genome and 34 transposases in the PG45 genome are classified into this category, and only a few of these transposase genes have orthologs. Moreover, 51 Hubei-1 genes and 46 PG45 genes are unique for each genome and may represent a 'specific genome' of Hubei-1 and PG45, respectively ( Table S19). The specific genes of the PG45 genome are mainly composed of hypothetical proteins and variable surface lipoproteins, which are located in the ICEB-2 element. Because the ICEB-2 element is absent in Hubei-1, the Hubei-1 specific genes include other lipoproteins, such as P40. It is noteworthy that two type II restriction endonucleases were among the PG45 specific genes, and several truncated type III RM system methylases were found among the Hubei-1 specific genes. The difference in the specific RM system genes may hint that the two strains have different capacities for undergoing lateral gene transfer.

Discussion
While the M. bovis strain Hubei-1 genome encodes 73 lipoproteins and 36 secreted proteins, the SpaseI gene is not found in the genome; however, a lipoprotein signal peptidase gene (lsp) is predicted instead. In some Mycoplasma species, such as M. conjunctivae, M. hyopneumoniae, M. pulmonis and M. synoviae, both the signal peptidase I gene and the lipoprotein signal peptidase genes exist. Thus, extracellular protein secretion in M. bovis may have a different mechanism.
IS elements, which are usually associated with lateral gene transfer and genomic evolution, are abundant in the M. bovis genome (5.4%) and may cause genomic variation in different strains. We found two kinds of IS elements, ISMbov2 and ISMbov3, in both the M. bovis and M. mycoids genomes. Because there is no genome homology between M. bovis and M. mycoids, the existence of these IS elements in both genomes is evidence of horizontal transfer between the genome.
M. agalactiae is the closest phylogenetically related M. bovis neighbor, and the two genomes exhibit high synteny in their genome structure, with only 15% of M. bovis genes lacking orthologs in M. agalactiae.
Over half of the lipoproteins and several extracellular and transmembrane proteins contain tandem repeats in their coding sequence or promoter regions. These phase-variable proteins exhibit protein sequence variation or expression level variation in different growth phases, and thus, they may change the antigen and M. bovis surface structure to help the organism evade the host immune response.
The M. bovis transporter system is mainly composed of ABC transport systems and PTS, which are in charge of oligopeptide, glycerol, sugar, alkylphosphonate and iron uptake.
As an intracellular pathogen, most of the biosynthesis pathways are incomplete in M. bovis. Glycerol, instead of sugar, is the M. bovis carbon source, which is transformed into glyceraldehyde 3phosphate and participates in the EMP pathway to process pyruvate, with one molecule of ATP produced. Pyruvate is then transformed into acetate through two steps resulting in the production of another ATP molecule. Thus, the process of transforming glycerol into acetate produces two molecules of ATP, which may be an important energy generating pathway.
No mismatch-repair system (MutHLS) genes are encoded in the genome; thus, mutations may accumulate with strain replication and cause some genetic degeneration. The M. bovis pseudogenes mainly belong to the RM system, and the reduction in RM systems increase the possibility of horizontal gene transfer. The function of other pseudogenes may be complemented by their paralogs.
Because pathological changes in the lungs of M. bovis-infected calves are similar to CBPP, 29 M. bovis genes with homology to M. myvoides are also considered virulence factors. Moreover, 73 M. bovis lipoproteins could also be considered potential virulence factors, including eight variable surface lipoproteins, and the known virulence factors P30, P40, P48 and P80. Furthermore, our study validated that five proteins, including three cytoplasmic proteins, have immunogenicity. Other virulence factors include two pathogenicity islands, phase-variable proteins, a hemolysinrelated protein, lipoate protein ligase (LplA), dihydrolipoamide dehydrogenase (the E3 component of PDH), extracellular cysteine protease, 59-nucleotidase, ClpB, cation-transporting P-ATPase (MgtA) and ABC type transporters involved in iron uptake. Because glycerol 3-phosphate oxidation may produce H 2 O 2 and cause host cell damage, the released H 2 O 2 could also be considered a M. bovis virulence factor. These potential virulence factors could be used as candidates for drug therapy and vaccine design.

Bacterial growth and DNA extraction
For Mycoplasma growth, we used a modified ATCC 1699 broth (0.8 g glucose, 20% pig serum, 100 units of penicillin, and 0.05% acetic acid thallium). A commercial tissue genomic DNA extraction kit (Axygen, Inc., USA) was used to purify DNA.
High-density pyrosequencing and sequence assembly of the genome The complete genomic sequencing was conducted using a Roche GS FLX system [40]. A total of 85,820 reads totaling 34,770,094 bases (average read length: 405 bp), was obtained resulting in 38-fold genome coverage. Assembly was performed using the GS de novo Assembler software (http://www.454.com/) and produced 86 contigs ranging from 500 bp to 71,153 bp (the N50 contig size is 27,842 bp). The relationship of the contigs was determined by multiplex PCR [41]. Gaps were then filled in by sequencing the PCR products using ABI 3730xl capillary sequencers. Phred, Phrap and Consed software packages (http://www.genome.washington.edu) were used for final assembly and editing, and low quality regions of the genome were resequenced. The final sequencing accuracy was 99.99%.

Genome annotation
Putative CDS were identified by Glimmer 3.02 [42] and ZCURVE 1.02 [43], and peptides shorter than 30 amino acids were eliminated. Insert sequences were first detected using the IS Finder database (http://www-is.biotoul.fr/is.html) using the default parameters and manual selection. Transfer RNA genes were predicted by tRNAScan-SE [44]. Functional annotation of CDS was performed by searching against an in-house developed Mollicutes protein database using BLASTP [45] and the CDD databases [46] by RPS-BLAST.
The metabolic pathways were constructed using the KEGG database [47]. Subcellular localization of the proteins was predicted by PSORTb (v2.0.1) [48], and lipoproteins were identified with LipoP 1.0 [49]. Pathogenicity islands were detected by IslandViewer [50]. Orthologs and paralogs were defined as proteins with greater than 30% similarity. Genome comparisons were performed using BLAST and displayed by the Artemis Comparison Tool (ACT) [51]. The genome atlas was drawn using GenomeViz1.1 [52].

Phylogenetic tree construction
Orthologs of known Mycoplasma genomes were obtained from the MBGD database [53]. The M. bovis strain Hubei-1 phylogenetic position within the Mollicutes was determined based on 150 ortholog proteins. Concatenated protein sequences of 150 orthologous mycoplasma species proteins were first aligned using ClustalW [54], the conserved alignment blocks were then extracted by the Gblocks program [55]. A maximum likelihood tree was built with PHYML [56] with the following parameters: 100 replications for bootstrap analysis, ''JTT'' for the substitution model, ''estimated'' for the proportion of invariable sites, ''estimated'' for the gamma distribution parameters, ''4'' for the number of substitution categories, ''yes'' to optimize tree topology, and ''BIONJ'' for starting tree(s).

Nucleotide sequence accession numbers
The complete genomic sequences of M. bovis strain Hubei-1 have been deposited in GenBank under accession number CP002513. Figure S1 Comparison of genomic structure between Mycoplasma bovis Strain Hubei-1 and Mycoplasma bovis PG45. Red lines represented homologous sequences with same direction between the 2 genomes; blue lines represented homologous sequences with reversed direction between the 2 genomes. (TIF)

Author Contributions
Conceived and designed the experiments: JX Y. Li. Performed the experiments: Y. Liu WC ZS. Analyzed the data: HZ JX. Contributed reagents/materials/analysis tools: JX. Wrote the paper: Y. Li JX HZ.