Polymorphisms and features of cytomegalovirus UL144 and UL146 in congenitally infected neonates with hepatic involvement

Human cytomegalovirus is a significant agent of hepatic involvement in neonates. In this study, we investigated the polymorphisms and features of the viral genes UL144 and UL146 as well as their significance to congenital hepatic involvement. In 79 neonates with congenital cytomegalovirus infection and hepatic involvement, full length UL144 and UL146 were successfully amplified in 73.42% and 60.76% of cases, respectively. Sequencing indicated that both genes were hypervariable. Notably, UL144 genotype B was highly associated with aspartate aminotransferase (P = 0.028) and lactate dehydrogenase (P = 0.046). Similarly, UL146 genotype G1 and G13 were significantly associated with CMV IgM (P = 0.026), CMV IgG (P = 0.034), alanine aminotransferase (P = 0.019), and aspartate aminotransferase (P = 0.032). In conclusion, dominant UL144 (genotype B) and UL146 (genotype G1 and G13) genotypes are associated with elevated levels of enzymes and CMV IgM and IgG of cytomegalovirus infection.


Introduction
Human cytomegalovirus belongs to the subfamily Betaherpesvirinae in family Herpesviridae, and is congenitally transmitted to about 1. 8% of neonates in China [1], and to 0. 2−2.2% of newborns in other countries [2]. About 11% of congenitally infected infants born alive are symptomatic and present multisystemic or fatal disease [1], including hepatosplenomegaly, petechiae, megacolon, microcephaly, neurodevelopmental disorders, and hepatic involvement. In particular, the virus tends to infect the reticuloendothelial system, especially the liver [3]. PLOS  Although 85−90% of infected neonates do not show clinical evidence of infection, they may develop several clinical outcomes in following years, including motor deficits, ocular abnormalities, and hearing loss [4]. Why only some neonates infected with HCMV, but not all population, develop symptoms is unknown [5]. Although the host immune system is believed to largely determine the outcome of infection, sequence polymorphisms in the infecting strains are also thought to be associated with outcome and tissue tropism [6,7]. Human cytomegalovirus is one of the largest human viruses. It carries approximately 230 −235 kb of double-stranded DNA and >200 predicted open reading frames [8][9][10]. Notably, the laboratory strain AD169 lacks the UL/b' sequence, in contrast to the low-passage clinical strain Town and several other low-passage clinical isolates. This fragment contains at least 19 open reading frames, including UL133−UL151, and is dispensable for growth in vitro, but may be essential for viral infectivity and pathogenicity in vivo [11]. Of these, UL144 is expressed early in lytic infection, and encodes a structural homologue of a herpesvirus receptor that mediates viral entry. Poole et al. found that, although IE86 represses the UL144-mediated activation of a synthetic NF-B promoter, it is unable to block UL144-mediated activation of the CCL22 promoter, and this lack of responsiveness to IE86 appears to be regulated by binding of the CREB transcription factor [12]. Moreover, Cheung et al. found that UL144 binds Ig superfamily member B and T lymphocyte attenuator (BTLA), but not LIGHT, and inhibits T cell proliferation, selectively mimicking the inhibitory co-signaling function of herpesvirus entry mediator (HVEM) [13]. Four UL144 transcripts have been identified in infected cells as variously regulated 3'-coterminal transcripts of 1,300, 1,600, 1, 700, and 3,500 nucleotides, while the largest transcript initiated from within the UL141 open reading frame includes UL141-UL145 [14]. On the other hand, UL146 encodes a viral α (CXC)-chemokines (vCXCL-1), a sufficiently functional chemokine, that elicits chemotaxis and mobilizes calcium [15,16]. In infected endothelial cells, the viral chemokine recruits neutrophils via cellular CXCR1 and CXCR2 receptors, and the cells subsequently transport the virus to uninfected endothelial cells. In this manner, a large population of infected endothelial cells is maintained [17].
Remarkably, unrelated strains cluster into defined UL146 genotypes, of which 14−15 have been catalogued [18,19]. Accordingly, UL146 diversity impacts binding affinity, receptor targeting, activation of peripheral blood neutrophils, and, hence, virus dissemination and pathogenesis [20]. Indeed, UL144 and UL146 are some of the most polymorphic genes in many clinical isolates [7,15,18,[21][22][23][24]. The relationship between these polymorphisms and several symptoms have been evaluated in various studies with inconclusive or contradictory results [25]. However, it has not been reported in congenitally infected neonates with hepatic involvement. Therefore, we investigated UL144 and UL146 polymorphisms and genotypes in 79 newborn infants with congenital cytomegalovirus infection and hepatic involvement, and explored their correlation with clinical outcome.

Study population and sample collection
From November 2014 to May 2016, 79 newborn infants with congenital cytomegalovirus infection and hepatic involvement were recruited at Second Affiliated Hospital of Wenzhou Medical University, Wenzhou, China. Average age was 6 days, with a median of 1.8 days and a range of 1-13 days.
Congenital HCMV infection is diagnosed when a newborn is confirmed to be infected by HCMV within 14 days (including the 14 th day) after birth based on one of the following standards used to define HCMV infection: the virus copies of the blood or urine of patients detected by fluorogenic quantitative PCR is >500copies/mL; serological test demonstrates CMV IgG >1.00 U/mL or CMV IgM >1.00 COI.
Cytomegalovirus hepatitis was diagnosed according to criteria set at the National Infant Virus Hepatitis Prevention and Cure Symposium in China. The criteria consist of (i) several significant indicators of hepatic involvement, (ii) detection of cytomegalovirus DNA in urine or blood within 2−3 weeks after birth, (iii) absence of Epstein-Barr virus and hepatitis A, B, C, D, and E, and (iv) exclusion of metabolic disorders, alcoholic hepatitis, drug-induced hepatitis, familial cholestasis, autoimmune hepatitis, or idiopathic neonatal hepatitis. Indicators of hepatic involvement include alanine aminotransferase (ALT) > 40 units/L, aspartate aminotransferase (AST) > 35 units/L, and lactate dehydrogenase (LDH) > 245 units/L. Presence of cytomegalovirus DNA was tested by fluorescence quantitative PCR (Applied Biosystems 7500 Real−Time PCR System, Foster City, CA, USA).
Sixty neonates tested positive for serum CMV IgG against cytomegalovirus, while 25 tested positive for CMV IgG and IgM. CMV IgG and IgM were detected by chemiluminescence immunoassay, following the manufacturer's instructions (Roche Cobas 8000 Analysis System and Auxiliary Kit, Basel, Switzerland). Sediments from 5 mL urine were collected and stored at −80˚C until use. Urine samples were collected during standard examination at admission, and written informed consent was obtained from the parents of the newborn infants who participated in the study. The corresponding author, Xiangyang Xue, was responsible for anonymizing the data collected from participants and this anonymization procedure was approved by the Ethics Committee that approved the study. All data for this study were used and analyzed in strictly anonymous form, according to the code of conduct for medical research approved by the hospital's Medical Ethical Committee. The Medical Ethical Committee of the hospital of the Second Affiliated Hospital of Wenzhou Medical University approved the consent procedure and current study. Baoqing Li and Yiping Chen, two of the authors of this study, involved in collection of the participant data.
DNA extraction, polymerase chain reaction, and sequencing Viral DNA was isolated from urine sediments using TIANamp Genomic DNA Kit (Tiangen, Beijing, China) according to the manufacturer's protocol and was eluted in 100 μL elution buffer. DNA concentration and purity were assessed by spectrophotometry (Beckman, Fullerton, CA, USA), and samples were stored at -20˚C until use. UL144 was detected using primers designed to generate a 740bp product [19] after denaturation at 95˚C for 2 min, 40 cycles at 95˚C for 30 s, 55˚C for 30 s, 72˚C for 1 min, and extension at 72˚C for 10 min [19]. UL146 was detected using primers designed by Hassan-Walker et al. [15]. These primers amplified a 721bp fragment encompassing UL146 and flanking sequences from UL145 and UL147 in reactions consisting of denaturation at 95˚C for 10 min, 40 cycles at 94˚C for 30 s, 55˚C for 30 s, 72˚C for 30 s, and final extension at 72˚C for 10 min [15]. Primer sequences are listed in Table 1. PCR products were cloned into pEASY-T1 Cloning Vector (TransBionovo, Beijing, China) using pEASY-T1 Cloning Kit (TransBionovo, Beijing, China), and transformed into Trans-T1 Phage-Resistant Chemically Competent Cells (TransBionovo, Beijing, China) according to the manufacturer's instructions. At least ten colonies were sequenced in both directions on a 3730xl DNA Analyzer (Applied Biosystems), using the universal primers M13 and T7. Clones were analyzed in Chromas and verified by Basic Local Alignment Search Tool [23].
Template DNA used for the sensitivity tests of UL144 and UL146 PCR HCMV clinical strains separated in our laboratory were used as templates with full length primers of UL144 and UL146 (Table 1) to amplify target genes. The amplified products were electrophoresed on a 1.5% agarose gel containing ethidium bromide (EB) and the bands were photographed. They were also sequenced to ensure that the sequence was correct. The correct PCR product was cloned into T vector. The concentration was determined and then converted to copy number. The vector was serially diluted to 10 4 , 10 3 , 10 2 , 10 1 , and 10 0 copies. Serially diluted samples were used as templates to detect the sensitivity of the primers for the detection of UL144 and UL146; the emergence of target bands was used as a positive criterion.

Results
Frequency of UL144 and UL146 detection in neonates with congenital cytomegalovirus hepatic involvement As shown in Fig 1A and 1B, sharp bands were generated with each primer pair from samples infected with cytomegalovirus, but not from negative controls (HCMV negative urine sample). Sequencing confirmed the specificity of PCR reactions (S1 Fig), whose sensitivity was determined to be 10 1 copies/reaction for UL144 and 10 2 copies/reaction for UL146 ( Fig 1C and 1D). Bioinformatic analysis of 250 strains indicated that homology was 87.5 −100% and 91.3−100% in the forward and reverse primer regions of UL144, respectively. In particular, the last six 3' nucleotides of the UL144 primer were completely conserved, except in four strains, to which the primer did not match well. Therefore, the UL144 primer can theoretically amplify 98.4% (246/250) of the strains analyzed. A similar analysis of 265 strains indicated that the homology in the UL146 forward and reverse primer regions was 90.4−100% and 100%, respectively, with the last six nucleotides of the primer being completely conserved except in two strains. Hence, the UL146 primer should theoretically amplify 99.25% (263/265) of the strains. Out of the 79 samples, UL144 and UL146 were successfully amplified and sequenced by PCR in 58 and 48 of patients, respectively ( Table 2). The genes were not amplified in one-third of the samples. For specimens without amplification, we tested redesigned UL144 and UL146 primers or primers reported by others [7,31] which also failed, probably due to low DNA copy number or poor PCR sensitivity.

Features and genotypes of UL144 in clinical isolates
UL144 genes were generally classified into 5 genotypes,named A,B,C,AB,and AC. Two hundred fifty four previously reported UL144 sequences [5,19,26,27] were retrieved and analyzed in the MEGA 5.0 software. As shown in Fig 2A, we found 97 in group A, 92 in group B, 49 in group C, 11 in group AB, 5 in group AC. Comparatively, according to our UL144 sequencing results, 49 out of 58 patients were classified as A, B, and C, among which genotype B accounted for approximately half of the patients (51.02%), while genotypes AB and AC were not detected (S1 Table). Our sequences were named "UL144 XXH" (XX, stood for numbers) in Fig 2A. In addition, the remaining 9 patients were defined as mixed infection, which are described in the "features and genotypes of mixed infections" section below. We previously reported the distribution of UL144 genotypes in congenital infection asymptomatic patients, which showed that the genotypes A and B were detected, but genotypes C, AB, and AC were not [32]. UL144 gene polymorphism of the Towne strain and of 49 sequences was then observed, with homology ranging from 80% to 100% at the nucleotide level, and from 77.7% to 100% at the amino acid level (S2 Table). Logos of UL144 gene sequence alignment showed the variation was concentrated on the 5' half of the gene, especially the CRD 1 domain. (Fig 3A).
Amino acid sequence polymorphism of the same type was also analyzed. Comparing with the Towne strain, we found significant differences in genotypes A (79.5−80.6% identity) and C (78.8−82.2% identity) ( Fig 4A). Moreover, genotype A isolates differed from the Towne strain by 25−27 amino acid substitutions, as well as by insertion of glutamine (Gln, Q) at position 116, so that the UL144 protein was 176 amino acids in length. Similarly, genotype C isolates differed from the Towne strain by 19−23 amino acid substitutions and by insertion of glutamine at position 116, as observed in 3/7 (42.86%) isolates. Furthermore, amino acids 131−133, corresponding to an arginine (Arg, R), a histidine (His, H), and a threonine (Thr, T), were deleted from genotype C isolates, so that predicted peptide sequences ranged from 172 to 176 amino acids. Missense mutations were the most common polymorphisms in both genotypes A and C. On the other hand, genotype B was strongly conserved, with amino acid homology as high as 97.7−98.8%. In addition to leucine (Leu, L) being mutated into phenylalanine (Phe, F) at position 7 or phenylalanine (Phe, F) being mutated into leucine (Leu, L), the mutation type was almost synonymous mutation, and the mutation rate was 56% (14/25) (Fig 4).
To evaluate whether the variation in amino acid sequence would influence the physical and chemical property of proteins, protein isoelectric points (IP) and molecular weights (MM) were predicted in the EXPASY database. The MW median was 19.63, 19.49, and 19.30 kDa, for genotype A, B, and C, and IP was 8.97, 8.87, and 9.04, respectively (S2A and S2B Fig) with obvious differences (P = 0.000).
Depending on the hypervariable amino acid sequences, we then analyzed the functional motifs of UL144 proteins. Most essential motifs such as N-glycosylation sites, N-myristoylation sites, protein kinase C (PKC) phosphorylation sites, and TNFR/NGFR cysteine-rich regions were present in all genotypes, although genotype C isolates contain two additional N-myristoylation sites due to mutation of valine 19 and serine 53 to glycine (Table 3). Notably, only genotype B strains contained bacterial Ig-like domain 1, a prokaryotic site for binding membrane lipoprotein lipids (Prokar_Lipoprotein), and a CTCHY zinc finger (Table 3). Furthermore, even though the UL144 genotype in congenitally infected neonates with hepatic involvement possessed the functional site of the genotype in congenital infection asymptomatic neonates, the UL144 type A and type B in the former had one more PKC phosphorylation site (49 th -51 th amino acids) and Bacterial Ig-like domain 1 (1 th -8 th amino acids), respectively [32]. In any case, the UL144 protein from all genotypes consists of two cysteine-rich domains (CRD1 and CRD2), one transmembrane domain (tm), and one cytoplasmic domain (cyto), as illustrated  In comparison to the Towne strain, CRD1 was more variable in genotypes A and C, but was significantly more conserved in genotype B. However, the tm and cyto were strongly conserved in all genotypes (Fig 4).

Features and genotypes of UL146 in clinical isolates
UL146 genes were generally classified into 14 genotypes, named G1−G14 [8]. Hundred sixty three previously reported UL146 sequences were retrieved [6,19,28,29] and analyzed in the MEGA 5.0 software. As shown in Fig 2B, we found 13 in group 1, 13 in group 2, 3 in group 3, 3 in group 4, 4 in group 5, 1 in group 6, 23 in group 7, 25 in group 8, 19 in group 9, 7 in group 10, 20 in group 11, 18 in group 12, 12 in group 13, and 2 in group 14. Comparatively, we their study, which was described in Section Materials and methods), but also this study population. Our sequences were named "UL144 XXH" or "UL146 XXH"(XX, stood for numbers) in Fig 2A or 2B. Pairwise evolutionary distances were estimated using Poisson model, and trees were constructed by a neighborjoining method implemented in MEGA5.0. The reliability of each tree topology was estimated from 100 bootstrap replicates.
doi:10.1371/journal.pone.0171959.g002   successfully sequenced and genotyped 42 out of 48 UL146 fragments into genotypes G1, G2, G6-G9, and G11−G14 (Fig 2B). G9 was the most prevalent, and was found in 15 patients. G12 and G13 were found in seven cases each, while G1 and G7 were detected in three cases each. G8 and G11 infected two newborns each, and G2, G6, and G14 were found in one patient each, while G3−G5 and G10, which were described previously [8], were not detected. Our sequences were named "UL146 XXH" (XX, stood for numbers) in Fig 2B. In addition, the remaining 6 out of 48 patients were defined as mixed infection, which are described in the "features and genotypes of mixed infections" section below. UL146 gene polymorphism of the Towne strain and total 42 sequences was then observed. Like UL144, UL146 was hypervariable, with homology among strains ranging from 40% to 100% at the nucleotide level, and from 18.5% to 100% at the amino acid level (S2 Table). Comparison with the Towne strain indicated that polymorphisms were distributed throughout the entire coding region except in G7 ( Fig  5). Logos of UL146 gene sequence alignment also showed that the variation was distributed throughout the entire coding region (Fig 3B). The predicted amino acid sequences were compared among 10 groups (Fig 5). The estimated size of the UL146 protein ranged from 114 to 120 amino acids. Moreover, the mature forms without the signal sequence and the reference sequences contain between 93-98 residues [19]. Missense mutations in the ELR motif, including glutamate (E) ! asparagine (N) and leucine (L) ! glycine (G) substitutions, were observed, and may have far-reaching effects, as the motif is essential for chemokine activity and receptor binding [19,33,34]. The ELRCXC motif also regulates angiogenesis [35]. The Towne strain contained an ELRCPC motif, as did all 35 strains of genotype G7, G9, and G11-G14, although amino acid homology among these 35 strains was 42.5-100%. On the other hand, a G2 strain contained an ELRCKC motif, while five G1 and G8 strains contained an ELRCRC motif. Notably, a G6 strain contained an NGRCTC motif without an ELR motif (Fig 5). The amino acid homology between G6 and other genotypes ranged from 18.5% to 30.4%, suggesting a clear difference in protein sequence.
In comparison to host chemokines, CXCL-1 and CXCL8, UL146 contained about 25 additional residues at the C terminus, as previously observed by Heo et al [20]. Except proline (P) 56, which was not conserved in cytomegalovirus isolates, the arginine (R) in the ELR motif, two cysteines (C) in the N terminus at positions 59 and 79, and a leucine (L) at position 80 were conserved in UL146 and other host chemokines (Fig 5) [17,20]. Homology in the ELR motif, N-loop, and C terminus also implies differences in chemokine receptor binding and functional response [20].
Like UL144, to evaluate whether the variation in amino acid sequence would influence the physical and chemical property of UL146 proteins, molecular weights (MW) and protein isoelectric points (IP) were predicted in the EXPASY database. The predicted molecular weight of the UL146 protein was the lowest in G13 (10.73 kDa) and the highest in G1 (11.20 kDa), whereas the IP was the smallest in G12 (9.30) and the largest in G1 (9.89) (S2C and S2D Fig) with a significant difference (P = 0.000).
Depending on the hypervariable amino acid sequences, we then analyzed the functional motifs of UL146 proteins. G1 was predicted to contain an amidation site, three N-glycosylation sites, a tyrosine kinase phosphorylation site, and a protein kinase C phosphorylation site. G7 was predicted to contain an amidation site and two protein kinase C phosphorylation sites, while G9, G12, and G13 contained bacterial Ig-like domain 1. These results suggest that UL146 is post−translationally modified in different ways among the five major genotypes ( Table 4).

Linkage of UL144 and UL146 sequence genotypes
Potential linkage disequilibrium between both the UL144 and UL146 genotypes was investigated in 37 single infected samples, both UL144 and UL146 genes were detected, from neonates with congenital cytomegalovirus hepatic involvement. The observation that 8 of 13 (61.54%) UL146 G9 were UL144 B's and that 7 of 7 (100%) UL146 G13 isolates were UL144 The numbers represent the location of amino acid. The first amino acid encoded by the initiation codon of UL146 gene was defined as 1 to confirm the amino acid location of each functional site. "/" represents homologous genotype of UL146 with no such functional site. B's shows a linkage between UL144 and UL146 genotypes (Table 5). However, we did not observe a conformity between UL144 and UL146 variants (k = -0.008, p = 0.739) in all patients examined.

Features and genotypes of mixed infections
Two or more sequences were detected in the same samples more than once by sequencing after cloning, which indicated that these samples contained more than one HCMV strains. Moreover, alignment of amino acid sequence and clustering analysis also proved the presence of multiple UL144 or UL146 genotypes (S1 Table, Table). We noted that most similar analyses in the past were based on virus isolates propagated in culture, a process that may have selected a single strain and, therefore, prevented detection of mixed infection. In contrast, we tested clinical specimens directly to prevent underestimation of virus diversity.
Moreover, correlation analysis of the mixed infection and single strain infection neonates for the clinical indicators was performed. The CMV IgM level in the group of newborns infected with a single virus strain displayed a median of 1.585 COI (interquartile range, 0.404 to 7.348) and statistically higher (P = 0.003) than that in the group with mixed infection having a median of 0.316 COI (interquartile range, 0.255 to 0.462) ( Table 6). However, no significant difference was found for CMV IgG level, urine DNA load, blood DNA load, ALT, AST, and LDH.

Distribution of UL144 and UL146 genotypes in different geographic regions
The distribution of cytomegalovirus genotypes differs with geographic region, as demonstrated by Fu et al. [36], Chen et al. [7], and Pignatelli et al. [37]. Thus, we also compared the distribution of UL144 and UL146 genotypes in our cohort with those previously reported, to determine whether geographic differences also exist. We found that the distribution of UL144 genotypes in our cohort (Table 7) was significantly different from the distribution in Illinois, USA [21], Europe [23,26,27,38], Taiwan [7], and even Shenyang, China [5], but similar to the distribution in Tennessee, USA [19] and The Netherlands [39]. We noted, however, that the distribution of UL144 genotypes was similar in Japan and Taiwan [7,22]. Similarly, the  distribution of UL146 genotypes in our cohort was significantly different from the distribution in the UK [29], Poland [6], and Japan [40], in which G2 or G1 is predominant along with G7. However, the distribution in our cohort was similar to the distribution in the US [19] and Europe [41], in which G9 or G12 predominates along with G13 (Table 8).

Correlation between UL144 and UL146 genotypes and clinical indicators
As shown in Table 9, there are differences in the distribution of the clinical indicators, AST (P = 0.028) and LDH (P = 0.046), among various genotypes of UL144. Further statistical   analysis revealed that the concentration of AST and LDH in the UL144 genotype B group was obviously higher than that in the other two groups. However, UL144 genotype did not significantly correlate with viral load in the urine or blood, although median viral load was significantly higher in the urine than in the blood. In addition, UL144 genotype did not significantly correlate with serum CMV IgG and IgM, although CMV IgG was generally higher than normal (0 As expected, there are differences in the distribution of the clinical indicators, CMV IgM (P = 0.026), CMV IgG (P = 0.034), ALT (P = 0.019), and AST (P = 0.032, Table 10), among various genotypes of UL146. Further statistical analysis revealed that levels of ALT and AST suggested that G1 and G13 were associated with severe hepatic involvement. High viruria, lower viremia, and lower than normal total, direct, and indirect bilirubin were noted for UL146 genotypes, as observed for UL144. High rates of hepatobiliary disorder, as indicated by elevated hepatic transaminase and conjugated hyperbilirubinemia, were not observed, in contrast to results from other studies [42,43].

Discussion
The UL144 transmembrane and cytoplasmic domains are conserved among clinical cytomegalovirus strains, although the ectodomain and signal peptide are highly variable [21,23]. Accordingly, we noted that UL144 polymorphisms are concentrated in the 5' end of the gene, especially in cysteine-rich domain 1, which may bind the B-and T-lymphocyte attenuator protein. The activity of this domain may vary due to polymorphisms, resulting in a range of clinical symptoms and prognosis [13]. Notably, Heo et al. [19] analyzed cytomegalovirus from asymptomatic infected children and showed that UL144 did not mutate over two years even under pressure from host immunity, suggesting that UL144 diversity is not due to immune selection. In addition, several UL144 motifs such as N-glycosylation sites, protein kinase C phosphorylation sites, and cysteine-rich regions were conserved, suggesting that these domains are critical for infectivity [21]. Strong conservation of UL144 within the same genotype and its long−term stability in the host suggest that immune selective pressure helps maintain UL144 genotype. The molecular epidemiology of American [19], Japanese [22,40], Italian [6,26], and Polish [23] isolates has been investigated based on UL144 and UL146. In addition, Waters et al. [27] suggested that congenital infection with UL144 genotypes A and C is serious and may lead to Abbreviations are the same as those in Table 9.
* represents mathematical operators "X". doi:10.1371/journal.pone.0171959.t010 long-term clinical symptoms. Indeed, genotypes A and C generate markedly higher plasma viral loads than genotype B [27]. Similarly, Arav-Boger et al. [26,30] demonstrated that genotypes A and C are associated with congenital cytomegalovirus symptoms, with genotype C identified in symptomatic patients only. Further, Pati et al. [44] found that genotype C was significantly more prevalent in symptomatic (6/20) than in asymptomatic infants (2/27). However, UL144 polymorphisms were also found to be unrelated to clinical symptoms in other studies [45,46]. On the other hand, genotype B is usually detected in asymptomatic neonates [27]. Indeed, Paradowska et al. [47] showed that only genotype B was observed in asymptomatic children, while genotypes A and A/B were observed in symptomatic children, although the genotype did not significantly correlate with viral load in the blood and urine [39]. In contrast, genotype B was the most prevalent in our patients, who were symptomatic and were from a geographically distinct Chinese cohort, followed closely by genotype A. Genotype C was the least prevalent, and no recombinant strains were identified, as observed in previous surveys [27]. We noted that the prevalence of UL144 genotypes in our cohort was more similar to those reported by Heo et al. [19] for the US, and by Nijman et al. [39] for The Netherlands, than to those reported by Lurain et al. [21] for the US, by Mao et al. [5] for China, by Waters et al. [27] for Ireland, and by Branas et al. [38] for Spain. Collectively, the data indicate that the relative prevalence of UL144 genotypes was comparable between our cohort and the State of Tennessee, USA. In contrast, our results showed that newborns infected with genotype B had significantly higher AST and LDH than those infected with genotypes A and C, suggesting that genotype B is associated with severe hepatic involvement. We noted that Boppana et al. [48] and Yamamoto et al. [49] suggested that congenital infections are typically due to recurrent infection among pregnant women, either because of virus reactivation and/or further infection with other strains. Moreover, we found that genotype B only slightly stimulated CMV IgM (1.180, 0.389, and 2.710), but strongly stimulated CMV IgG, suggesting that genotype B infection is both short-and long-lived, while infection with genotypes A and C was generally long-lived. However, UL144 genotype did not significantly correlate with viral load in the urine or blood. Indeed, median viral load was <500 copies/mL in the blood for all genotypes, while median viral load in the urine was the highest at 5.90 × 10 5 copies/mL for genotype C, and the lowest at 1.01 × 10 5 copies/mL for genotype A.
In contrast, UL146 polymorphisms, mostly missense mutations, were found throughout the entire coding region. Significant differences in post-translational modification sites in the protein were also noted among dominant genotypes as well as in the predicted isoelectric point and molecular weight, which was also in contrast to UL144. These data indicate that UL146 may be more sensitive to host immune pressure. Further, the ELRCXC motif was present in 36 strains from G6, G7, G9, and G11−14, implying that this functional motif is essential. Indeed, He et al. [31] found this motif in most clinical strains obtained from congenitally infected patients with jaundice, megacolon, and microcephaly.
The relationship between UL146 genotypes and congenital infection has also been investigated [8,19,28]. For instance, Paradowska et al. [6] found that genotypes G1, G5, and G7 were prevalent in central Poland. In other studies in Europe, G1, G2, G7, G9, G12, and G13 were prevalent [41]. As a similar genotype distribution was observed in our cohort, we deem that G9, G12, and G13 are also prevalent in China, although He et al. [14] suggested that G1 and G2 are prevalent among Chinese infants. This discrepancy is probably due to regional differences and differences in the virulence of cytomegalovirus strains [6], as highlighted by Heo et al. [19], who found that G8, G11, and G13 were prevalent in the US. Notably, Paradowska et al. [6] found that, in 121 children with symptomatic infection, G7 and G5 were prevalent in postnatal infection, but G1 was predominant in congenital infection. Heo et al. [19] also found that G8, G10, and G12 are asymptomatic, although UL146 genotype did not correlate with symptomatic infection. Similarly, Arav-Boger et al. [28], Dolan et al. [8], and Hassan-Walker et al. [15] showed that no specific UL146 genotype was associated with clinical manifestation, but probably because of small sample size.
Median CMV specific IgM was normal in patients infected with genotype G9 and G12, but much higher than normal in patients infected with G1, G7, and G13. In addition, CMV specific IgG levels due to G7, G9, G12, and G13 were also elevated and comparable, suggesting that G9 and G12 infections were mainly long-lived, while G1, G7, and G13 were both shortand long-lived. Viremia was <500 copies/mL in patients infected with dominant genotypes, but the urine viral load was elevated and comparable among babies with G9, G12, and G13 infections. Levels of ALT and AST suggested that G1 and G13 were associated with severe hepatic involvement.
Mixed infection with multiple genotypes was detected in 15.19% of infants. Paradowska et al. [6] reported that mixed infection was approximately 7% in infants with postnatal infection and 11% in adults, but was not detected in neonates. Furthermore, Ross et al. [50] observed multiple genotypes in 39% (5/13) of urine, blood, and saliva samples. In another survey, mixed infection was also found in 59 infants (45%), but was not associated with symptoms [44]. Similarly, we did not observe a relationship between mixed infection and clinical outcome.
We found associations between specific genotypes and severe hepatic involvement based on comparisons among infants with congenital CMV infection and hepatic involvement. To make a conclusion, it would be important to investigate infants that have congenital infection with or without hepatic involvement in the same populations, otherwise the argument for its functional effects are still somewhat speculative. Yes, we previously reported on the polymorphisms of UL144 of congenitally infected asymptomatic neonates, but not analyze the polymorphisms of UL146 of congenitally infected asymptomatic neonates. This will be the subject of a future study. Moreover, It is important to define the significant associations of UL144 and UL146 genotypes and polymorphisms with congenital CMV with hepatic involvement or without hepatic involvement by increasing the numbers, and we will collect more cases for our future study.
In summary, we investigated the genotype distribution and polymorphisms in UL144 and UL146, which encode essential cytomegalovirus proteins, to assess whether polymorphisms are associated with hepatic involvement in infected infants. Because several UL144 and UL146 genotypes were detected, we speculate that various strains are transmissible from mother to child, with UL144 genotype A and B being the most prevalent along with UL146 genotypes G9, G12, and G13. We also confirmed that congenital infection with multiple strains occurs. In addition, UL144 genotype B and UL146 genotypes G1/G13 seemed to be associated with severe hepatic involvement. Taken together, the data show that different viral factors determine cytomegalovirus pathology in children [6]. However, the relationship between clinical indicators and UL144 and UL146 genotype needs further study. We emphasize that we are the first to report the prevalence of cytomegalovirus hepatic involvement in Chinese neonates.
Supporting information S1 Fig. (A, B)  . The predicted objects of UL144 gene IP and MW were full length proteins. The predicted objects of UL146 gene IP and MW were mature proteins without the signal peptide sequence. Nonparametric Kruskal-Wallis test was used to compare isoelectric points and molecular weights of UL144 and UL146. P < 0.05 indicates statistical significance. Ã and˚represent the abnormal value deviated from the data set; the box plot in the figures from top to bottom represents the top edge value, 3/4 quantile, median, 1/4 quantile, lower edge value, respectively. (TIF) S1