Genetic Diversity of Circulating Rotavirus Strains in Tanzania Prior to the Introduction of Vaccination

Background Tanzania currently rolls out vaccination against rotavirus-diarrhea, a major cause of child illness and death. As the vaccine covers a limited number of rotavirus variants, this study describes the molecular epidemiology of rotavirus among children under two years in Dar es Salaam, Tanzania, prior to implementation of vaccination. Methods Stool specimens, demographic and clinical information, were collected from 690 children admitted to hospital due to diarrhea (cases) and 545 children without diarrhea (controls) during one year. Controls were inpatient or children attending child health clinics. Rotavirus antigen was detected using ELISA and positive samples were typed by multiplex semi-nested PCR and sequencing. Results The prevalence of rotavirus was higher in cases (32.5%) than in controls (7.7%, P<0.001). The most common G genotypes were G1 followed by G8, G12, and G4 in cases and G1, G12 and G8 in controls. The Tanzanian G1 variants displayed 94% similarity with the Rotarix vaccine G1 variant. The commonest P genotypes were P[8], P[4] and P[6], and the commonest G/P combination G1 P[8] (n = 123), G8 P[4] and G12 P[6]. Overall, rotavirus prevalence was higher in cool (23.9%) than hot months (17.1%) of the year (P = 0.012). We also observed significant seasonal variation of G genotypes. Rotavirus was most frequently found in the age group of four to six months. The prevalence of rotavirus in cases was lower in stunted children (28.9%) than in non-stunted children (40.1%, P = 0.003) and lower in HIV-infected (15.4%, 4/26) than in HIV-uninfected children (55.3%, 42/76, P<0.001). Conclusion This pre-vaccination study shows predominance of genotype G1 in Tanzania, which is phylogenetically distantly related to the vaccine strains. We confirm the emergence of genotype G8 and G12. Rotavirus infection and circulating genotypes showed seasonal variation. This study also suggests that rotavirus may not be an opportunistic pathogen in children infected with HIV.


Introduction
Rotavirus is a major cause of severe dehydrating diarrhea both in developed and developing countries [1,2]. The WHO rotavirus surveillance networks estimates that more than a third of diarrhea hospitalizations among children under five years of age is attributed to rotavirus infection [3]. In 2008, rotavirus caused an estimated 453000 deaths in children younger than five years, more than half of which occurred in developing countries [4].
Due to the limited effect of sanitation-based strategies for preventing the spread of the virus, several rotavirus vaccines have been developed, out of which two oral vaccines (RotaTeq and Rotarix) have been licensed. The introduction of rotavirus vaccines in developed countries has significantly reduced diarrheal mortality and hospitalizations [5][6][7][8][9][10]. A live attenuated monova-lent vaccine (Rotarix) was introduced in Tanzania early 2013 and is implemented in the national childhood vaccination schedule.
Rotavirus infection is common in both temperate and tropical climatic areas and shows distinct seasonality [29]. Before the discovery of the viral agent it was called 'winter diarrhea and winter gastroenteritis' in some parts of the world [29][30][31]. There is an increase in attention to the nature of rotavirus disease in relation to seasonality, and this has been well documented in temperate countries [32,33]. Recently two studies have also demonstrated the inverse relationship between temperature and rotavirus incidence in the tropics [29,34].
In Tanzania, it is estimated that rotavirus causes more than one third of diarrheal disease hospitalizations and each year it kills more than eight thousands children under five years of age [35]. In a previous study conducted in 2005-2006 [36], we reported high prevalence of rotavirus G9 among under-five children admitted with diarrhea in Dar es Salaam. The present study provides an update on the distribution of rotavirus G and P genotypes among children with diarrhea (cases) and compares it with that from children without diarrhea (controls) between August 2010 and July 2011. In addition we assessed the impact of HIV, clinical features and seasonal variation on rotavirus infection. The findings provide baseline information on rotavirus infection shortly before vaccine introduction in Tanzania.

Ethics Statement
This study received ethical approval from the Senate Research and Publications Committee of Muhimbili University of Health and Allied Sciences in Dar es Salaam, Tanzania and from the Regional Committee for Medical and Health Research Ethics (REK) in Norway. Permission was also obtained from the respective hospital authorities where recruitment of study participants took place (i.e. MNH, Amana and Temeke Hospitals). Written informed consent was obtained from the parent, next of kin, caretaker, or guardian on behalf of all the minors/children enrolled in the study.

Study Population
This study was conducted between August 2010 and July 2011 in Dar es Salaam Tanzania, a city with a population of about five million. Sample collection was performed during two seasons, starting in August 2010 and in March 2011, aiming for minimum 300 cases and 300 controls in each period. The target for cases was reached in January 2011 and in June 2011, while enrollment of controls continued in February 2011 and July 2011. A total of 1266 stool specimens were collected from children below two years of age. Cases were children hospitalized due to diarrhea at three major hospitals of Dar es Salaam; Muhimbili National Hospital (MNH), Amana and Temeke Municipal Hospitals. MNH, with a bed capacity of 1200, is the largest hospital in the country and serves as a tertiary and national referral Hospital. Amana and Temeke are Municipal district Hospitals of Dar es Salaam. Controls included children below two years of age with no history of diarrhea for one month prior to the study enrollment. These were either children attending child health clinics for immunization and growth monitoring (CHC, n = 310) or children admitted to hospital due to diseases other than diarrhea (n = 235).

Inclusion Criteria Case Definition
Children admitted in the diarrhea wards with acute or persistent diarrhea were included in the study. Diarrhea was defined as three or more watery stools within 24 hours. An episode of diarrhea was considered over when two consecutive days pass without diarrhea. An episode of acute diarrhea was defined as diarrhea with duration between 24 hours and less than 14 days. Persistent diarrhea was defined as diarrhea for 14 days or more.
Controls included in the study were children without history of diarrhea for one month prior to enrollment. Controls were not matched by age and sex with cases.

Exclusion Criteria
Children above 24 months of age and cases that could not provide stool sample on the day of admission were not included in the study. Cases and controls whose parent or guardian did not consent to participate in the study were excluded.

Collection of Information from Children with Diarrhea (Cases) and from Children without Diarrhea (Controls)
Recruitment of cases was done during the admission of the child in the pediatric diarrhea wards of the study sites. A standardized questionnaire was used to collect demographic and clinical information, including age (date of birth), sex, place of residence, parent/guardian level of education and history of antibiotic use prior to admission. Consistency of stool (watery, bloody) and duration of diarrhea was also recorded. The child's length and weight measurements were recorded. Additional clinical information of patients, such as hydration status, as assessed on the day of admission by the attending clinician, was obtained from patient files together with HIV testing results. HIV testing was done by HIV-DNA PCR at the Hospital laboratory (MNH) or research laboratory at MUHAS, as previously described [37]. Potential controls were enrolled during the recruitment of cases. The questionnaire for controls did not have parameters for clinical characteristics such as type of diarrhea and presence of dehydration. In addition controls did not have diarrhea for one month prior to the study enrollment.

Weight and Length Measurements
All children were weighed using a 25 kg Salter hanging scales (CMS Weighing equipment, High Holborn, London, United Kingdom). Weight was recorded to the nearest 0.1 kilogram. Length was measured using standard length boards to the nearest millimeter. Weight for age (WAZ), length for age (LAZ) and weight for length (WLZ) Z-scores were calculated using EPI Info (USD, Inc., Stone Mountain, GA). Children were categorized to have normal nutritional status, mild or severe malnutrition using Z-scores according to WHO criteria [38].

Specimen Collection
A single stool specimen was collected on inclusion from each child using wide mouthed sterile plastic containers. One portion was frozen at 270uC the same day as it was collected.

Rotavirus Detection
Rotavirus antigens were detected using the commercially available ProSpecT Rotavirus ELISA kit (Oxoid, Hants, UK), with 10% fecal suspensions according to the manufacturer's instructions.

Extraction of RNA
Fifty mg of rotavirus antigen-positive stool specimen were mixed 1:10 with Bacterial Lysis buffer (Roche Applied Science, Mannheim, Germany), and centrifuged at 13.000 g for 3 minutes. RNA was extracted from 200 mL supernatant using the Magna Pure LC High Performance Total Nucleic Acid Isolation Kit (Roche Applied Science, Mannheim, Germany).

Reverse Transcription, G and P-typing of Rotaviruses
Reverse transcription, G and P typing was done as described in European Rotavirus Detection and Typing Methods version 4 [39]. Briefly a total of 40 mL of RNA extract was used as template for reverse transcription with random primers. Rotavirus G an P genotyping was performed using semi-nested type specific multiplex PCR's that could detect eight G genotypes (G1, G2, G3, G4, G8, G9, G10 and G12) and six P-types P [4], P [6], P [8], P [9], P [10] and P [11]. PCR products were subjected to electrophoresis on a 2% agarose gel, stained with Gel Red, and observed under ultraviolet light. The Rotavirus G and P genotypes were determined by their specific size on the agarose gel. The nontypeable rotavirus strains were further confirmed if they were rotavirus using a single round rotavirus VP6 specific PCR. Negative and positive controls were included in all PCR assays.

Sequencing of the VP7 and VP4 Gene of the Rotavirus Positive Strains
Partial sequencing of the first round PCR-amplicons for the VP7 gene was performed for all samples positive for rotavirus in order to confirm the PCR G-typing results as described in the European Rotavirus Detection and Typing Methods version 4 [39]. Randomly selected rotavirus positive isolates (33%) were subjected to partial sequencing of the VP4 gene to confirm PCR P-typing results. The sequencing of both VP4 and VP7 was done using purified first round PCR products, on an ABI3730 machine using BigDye (Applied Biosystems). The same primers as in the PCR were used, both forward and reverse primers for VP7 gene while only forward primers was used for VP4 gene.

Sequence and Phylogenetic Analysis of the VP7 Gene
Nucleotide sequences were analyzed using the RipSeq interpretation software (iSentio Ltd., Bergen, Norway) and by the nucleotide BLAST service (NCBI). The evolutionary distances between Tanzanian strains, vaccine strains and the reference strains from GenBank were investigated using pairwise comparison from multiple sequence alignments using the Genius software package (Biomatters) and the phylogenetic tree was constructed using the UPGMA and Kimura two-parameter methods [40]. A bootstrap resampling analysis was performed (1000 replicates) to test tree reliability.

Nucleotide Sequence Accession Numbers
The DNA sequences for the VP7 genes of the study strains were submitted directly to GenBank and were assigned accession numbers from KF976838 to KF976860.
The DNA sequences for the VP4 genes of the study strains were submitted directly to GenBank and were assigned accession numbers from KF976815 to KF976837.

Statistical Analysis
Weight for age, length for age and weight for length Z-scores were calculated using EPI Info (USD, Inc., Stone Mountain, GA, USA). Statistical analysis was performed using the Statistical Package for the Social Sciences (SPSS for IBM-PC, release 18.0; SPSS Inc., Chicago, IL, USA). Differences in proportions were tested using the chi-square (x 2 ) test. A p-value of ,0.05 was considered significant. The association between rotavirus positivity, genotypes, clinical and demographic characteristics was estimated as the odds ratio (OR) between rotavirus infected and rotavirus uninfected individuals in a logistic regression model. Factors from the univariate analysis were kept in the main logistic regression model. A separate regression model including HIV as a factor, employed manual, backwards, stepwise elimination of nonsignificant factors (p$0.05).
A total of 236 samples were P genotyped using RT-PCR, 211 were cases and 25 were controls. One third of 236 samples underwent sequencing of the VP4 gene, which produced the same results as RT-PCR. Out of six P genotypes searched for, only three P genotypes were found. The commonest circulating P genotype in cases was P [8] (n = 141, 66.8%) followed by P [4] (n = 40, 19.0%) and P [6] (n = 30, 14.2%). Rotavirus P [9], P [10]and P [11] were not found. Table 1 shows G/P combinations for the 211 samples that were successfully typed for both G and P genotypes (among the 236 P-typed samples, 25 could not be G-typed). The commonest G/P combination in cases was G1P [8] accounting for 123 samples, followed by G8P [4] (n = 27) and G12P [6] (n = 21).
We found no rotavirus G or P genotype associated with diarrhea. The common G and P genotypes circulating in children with diarrhea were also common among children without diarrhea as shown in Table 1. We did not find samples with multiple strains of rotavirus.

Analysis of VP7 Nucleotide Sequences
Phylogenetic analysis based on VP7 nucleotide sequences showed that sequences varied by only 1-2% within the rotavirus G1 strains detected in Tanzania (Fig. 1A). These GI genotypes also showed 98-99% nucleotide similarity to G1 strains circulating worldwide such as in Malawi, Bangladesh, Belgium and USA with GenBank accession numbers JN591404, EF690754, EF690759, HQ392309 and HM773752. Notably, only 94% similarity was found between the Tanzanian G1genotype and the rotavirus vaccine-strains RotaTeq (GU565057) and Rotarix (JN849114).
The rotavirus G8 strains in this study had 98-99% nucleotide similarity with each other; they are also closely related (98%-99% nucleotide similarity) to circulating rotavirus G8 strains in Kenya, Malawi and USA with GenBank accession numbers EU488721, FJ386444, JN591405, GQ496281, JF693231, KC215513. The G12 genotypes in this study displayed nucleotide similarities above 99%. These rotavirus G12 strains were also closely related to circulating G12 strains in Malawi, India and Nepal with GenBank accession numbers EU573780, EF559260, AB263988 and AB275292.

Analysis of VP4 Nucleotide Sequences
Comparison of VP4 nucleotide sequences from Tanzanian strains and representatives of P-genotype strains from the GenBank database are shown in figure 1B. Tanzanian P [8] strains showed 98% nucleotide identity to each other and to circulating P [8]

Seasonality of Rotavirus Infection and Rotavirus G-types
The prevalence of rotavirus detected varied significantly by months in both cases and controls (P,0.001). When results were divided into cool and hot months of the year i.e. May through August vs. November through February, rotavirus prevalence (both in cases and controls) was significantly higher in cool than hot months (23.9% vs. 17.1%, P = 0.012, OR 1.52, 95% CI: 1.081 to 1.894: 1.09 to 2.11). As shown in Figure 1, there was also high number of rotavirus detected among cases in the beginning of the cool season i.e. April 2011 and between the cool and the hot season, which are periods that coincide with rain season.
We also observed significant variations of G genotypes detection in different months during the study period (P,0.001) as shown in Figure 2. The commonest genotype G1 was detected in most months of the study, with the highest peak in the month of April 2011, which is the beginning of the cool season and May 2011. Genotypes G8 and G12 were detected in most months of the study period, but genotype G9 was only detected in the month of October 2010.
We compared rotavirus prevalence in studies conducted in previous years in the same region but during different seasons of the year. We observed high prevalence in studies conducted during the cool months of the year [41,42] compared to studies conducted in hot months of the year, Table 2 [36,43].

Association between Rotavirus Infection and HIV Status
A total of 421 children were tested for HIV, of these 33 and 388 tested HIV positive and negative respectively. In univariate analysis, the prevalence of rotavirus infection was significantly lower in HIV-infected (15.4%, 4/26) than in HIV-uninfected children with diarrhea (cases) (55.3%, 42/76, P = 0.001, OR 0.152, 95% CI: 0.05-0.48). HIV status was not included in the final regression model; because it would introduce a high number of missing values as 54% rotavirus infected children were not tested for HIV. A separate regression model was performed including HIV status and other significant risk factors from the univariate analysis (length for age, place of residence and type of diarrhea). With stepwise, backwards removal of all non-significant factors (P$0.05), rotavirus infection remained significantly negatively associated with HIV infection (P = 0.027, OR 0.26, 95% CI: 0.08 to 0.85) and stunting (P = 0.005, OR 0.23, 95% CI: 0.08 to 0.63).

Distribution of Rotavirus Infection and G/P Genotypes by Age and Sex
The median age of all rotavirus infected children (9.6 months) was significantly lower than that of rotavirus uninfected children (10.7 months, P = 0.003). There was no significant difference in the median age of rotavirus infected children with and without diarrhea (9.6 months vs. 9.7 months). Figure 3 shows that the prevalence of rotavirus infection in cases was significantly higher in the age group of 0-6 months compared to the age group $19 months (35.1% vs. 23.4%, P = 0.028, OR 2.14, 95% CI: 1.09 to 4.23). The prevalence of rotavirus infection did not differ significantly between age groups 7-12 and 13-18 months respectively, as compared to age group $19 months. The prevalence of rotavirus infection in controls did not differ significantly between age groups (P.0.05). Rotavirus genotypes G1 and G8 were detected in all age groups, whereas G4 was detected in the age of 0-12 months and one strain of G9 was detected from a child in the group of 0-6 months.
The proportion of rotavirus infection in all children with and without diarrhea did not differ significantly by sex as shown in Table 3.
The distribution of rotavirus G and P genotypes was not significantly associated with sex of the child, presence of dehydration, nutritional status or HIV status (P.0.05).

Discussion
Rotavirus is the leading cause of severe diarrhea both in developed and developing countries. This one-year surveillance study described the molecular epidemiology of rotavirus infection in children in Dar es Salaam, the major city of Tanzania with a population of about five million inhabitants. Children with diarrhea were six times more likely to be infected with rotavirus than those without diarrhea. The study confirms findings from other studies twenty years ago in the same location and elsewhere [42,44] indicating that rotavirus is still a major pathogen causing diarrhea in children in Tanzania. The presence of rotavirus among controls may represent reservoirs for transmission in the community.
In the current study, five G and three P genotypes were detected. Rotavirus G1[P8] was the most prevalent G/P combination and this genotype combination is reported to be responsible for 50-65% of rotavirus infections in children worldwide [45]. Since more than 60% of the study subjects were affected by this genotype combination, which is in the current vaccine introduced in Tanzania (Rotarix), we assume that the vaccine will be protective, given that the circulating genotype is stable. Rotavirus G8 was the second most common genotype in this study and this is the first time it is reported in Tanzania. Of note is the fact that all the G8 strains detected were initially genotyped as G12 by multiplex and semi-nested PCR using primers described in the European Rotavirus Detection and Typing Methods version 4 [39]. The G8 genotype-specific primers were compared to nucleotide sequences of the G8 viruses isolated in this study, and three to four primer mismatches were found predominantly in the 39end. When G12 genotype-specific primers were compared to G8 sequences we found that the numbers of mismatches were fewer and in less critical positions than the unintended mismatches for the G8 specific primers. Consequently, in samples containing G8 genotype viruses, none of the primers had perfect match, and the G12 primers by chance obtained the strongest binding producing a false positive G12 result. Due to the higher mutation-rates in viral genomes, PCR based typing strategies will generally be more error-prone than typing based on sequencing. This can result in erroneous epidemiological data and a poor foundation for further vaccine research. Other studies have also documented mistyping of rotavirus strains by multiplex RT-PCR [20,[46][47][48]. We suggest that caution should be taken when interpreting the results of rotavirus G genotypes based on multiplex PCR. Furthermore these findings emphasize the higher robustness obtained by sequencing for typing of rotaviruses.
strains from this study are indicated by accession numbers. The Genius software package was used to build the tree with the UPGMA method and bootstrapped with 1,000 repetitions; The Kimura-2 substitution model was used. The bar indicates nucleotide substitutions per site. doi:10.1371/journal.pone.0097562.g001 Genotype G12 was previously documented by sequencing only in three African countries, namely Malawi, South Africa and Nigeria [18][19][20]. The present study is the first documentation of genotype G12 in Tanzania.
Surprisingly we found only one strain of G9 in combination with P [8] in the current study, while this genotype predominated in the same study setting 8 years earlier [36]. This emphasizes the need for continuous rotavirus strain surveillance.
Apart from defining the serotype of rotavirus, G proteins are critical to vaccine development because they are targets for neutralizing antibodies that are believed to be important for protection. The two current rotavirus vaccines i.e RotaTeq (RV5) and Rotarix (RV1) can control infection against the five main rotavirus genotypes, which are G1, G2, G3, G4 and G9 [49]. Phylogenetic analysis was performed to show the relationship between the Tanzanian G1P [8] and vaccine strains in the Rotarix and RotaTeq vaccines. Our results revealed that the Tanzanian G1P [8] strains are distantly related to G1P [8] of the vaccine strains. This suggests that circulating G1P [8] strains may have changed over time through accumulated mutations making them different from original vaccine strains which were isolated over twenty six years ago [49,50]. In this study we report a significant increase in the prevalence of P [6] and P [4] rotavirus positive samples compared to the previous study in the same region [36]. Rotavirus samples with P [6] in this study were associated with a variety of G-genotypes (G4, G8, and G12). Phylogenetic analysis revealed that rotavirus P [6] in this study is closely related to P [6] from other African countries such as Malawi and South Africa. The high prevalence of rotavirus genotypes not included in the current rotavirus vaccines i.e. G8, G12, P [4] and P [6] in this study and other studies from developing countries may be one of the reasons for the reported lower vaccine efficacy in developing countries [51]. The Tanzanian G1P [8] variant may be able to escape from vaccine induced immunity [50]. However other possible contributing factors such as maternal antibodies and change in gut microbiota needs to be investigated [52,53]. No multiple rotavirus infection of G or P genotypes was detected in this study, which concurs with the previous findings in the same study setting [36]. In this study, rotavirus showed significant variation of prevalence in different months of the year with a peak of rotavirus infection during the cooler months of the year. The seasonal variation observed in the current study is further supported by findings from previous studies from the same region [36,[41][42][43]. Studies conducted during cool months of the year [41,42] found high prevalence of rotavirus compared to studies conducted during hot months of the year [36,43]. Understanding seasonal patterns of rotavirus will be useful when considering the appropriate timing of immunization booster programs in settings, which have reported poor efficacy of rotavirus vaccine and have demonstrated strong seasonality. Rotavirus vaccine administration in the current Extended Program on Immunization schedule in developing countries is at 6, 10, and 14 weeks of age. However, if booster vaccination programs were to be considered for older children lacking immunity, vaccination during the pre-rotavirus season is recommended [34].
In this study we found significantly lower incidence of rotavirus in HIV-positive children compared to HIV-negative children with diarrhea. This supports findings of previous studies in Tanzania and other developing countries where rotavirus is detected less.
frequently or not detected at all among HIV-positive children with diarrhea [54][55][56][57][58][59]. On the other hand rotavirus was more prevalent among HIV-positive children than HIV-negative children without diarrhea. More studies are needed to clarify this issue.
Rotavirus diarrhea occurs at an earlier age among children in developing countries than children in industrialized countries. The mean age of rotavirus infection in this study compares with the mean age of rotavirus gastroenteritis in other developing countries which ranges from 6-9 months [60]. However the rotavirus prevalence of 27.8% in children less than three months is notably high with non-vaccine serotypes also found i.e. G8 and G12. This may have implications for the rotavirus vaccine introduced in the study setting; first, this age group may not fully benefit from the immunization programme, since the first dose is given at 6 weeks; secondly these infants may have acquired immunity from natural infection with rotavirus prior to immunization and therefore the ability to measure vaccine efficacy in the study setting may be impaired.

Conclusion
This study showed a switch from G9 genotype during the past 8 years to G1genotype dominance in this study, and a low similarity between the Tanzanian G1 genotype and the vaccine G1genotype. We have also observed unusual circulating genotypes G8 and G12 for the first time in Tanzania. Since early 2013, the rotavirus vaccine Rotarix has been introduced and included in the Extended Program of Immunization (EPI) in Tanzania. The present study represents pre-vaccination data and may be useful in the future when assessing the effectiveness of the vaccine. This study also showed seasonal variation in the prevalence of rotavirus. Rotavirus seasonality provides insights important for vaccination strategies, including potential shifts in seasonal peaks and duration of outbreaks. Our data also support the notion that rotavirus may not be an opportunistic pathogen in children infected with HIV.