Molecular characterization of human group A rotavirus genotypes circulating in Rawalpindi, Islamabad, Pakistan during 2015-2016

Group A rotaviruses (RVA) are one of the major causes of acute gastroenteritis (AGE) in young children worldwide. Owing to lack of proper surveillance programs and health facilities, developing countries of Asia and Africa carry a disproportionately heavy share of the RVA disease burden. The aim of this hospital-based study was to investigate the circulation of RVA genotypes in Rawalpindi and Islamabad, Pakistan in 2015 and 2016, prior to the implementation of RVA vaccine. 639 faecal samples collected from children under 10 years of age hospitalized with AGE were tested for RVA antigen by ELISA. Among 171 ELISA positive samples, 143 were successfully screened for RT-PCR and sequencing. The prevalence of RVA was found to be 26.8% with the highest frequency (34.9%) found among children of age group 6–11 months. The most predominant circulating genotypes were G3P[8] (22.4%) followed by G12P[6] (20.3%), G2P[4] (12.6%), G1P[8] (11.9%), G9P[6] (11.9%), G3P[4] (9.1%), G1P[6] (4.2%), G9P[8] (4.2%), and G3P[6] (0.7%). A single mixed genotype G1G3P[8] was also detected. The findings of this study provide baseline data, that will help to assess if future vaccination campaigns using currently available RVA vaccine will reduce RVA disease burden and instigate evolutionary changes in the overall RVA biology. The high prevalence of RVA infections in Pakistan require to improve and strengthen the surveillance and monitoring system for RVA. This will provide useful information for health authorities in planning public health care strategies to mitigate the disease burden caused by RVA.


Introduction
Group A rotaviruses (RVAs) belong to family Reoviridae are the leading cause of fatal dehydrating diarrhea in infants and young children worldwide particularly in the developing countries of Africa and Asia [1][2][3][4]. The rotavirus is a 11 segmented double-stranded RNA (dsRNA) a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 hospitalizations each day. The hospital provides health care to more than 10 million individuals from urban, peri-urban and rural settings of the country, and is a referral hospital for patients from all over Pakistan. Similarly, another tertiary care setup in Rawalpindi is the General Hospital (now called Benazir Bhutto Hospital (BBH). It has up to 2500 patient visits in OPDs per day. It is a public-sector hospital with a dedicated paediatric unit including a diarrhea ward. These are major metropolitan cities in the country with a large population size (4.5 million). Hence, the results obtained will probably reflect a large section of the country's population.

Ethical approval and consent
The sampling was carried out after gaining informed consents from the parents/guardians of the study participants. Ethical approval was obtained from the ethical committee of PIMS, Benazir Bhutto Shaheed Hospital (BBH) and Internal Review Board (IRB) of COMSATS University, Islamabad, Pakistan.

Sample collection
A total of 639 stool samples were collected from January 2015 to December 2016 from children less than 10 years of age, hospitalized/visited or received IV rehydration treatment in the emergency paediatric ward of two hospitals, BBH in Rawalpindi and PIMS in Islamabad.
Samples were collected in stool collection vials from patients with three or more watery non-bloody stools in a 24 hours period excluding non-infectious bloody diarrheal cases in accordance with WHO case standard guideline [38]. All the samples were initially kept at in the hospital at -20˚C and transported under cold chain to the microbiology and public health laboratory, COMSATS University, Islamabad. Upon arrival at COMSATS, all stool samples were stored at −80˚C, until serological and molecular analysis. Demographic and clinical data including age, gender, residence, hospital admission date, diarrhea onset date, date of stool sample collection, vomiting (duration and episodes per day), diarrhea (duration and episodes per day) and body temperature of the patients was recorded.

Detection of RVA by enzyme immunoassay (ELISA)
Faecal suspensions (10-20%) were prepared by adding 100-200 mg of faecal sample in 1 ml of universal transport medium (UTM, Copan Diagnostics) in a clean 1.5 ml centrifuge tube. The individual faecal suspensions were tested for the detection of RVA antigen using the ProSpectT Rotavirus ELISA Kit (Oxoid Ltd, UK). Stools positive for RVA were shipped to Rega Institute for Medical Research, Leuven, Belgium for further characterization. Specimens were packed in transport box with (ice packs) to maintain a temperature of 2-8 C˚.
The ELISA negative faecal samples were not under the scope of our study so, they were not further tested.

RT-PCR for VP7 and VP4 genes
The extracted RNA template was denatured for 2 minutes at 95˚C followed by reverse transcriptase PCR (RT-PCR) using the One-step RT-PCR Kit (Qiagen/Westburg, The Nederlands). RT-PCR was carried out for both VP7 and VP4 gene fragments using consensus primers Beg9 and End9 [39] for VP7 and VP4_1-17F and Con2Deg for VP4 [40]. Samples negative for PCR using Beg9/End9 and VP4_1-17F/Con2Deg primer sets were further characterized by RT-PCR using the 2nd primer sets (VP7F and VP7R for VP7;VP4F and VP4R for VP4 [41,42], as there might be mutation in primer binding sites which can be captured by other set of primers. The primer sequences are given in the Table 1. The RT-PCR was carried out with an initial reverse transcription step of 30 min at 50˚C followed by polymerase activation at 95˚C for 15 min, 40 cycles of amplification (denaturation: 45s at 94˚C; annealing: 45s at 45˚C for VP4 and 45s at 50˚C for VP7; product extension for 1 min at 72˚C), followed by a final extension of 10 minutes at 72˚C in a Biometra T3000 thermocycler (Biometra, Westburg). PCR products were run on a polyacrylamide gel along with a 50-bp DNA ladder (Sigma Aldrich), stained with EtBr (Sigma Aldrich) and visualized under UV-light. The sample preparation, amplification and end point analysis were kept physically separated. All steps of experiment i.e Extraction, PCR were done in a bio-safety cabinets. All the PCR runs included positive control, negative control and non-target control (reagent blank) to avoid false positive results. A rotavirus positive sample (Genotype G1P[8] was used as a positive control. Quality control sample were run in each batch to check the instrumental stability (S1 and S2 Figs).

Nucleotide sequencing
The PCR product was purified using the ExoSAP clean-up kit (Thermofisher Scientific, USA). The PCR amplicons were then sequenced using the BigDye Cycle Sequencing Kit (Applied Biosystems, USA). The sequencing was performed with the same forward primers as were used for the RT-PCR [43]. After the sequencing reaction, an ethanol precipitation was performed and the final product was loaded in ABI PRISM 3130 automated sequencer (Applied Biosystems, USA) [44].

Determination of RVA genotypes for VP7 and VP4 genes
The chromatogram obtained were analysed by using Chromas 2.6.4 (Technelysium, Queensland, Australia). The sequences were manually corrected and compared with other sequences available in Genbank using BLASTn. Their genotype were then obtained using the RVA online

Phylogenetic analysis
Multiple sequence alignments were performed using ClustalW in MEGA version 6.06 [46]. Phylogenetic trees were constructed by Maximum Likelihood method with kimura-2-parameter model in MEGA 6.06 [46]. The statistical reliability was checked with using 1000 bootstrap replicates. Nucleotide and amino acid distances were calculated using the P Distance Model.

Nucleotide sequence accession numbers
The VP7 and VP4 nucleotide sequences were submitted to the GenBank with following accession numbers: (

Statistical analysis
Statistical analyses were performed with Stata version 13 (StataCorp. 2013) [47]. A chi-square test was performed to test the possible association between gender difference and RVA status. Student's t-test was performed to ascertain the equality of means of RVA positive and negative for continuous variables; age, weight, height, temperature, vomiting (duration and episodes), diarrhea (duration and episodes) and dehydration. Statistical significance was defined as p < 0.05. There was no statistically significant gender difference in the distribution of RVA genotypes (p>0.05) ( Table 2). There was a statistically significant difference between mean age (in months) of RVA gastroenteritis cases as detected by ELISA (p<0.05). Whereas, no significant difference (p>0.05) was found between different age groups positive for RVA through ELISA (Table 2). However, the highest prevalence was observed in children of 6-11 months of age followed by children of 12-23 months of age, while the lowest number of positive cases were observed in children older than 60 months. There was no significant difference found between anthropomorphic and clinical features (weight, height, temperature, diarrhea duration and episodes, vomiting duration and episodes and dehydration) and RVA infections ( Table 2).

Prevalence of RVA genotypes
From the 171 RVA antigen-positive samples, 143(83.6%) were confirmed positive by RT-PCR and the remaining 16.4% were RT-PCR negative. All 143 RT-PCR positive samples were successfully genotyped for VP7 and 140 samples were genotyped for VP4. The most common G type was G3 followed by G12, G9, G2 and G1. A single mixed G-type (G1G3) was detected in one sample (0.7%). The most common P-type was P  (Fig 1).
G9: A total of 24 Pakistani G9 strains sequenced in this study and representative members of the G9 genotype were selected for the phylogenetic analyses (Fig 6). All Pakistani strain fell into G9 lineage 3. Twenty-tree of these G9 strains showed a high nucleotide identity

Discussion
RVA gastroenteritis is considered to be a leading cause of infant and childhood morbidity and mortality, particularly in developing countries like Pakistan [55]. The current study reported the circulating genotypes of human RVA during a two years surveillance activity (2015 and 2016) in Rawalpindi and Islamabad, Pakistan.
The prevalence of RVA found in the current study was 27% which is comparable with the results of previous studies (29-34%) reported from Pakistan during 2008-2014 [23,25,27,30,31]. However, a higher percentage of RVA infections was also reported in the past from Karachi and Faisalabad (57% and 63%, respectively) [28,29]. On the other hand, a community based study by Qazi and colleagues in Karachi Pakistan found a RVA prevalence of 17% [27], which is much lower as compared to current study. There might be several reasons for these observed prevalence differences, such as a different study design, the sampling period (throughout the year or during rotavirus peak season or epidemics), type of health facilities in target hospital, type of used diagnostic procedures, patients age criteria and patient testing criteria. The RVA prevalence in our study was lower when compared with studies reported in other countries like India, Cambodia, Thailand, Mongolia [56-59] and was high when compared with china [60].
The RT-PCR assay failed to detect RVA in 28 faecal samples positive by ELISA. This could be due to one or more of the following reasons; i) false positive ELISA result, ii) PCR inhibitors  that cause positive samples to be negative by PCR, iii) drop of the viral titer below PCR detection limit due to additional shipment and freeze-thaw cycles.
There are seasonal and geographical variations in RVA based infections globally. In temperate climates, RVA epidemics commonly appear during the dry, winter months of the year. However, in most tropical areas, RVA gastroenteritis is prevalent throughout the year without seasonal fluctuation In the present study, no gender differences were found in RVA infections, which is in accordance with the previous reported cases from Pakistan [24,27]. A large proportion (71.4%) of children with RVA gastroenteritis is in less than 1 year of age which is also in agreement with previous work from Pakistan [23,24,27,28]. Some studies have revealed that children younger than 4 months are protected against severe RVA diarrhea mainly due to maternal antibodies and breast feeding which may be the reason for the higher RVA incidence after 7 months of age [65,66]. A low incidence (3.5%) of RVA infection is also reported in older children (>5 years) in this study, an age group in which RVA infections are only poorly appreciated worldwide. The similar trends has also been documented in previous studies reported in Pakistan, Angola, Kenya, Nepal, India, Turkey, China and Bangladesh [16,23,27,60, [67][68][69][70][71][72]. In adults, RVA infections can be symptomless or can be accompanied by nausea, malaise, headache, abdominal pain, fever and diarrhea. In immunocompromised adults, it can range from symptomless to chronic infection [73].
The rotavirus molecular epidemiology varies from country to country, depending upon the socio-economic status and weather conditions [74]. The most common G-genotypes affecting humans are G1, G2, G3, G4, G9 and G12 while the most common human P-genotypes are P The detection rate of the G1 genotype in the current study is 16% which is in accordance with two previous hospital based studies (11.6% and 14.5%) conducted during the years 2008 and 2014, respectively [24,27]. However, this prevalence rate was lower than a study conducted by Tamim and colleagues who reported that 24.3% of the infection were cause by G1 strains during the year 2010 [23]. G1 is responsible for 70% of infections in Europe, North America and Australia, while in south America, Africa and Asia it accounts for 20-30% infections [70]. Phylogenetic analysis suggested that, Pakistani G1 strains isolated during current study clustered with other strains detected worldwide into two distinct G1 Lineages (Lineage 1 and 2,  Fig 3). The different G1 lineages possess antigenic variation at both amino acid and nucleotide level [80].
In this study the G2 genotype is reported in 13% of the cases, which is less than the previous studies (19-24%) reported during 2010 and 2014 [23,24]. Phylogenetic analysis showed that all Pakistani G2 strains from this study clustered in lineage 4 with reference strains from USA, India, Bangladesh, Russia, Canada, Korea, Philippines, Belgium, Italy, Taiwan and two locally circulating strains identified in previous studies (Fig 4).
The G3 genotype is recognized as the third most predominant RVA genotype in humans mostly found in combination with P[8] [81]. RotaTeq, a pentavalent human bovine reassortment vaccine contain G3P[8] genotype in its formulation [82]. Although no direct causal link has been proven, G3 strains with different P types P[4], P[6], P[8] and P[9] have been detected in humans round the world after introduction of RVA vaccines in immunization program of many countries [83,84]. In Pakistan G3 genotype was first reported in 2010 in Faisalabad region in very low percentage (2.6%) [ The human G9 RVA genotype has gained epidemiological significance since mid-1990s and is now acknowledged as the fifth main human RVA genotype [89]. According to an WHO report in 2013, G9 is most frequently found in combination with the P[8] genotype as is less prevalent with the P[4], P[6], P[11] and P[19] genotypes [90,91]. The G9 genotype accounts for 7.4% RVA infections globally and is more commonly present in south Asia and Middle East (20.3%) and is less prevalent in Southeast Asia (5.9%) [92]. Many studies have revealed a close genetic relationship between human and Pig G9 RVA strains, suggesting that interspecies transmissions in combination with reassortment has resulted in the emergence of this genotype in humans [93][94][95]. In Asia, G9 RVA strains are the second most abundant genotype in pigs with a detection rate of 11.4% [96]. In the present study the G9 prevalence rate is 18% which is more than the previously reported G9 rates in Pakistan [23-25,27]. The G9 isolates detected in this study found in combination with P[6] and P[8] genotypes. G9P[6] was more prevalent (11.9%) than G9P[8] (4.2%). These G9 strains are showing close sequence similarity, suggesting that the frequent reassortment events have occurred. Pakistani G9 RVA strains have clustered very close to each other in lineage 3 with other strains isolated all over the world. One Pakistani G9 strain (PAK102) is showing 98.7 nucleotide identity with a porcine RVA strain found in South Africa (RVA/Pig-wt/ZAF/MRC-DPRU1540/2007/G6G9PX). This sequence was directly submitted to GenBank and was identified in a mixed infected sample [97].
The first G12 genotype (G12P[4]) was discovered in 1987 from the Philippines, in children less than 2 years old, affected with acute diarrhea [98,99]. A decade later, G12 was identified as an emerging genotype in several countries of Asia, Europe, North America and South America mostly in combination with P[8] and P[6] [78,100,101]. Now, G12 has been spreading as a devastating genotype in many Asian countries including Pakistan [102]. These unusual strains such as G12P [6] are not included in the formulation of two available rotavirus vaccines Rotarix (GlaxoSmithkline) and RotaTeq (Merck). Hence, it has not yet been determined, whether or not these two available vaccines provide protective immunity against these unusual strains [102]. The continuous monitoring and whole genome-based analyses are essential in understanding the evolutionary dynamics and the spread of G12 strains in Asia. In Pakistan, G12 strains were reported by Alam and colleagues in two children of Rawalpindi, Pakistan for the first time in 2009 [103]. Other studies from the same region by Tamim in 2010 and Umair in 2014 detected G12 with 6.7% and 16.7% prevalence, respectively [23,24]. In the present study genotype G12P[6] was found as the second most abundant genotype with 21% prevalence in Pakistan. This emphasises the continuous spread of the G12 genotype in Pakistan from 2009 to 2016, as well as in Asian countries. On phylogenetic observation, Pakistani G12 strains were most closely associated with Thailand strain (B1373). This reflects the close connection with this country in terms of travel and other business activities [104,105]. The high prevalence of G12 genotype during the same period was also detected in neighbouring country Bangladesh [62]. Therefore, it is suggested that the emergence of G12 in 2015 as a dominant strain in Pakistan more likely was due to their spread from other Asian countries to Pakistan.
The dominance of G3P[8] genotype in 2014, G12P[6] genotype in 2015 and further reemergence of G3P[8] as a dominant genotype in 2016 in Pakistan in the current study suggested that the G3 and G12 strains might be competing and excluding one another in the same vulnerable population. Moreover, the increasing implementation of RVA mass vaccination worldwide might be the reason for genotypic fluctuations and emergence of new genotypes that lead to their global spread including Pakistan.
The P[4] genotype of human RVAs is mostly identified with VP7 genotypes G2 and G8 [106]. However, in the present study P[4] is found in combination with G2 and G3, and is the third most abundant genotype found in 22% of study cases. Phylogenetic analysis of the P[4] genotype revealed that all Pakistani P[4] strains fall in lineage 5 with other strains isolated from USA, India, Bangladesh, Japan, Russia, China, Canada, Ghana, Belgium, Brazil and previously identified Pakistani strains (Fig 8).
The RVA P[6] genotype is detected all over the world with many G genotypes (G1-G6, G8-G12, and G25) [107,108]. In humans the P[6] genotype is considered to be the most prevalent genotype in South Asia and Sub-Saharan Africa [109]. In the present study P[6] is the second most dominant (37%) P-genotype found in the Pakistani population. Comparison of RVA genotype P[6] with reference strains from BLAST search reported similarity with human as well as with one bat RVA strain from Kenya (RVA/Bat-wt/KEN/KE4852/2007/G25P[6]), although it is more likely that the bat P[6] was derived from humans, rather than the other way around [110].
A distinct subtype of the P[8] genotype, P[8]-4 or OP354-like P[8] has been detected in many countries of Europe, Africa and Asia [111]. According to Zeller and colleagues, South and East Asia are the main origin from where OP354-like P[8] strains migrated to Africa, Europe, and North America [112]. In the present study, the P[8] genotype is found to be dominant (39%), and based on our phylogenetic analysis nine Pakistani strains possess an OP354like P[8], whereas all other strains are closely related to the P[8] strains revolving in other countries of the world in most common P[8] Lineage 3. These results show two distinct P[8] lineages are co-circulating in Pakistan at the same time. Owing to antigenic differences in the VP4 spike of the OP354-like strain, it has been questioned whether or not the P[8] moiety in the vaccines would equally well protect against this divergent variant [113].
RVA mixed infection rate has increased globally from 7.9-11.7% from 1997 to 2007 [9, 114]. The RVA mixed infection rate observed in the present study is 0.7% which is similar to previous reports [23,24,26-28]. The total of 2.4% of the study samples were P-Non-typeable even with different primer sets. The possible reasons for genotyping failure might be the primer mismatch due to higher RVA diversity.
The unusual genotypes (G1P[6], G3P[4] and G9P[6]) accounted for 25% of total RVA genotypes in this study, which is lower than previous data reported by Masab and colleagues [24]. The unusual genotypes and mixed genotypes arise due to mixed infections that lead to the reassortment and evolution of novel genotypes. Among unusual genotypes, G3P[4] acquire attention, as it is the second time that G3 in combination with P[4] is reported in Pakistan in this study. This genotype emerged most likely after an intergenogroup re-assortment in which G2P[4] strain already circulating in population may have acquired VP7 gene from locally cocirculating G3 human RVA strains. Likewise already circulating G3P[8] and G3P[6] strains in population can be the parental strains of G3P[4] reassortant strain. Similarly OP354-like P [8] strains detected in the present study were found in combination with G1, G3 and G9. This suggests that there is a frequent ongoing reassortment phenomenon in progress.
Pakistan is one of the countries with the highest disease burden of RVA infections in children of less than five years of age. The introduction of two RVA vaccines has reduced the global mortality rate and number of hospitalizations in many countries [10,14,115]. Pakistan has completed phased introduction of RVA vaccines in the national EPI program in January 2018. The hospital based studies regarding efficacy of RVA vaccines and its effects on the country's specific strains are yet to be determined [23,24,26,27]. However, in a recent survey conducted by Richard and his colleagues, Sindh and Baluchistan province of Pakistan bear the highest RVA burden which is estimated to be 3.1-3.3 rotavirus deaths/1000 births. The vaccine coverage in these two provinces is estimated to be lowest, particularly in children living in the poorest households conditions. Overall, RVA vaccine introduction in Pakistan could prevent 3061 deaths per year with current routine immunization program with highest coverage in capital city Islamabad that could prevent an additional 1648 deaths per year [116].
However, we believe that with the inclusion of more samples for sustained time period and recruiting multiple sentinel sites throughout the country will be required for better understanding the scale of problem. Furthermore, whole genome characterization of human RVA strains and inclusion of samples from animal and environmental sources in future studies is recommended to monitor the ongoing interspecies transmission and reassortment events.

Conclusions
This study concludes the high prevalence of G12 and G3 RVA strains in Pakistan in 2015 and 2016 in Rawalpindi and Islamabad, Pakistan. Although it provides an incomplete epidemiological picture of gastroenteritis caused by RVA throughout the country, the results of this study will provide comprehensive knowledge to the researchers and public health authorities to calculate the RVA disease burden in the country. The observed large diversity of RVA genotypes and their yearly fluctuations demand an intensive, large scale surveillance system in the country. This will help to monitor evolving of unusual RVA genotypes and to assess the impact of vaccine introduction on these recently emerged RVA strains in Pakistani population.
Supporting information S1 Table. GenBank accession numbers assigned for all rotavirus genotypes based on gene segment VP4 and VP7 sequenced in this study.