Smear positive tuberculosis and genetic diversity of M. tuberculosis isolates in individuals visiting health facilities in South Gondar Zone, northwest Ethiopia

Background Tuberculosis (TB), a bacterial infectious disease, continues to be a public health concern in many developing countries. However, lack of data concerning the public health burden and potential risk factors for the disease hampers control programs in target areas. Therefore, the aims of present study were to determine the prevalence of TB and genetic diversity of M. tuberculosis isolates from individuals visiting health facilities in South Gondar Zone, northwest Ethiopia. Methods A cross-sectonal study was conducted between March 2015 and April 2017. Bacteriological examination, region of difference (RD) 9 based polymerase chain reaction (PCR) and spoligotyping were used. Results The overall prevalence of all smear positive TB was 6.3% (186/2953). Extra pulmonary TB (EPTB) was clinically characterized in about 62.4% (116/186) TB-positive cases. Some demographic characteristics, such as patients' origin (districts where patients were recruited) [patients’ origin (chi-square (χ2) value; 62.8,p<0.001) were found to be significantly associated risk factors for the occurrence of TB in the study area. All the mycobacterial isolates were found to be M. tuberculosis. Among the 35 different spoligotype patterns identified, 22 patterns were shared types.The three dominantly identified families were T, CAS and Manu, each consisting of 46.9%, 24.0% and 10.4% of the isolates, respectively. Conclusion The present study revealed that TB continues to be a public health problem in South Gondar Zone which suggests a need of implementing effective disease control strategies.


Study design
A Cross-sectional study was conducted between March 2015 and April 2017on smallholder farmers in South Gondar Zone. All forms of smear positive TB patients who visited the district health centers and perpheral rural health posts, zonal and regional referal hospitals (Debretabor, Felegehiwot and Gambi hospitals) were included in this study. TB patients who were under treatment prior to the start of the study and those under 5 years of age were excluded from the study. For convenience, the patient sources were reduced from10 to 8 districts. A questionnaire based survey was also conducted to assessthe socio-demographic and other risk factors for the occurence of TB in the study area.

Sample size and sampling techniques
The sample size for the present study was determined at district level using a population proportion with specified absolute precision formula [13]: n = z 2 p (1− p)/d2,where n = the number of TB suspects, z = standard normal distribution value at 95% CI which is1.96, p = expected prevalence of tuberculosis among the populaton (0.05, from literature in the study area [3]), d = absolute precision, taken as 5% (we took a larger d due to resource limitations). Accordingly, the sample size calculated, including a 10% non-responsive rate was 396. Hence, for 8districtsin which the study was conducted, a total sample size of 3168 was determined.
Inclusion criteria: All TB suspected smallholder farmers visiting the differnt health facilities in South Gonder Zone, residng in rural and semi-urban settings, who were above 5 years of age, were included in the present study.

Data collection
Socio-demographic data. After eligible subjects were recruited and concented on the study, socio-demographic data of the study participants was collected using a structured questionaire. Data on patient characteristics such as age, sex, district of residency, education status, previous exposure to TB, raw milk consumption habit, and family TB history were collected. Moreover, clinical data on the form of TB cases as either pulmonary or extrapulmonary was included.
Sputum sample collection and processing. Paired morning-spot sputum samples from clinically suspected pulmonary TB patients and spot fine needle aspirate (FNA) samples from suspected TB lymphadenitis patients were collected by trained laboratory technicians and pathologists.Sputum samples were collected in standard sterile containers and FNA specimens were collected with cryo-tubes with phosphate buffer saline (PBS) pH 7.2. The sputum and FNA samples were processed for smear microscopic examination using standard protocols [14]. The samples were first digested and concentrated/decontaminated by the NALC-NaOH method (N-acetyl-L-cysteine-Sodium hydroxide). Smears of the final deposits from the various specimens were stained by the Ziehl-Neelsen (Z-N) method and examined under oil immersion using a binocular light microscope.
Mycobacterium culture. The samples were processed for culturing according to the standard methods described earlier [15,16]. Both sputum and FNA samples were cultured at the Bahir Dar Regional Health Research Laboratory Centre. Briefly, about 100 μl of the sample suspension was inoculated on four sterile Lowenstein Jensen (LJ) medium slopes (two were supplemented with pyruvate and the other two with glycerol) and then incubated at 37˚C with weekly examination for growth. Cultures were considered to be negative when no visible growth was detected after eight weeks of incubation. The presence of AFB in positive cultures was detected by smear microscopic examination using the ZN staining method. AFB positive cultures were prepared as 20% glycerol stocks and stored at -80˚C for reference. Heat-killed cells of each AFB isolate were prepared by mixing~2 loopfuls of cells (� 20μl cell pellet) in 200μl distilled H 2 O followed by incubation at 80˚C for 45 minutes for the release of DNA after the breaking the cell wall. The heat killed cells were transported to the laboratory at Aklilu Lemma Institute of Pathobiology, Addis Ababa University, for molecular typing.
Region of difference (RD) 9-based polymerase chain reaction. To differentiate M. tuberculosis from other members of the M. tuberculosis complex (MTBC) species, RD9 based PCR was performed according to protocols previously described [17].
M. tuberculosis H37Rv and M. bovis bacille Calmette-Guérin (BCG) were included as positive controls and water was used as a negative control. Interpretation of the result was based on bands of different sizes (396 base pairs (bp) for M. tuberculosis and 375bp for M. bovis) as previously described [18].
Spoligotyping. Isolates genetically identified as M. tuberculosis were spoligotyped for further strain identification. Spoligotyping makes use of the variability of the MTBC chromosomal direct repeat (DR) locus for strain differentiation as previously described [19]. A PCRbased amplification of the DR region of the isolate was performed using oligonucleotide primers derived from the DR sequence. The amplified product was hybridized followed by subsequent membrane washing processes. Known strains of M. bovis and M. tuberculosis H37Rv were used as positive controls, whereas Qiagen water (Qiagen Company, Germany) was used as a negative control. Hybridized DNA was detected by the enhanced chemiluminescence method. The presence or absence of spacer was used to interpret the result.
Identification of M. tuberculosis strains and lineages. A web-based spoligotype database was utilized to assign shared international types (SITs) for the isolates [20]. Strains matching a preexisting pattern in the database were identified with SIT number otherwise considered as orphans. An online TB lineage tool was used to predict the lineages/familyof M. tuberculosis using knowledge based Bayesian network (KBBN) [21].

Data analysis
Data were double entered to Excel file format and statistical analysis was performed using SPSS software version 20. Descriptive statistics were used to depict the socio-demographic variables. Chi-square (χ 2 ) was used to determine the possible association of risk factors with the occurrence of TB. Odds ratio (OR) was used to show the association of patient characteristics with the clustering rate of mycobacterial isolates. Results were considered statistically significant whenever p-value was less than 5%.

Ethics approval and consent to participate
The study was ethically reviewed and approved by Addis Ababa University, College of Natural Sciences Research Ethics Review Board. Each study participant gave their consent using a written form (S1 Appendix) and agreed to participate in the study after a clear explanation of the study objectives and patient data confidentiality. In case of participants under the age of 18 years, consent was obtained from their parents/guardians.

Socio-demographic characteristics of the study participants
A total of 3168 TB-suspected individuals, with a response rate of 93.2% (2,953/3168) participated in the study. Of these, 1,733 (58.7%) were males and 1,220 (41.3%) were females. Other socio-demographic characteristics such as patient origin, age, education status, status of prior exposure to TB within the family members and their milk consumption habits are noted and their association with TB prevalence has been determined ( Table 1).

Prevalence of tuberculosis among the study participants
The overall prevalence of smear positive all forms of TB was 6.3% (186/2953). The study showed variability in TB prevalence among the different socio-demographic characteristics such as sex, age and patient origin (districts). Accordingly, males (3.4% (100/2953), age groups from 18 to 30 years (2.2% (66/2953), and the Fogera district (1.8% (56/2953), had the highest prevalence. Moreover, the majority of smear positive cases (72.0%) had a history of TB in their family (Table 2).
EPTB was clinically characterized in 62.4% (116/186) of TB-positive cases in the study area, and 86.0% (160/186) of them were newly diagnosed for TB ( Table 2). All of the EPTB cases were TB lymphadenitis. Smear positive tuberculosis and genetic diversity of M. tuberculosis isolates

Risk factor assessment for TB prevalence
Patients' origin (p<0.001), age groups (p<0.001) and TB history in the family were significantly associated (p<0.001) with TB prevalence. However, sex, educational status and raw milk consumption habits were not significantly associated with the occurrence of all forms of TB (p> 0.05 for all) (Table 3). Similarly, no significant association was observed between any of the different demographic or clinical characteristics and the prevalence of EPTB among the study participants(p>0.05 for all) ( Table 3).

RD9-Deletion typing
A 59.7% (111/186) culture positivity was obtained from the active TB cases. Of these, 59.5% (66/111) were isolated from extrapulmonary TB (EPTB) patients. The molecular typing of culture positive isolates using RD9-based PCR revealed that all isolates had an intact RD9 locus and were subsequently classified as M. tuberculosis.

Spoligotyping patterns of M. tuberculosis isolates
A total of 111 isolates confirmed to be M. tuberculosis using RD9-based PCR were subjected to spoligotyping. Among these, the patterns of 96 isolates were interpreted and grouped into 35 different spoligotype patterns. From the 35 patterns, 22 patterns were shared types and consisted of 79 isolates (Fig 1). The dominantidentified SITs were SIT53 (Lineage 4), SIT149 (Lineage 4), and SIT428 (Lineage 3), each consisting of 18, 14 and 12 isolates, respectively (Fig 1). These three SITs consisted of 45.8% of the total isolates. The remaining 13 were not registered in the database and were 'orphan strains (Fig 2).
The two dominant major lineages identified in this study were Lineage 4 (Euro-American, 62.5%) and Lineage 3 (Central Asia, 26.0%). MTBC strains identified as lineage 7 (Afri), Line-age1 (East African Indian) and Lineage 2 (Beijing) were only identified in seven, three and one patients, respectively. The three dominantly identified families were T, CAS and Manu, each consisting of 46.9%, 24.0% and 10.4% of the isolates, respectively (Figs 1 and 2). Association of patient characteristics with M. tuberculosis genotype clustering. Higher percentages of the total 96 M. tuberculosis isolates were identified from Dera District (30.2%) followed by Fogera District (22.9%). The clustering rates of strains isolated from these two districts were 89.7 and 72.7, respectively. The least number of M. tuberculosis isolates were identified from Este (3.1%) with a clustering rate of 33.3% (Table 4).
Similarly, there was lower odd of strain clustering in samples from Gayint district (Gayint vs. Dera: AOR = 0.16; 95% CI: 0.01-1.09;p = 0.03). Clustering of strains was more than three times higher in sample from illiterate individuals than from those with secondary school education level (Secondary vs. Illiterates: AOR = 0.27; 95% CI: 0.04-1.16;p = 0.05). Clustering level, however, was not significantly associated with the other demographic and clinical characteristics of the study participants (Table 4).
A total of 74 (77.1%) isolates were grouped into 13 different clusters of strains which ranged from 2 to 18 isolates. The remaining 22 (22.1%) isolates had single strains (unique strains). The overall clustering rate was 77.1 (Table 4). A higher number (26) of clustered strains occurred in the strains of the Dera district (Table 4).

Discussion
The 6.3%overall prevalence of all forms of TB recorded in individuals visiting public health institutions in the present study was comparable with the report (6.2%) from northeast Ethiopia [6], whereas relatively lower prevalence of TB was reported among private health clinic Smear positive tuberculosis and genetic diversity of M. tuberculosis isolates attendees (5.4%) [3] and in the prison population in Gondar, northwest Ethiopia (5.3%) [8]. A4.6%lower prevalence was reported in general population living in southwest Ethiopia [22]. Lower TB prevalence was also reported in other African countries such as Zambia (4.3%) and Botswana (4.2%) [23,24]. On the contrary, some studies in Ethiopia have revealed higher prevalence of TB within the range of 7.5% to 17.3% [10,[25][26][27]. Higher TB prevalence was also reported in South Africa (30.2%) [28]. The observed variations in TB prevalence might be due to differencesin the quality of diagnostic techniques used as well as variations in the living condition ofthe study populations.
About 62.4% of smear positive TB cases in the present study were EPTB. This was higher than reports of other two studies conducted in northwest Ethiopia (59.8% and 28.3%) and a  [29][30][31]. Lower prevalence of EPTB was also reported in northern Nigeria (32.6%) and Cameroon (12,7%) [32,33]. The reason for the high prevalence of EPTB in the present study might be due to the fact that many of the study participants were from referral hospitals where highly suspected EPTB cases were diagnosed. In this study, higher TB prevalence was found in Dera and Fogera districts. Consistent with our finding, it was reported that place of residence was found to be a risk factor for TB [6,8]. The relative higher population size of the study sites [11], living conditions in the study area and migration of working staff to these areas from other districts in the zone might have contributed to the recorded variation in TB prevalence among the different sites in the study area.
In agreement with previous studies in other parts of Ethiopia [3,34], age of study participants was observed to be significantly associated with TB prevalence in the present study. Specifically, individuals within the age range of 31 to 43 years were observed to be infected with TB in a higher proportion (25.6%). In South Africa, 88.2% prevalence of TB in the 31 to 35 years of age study participants was reported [7].The reason for the observed high prevalence of the disease in these age groups might be due to their greater chance of contact with the community due to their more interactive day to day life activities. The global high prevalence of HIV/AIDS in these age groups might have also contributed for the observed high TB prevalence [35,36].
In the present study none of the demographic or clinical characteristics of the study participants were observed to be significantly associated with the development of EPTB. Consistently, a study in Gondar reported a non-significant association of age, sex, educational status, residence and occupational status of the respondents with EPTB infection [37]. However, some studies elsewherehave reported that the occurence of EPTB depends on the region, ethnic background, age, underlying disease, immune status of the patient as well as genotype of the M. tuberculosis strain [38][39][40]. Such differences might be due to differences in the use of more advanced diagnostic techniques than smear microscopy that was used in the present study.
The identification of M. tuberculosis as the only Mycobacterium species in the present study, using RD9-based PCR, was in agreement with previous reports in other parts of Ethiopia in which all or the majority of the isolates found from human TB cases were M. tuberculosis [41][42][43][44].In contrast, previous studies conducted in large scale commercial farmsand pastoral communities in lowland areas suggested the contribution of M. bovis to the overall burden of TBin humans [45,46].The reason for the difference in Mycobacteriumspecies prevalence in this study and previous studies might be due to the low TB infection rate in cattle owned by smallholder farmers that participated in the present study.
The relatively low, 36.5% (35/96), genetic diversity of spoligotype strains in this study was consistent with similar studies conducted in Ethiopia [41][42][43]47]. This might be an indication that genetic diversity of spoligotype strains is limited in Ethiopia.
The most prevalent M. tuberculosis strains in this study, SIT53 (lineage 4) and SIT149 (lineage 4), have been frequently reported from other parts of Ethiopia [47][48][49][50], while strain SIT428 (lineage 3) appears to be specific to the study area since it was not reported in previous studies from other parts of Ethiopia and East Africa. However, SIT428 was reported in some parts of Asia such as India [51]and Iran [52]. SIT53 and SIT149 were isolated from both PTB and EPTB patients. However, all the SIT428 strains were isolated from EPTB cases only. Hence, the present finding suggested an epidemiological link of this strain, particularly to TB type, in the study area.
In agreement with previous studies conducted in Ethiopia [44,50,53,54], lineage 4 (Euro-American) was the dominant (62.5%) major lineage identified, while T3-ETH, among the most prevalent T family, constituted a considerable share (14.6%) in this study. This is in agreement with previous studies that reported a high proportion of T3-ETH from mycobacteria isolates from Ethiopia [42,47,55].
In the present study, except partly for the patients' origin,no significant association was observed between socio-demographic factors and mycobacterial strain clustering. Similarly, studies conducted in northwest Ethiopia [56] and elsewhere [57,58] also reported no significant association in this respect. This lack of association may be explained by high force of interaction (proportion of susceptible individuals who have become infected in a specified period), which would result in considerable rates of TB infections [58].In this study, the observed high clustering rates of M. tuberculosis strains in Dera (89.7%), Simada (77.8%) and Fogera (72.7%) were possibly due to the relatively high population density, suggesting an ongoing TB transmission in the study area.
The present study has some limitations in including other patient characteristics such as the HIV status of the study participants and failure in the use of more advanced molecular techniquies than spoligotyping to better support epidemiological links to the occurrence of TB in the study area.

Conclusions
The present study revealed high prevalence of all forms of TB with a higher proportion of EPTB and high clustering rate of the M. tuberculosis isolates. Moreover, patients' origin, age and family history of TB were found to be the main risk factors of TB. Hence, appropriate TB control strategies should be implemented in the study area.
Supporting information S1 Appendix. Informed consent form for the study participants. The consent form was designed for the study participants or witnesses in case of illiterate ones who attend health facilities during the study period in South Gondar Zone, northwest Ethiopia. (PDF) Writing -review & editing: Beyene Petros, Gobena Ameni.