Evaluation of Customised Lineage-Specific Sets of MIRU-VNTR Loci for Genotyping Mycobacterium tuberculosis Complex Isolates in Ghana

Background Different combinations of variable number of tandem repeat (VNTR) loci have been proposed for genotyping Mycobacterium tuberculosis complex (MTBC). Existing VNTR schemes show different discriminatory capacity among the six human MTBC lineages. Here, we evaluated the discriminatory power of a “customized MIRU12” loci format proposed previously by Comas et al. based on the standard 24 loci defined by Supply et al. for VNTR-typing of MTBC in Ghana. Method One hundred and fifty-eight MTBC isolates classified into Lineage 4 and Lineage 5 were used to compare a customized lineage-specific panel of 12 MIRU-VNTR loci (“customized MIRU-12″) to the standard MIRU-15 genotyping scheme. The resolution power of each typing method was determined based on the Hunter-Gaston- Discriminatory Index (HGDI). A minimal set of customized MIRU-VNTR loci for typing Lineages 4 (Euro-American) and 5 (M. africanum West African 1) strains from Ghana was defined based on the cumulative HGDI. Results and Conclusion Among the 106 Lineage 4 strains, the customized MIRU-12 identified a total of 104 distinct genotypes consisting of 2 clusters of 2 isolates each (clustering rate 1.8%), and 102 unique strains while standard MIRU-15 yielded a total of 105 different genotypes, including 1 cluster of 2 isolates (clustering rate: 0.9%) and 104 singletons. Among, 52 Lineage 5 isolates, customized MIRU-12 genotyping defined 51 patterns with 1 cluster of 2 isolates (clustering rate: 0.9%) and 50 unique strains whereas MIRU-15 classified all 52 strains as unique. Cumulative HGDI values for customized MIRU-12 for Lineages 4 and 5 were 0.98 respectively whilst that of standard MIRU-15 was 0.99. A union of loci from the customised MIRU-12 and standard MIRU-15 revealed a set of customized eight highly discriminatory loci: 4052, 2163B, 40, 4165, 2165, 10,16 and 26 with a cumulative HGDI of 0.99 for genotyping Lineage 4 and 5 strains from Ghana.


Introduction
Tuberculosis (TB) is a major public health problem worldwide, causing 8.8 million new cases and more than 1.4 million deaths each year [1]. The main strategy for controlling TB, especially in low resourced countries, is case detection and treatment using the directly observed treatment short course (DOTS) strategy [2]. The conventional indicators used for assessing TB control programmes focuses on the proportion of patients with new sputum smear positive pulmonary disease that are cured by the end of treatment or whose sputum microscopy becomes negative after the first 2 months of treatment [3]. Such indicators ignore equally important aspects of TB control such as the duration of infectivity, the frequency of reactivation, and the risk of progression among the infected contacts, or the risk of transmission. Thus the control of TB also depends on understanding the patterns and dynamics of transmission which is useful for the implementation of public health measures to reduce sources of infection [4,5].
A number of molecular markers are available for differentiating members of the Mycobacterium tuberculosis complex (MTBC) for conventional epidemiological investigations of TB outbreaks and to assess risk factors associated with recent transmissions [6,7]. Mycobacterial interspersed repetitive unit-variable number of tandem repeats (MIRU-VNTR) typing, have overcome most of the shortcomings of IS6110 RFLP [8][9][10], and have now replaced this older technique as the new gold standard for molecular epidemiological investigation of TB. MIRU-VNTR typing which uses genomic diversity at different VNTR loci can have a cumulative resolution comparable to that of IS6110 RFLP analysis depending on the combination of loci analysed [11][12][13][14][15][16][17].
Several combinations of MIRU-VNTR loci have been published with initial methods relying on only a few loci, which turned out to have low discriminatory power among MTBC isolates [18,[37][38][39]. Subsequently, a standard MIRU-12 loci set with discriminatory power close to IS6110-RFLP was proposed for molecular epidemiological studies in TB [19][20][21]. More recently, this initial MIRU-12 set was replaced by the standard MIRU-15 set, and currently, standard MIRU-24 loci set [34] has been proposed for optimal discrimination of closely related strains. The standard MIRU15 set which includes six of the previous MIRU-12 with nine additional loci has been recommended as the standard for routine molecular epidemiology of TB, including outbreak investigations and population-based transmission studies. MIRU-24 set comprises the same 15 loci plus an additional nine provide additional information aimed at phylogenetic and population genetic aspects of MTBC.
The usage of the standard MIRU-15 and MIRU-24 has helped to gain insight into the transmission dynamics of MTBC. However, the initial selection of these loci was to some extent biased towards strains belonging to Lineage 4 (Euro-American lineage) [34]. The inability of the proposed loci led to new customized sets for Lineage 2 strains that include the clinically relevant Beijing family of strains [18]. However, the humanassociated MTBC includes 6 additional lineages [22,23,[45][46][47], which show a strong phylogeographic structure [24][25][26]. As observed for Lineage 2 strains, this might suggest that the usage of high discriminatory MIRU-VNTR loci may be sub-optimal in areas such as Ghana where about 20% of all TB cases are caused by Lineages 5 and 6 of MTBC (also known as M. africanum West Africa 1 and 2) [27][28].
Comas et al. [30] using 108 global MTBC strains [30] showed that the majority of the loci included in standard MIRU-24 had a variable discriminatory power across the different MTBC lineages. Moreover, the MIRU-VNTR loci that exhibited the highest discrimination index within one lineage were not necessarily the ones with the highest discriminatory power in other lineages. Based on the allelic diversity of individual MIRU-VNTR locus, Comas et al. [30] suggested different combinations of MIRU-VNTR loci that offered high resolution for the different MTBC lineages. These combinations offered two main advantages over the existing one; it maximized allelic diversity for a given MTBC lineage and allowed for cost effective analyses [30].
Here we evaluated this concept in the Ghanaian setting and compared the standard MIRU-15 to two lineage-specific 12-loci sets (here referred to as ''customized MIRU-12''), one for Lineage 4 and one for Lineage 5, which are the most frequent MTBC lineages in Ghana [27][28]49].

Ethics Statement
Ethical clearance for this study was obtained from the IRB of the Noguchi Memorial Institute for Medical Research, which has a Federal wide Assurance number FWA00001824. The procedure for sampling in this study was basically the same as those outlined by the National Tuberculosis Programme for the routine management of TB in Ghana. Informed consent both written (in the case of literate participants) and oral (for illiterates) was sought from all participants before their inclusion in the study. Consent was sought from their parents or guardians on behalf of children below sixteen years. As per the guidelines of the institutional review board of the Noguchi Memorial Institute for Medical Research, the objectives and benefits of the study were explained to all participants and they were assured of the confidentiality of all information collected from during the study.

Isolate Selection and Lineage Classification
A total of 178 MTBC isolates consecutively selected from a pool of retrospective samples were included in the study. Specimens included in this study were collected consecutively over a period of    Greater Accra and Western regions of Ghana respectively before commencement of anti-TB drug. DNA was extracted as described previously [33]. MTBC was confirmed by IS6110 PCR [40]. The isolates were then classified into lineages by analyses of various regions of difference (RDs) as previously described [31]. Specifically, all isolates were first screened for RD9. RD9-deleted strains were screened for RD4. Isolates identified as RD9 deleted and RD4 undeleted were further sub-typed for Lineage 5 and 6 (M. africanum West Africa I and II) using RD711 and RD702 flanking primers, respectively. TaqMan real time PCR was performed according to standard procedures using probes designed by Stucki et al for the confirmation of Lineages [35]. Although Lineage 6 strains (M. africanum West Africa II) are present in Ghana [27,28], they were removed from further analysis due to limited number (6 isolates) identified.

MIRU-VNTR Analysis
Two sets of PCRs were performed for each isolate. The first set was performed using the 12 lineage-specific MIRU-VNTR loci proposed by Comas et al. [30], while the second set consisted of the standard MIRU-15 as described by Supply et al. [34] (Table 1). Each PCR mixture contained 10X PCR buffer, 1.5 mM MgCl 2 , 200 mM concentrations of deoxyribonuclueotide triphosphate, 5 mM concentration of each primer, 1 ml of HotstarTaq DNA polymerase enzyme, 5 ml Q solution and 10 ng of DNA template in a total volume of 25 ml. Negative (sterile water) and positive controls (H37Rv) were added to each PCR reaction to validate the assay. Locus amplification was carried out under the following conditions: initial denaturation at 95uC for 15 minutes, and then 40 cycles of 95uC for 1 minute, 59uC for1 minute and 72uC for 3 minutes, followed by a final extension at 72uC for 7 minutes. Gel electrophoresis was done in 2% agarose for 5 hours at 80 constant Voltage. The amplicons were sized using a 100 bp marker and the obtained size compared with allelic table as published by Supply et al. [34].

SNP Typing
TaqMan real time PCR was performed as published by Stucki et al. [35]. Briefly, in a 200 ml sterile PCR tube, 2 ml of DNA was added to a 5 ml sterile water containing 0.21 mM each reverse and forward primers for the targeted regions, 0.83 mM each probe A for ancestral allele and probe B for mutant allele (each labelled with different dyes); and 5 ml Taqman Universal MasterMix II

Data Analysis
The number of repeats for each locus was determined based on the allelic table by Supply et al. [34] and clustering analysis was done using the online tool at http://www.MIRU-VNTRplus.org. MIRU-VNTR clusters were defined as isolates sharing identical patterns. The clustering rate was defined as (nc -c)/n, where nc is the total number of clustered cases, c is the number of clusters, and n is the total number of cases in the sample [29].
The Hunter-Gaston Discriminatory Index (HGDI) was used to calculate the discriminatory power of each locus as well as that of each method [36].

Determination of a Minimal Set of MIRU-VNTR Loci
Stepwise analysis was performed to identify a set of loci needed to achieve maximum discrimination. Firstly, we combined loci from the customised sets and standard MIRU-15 for each lineage under investigation. Twelve loci were shared between the customised Lineage 4 set and standard MIRU-15, addition of the remaining 4 non-shared loci from standard MIRU-15 gave a total of 16 loci for analysis. For Lineage 5, addition of 6 nonshared loci to the 9 shared loci gave a total of 17 loci. Subsequently, we calculated individual locus HGDI. The results obtained were arranged in a descending order. Starting with the highest HGDI, cumulative HGDI was then calculated by successively adding one locus after the other. Finally, the clustering rate was calculated in a similar manner by successively adding one locus after the other. The result (cumulative HGDI and percentage clustering) obtained for each lineage was plotted on a graph and the cut-off point for selection of the minimal set of loci was set at where graph plateaued meaning further addition of loci resulted in the same cumulative HGDI. The customized minimal loci-set was then extracted from the graph.

MTBC Isolates and Lineage Determination
All 178 isolates included in this study were classified into Lineage 4 (N = 126) or Lineage 5 (N = 52) based on the RD and SNP typing analysis [31,35]. Discordant samples were excluded Clustering rate for lineages 4 and 5 calculated using after successive addition of analysed loci using the formula (nc -c)/n, where nc is the total number of clustered cases, c is the number of clusters, and n is the total number of cases in the sample a was calculated after successive addition of individual locus. Fig. 2a and b shows clustering rate values for lineages 4 and 5 respectively. doi:10.1371/journal.pone.0092675.g002 Evaluation of Customised MIRU-VNTR Loci in Ghana PLOS ONE | www.plosone.org from the study. A full set of MIRU allelic data was obtained for 158/178 (88.8%), comprising 106 Lineage 4 and 52 Lineage 5 isolates, respectively. The remaining 20 of the 178 (11.2%) isolates were excluded from the analysis for various reasons. 90% (18/20) of excluded isolates had no PCR amplicon at one or several loci whilst the remaining 10% (2/20) had double alleles at one or more MIRU-VNTR loci, indicative of the possible presence of two independent strains [32].

Determination of a Minimal Set of MIRU-VNTR Loci for Genotyping Main MTBC Lineages from Ghana
Customized MIRU-12 for Lineage 4 shared 11 loci with standard MIRU-15 whilst 9 loci were shared between customized MIRU-12 for Lineage 5 and standard MIRU-15. A union of both sets of typing schemes gave a total of 16 and 17 loci for Lineage 4 and 5, respectively (Table S1). For Lineage 4, we identified six top most discriminatory loci (4052, 2163B, 40, 2165, 10 and 4165) with a cumulative HGDI of 0.99 (Table 2). Similarly, for Lineage 5, six loci: 2163B, 4165, 40, 26, 4052 and 16 (Table 3) with a cumulative HGDI of 0.99 were identified. Further addition of loci gave no significant change in cumulative HGDI values as shown in Figure 1. Note that 4 loci (4052, 2163B, 4162 and 40) were among the 6 most discriminatory in both lineage-specific sets. Hence, based on this, we propose the usage of a new set of customised typing system comprising 8 loci showing the highest discriminatory power for genotyping strains from the two most common lineages circulating in Ghana.

Discussion and Conclusion
Different combinations of MIRU and other VNTR loci have been proposed to complement the standard MIRU-15 scheme to achieve higher discrimination. Results accumulated from such studies clearly revealed that due to the strong phylogeographic structure exhibited by MTBC, the most relevant MIRU-VNTR typing schemes will likely differ depending on the specific geographical setting. For example, Shamputa et al. [18] successfully identified a reduced set of 8 loci from standard MIRU-24, which could be used to discriminate, isolates from the Republic of Korea. Similarly, Musare et al. [37], Dong et al. [38] and Zhou et al. [39] successfully defined a minimal set of 12 loci for genotyping Beijing strains which made up more than 90% of the isolates investigated from Asia. Most of the studies have been focused on Lineage 2 including the clinically important Beijing family based on its association with drug resistance [48]. However, no study has been carried out in most resource-limited settings like Ghana, where M. africanum is an important pathogen [27][28]49]. If customized lineage-specific sets of MIRU-VNTR loci could be implemented in such settings, this will have an impact in terms of reducing work load and saving resources. In the present study, we evaluated such an approach for genotyping MTBC strains from Ghana, [27,28] and compared our results with the current gold standard typing method; standard MIRU 15 as proposed by Supply et al. [34].
Although standard MIRU-15 showed higher discrimination in its ability to accurately identify clusters among these two lineages in our study when compared to customised lineage-specific MIRU-12 proposed previously [30], we found that not all the 15 loci were as informative for typing MTBC strains in Ghana. Even with the customized MIRU-12, based on our data, not all 12 loci were needed to achieve maximum discrimination ( Figure 1). Specifically, our analysis showed that 10 of a total of 16 loci tested for Lineage 4 strains added no or only limited additional information in terms of discriminatory power. Similarly, 11 of a total of 17 loci screened for Lineage 5 strains showed limited discriminatory power. We thus explored the possibility of a minimal set of loci HGDI selected by combining the standard MIRU-15 and the customized MIRU-12 data set. Based on individual and cumulative HGDIs, and clustering rate, we defined the six top discriminatory loci for Lineage 4 (4052, 2163B, 40, 2165, 10 and 4165) (Figure 1a  We now plan to apply these minimal MIRU-VNTR set for molecular epidemiological investigation of MTBC transmission in population based study in Ghana. We anticipate that this approach will save a significant amount of time. In addition we perform cost analysis on the different VNTR schemes analysed in this study. Cost was calculated based on the direct cost of reagents, materials and equipment. We compared the cost of genotyping using standard MIRU-15 and our proposed customized set of MIRU-8. With a unit cost of $11.24, the cost of performing standard MIRU-15 on one sample was $168.60, with the total material costs of analyses using our proposed customized MIRU-8 set for one sample being $89.2. Hence, by screening for only the relevant loci, we not only maximize discriminatory power but also minimize genotyping costs.
Currently, human-associated MTBC is known to comprise a total of seven main phylogenetic lineages [23,[46][47]. We propose that additional lineage-specific sets of MIRU-VNTR could be identified for molecular epidemiological investigation of TB transmission in resource-limited settings. Moreover, each MTBC lineage consists of a number of sub-lineages, some of which also show strong geographical associations [22, 24, and 45]. For example, the ''Uganda'' sub-lineage of Lineage 4 causes up to 60% of TB in Kampala, Uganda [41], suggesting that a similar customized Lineage 4 set for Uganda could be developed,which possibly would include other loci considering that most of Lineage 4 in Ghana consists of the ''Cameroon'' sub-lineage [42][43][44].
This study set out to define a set of loci for genotyping MTBC strains from Ghana. We acknowledge the high prevalence of M. africanum strains in Ghana, however, this prevalence is driven by Lineage 5 (M. africanum West Africa I) with limited number of Lineage 6 (M. africanum West Africa II). We acknowledge the fact that this makes our proposed customized MIRU-8 countryspecific, and thus suggest that countries within West African where the high prevalence of M. africanum is driven by Lineage 6 (M. africanum West Africa II) could equally determine the minimal set of loci which gives the highest discrimination. Nevertheless, the strength of our study is the ability to genotype an unknown strain in Ghana with the proposed customized MIRU-8 loci in the most cost-efficient way.
In conclusion, this study identified a reduced set which can be applied for strain differentiation of the main MTBC lineages from Ghana.