Skip to main content
Advertisement
  • Loading metrics

Mycobacterium tuberculosis complex Lineage 1: A neglected cause of tuberculosis

Abstract

The Mycobacterium tuberculosis complex (MTBC) phylogenetic lineages 1–4 (L1–L4) are the main causes of human tuberculosis (TB). Until now, most of the focus in the TB field has been on MTBC L2 and L4, as these two lineages are geographically widespread and have been repeatedly associated with multidrug resistance. By comparison, MTBC L1 has received little attention, partially because of its restricted geographical range that mainly includes low- to middle-income countries in South and Southeast Asia, and East Africa. However, recent estimates indicate that MTBC L1 is in fact the most common cause of human TB in terms of absolute numbers of TB patients, particularly among several high TB burden countries. As more L1 strains are being sampled in L1-endemic countries, the high genetic diversity of this geographically restricted MTBC lineage is slowly uncovered. This discovery has also impacted L1 nomenclature, which has been modified as new distinct L1 clades were identified. In parallel to the genomic discoveries ushered by progress in whole genome sequencing, clinical researchers have also studied several phenotypes that better describe L1 TB disease. L1 strains have been shown to have increased vulnerability to oxidative stress, which was associated with decreased virulence in animal and in vitro models. L1 infection also shows possible association with extrapulmonary TB and asymptomatic TB. However, despite belonging to the same lineage, L1 strains display phenotypic diversity that can be attributed to high within-lineage genetic diversity and possibly the interaction of different L1 genotypes with different human host genotypes. Among the clinical phenotypes that show heterogeneity are bacterial factors, immune profiles, and clinical virulence. The traditional view regarding the reduced transmissibility in L1 is now being challenged by new data indicating that L1 may be as transmissible as L2 or L4. Lastly, although historically referred to as being negatively associated with drug resistance, there is indication that the contribution of L1 to TB drug resistance is significant and that it may evolve drug resistance in ways distinct from those of other MTBC lineages.

Introduction

With the end of the COVID-19 pandemic, tuberculosis (TB) is again the number one cause of human mortality due to a single infectious agent. An estimated 10.8 million new TB cases and 1.25 million deaths occurred in 2023 [1]. The outcome of TB infection and disease varies greatly, ranging from asymptomatic infection to pulmonary and extrapulmonary disease. Many patient and environmental factors are known to influence these variable outcomes. Increasingly, bacterial diversity is also being recognized as a contributing factor. TB in humans and other mammals is mainly caused by members of the Mycobacterium tuberculosis complex (MTBC). The MTBC comprises ten phylogenetic lineages adapted to humans and several clades adapted to different wild and domestic animals [2,3]. In human TB, most of the global burden is caused by L1, L2, L3, and L4, with L5–L10 together contributing only a minor proportion [4]. To date, most of the research on the biological and epidemiological consequences of MTBC diversity has been focusing on L2 and L4, largely because these two lineages are geographically widespread and have been associated with antibiotic resistance. By comparison, L1 and L3 have been neglected [5]. Several genomic epidemiology studies aiming to understand the evolution of drug resistance in TB disease use datasets of MTBC genomes that under-represent South and Southeast Asia, where most L1-endemic countries are [6,7]. Undersampling often occurs in L1-endemic countries due to logistical and financial difficulties of doing population-based sampling with limited resources. However, recent estimates indicate that L1 is the most common cause of human TB in terms of absolute numbers of patients affected. This is particularly true in several high TB burden countries of South- and Southeast Asia [4]. Here, we review and discuss the available literature on L1 and highlight some of the key characteristics that make L1 stand out from the other MTBC lineages.

Methods

The global population structure of MTBC L1

To determine the global population structure of MTBC L1, we used a global dataset of MTBC L1 strains to create a whole-genome-based phylogenetic tree and map the distribution of L1 sublineages around the world. This dataset comprised L1 genomes stored in our in-house database and included all genomes from countries where L1 is endemic [5,8,9]. In addition, this dataset also contained L1 genomes from TB low-burden countries such as those in North America and Europe. An in-house pipeline was used to extract nucleotide variants by mapping to a reference genome as previously reported [10]. The genomes included in the dataset were deemed of good quality based on the following criteria—coverage of at least 15× and showed no evidence of mixed lineage infection. Alignments of variable positions were used for creating a phylogenetic tree with RAxML v. 8.2.11 (options -m GTRCAT -V), using Mycobacterium canettii as an outgroup.

Literature search

Pubmed and Ovid MEDLINE (ALL) searches were done for the purpose of this review. Different keywords were used, depending on the subtopic discussed. To broadly search about the global burden, population structure, and molecular epidemiology of MTBC L1, the following keywords were used—“tuberculosis” AND (“Lineage 1” OR “Indo-Oceanic” OR “EAI”) AND (“molecular epidemiology” OR “spoligotyping” OR “RFLP”). For the nomenclature of L1, the following keywords were used—“MTBC” AND (“sublineage” OR “barcode” OR “barcodes” OR “SNP”). For the different L1 phenotypes, the following keywords were used—“Tuberculosis” AND (“Lineage 1” OR “Indo-Oceanic” OR “EAI”) AND (“virulence” OR “immune profile” OR “transmission” OR “drug resistance” OR “extrapulmonary”). All database searches were supplemented by manual search done on the reference lists of each relevant publication.

The global burden of MTBC L1

Compared to the globally widespread L2 and L4, L1 is geographically restricted. The geographical range of L1 is historically described as being around the rim of the Indian Ocean (Fig 1), namely in the regions of South Asia, Southeast Asia (SEA), and East Africa [5,8,9]. In a recently published review, the proportion estimates for the different MTBC lineages around the world was combined with the 2022 World Health Organization (WHO) global estimates for incident TB, which led to the estimate that L1 caused 2.8 million human TB cases in 2021, more than any other MTBC lineage. In contrast, L4 and L2 caused an estimated 2.5 and 1.3 million TB cases, respectively [4]. For this review, we defined L1-endemic countries as those found to have L1 genotypes consistently circulating and transmitting among individuals born in the respective countries. Using this definition, Brazil and some countries in West Africa and Oceania were considered to have L1-endemic strains, even though the total number of TB cases attributable to L1 are low relative to the other MTBC lineages circulating in these countries [5]. Some low-burden countries in Europe, North America and Australia, although not considered as L1-endemic countries, have high proportions of L1 cases, which, most likely reflect recent migrations from high-burden countries where L1 is endemic [11]. Importantly, many of the high-burden TB countries recognized by WHO are L1-endemic countries. These include India, the Philippines, Indonesia, Bangladesh, and Myanmar. Moreover, out of the ten countries at the intersection of the WHO list of high-TB, TB-HIV, and MDR/RR-TB burden [1], six are L1-endemic countries (Fig 2).

thumbnail
Fig 1. Estimated global burden of MTBC L1 and the proportion of different MTBC lineages worldwide.

The pink heat map indicates the estimated absolute number of L1 TB cases in different countries, while the pie charts indicate the proportion of different MTBC lineages circulating in specific United Nations geoscheme geographical subregions. Data were previously published in a recent review [4]. The map used was created by Frank Bennett, Public Domain, via Wikimedia Commons (https://commons.wikimedia.org/wiki/File:BlankMap-World-Flattened.svg).

https://doi.org/10.1371/journal.pntd.0013513.g001

thumbnail
Fig 2. Estimated absolute number of cases of MTBC lineages circulating among L1-endemic high-burden TB, MDR/RR-TB, and TB-HIV countries.

The absolute numbers on the y-axis are scaled by 10,000. The six countries indicated in the figure are L1-endemic countries that are also included in the WHO list of high-burden TB, MDR/RR-TB, and HIV/TB countries. The largest number of TB patients caused by L1 strains are in India and the Philippines, with approximately 1.4 million and 700,000 cases, respectively. Data used were obtained from the 2024 WHO Global TB Report [1] and a recently published review [4].

https://doi.org/10.1371/journal.pntd.0013513.g002

The global population structure of L1

Using the SNP-based nomenclature originally published by Coll and colleagues [12], five L1 sublineages can be defined—L1.1.1, L1.1.2, L1.1.3, L1.2.1, and L1.2.2. In this review, we mainly discuss the L1 sublineages at this level of hierarchal subdivision, in line with the Coll nomenclature, which is the nomenclature most often used [12,13]. These sublineages may correspond to one or more of the following spoligotype groups—EAI1-SOM, EAI2-Manila, EAI2-Nonthaburi, EAI3-IND, EAI4-VNM, EAI5, EAI6-BGD1, EAI7-BGD2, EAI8-MDG, and Zero-copy [14]. Although widely used, the nomenclature by Coll and colleagues [12] has been revised and further expanded upon, which we will discuss in more detail below.

The phylogenetic tree shown in Fig 3 is a representation of all major L1 sublineages from L1-endemic countries, based on previously reported data [8,11]. From this phylogenetic tree, one can appreciate that the global population of L1 is strongly phylogeographically structured, with some L1 sublineages being associated with particular geographical regions, as previously reported [5,15,16]. Specifically, most L1.1.1 sublineage strains occur in the mainland of Southeast Asia (SEA), which includes Vietnam, Thailand, Cambodia, and Laos. By contrast, in Island SEA, which includes the Philippines, Indonesia, Malaysia, Papua New Guinea, and East Timor, a high proportion of L1.2.1 can be observed. L1.1.2 on the other hand predominates in South Asia but is also seen in East Africa. L1.1.3 is prevalent in Bangladesh and Myanmar, as well as in East Africa, especially Malawi and Mozambique. Lastly, L1.2.2 is prevalent in South Asia, East Africa, and South Africa (Figs 3 and 4).

thumbnail
Fig 3. The global population structure of MTBC L1.

The phylogenetic tree above consists of 4,171 MTBC L1 genomes originating from 63 countries. The phylogenetic tree indicates the sublineage classification according to L1 SNP barcodes published by Coll and colleagues [12]. Within L1.1.1, there is a clade designated as L1.1.1.1 (labeled in blue curved line). The genomes highlighted in light green (L1.1) and dark green (L1) are examples of genomes that could not be classified by Coll nomenclature, as previously published [5,15]. However, based on their position within the tree, they belong to L1.1.2 and L1.2.2, respectively. The strain’s geographical region of origin is based on the country of birth of the patient or if the country of birth is unknown, the country of isolation. The deeply branching genomes within the L1.2.1 clade, emphasized here with circular tip points, comprise strains from East Africa, Southeast Asia, and Oceania.

https://doi.org/10.1371/journal.pntd.0013513.g003

thumbnail
Fig 4. Global distribution of L1 sublineages.

The MTBC genomes included here are the same as the ones represented in Fig 3. The genomes are classified according to the country of birth of the patient source of the isolate, or the country of isolation, when the country of birth is unknown. The size of the pie chart for each country is based on the number of genomes included in the dataset, with the smallest pie chart corresponding to less than five genomes and the largest with more than 20 genomes. The world map was generated from the R package “rnaturalearth” (https://CRAN.R-project.org/package=rnaturalearth).

https://doi.org/10.1371/journal.pntd.0013513.g004

Outside of South Asia and SEA, the L1 strains identified in the West African countries of the Gambia, Ghana, Sierra Leone, and Mali belong to L1.1.1, which is the sublineage that predominates in Mainland SEA (Figs 3 and 4), suggesting that L1.1.1 was introduced to West Africa from SEA. However, the L1.1.1 sublineage is generally uncommon in the African continent, and the absolute number of TB cases due to L1.1.1 in West Africa is low compared to those caused by other MTBC lineages. In comparison, L1.1.2 is the most common L1 sublineage in East Africa; however, some strains from the East African countries of Ethiopia, Djibouti, Tanzania, and Somalia form a monophyletic clade within L1.2.1, which is genetically distinct from the L1.2.1 strains occurring in countries of Island SEA. This clade is deeply branching within L1.2.1 (Fig 3) and is a sister clade of another deeply branching clade that is composed of strains from Papua New Guinea and East Timor [5]. This deeply branching clade separated early in the evolutionary history of L1, and even though currently under-sampled, might provide information on the origin and the evolutionary forces that gave rise to the different clades of L1 we observe today. Finally, in the case of Brazil, genotyping methods based on spoligotyping and MIRU-VNTR have identified some L1 strains similar to the ones circulating in East Africa, South Africa, and South Asia—L1.1.2, L1.1.3, and L1.2.2, indicating that L1 in Brazil might have been introduced from Africa and/or South Asia [17,18] as further discussed below.

The geographical origin of MTBC L1

To date, four studies have used different phylogeographical approaches to infer the most likely geographical origin of MTBC L1 [5,9,16,19]. Three of these studies used whole genome sequencing (WGS) data and inferred a South Asian origin for the L1 Most Recent Common Ancestor (MRCA, i.e., the population from which all existing L1 population descended from) [5,16,19]. One study used spoligotyping data and inferred a Southeast Asian origin for L1 [9]. Given the known limitations of spoligotyping compared to WGS [14,20], most of the current evidence thus supports a South Asian origin for L1. Both Mainland SEA and South Asia are regions where all five main L1 sublineages defined by Coll and colleagues [12] co-occur. However, the absolute number of L1 cases in South Asia is far greater than what is seen in Mainland SEA. Several researchers have studied the spread of L1 out of South Asia. One example is the introduction of L1.1.2 into Tanzania from South Asia about 300 years ago, which can likely be linked to Indian Ocean trade and migration [21]. Several minor dispersals of L1 from South Asia have also occurred to South America and West Africa, as reported from Brazil and several West African countries, possibly associated with the transatlantic slave trade [5,18]. Despite contention against SEA as the geographic origin of L1 MRCA, it likely played a role in the diversification and geographical spread of L1. This is particularly true for the MRCA of L1.1.1 and L1.2.1, which are thought to have originated in Mainland SEA and Island SEA and subsequently spread within their respective region and even to countries outside the region, such as in West and East Africa [5].

The importance of knowing the geographical origin of the MRCA within the context of MTBC infection is related to the observation first published more than 20 years ago that different MTBC strains are associated with certain human host populations [22]. This association persists even among individuals who migrate out of their birth country, regardless of being part of an epidemiologic cluster [23]. Furthermore, allopatric host–pathogen relationship, wherein the human host and the infecting MTBC pathogen originate from different geographic areas, shows lower infectivity specifically among geographically restricted lineages such as L1 [24]. Accordingly, we can say, for example, that an L1.2.1 strain is associated with individuals from island SEA (e.g., the Philippines) because the L1.2.1 MRCA originated in island SEA. Host–pathogen co-evolution or co-adaptation are among the mechanisms being explored to explain this observation [5,15,25].

The evolving nomenclature of MTBC L1

The nomenclature of L1 has been changing through time, reflecting the changes in the genotyping methods used to classify MTBC strains. Older genotyping methods compared the presence or absence of genetic markers such as insertion sequences, direct repeats, and tandem repeats [26]. The current widely used nomenclature scheme, and the one mainly referred to in this review, is based on WGS. Thereby, single-nucleotide polymorphisms (SNPs) specific to different monophyletic clades are identified and can be used as SNP-barcodes to assign MTBC strains to the corresponding lineages and sublineages. Although WGS offers a high resolution for phylogenetic classification, SNP-barcoding has its limitations, particularly in the light of biased sampling. If the sampling is not able to capture all representative strains of a given lineage, some genetic diversity will remain unidentified. This is an issue for L1 in particular, since the capacity to conduct WGS remains limited in many L1-endemic countries. The recent inclusion of more WGS samples from L1-endemic countries allowed for finer delineation of various clades within L1 that were previously unidentified. For example, the observation that several L1 strains from Thailand could not be classified into any of the sublineages defined by the SNP-barcodes of Coll and colleagues [12] is a case in point [5,15] (see sublineages with light and dark green highlights in Fig 3). As a result, a re-evaluation of these SNP barcodes became necessary to accommodate smaller monophyletic groups or sublineages [27]. The major changes that occurred since the introduction of the Coll nomenclature include the subdivision of L1.2.1 into two sublineages—L1.2.1 and L1.2.2, and the reclassification of L1.2.2 into sublineages under the L1.3 clade [9,27,28]. Table 1 shows a summary of these recent changes in L1 sublineage nomenclature. The rationale used for supporting these changes in nomenclature included similarities in geographical location among strains in the newly defined clades, consistency with spoligotyping genotypes, smaller mean SNP distances within clades as opposed to between clades, inclusion of a minimum number of strains per clade, and having at least one SNP common within the clade that was not present outside of the clade [9,27,28]. However, caution is required when reading the clade numberings of the different nomenclature systems. Although these systems may use the same sublineage and clade numbers, they may not refer to the same L1 genotypes [9,27]. Moreover, with more comprehensive sampling and further WGS analysis of L1 strains in the future, more L1 diversity will likely be uncovered, and the corresponding nomenclature will continue to evolve.

thumbnail
Table 1. Comparison of different WGS-based SNP-typing schemes used to classify MTBC L1 strains.

https://doi.org/10.1371/journal.pntd.0013513.t001

Characterizing the genome diversity of MTBC L1

The genomic characteristics of L1 were often described in parallel to the developments in MTBC genotyping technologies. Historically, MTBC genotypes were defined based on the insertion sequence 6110 (IS6110) Restriction Fragment Length Polymorphism (RFLP) method. The range of IS6110 copy numbers across L1 strains is highly variable. Some L1 strains carry no copies at all, some have low numbers (1 or 2), and others can have as many as 15 copies [31]. Several studies have documented this phenomenon within individual countries or geographical regions [3235]. The development of spoligotyping at the end of the 1990s complemented IS6110 RFLP-based typing. The East-African-Indian (EAI) spoligotyping family, which largely corresponds to L1, was first defined in 2001 [36], and is characterized by the absence of CRISPR spacers 29—32 and 34 (DVR 39—42, and DVR 44 in the spoligotype-43 format) and the presence of spacer 33 or DVR 43 in the spoligotype-43 format [37]. Shortly thereafter, the first so-called “large sequence polymorphisms” (LSPs), also known as “regions of difference” (RDs), were identified. Among these RDs, it was the discovery of TbD1 that allowed for the characterization of L1 as an “ancestral” or “ancient” lineage of the MTBC. L1 was described as “ancestral” to L2, L3, or L4, due to the presence of the TbD1 region, which was lost in the common ancestor of the three aforementioned “modern” lineages [38]. Given the absence of ongoing horizontal gene exchange in the MTBC [39], the presence of TbD1 most likely represents the ancestral state, which is also supported by the fact that TbD1 is present in all MTBC lineages except for L2, L3, and L4 [2,40,41]. The deletion of TbD1 therefore represents a phylogenetically derived state, which indicates a unique evolutionary event that occurred in the common ancestor of L2, L3, and L4 [2,40,41]. Eventually, by comparative genomics of larger MTBC strain collections, RD239 was found to be deleted in all L1 strains, which led to this lineage being referred to as the Indo-Oceanic lineage by the LSP-based nomenclature [22,42].

As WGS developed and became more widely used, many SNPs were discovered that were phylogenetically informative and specific for the different MTBC lineages and sublineages, allowing for the general SNP-based nomenclature of the MTBC used today. An additional advantage of WGS-typing is that it can be used to quantify the genetic distance between strains. Based on such measurements, we know that L1 has the highest within-lineage diversity among all the MTBC lineages. In an initial set of pairwise comparisons, the average genetic distance between 44 L1 MTBC strains obtained from a global dataset was 730 SNPs, as compared that of L4 (n = 64) and L2 (n = 37) strains, with an average genetic distance of around 600 and 200 SNPs, respectively [43]. A more recent comparison showed that the within-lineage diversity of L1 can be as high as 998 SNPs [9]. However, SNPs are not the only way of quantifying genetic distance. In fact, structural variants such as indels (e.g. RDs and LSPs) and repetitive sequences (e.g., direct repeats used in spoligotyping and insertion sequences such as IS6110) are ignored when measuring SNP distances using short-read WGS technology [44]. The genetic distance between strains could even be higher than what short-read WGS SNPs show, when considering all genetic variations together. Initial evidence from a recent MTBC pangenome study based on long-read sequencing showed that the diversity of L1 is high, as seen by the considerable genetic diversity of its accessory genome [45]. The high genetic diversity in L1 is not only due to diversification (or divergent evolution, as evidenced by phylogenetically informative SNPs and RDs) but is also due to homoplasy or convergent evolution (as evidenced by non-phylogentically informative RDs, such as RD3 and RD11, and similar spoligotype patterns in the direct repeat region among different L1 sublineages) [9,45,46].

Connecting the MTBC L1 genotypes to phenotypes

How much of the MTBC genetic variation contributes to TB disease phenotype is an important question in TB clinical research. Unfortunately, it is one that is entangled with other factors such as host comorbidities and host-pathogen interaction [4749]. Therefore, we present in this section of the review the important clinical phenotypes considered in TB disease and how it generally presents among L1 TB cases. However, the results of different studies are not always applicable to all human host populations. This is especially true for studies done in vitro or in animal models that lack many of the nuances of human TB disease [50].

Experimental phenotypes

In 1963, Mitchison and colleagues observed that MTBC strains isolated from South Indian TB patients showed lower virulence in guinea pigs and increased susceptibility to reactive oxygen species when compared to MTBC isolates from British patients [51]. Follow-up studies around that time supported the notion that South Indian TB strains were indeed on average less virulent in guinea pigs and more susceptible to reactive oxygen stress than strains isolated from other countries [52]. Moreover, the lower virulence of these strains was associated with increased susceptibility to oxidative stress [53]. Even though increased susceptibility to oxidative stress in MTBC is not necessarily directly due to decreased virulence, it has been seen in other pathogenic bacteria that proteins involved in stress response also regulate virulence [54,55]. However, as mentioned previously, bacterial pathogenicity is not the only factor that dictates clinical virulence.

Today, we know that the large majority of the MTBC strains from South India belong to L1 and that endemic British MTBC isolates are mostly L4. A more recent study replicated the original experiments by Mitchison and colleagues using some of the same L1 isolates [56]. These authors established a possible role of TbD1 in MTBC virulence and used bacterial load as measurement of virulence. When comparing a wild-type L1 strain with an intact TbD1 region to a TbD1-deficient L1 strain, a higher bacterial load was observed in guinea pigs infected with the L1 mutant strain lacking TbD1. The authors also showed that the TbD1-deficient L1 strain was less susceptible to reactive oxygen and nitrogen species when compared to the wild-type strain [56]. Other experiments used spoligotyping information and compared EAI (i.e., L1) strains with non-EAI strains, in particular strains belonging to L2 and L4. Using MTBC strains isolated from Vietnam, one study measured the bacterial load in infected BALB/c mice and observed varying degrees of decreased virulence in L1 strains compared to either L2 or L4 [57]. In another study comparing strains isolated from Tanzania, L1 strains were shown to have a lower replication rate when compared to L2 in human monocyte-derived macrophages [58]. In summary, most (albeit limited) experimental evidence using infection models suggests L1 as being less virulent when compared to L2 and L4.

The inflammatory profile of L1 strains has also been studied to understand how the human immune system interacts with the different genetic variants of MTBC. In several studies, L1 induced higher TNF-α, IL-1β, and IL-12 levels in murine and human macrophages 24 and 48 hours after infection when compared to L2 and L4 [57,59,60]. Another study observed a wider range of values in the normalized median cytokine levels at 24 hours post-infection induced by L1 (n = 8), L5 (n = 1), and L6 (n = 7) strains, when compared to L2, L3, and L4 strains [60]. Other inflammatory cytokines that appear to be induced to a higher extent in L1 relative to L2, L3, and L4, 24–48 hours post-infection, include IL-6, IL-15, MIP-1a, CCL5, IL-8, and MCP-1 [5961]. One study comparing L1 and L4 strains from Tanzania reported lower instead of higher inflammatory cytokines in L1 at 1, 4, and 7 days post-infection [58]. These somehow contradictory results may point to within-lineage phenotypic diversity among L1 strains, which would be consistent with the large genetic within-lineage diversity in L1 reviewed above. However, contradictory results on the inflammatory profile can also be simply due to the inherent limitations and differences in the experimental designs, such as varying multiplicities of infection (MOIs) upon infection among the different studies.

A few studies looked at the specific composition of the cell wall of L1 compared to other MTBC lineages. One study observed that L1 strains had a relatively higher abundance of alpha- and keto-mycolic acids when compared to L6, and relatively lower abundance of methoxy-mycolic acids compared to L2, L4, and L6 [62]. When examining a globally-representative collection of L1 strains, it was shown that the cell wall phenolic glycolipids varied across the L1 sublineages [13]. These authors also reported the presence of mycoside B in strains belonging to a subset of L1.2.2. This study is so far the first report of the presence of mycoside B in L1. Previously, mycoside B was reported in some strains belonging to L6 [13]. The fact that the L1 sublineages differ in their cell wall characteristics is further substantiated by a study demonstrating that strains belonging to L1.1.2 from India did not produce sulfolipids, while L1 strains from Vietnam did [6365]. These findings on distinct MTBC cell wall phenotypes between L1 sublineages are important because many MTBC cell wall lipids are implicated in the host–pathogen cross-talk during TB infection and disease [66]. Finally, a recent study [67] used an in vitro granuloma-like infection model and found that L1 infections led to smaller granuloma formations when compared to L3 or L4.

Clinical virulence

Despite the increasing evidence for differences in experimental phenotypes when comparing L1 to other MTBC lineages, only a few studies have explored L1 phenotypes in the clinic. Most studies to date have focused on other lineages (in particular L2) and included L1 into the larger group of “other lineages”, making any L1-specific conclusion difficult. However, one recent analysis on clinical outcomes in Vietnam based on 158 patient isolates found that L1.1.1.1 was associated with treatment failure and cavitary disease when compared to other lineages [68]. Similarly, a study analyzing 1,305 isolates from Thailand found that infection with strains from the Indo-Oceanic lineage (i.e., L1) was associated with increased mortality when compared to the East Asian lineage (i.e., L2) [69]. Clinical virulence in human pathogens like the MTBC, however, is not only a result of bacterial pathogenicity, but can also be affected by human host comorbidities and the interaction between the host immune system and the infecting MTBC genotype [49]. Hence, measures of experimental virulence in vitro and using other animal models are not always applicable to what we see clinically [50]. In summary, these limited clinical data so far are largely inconclusive and do not support a reduced clinical virulence due to L1, as could be expected based on the experimental data reviewed above.

Extrapulmonary tuberculosis

While TB is primarily a pulmonary disease, TB can also affect many other parts of the human body. Collectively, extrapulmonary TB can represent more than 30% of cases in certain geographical settings like East Africa [70,71]. In addition to patient factors like co-infection with HIV and geographical region of origin [7274], several findings suggest that L1 might also be associated with extrapulmonary TB. In particular, three studies analyzed large multinational collections of MTBC strains together with the associated clinical data and found L1 to be associated with extrapulmonary TB when compared to L2 [75,76], L3 [76], or L4 [76,77]. In contrast, another study from Germany showed no significant association between L1 and extrapulmonary TB [78]. In trying to understand the possible association between L1 and extrapulmonary TB, Saelens and colleagues [79] studied in much detail an L1 strain that caused an outbreak of extrapulmonary TB in the USA. The authors found that this L1 outbreak strain carried a full-length variant of the esxM gene that was truncated in strains belonging to L2, L3, and L4. They hypothesized that carrying full-length esxM could lead to a higher propensity of developing extrapulmonary TB due to morphological changes in the TB-infected macrophages that increase their motility. Of note, and similar to the TbD1 genomic region discussed above, the full-length version of esxM likely reflects the ancestral state while the truncated form reflects the derived state characteristic of all “modern” MTBC strains. Taken together, one could hypothesize that the “modern” strains might therefore have evolved to become more streamlined to cause pulmonary TB rather than extrapulmonary TB, when compared to L1 and to the other lineages exhibiting a full-length esxM, which potentially could enhance transmission in areas with lower TB incidence or lower population density.

Transmission and infectiousness

Several genomic epidemiological studies have reported L1 strains to transmit less than other lineages (particularly L2 and L4) in both L1-endemic and non-endemic countries. These studies used genomic clustering rates based on specific SNP-distance thresholds and terminal branch lengths (TBLs) as proxies for recent transmission. A study from Vietnam reported reduced transmission of L1 strains compared to L2 and L4 [80]. One study in India and another in Tanzania also concluded L1 having a reduced transmission potential as compared to L2 and L4 [81], and L3 [21]. Several other studies that covered longer time periods or included larger sample sizes found that L1 strains were less likely to belong to transmission clusters [8284]. SNP-based clustering is useful for detecting strains that are epidemiologically linked. However, there are some important limitations when using genetic clustering to infer differences in transmission. Among the confounding factors that affect using SNP-distances to infer transmissibility include differences in the molecular clock rate, the latency period between lineages, and the extent of sampling done (i.e., sampling period and sampling proportion) [85]. For example, a smaller cluster or a longer TBL can be due to any of the following—a longer latency period, a slower molecular clock rate, a shorter sampling period, or sampling of only a small proportion of cases [85]. By contrast, phylodynamic modeling can circumvent some of these limitations by simultaneously integrating evolutionary, demographic, and epidemiological parameters [10]. Phylodynamics is a statistical framework that integrates genetic data to estimate transmission parameters like the basic reproduction number (R0) and the effective reproduction number (Re). The values of R0 or Re are more informative than TBLs or SNP clusters because they are validated measures of transmissibility that can also give an idea on the stability of an epidemic or the effectiveness of infection control measures [86]. While the value of R0 and Re is determined by the sampling rate and transmission rate, TBLs are affected by many other factors mentioned above [85].

A study in Tanzania used such a phylodynamic approach and found that despite the reduced transmission rate for L1 compared to L2 and L3, there were no differences in the effective reproductive number (Re) between these lineages [21]. This phylodynamic model was recently further refined and used to analyze four independent MTBC genomic datasets from Malawi, Tanzania, The Gambia, and Vietnam [10]. The authors found that the time between transmission events was consistently longer in L1 and L6 compared to the “modern” L2, L3, and L4 in all four countries. These findings were attributed to a longer initial period of non-infectiousness (i.e., a delayed onset of infectious TB disease) in L1 and L6 compared to the modern lineages. Interestingly, in contrast to the results of SNP-based clustering and TBLs, the transmission rates in L1, L2, L3, and L4 inferred by this phylodynamic model were similar [10]. Taken together, while several genomic epidemiological studies suggest that L1 shows a reduced transmission potential, more sophisticated phylodynamic analyses show a more complex picture. Hence, the notion of L1 having reduced transmissibility might be too simplistic, which is also supported by the observation that in most L1-endemic countries, there is no evidence for L1 being outcompeted by other lineages over time [87]. Instead, L1, and probably some of the other MTBC lineages still retaining intact TbD1 and esxM versions, might differ in their evolutionary strategy in more subtle ways. Specifically, one could hypothesize that a slower progression from infection to disease with overt clinical symptoms is associated with a delayed infectious period. Some support for this hypothesis comes from a recent study in Canada were an association between L1 and asymptomatic TB has been reported in two independent cohorts [88]. The high population density in many L1-endemic countries may also allow L1 to transmit despite the presumed delay in onset of infectiousness. It has been noted that TB re-infection can occur especially in areas with high TB incidence [89,90]. This also supports the notion that in densely populated L1-endemic countries, there is a constant supply of susceptible hosts, which allows for continuous infection and transmission. In a modeling done on TB disease using parameters that are more attuned with endemic TB communities, re-exposure was seen to increase the probability of entering an infectious state [91].

Drug resistance in MTBC L1

Multidrug-resistant TB (MDR-TB) has repeatedly been associated with L2 [92]. By contrast, the contribution of L1 in the MDR-TB epidemic remains unclear. This is important in light of the fact that among the ten countries with the highest absolute incidence number of MDR-TB, seven are L1-endemic countries—India, Indonesia, the Philippines, Pakistan, Myanmar, South Africa, and Vietnam [1]. However, the characteristics of MDR L1 strains have rarely been investigated. Several studies have shown that isoniazid resistance encoded by mutations in the inhA promotor located in the fabG1 (mabA) gene are overrepresented in L1 strains compared to L2 or L4 [9395]. The reason underlying the association between L1 and fabG1 drug resistance mutations is unknown, but could be related to the overall higher susceptibility of L1 to oxidative stress [51,53], making the catalase-peroxidase activity encoded by katG particularly essential for intracellular detoxification in L1-infected host cells. Therefore, it may be more beneficial for L1 to minimize katG mutations, which gives preference towards other mutations that confer isoniazid resistance. As for rifampicin, the mutation rpoB S450L is globally the most frequent cause of resistance. Based on the WHO MTBC Catalog of Drug Resistance Mutations, out of around 16,000 isolates with rifampicin resistance, 64% had an rpoB S450L mutation [96]. In the Philippines, however, where most of the circulating MTBC strains are L1, only around 40% of rifampicin-resistant MTBC strains harbored this mutation [97]. A similar observation was reported in India [98]. These findings suggest that in L1, other mutations in the rpoB gene or in genes other than rpoB, might play an important role in rifampicin resistance. During the process of developing a standardized protocol for routine phenotypic drug susceptibility testing for the new anti-TB drug pretomanid, it was noticed that L1 strains exhibited a higher minimal inhibitory concentration (MIC) for pretomanid when compared to L2, L3, or L4 strains [99]. This increased MIC for pretomanid in L1 was confirmed in a later study [100]. However, the clinical implications of this for the treatment of MDR-TB in L1-endemic countries remain to be determined [100]. Conversely, it has been reported that L1 had a reduced MIC to bedaquiline [68], which may have a beneficial effect on the treatment of patients infected with L1. This may be related to the observation of a loss-of-function (LOF) mutation in the mmpL5 gene of strains from sublineage L1.1.1.1 [101]. This mmpL5 LOF mutation is thought to increase sensitivity to the new drug bedaquilin [68]. Another LOF mutation was identified in the whiB7 gene among strains belonging to sublineage L1.2.1 [102]. The whiB7 gene has previously been shown as partially responsible for the intrinsic resistance of MTBC to many drugs such as the macrolides class of antibiotics [103]. The LOF mutation discovered in this gene resulted in sensitivity to clarithromycin, opening the door for alternative, bacterial genotype-specific treatments for TB [102].

Conclusions

Despite being the most common cause of human TB in terms of absolute numbers, we still have an incomplete understanding of the biology and epidemiology of MTBC L1. While experimental infection studies suggest a reduced virulence of L1 strains compared to strains belonging L2, L3, and L4, the limited clinical data available draws a more complex picture. Moreover, the repeated association of L1 with extrapulmonary TB, the delayed progression to the infectious disease state indicated by phylodynamic modeling, and the association with asymptomatic TB, are suggestive of distinct life history traits of L1 compared to the other human-adapted MTBC lineages. More research is necessary to understand the intersection between the human host and L1. These studies should especially be placed into the context of the environment in L1-endemic countries, where socioeconomic factors play important roles in shaping the health of individuals at risk for or already infected with MTBC.

Finally, the fact that L1 is the most genetically diverse MTBC lineage with a strong phylogeographical population structure might reflect local adaptation of L1 genotypes to different human host populations. However, more work is needed to explore this hypothesis. Beyond determining the evolutionary forces that resulted in an increased genetic diversity within L1, we also need to mechanistically study the functional consequences of this diversity. In light of the ongoing efforts to develop improved TB vaccines [104] and new anti-TB treatments [105], care should be taken to ensure a broad efficacy, including against infection and disease caused by L1.

Box 1. Key learning points

  1. a. MTBC L1 is globally the most frequent cause of human TB in terms of the estimated absolute number of patients affected, specifically among the high-burden TB countries India, the Philippines, Indonesia, and Bangladesh.
  2. b. The global population of MTBC L1 is highly phylogeographically structured, with the different L1 sublineages showing particular geographical associations.
  3. c. Increase in sampling of MTBC strains from L1-endemic countries resulted in the identification of new L1 genotypes. More L1 diversity will likely be uncovered in the future.
  4. d. In addition to being the most genetically diverse MTBC lineage, L1 strains also exhibit large phenotypic diversity, the relevance of which for global TB control remains to be established.
  5. e. Differences in transmission dynamics and progression to disease in L1 compared to L2, L3, and L4 suggest subtle differences in life history traits across MTBC lineages.

Box 2. Five key references

  1. a. Goig GA, Windels EM, Loiseau C, Stritt C, Biru L, Borrell S, et al. Ecology, global diversity and evolutionary mechanisms in the Mycobacterium tuberculosis complex. Nature Reviews Microbiology. 2025;23:602–14. doi: https://doi.org/10.1038/s41579-025-01159-w.
  2. b. Bottai D, Frigui W, Sayes F, Di Luca M, Spadoni D, Pawlik A, et al. TbD1 deletion as a driver of the evolutionary success of modern epidemic Mycobacterium tuberculosis lineages. Nature Communications. 2020;11(1):684. https://doi.org/10.1038/s41467-020-14508-5.
  3. c. Saelens JW, Sweeney MI, Viswanathan G, Xet-Mull AM, Jurcic Smith KL, Sisk DM, et al. An ancestral mycobacterial effector promotes dissemination of infection. Cell. 2022;185(24):4507–25.e18. doi: https://doi.org/10.1016/j.cell.2022.10.019.
  4. d. Netikul T, Thawornwattana Y, Mahasirimongkol S, Yanai H, Maung HMW, Chongsuvivatwong V, et al. Whole-genome single nucleotide variant phylogenetic analysis of Mycobacterium tuberculosis Lineage 1 in endemic regions of Asia and Africa. Scientific Reports. 2022;12(1):1565. https://doi.org/10.1038/s41598-022-05524-0.
  5. e. Windels EM, Valenzuela Agüí C, de Jong BC, Meehan CJ, Loiseau C, Goig GA, et al. Onset of infectiousness explains differences in transmissibility across Mycobacterium tuberculosis lineages. Epidemics. 2025;51:100821. doi: https://doi.org/10.1016/j.epidem.2025.100821.

Acknowledgments

We thank all the other members of our group for the stimulating discussions.

References

  1. 1. Global tuberculosis report 2024. World Health Organization. 2024.
  2. 2. Guyeux C, Senelle G, Le Meur A, Supply P, Gaudin C, Phelan JE, et al. Newly identified Mycobacterium africanum Lineage 10, Central Africa. Emerg Infect Dis. 2024;30(3):560–3. pmid:38407162
  3. 3. Zwyer M, Çavusoglu C, Ghielmetti G, Pacciarini ML, Scaltriti E, Van Soolingen D, et al. A new nomenclature for the livestock-associated Mycobacterium tuberculosis complex based on phylogenomics. Open Res Eur. 2021;1:100. pmid:37645186
  4. 4. Goig GA, Windels EM, Loiseau C, Stritt C, Biru L, Borrell S, et al. Ecology, global diversity and evolutionary mechanisms in the Mycobacterium tuberculosis complex. Nat Rev Microbiol. 2025;23(9):602–14. pmid:40133503
  5. 5. Menardo F, Rutaihwa LK, Zwyer M, Borrell S, Comas I, Conceição EC, et al. Local adaptation in populations of Mycobacterium tuberculosis endemic to the Indian Ocean Rim. F1000Res. 2021;10:60. pmid:33732436
  6. 6. Poonawala H, Kumar N, Peacock SJ. A review of published spoligotype data indicates the diversity of Mycobacterium tuberculosis from India is under-represented in global databases. Infect Genet Evol. 2020;78:104072. pmid:31618692
  7. 7. Manson AL, Cohen KA, Abeel T, Desjardins CA, Armstrong DT, Barry CE 3rd, et al. Genomic analysis of globally diverse Mycobacterium tuberculosis strains provides insights into the emergence and spread of multidrug resistance. Nat Genet. 2017;49(3):395–402. pmid:28092681
  8. 8. Netikul T, Palittapongarnpim P, Thawornwattana Y, Plitphonganphim S. Estimation of the global burden of Mycobacterium tuberculosis lineage 1. Infect Genet Evol. 2021;91:104802. pmid:33684570
  9. 9. Netikul T, Thawornwattana Y, Mahasirimongkol S, Yanai H, Maung HMW, Chongsuvivatwong V, et al. Whole-genome single nucleotide variant phylogenetic analysis of Mycobacterium tuberculosis Lineage 1 in endemic regions of Asia and Africa. Sci Rep. 2022;12(1):1565. pmid:35091638
  10. 10. Windels EM, Valenzuela Agüí C, de Jong BC, Meehan CJ, Loiseau C, Goig GA, et al. Onset of infectiousness explains differences in transmissibility across Mycobacterium tuberculosis lineages. Epidemics. 2025;51:100821. pmid:40118009
  11. 11. Wiens KE, Woyczynski LP, Ledesma JR, Ross JM, Zenteno-Cuevas R, Goodridge A, et al. Global variation in bacterial strains that cause tuberculosis disease: a systematic review and meta-analysis. BMC Med. 2018;16(1):196. pmid:30373589
  12. 12. Coll F, McNerney R, Guerra-Assunção JA, Glynn JR, Perdigão J, Viveiros M, et al. A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nat Commun. 2014;5:4812. pmid:25176035
  13. 13. Gisch N, Utpatel C, Gronbach LM, Kohl TA, Schombel U, Malm S. Sub-lineage specific phenolic glycolipid patterns in the Mycobacterium tuberculosis complex lineage 1. Frontiers in Microbiology. 2022;13.
  14. 14. Napier G, Couvin D, Refrégier G, Guyeux C, Meehan CJ, Sola C, et al. Comparison of in silico predicted Mycobacterium tuberculosis spoligotypes and lineages from whole genome sequencing data. Sci Rep. 2023;13(1):11368. pmid:37443186
  15. 15. Palittapongarnpim P, Ajawatanawong P, Viratyosin W, Smittipat N, Disratthakit A, Mahasirimongkol S, et al. Evidence for host-bacterial co-evolution via genome sequence analysis of 480 thai Mycobacterium tuberculosis lineage 1 isolates. Sci Rep. 2018;8(1):11597. pmid:30072734
  16. 16. O’Neill MB, Shockey A, Zarley A, Aylward W, Eldholm V, Kitchen A, et al. Lineage specific histories of Mycobacterium tuberculosis dispersal in Africa and Eurasia. Mol Ecol. 2019;28(13):3241–56. pmid:31066139
  17. 17. Duarte TA, Nery JS, Boechat N, Pereira SM, Simonsen V, Oliveira M, et al. A systematic review of East African-Indian family of Mycobacterium tuberculosis in Brazil. Braz J Infect Dis. 2017;21(3):317–24. pmid:28238627
  18. 18. Conceição EC, Refregier G, Gomes HM, Olessa-Daragon X, Coll F, Ratovonirina NH, et al. Mycobacterium tuberculosis lineage 1 genetic diversity in Pará, Brazil, suggests common ancestry with east-African isolates potentially linked to historical slave trade. Infect Genet Evol. 2019;73:337–41. pmid:31170529
  19. 19. Silcocks M, Dunstan SJ. Parallel signatures of Mycobacterium tuberculosis and human Y-chromosome phylogeography support the Two Layer model of East Asian population history. Commun Biol. 2023;6(1):1037. pmid:37833496
  20. 20. Mokrousov I. On sunspots, click science and molecular iconography. Tuberculosis (Edinb). 2018;110:91–5. pmid:29779780
  21. 21. Zwyer M, Rutaihwa LK, Windels E, Hella J, Menardo F, Sasamalo M, et al. Back-to-Africa introductions of Mycobacterium tuberculosis as the main cause of tuberculosis in Dar es Salaam, Tanzania. PLoS Pathog. 2023;19(4):e1010893. pmid:37014917
  22. 22. Hirsh AE, Tsolaki AG, DeRiemer K, Feldman MW, Small PM. Stable association between strains of Mycobacterium tuberculosis and their human host populations. Proc Natl Acad Sci U S A. 2004;101(14):4871–6. pmid:15041743
  23. 23. Gagneux S, DeRiemer K, Van T, Kato-Maeda M, de Jong BC, Narayanan S, et al. Variable host-pathogen compatibility in Mycobacterium tuberculosis. Proc Natl Acad Sci U S A. 2006;103(8):2869–73. pmid:16477032
  24. 24. Gröschel MI, Pérez-Llanos FJ, Diel R, Vargas R Jr, Escuyer V, Musser K, et al. Differential rates of Mycobacterium tuberculosis transmission associate with host-pathogen sympatry. Nat Microbiol. 2024;9(8):2113–27. pmid:39090390
  25. 25. Liu Q, Liu H, Shi L, Gan M, Zhao X, Lyu L-D, et al. Local adaptation of Mycobacterium tuberculosis on the Tibetan Plateau. Proc Natl Acad Sci U S A. 2021;118(17):e2017831118. pmid:33879609
  26. 26. Merker M, Kohl TA, Niemann S, Supply P. The evolution of strain typing in the Mycobacterium tuberculosis complex. Adv Exp Med Biol. 2017;1019:43–78. pmid:29116629
  27. 27. Shitikov E, Bespiatykh D. A revised SNP-based barcoding scheme for typing Mycobacterium tuberculosis complex isolates. mSphere. 2023;8(4):e0016923. pmid:37314207
  28. 28. Napier G, Campino S, Merid Y, Abebe M, Woldeamanuel Y, Aseffa A, et al. Robust barcoding and identification of Mycobacterium tuberculosis lineages for epidemiological and clinical studies. Genome Med. 2020;12(1):114. pmid:33317631
  29. 29. Comas I, Homolka S, Niemann S, Gagneux S. Genotyping of genetically monomorphic bacteria: DNA sequencing in Mycobacterium tuberculosis highlights the limitations of current methodologies. PLoS One. 2009;4(11):e7815. pmid:19915672
  30. 30. Homolka S, Projahn M, Feuerriegel S, Ubben T, Diel R, Nübel U, et al. High resolution discrimination of clinical Mycobacterium tuberculosis complex strains based on single nucleotide polymorphisms. PLoS One. 2012;7(7):e39855. pmid:22768315
  31. 31. Roychowdhury T, Mandal S, Bhattacharya A. Analysis of IS6110 insertion sites provide a glimpse into genome evolution of Mycobacterium tuberculosis. Sci Rep. 2015;5:12567. pmid:26215170
  32. 32. Agasino CB, Ponce de Leon A, Jasmer RM, Small PM. Epidemiology of Mycobacterium tuberculosis strains in San Francisco that do not contain IS6110. Int J Tuberc Lung Dis. 1998;2(6):518–20. pmid:9626611
  33. 33. Das S, Paramasivan CN, Lowrie DB, Prabhakar R, Narayanan PR. IS6110 restriction fragment length polymorphism typing of clinical isolates of Mycobacterium tuberculosis from patients with pulmonary tuberculosis in Madras, south India. Tuber Lung Dis. 1995;76(6):550–4. pmid:8593378
  34. 34. Park YK, Bai GH, Kim SJ. Restriction fragment length polymorphism analysis of Mycobacterium tuberculosis isolated from countries in the western pacific region. J Clin Microbiol. 2000;38(1):191–7. pmid:10618086
  35. 35. Douglas JT, Qian L, Montoya JC, Musser JM, Van Embden JDA, Van Soolingen D, et al. Characterization of the Manila family of Mycobacterium tuberculosis. J Clin Microbiol. 2003;41(6):2723–6. pmid:12791915
  36. 36. Sola C, Filliol I, Legrand E, Mokrousov I, Rastogi N. Mycobacterium tuberculosis phylogeny reconstruction based on combined numerical analysis with IS1081, IS6110, VNTR, and DR-based spoligotyping suggests the existence of two new phylogeographical clades. J Mol Evol. 2001;53(6):680–9. pmid:11677628
  37. 37. Refrégier G, Sola C, Guyeux C. Unexpected diversity of CRISPR unveils some evolutionary patterns of repeated sequences in Mycobacterium tuberculosis. BMC Genomics. 2020;21(1):841. pmid:33256602
  38. 38. Brosch R, Gordon SV, Marmiesse M, Brodin P, Buchrieser C, Eiglmeier K, et al. A new evolutionary scenario for the Mycobacterium tuberculosis complex. Proc Natl Acad Sci U S A. 2002;99(6):3684–9. pmid:11891304
  39. 39. Stritt C, Gagneux S. How do monomorphic bacteria evolve? The Mycobacterium tuberculosis complex and the awkward population genetics of extreme clonality. Peer Commun J. 2023;3.
  40. 40. Coscolla M, Gagneux S, Menardo F, Loiseau C, Ruiz-Rodriguez P, Borrell S, et al. Phylogenomics of Mycobacterium africanum reveals a new lineage and a complex evolutionary history. Microb Genom. 2021;7(2):000477. pmid:33555243
  41. 41. Ngabonziza JCS, Loiseau C, Marceau M, Jouet A, Menardo F, Tzfadia O, et al. A sister lineage of the Mycobacterium tuberculosis complex discovered in the African Great Lakes region. Nat Commun. 2020;11(1):2917. pmid:32518235
  42. 42. Tsolaki AG, Hirsh AE, DeRiemer K, Enciso JA, Wong MZ, Hannan M, et al. Functional and evolutionary genomics of Mycobacterium tuberculosis: insights from genomic deletions in 100 strains. Proc Natl Acad Sci U S A. 2004;101(14):4865–70. pmid:15024109
  43. 43. Coscolla M, Gagneux S. Consequences of genomic diversity in Mycobacterium tuberculosis. Semin Immunol. 2014;26(6):431–44. pmid:25453224
  44. 44. Stritt C, Reitsma M, Marin AMG, Goig G, Dötsch A, Borrell S, et al. Gene conversion and duplication contribute to genetic variation in an outbreak of Mycobacterium tuberculosis. Microbial Genomics. 2025;11(5).
  45. 45. Behruznia M, Marin M, Whiley D, Farhat M, Thomas JC, Domingo-Sananes MR, et al. The Mycobacterium tuberculosis complex pangenome is small and shaped by sub-lineage-specific regions of difference. eLife Sciences Publications; 2025.
  46. 46. Bespiatykh D, Bespyatykh J, Mokrousov I, Shitikov E. A comprehensive map of Mycobacterium tuberculosis complex regions of difference. mSphere. 2021;6(4):e0053521. pmid:34287002
  47. 47. Goig GA, Loiseau C, Maghradze N, Mchedlishvili K, Avaliani T, Tsutsunava A, et al. Clinical and bacterial determinants of unfavorable tuberculosis treatment outcomes: an observational study in Georgia. Cold Spring Harbor Laboratory; 2025.
  48. 48. Sweeney MI, Carranza CE, Tobin DM. Understanding Mycobacterium tuberculosis through its genomic diversity and evolution. PLoS Pathog. 2025;21(2):e1012956. pmid:40019877
  49. 49. Palittapongarnpim P, Tantivitayakul P, Aiewsakun P, Mahasirimongkol S, Jaemsai B. Genomic interactions between Mycobacterium tuberculosis and humans. Annu Rev Genomics Hum Genet. 2024;25(1):183–209. pmid:38640230
  50. 50. Li H, Li H. Animal models of tuberculosis. In: Christodoulides M, editor. Vaccines for neglected pathogens: Strategies, achievements and challenges: focus on leprosy, leishmaniasis, melioidosis and tuberculosis. Cham: Springer International Publishing; 2023. p. 139–70.
  51. 51. Mitchison DA, Selkon JB, Lloyd J. Virulence in the guinea-pig, susceptibility to hydrogen peroxide, and catalase activity of isoniazid-sensitive tubercle bacilli from South Indian and British patients. J Pathol Bacteriol. 1963;86:377–86. pmid:14068947
  52. 52. Mitchison DA. Regional variation in the guinea-pig virulence and other characteristics of tubercle bacilli. Pneumonologie. 1970;142(2):131–7. pmid:4100080
  53. 53. Goren MB, Grange JM, Aber VR, Allen BW, Mitchison DA. Role of lipid content and hydrogen peroxide susceptibility in determining the guinea-pig virulence of Mycobacterium tuberculosis. Br J Exp Pathol. 1982;63(6):693–700. pmid:6817780
  54. 54. Sharma A, Tayal S, Bhatnagar S. Analysis of stress response in multiple bacterial pathogens using a network biology approach. Sci Rep. 2025;15(1):15342. pmid:40316612
  55. 55. Cardoza E, Singh H. From stress tolerance to virulence: recognizing the roles of csps in pathogenicity and food contamination. Pathogens. 2024;13(1):69. pmid:38251376
  56. 56. Bottai D, Frigui W, Sayes F, Di Luca M, Spadoni D, Pawlik A, et al. TbD1 deletion as a driver of the evolutionary success of modern epidemic Mycobacterium tuberculosis lineages. Nat Commun. 2020;11(1):684. pmid:32019932
  57. 57. Krishnan N, Malaga W, Constant P, Caws M, Tran THC, Salmons J, et al. Mycobacterium tuberculosis lineage influences innate immune response and virulence and is associated with distinct cell envelope lipid profiles. PLoS One. 2011;6(9):e23870. pmid:21931620
  58. 58. Hiza H, Zwyer M, Hella J, Arbués A, Sasamalo M, Borrell S, et al. Bacterial diversity dominates variable macrophage responses of tuberculosis patients in Tanzania. Sci Rep. 2024;14(1):9287. pmid:38653771
  59. 59. Chakraborty P, Kulkarni S, Rajan R, Sainis K. Drug resistant clinical isolates of Mycobacterium tuberculosis from different genotypes exhibit differential host responses in THP-1 cells. PLoS One. 2013;8(5):e62966. pmid:23667550
  60. 60. Portevin D, Gagneux S, Comas I, Young D. Human macrophage responses to clinical isolates from the Mycobacterium tuberculosis complex discriminate between ancient and modern lineages. PLoS Pathog. 2011;7(3):e1001307. pmid:21408618
  61. 61. Chakraborty P, Kulkarni S, Rajan R, Sainis K. Mycobacterium tuberculosis strains from ancient and modern lineages induce distinct patterns of immune responses. J Infect Dev Ctries. 2018;11(12):904–11. pmid:31626595
  62. 62. Portevin D, Sukumar S, Coscolla M, Shui G, Li B, Guan XL, et al. Lipidomics and genomics of Mycobacterium tuberculosis reveal lineage-specific trends in mycolic acid biosynthesis. Microbiologyopen. 2014;3(6):823–35. pmid:25238051
  63. 63. Panchal V, Jatana N, Malik A, Taneja B, Pal R, Bhatt A, et al. A novel mutation alters the stability of PapA2 resulting in the complete abrogation of sulfolipids in clinical mycobacterial strains. FASEB Bioadv. 2019;1(5):306–19. pmid:32123834
  64. 64. Goren MB, Brokl O, Schaefer WB. Lipids of putative relevance to virulence in Mycobacterium tuberculosis: phthiocerol dimycocerosate and the attenuation indicator lipid. Infect Immun. 1974;9(1):150–8. pmid:4271720
  65. 65. Gangadharam PR, Cohn ML, Middlebrook G. Infectivity, pathogenicity and sulpholipid fraction of some Indian and British strains of tubercle bacilli. Tubercle. 1963;44:452–5. pmid:14101326
  66. 66. Ruhl CR, Pasko BL, Khan HS, Kindt LM, Stamm CE, Franco LH, et al. Mycobacterium tuberculosis sulfolipid-1 activates nociceptive neurons and induces cough. Cell. 2020;181(2):293-305.e11. pmid:32142653
  67. 67. Arbués A, Schmidiger S, Reinhard M, Borrell S, Gagneux S, Portevin D. Soluble immune mediators orchestrate protective in vitro granulomatous responses across Mycobacterium tuberculosis complex lineages. eLife Sciences Publications; 2025.
  68. 68. Stanley S, Spaulding CN, Liu Q, Chase MR, Ha DTM, Thai PVK, et al. Identification of bacterial determinants of tuberculosis infection and treatment outcomes: a phenogenomic analysis of clinical strains. Lancet Microbe. 2024;5(6):e570–80. pmid:38734030
  69. 69. Smittipat N, Miyahara R, Juthayothin T, Billamas P, Dokladda K, Imsanguan W. Indo-Oceanic Mycobacterium tuberculosis strains from Thailand associated with higher mortality. Int J Tuberc Lung Dis. 2019.
  70. 70. Hailu S, Hurst C, Cyphers G, Thottunkal S, Harley D, Viney K, et al. Prevalence of extra-pulmonary tuberculosis in Africa: a systematic review and meta-analysis. Trop Med Int Health. 2024;29(4):257–65. pmid:38263374
  71. 71. Baykan AH, Sayiner HS, Aydin E, Koc M, Inan I, Erturk SM. Extrapulmonary tuberculosıs: an old but resurgent problem. Insights Imaging. 2022;13(1):39. pmid:35254534
  72. 72. Yang Z, Kong Y, Wilson F, Foxman B, Fowler AH, Marrs CF, et al. Identification of risk factors for extrapulmonary tuberculosis. Clin Infect Dis. 2004;38(2):199–205. pmid:14699451
  73. 73. Leeds IL, Magee MJ, Kurbatova EV, del Rio C, Blumberg HM, Leonard MK, et al. Site of extrapulmonary tuberculosis is associated with HIV infection. Clin Infect Dis. 2012;55(1):75–81. pmid:22423123
  74. 74. Wetzstein N, Drummer A-P, Bockey A, Herrmann E, Küpper-Tetzel CP, Graf C, et al. Occurrence of extrapulmonary tuberculosis is associated with geographical origin: spatial characteristics of the Frankfurt TB cohort 2013-2018. Infection. 2023;51(3):679–87. pmid:36181634
  75. 75. Click ES, Moonan PK, Winston CA, Cowan LS, Oeltmann JE. Relationship between Mycobacterium tuberculosis phylogenetic lineage and clinical site of tuberculosis. Clin Infect Dis. 2012;54(2):211–9. pmid:22198989
  76. 76. Du DH, Geskus RB, Zhao Y, Codecasa LR, Cirillo DM, van Crevel R, et al. The effect of M. tuberculosis lineage on clinical phenotype. medRxiv. 2023;:2023.03.14.23287284. pmid:36993190
  77. 77. Negrete-Paz AM, Vázquez-Marrufo G, Vázquez-Garcidueñas MaS. Whole-genome comparative analysis at the lineage/sublineage level discloses relationships between Mycobacterium tuberculosis genotype and clinical phenotype. PeerJ. 2021;9:e12128.
  78. 78. Rachwal N, Idris R, Dreyer V, Richter E, Wichelhaus TA, Niemann S, et al. Pathogen and host determinants of extrapulmonary tuberculosis among 1035 patients in Frankfurt am Main, Germany, 2008-2023. Clin Microbiol Infect. 2025;31(3):425–32. pmid:39528087
  79. 79. Saelens JW, Sweeney MI, Viswanathan G, Xet-Mull AM, Jurcic Smith KL, Sisk DM, et al. An ancestral mycobacterial effector promotes dissemination of infection. Cell. 2022;185(24):4507-4525.e18. pmid:36356582
  80. 80. Holt KE, McAdam P, Thai PVK, Thuong NTT, Ha DTM, Lan NN, et al. Frequent transmission of the Mycobacterium tuberculosis Beijing lineage and positive selection for the EsxW Beijing variant in Vietnam. Nat Genet. 2018;50(6):849–56. pmid:29785015
  81. 81. Dixit A, Ektefaie Y, Kagal A, Freschi L, Karyakarte R, Lokhande R. Drug resistance and epidemiological success of modern Mycobacterium tuberculosis lineages in western India. J Infect Dis. 2024.
  82. 82. Sobkowiak B, Banda L, Mzembe T, Crampin AC, Glynn JR, Clark TG. Bayesian reconstruction of Mycobacterium tuberculosis transmission networks in a high incidence area over two decades in Malawi reveals associated risk factors and genomic variants. Microb Genom. 2020;6(4):e000361. pmid:32234123
  83. 83. Freschi L, Vargas R Jr, Husain A, Kamal SMM, Skrahina A, Tahseen S, et al. Population structure, biogeography and transmissibility of Mycobacterium tuberculosis. Nat Commun. 2021;12(1):6099. pmid:34671035
  84. 84. Walker TM, Choisy M, Dedicoat M, Drennan PG, Wyllie D, Yang-Turner F, et al. Mycobacterium tuberculosis transmission in Birmingham, UK, 2009-19: an observational study. Lancet Reg Health Eur. 2022;17:100361. pmid:35345560
  85. 85. Menardo F. Understanding drivers of phylogenetic clustering and terminal branch lengths distribution in epidemics of Mycobacterium tuberculosis. Elife. 2022;11:e76780. pmid:35762734
  86. 86. Guinat C, Vergne T, Kocher A, Chakraborty D, Paul MC, Ducatez M, et al. What can phylodynamics bring to animal health research?. Trends Ecol Evol. 2021;36(9):837–47. pmid:34034912
  87. 87. Glynn JR, Alghamdi S, Mallard K, McNerney R, Ndlovu R, Munthali L, et al. Changes in Mycobacterium tuberculosis genotype families over 20 years in a population-based study in Northern Malawi. PLoS One. 2010;5(8):e12259. pmid:20808874
  88. 88. Long R, Croxen M, Lee R, Doroshenko A, Lau A, Asadi L, et al. The association between phylogenetic lineage and the subclinical phenotype of pulmonary tuberculosis: a retrospective 2-cohort study. J Infect. 2024;88(2):123–31. pmid:38104727
  89. 89. McIvor A, Koornhof H, Kana BD. Relapse, re-infection and mixed infections in tuberculosis disease. Pathog Dis. 2017;75(3):10.1093/femspd/ftx020. pmid:28334088
  90. 90. Asare P, Osei-Wusu S, Asante-Poku A, Otchere I, Prah D, Borrell S. Evidence of exogenous and endogenous re-infection with Mycobacterium tuberculosis complex strains among pulmonary TB patients with recurring TB episodes in Ghana; a call for intensifying TB monitoring. Maryland, USA, 2019.
  91. 91. Viljoen S, Pienaar E, Viljoen HJ. A state-time epidemiology model of tuberculosis: importance of re-infection. Comput Biol Chem. 2012;36:15–22. pmid:22340441
  92. 92. Loiseau C, Windels EM, Gygli SM, Jugheli L, Maghradze N, Brites D, et al. The relative transmission fitness of multidrug-resistant Mycobacterium tuberculosis in a drug resistance hotspot. Nat Commun. 2023;14(1):1988. pmid:37031225
  93. 93. Gagneux S, Burgos MV, DeRiemer K, Encisco A, Muñoz S, Hopewell PC, et al. Impact of bacterial genetics on the transmission of isoniazid-resistant Mycobacterium tuberculosis. PLoS Pathog. 2006;2(6):e61. pmid:16789833
  94. 94. Fenner L, Bodmer T, Altpeter E, Zwahlen M, Jaton K, Pfyffer GE. Effect of mutation and genetic background on drug resistance in Mycobacterium tuberculosis. Antimicrob Agents Chemother. 2012.
  95. 95. Xiao Y-X, Liu K-H, Lin W-H, Chan T-H, Jou R. Whole-genome sequencing-based analyses of drug-resistant Mycobacterium tuberculosis from Taiwan. Sci Rep. 2023;13(1):2540. pmid:36781938
  96. 96. Organization WH. Catalogue of mutations in Mycobacterium tuberculosis complex and their association with drug resistance, second edition. Geneva. 2023.
  97. 97. Wang L, Lim DR, Phelan JE, Reyes LT, Palparan AG, Sanchez MGC, et al. Whole genome sequencing analysis of Mycobacterium tuberculosis reveals circulating strain types and drug-resistance mutations in the Philippines. Sci Rep. 2024;14(1):19602. pmid:39179783
  98. 98. Shanmugam SK, Kumar N, Sembulingam T, Ramalingam SB, Selvaraj A, Rajendhiran U, et al. Mycobacterium tuberculosis lineages associated with mutations and drug resistance in isolates from India. Microbiol Spectr. 2022;10(3):e0159421. pmid:35442078
  99. 99. Bateson A, Ortiz Canseco J, McHugh TD, Witney AA, Feuerriegel S, Merker M, et al. Ancient and recent differences in the intrinsic susceptibility of Mycobacterium tuberculosis complex to pretomanid. J Antimicrob Chemother. 2022;77(6):1685–93. pmid:35260883
  100. 100. Rupasinghe P, Reenaers R, Vereecken J, Mulders W, Cogneau S, Merker M, et al. Refined understanding of the impact of the Mycobacterium tuberculosis complex diversity on the intrinsic susceptibility to pretomanid. Microbiol Spectr. 2024;12(3):e0007024. pmid:38334384
  101. 101. Merker M, Kohl TA, Barilar I, Andres S, Fowler PW, Chryssanthou E, et al. Phylogenetically informative mutations in genes implicated in antibiotic resistance in Mycobacterium tuberculosis complex. Genome Med. 2020;12(1):27. pmid:32143680
  102. 102. Li S, Poulton NC, Chang JS, Azadian ZA, DeJesus MA, Ruecker N, et al. CRISPRi chemical genetics and comparative genomics identify genes mediating drug potency in Mycobacterium tuberculosis. Nat Microbiol. 2022;7(6):766–79. pmid:35637331
  103. 103. Morris RP, Nguyen L, Gatfield J, Visconti K, Nguyen K, Schnappinger D, et al. Ancestral antibiotic resistance in Mycobacterium tuberculosis. Proc Natl Acad Sci U S A. 2005;102(34):12200–5. pmid:16103351
  104. 104. TB Vaccine Clinical Pipeline: Working Group on New TB Vaccines [cited 2025 April 2]. Available from: https://newtbvaccines.org/tb-vaccine-pipeline/clinical-phase/
  105. 105. Clinical Pipeline: Working Group on New TB Drugs [cited 2025 April 25]. Available from: https://www.newtbdrugs.org/pipeline/clinical