Beyond the clone: A new ancestor for Mycobacterium tuberculosis and implications for host-specific genomic events
P Ruiz-Rodriguez(1) M Coscolla(1)
1:I2SysBio, University of Valencia-FISABIO Joint Unit, 46980 Paterna, Spain
Mycobacterium tuberculosis is an exceptionally virulent pathogen, often characterized as a monomorphic bacterium. In this study, we aimed to investigate the genetic diversity within the M. tuberculosis Complex (MTBC) to gain deeper insights into its pathogenicity. Although the MTBC has limited diversity, it comprises lineages that infect humans (L1-9) as well as those predominantly infecting animals (A1-4). Traditional genetic studies have primarily relied on Illumina mapping and a reference M. tuberculosis ancestor based on the H37Rv strain (L4), which might result in an incomplete representation of the full MTBC spectrum.
To gain a better understanding of MTBC genomic diversity, we constructed a comprehensive MTBC pangenome using 340 complete whole genome sequences and via graph algorithm. We applied maximum-likelihood and Bayesian inference techniques to reconstruct the ancestral sequence of MTBC.
We reconstructed a new, refined annotation for the MTBC ancestor that we defined as pancestrome (the pangenome of the ancestor). We corrected inaccuracies in the annotation caused by software annotators, using the curated annotation of H37Rv. We discerned patterns of gene presence and absence across MTBC lineages, capturing ancestral diversity leading to each clade. Additionally, we described within- and between-lineage diversity through SNP accumulation, and indels. Moreover, we identified evolutionary selective pressures differentiating lineages within the complex using all genes and epitopes identified in the pancestrome. Our results offer a new reference and annotation capturing whole MTBC for genomic analysis and highlight the potential of host-specific genomic markers to identify unique virulence factors associated with specific hosts.