OR16
Complete genomes reveal unprecedented genomic diversity of Mycobacterium tuberculosis complex from global to within-host variation
A M García-Marín(1,8) M Torres-Puente(1) L Martínez-Priego(2) G De Marco(2) M A Moreno-Molina(1) M Hunt(4,5) Z Iqbal(4,6) V alencia Region TB Working Group(7) M G López(1) F González-Candelas(8,9) J Alonso-del-Real(1) I Comas(1,9)
1:Instituto de Biomedicina de Valencia - Spanish National Research Council; 2:Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunitat Valenciana FISABIO; 3:EB House Austria; 4:European Bioinformatics Institute; 5:Nuffield Department of Medicine, University of Oxford; 6:Milner Centre for Evolution, University of Bath; 7:Conselleria Sanitat, Generalitat Valenciana; 8:Joint Research Unit ‘Infección y Salud Pública’ FISABIO, University of Valencia-I2SysBio; 9:CIBER Epidemiology and Public Health
Sequencing technologies have provided key insights into the epidemiology and pathogenesis of the Mycobacterium tuberculosis complex (MTBC), but are hampered by the technical limitations of conventional short-read sequencing and its low genetic diversity. In particular, regions encoding key MTBC host-pathogen interaction genes such as PE/PPE remain poorly characterized, limiting the analysis of host selection on them. We used long-read sequencing of 216 MTBC clinical isolates from Valencia, Spain, to construct high-quality complete genomes. We have been able to generate complete genome sequences solely from long-read HiFi sequencing data.
These genomes revealed previously inaccessible diversity hotspots often linked to pe/ppe genes. [removed] A median of 312 (-1 to 792) SNPs per pairwise comparison are gained. pe/ppe hotspots are driven both by structural variation and point mutations mediated mainly by gene conversion events, contrary to what is observed in the rest of the genome. However not all pe/ppe are the same and here we show the diversity and gene conversion landscape across the entire pe/ppe repertorie. Paradoxically, we show that the epitopes in pe/ppe genes are hyperconserved, with some salient exceptions which happen to be part of vaccine candidates in clinical trials. Furthermore, our results provide new insights to redefine transmission boundaries based on genetic distances. Finally, the use of a complete genome from the same patient as a reference reveals a substantial reduction in non-fixed variation when mapping short reads from serial isolates, prompting the need to reassess our current estimates of within-host diversity and the dynamics of Mycobacterium tuberculosis during infection.
