P073
UShER for Mycobacterium tuberculosis: an evaluation of pandemic-scale tools capacity to perform transmission surveillance
F J Martínez-Martínez(1) L Karim(2) M G López(1) R Corbett-Detig(2) I Comas(1,3)
1:Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, CSIC, Valencia, 46010 Spain; 2:Department of Biomolecular Engineering and Bioinformatics, UC Santa Cruz Genomics Institute, UCSC, Santa Cruz, 95064, United States; 3:CIBER de Epidemiología y Salud Pública, CIBERESP, Madrid, Spain
Whole-Genome Sequencing (WGS) of Mycobacterium tuberculosis (MTB) is increasingly standard in health services. The growing data amount allows the reconstruction of phylogenies with a broader genomic context, providing a higher resolution for transmission dynamics but also escalating computational demands. UShER revolutionized SARS-CoV-2 phylogenomics by using sample placement to reconstruct massive phylogenies, and it’s been now extended to MTB. We tested its capacity to reconstruct large phylogenies and place new sequences to transmission clusters. To evaluate the robustness of UShER, we first identified transmission clusters in the Valencian Region, Spain, using a population-based dataset collected between 2014-2019. To recover transmission clusters, we built a regional phylogeny (N=1455), and another including global strains (N=39676), and applied a 10 SNPs threshold. To test the accuracy of USheR to assign strains to their corresponding transmission clusters we removed samples collected in the Valencian Region between 2017-2019 (N=729) from the phylogenies and used phylogenetic placement to reincorporate them in the phylogenies. 212 samples in cluster were placed in the regional phylogeny. 188/212 (88.67%) fell in the correct cluster; 12/212 (5.66%) fell in the cluster neighborhood and 11/212 (5.18%) were misplaced. In the global dataset phylogeny, 209 samples in cluster were placed. 167/209 (79.9%) fell in the correct cluster; 30/209 (14.35%) fell near their cluster and 12/209 (5.74%) were misplaced. UShER places new sequences with high accuracy regardless of the size of the phylogeny, providing a reliable tool for transmission surveillance.