OR12
TB-ANNOTATOR: A scalable web application that allows in-depth analysis of very large sets of publicly available Mycobacterium tuberculosis complex genomes.
G Senelle(1,2) C Guyeux(1,2) E Cambau(3,6) G Refrégier(4,5) C Sola(3,4)
1:FEMTO-ST Institute, UMR 6174 CNRS, DISC Computer Science Department, Univ. Franche-Comte (UFC); 2:Université de Bourgogne Franche-Comté; 3:INSERM UMR1137; 4:Université Paris-Saclay; 5:UMR 8079 CNRS-UPsay-AgroParisTech; 6:APHP, GHU Nord site Bichat, Service de mycobacteriologie specialisee et de reference
Tuberculosis continues to be one of the most threatening bacterial diseases in the world. Since the beginning of the NGS era, there are more than 160,000 Short Read Archives (SRAs) of Mycobacterium tuberculosis complex in the databases. Gathering this high amount of data could help better understanding this bacterium and fighting against tuberculosis. In addition, after gathering, it is important to be able to study, in its entirety and in-depth, this important mass of data. We developed the “TB-Annotator” web application that combines a database containing at the time of writing 102,000 SRAs (after checking their quality). We present a fully featured analysis platform to explore and query such a large amount of data. The objective is to present this platform tool centered on the key notion of exclusivity, to show its numerous capacities (detection of single nucleotide variants, insertion sequences, deletion regions, spoligotyping, etc.) and its general functioning. We compared TB-Annotator to existing platform tools for the study of tuberculosis, and showed that its objectives are original and have no equivalent at present. The database on which it is based will be presented, with the numerous advanced search queries and screening capacities it offers, and the interest and originality of its phylogenetic tree navigation interface will be detailed. We will end this presentation with examples of the results made possible by TB-Annotator, followed by avenues for future improvement.