P58
XBS-Nextflow: a comprehensive automated Mtb bioinformatics pipeline for low coverage contaminated clinical samples.
T H Heupink(1) L Verboven(1) A Sharma(3) R M Warren(2) A Van Rie(1)
1:University of Antwerp; 2:Stellenbosch University; 3:Liverpool John Moores University
Various pipelines exist for the analysis of whole genome sequencing data from Mycobacterium tuberculosis (Mtb). Most pipelines require DNA extraction from purified sub-cultures to ensure sufficient Mtb DNA and little or no contamination with DNA from human or microbial origins. Furthermore, most pipelines exclude the complex genomic regions which account for about 10% of the genome and few offer both variant calling for drug resistance and phylogeny for the study of Mtb transmission.
We designed the XBS pipeline to handle DNA extracted from early positive cultures or directly from sputum. XBS uses an advanced genetic variant calling core that can accurately identify true from false genetic variants in low coverage and contaminated samples. Using simulated and clinical datasets, we previously demonstrated a high accuracy of XBS for variant calling in Mtb DNA extracted from sub-cultures and directly from sputum, here we will show additional findings for early positive liquid cultures. XBS also partially unlocks the complex regions of the genome (e.g. PE/PPE) with the same sequencing accuracy as for well-established variants.
We recently implemented XBS in Nextflow to enable scalable and reproducible scientific workflows using software containers. XBS-Nextflow is now freely available for easy installation on local computers, servers with job schedulers and cloud computers. The integration of other software add-ons into XBS-Nextflow produced a highly comprehensive Mtb bioinformatics pipeline for analysis of large datasets with automated mapping, quality control, variant calling and filtering, inference of drug resistance (using TB-profiler), (sub-)lineage designation, and annotated phylogenies for Mtb transmission analyses.
