top of page


MAGMA – A novel bioinformatics pipeline developed for integration of WGS in clinical care and tuberculosis control

T H Heupink(1) L Verboven(1) A Sharma(2) V Rennie(1) R M Warren(2) A Van Rie(1)

1:Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Wilrijk, Belgium; 2:DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa

Mtb bioinformatics pipelines developed for research purposes often struggle to analyze Mycobacterium tuberculosis (Mtb) DNA from clinical samples with low mycobacterial burden and high levels of contamination. Building on the XBS variant calling core, we created the open-source MAGMA (Maximum Accessible Genome for Mtb Analysis) pipeline. MAGMA is implemented in Nextflow, employs the strengths of various software packages, and is compatible with various computing systems. After quality control (median coverage ≥10x, coverage breadth <90%, mixed infections or NTM frequency <20%), the pipeline starts with an individual sample workflow where minor, major and structural variants are identified. Major variants are subsequently called at cohort level and filtered using a machine learning approach. The accurate identification of variants in both regular and complex regions of the Mtb genome increases the genetic resolution by 9%. TBProfiler is implemented for lineage identification and drug resistance profiling of the major and minor variants. ClusterPicker is used for cluster identification based on 5 and 12bp SNP cut-offs. IQtree is employed to construct a Maximum Likelihood phylogeny and iTOL annotations for visualizing of the phylogenetic tree. A case study showed that MAGMA accurately analyzes data from early positive primary cultures with variable Mtb DNA to total DNA (mapping percentage 8% to 87% and median coverage 57x) and large amounts of contaminating sequences (poly-modal GC content distribution). MAGMA accurately analyses clinical Mtb samples and provides users with a range of data and visual outputs that can guide precision medicine and precision public health interventions.

bottom of page