GPAS pipeline relatedness service for TB

R Spies(1) J Westhead(1) D W Crook(1) T E Peto(1) T M Walker(1)

1:The University of Oxford

Whole genome sequencing (WGS) of Mycobacterium tuberculosis complex (MTBC) provides data which can be used to investigate the relatedness between groups of isolates with high resolution. Global Pathogen Analysis Service (GPAS) is a user-friendly, cloud-based bioinformatics tool which assembles and analyses MTBCs WGS data reporting relatedness.

To investigate relatedness, we used the 11 well described MIRU-VNTR-defined clusters reported in the Lancet Infectious Diseases 2013 GPAS generated a “Relatedness” csv file containing a pairwise matrix of all uploaded sequences within 20 single nucleotide polymorphisms (SNPs) of each other. Additional outputs included a table displaying all samples within 20 SNPs of the reference sample; a bar graph demonstrating the number of samples at each SNP distance from the reference sample (up until 20 SNPs) and a neighbor-joining phylogenetic tree consisting of samples within 20 SNPs of the reference sample. The table, bar graph and phylogenetic tree included interactive links, enabling detailed inspection of related sequences at an individual level. Using the GPAS outputs we were able to reconstruct the previously described cluster, demonstrating very close agreement with the originally reported SNP distances.

In conclusion, GPAS is an effective and user-friendly tool for analyzing the relatedness between MTBC whole genome sequences. The tool’s relatedness function may be particularly valuable in public health, facilitating detailed outbreak investigation and enhancing epidemiological surveillance capabilities.

