IntroductionThe S. enterica SISTR Geno-Serotyping task template is based on Salmonella In Silico Typing Resource (SISTR) that is a bioinformatics tool for rapidly performing in silico serotyping analyses on draft Salmonella genome assemblies [PubMed 26800248]. The serovar prediction module of SISTR utilizes O (somatic) and H (flagellar) antigen and/or serogroup-specific probes, which provides serovar identification for about 90% (n = 2,190) of serovars. SISTR does not identify or report serovar variants requiring biochemical or sub-speciation tests for full characterization. Furthermore, SISTR performs a cgMLST analysis using a by strict consensus defined scheme of 330 targets (cgMLST330). The completeness of cgMLST330 data is used in SISTR for assessing genome assembly quality (QC Status). Because draft genome assemblies may generate incomplete data for the antigenic query, the algorithm incorporates logic that allows for partial matching of the antigenic formula. Results with multiple possible serovars use the 'phylogenetic context', whereby the query genome is compared against a wide cross-section of genomes from different serovars and subspecies and the predominant serovar of genomes within the same cgMLST330 cluster is used to identify the most likely serovar. When a unique serovar is identified based on antigen identification, the SISTR serovar prediction pipeline is complete. The phylogenetic context from cgMLST330 is used for serovar prediction only when it is not possible or incomplete by antigen geno-serotyping. Finally, the phylogenetic context method is also used to determine the Subspecies. Important:
Task Entry OverviewWhen a S. enterica SISTR Geno-Serotyping task template is processed the following information is stated in the task entry overview:
Result FieldsThe task entry stores the following results in searchable database fields:
The Subspecies (cgMLST330) information is stored in the according field in the 'epi characteristic' section of the epidemiological metadata. RuntimesThe following table contains the measured SISTR (1.1.1) runtimes for various Salmonella finished and draft genomes on a dual core desktop with 16GB RAM (using Windows Subsystem for Linux).
|