Overview

The Mash plasmid typing Task Template allows to compare plasmids that were identified by MOB-suite in the Chromosome & Plasmid Overview Task Template using the program Mash [PubMed 27323842]. Other Samples can be searched for similar plasmids that are below a specified Mash-distance threshold. Subsets for plasmid typings can be defined based on AMR-targets.

Note that plasmid typing is usually only useful for sequence data with (nearly) complete contigs and plasmids (e.g. long-read data). Using plasmid typing with short-read data (e.g. Illumina reads) may result in incomplete or incorrect results. Furthermore, Mash does not take the synteny of the plasmids into consideration. However, the synteny can be checked by using the pyGenomeViz tool.

Requirements

The Mash plasmid typing requires the Chromosome & Plasmid Overview Task Template for a Sample to assign contigs to plasmids or chromosome. If plasmids should be filtered by AMR targets, the NCBI AMRFinderPlus Task Template is also required.

Button16 Important.png Important:

Please Note: Mash Plasmid Typing requires the Long-read Data Plasmid Transmission Analysis Module

Mash Plasmid Databases

The plasmid typing uses the external program Mash to create and search sketch-databases for plasmids. In this manual, these sketch-databases are also called Mash databases.

A plasmid typing is stored as a Genotyping Library for a Task Template and the associated Mash database is stored as Task Template attachment. Each plasmid typing has a single Mash database but a plasmid can be in multiple Mash databases.

The Mash databases are built using a fixed k-mer size of 21 and a selectable (1,000 or 10,000) sketch-size.

The Mash plasmid databases and settings can be edited in the plasmid typing editor. The database content can be edited in the Mash Plasmid Databases management window, which can be accessed via Options > Mash Plasmid Databases.

Different plasmid typings can be defined and filtered based on their AMR content. To make use of this function, a NCBI AMRFinderPlus Task Entry must be present for the Sample.

To create a new Task Template with a new Mash plasmid database use the menu command File > New > Create Task Template. Then select Create Task Template for Whole Genome Sequencing Data and then Create Task Template for Plasmid Mash Database. This will create a new Task Template with a plasmid typing and a plasmid database.

To use the newly created Task Template in a Project, use the menu command Options > Projects. Then select the Project in the list on the left and click the Button16-Plus.gif Add button. Select the created Task Template in the list and click OK.

Task Entry Overview

Mash Genotyping Result View

The task entry overview contains a tab for each plasmid of the Sample with a table showing all hits from the typing's Mash database up to the selected threshold. Each table contains the source plasmid, i.e. the plasmid belonging to the Sample that is currently opened, highlighted with an orange background.

The table is ordered by:

  • hit has same genus as query Sample (if column "Genus" is available)
  • hit has same species as query Sample (if column "Species" is available)
  • cgMLST distance to query Sample (in increasing distance; if column "cgMLST Distance" is available)
  • Genus name (if column "Genus" is available)
  • Species name (if column "Species" is available)
  • MLST Sequence Type (ST) (if available)

The table contains the following columns:

  • Plasmid Name: Taken from the Chromosome & Plasmid Overview.
  • Sample ID
  • Project: Project for the Sample
  • Distance: distance calculated by Mash withouth length compensation
  • Length difference (%): Difference of the length of the source and target plasmid in %
  • Matching hashes: Number of matching hashes as calculated by Mash
  • Sample columns with Sample's Collection Date, City of Isolation, ZIP of Isolation, Genus and Species
  • Further Sample data with the default Comparison Table fields for the source Sample's Project
  • Table columns from the Chromosome & Plasmid Overview

Messages (error) will be displayed if not all requirements are met:

  • If the Chromosome & Plasmid Overview Task Entry does not contain plasmids, the error message No plasmids found. will be displayed.
  • The error message Cannot find result of MOB Suite! is displayed if the Sample does not contain a Chromosome & Plasmid Overview Task Entry.
  • If plasmids are to be filtered by AMR results, the Sample must contain an NCBI AMRFinderPlus Task Entry, otherwise the error message No plasmids for mode 'Priority AMR targets carrying plasmids' found! will be displayed.
  • Task Template X is not readable! is displayed when the user has not the rights to view the Task Template for the Typing.

A description of the available commands for the table can be found in the section Plasmid Table.

When a plasmid typing is performed, the sequence of a query-plasmid is added to the Mash database if is not yet in the database. If the Sample data has changed, the old sequence is removed from the Mash-database and the new plasmid-sequence is added. Note that plasmids are not automatically removed from a plasmid database if a Sample or a Task Entry containing the plasimd is deleted from the SeqSphere server database.

Result Fields

One result field is available for each plasmid typing: Top Plasmid Match. The result field contains the plasmid name for the hit with the lowest length-compensated distance to the current source plasmid. If multiple plasmids have the same lowest distance, the plasmid names are ordered alphabetically and the plasmid name with the first name is used. For each source plasmid, the result field contains the text source plasmid name: target plasmid name (dist: compensated distance). For multiple source plasmids "|" is used as separator.

Plasmid Early Warning Alerts

New Early Warning Alerts (EWAs) for plasmid databases can be defined using the command Options > Mash Plasmid Databases. Unlike the Early Warning Alerts for cgMLST the plasmid EWAs are defined for a Typing, and not for a Project.

Plasmid Transmission EWAs are triggered if a Sample that is processed by the pipeline using the plasmid typing has plasmids with a Mash distance below the distance threshold to another plasmid in the database.

If a Sample contains more than one plasmid, multiple EWAs can be created for this Sample, one for each plasmid.

However, to prevent plasmid alerts for clonal transmissions, no alert is reported for a plasmid hit if the respective Samples have a cgMLST distance below the specified cluster threshold. If more than one cgMLST typing is present in a Sample, only the cgMLST typing scheme that is listed first in the project definition will be considered. If no cgMLST typing is available for a certain species project then each intra-species match triggers an alert.

Note: The client that runs the pipeline requires the Linux tools to trigger a plasmid EWA.

Opening Plasmid Early Warning Alerts

The three most recent and unchecked plasmid EWAs that were triggered are shown on the top of the home screen in the section Unchecked Plasmid Transmission Early Warning Alerts. Clicking one of the EWAs will open a Plasmid Table with plasmid hits. The table will contain the default Project fields for the Sample with the plasmid that triggered the EWA, the MLST ST fields for all plasmids in the EWA and the MOB-Suite results for each plasmids.


Additionally, plasmid EWAs are also listed in the Browse EWA window that can be opened using the command Browse Early Warning Alerts from the Options menu.

Plasmid Searches

The Mash plasmid typing uses the Chromosome & Plasmid Overview Task Entry of a Sample to identify the plasmids and retrieve the contigs for each plasmid. Each plasmid can be used to manually query the created Mash databases for hits based on Mash distances. The next hits in the database up to a user-defined threshold (default = 0.001) are reported for each plasmid. A size compensation can be applied, where the calculated Mash distance is lowered by a user-specified value. The values entered for the plasmid search can differ from the plasmid typing definition.

A plasmid search can be invoked from the Chromosome & Plasmids Overview using the Search Similar Plasmids in Mash Database function. Alternatively, it can be started from the EWA plasmid table or Typing Result View. The resulting plasmid table will contain the default Project fields for the Sample, the MLST ST fields for all found plasmids, and the MOB-Suite results for each plasmid.

Plasmid Tables

Plasmid EWAs, the results of a plasmid similarity search, and stored plasmid table snapshots are displayed in a Plasmid Table.

Plasmid Clustering

Multiple selected plasmids in a table can be clustered using Single Linkage Clustering of the Mash sketches. The Mash distance threshold for a cluster can be defined.