Basic Workflow in SeqSphere+Start SeqSphere+ and Log inThe SeqSphere+ software consists of a client application and a server application (see installation). The server can be accessed by multiple clients from different locations, using different user accounts. The server contains an integrated database. When the SeqSphere+ client is started, it shows a log in screen. A named user account can be used to log in to the SeqSphere+ server. After you have successfully logged in the home screen is shown, including configurable shortcuts for often used functions, the last created (or favorite) comparison table snapshots and recent pipeline reports. On the top of the home screen a welcome text is shown, that can be changed to a custom text including images. Define a ProjectCreate a Project first. A Project contains one or more Task Templates and is (in most cases) species specific. A Task Template describes the targets that should be analyzed and the processing that should be performed, like automatic quality checks (e.g. minimum coverage) and genotyping. Predefined Task Templates are available in SeqSphere+ for MLST, cgMLST, serotyping, and resistance/virulence. Customized Task Templates for other workflows can be created with a step-by-step dialog. Ridom SeqSphere+ is a resequencing software. Once you have setup a project like this you can literally analyze hundreds/thousands of sequence data automatically. Proccess Sequence DataUse the function Process Assembled Genome Data to process assembled whole genome sequence (WGS) data files in FASTA, ACE, BAM, or GenBank format. Raw read files in FASTQ (phred+33) format can be assembled (or mapped) and processed by defining and starting a highly automatized SeqSphere+ pipeline. Sanger sequencing data (e.g., SCF or ABI chromatogram files) can also be processed. First, the target scan procedure searches (with BLAST) in the WGS data for targets that are defined in the task templates of the used project. If the matching sequences reach a threshold defined by the task template, they are imported as targets of the sample. Second, the target QC procedure is performed for the imported target sequences to check the reliability. Third, the genotypings that are defined in the task templates are performed (e.g., MLST, cgMLST) and the results are kept in the genotyping result fields of the sample and stored in the database on the SeqSphere+ server. Import Epi MetadataRidom SeqSphere+ can import epi metadata from MS Excel and CSV files. The Excel file must contain a simple data table, where each row holds the values for one Sample. One column of the table must contain the Sample ID. This Sample ID is used to match the data to existing Samples, or to create new ones. The Sample ID must be unique for each Sample in a Project. Searching and Retrieving Stored SamplesThe samples are stored in the database of the SeqSphere+ server. The search samples function can be used to search for the samples by Sample ID or other metadata fields, and to load them into the workspace for manual editing. Compare and Visualize Genotyping Results and MetadataFinally the comparison table function can be used for advanced analysis of the genotyping results together with the metadata. The data can be compared and visualized using neighbor joining trees, minimum spanning trees, epi curves and geographical maps. Comparison table and minimum spanning tree content and layout can be stored for later reuse in a snapshot. Data Objects in SeqSphere+The Sample (, , , or ) is the central data object in SeqSphere+. Each Sample belongs to one Project ():
Each Sample can have Task Entries () that are specified by the Task Templates of its Project:
Example: Target QC ProcedureThe targets are automatically checked for the quality issues that were defined in the Target QC Procedure of the Task Template. Each target is rated with one of the following states: The targets can have different states:
A list of all error positions in the imported data can be shown, and can be used to control those positions. Especially for Ion Torrent data, a reference based auto-correction of homopolymer related insertion/deletion errors is integrated. Genotyping LibrariesEach Task Template can have one or more Genotyping Libraries to perform typing or identification (e.g., MLST and cgMLST). After the queries of Task Entry has been performed, a result state assigned: The results of a genotyping (e.g., allele types) are stored in the result fields of the Task Entry. |