This function can be invoked via via Tools > Fix rep/dnaA Start and Orientation for Plasmids/Chromosomes in the menu. It searches the sequences in FASTA files for genes encoding for origins of replication within chromosomes and plasmids, and re-orients the sequences so that they start at the respective position. This can be useful for the visualization of sequence comparisons. The assignment of a contig to either plasmid or chromosome is based on the size of the contig: Contigs from 3000 to 500,000 bases are considered plasmids, larger contigs are considered chromosomes, and contigs shorter than 3000 bases are ignored. By default the function assumes that each contig represents a circular sequence. If the box Look for [topology=circular] in FASTA contig header instead of assuming that contig(s) are circular is checked, only contigs with the term [topology=circular] in their header are considered circular. Selecting the box Skip non-circular contigs allows to leave contigs without [topology=circular] in their headers unchanged. BLAST is used to search for hits within a dnaA library for chromosomes and for hits within a rep library for plasmids. If multiple hits are found, the hits that contain rep_cluster in their name are sorted to the end. Afterwards, the best hit is used. If the best hit begins within the first 50 bases of the contig no changes are made. If a best hit is found the further processing depends on the circularity of the contig:
Used librariesfor dnaA (chromosomes): dnaA-library (amino acid) from Circlator for rep (plasmids): rep-library (nucleotide) from Mob-Suite Blast parametersThe following BLAST parameters are used: for dnaA (amino acid library): matrix = BLOSUM 62, word size = 3, mismatch penalty = -3, match reward = 1, gap open costs = 11, gap extension costs = 1 for rep (nucleotide library): word size = 11, mismatch penalty = -3, match reward = 1, gap open costs = 5, gap extension costs = 2 |