Header Leaderboard Ad

Collapse

Mira: Contigs failing to collapse despite similarity

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mira: Contigs failing to collapse despite similarity

    I have been using MIRA to assemble PacBio data for a very small circular genome and I have been observing a strange result in the output. For several datasets when the contigs are compared to the closest available reference There are a large number of contigs in certain regions that represent the same region of the genome.

    Even when though these contigs have a high degree of overlap, they are not joined into single contigs.

    The problem is especially obvious in one dataset where the whole genome can be represented as two contigs with a large degree of overlap at both ends but are not collapsed into a single contig (shown by MUMmer mapview output attached)

    I've been running Mira just with the most basic settings for whole genome, denovo, accurate

    The closest theory I can come up with for why this is happening is that errors are prevalent enough in the PacBio data that it is possible to come up with two distinct version of the same sequence as a contig.

    I would love to hear any suggestions on how to properly collapse these contigs as I am worried I am missing valuable read and quality information by having identical regions represented by different contigs.
    Attached Files

  • #2
    I have never used MIRA, so cannot comment specifically as to the why, but in assembling PacBio data with HGAP - Celera assembler I have on occasion seen this. It is generally due to Celera Assembler conservatively breaking conitgs based on some heuristic. To force the overlap I generally use a simpler overlapper, such as minimus2, then resequence and call a consensus with quiver to check for the introduction of any missasemblies.

    Comment


    • #3
      use the mira mailing list to get a quick reply and solution from the authors

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Improved Targeted Sequencing: A Comprehensive Guide to Amplicon Sequencing
        by seqadmin



        Amplicon sequencing is a targeted approach that allows researchers to investigate specific regions of the genome. This technique is routinely used in applications such as variant identification, clinical research, and infectious disease surveillance. The amplicon sequencing process begins by designing primers that flank the regions of interest. The DNA sequences are then amplified through PCR (typically multiplex PCR) to produce amplicons complementary to the targets. RNA targets...
        Today, 01:49 PM
      • seqadmin
        Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
        by seqadmin




        Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
        03-10-2023, 05:31 AM
      • seqadmin
        Expert Advice on Automating Your Library Preparations
        by seqadmin



        Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....
        02-21-2023, 02:14 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-17-2023, 12:32 PM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-15-2023, 12:42 PM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-09-2023, 10:17 AM
      0 responses
      67 views
      1 like
      Last Post seqadmin  
      Started by seqadmin, 03-03-2023, 12:03 PM
      0 responses
      64 views
      0 likes
      Last Post seqadmin  
      Working...
      X