Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Ramprasad
    Junior Member
    • Jun 2011
    • 7

    Problems in assembling mitochondrial genome

    Hi all,

    I have assembled a genome (from illumina data) for a non-model species using allpaths and I'm searching for the mitochondria. I have a list of proteins (13 in total) that should be on mitochondria and have been unable to locate them in any sensible manner in this assembly(used the proteins with exonerate, spaln, blast and blat).

    For example, blasting these proteins against the assembled genome reveals only one protein and I'm getting similar results with other approaches. I want to know why they are not showing up. I expect that the mitochondria should be assembled in a single contig (about 20kb in length) but it’s puzzling not even fragments are showing up.

    Has anyone ever run into this problem with their assembly? Or does anyone have any idea what is going on here?

    Any help would be appreciated.

    Thanks and regards,
    Ram
  • maubp
    Peter (Biopython etc)
    • Jul 2009
    • 1544

    #2
    Did you do any pre-filtering? The mitochondrial reads might have been lost (e.g. due to much higher coverage, or very different %GC).

    Comment

    • colindaven
      Senior Member
      • Oct 2008
      • 417

      #3
      Hi,

      recently had a project like this for some plants.

      Observations
      1) mitochondria are NOT easy to assemble. Expect 100+ contigs, perhaps ~400, with Illumina data. I had pacbio data which did not assemble to one contig, more like 30-60.

      2) mitochondria are highly variable in size, but as far as I know none are 20kb. See for example
      NCBI Virus is a community portal for viral sequence data from RefSeq, GenBank and other NCBI repositories.


      3) as maubp suggested perhaps take these 13 genes as a nt fasta and map raw reads against them prior to assembly. Are they covered ?

      Maybe the data quality isn't good enough.

      cheers,
      Colin

      Comment

      • maubp
        Peter (Biopython etc)
        • Jul 2009
        • 1544

        #4
        Re (3), I didn't explicitly suggest that, but its a good idea worth trying. You could also try mapping against mitochondrial sequences from the closest published relatives.

        Comment

        • Brian Bushnell
          Super Moderator
          • Jan 2014
          • 2709

          #5
          Originally posted by maubp View Post
          You could also try mapping against mitochondrial sequences from the closest published relatives.
          I second this; you might be able to grab most of the mito reads that way. Alternately, I suggest you try normalizing the data prior to assembly, to drop the mito coverage down to a level similar to the rest of the genome - that makes it much easier to assemble, and typically yields a superior assembly for things with extremely high coverage.

          Comment

          • Linnea
            Member
            • Mar 2010
            • 23

            #6
            I also agree, first try to get the mitochondrial reads.

            We had exactly thee same problem when assembling our non-model organism (1Gb, mitochondrion ~16kb), and didn't get any mitochondrial contigs at all. It turned out that the tissue we used was so full of mitochondria that the read coverage was just too high for the assembler to handle it.

            Instad, we mapped all reads to the closest mithochondrion we could find, and extracted a consensus from our mapped reads. Then we mapped our reads again against the consensus, corrected it, mapped again and continued in an iterative manner until the reads and the consensus matched perfectly. I think the software MITObim (https://github.com/chrishah/MITObim) works in a similar way (it wasn't released when we did this so I haven't tried it myself).

            Good luck!

            Comment

            • colindaven
              Senior Member
              • Oct 2008
              • 417

              #7
              Interesting. Good suggestions everybody.

              Just stumbled across this program as well, which looks decent:

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, Yesterday, 10:09 AM
              0 responses
              10 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              20 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              27 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 11:40 AM
              0 responses
              21 views
              0 reactions
              Last Post SEQadmin2  
              Working...