Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • strob
    Member
    • Nov 2008
    • 84

    low 454 coverage combined with high solexa coverage

    Hi,

    has anybody experience with combining following two datasets:

    1X coverage of 454 reads (backbone)
    30X coverage of solexa reads

    background: we are talking about a non sequenced plant genome. So I would use the 1x 454 reads as a backbone for the solexa reads to perform a de novo assembly.

    Question: is a 1X 454 coverage in this case a waste of money or a real help in the assembly? Somebody experience with this?
  • jimmybee
    Senior Member
    • Sep 2010
    • 119

    #2
    How repetitive is your plant genome?

    Comment

    • natstreet
      Member
      • Nov 2009
      • 83

      #3
      I don't have a good answer but this is something of a hot topic to me as we are doing much the same, although I have higher 454 coverage.

      For plants a big factor can be how polymorphic your species is as well as the repeat structure.

      In general, I would be really interested to know how people are effectively integrating 454 and Illumina data. Do you compile them on their own and then combine those assemblies or are you compiling the data all together? In either case, what assemblers are you using?

      Comment

      • strob
        Member
        • Nov 2008
        • 84

        #4
        highly repetitive....
        we have the illumina dataset available. But we are thinking of adding a 454 low coverage set. I think we can do three things:
        - all de novo (hybrid assembly)
        - illumina de novo and than map them back on the 454 reads
        - map the illumina reads directly to the 454 reads

        Before doing this, I want to know if a 454 run will bring additional information.
        Tools? I was thinking of MIRA

        Comment

        • jimmybee
          Senior Member
          • Sep 2010
          • 119

          #5
          If its highly repetitive (my definition of highly would be >80%), then doing a 1x coverage run wouldn't be particularly effective, nor will it compliment the illumina data for the hybrid assembly. You'll need to figure out a few things like how finished do you want the sequence and what information do you want out of the assembly (eg. just good assembly of genes or repeats).

          To answer natstreet: Hybrid assemblies with different types of data are the way to go for repetitive genomes (such as cereal crops). We've found that integrating differing types of data (paired end/fragment), different insert sizes and read lengths can been very beneficial to the assembly.

          Comment

          • natstreet
            Member
            • Nov 2009
            • 83

            #6
            Hybrid assemblies with different types of data are the way to go for repetitive genomes (such as cereal crops). We've found that integrating differing types of data (paired end/fragment), different insert sizes and read lengths can been very beneficial to the assembly.
            I have shotgun 454, paired end 454 and a range of paired end Illumina libraries as well as a mate pair library. I haven't yet found an assembler that can take all of the data for a hybrid assembly on any machine that I have access to. Velvet and Mira both take both types of data but have huge RAM requirements and are simply impractical to run. For hybrid cereal assemblies, what software are you using?

            Comment

            • jimmybee
              Senior Member
              • Sep 2010
              • 119

              #7
              velvet. I feel your pain in regards to the RAM requirements. We only just got something can handle the requirements. I've compiled SOAPdenovo and Euler-SR but have yet to play around with them

              Comment

              • glacerda
                Member
                • Aug 2008
                • 27

                #8
                It is crucial to correct your reads prior to assembly (using the SOAPdenovo correction tool, SHREC or other). This will save memory in the assembly stage.

                Last, SOAPdeNovo uses much less memory than velvet, although in my personal experience velvet produces slightly better assemblies.

                Don't forget to optimize the parameters, specially the k-mer size. This has a great influence on memory/time and quality of assembly.

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 10:09 AM
                0 responses
                10 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                20 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                27 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                21 views
                0 reactions
                Last Post SEQadmin2  
                Working...