Header Leaderboard Ad

Collapse

454 + Illumina Combined Assembly

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 454 + Illumina Combined Assembly

    Hi All,

    I am looking for suggestions on how to combine data from both 454 and Illumina platforms. Briefly, I have 454 sequenced BACs with average insert sizes of 150kb covered to approximately 30X and an illumina 100bp PE genome of the same species covered to ~25X. I am interested in obtaining the least gapped assembly of the BAC sequences only - not a completely assembled genome of this species.

    The only assembler I have attempted is velvet with -long libraries in addition to the -shortPaired library and this does not produce an assembly of comparable quality to my 454-only newbler 2.5p1 assemblies of just the BAC sequences. Any advice is greatly appreciated.

    Best,
    Kyle

  • #2
    I wrote a blog post on this subject:
    http://pathogenomics.bham.ac.uk/blog...-and-454-data/

    I'd be inclined to add your Illumina reads to a Newbler 2.6 assembly as your starting point.

    Comment


    • #3
      Thanks nick,

      I had a look at your blog and I can see there are several alternatives. Just to clarify, when you say "add your illumina reads to a newbler 2.6 assembly" what program would you recommend for a first shot? Presumably not newbler itself, but a de Bruijn assembler like velvet?

      Best,
      Kyle

      Comment


      • #4
        I mean run Newbler with the 454 and Illumina reads. V2.6 supports Illumina FASTQ files and seems to do a pretty good job with the bacterial datasets I've tried it on. If you are finding it takes too long consider using the -large flag.

        Comment


        • #5
          Can newbler really manage the quantity of data produced by Illumina? I have 171 million 100bp PE reads. It was my understanding that the typical OLC assemblers could not handle this.

          Comment


          • #6
            For what its worth, i can vouch for newbler 2.6 with 1.2 million titantium plus up to 2million HiSeq 100bp SE. did a good job with eukaryotic cDNA

            Comment


            • #7
              I am not sure whether Newbler incorporates the FASTQ reads as part of the overlapping stage or whether it uses some kind of mapping assembly algorithm, but I would certainly give it a try and see how it goes. Obviously there's always a danger of running into memory or CPU issues with that many reads.

              Comment


              • #8
                Because you are wanting only the BAC portion in your final assembly, have you thought of filtering the Illumina data for reads that have, say, 50% identity to the 454 reads prior to a combined assembly?

                Comment


                • #9
                  Nick has a good blog, I would go for MIRA3.

                  -veljo

                  Comment


                  • #10
                    MIRA3 has done really good jobs here, assembling yeast 454 and ill PE 90x90 data.

                    give it a try!

                    Comment

                    Working...
                    X