Header Leaderboard Ad

Collapse

De novo assembly - plant genome - read length + amount

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • De novo assembly - plant genome - read length + amount

    Hi,
    I am looking to do my first de novo assembly:
    Plant, diploid, genome size of just under 1gig, no reference, 2 samples, happy to start with only a draft Illumina assembly, genome NOT transcriptome.
    I would guess PE, but 50bp or 100bp or 300bp?, 30x coverage? Sufficient data off a MiSeq?
    Any feedback appreciated, thank you.

  • #2
    The longer the reads, the better. The amount of coverage you need likely depends on the ploidy and heterozygosity; a tetraploid organism may need over 4x the coverage of a haploid. But even for a haploid I would suggest aiming for at least 50x, and for an organism that's highly heterozygous, 50x per ploidy. Bear in mind that Illumina coverage is not very even, so there will be many places where the actual coverage is substantially below the average.

    We make most of our fungal assemblies with around 100x fragment library coverage, with 2x150bp HiSeq reads. I don't work much with plants directly, but I understand that our plant group tries to get 2x250bp MiSeq data because plants are typically bigger and more polyploid than fungi. For an optimal assembly you should use both fragment and long-mate-pair libraries, but that's more difficult (lab-wise) and much more expensive, and may not be needed for a decent assembly; it depends on the genome.

    So you MIGHT be able to get all the coverage you need from a single HighSeq lane, but you'll definitely need multiple MiSeq lanes; however, MiSeq will give longer reads and thus a better assembly at a higher cost per base pair.

    Comment


    • #3
      Thank you Brian, that is extremely useful, I really appreciate this.

      Comment


      • #4
        Hi

        it might also be advisable to check (consider) the homozygosity. (If you have some coverage you see this in kmer plots). But you can also estimazte this if this is an inbred line and whether or not it is self compatible.
        If in doubt go for longer reads. There is of course a trade off between quality and length but 50bp is definitely too short.
        In any case unfortunately no two plant genomes are exactly the same. As one major bugbear are repetetive/transposable elements of differrent sizes.

        We usually do one/two Miseq runs 2x300 to get a feeling for the genome.

        Best Wishes
        Björn

        Comment


        • #5
          Thanks Bjorn, helpful comments. I doubt we'll do any pre-sequencing, we will just commit to one type and then go for it, but your comments on 300bp match what I have read in other publications.

          Comment


          • #6
            Hi Elsie

            if you have your own Miseq, you can also get slightly longer runs. (We picked his up from here) but this is totally unsupoorted though.

            björn

            Comment


            • #7
              Hi Bjorn, thank for you for that. I spoke with the core where we are going to do our sequencing and have decided to go for a HiSeq run 150bp paired ends (1 lane) as the quality on the current MiSeq 300bp PE drops off significantly, so I would lose a lot after trimming.
              Thank you.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Targeted Sequencing: Choosing Between Hybridization Capture and Amplicon Sequencing
                by seqadmin




                Targeted sequencing is an effective way to sequence and analyze specific genomic regions of interest. This method enables researchers to focus their efforts on their desired targets, as opposed to other methods like whole genome sequencing that involve the sequencing of total DNA. Utilizing targeted sequencing is an attractive option for many researchers because it is often faster, more cost-effective, and only generates applicable data. While there are many approaches...
                03-10-2023, 05:31 AM
              • seqadmin
                Expert Advice on Automating Your Library Preparations
                by seqadmin



                Using automation to prepare sequencing libraries isn’t a new concept, and most researchers are aware that there are numerous benefits to automating this process. However, many labs are still hesitant to switch to automation and often believe that it’s not suitable for their lab. To combat these concerns, we’ll cover some of the key advantages, review the most important considerations, and get real-world advice from automation experts to remove any lingering anxieties....
                02-21-2023, 02:14 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 03-17-2023, 12:32 PM
              0 responses
              7 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-15-2023, 12:42 PM
              0 responses
              17 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-09-2023, 10:17 AM
              0 responses
              66 views
              1 like
              Last Post seqadmin  
              Started by seqadmin, 03-03-2023, 12:03 PM
              0 responses
              64 views
              0 likes
              Last Post seqadmin  
              Working...
              X