Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Advice for de novo sequencing strategy


    We are considering to do a genome sequencing in one of our filamentous fungi, and would like to have some advices and hear your experiences on some off the different platforms.

    The aim of the sequencing is to find a specific gene cluster involved in biosynthesis of an interesting compound.The fungi has an expected genome size of 30 Mb based on the closest related full genome sequenced fungi.

    Of course the best option is some sort of hybrid strategy, however we don't at the moment have the budget for that. So instead we will just try and sequence on one platform.

    So my questions is:
    1. Which platform will you use for de novo sequencing of a 30 Mb filamentous fungi (at the moment GS-FLX and Illumina HiSeq seems as the best options with the HiSeq being the best and cheapest choice)

    2. We had an offer for sequencing on the HiSeq. However the coverage was above 2000x making the price higher than we expected. How much coverage would you recommend as a minimum in this case?

    3. For the assembly, does any of you have experience in the quality difference between assemblys done with CLC or for instance ABySS?

    Any help would be appreciated.


  • #2
    You could do a 2x250bp MiSeq run. This could get you around ~250X coverage with longer reads than the HiSeq and a much lower price than the FLX+.


    • #3
      With any paired-end Illumina strategy, a useful strategy is to size the library so the reads will overlap in the middle; there are a number of tools (such as FLASH) which will then fuse them into a single, higher-quality read.

      100-200X should be quite sufficient. HiSeq will be cheaper than MiSeq, but with shorter readlengths -- and cheaper only if you can ride along with someone else. Running it on 454 is probably going to be quite pricey in comparison.

      It's hard to know which assembler will behave best until you try the data, and you'll need some sort of "ground truth" to sort them out by. I like Ray a lot, as it can distribute very large jobs across multiple nodes in a cluster (I believe ABySS can do this also, but velvet cannot).


      • #4
        Thx for the advices.

        So far our plan is to sequence 2x100 PE on a HiSeq with a 200x coverage and combine it with a 6 or 10 kb insert library also sequenced on HiSeq.


        • #5
          I've sequenced a few Aspergillus genomes (30-40 Mb genomes too) looking for secondary metabolite clusters and have found 2x100 PE to be very good with a coverage around 40x. 200x coverage should be great for your assembly.

          SOAPdenovo is my assembler of choice, but ALLPATHS-LG ( is supposed to be very good as well. If you plant o use ALLPATHS-LG, I would be sure to take a look at their paper. You have to do some prep work on the library side (2 libraries of different insert sizes) to ensure the algorithm is optimized.


          • #6
            Originally posted by jgibbons1 View Post
            I've sequenced a few Aspergillus genomes (30-40 Mb genomes too) looking for secondary metabolite clusters and have found 2x100 PE to be very good with a coverage around 40x. 200x coverage should be great for your assembly.
            @jgibbons1: Was 40x sufficient to identify the polyketide synthases (PKSs) as well? Since PKSs have very conserved sequences I can imagine that de novo assembly of these regions could generate some problems.

            I will be sequencing Penicillium strains myself also for the purpose of looking secondary metabolism, so I am currently considering how much coverage i need for that. Did you also try to assemble the genome against a reference Aspergillus, or did you only do de novo assebly?


            Latest Articles


            • seqadmin
              Understanding Genetic Influence on Infectious Disease
              by seqadmin

              During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

              Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
              09-09-2024, 10:59 AM
            • seqadmin
              Addressing Off-Target Effects in CRISPR Technologies
              by seqadmin

              The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
              08-27-2024, 04:44 AM





            Topics Statistics Last Post
            Started by seqadmin, Today, 06:25 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 01:02 PM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 09-18-2024, 06:39 AM
            0 responses
            Last Post seqadmin  
            Started by seqadmin, 09-11-2024, 02:44 PM
            0 responses
            Last Post seqadmin  