Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • FrancescaRaffini
    Junior Member
    • Feb 2015
    • 3

    denovo assembly evaluation

    Dear all,

    I made a denovo assembly of a paired-end ddRAD dataset without any reference genome nor previous genomic knowledge. As you know, a good practice is to make several assembly using different parameters (such as -n -m -M in Stacks), to find the settings that best fits your data and needs. I would kindly ask you if you have any suggestions on how to qualify the diverse assemblies without any reference genome and generations (usually used to estimate the error), if there is any software (quast tool?) you would reccomend.

    Thank you in advance,

    Regards

    Francesca
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    I do recommend Quast; it's easy to use and provides some useful statistics on assembly continuity, but without a reference it is not really a complete solution, and is mainly useful for continuity statistics (N50/L50).

    We typically use BBMap to evaluate the quality of assemblies that lack a reference, as it provides a nice summary of error statistics. If you map the reads to each assembly, it will tell you:

    %of reads mapped (higher is better)
    %of reads properly paired (higher is better)
    %of reads that mapped ambiguously (lower is better)
    %of reads that matched the reference perfectly (higher is better)

    ...and also the overall error rate, and rates of each individual error type (substitutions, insertions, and deletions) on a per-base and per-read level. In each case, of course, lower is better. You can also use it to directly output per-contig coverage stats (with the covstats=file flag), which is sometimes useful for spotting collapsed repeats or contaminant contigs.

    Comment

    • FrancescaRaffini
      Junior Member
      • Feb 2015
      • 3

      #3
      Thank you very much Brian Bushnell. I would kindly ask you if your package was already used with ddRAD.

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #4
        Originally posted by FrancescaRaffini View Post
        Thank you very much Brian Bushnell. I would kindly ask you if your package was already used with ddRAD.
        That's possible, but I don't know. I have never worked with ddRAD data, and I am unaware of it being used at JGI.

        Comment

        • sarvidsson
          Senior Member
          • Jan 2015
          • 137

          #5
          I don't think Quast can be used for ddRAD data. What did you use to "assemble" your data - Stacks? I would evaluate the data based on number of polymorphic (where >2 samples are homozygote for each allele) "stacks"/loci covered by at least 10x coverage in 2/3 of your samples. With a few thousand such loci, you have a pretty nice dataset. If not, there is a long list of things than can go wrong (especially in the wet lab)...

          @ BB: Typical ddRAD protocols produce quite small fragments (100-250 bp) representing small islands with strictly defined borders (restriction enzyme digested), so you rarely get any contigs larger than possible read pair overlap when pushing the data through a de novo assembler.

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #6
            Thanks for the explanation. It does not sound like standard methods of assembly and assembly evaluation are relevant here.

            Comment

            • FrancescaRaffini
              Junior Member
              • Feb 2015
              • 3

              #7
              Thank you to both for yuor suggestions.

              Sarvidsson, yes, I am using Stacks. Since I will perform diverse denovo assembly using diverse parameter, I need some qualitative measures (e.g. error, I can't measure it using most known methods since I don't have a refernce genome, replicates or generations) to estimate how good is the assembly with that paramenters. Your method looks interesting, but I am afraid it is not enough alone to evaluate the assembly quality.

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              25 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              30 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              39 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-02-2026, 12:03 PM
              0 responses
              62 views
              0 reactions
              Last Post SEQadmin2  
              Working...