Unconfigured Ad

**Brian Bushnell** · 02-16-2015, 10:34 AM

I do recommend Quast; it's easy to use and provides some useful statistics on assembly continuity, but without a reference it is not really a complete solution, and is mainly useful for continuity statistics (N50/L50).

We typically use BBMap to evaluate the quality of assemblies that lack a reference, as it provides a nice summary of error statistics. If you map the reads to each assembly, it will tell you:

%of reads mapped (higher is better)
%of reads properly paired (higher is better)
%of reads that mapped ambiguously (lower is better)
%of reads that matched the reference perfectly (higher is better)

...and also the overall error rate, and rates of each individual error type (substitutions, insertions, and deletions) on a per-base and per-read level. In each case, of course, lower is better. You can also use it to directly output per-contig coverage stats (with the covstats=file flag), which is sometimes useful for spotting collapsed repeats or contaminant contigs.

**FrancescaRaffini** · 02-16-2015, 11:20 PM

Thank you very much Brian Bushnell. I would kindly ask you if your package was already used with ddRAD.

**Brian Bushnell** · 02-16-2015, 11:40 PM

Originally posted by FrancescaRaffini View Post

Thank you very much Brian Bushnell. I would kindly ask you if your package was already used with ddRAD.

That's possible, but I don't know. I have never worked with ddRAD data, and I am unaware of it being used at JGI.

**sarvidsson** · 02-17-2015, 12:18 AM

I don't think Quast can be used for ddRAD data. What did you use to "assemble" your data - Stacks? I would evaluate the data based on number of polymorphic (where >2 samples are homozygote for each allele) "stacks"/loci covered by at least 10x coverage in 2/3 of your samples. With a few thousand such loci, you have a pretty nice dataset. If not, there is a long list of things than can go wrong (especially in the wet lab)...

@ BB: Typical ddRAD protocols produce quite small fragments (100-250 bp) representing small islands with strictly defined borders (restriction enzyme digested), so you rarely get any contigs larger than possible read pair overlap when pushing the data through a de novo assembler.

**Brian Bushnell** · 02-17-2015, 12:33 AM

Thanks for the explanation. It does not sound like standard methods of assembly and assembly evaluation are relevant here.

**FrancescaRaffini** · 02-17-2015, 07:02 AM

Thank you to both for yuor suggestions.

Sarvidsson, yes, I am using Stacks. Since I will perform diverse denovo assembly using diverse parameter, I need some qualitative measures (e.g. error, I can't measure it using most known methods since I don't have a refernce genome, replicates or generations) to estimate how good is the assembly with that paramenters. Your method looks interesting, but I am afraid it is not enough alone to evaluate the assembly quality.

Topics	Statistics	Last Post
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 39 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 62 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM

Unconfigured Ad

denovo assembly evaluation

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News