Unconfigured Ad

**ctseto** · 01-23-2014, 04:05 PM

BLAST would be online (through NCBI) for single fastas, or downloading and compiling blast on your end along with the database, and running a search of your contigs against the database.

Curious if something like metaphlan, phytophlan or Kraken against your assemblies (and your raw reads, just to check) would tell you what you have. Of course, "clade-specific markers" and Kmer search is prone to some degree of noise.

**yueluo** · 01-23-2014, 04:54 PM

How did you make your assembly(de-novo or reference-guilded) ?
Is this a meta project?

**Brian Bushnell** · 01-23-2014, 10:02 PM

If you have a reference (and it appears that you do), I recommend QUAST; it's quite effective!

QUAST: Quality Assessment Tool for Genome Assemblies | Algorithmic Biology Lab

http://bioinf.spbau.ru/quast

Even if you don't have a reference, it still tells you things like the number of predicted genes of size>=X; better assemblies tend to have more longer genes and fewer short genes.

Also, you could try ALE (Assembly Likelihood Evaluator), which does not need a reference and estimates the correctness of an assembly from a sam file, based on statistics of variations, coverage, and insert size:

Checking your browser - reCAPTCHA

http://www.ncbi.nlm.nih.gov/pubmed/23303509

ALE is not designed to evaluate the quality of a single assembly, but rather, the relative quality of multiple assemblies from the same set of reads. But that's still quite useful when you have several assemblies and need to pick the best one.

EST capture is also a good method when you have EST data.

You can also capture metrics like the percent of source reads that align to the assembly, and the rate of substitutions/insertions/deletions in those reads. The higher the mapping rate, and the lower the error count, the better the assembly is. For this you should use a normal aligner, not mummer.

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 15 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 107 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

How do I go about evaluating my assembly?

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News