Unconfigured Ad

**jimmybee** · 10-07-2010, 04:46 AM

How repetitive is your plant genome?

**natstreet** · 10-07-2010, 05:16 AM

I don't have a good answer but this is something of a hot topic to me as we are doing much the same, although I have higher 454 coverage.

For plants a big factor can be how polymorphic your species is as well as the repeat structure.

In general, I would be really interested to know how people are effectively integrating 454 and Illumina data. Do you compile them on their own and then combine those assemblies or are you compiling the data all together? In either case, what assemblers are you using?

**strob** · 10-07-2010, 05:29 AM

highly repetitive....
we have the illumina dataset available. But we are thinking of adding a 454 low coverage set. I think we can do three things:
- all de novo (hybrid assembly)
- illumina de novo and than map them back on the 454 reads
- map the illumina reads directly to the 454 reads

Before doing this, I want to know if a 454 run will bring additional information.
Tools? I was thinking of MIRA

**jimmybee** · 10-07-2010, 05:40 AM

If its highly repetitive (my definition of highly would be >80%), then doing a 1x coverage run wouldn't be particularly effective, nor will it compliment the illumina data for the hybrid assembly. You'll need to figure out a few things like how finished do you want the sequence and what information do you want out of the assembly (eg. just good assembly of genes or repeats).

To answer natstreet: Hybrid assemblies with different types of data are the way to go for repetitive genomes (such as cereal crops). We've found that integrating differing types of data (paired end/fragment), different insert sizes and read lengths can been very beneficial to the assembly.

**natstreet** · 10-07-2010, 06:33 AM

Hybrid assemblies with different types of data are the way to go for repetitive genomes (such as cereal crops). We've found that integrating differing types of data (paired end/fragment), different insert sizes and read lengths can been very beneficial to the assembly.

I have shotgun 454, paired end 454 and a range of paired end Illumina libraries as well as a mate pair library. I haven't yet found an assembler that can take all of the data for a hybrid assembly on any machine that I have access to. Velvet and Mira both take both types of data but have huge RAM requirements and are simply impractical to run. For hybrid cereal assemblies, what software are you using?

**jimmybee** · 10-07-2010, 06:45 AM

velvet. I feel your pain in regards to the RAM requirements. We only just got something can handle the requirements. I've compiled SOAPdenovo and Euler-SR but have yet to play around with them

**glacerda** · 10-07-2010, 10:14 AM

It is crucial to correct your reads prior to assembly (using the SOAPdenovo correction tool, SHREC or other). This will save memory in the assembly stage.

Last, SOAPdeNovo uses much less memory than velvet, although in my personal experience velvet produces slightly better assemblies.

Don't forget to optimize the parameters, specially the k-mer size. This has a great influence on memory/time and quality of assembly.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 23 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 55 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

low 454 coverage combined with high solexa coverage

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News