Seqanswers Leaderboard Ad

**jgibbons1** · 11-28-2012, 11:14 AM

One more thing while I'm at it...

Has anyone used Telescoper (DOI:10.1093/bioinformatics/bts399)? If so, I'd be interested in hearing about your experiences with it.

**bckirkup** · 11-28-2012, 11:32 AM

Assembly of repetitive DNA

The big question: what is repeating/how much is repeating? Sounds vaguely leninist.

One person who has experience with this is Matt Riley at U. Tennessee, at least in microbial genomes.

Anyway, if you have tandem repeats of a few bp, then read length is your big factor; if you have repeats of a gene, then you need jumps/paired ends. If you have repeats of gene clusters, you may need something more substantial. 40kb jumps are possible and published; PacBio reads are another option; an Optical Map may be the answer. Of course, you'll need to 'fill in' the map or fix the SNPs in the SMRT reads. Joint assemblies are performed by a number of groups; the folks at NCBI, the FDA, and UMD (Mihai Pop) are familiar with the strategies.

Ultimately, if you have the worst case scenario, some sort of scale-free nesting of repeats within repeats, you would need all these solutions combined.

Hope that points you toward some ideas.

**jgibbons1** · 11-28-2012, 11:48 AM

Originally posted by bckirkup View Post

The big question: what is repeating/how much is repeating? Sounds vaguely leninist.

Leninist indeed

Thanks for your response...They are VERY helpful.

I'm interested in one chromosome which is probably a worst case scenario - variable sized microsatellites, minisatellites, transposable elements, and variable sized rDNA arrays. Quite frankly, it's a mess.

I'm not looking for a complete assembly, but the chromosome is ~40 Mb and I would like to generate scaffolds large enough to give me something to work with. Using a published illumina dataset (86 bp pe), I wasn't able to assembly anything larger than 1kb, although in non repetitive regions I was getting scaffolds as large as 400 Kb.

My goal is to predict functional motifs from the assembly (ex. TF binding sites, transposable element content, CNV in satellite sequences) and identify variation in this chromosome across populations.

**HESmith** · 11-28-2012, 01:27 PM

Ugh. (sorry, I meant to say "What a challenging project!")

Regarding your initial question (close vs. large size differences in the two libraries), the larger difference will be more useful for assembly. The ideal is to have paired end read jumps that span the repeats, which delimits the number of copies in the intervening region. You can calculate transposable element content and satellite CNV by read depth (although their positions will be difficult/impossible to assign).

Note that coverage issues do not necessarily limit you to two library sizes. For assembly, it would be more useful to have additional jump libraries sequenced at lower depth. IIRC, ALLPATHS-LG sequenced large (10kbp) jump libraries at 1/25th the depth of the shorter-sized inserts. A similar hybrid approach may also be your best bet.

Good luck!

**jgibbons1** · 11-28-2012, 01:50 PM

Originally posted by HESmith View Post

Ugh. (sorry, I meant to say "What a challenging project!")

haha...indeed!

Thanks for your comments. This is making me thing starting with 2 libraries for the pilot (probably 250 bp and 1 kb) at a higher depth and then running another 2 libraries (5 kb and 10 kb) at a lower depth would be a good strategy.

**piyo2** · 11-29-2012, 08:42 AM

Jumping libraries are a possibility, but the cost and difficulty to make the libraries are a consideration and there may be value in sequencing through the entire region. The current PacBio C2 XL chemistry averages 5kb read length with 10% reads 10kb+, max is usually ~20kb.

If you would like more information about potentially doing a PacBio library, we can discuss I can help make some introductions to labs in the area to get your challenging region sequenced. You can email me: akieu at pacificbiosciences.com

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 14 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

assembly strategies for repetitive dna

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News