What is the best way to close gaps (stretches of Ns) in a genome with single reads and also combine contigs/scaffolds.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Get more sequencing done.
It may even be cost effective in some cases to design primers and use "Sanger" capillary sequencing to target theses specific gaps or confirm particular contig junctions.
If you're doing more high throughput, I'd consider a paired end library - using a different insert length if you've already got some paired end data.
-
I actually meant in silico since I have a draft genome and additional single reads already. I thought of using velvet, the contigs from the draft genome as long reads and add the reads I have in addition, but was wondering whether there are any other (better) ideas out there. Thanks in advance for any suggestion.
Comment
-
Difficult. You really want paired end reads. (Very) long reads, eg Pac Bio, would be useful too.
There are a already a number of threads on this site about genome assembly strategies.
One suggestion - create contig sets from original reads, then new reads separately eg Velvet.
Trim to 1999bp.
Input into Newbler.
Comment
-
Have you tried GapCloser from the SOAP package? I could close/remove ~80% of the Ns in a 1Gb draft assembly by using paired end reads with inserts of 200-500bp.
Of course, whether it will work or not depends a lot on the genome content, how long the stretches of N are and if the regions around it is hard to map to or not, but it might be worth a try?
Edit: Sorry - I missed the part where it said "single reads".. A bit more tricky then... :/
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...-
Channel: Articles
Yesterday, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
39 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
41 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
35 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
55 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment