I used Bowtie(missmatch 3), BWA(missmatch 4) for mapping reads to neurospora genome. I don't know why i only have 40% of reads which could be mapped, the rest were not mappable. I have never experienced such things, so what are the possible reasons for this, anyone have any idea?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Often: contaminations.
Try to assemble the non mappable reads, for example with ABYSS, blast the resulting contigs to get an idea of what's in there, then try to align again against those organisms you identified in the contamination.
You have checked the duplication ratio of your reads first though, right?
-
Poor quality sequence, contamination, enrichment of repetitive sequence... plenty of possible reasons.
I'd suggest running some QC on your raw sequence to see if that turns up any problems before delving any further into the failures. 40% isn't disastrously low so it may not be too serious a problem.
Comment
-
Originally posted by ffinkernagel View PostOften: contaminations.
Try to assemble the non mappable reads, for example with ABYSS, blast the resulting contigs to get an idea of what's in there, then try to align again against those organisms you identified in the contamination.
You have checked the duplication ratio of your reads first though, right?
Comment
-
Originally posted by hannat View PostWhat is "duplication ratio", how should i estimate that? Thanks
Comment
-
Originally posted by simonandrews View PostIt's a measure of how often each unique sequence is seen. High duplication levels indicate that your sequence may have been overamplified during library preparation. The QC report I linked to will show you a duplication level plot to see how many times you see unique, duplicated, triplicated etc sequences. It will also spot heavily overrepresented sequences in case you have a small number of heavy contaminants (eg primers).Attached Files
Comment
-
Originally posted by hannat View PostI see a rise in the end of the duplication plot, so i have large number of sequence which were duplicated.
What you would hope to see on these plots is that the duplication rate immediately falls to very close to zero and stays there. Any significant amount of duplication is something to be concerned about.
Comment
-
Going along with what Simon said (ooh, pad pun), how many reads are in your data set? The Neurospora crassa genome is ~40Mb. If you have close to, or more than 40 million reads you would expect to see some degree of low level duplication. The rise at the high end of the plot may be due to the over representation of the mt plasmid.
Comment
-
What's wrong with my ChIP seq data?
I perform H3K9me3 ChIP experiment and built the libarary acoording illumina's ChIP seq libarary protocol. The analysis of the data is as follows:
raw read: 46891730
map read: 42812364
uniq read: 40442380
used read:13409805
map ratio: 91.30%
uniq read: 86.25%
used ratio:28.6%
region: 253
The used raed/used ratio/region is too low. I cannot figure out the problem, could anyone help me?? Thanks!
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
27 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
30 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
26 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
52 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment