Originally posted by gstone
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Great, thank you for the clarification on the first point. As for my use of BBRepair, when I was aligning with STAR I got a large amount of reads too short to map, which was due to the paired-end reads being out of order. I used BBRepair and the mapping rate drastically improved (~40% to anywhere from ~65%-97%). I then figured I'd use BBRepair then quality trim just to be safe. My repair command is as follows: bbmap/repair.sh in1=file1 in2=file2 out1=out1 out2=out2 outs=out_singles repair. As to how the reads got scrambled is a mystery to me, but I was very excited to find BBRepair.
My confusion essentially comes from the fact that using BBRepair prior to quality trimming results in fewer input reads than just quality trimming, even though BBRepair identifies no singleton reads. My understanding is that BBRepair just orders the reads between the pairs.
Thank you again for all of your help.
Leave a comment:
-
Originally posted by gstone View Post1) How would I map the consensus file, would I treat it as a single-end read?
This command assumes you have paired reads interleaved in "reads.fq". In that case, you can use BBMerge like this:
Code:bbmerge.sh in=reads.fq out=merged.fq outu=unmerged.fq
Code:bbmerge.sh in=reads.fq out=corrected.fq ecco mix
2) When I use bbrepair to match the order of my paired end reads and then quality trim I get less input reads to my aligner than if I just quality trim then align, but no reads are placed in the singletons file when I call bbrepair. Why is this?
Thank you for being so generous with you time!
Leave a comment:
-
Hi Brian, thank you so much for your quick and thoughtful response. If you don't mind taking the time I was hoping to ask 2 questions. 1) How would I map the consensus file, would I treat it as a single-end read? and 2) When I use bbrepair to match the order of my paired end reads and then quality trim I get less input reads to my aligner than if I just quality trim then align, but no reads are placed in the singletons file when I call bbrepair. Why is this?
Thank you for being so generous with you time!Last edited by gstone; 02-28-2017, 06:06 PM.
Leave a comment:
-
If you want to do quality-trimming, it should be done conservatively to avoid losing so much information that the reads no longer map uniquely, and increasing bias. Typically, I suggest either not trimming, or to at most Q8~Q12. On the other hand, you should definitely adapter-trim prior to mapping.
It's also possible to map with a more error-tolerant aligner to avoid the issues incurred by quality-trimming entirely. BBMap is slower than STAR, but yields a higher mapping rate.
Oh, and one more thing - for overlapping paired-ended reads, it's also possible to increase read quality by using the consensus in the overlapping portion, by using BBMerge like this:
Code:bbmerge.sh in=reads.fq out=corrected.fq ecco mix vstrict adapters=default
Last edited by Brian Bushnell; 02-28-2017, 05:29 PM.
Leave a comment:
-
Quality trimming prior to aligning
I am aligning reads to the GrCh37 human reference genome using STAR and am confused as to how to get the most reliable results. At first I aligned my reads with no pre-processing using STAR and got essentially 2 groups: 1 group that aligned well, with 85-95+% of reads uniquely mapping, and another group with only about 60% uniquely mapping. I then tried using sickle to trim low quality reads (q<20). Reads that mapped 85-95+% unique lost about 1 million reads and map a similar % uniquely, and reads that mapped about 60% now map >80% but lost millions of uniquely mapped reads. Though millions of reads are lost after quality trimming, on average ~90% of reads survive in every case, with a max of 21% reads lost after quality trimming. My question is which pipeline should I trust? Should I be more interested in the raw number of mapping reads, or the percentage? Are the reads lost after quality trimming untrustworthy anyways?
I have done a lot of reading on this and have yet to find a definitive consensus. Any help or guidance would be much appreciated!Last edited by gstone; 02-28-2017, 03:48 PM.
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:47 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Today, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
59 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
54 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
Leave a comment: