Seqanswers Leaderboard Ad

**Brian Bushnell** · 02-28-2017, 05:27 PM

If you want to do quality-trimming, it should be done conservatively to avoid losing so much information that the reads no longer map uniquely, and increasing bias. Typically, I suggest either not trimming, or to at most Q8~Q12. On the other hand, you should definitely adapter-trim prior to mapping.

It's also possible to map with a more error-tolerant aligner to avoid the issues incurred by quality-trimming entirely. BBMap is slower than STAR, but yields a higher mapping rate.

Oh, and one more thing - for overlapping paired-ended reads, it's also possible to increase read quality by using the consensus in the overlapping portion, by using BBMerge like this:

Code:

bbmerge.sh in=reads.fq out=corrected.fq ecco mix vstrict adapters=default

That may negate the need for quality-trimming, and it's a very fast operation.

**gstone** · 02-28-2017, 06:02 PM

Hi Brian, thank you so much for your quick and thoughtful response. If you don't mind taking the time I was hoping to ask 2 questions. 1) How would I map the consensus file, would I treat it as a single-end read? and 2) When I use bbrepair to match the order of my paired end reads and then quality trim I get less input reads to my aligner than if I just quality trim then align, but no reads are placed in the singletons file when I call bbrepair. Why is this?

Thank you for being so generous with you time!

**Brian Bushnell** · 02-28-2017, 06:40 PM

Originally posted by gstone View Post

1) How would I map the consensus file, would I treat it as a single-end read?

Sorry, let me clarify -

This command assumes you have paired reads interleaved in "reads.fq". In that case, you can use BBMerge like this:

Code:

bbmerge.sh in=reads.fq out=merged.fq outu=unmerged.fq

...which produces 2 files. merged.fq contains consensus singletons made from overlapping pairs (each pair yields one read), and unmerged.fq contains interleaved pairs that did not confidently overlap so they remain unchanged. This is nice for assembly, but sometimes annoying otherwise (like for mapping). Also, some programs only accept paired reads. So you can also do this:

Code:

bbmerge.sh in=reads.fq out=corrected.fq ecco mix

...which produces a single output file that still contains paired, interleaved reads. Reads that overlapped will have the overlapping portion corrected to the consensus - where they match, the quality score is increased, and where they don't match, the quality score is set to the difference of the quality scores and the base is set to the base with the higher quality score. With this command, the result is used for mapping just like the input would have been.

2) When I use bbrepair to match the order of my paired end reads and then quality trim I get less input reads to my aligner than if I just quality trim then align, but no reads are placed in the singletons file when I call bbrepair. Why is this?

Can you give the exact sequence of commands you are using? Also, how did the order of your paired reads get scrambled in the first place? If you start with the raw reads and use a series of tools that respects pairing, you'll never need to use repair.sh.

Thank you for being so generous with you time!

You're welcome!

**gstone** · 02-28-2017, 07:01 PM

Great, thank you for the clarification on the first point. As for my use of BBRepair, when I was aligning with STAR I got a large amount of reads too short to map, which was due to the paired-end reads being out of order. I used BBRepair and the mapping rate drastically improved (~40% to anywhere from ~65%-97%). I then figured I'd use BBRepair then quality trim just to be safe. My repair command is as follows: bbmap/repair.sh in1=file1 in2=file2 out1=out1 out2=out2 outs=out_singles repair. As to how the reads got scrambled is a mystery to me, but I was very excited to find BBRepair.

My confusion essentially comes from the fact that using BBRepair prior to quality trimming results in fewer input reads than just quality trimming, even though BBRepair identifies no singleton reads. My understanding is that BBRepair just orders the reads between the pairs.

Thank you again for all of your help.

**Brian Bushnell** · 03-02-2017, 10:09 AM

Originally posted by gstone View Post

My confusion essentially comes from the fact that using BBRepair prior to quality trimming results in fewer input reads than just quality trimming, even though BBRepair identifies no singleton reads. My understanding is that BBRepair just orders the reads between the pairs.

Your understanding is correct. The repair tool does not discard any reads - they either get sent to "out1", "out2", or "outs". How are you counting the reads before and after? Can you post the commands and results? It's possible that there's a bug somewhere, and if so, I'd like to fix it ASAP.

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Quality trimming prior to aligning

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News