Seqanswers Leaderboard Ad

**dpryan** · 12-06-2013, 03:22 AM

The explanation given at the link is actually pretty good. What part of that wasn't satisfactory? I should note that if the input BAM file is name-sorted (or even unsorted) then you probably don't need to shuffle things (it'l. The idea is that the insert sizes that will map to a single region of the genome might not be representative of the entire experiment. If you don't shuffle the reads, then your aligner might estimate the insert size incorrectly, which will slightly bias the alignments downstream (obviously, bias is an issue if you're going to call SNPs).

**splaisan** · 12-06-2013, 03:29 AM

Thanks for this answer, it already sheds some light.

No part is unsatisfactory but the process leading to a wrong estimate is not really explained (or I did not understand it correctly). Is it only the initial step where BWA collects a read sample and measures distance to fine tune the remaining alignments?

In my understanding, the bias was present also in name-sorted BAM which did not make sense to me since the only thing exported to FASTQ are seq and quals.

Considering that public BAM often only report nicely mapped reads I guess that the bias is unavoidable, right!

S

**dpryan** · 12-06-2013, 03:30 AM

To a certain extent yes, particularly if the original authors didn't do a good job with things. It's probably a good idea to be cautious if you can't get the original fastq files.

Topics	Statistics	Last Post
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, Today, 10:49 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 23 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 20 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM

Seqanswers Leaderboard Ad

Announcement

why should BAM be shuffled before extracting to FASTQ?

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News