Seqanswers Leaderboard Ad

**gcarbajosa** · 12-13-2011, 08:37 AM

Something more about this. Going through the SEQanswers post related to fastqc I've found a link to this page:

Interpreting the duplicate sequence plot in FastQC |

http://proteo.me.uk/2011/05/interpreting-the-duplicate-sequence-plot-in-fastqc/

where Simon Andrews mentions that fastqc only uses the first 50bp of each sequence to search for duplicates. I guess that since the reads in my dataset are 100bp long they duplication levels can be boosted by only considering the first 50bp when looking for identical reads. So now I'm thinking that the correct answer is the 2nd possibility

**fkrueger** · 12-13-2011, 09:43 AM

Hi gcarbajosa,

As you mentioned, FastQC determines an approximate level of sequence duplication by storing the first 50bp of the first 200,000 different sequences it encounters in a sequencing file. These duplicated sequences may for example be be adapter contamination (which would not map at all in Bismark), but could also be duplicate reads that were amplified by PCR during the library construction. These reads might align perfectly well and uniquely to the genome even though they might be technical duplicates.

So essentially the number of reads mapping non-uniquely (which are being discarded) and duplicated reads is not the same thing, and Bismark does not specifically output anything regarding duplication levels. I hope this helps?

Topics	Statistics	Last Post
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, Today, 06:57 AM	0 responses 9 views 0 likes	Last Post by seqadmin Today, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, Yesterday, 07:17 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 19 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 22 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM

Seqanswers Leaderboard Ad

Announcement

Apparent duplication levels incongruence between bismark and fastqc with BS-Seq data

Comment

Comment

Latest Articles

ad_right_rmr

News