Seqanswers Leaderboard Ad

**jwfoley** · 04-03-2014, 06:59 AM

The tables are hard to read and this logic is hard to follow:

Since the observed duplication rate in the merged sample is only slightly higher, I conclude that the majority of original reads marked as duplicate really are pcr duplicates. And that the 'false pcr duplicates' rate is only about 3%, given this library preperation.

Can you elaborate?

**dpryan** · 04-03-2014, 11:03 AM

I'm not sure that Picard's markDuplicates command respects read groups and, if not, I wouldn't conclude anything from your test. Having said that, high alleged duplication rates are expected in RNAseq due to highly expressed species (this is also why you don't bother marking duplicates in RNAseq unless you're looking for edit sites).

**saturatedfunk** · 04-06-2014, 05:54 AM

Sorry for the tables, they looked ok before I posted
I will attempt to simplify the question and describe my logic.

Replicate 1 had 15.5 million duplicates, which represented roughly 50% of total reads
Replicate 2 had 20.3 million duplicates, which represented roughly 50% of total reads.

Assuming,
1. These are truly PCR duplicates
2. There is 0% pcr duplication between the two runs ( since they were pcr'd seperately)

The assumption in #1 is a bit naive and has been discussed exhaustively, I do not intend to rehash it here. The assumption in #2 is true by definition.

I expect to find (at least) 35.8 million duplicates in the merged file.
I found 36.6 million duplicates in the merged file.

Therefor, (1- (35.8/36.6))=2% of the reads from replicate 1 share a read start site with replicate 2, but are not truly pcr duplicates. I refer to these 2% as "false-pcr-duplicate"

My question is, can I assume that the false pcr duplication rate in the individual runs is only 2%?

Topics	Statistics	Last Post
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, Yesterday, 02:46 PM	0 responses 12 views 0 likes	Last Post by seqadmin Yesterday, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 13 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM
Enhanced Neoantigen Detection: Introducing NeoHunter by seqadmin Started by seqadmin, 05-06-2024, 07:17 AM	0 responses 17 views 0 likes	Last Post by seqadmin 05-06-2024, 07:17 AM
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 23 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM

Seqanswers Leaderboard Ad

Announcement

interpretation of pcr duplication

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News