Seqanswers Leaderboard Ad

**john_nl** · 07-26-2012, 05:14 AM

Originally posted by Rachel Hillmer View Post

Hello,

Shouldn't this be the number of reads NOT mapping to the upper quartile? My understanding is that bad behavior -- titrating out the bulk of the reads because of a few highly overrepresented sequences in one sample -- can be corrected for by IGNORING the upper quarttile.

~Rachel

Glad i'm not the only one who thinks this. I'm sure there is an explanation, but at the moment it does not seem intuitive to me.

**glados** · 08-21-2012, 11:56 PM

Wondering this as well.

**jk1124** · 10-25-2012, 01:35 PM

I am also confused by the explanation for upper quartile normalization provided by the Cufflinks page (i.e. adjusting for highly overexpressed genes), and would appreciate any insight on that, but the paper the authors reference (Bullard 2010 BMC Bioinformatics) makes more sense, I think.

Basically, the upper quartile normalization gets rid of any long tail on the distribution of read counts which occurs due to the "preponderance of zero and low-count genes." So it seems, using this kind of normalization gets rid of any sequencing noise.

It makes sense that an FPKM would be inflated with upper quartile normalization then, because you are basically dividing by a smaller denominator (upper quartile < total reads).

Please let me know if this is a plausible reasoning, since I am new to this.

**HESmith** · 10-26-2012, 08:18 AM

jk1124,

Your reasoning is not flawed, but (unless I'm missing something) the only way to increase FPKM by four orders of magnitude would be if the upper quartile read count constitutes only 1/10000 of the total read count. That seems unlikely.

Also, the distribution tail of the data would not include zero-count genes.

**Richard Barker** · 10-30-2012, 01:54 PM

So would you recommend that we/i normalize my data by the upper quartile of the number of fragments mapping to individual loci when running cufflinks? or should one just omit this option?

**john_nl** · 10-31-2012, 01:46 AM

The Upper Quartile normalisation method does just essentially use the count value at the 75th percentile as the denominator.

Also, for people thinking about normalization methods I would recommend this article:

A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. (2012) Brief Bioinform

Topics	Statistics	Last Post
Bacterial Timeline Study Suggests Oxygen Use Preceded Photosynthesis by seqadmin Started by seqadmin, Today, 12:59 PM	0 responses 6 views 0 reactions	Last Post by seqadmin Today, 12:59 PM
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Yesterday, 10:17 AM	0 responses 8 views 0 reactions	Last Post by seqadmin Yesterday, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 60 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM

Seqanswers Leaderboard Ad

Why does quartile normalization inflate my FPKM values by ~4 orders of magnitude?

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News