Seqanswers Leaderboard Ad

**chadn737** · 04-05-2013, 05:24 PM

1) They have just released DESeq 2 and you might want to check that out.
2) Read the DESeq vignette, one of the best written out there, they explain in detail how to go about pre-filtering data
3) The harsh truth is that you don't always get what you want. If there are only 50 truly differentially expressed genes, trying to increase that number because you want it to be higher is your bias and not what is really going on. That data is what the data is. For that matter, 50 genes, is a small enough number that one can check potential functions by hand.

**adumitri** · 04-05-2013, 05:52 PM

Hi chadn737,

1, 2) I will have to look into the DESeq2 R package - thanks for pointing this out.
3) In general, I would agree with you. Nevertheless, my question regarding the variance filters came about after having tried several methods for the DE analysis. While the version of DESeq that I used returned only 50 DE genes, other programs (including edgeR) returned surprisingly more differentially expressed genes. I like DESeq better, since it does not return genes that I do not trust (and as you said, the vignette is amazing), but at the same time I know that interesting genes do not pass this FDR threshold because DESeq was designed to be more conservative (as you can see from this reference). I do need to look at DESeq2, though ..

Alexandra

**dietmar13** · 04-06-2013, 06:42 AM

a non-parametric approach is probably better

for designs with sufficient biological replicates.

RUM - HTseq-count - SAMseq (samr package) is my pipeline for clinical samples with many biological replicates...

**joachim.jacob** · 04-08-2013, 03:23 AM

adumitri,

As mentioned in your title and the vignette of DESeq, non-specific filtering of your genes on certain features, to reduce the number of tests carried out, works very well.

I had tested some variables to filter on: the total counts of reads per gene worked out best for me. But you can test anything that pops into your mind. Remove increasing percentiles from your dataset, in small steps. The maximum number of significant genes lies ~15% higher in my case than not applying any filtering. It takes however a fair amount of time to loop through all the calculations. Precaution on setting the variable to filter on, as it should not be correlated with the hypothesis you are testing (hence, non-specific).

A typical increase in sign genes for DESeq: see https://dl.dropbox.com/u/18352887/sweet_spot_deseq.png

For your pathway enrichment, I advise you to use the Piano package in R. See http://nar.oxfordjournals.org/conten.../26/nar.gkt111

You can provide the complete list of p-values as assigned by DESeq (without applying your cutt-off) to Piano, and let Piano run a couple of gene set enrichment algorithms on it, to assign a consensus score to the pathways.

Hope this helps.

Topics	Statistics	Last Post
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, Yesterday, 10:17 AM	0 responses 7 views 0 reactions	Last Post by seqadmin Yesterday, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 59 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM
Mapping the snoRNAome in Zebrafish to Advance Disease Research by seqadmin Started by seqadmin, 03-18-2025, 12:50 PM	0 responses 50 views 0 reactions	Last Post by seqadmin 03-18-2025, 12:50 PM

Seqanswers Leaderboard Ad

RNA-Seq variance-based filter before differential expression analysis

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News