Dear all,
I seek your help with choosing & validating RNA-Seq time course data normalization method(s) for my work.
My data set is 4 reps per time point, and 9 time points.
I want to extract co-expressed genes based on their shared expression profiles over time. So I am NOT asking you how to perform pair-wise DE gene identification.
I know there have been multiple posts on the topic of RNA-Seq data normalization. This is my 1st post here, so at the cost of being repetitive with some of my questions, and irking some or all of you, here I go:
1. For my purposes I am assuming that raw mapped count data needs to first normalized, right?
2. Should I test different methods of data normalization of my raw, mapped counts? Like TMM, quantile etc.?
3. Strictly speaking, should the choice of normalization method be justified through some measure or test, or is it norm to try out different methods?
4. Do both edgeR and DESeq offer different built-in methods of data normalization applicable for time-course data (NOT pair-wise comparisons)?
5. Will normalization have to be performed with respect to a reference data point, lets say time point zero (which makes intuitive and biological sense to me)
OR
are there variants of normalization that can normalize data across time, but without explicitly choosing a reference (such a method, if it exists, does not make intuitive or biological sense to me)
6. What is the best place for someone like myself, new to bio-statistics and the R environment, to quickly learn tricks of the trade?
Lots of question I know, hoping this forum can help out a poor, starving grad student
Thanks a ton.
Wishing you all happy holidays and a fantastic 2012!
AksR
-----------------
CTTATTGTTGAACTTOAATGGTGCTAATGATCCTCGTOTCTCCTGAACGT
(translate THAT!)
I seek your help with choosing & validating RNA-Seq time course data normalization method(s) for my work.
My data set is 4 reps per time point, and 9 time points.
I want to extract co-expressed genes based on their shared expression profiles over time. So I am NOT asking you how to perform pair-wise DE gene identification.
I know there have been multiple posts on the topic of RNA-Seq data normalization. This is my 1st post here, so at the cost of being repetitive with some of my questions, and irking some or all of you, here I go:
1. For my purposes I am assuming that raw mapped count data needs to first normalized, right?
2. Should I test different methods of data normalization of my raw, mapped counts? Like TMM, quantile etc.?
3. Strictly speaking, should the choice of normalization method be justified through some measure or test, or is it norm to try out different methods?
4. Do both edgeR and DESeq offer different built-in methods of data normalization applicable for time-course data (NOT pair-wise comparisons)?
5. Will normalization have to be performed with respect to a reference data point, lets say time point zero (which makes intuitive and biological sense to me)
OR
are there variants of normalization that can normalize data across time, but without explicitly choosing a reference (such a method, if it exists, does not make intuitive or biological sense to me)
6. What is the best place for someone like myself, new to bio-statistics and the R environment, to quickly learn tricks of the trade?
Lots of question I know, hoping this forum can help out a poor, starving grad student
Thanks a ton.
Wishing you all happy holidays and a fantastic 2012!
AksR
-----------------
CTTATTGTTGAACTTOAATGGTGCTAATGATCCTCGTOTCTCCTGAACGT
(translate THAT!)
Comment