Hello,
I am new to this, and have been unable to find questions/advice related to my situation, so I hope someone can provide some insight.
I have RNA-seq data that I have processed with the following simplified pipeline:
fastq --> bowtie2 (mapped to reference transcriptome) --> eXpress (outputs count data, and fpkm) --> limma (count data from eXpress, weighted voom transformation, which gives normalized log2counts with associated precision weights) --> DE transcripts (with log2FC, Avg. Expr, P.Value, etc.)
The data is from a time-course injury experiment, so I have 0 hr (uninjured), 1 day post-injury, and 2 days post-injury. For each time-point I have 3 replicates. One of the 1 day and 2 day samples look to outliers, so I have down-weighted them in limma using the weighted voom transformation. I really would like to NOT throw away any data, so I kept it.
I would like to perform hierarchical clusterstering across the time-points and originally wanted to use fpkm values from eXpress, but realized that these values are not weighted and not normalized. Due to the fact that some of the samples might be outliers, I though this might cause an issue in the clustering. So I would like to use normalized, weighted values it possible.
My questions are:
1. Which values would be best to use:
a. mean fpkm from eXpress (non-normalized, non-weighted)
b. log2FC from limma (normalized, weighted)
c. Average expression value from limma (normalized, weighted)
2. If log2FC is suggested, how should I go about clustering since I believe I would only have values for the 1 day and 2 day time-points?
Thank you,
Chris
I am new to this, and have been unable to find questions/advice related to my situation, so I hope someone can provide some insight.
I have RNA-seq data that I have processed with the following simplified pipeline:
fastq --> bowtie2 (mapped to reference transcriptome) --> eXpress (outputs count data, and fpkm) --> limma (count data from eXpress, weighted voom transformation, which gives normalized log2counts with associated precision weights) --> DE transcripts (with log2FC, Avg. Expr, P.Value, etc.)
The data is from a time-course injury experiment, so I have 0 hr (uninjured), 1 day post-injury, and 2 days post-injury. For each time-point I have 3 replicates. One of the 1 day and 2 day samples look to outliers, so I have down-weighted them in limma using the weighted voom transformation. I really would like to NOT throw away any data, so I kept it.
I would like to perform hierarchical clusterstering across the time-points and originally wanted to use fpkm values from eXpress, but realized that these values are not weighted and not normalized. Due to the fact that some of the samples might be outliers, I though this might cause an issue in the clustering. So I would like to use normalized, weighted values it possible.
My questions are:
1. Which values would be best to use:
a. mean fpkm from eXpress (non-normalized, non-weighted)
b. log2FC from limma (normalized, weighted)
c. Average expression value from limma (normalized, weighted)
2. If log2FC is suggested, how should I go about clustering since I believe I would only have values for the 1 day and 2 day time-points?
Thank you,
Chris
Comment