What is the function of the MA plot the fold change??
The graph shows the relationship in the MA-plot, no??
thanks
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by Sol View Postbut, what is the mean RPKM?
thanks
Leave a comment:
-
Originally posted by Sol View PostThanks
and the RPKM. how to normalize the data?
I divide the number of reads by the size of the gene and divide by all the reads?
How it is calculated
thanks
Leave a comment:
-
Thanks
and the RPKM. how to normalize the data?
I divide the number of reads by the size of the gene and divide by all the reads?
How it is calculated
thanks
Leave a comment:
-
Originally posted by Sol View Postbut, o z-score, is based on what data? and what is difference between q-value and p-value. I don't understand.
Thanks
Search "Z-score" for details.
q-value is a kind of corrected p-value for multiple testing. Please refer to Section 2.3 of our DEGseq paper:
"2.3 Multiple testing correction
For the above methods, the P-values calculated for each gene are adjusted to Q-values for multiple testing corrections by two alternative strategies (Benjamini and Hochberg, 1995; Storey and Tibshirani, 2003). Users can set either a P-value or a false discovery rate (FDR) threshold to identify differentially expressed genes.
"
If it is still unclear, please let me know. Thanks.
Leave a comment:
-
but, o z-score, is based on what data? and what is difference between q-value and p-value. I don't understand.
Thanks
Leave a comment:
-
Originally posted by Sol View PostHi
I ran the program DEGexp of the DEGseq. The output file generated a table with values of log2 fold change, z-score, p-value, q-value and signature (p-value <0.001). How to interpret the gene upregulation and downregulation? what each column means? the input file was RPKM and genes. I have not replicate, but I'm comparing two conditions.
Thanks
In the output file, there are 2 columns for fold-change: "log2(Fold_change)" and "log2(Fold_change) normalized". log2(Fold_change) = log(value1/value2), and the normalized value is got from the normalized value1 and value2. From the value of fold-change, you can judge this gene is up-regulated or down-regulated. For example, for a gene if its log2(Fold_change) > 0, which means value1 > value2, and if its signature = TRUE, this gene is significantly down-regulated in condition 2. Also, you can look into z-scores.
Hope this helps.
Leave a comment:
-
Hi
I ran the program DEGexp of the DEGseq. The output file generated a table with values of log2 fold change, z-score, p-value, q-value and signature (p-value <0.001). How to interpret the gene upregulation and downregulation? what each column means? the input file was RPKM and genes. I have not replicate, but I'm comparing two conditions.
Thanks
Leave a comment:
-
Hi Xi Wang,
Thanks for your software. I am wondering what's the difference between DEseq and DEGseq?
Leave a comment:
-
Originally posted by osvaldoreis View PostHey all, I'm just start using Degseq and I'm having some troubles. I have many library without replicats and I want to see the differential expression in the time course of a disease. I have solexa reads MP ~35 bp og RNA-seq, the genome and a gene prediction of this specie. I align the reads with the predicted genes using bowtie then I made and script to make a file like this:
"Gene" "LB1" "LB2"
"g1" 45 103
... ... ...
where the numbers are how many reads align with that gene for each library. Then I ran DEGexp like this:
library(DEGseq)
geneExpMatrix1 <- readGeneExp(file="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/reads_por_gene.txt", geneCol=1, valCol=c(2))
geneExpMatrix2 <- readGeneExp(file="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/reads_por_gene.txt", geneCol=1, valCol=c(3))
layout(matrix(c(1,2), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=2, groupLabel1="GPB1", geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=2, groupLabel2="DPB1", method="MARS", outputDir="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/", pValue=1e-2, thresholdKind=1)
I got results without errors but the results have a lot of differentially expressed genes. I have ~18k genes and ~12k are differentially expressed. Looking at the file I can see that many of them aren't significant. I wanted to know if I'm doing something wrong?
Thanks for any help!
Thanks for using DEGseq in differential expression analysis.
As your samples are without biological replicates (right?), the statistical model in DEGseq only depicts the measurement uncertainty in RNA-seq technology, so there could be some genes, which are picked up as differentially expressed genes, do appear to be differentially expressed in samples (for several reasons such as individual differences) but don't have biological significance. It is often said that statistical significance doesn't equal to biological significance.
Another thing is you may try more stringent p-value (or q-value) cutoff, say specifying pValue=1e-3. Or you'd better use thresholdKind=3 or thresholdKind=4. The q-values are adjusted from p-values for multiple testing correction.
Hope this helps.Last edited by Xi Wang; 10-01-2010, 01:05 AM.
Leave a comment:
-
Starting using Degseq
Hey all, I'm just start using Degseq and I'm having some troubles. I have many library without replicats and I want to see the differential expression in the time course of a disease. I have solexa reads MP ~35 bp og RNA-seq, the genome and a gene prediction of this specie. I align the reads with the predicted genes using bowtie then I made and script to make a file like this:
"Gene" "LB1" "LB2"
"g1" 45 103
... ... ...
where the numbers are how many reads align with that gene for each library. Then I ran DEGexp like this:
library(DEGseq)
geneExpMatrix1 <- readGeneExp(file="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/reads_por_gene.txt", geneCol=1, valCol=c(2))
geneExpMatrix2 <- readGeneExp(file="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/reads_por_gene.txt", geneCol=1, valCol=c(3))
layout(matrix(c(1,2), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=2, groupLabel1="GPB1", geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=2, groupLabel2="DPB1", method="MARS", outputDir="/root/rnaseq/Analise_diff_Expr/GPB1_X_DPB1/", pValue=1e-2, thresholdKind=1)
I got results without errors but the results have a lot of differentially expressed genes. I have ~18k genes and ~12k are differentially expressed. Looking at the file I can see that many of them aren't significant. I wanted to know if I'm doing something wrong?
Thanks for any help!
Leave a comment:
-
Originally posted by sma View PostDear WANG Xi,
I am running into troubles using DEGseq. Admittedly, I am a novice to analyzing this kind of data and using R---in fact I only loaded R just yesterday in order to run your DEGseq package.
I am trying to compare three separate sets of 454 data (from 3 stages of development). When we get the data back from our company in Shanghai, it is complete with the annotations and the reads are already counted. There are no replicates (just 3 samples). So, I think I can simply use DEGexp with the MARS method.
I have followed your recent suggestions to "(1) download the most recent version with its “reference manual”, (2) following the examples in the manual, apply DEGseq’s functions to the test data, (3) replace the data set, and apply DEGseq to your own data." I can successfully run the test data without problem. I can also replace the data with my own data and I can successfully show the filepath to my data (so I know I have entered it correctly). However, when I try to set up the geneExpMatrix, I run into problems. I always get an error that says :
"Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 10 did not have 4 elements"
I've tried everything I can think of and I cannot seem to remedy this problem. Any suggestion?
Thank you!
Thanks for using DEGseq. I am wondering your problem may be caused by the format of the input file to geneExpMatrix. Could you paste the first 15 lines in the input file here, or email to me via [email protected]? Thanks.
Leave a comment:
-
Dear WANG Xi,
I am running into troubles using DEGseq. Admittedly, I am a novice to analyzing this kind of data and using R---in fact I only loaded R just yesterday in order to run your DEGseq package.
I am trying to compare three separate sets of 454 data (from 3 stages of development). When we get the data back from our company in Shanghai, it is complete with the annotations and the reads are already counted. There are no replicates (just 3 samples). So, I think I can simply use DEGexp with the MARS method.
I have followed your recent suggestions to "(1) download the most recent version with its “reference manual”, (2) following the examples in the manual, apply DEGseq’s functions to the test data, (3) replace the data set, and apply DEGseq to your own data." I can successfully run the test data without problem. I can also replace the data with my own data and I can successfully show the filepath to my data (so I know I have entered it correctly). However, when I try to set up the geneExpMatrix, I run into problems. I always get an error that says :
"Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 10 did not have 4 elements"
I've tried everything I can think of and I cannot seem to remedy this problem. Any suggestion?
Thank you!
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 08:47 AM
|
0 responses
12 views
0 likes
|
Last Post
by seqadmin
Today, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
59 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
54 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
Leave a comment: