DEGseq - SEQanswers

Xi Wang replied

03-05-2015, 01:38 PM
Originally posted by a0909 View Post

Hello Xi,
It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
I would appreciate your help.

Thanks

1) Zscore >0 is equivalent to log2(Fold_change) > 0
2) negative Zscores, nagative log2(Fold_change), then expression in condition 1 < that in condition 2, thus up-regulated in condition 2.
Leave a comment:
a0909 replied

03-05-2015, 08:32 AM
z-scores

Originally posted by Xi Wang View Post

Hi Sol, Thanks for using DEGseq.

In the output file, there are 2 columns for fold-change: "log2(Fold_change)" and "log2(Fold_change) normalized". log2(Fold_change) = log(value1/value2), and the normalized value is got from the normalized value1 and value2. From the value of fold-change, you can judge this gene is up-regulated or down-regulated. For example, for a gene if its log2(Fold_change) > 0, which means value1 > value2, and if its signature = TRUE, this gene is significantly down-regulated in condition 2. Also, you can look into z-scores.

Hope this helps.

Hello Xi,
It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
I would appreciate your help.

Thanks
Leave a comment:
Xi Wang replied

06-14-2012, 08:56 PM
Thanks for your question. The q-values are calculated by function in 'samr' package, and we didn't change anything regarding the calculation of q-values. You may have to add a small number (say 1e-6) to make your volcano plot work.
Leave a comment:
amdic2 replied

06-14-2012, 09:24 AM
Print q-value with SamWrapper

Dear all,
I am using the samWrapper function from DEGseq.
I would like to be able to get the q-values in the output of the method, as I need them in order to make a volcano plot. The problem is that for low q-values (e.g. 10e-4) samWrapper outputs "0". Can anybody help?
Thank you,
Anne-Marie
Leave a comment:
Xi Wang replied

03-26-2012, 03:53 PM
Originally posted by AsoBioInfo View Post

Thanks Xi for your reply!

The output score data looks like this:
"GeneNames" "value1" "value2" "log2(Fold_change)"
00000000000000 6 10 -0.736 -0.643
11111111111111 68 69 -0.02 0.072
22222222222222 1 1 0 0.095
33333333333333 NA NA NA NA NA NA NA NA FALSE
44444444444444 NA NA NA NA NA NA NA NA FALSE

Note: There are other scores also.

The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

-> library(DEGseq)
geneExpFile <- "D:/data/MyData.txt"
geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
write.table(geneExpMatrix1[1:13,],row.names=FALSE)
write.table(geneExpMatrix2[1:13,],row.names=FALSE)

-> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
method="MARS")

Hope this helps!

Thanks!

Hi, By reading your code, I guess you were going to compare gene expression levels for two groups, each having 3 replicates. The expression values for Group1 were of Columns 7,9,11 in your MyData.txt file; whilst values for Group2 were of Columns 8,10,12 of MyData.txt. Is that right? So far, I understand you did a 3 versus 3 comparison. However, in the line starting with DEGexp, it seems you performed a 5 versus 5 comparison, as you listed 5 columns for each group. Perhaps, you were confused by "layout". As I said before, layout is to format the output figure but has nothing to do with your data matrix.

Besides, I'd like to make it clear that DEGseq works well with technical replicates from the same experiment manipulation. It has been shown in our paper that the detection variance in technical replicates can be almost totally explained by Poisson models.
In Hardcastle et al 2010, DEGSeq has been shown to have a better performance than other tools compared in a real world dataset (Figure 5 of Hardcastle et al 2010). The choice of methods/tools is your decision, but you'd better have a more comprehensive understanding of these tools as well as your data.

Any further questions please let me know.

Ref:
Hardcastle, T.J. and Kelly, K.A. (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data, BMC Bioinformatics, 11, 422.

Last edited by Xi Wang; 03-26-2012, 04:06 PM.
Leave a comment:
AsoBioInfo replied

03-26-2012, 05:41 AM
Originally posted by ETHANol View Post

Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.

Thanks for your reply!

Yup.. it is RNA-seq data.... Okay, I'll try DESeq and edgeR

Thanks once again....
Leave a comment:
AsoBioInfo replied

03-26-2012, 05:38 AM
Originally posted by Xi Wang View Post

Dear Aso, thanks for your questions.

The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.

Thanks Xi for your reply!

The output score data looks like this:
"GeneNames" "value1" "value2" "log2(Fold_change)"
00000000000000 6 10 -0.736 -0.643
11111111111111 68 69 -0.02 0.072
22222222222222 1 1 0 0.095
33333333333333 NA NA NA NA NA NA NA NA FALSE
44444444444444 NA NA NA NA NA NA NA NA FALSE

Note: There are other scores also.

The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

-> library(DEGseq)
geneExpFile <- "D:/data/MyData.txt"
geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
write.table(geneExpMatrix1[1:13,],row.names=FALSE)
write.table(geneExpMatrix2[1:13,],row.names=FALSE)

-> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
par(mar=c(2, 2, 2, 2))
DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
method="MARS")

Hope this helps!

Thanks!
Leave a comment:
ETHANol replied

03-26-2012, 04:38 AM
Originally posted by AsoBioInfo View Post

Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso

Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.
Leave a comment:
Xi Wang replied

03-26-2012, 04:13 AM
Originally posted by AsoBioInfo View Post

Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso

Dear Aso, thanks for your questions.

The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.
Leave a comment:
AsoBioInfo replied

03-25-2012, 10:45 PM
DEGseq Question

Hello,

I have a question regarding DEGseq. I am not understanding the syntax of layout:
layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

Thanks for your help!
Aso
Leave a comment:
Xi Wang replied

09-22-2011, 09:42 PM
Originally posted by wangleibio View Post

hi,xi
I have a problem using DEGseq,
DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

Please wait...
gene id column in geneExpMatrix1 for sample1: 1
expression value column(s) in geneExpMatrix1: 2
total number of reads uniquely mapped to genome obtained from sample1: 62747041
gene id column in geneExpMatrix2 for sample2: 1
expression value column(s) in geneExpMatrix2: 2
total number of reads uniquely mapped to genome obtained from sample2: 69469907

method to identify differentially expressed genes: MARS
pValue threshold: 0.001
output directory: ./roothypocoty

Please wait ...
Identifying differentially expressed genes ...
Please wait patiently ...
output ...

Done ...
The results can be observed in directory: ./roothypocoty

problem:

it can produce the file(outpuDir),but do not produce MA-plot,
additionaly, my two sample data do not have replicates.

hope you help !
thanks !
lei

Thanks for using DEGseq.

To figure out your problem, please try
(1) Run the example provide in the help document. Simply type "?DEGexp" in the R console, and cope/paste the Examples at the end of the document. Then check if the example works properly
(2) Run "sessionInfo()" in R console, and paste the result here or better email to me "[email protected]"

Thanks.
Leave a comment:
wangleibio replied

09-22-2011, 09:18 PM
DEGdseq problem

hi,xi
I have a problem using DEGseq,
DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

Please wait...
gene id column in geneExpMatrix1 for sample1: 1
expression value column(s) in geneExpMatrix1: 2
total number of reads uniquely mapped to genome obtained from sample1: 62747041
gene id column in geneExpMatrix2 for sample2: 1
expression value column(s) in geneExpMatrix2: 2
total number of reads uniquely mapped to genome obtained from sample2: 69469907

method to identify differentially expressed genes: MARS
pValue threshold: 0.001
output directory: ./roothypocoty

Please wait ...
Identifying differentially expressed genes ...
Please wait patiently ...
output ...

Done ...
The results can be observed in directory: ./roothypocoty

problem:

it can produce the file(outpuDir),but do not produce MA-plot,
additionaly, my two sample data do not have replicates.

hope you help !
thanks !
lei
Leave a comment:
Xi Wang replied

08-04-2011, 10:12 PM
Originally posted by townway View Post

Hi Xi,
My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

If not, would you suggest some alternatively ways?

Thank you in advance!

Townway

Sorry Townway, DEGseq is now not suitable for time series data. Please try Cufflinks (http://cufflinks.cbcb.umd.edu/) instead. Thanks.
Leave a comment:
townway replied

08-03-2011, 09:32 PM
Hi Xi,
My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

If not, would you suggest some alternatively ways?

Thank you in advance!

Townway
Leave a comment:
Xi Wang replied

07-26-2011, 06:58 AM
Originally posted by mgolo View Post

Thanks for your reply Xi

I'll try all the methods when i have my annotation file. But, what are the criteria to know which one is the best?

Looking forward to your new version of DEGseq!

I think one of the most important criteria should be how the DEGs detected consist with previous knowledge, though the new findings may give novel discoveries. From the statistical point of view, the best method should guarantee that your data don't violate the assumption of the chosen method.
Leave a comment:

Previous 1 2 3 4 11 template Next

Addressing Off-Target Effects in CRISPR Technologies

by seqadmin

The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality¹. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes². This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways³. Identifying the full range...
- Channel: Articles
08-27-2024, 04:44 AM
Selecting and Optimizing mRNA Library Preparations

by seqadmin

Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
- Channel: Articles
08-07-2024, 12:11 PM

Topics	Statistics	Last Post
Study Reveals How Bacteria Defend Against Viral Attacks by seqadmin Started by seqadmin, 08-27-2024, 04:40 AM	0 responses 16 views 0 likes	Last Post by seqadmin 08-27-2024, 04:40 AM
New Single-Molecule Sequencing Platform Introduces Advanced Features for High-Throughput Genomics by seqadmin Started by seqadmin, 08-22-2024, 05:00 AM	0 responses 293 views 0 likes	Last Post by seqadmin 08-22-2024, 05:00 AM
New DNA Code Discovered Revealing Complex Gene Regulation Mechanisms by seqadmin Started by seqadmin, 08-21-2024, 10:49 AM	0 responses 135 views 0 likes	Last Post by seqadmin 08-21-2024, 10:49 AM
Epigenetic Clocks Derived from Retroelements Offer New Insights into Aging by seqadmin Started by seqadmin, 08-19-2024, 05:12 AM	0 responses 124 views 0 likes	Last Post by seqadmin 08-19-2024, 05:12 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News