Announcement

Collapse
No announcement yet.

DEGseq

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Xi Wang
    replied
    Originally posted by a0909 View Post
    Hello Xi,
    It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
    I would appreciate your help.

    Thanks
    1) Zscore >0 is equivalent to log2(Fold_change) > 0
    2) negative Zscores, nagative log2(Fold_change), then expression in condition 1 < that in condition 2, thus up-regulated in condition 2.

    Leave a comment:


  • a0909
    replied
    z-scores

    Originally posted by Xi Wang View Post
    Hi Sol, Thanks for using DEGseq.

    In the output file, there are 2 columns for fold-change: "log2(Fold_change)" and "log2(Fold_change) normalized". log2(Fold_change) = log(value1/value2), and the normalized value is got from the normalized value1 and value2. From the value of fold-change, you can judge this gene is up-regulated or down-regulated. For example, for a gene if its log2(Fold_change) > 0, which means value1 > value2, and if its signature = TRUE, this gene is significantly down-regulated in condition 2. Also, you can look into z-scores.

    Hope this helps.
    Hello Xi,
    It is regarding the last line of the quoted answer ("you can look into z-scores").I would like to know whether the Zscore >0 is equivalent to log2(Fold_change) > 0, implying the negative Zscores are the down regulated genes in the condition 2 (as per the example quoted in your answer).
    I would appreciate your help.

    Thanks

    Leave a comment:


  • Xi Wang
    replied
    Thanks for your question. The q-values are calculated by function in 'samr' package, and we didn't change anything regarding the calculation of q-values. You may have to add a small number (say 1e-6) to make your volcano plot work.

    Leave a comment:


  • amdic2
    replied
    Print q-value with SamWrapper

    Dear all,
    I am using the samWrapper function from DEGseq.
    I would like to be able to get the q-values in the output of the method, as I need them in order to make a volcano plot. The problem is that for low q-values (e.g. 10e-4) samWrapper outputs "0". Can anybody help?
    Thank you,
    Anne-Marie

    Leave a comment:


  • Xi Wang
    replied
    Originally posted by AsoBioInfo View Post
    Thanks Xi for your reply!

    The output score data looks like this:
    "GeneNames" "value1" "value2" "log2(Fold_change)"
    00000000000000 6 10 -0.736 -0.643
    11111111111111 68 69 -0.02 0.072
    22222222222222 1 1 0 0.095
    33333333333333 NA NA NA NA NA NA NA NA FALSE
    44444444444444 NA NA NA NA NA NA NA NA FALSE

    Note: There are other scores also.

    The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

    -> library(DEGseq)
    geneExpFile <- "D:/data/MyData.txt"
    geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
    geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
    write.table(geneExpMatrix1[1:13,],row.names=FALSE)
    write.table(geneExpMatrix2[1:13,],row.names=FALSE)

    -> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
    par(mar=c(2, 2, 2, 2))
    DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
    geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
    method="MARS")

    Hope this helps!

    Thanks!
    Hi, By reading your code, I guess you were going to compare gene expression levels for two groups, each having 3 replicates. The expression values for Group1 were of Columns 7,9,11 in your MyData.txt file; whilst values for Group2 were of Columns 8,10,12 of MyData.txt. Is that right? So far, I understand you did a 3 versus 3 comparison. However, in the line starting with DEGexp, it seems you performed a 5 versus 5 comparison, as you listed 5 columns for each group. Perhaps, you were confused by "layout". As I said before, layout is to format the output figure but has nothing to do with your data matrix.

    Besides, I'd like to make it clear that DEGseq works well with technical replicates from the same experiment manipulation. It has been shown in our paper that the detection variance in technical replicates can be almost totally explained by Poisson models.
    In Hardcastle et al 2010, DEGSeq has been shown to have a better performance than other tools compared in a real world dataset (Figure 5 of Hardcastle et al 2010). The choice of methods/tools is your decision, but you'd better have a more comprehensive understanding of these tools as well as your data.

    Any further questions please let me know.

    Ref:
    Hardcastle, T.J. and Kelly, K.A. (2010) baySeq: empirical Bayesian methods for identifying differential expression in sequence count data, BMC Bioinformatics, 11, 422.
    Last edited by Xi Wang; 03-26-2012, 04:06 PM.

    Leave a comment:


  • AsoBioInfo
    replied
    Originally posted by ETHANol View Post
    Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.

    Thanks for your reply!

    Yup.. it is RNA-seq data.... Okay, I'll try DESeq and edgeR

    Thanks once again....

    Leave a comment:


  • AsoBioInfo
    replied
    Originally posted by Xi Wang View Post
    Dear Aso, thanks for your questions.

    The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

    For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.

    Thanks Xi for your reply!

    The output score data looks like this:
    "GeneNames" "value1" "value2" "log2(Fold_change)"
    00000000000000 6 10 -0.736 -0.643
    11111111111111 68 69 -0.02 0.072
    22222222222222 1 1 0 0.095
    33333333333333 NA NA NA NA NA NA NA NA FALSE
    44444444444444 NA NA NA NA NA NA NA NA FALSE

    Note: There are other scores also.

    The fold change is calculated for only three rows. Although the matrix is having all values since it is giving output the whole matrix. The commands I used are:

    -> library(DEGseq)
    geneExpFile <- "D:/data/MyData.txt"
    geneExpMatrix1 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(7,9,11))
    geneExpMatrix2 <- readGeneExp(file=geneExpFile, geneCol=1, valCol=c(8,10,12))
    write.table(geneExpMatrix1[1:13,],row.names=FALSE)
    write.table(geneExpMatrix2[1:13,],row.names=FALSE)

    -> layout(matrix(c(1,2,3,4,5,6), 3, 2, byrow=TRUE))
    par(mar=c(2, 2, 2, 2))
    DEGexp(geneExpMatrix1=geneExpMatrix1, geneCol1=1, expCol1=c(2,3,4,5,6), groupLabel1="Label1",
    geneExpMatrix2=geneExpMatrix2, geneCol2=1, expCol2=c(2,3,4,5,6), groupLabel2="Label2",
    method="MARS")

    Hope this helps!

    Thanks!

    Leave a comment:


  • ETHANol
    replied
    Originally posted by AsoBioInfo View Post
    Hello,

    I have a question regarding DEGseq. I am not understanding the syntax of layout:
    layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

    I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

    The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

    Thanks for your help!
    Aso
    Are you analyzing RNA-seq data? If so the overwhelming opinion of the community is that the poisson model of DEGseq is invalid and you should use edgeR or DESeq instead.

    Leave a comment:


  • Xi Wang
    replied
    Originally posted by AsoBioInfo View Post
    Hello,

    I have a question regarding DEGseq. I am not understanding the syntax of layout:
    layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

    I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

    The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

    Thanks for your help!
    Aso

    Dear Aso, thanks for your questions.

    The "layout" is only related to drawing the DEGSeq output plot. Specifically, the command line means to generate a figure with 6 panels in 3 rows and 2 columns.

    For your problem, could you copy and paste a head of your data and your command lines here? Thus I will be able to diagnose the issues. Thanks.

    Leave a comment:


  • AsoBioInfo
    replied
    DEGseq Question

    Hello,

    I have a question regarding DEGseq. I am not understanding the syntax of layout:
    layout(matrix(c(1, 2, 3, 4, 5, 6), 3, 2, byrow = TRUE))

    I am seeing my graphs but it is not interpreting anything. For my data only three rows were considered and their log fold changes were calculated. But for the remaining data, no histogram was built.

    The first chunk of data is able to read the whole data, I think something is wrong in only fixing the layout and matrix.

    Thanks for your help!
    Aso

    Leave a comment:


  • Xi Wang
    replied
    Originally posted by wangleibio View Post
    hi,xi
    I have a problem using DEGseq,
    DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

    Please wait...
    gene id column in geneExpMatrix1 for sample1: 1
    expression value column(s) in geneExpMatrix1: 2
    total number of reads uniquely mapped to genome obtained from sample1: 62747041
    gene id column in geneExpMatrix2 for sample2: 1
    expression value column(s) in geneExpMatrix2: 2
    total number of reads uniquely mapped to genome obtained from sample2: 69469907

    method to identify differentially expressed genes: MARS
    pValue threshold: 0.001
    output directory: ./roothypocoty

    Please wait ...
    Identifying differentially expressed genes ...
    Please wait patiently ...
    output ...

    Done ...
    The results can be observed in directory: ./roothypocoty



    problem:


    it can produce the file(outpuDir),but do not produce MA-plot,
    additionaly, my two sample data do not have replicates.


    hope you help !
    thanks !
    lei
    Thanks for using DEGseq.

    To figure out your problem, please try
    (1) Run the example provide in the help document. Simply type "?DEGexp" in the R console, and cope/paste the Examples at the end of the document. Then check if the example works properly
    (2) Run "sessionInfo()" in R console, and paste the result here or better email to me "[email protected]"

    Thanks.

    Leave a comment:


  • wangleibio
    replied
    DEGdseq problem

    hi,xi
    I have a problem using DEGseq,
    DEGexp(geneExpMatrix1 = geneExpMatrix1, geneCol1 = 1,expCol1 = 2, groupLabel1 = "roottip",geneExpMatrix2 = geneExpMatrix2,geneCol2 = 1,expCol2 = 2,groupLabel2 = "hypocotyl",outputDir= "./roothypocoty",method = "MARS")

    Please wait...
    gene id column in geneExpMatrix1 for sample1: 1
    expression value column(s) in geneExpMatrix1: 2
    total number of reads uniquely mapped to genome obtained from sample1: 62747041
    gene id column in geneExpMatrix2 for sample2: 1
    expression value column(s) in geneExpMatrix2: 2
    total number of reads uniquely mapped to genome obtained from sample2: 69469907

    method to identify differentially expressed genes: MARS
    pValue threshold: 0.001
    output directory: ./roothypocoty

    Please wait ...
    Identifying differentially expressed genes ...
    Please wait patiently ...
    output ...

    Done ...
    The results can be observed in directory: ./roothypocoty



    problem:


    it can produce the file(outpuDir),but do not produce MA-plot,
    additionaly, my two sample data do not have replicates.


    hope you help !
    thanks !
    lei

    Leave a comment:


  • Xi Wang
    replied
    Originally posted by townway View Post
    Hi Xi,
    My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

    If not, would you suggest some alternatively ways?

    Thank you in advance!

    Townway
    Sorry Townway, DEGseq is now not suitable for time series data. Please try Cufflinks (http://cufflinks.cbcb.umd.edu/) instead. Thanks.

    Leave a comment:


  • townway
    replied
    Hi Xi,
    My data is time course data with 6 time points but without replicate. I wonder if I can try your DEGseq.

    If not, would you suggest some alternatively ways?

    Thank you in advance!

    Townway

    Leave a comment:


  • Xi Wang
    replied
    Originally posted by mgolo View Post
    Thanks for your reply Xi

    I'll try all the methods when i have my annotation file. But, what are the criteria to know which one is the best?

    Looking forward to your new version of DEGseq!
    I think one of the most important criteria should be how the DEGs detected consist with previous knowledge, though the new findings may give novel discoveries. From the statistical point of view, the best method should guarantee that your data don't violate the assumption of the chosen method.

    Leave a comment:

Working...
X