Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • chadn737
    replied
    Originally posted by a_mt View Post
    ok.. not arguing.. but for your reference

    Background Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. Results We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. Conclusions Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the ‘limma’ method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.


    and a quote from DESeq paper :



    quote from DEGseq paper :



    And even I think no replicate does not make any sense.. but the data I am using is a published one,and just I am trying out different methods to call DEG's.
    I apologize if I came across a little strongly.

    From the DESeq paper:

    "However, it has been noted [1,8] that the assumption of Poisson distribution is too restrictive: it predicts smaller variations than what is seen in the data. Therefore, the resulting statistical test does not control type-I error (the probability of false discoveries) as advertised."

    In other words, the Poisson distribution leads to false positives and is not suitable. That is why DESeq is based on a Negative Binomial, not a Poisson distribution:

    "To address this so-called overdispersion problem, it has been proposed to model count data with negative binomial (NB) distributions [9], and this approach is used in the edgeR package for analysis of SAGE and RNA-Seq [8,10]."

    The DESeq vignette provides protocols for analyzing data without technical replicates.

    Go here: http://bioconductor.org/packages/rel.../doc/DESeq.pdf

    and read section 3.3 titled "Working without any replicates." That will tell you how to do this in DESeq. The purpose of the VSD normalized data is to put everything on the same scale for clustering and other sorts of analysis, not for differential expression.
    Last edited by chadn737; 04-29-2013, 06:35 AM.

    Leave a comment:


  • a_mt
    replied
    ok.. not arguing.. but for your reference

    Background Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. Results We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. Conclusions Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the ‘limma’ method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.


    and a quote from DESeq paper :

    If reads were independently sampled from a population with given, fixed fractions of genes, the read counts would follow a multinomial distribution, which can be approximated by the Poisson distribution.
    quote from DEGseq paper :

    Current observations suggest that typically RNA-seq experiments have low technical background noise (which could be checked using DEGseq) and the Poisson model fits data well.
    And even I think no replicate does not make any sense.. but the data I am using is a published one,and just I am trying out different methods to call DEG's.

    Leave a comment:


  • chadn737
    replied
    Does count data follow a poisson distribution? The authors of DESeq, EdgeR and others would disagree with that.

    And the purpose of VSD normalized data is not for calling differential expression, but for clustering, creating heat maps, etc. In the DESeq vignette they actually describes a protocol for analyzing data without replicates, however that does not mean you should! I honestly don't know how somebody would publish results without replicates, your really can't make sense of the data without them.
    Last edited by chadn737; 04-28-2013, 11:19 PM.

    Leave a comment:


  • a_mt
    replied
    Sorry, I meant to say in general count data follows poisson distribution.
    But,is it ok to use vsd normalized data to detect DE genes ?? and I don't have any replicates..

    Leave a comment:


  • chadn737
    replied
    You should do replicates. And DESeq uses the negative binomial, not the poisson.

    Leave a comment:


  • a_mt
    started a topic finding DE genes from VSD normalized data

    finding DE genes from VSD normalized data

    Hi all,

    I have time series mRNA-seq data but without replicates.
    I have some doubts about calling DE genes with DESeq.

    I have used
    Code:
    varianceStabilizingTransformation
    function from DESeq to normalize count data. Now can I use this vsd transformed data to calculate fold change and to call DE genes?? may be using classic LIMMA package.. Is it good practice to do so ??

    I have tried
    Code:
    nbionTest
    on raw count too.. but after vsd transforming, data look more like microarray and I was wondering is it of any harm to call DE genes/FC change on vsd transfored data.. since in original DESeq paper they have made clear that count data follows poisson distribution unlike microarray which is more like normally distributed, but after vsd transformation, data looks more like normally distributed.

    Thank you.

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM
  • seqadmin
    Latest Developments in Precision Medicine
    by seqadmin



    Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

    Somatic Genomics
    “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
    05-24-2024, 01:16 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 02:20 PM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-07-2024, 06:58 AM
0 responses
181 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-06-2024, 08:18 AM
0 responses
228 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-06-2024, 08:04 AM
0 responses
184 views
0 likes
Last Post seqadmin  
Working...
X