Seqanswers Leaderboard Ad

**ECO** · 10-09-2011, 10:06 AM

If that is supposed to be a table, enclose it in [ code ] ... [ / code ] tags to preserve the spacing.

...or use the [ table ] tag (syntax example in this thread: http://seqanswers.com/forums/showthread.php?t=948)

**Chrevan** · 10-10-2011, 05:11 PM

I think edgeR better. I have made a compare among DEGseq, DESeq and edgeR,and made venn diagrams to find the overlap, finding the DESeq and edgeR have a better overlap. So I think edgeR better.

It depends on you!
Wishes!

**tianyub836** · 10-10-2011, 06:28 PM

Originally posted by Chrevan View Post

I think edgeR better. I have made a compare among DEGseq, DESeq and edgeR,and made venn diagrams to find the overlap, finding the DESeq and edgeR have a better overlap. So I think edgeR better.

It depends on you!
Wishes!

thanks for you reply, Chrevan.

i know that DEGseq based on Poisson distribution while edgeR based on negative binomial distribution. and what i want to know is that apart from the methodology, output from which pakcage is reasonable based on common sense if there was such a thing?

**jwfoley** · 10-10-2011, 10:15 PM

DESeq is basically edgeR with some improvements, so if you want common sense, that seems to be the winner. Since DESeq and edgeR use the same distribution while DEGseq uses a different one, they naturally get more similar results, and that's not a sensible way to conclude that they're better. However, both of the negative-binomial methods' authors provide good evidence that DEGseq's Poisson assumption is invalid.

Here is the DESeq paper: http://genomebiology.com/2010/11/10/R106

**tianyub836** · 10-11-2011, 12:21 AM

Originally posted by jwfoley View Post

DESeq is basically edgeR with some improvements, so if you want common sense, that seems to be the winner. Since DESeq and edgeR use the same distribution while DEGseq uses a different one, they naturally get more similar results, and that's not a sensible way to conclude that they're better. However, both of the negative-binomial methods' authors provide good evidence that DEGseq's Poisson assumption is invalid.

Here is the DESeq paper: http://genomebiology.com/2010/11/10/R106

thanks, jwfoley.

well, i have read the DESeq paper and the edgeR one. they use the same NB distribution medel and both claimed taht they suit for the identification of DEGs from RNA-seq without any replicates and that meet my situations.

i am working on looking for DEGs of plants in abiotic stresses and my samples contain only control and treated groups. and both papers mentioned above have suggested that Poisson distribution model for no-replicates samples is acceptable. Am I right?

as i have mentioned in the former table, nearly 1/3 matched genes outputted from DEGseq were DEGs. does that make any sense?

**Simon Anders** · 10-11-2011, 04:47 AM

No, the Poisson distribution is never appropriate, and I thought we said that quite clearly in our paper. You will always end up with loads of false positives.

You simply cannot perform a proper analysis without replicates. The correct solution is to start over. (See also http://seqanswers.com/forums/showpos...04&postcount=2 )

DESeq offers the possibility to perform a very conservative analysis for the no-replicates case which shows you only those genes which really "stick out". This can give you at least a few results.

**tianyub836** · 10-11-2011, 05:52 AM

Originally posted by Simon Anders View Post

No, the Poisson distribution is never appropriate, and I thought we said that quite clearly in our paper. You will always end up with loads of false positives.

You simply cannot perform a proper analysis without replicates. The correct solution is to start over. (See also http://seqanswers.com/forums/showpos...04&postcount=2 )

DESeq offers the possibility to perform a very conservative analysis for the no-replicates case which shows you only those genes which really "stick out". This can give you at least a few results.

well , you just frightened me, Simon Anders.

i did not understand your words by saying "You simply cannot perform a proper analysis without replicates. The correct solution is to start over".

did you mean that, the data i was working on which simply came from control and treatment samples were meaningless?

**frozenlyse** · 10-11-2011, 05:56 PM

Originally posted by tianyub836 View Post

well , you just frightened me, Simon Anders.

i did not understand your words by saying "You simply cannot perform a proper analysis without replicates. The correct solution is to start over".

did you mean that, the data i was working on which simply came from control and treatment samples were meaningless?

If you have no measure of variability of your measurements, how can you make any conclusions about how reliable/reproducible the differential expression you observe is?

**tianyub836** · 10-11-2011, 06:31 PM

Originally posted by frozenlyse View Post

If you have no measure of variability of your measurements, how can you make any conclusions about how reliable/reproducible the differential expression you observe is?

what if i assumed that the variability of my measurement was ignorable or not significant enough to imapct my final output?

i mean that i am sure of the technical noise is minimumized and can be ignored and the biological variance is not significant.

**frozenlyse** · 10-11-2011, 06:48 PM

Originally posted by tianyub836 View Post

what if i assumed that the variability of my measurement was ignorable or not significant enough to imapct my final output?

i mean that i am sure of the technical noise is minimumized and can be ignored and the biological variance is not significant.

Well, then you'd be lying to yourself. But getting list of DE genes isn't the problem (edgeR will of course still give you a table of pvals and logFC) but knowing how many of those are at all trustworthy is the problem.

**Simon Anders** · 10-12-2011, 12:39 AM

Originally posted by tianyub836 View Post

i mean that i am sure of the technical noise is minimumized and can be ignored and the biological variance is not significant.

I am curious what makes you so sure of that?

There are, of course some possibilities to find something in your data. You might just guess the amount of sample-to-sample variability, and inject this information into the DESeq workflow. For a reasonable guess, however, you have better performed this kind of analysis before, with replication, and still, I would not want to see something like this in a publication. You might also estimate the variance from comparing your treatment and control samples and limit your hits to genes with so extreme fold changes that they stick out even there. DESeq's "blind" dispersion estimation is meant for that. Again, such an analysis is not publication quality.

**tianyub836** · 10-12-2011, 01:44 AM

Originally posted by Simon Anders View Post

I am curious what makes you so sure of that?

There are, of course some possibilities to find something in your data. You might just guess the amount of sample-to-sample variability, and inject this information into the DESeq workflow. For a reasonable guess, however, you have better performed this kind of analysis before, with replication, and still, I would not want to see something like this in a publication. You might also estimate the variance from comparing your treatment and control samples and limit your hits to genes with so extreme fold changes that they stick out even there. DESeq's "blind" dispersion estimation is meant for that. Again, such an analysis is not publication quality.

well, that was just an assumption of not significant impacts.

actually, when samples were prepared and we collected samples from multiple plants both for the control and treatment groups, which meant we had sent mixed samples for each group to be sequenced respectively. and we oringally thought that the biological replicates' impact might be reduced.

Did that make any sense?

**Simon Anders** · 10-14-2011, 12:50 PM

Originally posted by tianyub836 View Post

actually, when samples were prepared and we collected samples from multiple plants both for the control and treatment groups, which meant we had sent mixed samples for each group to be sequenced respectively. and we oringally thought that the biological replicates' impact might be reduced.

Did that make any sense?

Only a bit. If you pool N plants, the your variance goes down to 1/N (or your standard error of expression estimates to 1/sqrt(N) of the value for a single plant.)

So, of course, the variance got smaller, but by pooling everything, you have lost all possibility of figuring out how small it is now.

What you should have done is make two or three pools for each group and add multiplexing tags to the samples so that you can put them together in one sequencing lane. Comparing the pools from the same group would have enabled you to assess the variance. Without is, you have to guess it blindly, and whatever guess you may come up with, you cannot expect anybody (especially not a reviewer of your paper) to believe that to be a good guess.

**tianyub836** · 10-14-2011, 06:29 PM

Originally posted by Simon Anders View Post

Only a bit. If you pool N plants, the your variance goes down to 1/N (or your standard error of expression estimates to 1/sqrt(N) of the value for a single plant.)

So, of course, the variance got smaller, but by pooling everything, you have lost all possibility of figuring out how small it is now.

What you should have done is make two or three pools for each group and add multiplexing tags to the samples so that you can put them together in one sequencing lane. Comparing the pools from the same group would have enabled you to assess the variance. Without is, you have to guess it blindly, and whatever guess you may come up with, you cannot expect anybody (especially not a reviewer of your paper) to believe that to be a good guess.

thanks, Simon Anders.

i admit that it was not a perfect experiment design and also i ve many details to take care.

it was wonderful to discuss with you

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Today, 11:49 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

DEGseq VS edgeR, which one is more reliable?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News