Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I think I have found a solution. CuffDiff actually does output a count of each feature (genes.count_tracking, cds.count_tracking, etc). I ran cuffDiff with each sample treated as a separate treatment, c.f. different samples from the same treatment being identified as replicates using the -L/--labels feature. This gave me count data for each feature in each sample.
-
Originally posted by ashuchawla View Postday 0
sample 1 - untreated
day 3
sample 2 - treated with "A"
sample 3 - treated with "B"
sample 4 - treated with "C"
day 6
sample 5 - treated with "A"
sample 6 - treated with "B"
sample 7 - treated with "C"
day 9
sample 8 - treated with "A"
sample 9 - treated with "B"
sample 10 - treated with "C"
They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.
Ashu
I think you need a "untreated" control for every time point, or the time will be a extra variable.
Leave a comment:
-
day 0
sample 1 - untreated
day 3
sample 2 - treated with "A"
sample 3 - treated with "B"
sample 4 - treated with "C"
day 6
sample 5 - treated with "A"
sample 6 - treated with "B"
sample 7 - treated with "C"
day 9
sample 8 - treated with "A"
sample 9 - treated with "B"
sample 10 - treated with "C"
They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.
Ashu
Originally posted by Simon Anders View Post45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.
Maybe explain a bit more about the biology of your experiment.
Leave a comment:
-
45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.
Maybe explain a bit more about the biology of your experiment.
Leave a comment:
-
I have ten samples total, 2C10 = 45 combinations
Originally posted by Simon Anders View PostBTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?
Leave a comment:
-
I have been told that there will be genes that would be down regulated for all 9 days, some would be up-regulated for all 9 days and some would change from down to up or vice versa. I understand that I could get a list of DE genes upon pairwise comparisons of my samples across the times d0, d3, d6 and d9. I have categorized the ones with negative log2foldchange(in pairwise comparison in my previous project) as down regulated and the ones that are positive as up regulated. I wanted a way to do that for this project as well but if I have to do it pairwise , it will take me a lot of time. I have BAM files for all samples and I also have the HT-Seq counts for all of them. What would be the best move for me next?
Please let me know if you have any further questions...
Thanks a million,
Ashu
Originally posted by Simon Anders View PostYes, there have been a lot of updates to DESeq, especially the release of DESeq2.
And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.
You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?
DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.
The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.
Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.
Simon
Leave a comment:
-
BTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?
Leave a comment:
-
Yes, there have been a lot of updates to DESeq, especially the release of DESeq2.
And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.
You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?
DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.
The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.
Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.
Simon
Leave a comment:
-
Need Help with Time Series analysis of RNA-Seq Data
Dear Simon or anybody with RNA-Seq data analysis expertise,
I wanted to ask you if there have been any updates on DESeq or another tool since this post which could enable the analysis of RNA-Seq Data for samples across time( day 0, day 3, day6, day 9) without having to do pair wise comparisons. I have total 10 samples and pairwise comparisons would take me a long time. I need to know gene regulation pattern across time for these samples and if this could be done using all samples at one time. Any help would be highly appreciated. I started working on RNASeq analysis only a month ago and do not have a lot of experience.
Thanks
Ashu
Originally posted by Simon Anders View PostDESeq allows you to perform pairwise comparisons, and, to my knowledge, the same is true for all other tools out there. So, you can pick pairs of time points and compare these. Using GLMs, you can also compare differences between pairs of time points for one drug with differences for another drug or for the untreated controls.
But which comparisons (contrasts) are useful to analyze data? Figuring this out is, in my opinion, the main challenge of time course data.
Of course, all these pairwise comparisons are a bit pedestrian, if you have more than a three or four time points. You might be more interested in curve fits, and this is a very different statistical task, with which I have little experience. I haven't seen yet any such analysis published for RNA-Seq data, but there is lots of paper on microarray time courses. Maybe the article by Hafemeister et al in the current issue of Bioinformatics is a good starting point. Translating such methods to the RNA-Seq settings is certainly something that needs to be done now.
S
Leave a comment:
-
For those of you reading this thread that wished to use DESeq with Cufflinks: please check out http://cufflinks.cbcb.umd.edu for release 1.0.0, which includes a major overhaul of replicate support in Cuffdiff. Cuffdiff now models overdispersion of fragment counts at the transcript model, building on ideas introduced by DESeq and edgeR to greatly improve accuracy in calling differentially expressed genes and transcripts.
Thanks to all the commenters on this thread and elsewhere for helpful feedback!
Leave a comment:
-
The SCV plot is perfectly fine. The black curve just shows that you have genes with expression strength ranging from 1 count up to 300 or 1000 counts. For genes with more than around 100 counts, the biological noise is very low. For lowly expressed genes, you not only have strong shot noise but also strong biological noise.
It is a priori surprising that the biological noise should depend so strongly on the gene's expression strength (after all, a coefficient of variation is already normalized for expression strength). However, I've seen this before; it is quite common. We are currently investigating the hypothesis that this may happen whenever the library preparation PCR was started with very low initial cDNA concentration.
In your case, however, you have good replicability for you stronger genes, i.e., you should get good results.
Leave a comment:
-
Not sure if the attachment went through...trying againAttached Files
Leave a comment:
-
I really need to change this plot; it seems to confuse people. The black curve (labelled "base mean") , which I suppose you are talking about, does not show SCV values (i.e., the y axis does not apply to it). Rather it is only there to indicate which expression stengths actually occur in your sample in order to tell you which parts of the colored curves (which do show SCV values) are of interest.
Maybe post your plot here, then I can try to clarify it.
Leave a comment:
-
Dear Simon,
First of all, thanks for your response and clarification. I see your point regarding Fig. 3 in the paper, which is consistent with my own MvA plots - that at low expression levels, the fold change has to be bigger for DESeq to call that change significant.
To be clear, I just meant that when making the SCV plot as described in Fig. 1 of the vignette, I get a graph that looks different from the example shown. Basically, my mean curve doesn't follow a bell shape on the left side; instead of tailing off, the SCV values remain high on the left, and it looks as though there is a second, smaller peak. Is this something that other users have seen? How does one explain this?
Thanks again for your help! It is much appreciated.
Elena
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
178 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
271 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
654 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
267 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: