Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gwilymh
    replied
    I think I have found a solution. CuffDiff actually does output a count of each feature (genes.count_tracking, cds.count_tracking, etc). I ran cuffDiff with each sample treated as a separate treatment, c.f. different samples from the same treatment being identified as replicates using the -L/--labels feature. This gave me count data for each feature in each sample.

    Leave a comment:


  • lucer105
    replied
    Originally posted by ashuchawla View Post
    day 0
    sample 1 - untreated

    day 3
    sample 2 - treated with "A"
    sample 3 - treated with "B"
    sample 4 - treated with "C"

    day 6

    sample 5 - treated with "A"
    sample 6 - treated with "B"
    sample 7 - treated with "C"

    day 9

    sample 8 - treated with "A"
    sample 9 - treated with "B"
    sample 10 - treated with "C"

    They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.

    Ashu

    I think you need a "untreated" control for every time point, or the time will be a extra variable.

    Leave a comment:


  • ashuchawla
    replied
    day 0
    sample 1 - untreated

    day 3
    sample 2 - treated with "A"
    sample 3 - treated with "B"
    sample 4 - treated with "C"

    day 6

    sample 5 - treated with "A"
    sample 6 - treated with "B"
    sample 7 - treated with "C"

    day 9

    sample 8 - treated with "A"
    sample 9 - treated with "B"
    sample 10 - treated with "C"

    They want to see DE across time as well as across treatments. I am just a bioinformatician. Some body else did the experiments and I have the rnaSeq data for it. This is all I have been told.

    Ashu

    Originally posted by Simon Anders View Post
    45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.

    Maybe explain a bit more about the biology of your experiment.

    Leave a comment:


  • Simon Anders
    replied
    45 comparisons still would only take an hour or so. However, my feeling is that you are doing something fundamentally wrong if you are comparing pairs of samples, rather than pairs of conditions or time points.

    Maybe explain a bit more about the biology of your experiment.

    Leave a comment:


  • ashuchawla
    replied
    I have ten samples total, 2C10 = 45 combinations

    Originally posted by Simon Anders View Post
    BTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?

    Leave a comment:


  • ashuchawla
    replied
    I have been told that there will be genes that would be down regulated for all 9 days, some would be up-regulated for all 9 days and some would change from down to up or vice versa. I understand that I could get a list of DE genes upon pairwise comparisons of my samples across the times d0, d3, d6 and d9. I have categorized the ones with negative log2foldchange(in pairwise comparison in my previous project) as down regulated and the ones that are positive as up regulated. I wanted a way to do that for this project as well but if I have to do it pairwise , it will take me a lot of time. I have BAM files for all samples and I also have the HT-Seq counts for all of them. What would be the best move for me next?

    Please let me know if you have any further questions...

    Thanks a million,
    Ashu


    Originally posted by Simon Anders View Post
    Yes, there have been a lot of updates to DESeq, especially the release of DESeq2.

    And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.

    You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?

    DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.

    The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.

    Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.

    Simon

    Leave a comment:


  • Simon Anders
    replied
    BTW, what did you mean by "pairwise comparisons would take me a long time"? Can't be more than a few minutes calculation time. The issue is rather: what would the result tell you?

    Leave a comment:


  • Simon Anders
    replied
    Yes, there have been a lot of updates to DESeq, especially the release of DESeq2.

    And there have always been plenty of methods to analyse time series data. My post above was not to claim that it cannot be done. Rather, it can only be done once you know what it is, i.e., what you actually want.

    You say you "need to know gene regulation pattern across time". What do you mean exactly by "pattern"?

    DESeq is a tool to test for statistical significance of differential expression. You ask a specific question and you get a p value, i.e., a yes/no answer (or rather: a yes / can't say answer). Once you can tell me the yes/no question, I can tell you how to use DESeq for it.

    The issue here is that people keep asking me about once a week "how do you analyse time course data?" but when I ask back "what is your precise question?" I never get an answer.

    Not that I'm surprised: In my experience, analysing time course data is rarely about answering yes/no question but rather about answering "which?" questions and hence, they are a job not for methods of statistical hypothesis testing but of machine learning.

    Simon

    Leave a comment:


  • ashuchawla
    replied
    Need Help with Time Series analysis of RNA-Seq Data

    Dear Simon or anybody with RNA-Seq data analysis expertise,

    I wanted to ask you if there have been any updates on DESeq or another tool since this post which could enable the analysis of RNA-Seq Data for samples across time( day 0, day 3, day6, day 9) without having to do pair wise comparisons. I have total 10 samples and pairwise comparisons would take me a long time. I need to know gene regulation pattern across time for these samples and if this could be done using all samples at one time. Any help would be highly appreciated. I started working on RNASeq analysis only a month ago and do not have a lot of experience.

    Thanks
    Ashu



    Originally posted by Simon Anders View Post
    DESeq allows you to perform pairwise comparisons, and, to my knowledge, the same is true for all other tools out there. So, you can pick pairs of time points and compare these. Using GLMs, you can also compare differences between pairs of time points for one drug with differences for another drug or for the untreated controls.

    But which comparisons (contrasts) are useful to analyze data? Figuring this out is, in my opinion, the main challenge of time course data.

    Of course, all these pairwise comparisons are a bit pedestrian, if you have more than a three or four time points. You might be more interested in curve fits, and this is a very different statistical task, with which I have little experience. I haven't seen yet any such analysis published for RNA-Seq data, but there is lots of paper on microarray time courses. Maybe the article by Hafemeister et al in the current issue of Bioinformatics is a good starting point. Translating such methods to the RNA-Seq settings is certainly something that needs to be done now.

    S

    Leave a comment:


  • Cole Trapnell
    replied
    For those of you reading this thread that wished to use DESeq with Cufflinks: please check out http://cufflinks.cbcb.umd.edu for release 1.0.0, which includes a major overhaul of replicate support in Cuffdiff. Cuffdiff now models overdispersion of fragment counts at the transcript model, building on ideas introduced by DESeq and edgeR to greatly improve accuracy in calling differentially expressed genes and transcripts.

    Thanks to all the commenters on this thread and elsewhere for helpful feedback!

    Leave a comment:


  • Simon Anders
    replied
    The SCV plot is perfectly fine. The black curve just shows that you have genes with expression strength ranging from 1 count up to 300 or 1000 counts. For genes with more than around 100 counts, the biological noise is very low. For lowly expressed genes, you not only have strong shot noise but also strong biological noise.

    It is a priori surprising that the biological noise should depend so strongly on the gene's expression strength (after all, a coefficient of variation is already normalized for expression strength). However, I've seen this before; it is quite common. We are currently investigating the hypothesis that this may happen whenever the library preparation PCR was started with very low initial cDNA concentration.

    In your case, however, you have good replicability for you stronger genes, i.e., you should get good results.

    Leave a comment:


  • ecofriendly
    replied
    Not sure if the attachment went through...trying again
    Attached Files

    Leave a comment:


  • ecofriendly
    replied
    Hi Simon, Please see the attachment.

    Leave a comment:


  • Simon Anders
    replied
    I really need to change this plot; it seems to confuse people. The black curve (labelled "base mean") , which I suppose you are talking about, does not show SCV values (i.e., the y axis does not apply to it). Rather it is only there to indicate which expression stengths actually occur in your sample in order to tell you which parts of the colored curves (which do show SCV values) are of interest.

    Maybe post your plot here, then I can try to clarify it.

    Leave a comment:


  • ecofriendly
    replied
    Dear Simon,

    First of all, thanks for your response and clarification. I see your point regarding Fig. 3 in the paper, which is consistent with my own MvA plots - that at low expression levels, the fold change has to be bigger for DESeq to call that change significant.

    To be clear, I just meant that when making the SCV plot as described in Fig. 1 of the vignette, I get a graph that looks different from the example shown. Basically, my mean curve doesn't follow a bell shape on the left side; instead of tailing off, the SCV values remain high on the left, and it looks as though there is a second, smaller peak. Is this something that other users have seen? How does one explain this?

    Thanks again for your help! It is much appreciated.

    Elena

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    New Genomics Tools and Methods Shared at AGBT 2025
    by seqadmin


    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

    The Headliner
    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
    03-03-2025, 01:39 PM
  • seqadmin
    Investigating the Gut Microbiome Through Diet and Spatial Biology
    by seqadmin




    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
    02-24-2025, 06:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-03-2025, 01:15 PM
0 responses
178 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-28-2025, 12:58 PM
0 responses
271 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-24-2025, 02:48 PM
0 responses
654 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2025, 02:46 PM
0 responses
267 views
0 likes
Last Post seqadmin  
Working...
X