Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Question on a paired test for RNA expression

    Hi, I have a big problem on finding a tool for my specific analysis.

    I have a group of people with expression data under 2 different condition.

    DESeq is not pairwise, I am not sure limma do it and I didn t find something simple ...

    Thank you !

  • #2
    Limma can handle paired setups. So can DESeq and edgeR ("person" is just used as a factor).

    Comment


    • #3
      Thnak you ! I found here that DESeq does not support the pair test :


      It s not clear for DESeq2 and hard to figure for Limma. Maybe someone could give me an example ?

      Comment


      • #4
        It sort of depends on what method you want to use to model the paired samples. The classic method would be a mixed-effect linear model, which DESeq doesn't do. You can also just put each patient/person in as a factor, which isn't doing the exact same thing but is at least in the right direction. Limma has some of its own methods, since it's not using raw count data. In the edgeR vignette, there's an example of this in section 3.4.1 (DESeq works the same). In the limma vignette, look at section 16.3 (or google around for other "tumor normal comaprison"s, which are the most frequent use example).

        Comment


        • #5
          As I understand, edgeR, as DESeq have the exact test and the GLM test. In both case the GLM test allow to test for multiple factor and we can use use the subjects ID as a factor.
          Problem is that GLM test seems less accurate, I compare both and the result is quite different.
          I also compare my results in edgeR with my result in DESeq and it s still very different.
          I am a bit new in the field so I would be happy if you have any comment, here is my code :


          Code:
          library("edgeR")
          
          miRNA=read.table("merged_filtered_data.csv", header=TRUE,row.names=1)
          roundmiRNA=round(miRNA)
          roundmiRNA = roundmiRNA[apply(roundmiRNA,1,sum)!=0,]
          counts=roundmiRNA[miRNAdesign$week=="0"]
          
          info=read.table("samples_infos.csv", header=TRUE,row.names=1)
          miRNAdesign=data.frame(row.names = colnames(info),
            subject=t(info)[,"subject"],
            week=t(info)[,"week"],
          )
          
          y <- DGEList(counts=roundmiRNA,group=miRNAdesign$week)
          y <- calcNormFactors(y)
          y <- estimateCommonDisp(y)
          y <- estimateTagwiseDisp(y)
          et <- exactTest(y)
          topTags(et)
          
          
          design <- model.matrix(~miRNAdesign$week)
          y <- DGEList(counts=roundmiRNA)
          y <- estimateGLMCommonDisp(y,design)
          y <- estimateGLMTrendedDisp(y,design)
          y <- estimateGLMTagwiseDisp(y,design)
          fit <- glmFit(y,design)
          lrt <- glmLRT(fit,coef=2)
          topTags(lrt)
          
          
          
          design <- model.matrix(~miRNAdesign$subject+miRNAdesign$week)
          y <- estimateGLMCommonDisp(y,design)

          Comment


          • #6
            Limma is doing a contrast test that I do not understand yet .. My question is still on !

            Comment


            • #7
              roundmiRNA=round(miRNA)
              You need to thoroughly reconsider what you're doing. That line in your script tells me that your counts are wrong (unless that line has no effect).

              That using the exact test and a GLM produce somewhat different results isn't exactly surprising. Normally the changes are on the margins.

              Comment


              • #8
                I made my counts round because each read is splitted between each possible map in the genome when I build the table. It make float numbers !

                I have to do the math because I don t really get what happened. THank you for your help

                Comment


                • #9
                  Originally posted by raphael123 View Post
                  I made my counts round because each read is splitted between each possible map in the genome when I build the table. It make float numbers !
                  That's not the appropriate way to go about things. All counts should be integer because they could be nothing but integer. Use featureCounts or htseq-count (or summarizeOverlaps in R, though that's probably really slow) to derive your count tables.

                  Comment


                  • #10
                    Originally posted by dpryan View Post
                    That's not the appropriate way to go about things. All counts should be integer because they could be nothing but integer.
                    How do ou deal with a read that map at two different places ? You can make it desapear like the libraries that you suggest me. At the end of the day you loose some information
                    • featureCounts : It does not count reads overlapping with more than one feature
                    • htseq-count : If it contains more than one feature, the read is counted as ambiguous (and not counted for any features)

                    Comment


                    • #11
                      You don't deal with ambiguous mappings since all of the count-based statistical tests (edgeR/DESeq/Limma/etc.) aren't meant to deal with them (a general rule of high-throughput data is that you need to whittle away the parts of the data that aren't useable).

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Recent Advances in Sequencing Analysis Tools
                        by seqadmin


                        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                        05-06-2024, 07:48 AM
                      • seqadmin
                        Essential Discoveries and Tools in Epitranscriptomics
                        by seqadmin




                        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                        04-22-2024, 07:01 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 05-10-2024, 06:35 AM
                      0 responses
                      20 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-09-2024, 02:46 PM
                      0 responses
                      25 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-07-2024, 06:57 AM
                      0 responses
                      21 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 05-06-2024, 07:17 AM
                      0 responses
                      21 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X