Header Leaderboard Ad

Collapse

rethinking log-log RPKM plots

Collapse

Announcement

Collapse

SEQanswers June Challenge Has Begun!

The competition has begun! We're giving away a $50 Amazon gift card to the member who answers the most questions on our site during the month. We want to encourage our community members to share their knowledge and help each other out by answering questions related to sequencing technologies, genomics, and bioinformatics. The competition is open to all members of the site, and the winner will be announced at the beginning of July. Best of luck!

For a list of the official rules, visit (https://www.seqanswers.com/forum/sit...wledge-and-win)
See more
See less
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • rethinking log-log RPKM plots

    I put together my thoughts about why I think the log-log plots of RPKM values we are accustomed to seeing may not be the best way to go. It can be found here. I would be interested to hear what you guys think . . . .

    thanks,
    Justin

  • #2
    Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.

    Comment


    • #3
      Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?

      Comment


      • #4
        Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.
        Hi Jon,

        I thought it would be a good idea to give it a try on some real data, too. I tried it on some technical replicates from the Marioni paper and put the results here. Seems to support their observation that variation among technical replicates can be captured with a Poisson model, at least for the data they presented.

        Justin

        Comment


        • #5
          Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?
          Hi mudshark,

          These transformations are helpful to determine whether or not your data fit the Poisson model. For technical replicates, it seems that the Poisson model works well, and the variation can all be accounted for by the effects of random sampling. However, biological replicates appear to be over-dispersed in general, and so do not fit the Poisson model (in general). The variance stabilization transformations for over-dispersed data that I have come across in my searches all rely on knowing the over-dispersion parameter, which wouldn't be feasible to estimate on a gene-by-gene bases since the number of samples is usually small. Some methods that try to account for over-dispersion will pool genes with similar expression levels (like DESEq) to try to get some sort of estimate of the over-dispersion.

          So, I don't think this particular transformation would be good for over dispersed data, but if you used the appropriate transformation and knew the over-dispersion parameter, then I think it could be a useful plot for identifying differentially expressed genes.

          Comment


          • #6
            Anscomb or sqrt

            Dear Justin,

            I tried your suggestion, which makes mathematical sense, on some RNA-seq data of mine and it gives beautiful scatter plots among the technical replicates. There is not much difference between Anscombe and simple square root transformations.

            Thanks for the suggestion.

            Gunter

            Comment


            • #7
              Hi Gunter,

              I am glad you found that suggestion useful. I'll see what I can do about writing it up as a technical note and submitting it. By the way, my name is Justin, and I am at Washington University in St. Louis, working on the informatics pipeline for next-gen sequencing. Nice to meet you!

              Justin

              Comment

              Latest Articles

              Collapse

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 06-01-2023, 08:56 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-01-2023, 07:33 AM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-31-2023, 07:50 AM
              0 responses
              6 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-26-2023, 09:22 AM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Working...
              X