Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • rethinking log-log RPKM plots

    I put together my thoughts about why I think the log-log plots of RPKM values we are accustomed to seeing may not be the best way to go. It can be found here. I would be interested to hear what you guys think . . . .

    thanks,
    Justin

  • #2
    Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.

    Comment


    • #3
      Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?

      Comment


      • #4
        Interesting discussion and seems worth a try on some of my data. I'd always slipped back to my microarray days and added 16 to each FPKM/RPKM then logged. To many days playing with Plier versus Plier+16 I guess.
        Hi Jon,

        I thought it would be a good idea to give it a try on some real data, too. I tried it on some technical replicates from the Marioni paper and put the results here. Seems to support their observation that variation among technical replicates can be captured with a Poisson model, at least for the data they presented.

        Justin

        Comment


        • #5
          Very interesting! you think that the asin transformation could also be used for normalization in differential expression analysis or other comparative approaches such as ChipSeq enrichment (over input) calculations?
          Hi mudshark,

          These transformations are helpful to determine whether or not your data fit the Poisson model. For technical replicates, it seems that the Poisson model works well, and the variation can all be accounted for by the effects of random sampling. However, biological replicates appear to be over-dispersed in general, and so do not fit the Poisson model (in general). The variance stabilization transformations for over-dispersed data that I have come across in my searches all rely on knowing the over-dispersion parameter, which wouldn't be feasible to estimate on a gene-by-gene bases since the number of samples is usually small. Some methods that try to account for over-dispersion will pool genes with similar expression levels (like DESEq) to try to get some sort of estimate of the over-dispersion.

          So, I don't think this particular transformation would be good for over dispersed data, but if you used the appropriate transformation and knew the over-dispersion parameter, then I think it could be a useful plot for identifying differentially expressed genes.

          Comment


          • #6
            Anscomb or sqrt

            Dear Justin,

            I tried your suggestion, which makes mathematical sense, on some RNA-seq data of mine and it gives beautiful scatter plots among the technical replicates. There is not much difference between Anscombe and simple square root transformations.

            Thanks for the suggestion.

            Gunter

            Comment


            • #7
              Hi Gunter,

              I am glad you found that suggestion useful. I'll see what I can do about writing it up as a technical note and submitting it. By the way, my name is Justin, and I am at Washington University in St. Louis, working on the informatics pipeline for next-gen sequencing. Nice to meet you!

              Justin

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin



                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                06-06-2024, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 07:24 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-13-2024, 08:58 AM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-12-2024, 02:20 PM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-07-2024, 06:58 AM
              0 responses
              184 views
              0 likes
              Last Post seqadmin  
              Working...
              X