Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Example RNA-seq datasets with low and high false-positive rates

    Hello,

    I am trying to obtain two example RNA-seq datasets. One has verified low false positive rate, and the other has verified high false positive rate.

    Specifically, I am hoping to obtain 3 things for each dataset:

    1) The processed count table (filtered, normalized, and whatever else) that was directly fed into the model that created the DEG list.

    2) The DEG list (simply which rows of the count table were designated DEGs)

    3) An estimated false positive rate (or similar metric) showing how reliable the DEG list is. Maybe from some golden standard type of procedure. For one dataset, this rate is high. For the other dataset, this rate is low.

    If I need the processed count table and DEG list myself that is of course fine too. I am just hoping it is clear and reproducible documentation.

    I would be very grateful to hear from anyone who has knowledge even of just one of these datasets too. Thank you for any input!

  • #2
    You could simulate them yourself to have precise control over the "truth".

    Comment


    • #3
      Thank you, I am trying to use real (not simulated) RNA-seq data.

      Comment


      • #4
        You will find plenty of real datasets which will (claim to have) low false positive rates (everyone wants to achieve that) but it may be hard to find a real dataset that has high false positive rate (since no reviewer would accept that).

        Comment


        • #5
          Thanks GenoMax.

          1) I agree it might be hard to find a high false-positive rate example on its own. However, if that is the case, I am hoping to find an easily-reproducible example of a dataset that, say, has high false-positive rate when analyzed one way, but low false-positive rate when analyzed another way. This might be available in studies promoting a certain methodology. I am very interested in seeing what DEGs looks like (by counts) when they come from established high false positive rate.

          2) I do have one dataset that returns a suspiciously large number of DEGs (through edgeR, DESeq, and limmaVoom). However, when I look at the DEGs (view their counts), I do not see much larger variation between treatment groups than between replicates as expected. This makes me *suspect* many of these DEGs are false positive calls. However, I am looking for a dataset which has been compared to some *standard* that shows it indeed has a high false positive rate, and unfortunately, I do not know of a way to do that with my data. Hence, I am trying to find a public dataset.

          Comment


          • #6
            RNA-seq differential expression methods are known to be affected by outliers. You have used edgeR to analyse the dataset. What dispersion estimation variety did you use? If you have patient replicates, you should use the robust variety of dispersion estimation. The default method is only useful if you are analysing replicates of cell lines (e.g. 3 replicates of PrEC and 3 replicates of LNCaP), which aren't representative of biological tissue and the heterogeneity of it. There's also a robust style of limma analysis you could be using.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Best Practices for Single-Cell Sequencing Analysis
              by seqadmin



              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
              Yesterday, 07:15 AM
            • seqadmin
              Latest Developments in Precision Medicine
              by seqadmin



              Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

              Somatic Genomics
              “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
              05-24-2024, 01:16 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 06:58 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:18 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 08:04 AM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 06-03-2024, 06:55 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Working...
            X