Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Example RNA-seq datasets with low and high false-positive rates

    Hello,

    I am trying to obtain two example RNA-seq datasets. One has verified low false positive rate, and the other has verified high false positive rate.

    Specifically, I am hoping to obtain 3 things for each dataset:

    1) The processed count table (filtered, normalized, and whatever else) that was directly fed into the model that created the DEG list.

    2) The DEG list (simply which rows of the count table were designated DEGs)

    3) An estimated false positive rate (or similar metric) showing how reliable the DEG list is. Maybe from some golden standard type of procedure. For one dataset, this rate is high. For the other dataset, this rate is low.

    If I need the processed count table and DEG list myself that is of course fine too. I am just hoping it is clear and reproducible documentation.

    I would be very grateful to hear from anyone who has knowledge even of just one of these datasets too. Thank you for any input!

  • #2
    You could simulate them yourself to have precise control over the "truth".

    Comment


    • #3
      Thank you, I am trying to use real (not simulated) RNA-seq data.

      Comment


      • #4
        You will find plenty of real datasets which will (claim to have) low false positive rates (everyone wants to achieve that) but it may be hard to find a real dataset that has high false positive rate (since no reviewer would accept that).

        Comment


        • #5
          Thanks GenoMax.

          1) I agree it might be hard to find a high false-positive rate example on its own. However, if that is the case, I am hoping to find an easily-reproducible example of a dataset that, say, has high false-positive rate when analyzed one way, but low false-positive rate when analyzed another way. This might be available in studies promoting a certain methodology. I am very interested in seeing what DEGs looks like (by counts) when they come from established high false positive rate.

          2) I do have one dataset that returns a suspiciously large number of DEGs (through edgeR, DESeq, and limmaVoom). However, when I look at the DEGs (view their counts), I do not see much larger variation between treatment groups than between replicates as expected. This makes me *suspect* many of these DEGs are false positive calls. However, I am looking for a dataset which has been compared to some *standard* that shows it indeed has a high false positive rate, and unfortunately, I do not know of a way to do that with my data. Hence, I am trying to find a public dataset.

          Comment


          • #6
            RNA-seq differential expression methods are known to be affected by outliers. You have used edgeR to analyse the dataset. What dispersion estimation variety did you use? If you have patient replicates, you should use the robust variety of dispersion estimation. The default method is only useful if you are analysing replicates of cell lines (e.g. 3 replicates of PrEC and 3 replicates of LNCaP), which aren't representative of biological tissue and the heterogeneity of it. There's also a robust style of limma analysis you could be using.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Advanced Tools Transforming the Field of Cytogenomics
              by seqadmin


              At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
              09-26-2023, 06:26 AM
            • seqadmin
              How RNA-Seq is Transforming Cancer Studies
              by seqadmin



              Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
              09-07-2023, 11:15 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 07:14 AM
            0 responses
            6 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-29-2023, 09:38 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-27-2023, 06:57 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-26-2023, 07:53 AM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Working...
            X