Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GSEA: None of the gene sets passed the size thresholds

    I am trying to use GSEA for some RNA-Seq data. I've used it previously for microarray data and it worked fine. My guess is that using well-defined probe set IDs helped.

    For my data, I tried using the provided gene symbols as well as symbols exactly matching what I have. It seems no matter what I do, I end up with "None of the gene sets passed the size thresholds" error.

    Using GenePattern (which I assume should be the safest option), I get the following output:
    Code:
    1286 [INFO ] Begun importing: Chip from: /xchip/gpprod-upload/servers/genepattern/users/uploads/tmp/run4907393233174326312.tmp/chip.platform.file/1/same.chip
    1334 [WARN ] Missing chip file: >/xchip/gpprod-upload/servers/genepattern/users/uploads/tmp/run4907393233174326312.tmp/chip.platform.file/1/same.chip<	at edu.mit.broad.vdb.chip.FileInMemoryChip.initHere(?:?)
    1502 [INFO ] Parsed from dotchip : 21862
    1350 [WARN ] Missing chip file: >/xchip/gpprod-upload/servers/genepattern/users/uploads/tmp/run4907393233174326312.tmp/chip.platform.file/1/same.chip<	at edu.mit.broad.vdb.chip.FileInMemoryChip.initHere(?:?)
    1738 [INFO ] Collapsing dataset was done. Original: 21862x2 (ann: 21862,2,same.chip) collapsed: 860x2 (ann: 860,2,GENE_SYMBOL)
    to parse>c5.all.v4.0.symbols.gmt< got: [c5.all.v4.0.symbols.gmt]
    1763 [INFO ] Begun importing: GeneSetMatrix from: c5.all.v4.0.symbols.gmt
    2110 [INFO ] Got gsets: 1454 now preprocessing them ... min: 3 max: 500
    Done removeGeneSetsSmallerThan: 3 for: 501 / 1454
    Done removeGeneSetsSmallerThan: 3 for: 1001 / 1454
    2259 [INFO ] Done preproc for smaller than: 3
    2428 [INFO ] Renaming rpt dir on error to: error_.
    2276 [WARN ] Could not rename for error to: error_.	at edu.mit.broad.genome.reports.api.ToolReport.setErroredOut(?:?)
    Those warnings look questionable, but they are not exactly informative. Why would it say "Missing chip file" when the chip file is obviously present?

  • #2
    hi!

    i just started encountering the same error. ('Renaming rpt dir on error to: error_.')
    differently from you i'm still working on some microarray data, and i also tried using collapse option and also symbols that mach exactly my dataset probes.

    were you able to resolve the problem? could anyone have some advice?

    thanks!

    p.s. i'm sure that threshold is well above number of the genes in my geneset.
    Last edited by snaporaz; 10-23-2014, 11:35 AM. Reason: clarification

    Comment


    • #3
      I haven't used the Broads GSEA in a while, but for RNAseq data you could try a Bioconductor package:
      The package generally provides methods for gene set enrichment analysis of high-throughput RNA-Seq data by integrating differential expression and splicing. It uses negative binomial distribution to model read count data, which accounts for sequencing biases and biological variation. Based on permutation tests, statistical significance can also be achieved regarding each gene's differential expression and splicing, respectively.

      Comment


      • #4
        Set 'Collapse dataset to gene symbols' false -- might help.
        Last edited by jamesjcai; 03-08-2017, 08:51 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        49 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        66 views
        0 likes
        Last Post seqadmin  
        Working...
        X