Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • user 31888
    Member
    • Nov 2015
    • 19

    GAGE pathway analysis (R/Bioconductor): summarizeOverlaps error

    I am trying to use the GAGE pathway analysis tutorial with 3 samples of paired-end RNA-Seq reads (placed in a "tophat_all" directory) that I mapped with TopHat (.bam files have been indexed).

    An error occurred at the read count step:

    Code:
    > library(TxDb.Hsapiens.UCSC.hg19.knownGene)
    > exByGn <- exonsBy(TxDb.Hsapiens.UCSC.hg19.knownGene, "gene")
    > library(GenomicAlignments)
    > fls <- list.files("tophat_all/", pattern="bam$", full.names =T)
    > bamfls <- BamFileList(fls)
    > flag <- scanBamFlag(isSecondaryAlignment=FALSE, isProperPair=TRUE)
    > param <- ScanBamParam(flag=flag)
    > gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, single.end=FALSE, param=param)
    Error in validObject(.Object) : 
      invalid class "SummarizedExperiment0" object: 'assays' nrow differs from 'mcols' nrow

    When I tried the same with only one sample I get a different error:
    Code:
    > gnCnt <- summarizeOverlaps(exByGn, bamfls, mode="Union", ignore.strand=TRUE, single.end=FALSE, param=param)
    Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x),  : 
      'data' must be of a vector type, was 'NULL'
    Could someone explain me the error?
    Does the error come from my reads (potential unpaired reads, different length...)?

    Thanks !

    Notes:
    (1) I am using the last version of R, Biconductor and its packages. Could it come from the newer versions of the packages that would not fit the tutorial anymore?

    Code:
    > sessionInfo()
    R version 3.2.2 (2015-08-14)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
    [1] C
    
    attached base packages:
    [1] stats4    parallel  stats     graphics  grDevices utils     datasets 
    [8] methods   base     
    
    other attached packages:
     [1] GenomicAlignments_1.6.1                
     [2] Rsamtools_1.22.0                       
     [3] Biostrings_2.38.0                      
     [4] XVector_0.10.0                         
     [5] SummarizedExperiment_1.0.1             
     [6] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
     [7] GenomicFeatures_1.22.4                 
     [8] AnnotationDbi_1.32.0                   
     [9] Biobase_2.30.0                         
    [10] GenomicRanges_1.22.1                   
    [11] GenomeInfoDb_1.6.1                     
    [12] IRanges_2.4.1                          
    [13] S4Vectors_0.8.2                        
    [14] BiocGenerics_0.16.1                    
    
    loaded via a namespace (and not attached):
     [1] zlibbioc_1.16.0      BiocParallel_1.4.0   tools_3.2.2         
     [4] DBI_0.3.1            lambda.r_1.1.7       futile.logger_1.4.1 
     [7] rtracklayer_1.30.1   futile.options_1.0.0 bitops_1.0-6        
    [10] RCurl_1.95-4.7       biomaRt_2.26.0       RSQLite_1.0.0       
    [13] XML_3.98-1.3
    (2) Here is the traceback after the error occurred:
    Code:
    > traceback()
    26: stop(msg, ": ", errors, domain = NA)
    25: validObject(.Object)
    24: initialize(value, ...)
    23: initialize(value, ...)
    22: new("SummarizedExperiment0", NAMES = names, elementMetadata = rowData, 
            colData = colData, assays = assays, metadata = as.list(metadata))
    21: new_SummarizedExperiment0(from@assays, names(from@rowRanges), 
            mcols(from@rowRanges), from@colData, from@metadata)
    20: asMethod(object)
    19: as(object, superClass)
    18: slot(x, "assays")
    17: is(slot(x, "assays"), "Assays")
    16: .valid.SummarizedExperiment0.assays_current(x)
    15: valid.func(object)
    14: validityMethod(as(object, superClass))
    13: anyStrings(validityMethod(as(object, superClass)))
    12: validObject(.Object)
    11: initialize(value, ...)
    10: initialize(value, ...)
    9: new("RangedSummarizedExperiment", rowRanges = rowRanges, colData = colData, 
           assays = assays, elementMetadata = elementMetadata, metadata = as.list(metadata))
    8: .new_RangedSummarizedExperiment(assays, rowRanges, colData, metadata)
    7: .local(assays, ...)
    6: SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = features, 
           colData = colData)
    5: SummarizedExperiment(assays = SimpleList(counts = counts), rowRanges = features, 
           colData = colData)
    4: .dispatchBamFiles(features, reads, mode, match.arg(algorithm), 
           ignore.strand, inter.feature = inter.feature, singleEnd = singleEnd, 
           fragments = fragments, param = param, preprocess.reads = preprocess.reads, 
           ...)
    3: .local(features, reads, mode, algorithm, ignore.strand, ...)
    2: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE, 
           single.end = FALSE, param = param)
    1: summarizeOverlaps(exByGn, bamfls, mode = "Union", ignore.strand = TRUE, 
           single.end = FALSE, param = param)
    Last edited by user 31888; 11-16-2015, 05:06 PM.
  • gringer
    David Eccles (gringer)
    • May 2011
    • 845

    #2
    This error suggests that one of your two input arguments is empty. Try the following two commands to see if anything looks odd:
    Code:
    > str(exByGn)
    > str(bamfls)

    Comment

    • user 31888
      Member
      • Nov 2015
      • 19

      #3
      Thanks ginger for your reply !

      Here are the outputs:
      Code:
      > str(exByGn)
      Formal class 'GRangesList' [package "GenomicRanges"] with 5 slots
        ..@ unlistData     :Formal class 'GRanges' [package "GenomicRanges"] with 6 slots
        .. .. ..@ seqnames       :Formal class 'Rle' [package "S4Vectors"] with 4 slots
        .. .. .. .. ..@ values         : Factor w/ 93 levels "chr1","chr2",..: 19 8 20 18 1 23 3 14 15 11 ...
        .. .. .. .. ..@ lengths        : int [1:19380] 15 2 12 17 17 4 1 11 10 21 ...
        .. .. .. .. ..@ elementMetadata: NULL
        .. .. .. .. ..@ metadata       : list()
        .. .. ..@ ranges         :Formal class 'IRanges' [package "IRanges"] with 6 slots
        .. .. .. .. ..@ start          : int [1:272776] 58858172 58858719 58859832 58860934 58861736 58862757 58863649 58864294 58864658 58864770 ...
        .. .. .. .. ..@ width          : int [1:272776] 224 288 663 1084 282 297 273 270 36 96 ...
        .. .. .. .. ..@ NAMES          : NULL
        .. .. .. .. ..@ elementType    : chr "integer"
        .. .. .. .. ..@ elementMetadata: NULL
        .. .. .. .. ..@ metadata       : list()
        .. .. ..@ strand         :Formal class 'Rle' [package "S4Vectors"] with 4 slots
        .. .. .. .. ..@ values         : Factor w/ 3 levels "+","-","*": 2 1 2 1 2 1 2 1 2 1 ...
        .. .. .. .. ..@ lengths        : int [1:11717] 15 2 46 5 11 87 1 4 1 8 ...
        .. .. .. .. ..@ elementMetadata: NULL
        .. .. .. .. ..@ metadata       : list()
        .. .. ..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
        .. .. .. .. ..@ rownames       : NULL
        .. .. .. .. ..@ nrows          : int 272776
        .. .. .. .. ..@ listData       :List of 2
        .. .. .. .. .. ..$ exon_id  : int [1:272776] 250809 250810 250811 250812 250813 250814 250815 250816 250817 250818 ...
        .. .. .. .. .. ..$ exon_name: chr [1:272776] NA NA NA NA ...
        .. .. .. .. ..@ elementType    : chr "ANY"
        .. .. .. .. ..@ elementMetadata: NULL
        .. .. .. .. ..@ metadata       : list()
        .. .. ..@ seqinfo        :Formal class 'Seqinfo' [package "GenomeInfoDb"] with 4 slots
        .. .. .. .. ..@ seqnames   : chr [1:93] "chr1" "chr2" "chr3" "chr4" ...
        .. .. .. .. ..@ seqlengths : int [1:93] 249250621 243199373 198022430 191154276 180915260 171115067 159138663 146364022 141213431 135534747 ...
        .. .. .. .. ..@ is_circular: logi [1:93] NA NA NA NA NA NA ...
        .. .. .. .. ..@ genome     : chr [1:93] "hg19" "hg19" "hg19" "hg19" ...
        .. .. ..@ metadata       : list()
        ..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
        .. .. ..@ rownames       : NULL
        .. .. ..@ nrows          : int 23459
        .. .. ..@ listData       : Named list()
        .. .. ..@ elementType    : chr "ANY"
        .. .. ..@ elementMetadata: NULL
        .. .. ..@ metadata       : list()
        ..@ partitioning   :Formal class 'PartitioningByEnd' [package "IRanges"] with 5 slots
        .. .. ..@ end            : int [1:23459] 15 17 29 46 63 67 68 79 89 110 ...
        .. .. ..@ NAMES          : chr [1:23459] "1" "10" "100" "1000" ...
        .. .. ..@ elementType    : chr "integer"
        .. .. ..@ elementMetadata: NULL
        .. .. ..@ metadata       : list()
        ..@ elementType    : chr "GRanges"
        ..@ metadata       :List of 1
        .. ..$ genomeInfo:List of 19
        .. .. ..$ Db type                                 : chr "TxDb"
        .. .. ..$ Supporting package                      : chr "GenomicFeatures"
        .. .. ..$ Data source                             : chr "UCSC"
        .. .. ..$ Genome                                  : chr "hg19"
        .. .. ..$ Organism                                : chr "Homo sapiens"
        .. .. ..$ Taxonomy ID                             : chr "9606"
        .. .. ..$ UCSC Table                              : chr "knownGene"
        .. .. ..$ Resource URL                            : chr "http://genome.ucsc.edu/"
        .. .. ..$ Type of Gene ID                         : chr "Entrez Gene ID"
        .. .. ..$ Full dataset                            : chr "yes"
        .. .. ..$ miRBase build ID                        : chr "GRCh37"
        .. .. ..$ transcript_nrow                         : chr "82960"
        .. .. ..$ exon_nrow                               : chr "289969"
        .. .. ..$ cds_nrow                                : chr "237533"
        .. .. ..$ Db created by                           : chr "GenomicFeatures package from Bioconductor"
        .. .. ..$ Creation time                           : chr "2015-10-07 18:11:28 +0000 (Wed, 07 Oct 2015)"
        .. .. ..$ GenomicFeatures version at creation time: chr "1.21.30"
        .. .. ..$ RSQLite version at creation time        : chr "1.0.0"
        .. .. ..$ DBSCHEMAVERSION                         : chr "1.1"
      > str(bamfls)
      Formal class 'BamFileList' [package "Rsamtools"] with 4 slots
        ..@ listData       :List of 3
        .. ..$ sample_1.bam :Reference class 'BamFile' [package "Rsamtools"] with 8 fields
        .. .. ..$ .extptr         :<externalptr> 
        .. .. ..$ path            : chr "tophat_all//sample_1.bam"
        .. .. ..$ index           : chr "tophat_all//sample_1.bam.bai"
        .. .. ..$ yieldSize       : int NA
        .. .. ..$ obeyQname       : logi FALSE
        .. .. ..$ asMates         : logi FALSE
        .. .. ..$ qnamePrefixEnd  : chr NA
        .. .. ..$ qnameSuffixStart: chr NA
        .. .. ..and 14 methods.
        .. ..$ sample_2.bam    :Reference class 'BamFile' [package "Rsamtools"] with 8 fields
        .. .. ..$ .extptr         :<externalptr> 
        .. .. ..$ path            : chr "tophat_all//sample_2.bam"
        .. .. ..$ index           : chr "tophat_all//sample_2.bam.bai"
        .. .. ..$ yieldSize       : int NA
        .. .. ..$ obeyQname       : logi FALSE
        .. .. ..$ asMates         : logi FALSE
        .. .. ..$ qnamePrefixEnd  : chr NA
        .. .. ..$ qnameSuffixStart: chr NA
        .. .. ..and 14 methods.
        .. ..$ sample_3.bam:Reference class 'BamFile' [package "Rsamtools"] with 8 fields
        .. .. ..$ .extptr         :<externalptr> 
        .. .. ..$ path            : chr "tophat_all//sample_3.bam"
        .. .. ..$ index           : chr "tophat_all//sample_3.bam.bai"
        .. .. ..$ yieldSize       : int NA
        .. .. ..$ obeyQname       : logi FALSE
        .. .. ..$ asMates         : logi FALSE
        .. .. ..$ qnamePrefixEnd  : chr NA
        .. .. ..$ qnameSuffixStart: chr NA
        .. .. ..and 14 methods.
        ..@ elementType    : chr "BamFile"
        ..@ elementMetadata: NULL
        ..@ metadata       : list()
      Last edited by user 31888; 11-17-2015, 01:27 PM. Reason: corrected from 'sample_2.bai' to 'sample_2.bam.ai'

      Comment

      • gringer
        David Eccles (gringer)
        • May 2011
        • 845

        #4
        There's something odd about sample_2.bam:

        Code:
          .. .. ..$ path            : chr "GAGE//sample_2.bam"
          .. .. ..$ index           : chr "GAGE//sample_2.bai"
        I think this should be:

        Code:
          .. .. ..$ path            : chr "GAGE//sample_2.bam"
          .. .. ..$ index           : chr "GAGE//sample_2[b].bam.[/b]bai"
        If that's not the problem, I can't see anything else obviously out of place that might indicate an empty list.

        Comment

        • user 31888
          Member
          • Nov 2015
          • 19

          #5
          My mistake, sorry.

          It is actually 'sample_2.bam.bai' (I edited my previous post)

          Comment

          • user 31888
            Member
            • Nov 2015
            • 19

            #6
            Stupid thought: could it be something like EOF or other formatting issue that differs between my sample.bam files and TxDb.Hsapiens.UCSC.hg19.knownGene?

            (I am using a Linux cluster)

            Comment

            • Enmity_LD
              Junior Member
              • Jan 2016
              • 1

              #7
              Hello everyone

              Hi! user 31888, Does your issue being solved? if it does ,colud you please provide me some advice? I meet the same question ,Thank you
              Last edited by Enmity_LD; 01-19-2016, 07:01 PM.

              Comment

              • user 31888
                Member
                • Nov 2015
                • 19

                #8
                maybe see:


                But I ended up doing a fresh install of pretty much everything (R, Python, Bioconductor and associated packages), and it worked.

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                11 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                45 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                105 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                125 views
                0 reactions
                Last Post SEQadmin2  
                Working...