Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • easyRNASeq errors / experiences

    Hi all,

    At the moment I'm trying to set up easyRNASeq + edgeR to analyse my paired-end data using R. I'm following the easyRNASeq manual to acquire a table of read counts which can then be used as DGElist object for edgeR.

    Unfortunately, I do not even get close to the point of obtaining the DGElist object, as the easyRNASeq function crashes with the following error:

    Code:
    Checking arguments... 
    Fetching annotations... 
    Computing gene models... 
    Summarizing counts... 
    Processing EMC_18_alignment.bam 
    Updating the read length information. 
    The alignments are gapped. 
    Minimum length of 1 bp. 
    Maximum length of 101 bp. 
    Error in mk_singleBracketReplacementValue(x, value) : 
      'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(organism = "Hsapiens", annotationMethod = "biomaRt",  :
      There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
      You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Did anyone experience a similar error message yet?
    My bamfiles list consists of 4 samples aligned via GSNAP and here's how I run the function itself:

    Code:
    count.genes <- easyRNASeq(organism="Hsapiens",
                         annotationMethod="biomaRt",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=bamfiles,
                         outputFormat="RNAseq")
    I use the devel version of easyRNASeq since it supports varying read lengths.


    Any help is greatly appreciated.


    Code:
    > sessionInfo()
    R version 2.15.1 (2012-06-22)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
     [1] BiocInstaller_1.5.12               BSgenome.Hsapiens.UCSC.hg19_1.3.19
     [3] easyRNASeq_1.3.14                  ShortRead_1.15.11                 
     [5] latticeExtra_0.6-24                RColorBrewer_1.0-5                
     [7] Rsamtools_1.9.30                   DESeq_1.9.14                      
     [9] lattice_0.20-10                    locfit_1.5-8                      
    [11] BSgenome_1.25.8                    GenomicRanges_1.9.65              
    [13] Biostrings_2.25.12                 IRanges_1.15.44                   
    [15] edgeR_2.99.8                       limma_3.13.20                     
    [17] biomaRt_2.13.2                     Biobase_2.17.7                    
    [19] genomeIntervals_1.13.3             BiocGenerics_0.3.1                
    [21] intervals_0.13.3                  
    
    loaded via a namespace (and not attached):
     [1] annotate_1.35.3       AnnotationDbi_1.19.37 bitops_1.0-4.1       
     [4] DBI_0.2-5             genefilter_1.39.0     geneplotter_1.35.1   
     [7] grid_2.15.1           hwriter_1.3           RCurl_1.91-1         
    [10] RSQLite_0.11.2        splines_2.15.1        stats4_2.15.1        
    [13] survival_2.36-14      tools_2.15.1          XML_3.9-4            
    [16] xtable_1.7-0          zlibbioc_1.3.0

  • #2
    I have got the same error:

    #get annotation
    RNASeq<- easyRNASeq(filesDirectory=getwd(),
    organism="Hsapiens",
    #chr.sizes=chr.sizes,
    #readLength=80L,
    annotationMethod="biomaRt",
    format="bam",
    count="genes",
    summarization="geneModels",
    filenames=bamfiles[1],
    outputFormat="RNAseq"
    )
    gAnnot <- genomicAnnotation(rnaSeq)




    Checking arguments...
    Fetching annotations...
    Computing gene models...
    Summarizing counts...
    Processing RU_009_final.sorted.bam
    Updating the read length information.
    The reads have been trimmed.
    Minimum length of 50 bp.
    Maximum length of 80 bp.
    Error in mk_singleBracketReplacementValue(x, value) :
    'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
    2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter, :
    You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Last edited by hollandorange; 09-20-2012, 01:28 AM.

    Comment


    • #3
      Hi hollandorange,

      could you include which aligner (+ version) you used? I forgot to mention that I aligned to Hg19 with GSNAP (version 2012-07-12).

      Cheers

      Comment


      • #4
        Hi rboettcher,

        Thanks for your email pointing me to that thread.

        There indeed seem to be a bug in a sub-setting step when getting the reads' information.

        As I'm usually not scanning the seqanswers forum for posts related to easyRNASeq, a better place to post about it is the bioconductor mailing list (I've forwarded your post there). Let's go on with this discussion over there.

        Cheers,

        Nico

        Comment


        • #5
          Hi Rboettcher,

          The bam files that I used for easyRNAseq was generated from Tophat. I also wanted to use GSNAP, since I heard it is more accurate.

          Could you also forward me to the bioconductor email thread for this issue? thanks!

          Hollandorange

          Comment


          • #6
            Hi hollandorange,

            You can register for that mailing list there: http://www.bioconductor.org/help/mailing-list/ (the best option IMO) or follow it on GMANE there:



            Cheers,

            Nico

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-25-2024, 11:49 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-24-2024, 08:47 AM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            62 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X