Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • easyRNASeq errors / experiences

    Hi all,

    At the moment I'm trying to set up easyRNASeq + edgeR to analyse my paired-end data using R. I'm following the easyRNASeq manual to acquire a table of read counts which can then be used as DGElist object for edgeR.

    Unfortunately, I do not even get close to the point of obtaining the DGElist object, as the easyRNASeq function crashes with the following error:

    Code:
    Checking arguments... 
    Fetching annotations... 
    Computing gene models... 
    Summarizing counts... 
    Processing EMC_18_alignment.bam 
    Updating the read length information. 
    The alignments are gapped. 
    Minimum length of 1 bp. 
    Maximum length of 101 bp. 
    Error in mk_singleBracketReplacementValue(x, value) : 
      'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(organism = "Hsapiens", annotationMethod = "biomaRt",  :
      There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
      You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Did anyone experience a similar error message yet?
    My bamfiles list consists of 4 samples aligned via GSNAP and here's how I run the function itself:

    Code:
    count.genes <- easyRNASeq(organism="Hsapiens",
                         annotationMethod="biomaRt",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=bamfiles,
                         outputFormat="RNAseq")
    I use the devel version of easyRNASeq since it supports varying read lengths.


    Any help is greatly appreciated.


    Code:
    > sessionInfo()
    R version 2.15.1 (2012-06-22)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
     [1] BiocInstaller_1.5.12               BSgenome.Hsapiens.UCSC.hg19_1.3.19
     [3] easyRNASeq_1.3.14                  ShortRead_1.15.11                 
     [5] latticeExtra_0.6-24                RColorBrewer_1.0-5                
     [7] Rsamtools_1.9.30                   DESeq_1.9.14                      
     [9] lattice_0.20-10                    locfit_1.5-8                      
    [11] BSgenome_1.25.8                    GenomicRanges_1.9.65              
    [13] Biostrings_2.25.12                 IRanges_1.15.44                   
    [15] edgeR_2.99.8                       limma_3.13.20                     
    [17] biomaRt_2.13.2                     Biobase_2.17.7                    
    [19] genomeIntervals_1.13.3             BiocGenerics_0.3.1                
    [21] intervals_0.13.3                  
    
    loaded via a namespace (and not attached):
     [1] annotate_1.35.3       AnnotationDbi_1.19.37 bitops_1.0-4.1       
     [4] DBI_0.2-5             genefilter_1.39.0     geneplotter_1.35.1   
     [7] grid_2.15.1           hwriter_1.3           RCurl_1.91-1         
    [10] RSQLite_0.11.2        splines_2.15.1        stats4_2.15.1        
    [13] survival_2.36-14      tools_2.15.1          XML_3.9-4            
    [16] xtable_1.7-0          zlibbioc_1.3.0

  • #2
    I have got the same error:

    #get annotation
    RNASeq<- easyRNASeq(filesDirectory=getwd(),
    organism="Hsapiens",
    #chr.sizes=chr.sizes,
    #readLength=80L,
    annotationMethod="biomaRt",
    format="bam",
    count="genes",
    summarization="geneModels",
    filenames=bamfiles[1],
    outputFormat="RNAseq"
    )
    gAnnot <- genomicAnnotation(rnaSeq)




    Checking arguments...
    Fetching annotations...
    Computing gene models...
    Summarizing counts...
    Processing RU_009_final.sorted.bam
    Updating the read length information.
    The reads have been trimmed.
    Minimum length of 50 bp.
    Maximum length of 80 bp.
    Error in mk_singleBracketReplacementValue(x, value) :
    'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
    2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter, :
    You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Last edited by hollandorange; 09-20-2012, 01:28 AM.

    Comment


    • #3
      Hi hollandorange,

      could you include which aligner (+ version) you used? I forgot to mention that I aligned to Hg19 with GSNAP (version 2012-07-12).

      Cheers

      Comment


      • #4
        Hi rboettcher,

        Thanks for your email pointing me to that thread.

        There indeed seem to be a bug in a sub-setting step when getting the reads' information.

        As I'm usually not scanning the seqanswers forum for posts related to easyRNASeq, a better place to post about it is the bioconductor mailing list (I've forwarded your post there). Let's go on with this discussion over there.

        Cheers,

        Nico

        Comment


        • #5
          Hi Rboettcher,

          The bam files that I used for easyRNAseq was generated from Tophat. I also wanted to use GSNAP, since I heard it is more accurate.

          Could you also forward me to the bioconductor email thread for this issue? thanks!

          Hollandorange

          Comment


          • #6
            Hi hollandorange,

            You can register for that mailing list there: http://www.bioconductor.org/help/mailing-list/ (the best option IMO) or follow it on GMANE there:



            Cheers,

            Nico

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Developments in Metagenomics
              by seqadmin





              Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
              09-23-2024, 06:35 AM
            • seqadmin
              Understanding Genetic Influence on Infectious Disease
              by seqadmin




              During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

              Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
              09-09-2024, 10:59 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 10-02-2024, 04:51 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-01-2024, 07:10 AM
            0 responses
            21 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-30-2024, 08:33 AM
            0 responses
            25 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-26-2024, 12:57 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Working...
            X