Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • easyRNASeq errors / experiences

    Hi all,

    At the moment I'm trying to set up easyRNASeq + edgeR to analyse my paired-end data using R. I'm following the easyRNASeq manual to acquire a table of read counts which can then be used as DGElist object for edgeR.

    Unfortunately, I do not even get close to the point of obtaining the DGElist object, as the easyRNASeq function crashes with the following error:

    Code:
    Checking arguments... 
    Fetching annotations... 
    Computing gene models... 
    Summarizing counts... 
    Processing EMC_18_alignment.bam 
    Updating the read length information. 
    The alignments are gapped. 
    Minimum length of 1 bp. 
    Maximum length of 101 bp. 
    Error in mk_singleBracketReplacementValue(x, value) : 
      'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(organism = "Hsapiens", annotationMethod = "biomaRt",  :
      There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
      You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Did anyone experience a similar error message yet?
    My bamfiles list consists of 4 samples aligned via GSNAP and here's how I run the function itself:

    Code:
    count.genes <- easyRNASeq(organism="Hsapiens",
                         annotationMethod="biomaRt",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=bamfiles,
                         outputFormat="RNAseq")
    I use the devel version of easyRNASeq since it supports varying read lengths.


    Any help is greatly appreciated.


    Code:
    > sessionInfo()
    R version 2.15.1 (2012-06-22)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
     [1] BiocInstaller_1.5.12               BSgenome.Hsapiens.UCSC.hg19_1.3.19
     [3] easyRNASeq_1.3.14                  ShortRead_1.15.11                 
     [5] latticeExtra_0.6-24                RColorBrewer_1.0-5                
     [7] Rsamtools_1.9.30                   DESeq_1.9.14                      
     [9] lattice_0.20-10                    locfit_1.5-8                      
    [11] BSgenome_1.25.8                    GenomicRanges_1.9.65              
    [13] Biostrings_2.25.12                 IRanges_1.15.44                   
    [15] edgeR_2.99.8                       limma_3.13.20                     
    [17] biomaRt_2.13.2                     Biobase_2.17.7                    
    [19] genomeIntervals_1.13.3             BiocGenerics_0.3.1                
    [21] intervals_0.13.3                  
    
    loaded via a namespace (and not attached):
     [1] annotate_1.35.3       AnnotationDbi_1.19.37 bitops_1.0-4.1       
     [4] DBI_0.2-5             genefilter_1.39.0     geneplotter_1.35.1   
     [7] grid_2.15.1           hwriter_1.3           RCurl_1.91-1         
    [10] RSQLite_0.11.2        splines_2.15.1        stats4_2.15.1        
    [13] survival_2.36-14      tools_2.15.1          XML_3.9-4            
    [16] xtable_1.7-0          zlibbioc_1.3.0

  • #2
    I have got the same error:

    #get annotation
    RNASeq<- easyRNASeq(filesDirectory=getwd(),
    organism="Hsapiens",
    #chr.sizes=chr.sizes,
    #readLength=80L,
    annotationMethod="biomaRt",
    format="bam",
    count="genes",
    summarization="geneModels",
    filenames=bamfiles[1],
    outputFormat="RNAseq"
    )
    gAnnot <- genomicAnnotation(rnaSeq)




    Checking arguments...
    Fetching annotations...
    Computing gene models...
    Summarizing counts...
    Processing RU_009_final.sorted.bam
    Updating the read length information.
    The reads have been trimmed.
    Minimum length of 50 bp.
    Maximum length of 80 bp.
    Error in mk_singleBracketReplacementValue(x, value) :
    'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
    2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter, :
    You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Last edited by hollandorange; 09-20-2012, 01:28 AM.

    Comment


    • #3
      Hi hollandorange,

      could you include which aligner (+ version) you used? I forgot to mention that I aligned to Hg19 with GSNAP (version 2012-07-12).

      Cheers

      Comment


      • #4
        Hi rboettcher,

        Thanks for your email pointing me to that thread.

        There indeed seem to be a bug in a sub-setting step when getting the reads' information.

        As I'm usually not scanning the seqanswers forum for posts related to easyRNASeq, a better place to post about it is the bioconductor mailing list (I've forwarded your post there). Let's go on with this discussion over there.

        Cheers,

        Nico

        Comment


        • #5
          Hi Rboettcher,

          The bam files that I used for easyRNAseq was generated from Tophat. I also wanted to use GSNAP, since I heard it is more accurate.

          Could you also forward me to the bioconductor email thread for this issue? thanks!

          Hollandorange

          Comment


          • #6
            Hi hollandorange,

            You can register for that mailing list there: http://www.bioconductor.org/help/mailing-list/ (the best option IMO) or follow it on GMANE there:



            Cheers,

            Nico

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              05-06-2024, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:35 AM
            0 responses
            14 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-09-2024, 02:46 PM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-07-2024, 06:57 AM
            0 responses
            17 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-06-2024, 07:17 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Working...
            X