Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • easyRNASeq errors / experiences

    Hi all,

    At the moment I'm trying to set up easyRNASeq + edgeR to analyse my paired-end data using R. I'm following the easyRNASeq manual to acquire a table of read counts which can then be used as DGElist object for edgeR.

    Unfortunately, I do not even get close to the point of obtaining the DGElist object, as the easyRNASeq function crashes with the following error:

    Code:
    Checking arguments... 
    Fetching annotations... 
    Computing gene models... 
    Summarizing counts... 
    Processing EMC_18_alignment.bam 
    Updating the read length information. 
    The alignments are gapped. 
    Minimum length of 1 bp. 
    Maximum length of 101 bp. 
    Error in mk_singleBracketReplacementValue(x, value) : 
      'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(organism = "Hsapiens", annotationMethod = "biomaRt",  :
      There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    2: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter,  :
      You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Did anyone experience a similar error message yet?
    My bamfiles list consists of 4 samples aligned via GSNAP and here's how I run the function itself:

    Code:
    count.genes <- easyRNASeq(organism="Hsapiens",
                         annotationMethod="biomaRt",
                         gapped=TRUE, count="genes",
                         summarization="geneModels",
                         filesDirectory=getwd(),
                         filenames=bamfiles,
                         outputFormat="RNAseq")
    I use the devel version of easyRNASeq since it supports varying read lengths.


    Any help is greatly appreciated.


    Code:
    > sessionInfo()
    R version 2.15.1 (2012-06-22)
    Platform: x86_64-pc-linux-gnu (64-bit)
    
    locale:
     [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
     [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
     [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
     [7] LC_PAPER=C                 LC_NAME=C                 
     [9] LC_ADDRESS=C               LC_TELEPHONE=C            
    [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
    
    attached base packages:
    [1] parallel  stats     graphics  grDevices utils     datasets  methods  
    [8] base     
    
    other attached packages:
     [1] BiocInstaller_1.5.12               BSgenome.Hsapiens.UCSC.hg19_1.3.19
     [3] easyRNASeq_1.3.14                  ShortRead_1.15.11                 
     [5] latticeExtra_0.6-24                RColorBrewer_1.0-5                
     [7] Rsamtools_1.9.30                   DESeq_1.9.14                      
     [9] lattice_0.20-10                    locfit_1.5-8                      
    [11] BSgenome_1.25.8                    GenomicRanges_1.9.65              
    [13] Biostrings_2.25.12                 IRanges_1.15.44                   
    [15] edgeR_2.99.8                       limma_3.13.20                     
    [17] biomaRt_2.13.2                     Biobase_2.17.7                    
    [19] genomeIntervals_1.13.3             BiocGenerics_0.3.1                
    [21] intervals_0.13.3                  
    
    loaded via a namespace (and not attached):
     [1] annotate_1.35.3       AnnotationDbi_1.19.37 bitops_1.0-4.1       
     [4] DBI_0.2-5             genefilter_1.39.0     geneplotter_1.35.1   
     [7] grid_2.15.1           hwriter_1.3           RCurl_1.91-1         
    [10] RSQLite_0.11.2        splines_2.15.1        stats4_2.15.1        
    [13] survival_2.36-14      tools_2.15.1          XML_3.9-4            
    [16] xtable_1.7-0          zlibbioc_1.3.0

  • #2
    I have got the same error:

    #get annotation
    RNASeq<- easyRNASeq(filesDirectory=getwd(),
    organism="Hsapiens",
    #chr.sizes=chr.sizes,
    #readLength=80L,
    annotationMethod="biomaRt",
    format="bam",
    count="genes",
    summarization="geneModels",
    filenames=bamfiles[1],
    outputFormat="RNAseq"
    )
    gAnnot <- genomicAnnotation(rnaSeq)




    Checking arguments...
    Fetching annotations...
    Computing gene models...
    Summarizing counts...
    Processing RU_009_final.sorted.bam
    Updating the read length information.
    The reads have been trimmed.
    Minimum length of 50 bp.
    Maximum length of 80 bp.
    Error in mk_singleBracketReplacementValue(x, value) :
    'value' must be a CompressedIntegerList object
    In addition: Warning messages:
    1: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    You enforce UCSC chromosome conventions, however the provided chromosome size list is not compliant. Correcting it.
    2: In easyRNASeq(filesDirectory = getwd(), organism = "Hsapiens", annotationMethod = "biomaRt", :
    There are 16696 synthetic exons as determined from your annotation that overlap! This implies that some reads will be counted more than once! Is that really what you want?
    3: In fetchCoverage(rnaSeq, format = format, filename = filename, filter = filter, :
    You enforce UCSC chromosome conventions, however the provided alignments are not compliant. Correcting it.
    Last edited by hollandorange; 09-20-2012, 01:28 AM.

    Comment


    • #3
      Hi hollandorange,

      could you include which aligner (+ version) you used? I forgot to mention that I aligned to Hg19 with GSNAP (version 2012-07-12).

      Cheers

      Comment


      • #4
        Hi rboettcher,

        Thanks for your email pointing me to that thread.

        There indeed seem to be a bug in a sub-setting step when getting the reads' information.

        As I'm usually not scanning the seqanswers forum for posts related to easyRNASeq, a better place to post about it is the bioconductor mailing list (I've forwarded your post there). Let's go on with this discussion over there.

        Cheers,

        Nico

        Comment


        • #5
          Hi Rboettcher,

          The bam files that I used for easyRNAseq was generated from Tophat. I also wanted to use GSNAP, since I heard it is more accurate.

          Could you also forward me to the bioconductor email thread for this issue? thanks!

          Hollandorange

          Comment


          • #6
            Hi hollandorange,

            You can register for that mailing list there: http://www.bioconductor.org/help/mailing-list/ (the best option IMO) or follow it on GMANE there:



            Cheers,

            Nico

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Genetic Variation in Immunogenetics and Antibody Diversity
              by seqadmin



              The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
              11-06-2024, 07:24 PM
            • seqadmin
              Choosing Between NGS and qPCR
              by seqadmin



              Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
              10-18-2024, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 11:09 AM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Today, 06:13 AM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 11-01-2024, 06:09 AM
            0 responses
            30 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 10-30-2024, 05:31 AM
            0 responses
            21 views
            0 likes
            Last Post seqadmin  
            Working...
            X