Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rareaquaticbadger
    Junior Member
    • Jun 2011
    • 6

    Rsamtools memory leak

    Hi there,

    I've just updated Rsamtools to the latest version ("1.22.0") and I've noticed a weird memory leak that is crashing my system.

    I have a folder of ~150 bam files that I'm trying to process, feeding in a data.frame called filter, that splits the genome into 7744 genomic coordinates

    Code:
    > head(filter)
            V1       V2       V3                         V4
    1 GL896898        0 30180132        GL896898:0-30180132
    2 GL896898 30180133 52375790 GL896898:30180133-52375790
    3 GL896899        0 24984016        GL896899:0-24984016
    4 GL896899 24984017 40953715 GL896899:24984017-40953715
    5 GL896905        0 18438707        GL896905:0-18438707
    6 GL896905 18438708 27910907 GL896905:18438708-27910907
    For some reason, Rsamtools is not releasing RAM after each iteration. I assumed it was my code, but I just ran the following simple test, and unless I'm being completely stupid, Rsamtools is doing something funny. To test this I looped one bam file with the following script and monitored RAM usage:

    Code:
    #R implementation of top
    library(NCmisc)
    
    #R value of all objects
    library(pryr)
    
    library(Rsamtools)
    
    filter=read.table('ferret_merged_sce_events.bed')
    fileName='AAAGCA.merge.bam'
    objectRam=vector()
    totalRam=vector()
    
    for(seq in 1:10)
    {
         message(paste("iteration", seq))
         bam.dataframe <- scanBam(fileName, param=ScanBamParam(which=GRanges(seqnames = c(as.character(filter[,1])), ranges = IRanges(c(filter[,2]), c(filter[,3]) )), mapqFilter=10, what=c("rname","pos","strand")))
         #measure RAM across all R objects
         objectRam <- c(objectRam, mem_used()[1]/10^6)
         #measure RAM across system
         totalRam <- c(totalRam,  suppressWarnings(top(CPU=F)$RAM$used*10^6))
         #remove scanBam object to reset for next iteration
         rm(bam.dataframe)
         #empty garbage collection
         gc()
    
    }
    plot(objectRam, type='l', xlab='iteration', ylab='Ram usage (Mb)')
    plot(totalRam, type='l', xlab='iteration', ylab='Ram usage (Mb)')
    This generates the two attached graphs. You'll see that while the RAM used for the R object remains stable, the overall RAM usage of my computer increases, and only decreases when I end my R session. Is there any way of purging the RAM? Is this a bug in Rsamtools or have I done something wrong? The additive overall RAM usage also occurs when the scanBam(which()) is not used, but occurs at a much reduced rate.

    Any help would be greatly appreciated.
    Attached Files
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    What happens if you run gc() to have the garbage collector run?

    Comment

    • rareaquaticbadger
      Junior Member
      • Jun 2011
      • 6

      #3
      Hi dpryan,

      So after running the loop described, my system memory states this session is using 2.579Gb RAM.

      If I run gc() a few times, the system memory usage doesn't change and I get the following output:

      Code:
      > gc()
                used  (Mb) gc trigger  (Mb) max used  (Mb)
      Ncells 1912702 102.2    3205452 171.2  3205452 171.2
      Vcells 1579324  12.1    7866470  60.1 21256102 162.2
      > gc()
                used  (Mb) gc trigger  (Mb) max used  (Mb)
      Ncells 1912693 102.2    3205452 171.2  3205452 171.2
      Vcells 1579327  12.1    6293176  48.1 21256102 162.2
      > gc()
                used  (Mb) gc trigger  (Mb) max used  (Mb)
      Ncells 1912693 102.2    3205452 171.2  3205452 171.2
      Vcells 1579327  12.1    5034540  38.5 21256102 162.2
      > gc()
                used  (Mb) gc trigger  (Mb) max used  (Mb)
      Ncells 1912693 102.2    3205452 171.2  3205452 171.2
      Vcells 1579327  12.1    5034540  38.5 21256102 162.2
      > gc()
                used  (Mb) gc trigger  (Mb) max used  (Mb)
      Ncells 1912693 102.2    3205452 171.2  3205452 171.2
      Vcells 1579327  12.1    5034540  38.5 21256102 162.2

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        Apparently RSamtools isn't using the garbage collector. You might post this to the bioconductor support site and see if the maintainers reply.

        Comment

        • rareaquaticbadger
          Junior Member
          • Jun 2011
          • 6

          #5
          Great,

          Thanks dpryan. I'll cross post at bioconductor support and wait for a site maintainer to get back to me.

          Cheers!

          Comment

          Latest Articles

          Collapse

          • SEQadmin2
            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
            by SEQadmin2


            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
            ...
            Yesterday, 10:05 AM
          • SEQadmin2
            Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
            by SEQadmin2


            With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


            Introduction

            Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
            05-22-2026, 06:42 AM
          • SEQadmin2
            Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
            by SEQadmin2

            Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


            Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
            05-06-2026, 09:04 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, Yesterday, 12:03 PM
          0 responses
          17 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, Yesterday, 11:40 AM
          0 responses
          13 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 05-28-2026, 11:40 AM
          0 responses
          29 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 05-26-2026, 10:12 AM
          0 responses
          31 views
          0 reactions
          Last Post SEQadmin2  
          Working...