Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Approximate botwie runtime

    this maybe a general question, from reading the article at http://genomebiology.com/2009/10/3/R25, I kind of feel that the instance of bowtie that I am running right now is behaving differently than mentioned in the article in terms of the time print... I maybe doing something wrong or something

    I have SRA paired-end data downloaded from http://www.ncbi.nlm.nih.gov/sra/SRX026384?report=fullthat I am mapping against the human reference genome that is provided by bowtie... the file is roughly about 19 million reads and is about 6 GB in size. The article says it should be possible to use a normal desktop computer to be able to carry out the task in a very short time, they mentioned a matter of minutes without the indexing step (building BWT). But later on they mentioned that a server computer can take upto 21 hours building the index. My laptop is Ubuntu 32 bits, 2 GB ram, 4 GB swap, dual core and I have been running a multi-threaded bowtie instance for the past 3 days, does this sound normal ? How long did it estimably for you my colleagues when you ran bowtie ??

    Here is how my query looks like :

    $ bowtie hg19 -q /PATH/SRR065070.fastq -S align.map --offrate 20 -p 2



    the 'hg19' argument passed to the bowtie is just a placeholder for the reference since I am invoking bowtie from within the directory where the hg19.ebwt.zip was extracted.. It generates an alignment file align.map but it is getting populated on a very slow rate that over the past 3 days and 10 hours only 205 MB were written to it

  • #2
    You say you only have 2GB of RAM. How much is specified as a minimum requirement in the manual and/or paper? Consider that hg19 is 3.2 billion bases.

    Comment


    • #3
      According to the paper (http://genomebiology.com/2009/10/3/R25):

      'A Bowtie index for the human genome fits in 2.2 GB on disk and has a memory footprint of as little as 1.3 GB at alignment time, allowing it to be queried on a workstation with under 2 GB of RAM.'

      However, the current pre-built hg indices on their site are larger than 2.2 GB. Also, the memory footprint might be bigger for p>1; have you tried running this single threaded? Use something like the System Monitor or 'top' to figure out if your job fits in the machine's RAM - if your system is forced to use swap just to hold the index I expect the run will be desperately slow.

      Comment


      • #4
        Speaking of parallel performance; the paper says that the memory image of the index is shared by threads which could increase performance on multiple cores and that there will not be a 'substantial' increase in memory consumption upon using multiple threads. So these threads they synchronize their activities (fetching reads, outputting results, switching between indices and marking jobs).

        On your cue, RDW, I checked whether or not SWAP was involved, so I see that both processors are running full blast and the bowtie job occupies 1.5 GB of RAM, however, I see the swap with 1.3 GB consumption but it is not clear to me whether this is coming from bowtie, I haven't tried running a single threaded job, my decision to run a dual thread was the notion that parallelism was gonna cut short the time...

        Nilshomer, in the paper they ran bowtie on a server and on a PC and benchmarked the performance, the PC had 2 GB of RAM and this is why I was optimistic...

        Comment


        • #5
          the human genome takes about 3.3GB of memory, so the swapping is caused by bowtie. this is your major bottleneck.
          multithreading does not increase memory requirement.

          Comment


          • #6
            When I run Bowtie on a Core 2 Duo 2.0 Ghz using both cores (-p 2) with 3.3 GB available RAM under Ubuntu 32 bit it would take a few hours to align 19 million reads to hg19, depending on what options I'm using. I often run it overnight so I don't know exactly how long, but definitely less than 8 hours.

            You should probably add more RAM (put in 4GB to get the max ~3.3 available on a 32 bit system).

            Comment


            • #7
              use -t to know the run time

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Developments in Metagenomics
                by seqadmin





                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                09-23-2024, 06:35 AM
              • seqadmin
                Understanding Genetic Influence on Infectious Disease
                by seqadmin




                During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                09-09-2024, 10:59 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 10-02-2024, 04:51 AM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-01-2024, 07:10 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-30-2024, 08:33 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 09-26-2024, 12:57 PM
              0 responses
              17 views
              0 likes
              Last Post seqadmin  
              Working...
              X