Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bowtie2 fun

    OK, folks

    Who here has used bowtie2-build on a database >4GB in size?

    When I try it with the latest Beta3 release with the 64Bit binary (which is supposed to allow databases > 4GB, it spits and dies during bowtie2-build with:

    Error: Reference sequence has more than 2^32-1 characters! Please divide the
    reference into batches or chunks of about 3.6 billion characters or less each
    and index each independently.

    Just wondering if anyone has successfully build an index using any releases of bowtie2 on large dbs.

  • #2
    For mappers that index the genomes, RTGinvestigator/bwa-0.6 are known to work with >4GB genomes. Agile is likely to support long genomes, too. Mappers that index the reads (e.g. maq) may work large genomes if one chromosome is no longer than 2GB.

    Supporting >4GB genomes is non-trivial and usually comes at a cost.
    Last edited by lh3; 11-18-2011, 04:27 PM.

    Comment


    • #3
      ok, this might be a dumb question but where did you get the bowtie2 src file? The manual say to look for "the filename that ends in "-src.zip". There is no bowtie....-src.zip file on the bowtie sourceforge site that I can find. Only the one for 0.12.7.

      Comment


      • #4
        nevermind. I found it. It just didn't have the "-src" in the name. Grrr.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin


          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
          Yesterday, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        39 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        41 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        35 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        55 views
        0 likes
        Last Post seqadmin  
        Working...
        X