Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    Dear all,

    Just passing by and saw this thread. It seems lots of people would rather have BWA running on multi-core. Given INDEX is a one-off task for a certain batch of files, and ALN is supporting -t, thus only left SAMPE, SAMSE single-threaded.

    I have a version of SAMPE able to run multithreaded. Although it's done on Windows under CRT (C Runtime), the basic idea can be easily transfered back to the Linux code base, only needed knowledge of MemoryMappedFiles, and some threading concepts, it's just some plumbing wrapped around some core function calls. SAMSE can be done in the same way ( I didn't do that because I only have PE data in my hand).

    My project site is here http://bow.codeplex.com/, you can find some performance data on the release page. Source code is on GitHub: https://github.com/xied75, (oh, should be this branch https://github.com/xied75/bwa/tree/mt-sampe)

    I recently tested this MT-SAMPE on Windows Azure (Cloud) large instance with 4 cores 8GB memory, it runs without any problem on -t 4.

    Best,

    dong

    Comment


    • #47
      Hello,

      I'm trying the following code for MPI use on my cluster. Which has 2 nodes with 8 CPUs each and 32GB ram per node.

      It spawns 16 pbwa processes over the 2 compute nodes. Which seems ok, but checking execution log it seems that pbwa is running the same align process 16 times.

      Is my job wrong?
      Would appreciate some support.

      Of course running this on SGE with Open MPI.

      Code:
      #!/bin/bash
      ### shell
      #$ -S /bin/bash
      ### env path
      #$ -V
      ### name
      #$ -N aln_left
      ### current work directory
      #$ -cwd
      ### merge outputs
      #$ -j y
      ### PE
      #$ -pe mpi 16
      ### select all.q
      #$ -q all.q
      
      
      mpirun pBWA aln -f aln_left /data_in/references/genomes/human/hg19/bwa_ref/hg19.fa /data_in/rawdata/HapMap_1.fastq > /data_out_2/tmp/mpi/HapMap_1.cloud.left.sai

      Comment


      • #48
        I'm stuck at sampe step. pBWA internal documentation shows that pBWA sampe usage is the following one:

        Code:
        Usage:   pBWA sampe -f <output.sam> [options] <prefix> <SAI_FILE_PREFIX1> <SAI_FILE_PREFIX2> <in1.fq> <in2.fq>
        
        Options: -a INT   maximum insert size [500]
                 -o INT   maximum occurrences for one end [100000]
                 -n INT   maximum hits to output for paired reads [3]
                 -N INT   maximum hits to output for discordant pairs [10]
                 -c FLOAT prior of chimeric rate (lower bound) [1.0e-05]
                 -f FILE  sam file name/prefix to output results to
                 -M       merge all sam file prefixes into one file
                 -r STR   read group header line such as `@RG\tID:foo\tSM:bar' [null]
                 -P       preload index into memory (for base-space reads only)
                 -s       disable Smith-Waterman for the unmapped mate
                 -A       disable insert size estimate (force -s)
        
        Notes: 1. For SOLiD reads, <in1.fq> corresponds R3 reads and <in2.fq> to F3.
               2. For reads shorter than 30bp, applying a smaller -o is recommended to
                  to get a sensible speed at the cost of pairing accuracy.
               3. For the SAI prefixes, do NOT include the _1 and _2 generated by aln
                  as sampe will auto-detect these.
        While pBWA website show the following command as an example:

        Code:
        ./pBWA sampe -f SamPrefix /path/to/Index.fa SaiPrefix SaiPrefix[2] /path/to/Read_1.fq /path/to/Read_2.fq
        I don't understand why in website example reference fa file is an arg and in shell man it isn't. And what's the meaning of the first <prefix> arg.

        I'm trying with the following command and i'm getting an error. I've already tried to re-build the index with bwa index and error persists:

        Code:
        pBWA sampe -f output.sam /share/references/genomes/human/hg19/bwa_ref/hg19.fa aln_left aln_right /home/gmarco/input/data/rawdata/HapMap_1.fastq /home/gmarco/input/data/rawdata/HapMap_2.fastq
        
        Proc 0: Found second SAI file - aln_right-1-00000.sai
        Proc 0: [bwa_seq_open] seeked to 0 in /home/gmarco/input/data/rawdata/HapMap_1.fastq
        Proc 0: [bwa_seq_open] seeked to 0 in /home/gmarco/input/data/rawdata/HapMap_2.fastq
        Proc 0: [bwa_sai2sam_pe_core] 262144 reads
        Proc 0: [bwa_sai2sam_pe_core] convert to sequence coordinate...
        Broadcasting BWT (this may take a while)... done!
        [bwt_restore_sa] SA-BWT inconsistency: seq_len is not the same. Abort!
        [sg13:30874] *** Process received signal ***
        [sg13:30874] Signal: Aborted (6)
        [sg13:30874] Signal code:  (-6)
        [sg13:30874] [ 0] /lib64/libpthread.so.0 [0x347b60eb10]
        [sg13:30874] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3cf8c30265]
        [sg13:30874] [ 2] /lib64/libc.so.6(abort+0x110) [0x3cf8c31d10]
        [sg13:30874] [ 3] pBWA [0x404c52]
        [sg13:30874] [ 4] pBWA(bwt_restore_sa+0xce) [0x4081de]
        [sg13:30874] [ 5] pBWA(bwa_cal_pac_pos_pe+0x1b35) [0x41ad05]
        [sg13:30874] [ 6] pBWA(bwa_sai2sam_pe_core+0x3e3) [0x41b203]
        [sg13:30874] [ 7] pBWA(bwa_sai2sam_pe+0x450) [0x41bef0]
        [sg13:30874] [ 8] pBWA(main+0x96) [0x428206]
        [sg13:30874] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3cf8c1d994]
        [sg13:30874] [10] pBWA [0x404b79]
        [sg13:30874] *** End of error message ***
        Aborted
        Thanks.
        Last edited by gmarco; 11-22-2012, 01:06 AM.

        Comment


        • #49
          Dear all,
          Currently we try to run pBWA on our cluster as well.

          Unfortunately, we have problems with proper index generation. I suppose pBWA index differs from bwa one - there is an additional file test_ref.fa.rbwt which we can not produce using regular bwa index.

          Proc 0: [bwa_seq_open] seeked to 0 in reads_1.fq
          Proc 0: [bwa_seq_open] seeked to 0 in reads_2.fq
          Broadcasting BWT (this may take a while)... done!
          Broadcasting BWT (this may take a while)... [bwt_restore_bwt] fail to open file 'test_ref.fa.rbwt'. Abort!

          How to prepare a valid pBWA index out of fasta file?
          Tomasz Stokowy
          www.sequencing.io.gliwice.pl

          Comment


          • #50
            Originally posted by stoker View Post
            Dear all,
            Currently we try to run pBWA on our cluster as well.

            Unfortunately, we have problems with proper index generation. I suppose pBWA index differs from bwa one - there is an additional file test_ref.fa.rbwt which we can not produce using regular bwa index.

            Proc 0: [bwa_seq_open] seeked to 0 in reads_1.fq
            Proc 0: [bwa_seq_open] seeked to 0 in reads_2.fq
            Broadcasting BWT (this may take a while)... done!
            Broadcasting BWT (this may take a while)... [bwt_restore_bwt] fail to open file 'test_ref.fa.rbwt'. Abort!

            How to prepare a valid pBWA index out of fasta file?
            Are you generating the index with the latest BWA version 0.6.2?

            I think i got the same problem. If pBWA version is 0.5.9 you should be using an index generated by BWA 0.5.9 also. It won't work otherwise.

            Regards,
            G.

            Comment


            • #51
              Is there an updated version of pBWA ?

              Comment


              • #52
                Does this work for BWA-MEM ?

                Yes, an updated version of this software will be highly desired since the BWA itself has changed to incorporate MEM option and better indexing algorithm.

                Comment


                • #53
                  I guess the only one who knows if an updated version will be released is dp05yk.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Genetic Variation in Immunogenetics and Antibody Diversity
                    by seqadmin



                    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                    11-06-2024, 07:24 PM
                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Today, 11:09 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Today, 06:13 AM
                  0 responses
                  19 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  30 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  21 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X