Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #31
    Originally posted by StaciaWyman View Post
    Good morning--I get page not found error when I go to the above link--is there an updated one? Thanks!
    Stacia
    Check out the different branches available, as well as the commits:
    Contribute to nh13/samtools development by creating an account on GitHub.


    If you are brave, I have also been working on getting this into Picard:
    A set of tools (in Java) for working with next generation sequencing data in the BAM (http://samtools.sourceforge.net) format. - nh13/picard


    The Picard developers are more receptive than the samtools develpers to a patch.

    Comment

    • kenietz
      Member
      • Nov 2011
      • 86

      #32
      Hi Nils, why dont you use pigz/unpigz which is parallelized gzip/gunzip? It takes all the arguments as normal gzip and with -p one can specify the number of threads.

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #33
        Do you mean I should use the PIGZ API? Of course I could compress a SAM file with pigz, but the advantage of the BAM file (which is block gzip compressed) is the ability to index the file and then do random retrieval based on genomic coordinates.

        Can you give an example of what you mean?

        Comment

        • ersgupta
          Member
          • Jun 2011
          • 26

          #34
          any update on the mpileup ??

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #35
            Originally posted by ersgupta View Post
            any update on the mpileup ??
            No, the individual tools were not multi-thread, just the reading/writing of the SAM/BAM files, which can be a bottleneck.

            Comment

            • adaptivegenome
              Super Moderator
              • Nov 2009
              • 436

              #36
              Guys, we have a working version of a faster mergesort for BAMs:



              Source is also there but if you want to test speed you can grab the binary to make things easy. We implemented SAM to BAM, mergesort, mark duplicates, and some other routines.

              Would love feedback on whether it is faster or not than what others are doing...

              Comment

              • Heisman
                Senior Member
                • Dec 2010
                • 534

                #37
                For a non-computer intelligent person like myself, I am confused regarding if I should update beyond samtools 0.1.18 to the new multithreaded versions, my confusion mainly stemming from not knowing if they work or not and how to download them. Is there some web page anywhere that documents the changes being made and when they are considered working and safe to use, and then where to download them from?

                Comment

                • thetaomega3
                  Junior Member
                  • Jun 2012
                  • 1

                  #38
                  Hi Nils,

                  I gave it a try (0.1.18-r572) and had mixed results.

                  Success: going from sam to bam (samtools import) on a 102Gb sam file results in a ~10X speedup on a 24-core (HT) machine with 192GB RAM, and the output bam (27 Gb) file matches one generated from the general non-mt release (0.1.18 r982:295) (using diff).

                  Failure: sort fails with error "failed to create threads" when it attempts to merge all the intermediate sorted bam files. Running samtools merge on the same set of files also fails with the same error. Tried -n 6, 12 and 24 with no success. The general non-mt release completes the sort and merge successfully.

                  Suggestions?

                  Comment

                  • Richard Finney
                    Senior Member
                    • Feb 2009
                    • 701

                    #39
                    https://github.com/nh13/samtools ?

                    How is this project going?

                    Comment

                    • nilshomer
                      Nils Homer
                      • Nov 2008
                      • 1283

                      #40
                      Originally posted by Richard Finney View Post
                      https://github.com/nh13/samtools ?

                      How is this project going?
                      I haven't taken a look at the sort problem, but not much else.

                      Comment

                      • adaptivegenome
                        Super Moderator
                        • Nov 2009
                        • 436

                        #41
                        We have a parallelized mergesort, source is on github... See my post a couple posts up.... Would love feedback and suggestions, etc...

                        Comment

                        • nilshomer
                          Nils Homer
                          • Nov 2008
                          • 1283

                          #42
                          Originally posted by adaptivegenome View Post
                          We have a parallelized mergesort, source is on github... See my post a couple posts up.... Would love feedback and suggestions, etc...
                          Can you plugin the "pbgzf.c" source code into bamtools? I have been playing around with bamtools and I quite like it. The bottleneck in reading/writing BAM files is the compression/decompression.

                          Comment

                          • adaptivegenome
                            Super Moderator
                            • Nov 2009
                            • 436

                            #43
                            Originally posted by nilshomer View Post
                            Can you plugin the "pbgzf.c" source code into bamtools? I have been playing around with bamtools and I quite like it. The bottleneck in reading/writing BAM files is the compression/decompression.
                            You are correct. We built a multithreaded version of a combined merge and sort from bamtools and after lots of work we got a 5X increase over the serial implementation. In comparison novosort offers a 10X increase and this is probably because we still are stuck with bamtool's serial I/O.

                            We are now replacing the serial I/O with a parallel I/O but its taken a bit of time to do. We should have something soon.

                            One other thing is that we also included (optionally) MarkDuplicates as part of mergesort so this speeds things up as well...

                            Comment

                            • nilshomer
                              Nils Homer
                              • Nov 2008
                              • 1283

                              #44
                              Originally posted by adaptivegenome View Post
                              You are correct. We built a multithreaded version of a combined merge and sort from bamtools and after lots of work we got a 5X increase over the serial implementation. In comparison novosort offers a 10X increase and this is probably because we still are stuck with bamtool's serial I/O.

                              We are now replacing the serial I/O with a parallel I/O but its taken a bit of time to do. We should have something soon.

                              One other thing is that we also included (optionally) MarkDuplicates as part of mergesort so this speeds things up as well...
                              Could you use the implementation that I made to do parallel I/O?

                              Comment

                              • adaptivegenome
                                Super Moderator
                                • Nov 2009
                                • 436

                                #45
                                Oh I see what you are saying. Yes, let me check it out and see if I can figure it out!

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                24 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                23 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...