Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by SylvainL View Post
    I am using BBDuk version 36.84, I guess that's the main difference...
    Well .. let us remove that difference then

    Code:
    $ bbmap/bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
    java -Djava.library.path=/path_to/bbmap/jni/ -ea -Xmx19498m -Xms19498m -cp /path_to/bbmap/current/ jgi.BBDukF in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
    Executing jgi.BBDukF [in=nn.fq, out=stdout.fq, mm=f, hdist=0, edist=0, ktrim=l, rcomp=f, k=29, literal=CAACAGCAATATACCTTCTCGAGAGGTCT]
    
    BBDuk version 36.84
    Initial:
    Memory: max=19594m, free=19185m, used=409m
    
    Added 1 kmers; time:    0.043 seconds.
    Memory: max=19594m, free=18469m, used=1125m
    
    Input is being processed as unpaired
    Started output streams: 0.018 seconds.
    
    @test
    CTGTCCACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTC
    +
    FGGGCEGGGGIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIGIIGFIIFIIIIIII
    Processing time:                0.006 seconds.
    
    Input:                          1 reads                 100 bases.
    KTrimmed:                       1 reads (100.00%)       45 bases (45.00%)
    Total Removed:                  0 reads (0.00%)         45 bases (45.00%)
    Result:                         1 reads (100.00%)       55 bases (55.00%)
    
    Time:                           0.076 seconds.
    Reads Processed:           1    0.01k reads/sec
    Bases Processed:         100    0.00m bases/sec

    Comment


    • Ok, thanks for all this effort. I really don't catch it. Now, it's running with maxlength=1 and I get the expected results.
      Quite weird

      Comment


      • Just for the record what OS/Java version are you using?

        I am not using maxlength=1 and still get the correct answer. Strange indeed.

        Comment


        • Ubuntu 12.04 server...
          java version "1.7.0_121"
          OpenJDK Runtime Environment (IcedTea 2.6.8) (7u121-2.6.8-1ubuntu0.12.04.1)
          OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)

          Comment


          • Sounds like, possibly, an intermittent filesystem problem / system problem? Please let me know if it happens again.

            Comment


            • Originally posted by Brian Bushnell View Post
              Sounds like, possibly, an intermittent filesystem problem / system problem? Please let me know if it happens again.
              It appears to go away for @SylvainL by using maxlength=1 option, which is odd.

              PS: I get the correct answer without the need of maxlength.
              Last edited by GenoMax; 01-12-2017, 09:49 AM.

              Comment


              • Originally posted by GenoMax View Post
                It appears to go away by using maxlength=1 option, which is odd.
                Oh - my impression was that the problem occurred once, but then was not replicable either with or without "maxlength=1" (which, actually, should make it so that there is no output at all in this case).

                @SylvainL Sorry for the confusion, can you please clarify what output you are currently getting with and without "maxlength=1"? Currently, I got:

                Code:
                bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT
                
                output:
                @test
                CTGTCCACAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGGAATCTC
                +
                FGGGCEGGGGIIIIIIIIIIIIIIIIGIIIIIIIIIIIIIGIIGFIIFIIIIIII
                Code:
                bbduk.sh in=nn.fq out=stdout.fq mm=f hdist=0 edist=0 ktrim=l rcomp=f k=29 literal=CAACAGCAATATACCTTCTCGAGAGGTCT maxlen=1
                
                output:
                ...which is what I expect.

                Comment


                • Hi Brian,

                  if I do NOT set maxlength=1, the output is only reads shorter than 10 after trimming
                  if I set maxlength=1, I can get the normal output, i-e reads around 40-50bp after trimming...

                  I also set trimq=0 to be sure that it wasn't a problem of quality trimming ...

                  Here is my command:

                  Code:
                  ~/Applications/bbmap/bbduk.sh in=./out/${MissMatch}_MM_${Indel}_Indel/${Samplename}.fastq outm=./out/${MissMatch}_MM_${Indel}_Indel/${Samplename}_left.fastq literal=$(cat ./out/Barcodes_with_adapters/${Samplename}) k=$(wc -m ./out/Barcodes_with_adapters/${Samplename} | awk '{print $1-1}') mm=f hdist=${MissMatch} edist=${Indel} ktrim=l rcomp=f maxlength=1 trimq=0;
                  Last edited by SylvainL; 01-12-2017, 10:14 PM.

                  Comment


                  • Hi Brian,

                    Do you think it is possible for bbduk2.sh to trim both 5' end and 3' end with each mapping to a 22 nt reference and and a 3 nt reference, respectively?
                    To be more specific, the 5' end primer is: GCGGAGATCTACCTACGTACTT and the 3' end primer is: TGT

                    Thanks for your great tool!

                    Comment


                    • Originally posted by netasha View Post
                      Hi Brian,

                      Do you think it is possible for bbduk2.sh to trim both 5' end and 3' end with each mapping to a 22 nt reference and and a 3 nt reference, respectively?
                      To be more specific, the 5' end primer is: GCGGAGATCTACCTACGTACTT and the 3' end primer is: TGT

                      Thanks for your great tool!
                      Pretty sure the answer is yes but you may want to do following instead. A similar question recently came up on Biostars and this was @Brian's recommendation.

                      Code:
                      bbduk.sh in=file.fq out=stdout.fq ktrim=r k=3 mm=f literal=TGT rcomp=f ktrimexclusive | bbduk.sh in=stdin.fq out=trimmed.fq ktrim=l k=22 mm=f literal=GCGGAGATCTACCTACGTACTT rcomp=f ktrimexclusive

                      Comment


                      • Originally posted by GenoMax View Post
                        Pretty sure the answer is yes but you may want to do following instead. A similar question recently came up on Biostars and this was @Brian's recommendation.

                        Code:
                        bbduk.sh in=file.fq out=stdout.fq ktrim=r k=3 mm=f literal=TGT rcomp=f ktrimexclusive | bbduk.sh in=stdin.fq out=trimmed.fq ktrim=l k=22 mm=f literal=GCGGAGATCTACCTACGTACTT rcomp=f ktrimexclusive

                        Thanks for your quick reply!
                        I was thinking doing it sequentially as well. But what is the "ktrimexclusive"?

                        Comment


                        • Originally posted by netasha View Post
                          Thanks for your quick reply!
                          I was thinking doing it sequentially as well. But what is the "ktrimexclusive"?
                          Hi Netasha,

                          BBDuk2 can trim the left and right end at the same time, but it can only use a single kmer length, and as a result it won't work in your case. So, 2 passes with BBDuk using 2 different kmer lengths is better. "TGT" is super short, though, which will lead to overtrimming due to coincidental matches. Is there any more fixed sequence following that?

                          BBDuk's normal trimming behavior when matching a kmer is to trim the kmer itself and everything to the right/left of it. "ktrimexclusive" tells BBDuk to only trim to the right/left, but not to trim the matched kmer itself (so TGT or whatever would still remain in the read). Whether or not you should use that flag depends on whether the sequences you want to trim are genomic. For adapters, which are artificial, the ktrimexclusive flag should not be used, but in some cases it should.

                          Comment


                          • That's why I was looking for a parameter which can restrict this short 3mer to be anchored at the rightmost position. I thought you provided such parameters already: restrictright=3. Am I right?

                            Because I was using cutadapt and it can manage to trim it by adding a "$" at the end of the 3 mer: TGT$. If I'm wrong, would it be a lot of efforts to add such function to bbduk2.sh?

                            Thanks for the explanation of the "ktrimexclusive".

                            Comment


                            • Originally posted by netasha View Post
                              That's why I was looking for a parameter which can restrict this short 3mer to be anchored at the rightmost position. I thought you provided such parameters already: restrictright=3. Am I right?
                              Oh, yes, if you know the position then use restrictright=3. If you already know that 100% of the time you want to trim the last 3 bases, then you can just use "ftr2=3" instead of kmer-matching.

                              Comment


                              • Hi Brian,

                                I started using an HPC (36 CPUs @ 3.5 GHz each & 60 GB RAM) to processes NGS data. I noticed that while using BBDuk, neither the RAM nor processors are being challenged, yet BBDuk takes quite awhile to process the reads (about 10 minutes for 90,000 150bp x 2 paired reads). BBDuk Parameters:

                                Adapters
                                Trim: Right End Only
                                Kmer Length: 27
                                Max Substitutions: 3
                                Max Substitutions + INDELs: 0
                                Trim partial adapaters with kmer length: Yes, 7

                                Trim Low Quality - Yes
                                Both Ends
                                Minimum Quality: 20

                                Discard Short Reads - Yes
                                Minimum Length: 75 bp (changed to 75 because I found that primer dimers contributed to assembled reads when cutoff was set at 50)

                                Keep Original Order - Yes


                                I tried using the t=36 flag, but still don't get all of my processors utilized, and the RAM is set to 45 GB and only about 10 GB is utilized. BBMap on the other hand can and does cap out the processors during mapping on the HPC. I'm using BBDuk inside of Geneious, so if you think this is abnormal performance for BBDuk, I can reach out to them to inquire.

                                Thanks
                                Jake

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Best Practices for Single-Cell Sequencing Analysis
                                  by seqadmin



                                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                  06-06-2024, 07:15 AM
                                • seqadmin
                                  Latest Developments in Precision Medicine
                                  by seqadmin



                                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                  Somatic Genomics
                                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                  05-24-2024, 01:16 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, Today, 06:54 AM
                                0 responses
                                10 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-14-2024, 07:24 AM
                                0 responses
                                16 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-13-2024, 08:58 AM
                                0 responses
                                16 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 06-12-2024, 02:20 PM
                                0 responses
                                17 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X