Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • just 2.4% by trimming at 27. i do not know if it's a good average for a 2x300bp, MiSeq 16s and a metaprofilig of another marker


    • Is the command you posted, when you got the error the first time copy/paste from your terminal? In that case you simply made the mistake to only load the reverse reads
      That may also fit to the error message you got as you would have lost all mates. Interesting, however, that bbduk didn't complain and even produced two fastqs as output?!

      Originally posted by cnicolas View Post
      /data/umb/cichocki/bbmap/ in=/data/umb/cichocki/project2/bbduckdu11avril/project2_R2.fastq in=/data/umb/cichocki/project2/bbduckdu11avril/project2_R2.fastq out1=clean11avril.fastq out2=clean211avril.fastq qtrim=rl trimq=30


      • Originally posted by WhatsOEver View Post
        Is the command you posted, when you got the error the first time copy/paste from your terminal? In that case you simply made the mistake to only load the reverse reads
        That may also fit to the error message you got as you would have lost all mates. Interesting, however, that bbduk didn't complain and even produced two fastqs as output?!
        Ah! Good eye...

        OK, so here's what's happening:

        in=x.fq in=x.fq
        In1 is set as x.fq. Then, in1 is set as x.fq again (you can do this as many times as you want; BBTools all just overwrite the previous setting with the latest setting). Then, since 2 output files are specified, BBDuk assumes that the input file is interleaved and forces interleaved mode to true. That's a feature, by the way! But, I guess one that could potentially cause problems.


        • --Hi,

          i have a big difference between results using and trimmomatic trimming single-end reads, i have used the commands below, trimmomatic kept 99.78% survival reads whereas bbduk 91.76%. I don't know which to consider good or not.
          Which parameters do you use to use to trim in a good way single-reads ?

          thank you --

          java -Xmx10g -jar trimmomatic-0.36.jar SE -threads 8 -phred33 D3_464_S2_L001_R1_001.fastq.gz Out_D3_464_S2_L001_R1_001.fastq.gz ILLUMINACLIP:TruSeq3-SE.fa:2:40:15:8:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36

          TrimmomaticSE: Started with arguments:
          -threads 8 -phred33 D3_464_S2_L001_R1_001.fastq.gz Out_D3_464_S2_L001_R1_001.fastq.gz ILLUMINACLIP:/home/jtazi/save/Trimmomatic-0.36/adapters/TruSeq3-SE.fa:
          2:40:15:8:true LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36
          Using Long Clipping Sequence: 'AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA'
          Using Long Clipping Sequence: 'AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC'
          ILLUMINACLIP: Using 0 prefix pairs, 2 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
          Input Reads: 49512700 Surviving: 49401731 (99.78%) Dropped: 110969 (0.22%)

 Xmx8g in=D3_464_S2_L001_R1_001.fastq.gz out=D3_464_S2_L001_R1_001_trimmed.fastq.gz ref=resources/adapters.fa threads=8 k=13 ktrim=r useshortkmers=t mink=5 qtrim=rl minlength=36 trimq=27

          BBDuk version 36.11
          Set threads to 8
          maskMiddle was disabled because useShortKmers=true
          Memory: max=8232m, free=7974m, used=258m

          Added 2017 kmers; time: 0.570 seconds.
          Memory: max=8232m, free=7545m, used=687m

          Input is being processed as unpaired
          Started output streams: 0.449 seconds.
          Processing time: 118.446 seconds.

          Input: 49512700 reads 2475635000 bases.
          QTrimmed: 7288815 reads (14.72%) 218452414 bases (8.82%)
          KTrimmed: 3125475 reads (6.31%) 23413723 bases (0.95%)
          Total Removed: 4077445 reads (8.24%) 241866137 bases (9.77%)
          Result: 45435255 reads (91.76%) 2233768863 bases (90.23%)

          Time: 119.548 seconds.
          Reads Processed: 49512k 414.16k reads/sec
          Bases Processed: 2475m 20.71m bases/sec


          • The difference is primarily because you are quality-trimming to Q27, which is too high for almost any purpose. I'd suggest a command more like this:

   -Xmx8g in=D3_464_S2_L001_R1_001.fastq.gz out=D3_464_S2_L001_R1_001_trimmed.fastq.gz ref=resources/adapters.fa threads=8 k=19 mink=5 hdist=1 hdist2=0 ktrim=r qtrim=r minlength=36 trimq=14


            • --Hi,

              thank you for your answer, just a question about quality check:
              trimq=14 means an average quality in a sliding window such as in Trimmomatic with SLIDINGWINDOW:4:15 or not ?

              best -


              • BBDuk supports a sliding window; the flags "qtrim=w,4 trimq=15" will give similar behavior to Trimmomatic. But I don't recommend that; the Phred trimming method used by default is optimal, whereas sliding-window trimming is non-optimal.


                • okay good, thanks for your help.


                  • I am using to trim fastqs to a given length using the force trim capability. I noticed that the character # is being changed to ! in the Q score line of the trimmed fastq. I was unable to find documentation describing whether this is expected behavior. Would you be able to provide some insight into this? I ran the following command:

                    ../../tools/bbmap/ in=<sample>.fastq.gz out=<trimmed_sample>.fastq.gz ftr=50 ordered=t

                    Original fastq:
                    @SN1131:915:HFYN7ADXY:1:1101:21364:2052 1:N:0: CAACCACA

                    @SN1131:915:HFYN7ADXY:1:1101:21364:2052 1:N:0: CAACCACA

                    Thank you,


                    • Hi Brian,

                      That's intentional. The 5th base call is an N, which means the quality score should be 0 (!) not 2 (#). Some versions of Illumina software have bugs causing some Ns to be assigned quality scores above 0, or called bases to be assigned a quality score of 0. Neither of these cases should happen as they are mathematically contradictory, and can cause problems with downstream tools, so BBDuk automatically fixes both of them.

                      You can add the flag "changequality=f" to disable this behavior, but I don't recommend it.


                      • That makes sense. Thank you.


                        • This behaviour is a bit un-Unix like?

                 in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31 out=/dev/null

                          Unspecified format for output /dev/null; defaulting to fastq.

                          Exception in thread "main" java.lang.AssertionError: /dev/null already exists; please delete it.


                          • Originally posted by Torst View Post
                            This behaviour is a bit un-Unix like?
                            @Brian will have a more official answer but BBTools are pure Java and are coded to be OS agnostic (will run on any OS with Java).

                            Not specifying an "out" option with most BBTools produces all statistics without result output (giving you out=/dev/null effect).


                            • Haha

                              The syntax would be:

                     in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31 out=stdout.fq > /dev/null/

                              But, you don't need to specify anything, as the default is to not print anything rather than writing to stdout, so just do this:

                     in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31

                              Edit: @Genomax beat me by a minute


                              • Getting different results with bbduk command line vs geneious plugin


                                I have been searching the default settings in the command line and still haven't identified the source of the discrepancy... Here is my linux command:

                                sh ~/bbmap/ in1=~/path/to/forwards.fastq.gz in2=~/path/to/reverses.fastq.gz out=~/path/to/output.fastq.gz ref=~/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 minoverlap=24 tbo

                                result: 3628348 reads, 581467350 bases

                                and here is my plugin command from the geneious output:
                                java.exe -ea -Xmx100m -cp ...\currenjgi.BBDukF ktrim=r k=23 hdist=1 edist=0 mink=11 ref=adapters.fa minlength=10 trimbyoverlap=t minoverlap=24 qin=33 in=input1.fastq in2=input2.fastq out=output1.fastq out2=output2.fastq

                                result: 3628348 reads, 584214527 bases

                                the plugin command seems compatible with my data and the defaults in Any idea why 3M more bases in the plugin?

                                Thanks, Aaron


                                Latest Articles


                                • seqadmin
                                  Best Practices for Single-Cell Sequencing Analysis
                                  by seqadmin

                                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                                  06-06-2024, 07:15 AM
                                • seqadmin
                                  Latest Developments in Precision Medicine
                                  by seqadmin

                                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                  Somatic Genomics
                                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                  05-24-2024, 01:16 PM





                                Topics Statistics Last Post
                                Started by seqadmin, Today, 06:54 AM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, 06-14-2024, 07:24 AM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, 06-13-2024, 08:58 AM
                                0 responses
                                Last Post seqadmin  
                                Started by seqadmin, 06-12-2024, 02:20 PM
                                0 responses
                                Last Post seqadmin  