Seqanswers Leaderboard Ad

**Kermit** · 07-04-2023, 01:07 PM

Thanks for the tool. In comparison to `cutadapt`, I noticed that `bbduk` (using the options below) ran 20x faster, and excluded all of the over-represented sequences that were flagged by FastQC.

```
bbmap/bbduk.sh in1=file_1.fq.gz in2=file_2.fq.gz out1=file_1_trim.fq.gz out2=file_2_trim.fq.gz ref=adapters ktrim=r k=23 mink=11 hdist=1 tpe tbo
```

Are the above options sensible for trimming paired end Illumina RNA-seq?

Should I be using `ktrim=rl` or a lower `k`?

I did not see adapters.fa file in the bbmap folder, so I just used the 'adapters' string as mentioned in the --help description of ref.

I am new to processing this omic at the read level. Thank you

**ajverster** · 08-10-2023, 09:05 AM

Hello,

I'm noticing some non-reproducibility when bbduk is writing gzipped files. The reads appear to be the same, but the output order has changed. This leads to subtle changes in metaphlan abundance levels (changes at the 4th significant didget). Is this intended behavior or a bug? Is there anyway to force bbduk to run in a fully reproducible way?

```
bbduk.sh in=DC001_R1.fastq.gz out=DC001_R1.rep1.fastq.gz ftm=5 tpe tbo qtrim=rl trimq=25 minlen=50 ref=adapters,phix -Xmx16g
bbduk.sh in=DC001_R1.fastq.gz out=DC001_R1.rep2.fastq.gz ftm=5 tpe tbo qtrim=rl trimq=25 minlen=50 ref=adapters,phix -Xmx16g
```

**susanklein** · 11-27-2023, 05:33 PM

Hi,

Getting this strange error running mapPacBio.sh. The cluster node I'm running it on has 104 cpus, 500GB mem, and 400GB storage. I was getting the same error using 104 cpus so tried lowering it to 48.

The read fasta is 56 GB and the reference fasta is 95 MB.

I've cut down the length of the sequence and 'Sm' string shown.

Thanks,

S.

mapPacBio.sh ref=$CPREF ambig=all nodisk=t nzo=t mappedonly=t minid=0.85 out=bact-bbmap.sam scafstats=cp_bb.mapstats in=$reads fastareadlen=6000 threads=48 maxindel=100 k=15

Loading python3/3.9.2
Loading requirement: intel-mkl/2020.3.304
java -ea -Xmx229788m -cp /g/data/nm31/bin/bbmap/current/ align2.BBMapPacBio build=1 overwrite=true minratio=0.40 fastareadlen=6000 ambiguous=best minscaf=100 startpad=10000 stoppad=10000 midpad=6000 ref=/home/554/ta0341/dy44/r12.18_Roepera_similis/cp-fabids.fasta ambig=all nodisk=t nzo=t mappedonly=t minid=0.75 out=bact-bbmap.sam scafstats=bact_bb.mapstats in=/home/554/ta0341/dy44/r12.18_Roepera_similis/reads.fasta fastareadlen=6000 threads=48 maxindel=100 k=15
Executing align2.BBMapPacBio [build=1, overwrite=true, minratio=0.40, fastareadlen=6000, ambiguous=best, minscaf=100, startpad=10000, stoppad=10000, midpad=6000, ref=/home/554/ta0341/dy44/r12.18_Roepera_similis/cp-fabids.fasta, ambig=all, nodisk=t, nzo=t, mappedonly=t, minid=0.75, out=bact-bbmap.sam, scafstats=bact_bb.mapstats, in=/home/554/ta0341/dy44/r12.18_Roepera_similis/reads.fasta, fastareadlen=6000, threads=48, maxindel=100, k=15]
Version 39.01

Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.400
Set OUTPUT_MAPPED_ONLY to true
Scaffold statistics will be written to bact_bb.mapstats
Set threads to 48
Retaining all best sites for ambiguous mappings.
Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.367
Executing dna.FastaToChromArrays2 [/home/554/ta0341/dy44/r12.18_Roepera_similis/cp-fabids.fasta, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=false, minscaf=100, midpad=6000, startpad=10000, stoppad=10000, nodisk=true]

Set genScaffoldInfo=true
Set genome to 1

Loaded Reference: 0.003 seconds.
Loading index for chunk 1-1, build 1
Indexing threads started for block 0-1
Indexing threads finished for block 0-1
Generated Index: 10.393 seconds.
Analyzed Index: 33.211 seconds.
Started output stream: 1.183 seconds.
Creating scaffold statistics table: 0.036 seconds.
Cleared Memory: 0.175 seconds.
Processing reads in single-ended mode.
Started read stream.
Started 48 mapping threads.
Exception in thread "Thread-43" java.lang.AssertionError: -24, -24
4908336, 1,0,99886506,99894071,0,00,151,154695,351771,0,263804,,mSSSmSSSmSmmmSSSSSSmSSSmmSSmmmSmmmSSmmSSSSSmmmSSSSSmSSSmmSSSmmSSmSSSSSmSSSSSSmSSSSSSmSSSSSSSSSSSmmmSSSSSSSmSSmmSSmSSSSSSSSSSSSSSmmmmmmSmmmmmmmmmSmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmSmmmmmmmmmmmmmmmmSmmmmmmmmmmmmmSSmmmmmmmmmmmmmmmmmmmmSSmSSSSSmmSSSSmmSSmSmSSmSSSSSSSmSSSSmSmSSSSSSSSmmmSSmSSSSSSSSSSSSmSSSmmSSmSSSmmSSSmmSSmmmmmmmSSmmmmSmSmmSmSSSSSmSSSmmSSSSSSmSSSSmmSSSmSSSSmmSmSSSmmSmSSSmSSSmmmSSSSSmSSSmSSSSmmSSmSSSSSSSmSmSSmSSSSSmSmmSSSSmSSSSSSSmSSSmSmmSSSSSmSSSmSSSmmmSSSmSmSSSmmSmSSmSSSSSmSSSmSmmSSmSSSSmSSmSSmmmmSSSSSSSmSmSSSmSmSSSSmmSmSmSmmmmSSSSSSSSSSSSmmSSmSSSmSmSSSSSSmmSmmSSmSSSmmmSSmSSSSSmmSSSSSmSSSmmSSSSmSSSmSmmSS....SSmSm, ATGAAAGAAAATATTCACATAGCTTTGTTTGTTTGATTGGCCT....CCACAAACTACGAAAATTTCTCTTTTTCCTATCTCTATCCAGATCAGCCGAAGCCAAAGCTCTTATTATTTGCTGAGAAATAGCTCGAGTAAG
at align2.TranslateColorspaceRead.realign_new(TranslateColorspaceRead.java:407)
at align2.AbstractMapThread.genMatchStringForSite(AbstractMapThread.java:1043)
at align2.AbstractMapThread.genMatchString(AbstractMapThread.java:923)
at align2.BBMapThreadPacBio.processRead(BBMapThreadPacBio.java:564)
at align2.AbstractMapThread.run(AbstractMapThread.java:522)
Exception in thread "Thread-20" java.lang.AssertionError: -19, -19
9787423, 1,0,91446916,91454472,0,00,84,35773,287063,0,286984,,SSSSmSSSmSSm....SmSm, AGATAGATTAAAATAAAAA....AATATTCATCGTTTAATTCTATAATAGAATGC
at align2.TranslateColorspaceRead.realign_new(TranslateColorspaceRead.java:407)
at align2.TranslateColorspaceRead.realign_new(TranslateColorspaceRead.java:638)
at align2.AbstractMapThread.genMatchStringForSite(AbstractMapThread.java:1026)
at align2.AbstractMapThread.genMatchString(AbstractMapThread.java:923)
at align2.BBMapThreadPacBio.processRead(BBMapThreadPacBio.java:564)
at align2.AbstractMapThread.run(AbstractMapThread.java:522)
Detecting finished threads: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47

**************************************************************************
Warning! 2 mapping threads did not terminate normally.
Check the error log; the output may be corrupt or incomplete.
Please submit the full stderr output as a bug report, not just this message.
**************************************************************************

------------------ Results ------------------

Genome: 1
Key Length: 15
Max Indel: 100
Minimum Score Ratio: 0.367
Mapping Mode: normal
Reads Used: 12661524 (57423401385 bases)

Mapping: 27513.481 seconds.
Reads/sec: 460.19
kBases/sec: 2087.10

Read 1 data: pct reads num reads pct bases num bases

mapped: 1.9671% 249070 1.9269% 1106468960
unambiguous: 1.8172% 230081 1.8567% 1066207675
ambiguous: 0.1500% 18989 0.0701% 40261285
low-Q discards: 0.0000% 0 0.0000% 0

perfect best site: 0.0023% 288 0.0000% 20624
semiperfect site: 0.0023% 288 0.0000% 20624

Match Rate: NA NA 78.4035% 941243945
Error Rate: 86.9937% 248782 20.7111% 248639475
Sub Rate: 86.8864% 248475 5.4915% 65925963
Del Rate: 83.3997% 238504 7.8336% 94043737
Ins Rate: 83.6599% 239248 7.3860% 88669775
N Rate: 3.2615% 9327 0.8854% 10629277
Exception in thread "main" java.lang.AssertionError:
The number of reads out does not add up to the number of reads in.
This may indicate that a mapping thread crashed.
If you submit a bug report, include the entire console output, not just this error message.
125046+124024+0+12412452+0+0 = 12661522 != 12661524
at align2.AbstractMapper.printOutputStats(AbstractMapper.java:1962)
at align2.AbstractMapper.printOutput(AbstractMapper.java:1060)
at align2.BBMapPacBio.testSpeed(BBMapPacBio.java:491)
at align2.BBMapPacBio.main(BBMapPacBio.java:36)

**nicolask** · 03-29-2024, 02:23 AM

Hello, thank you very much for the tool! I have used it in the past on my older intel-based mac. However, I just downloaded the latest version onto my M3 silicon mac, and I am running into some issues...

Essentially, there are errors when trying to create an index, and when fails when trying to align some fasta sequences.

Code:

$ java -ea -Xmx32G -Xms32G -cp /Users/nicolas/bin/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 in=testmap.fa out=nate.bam ref=/Users/nicolas/genomes/mm39/mm39_ref.fa.gz

Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, in=testmap.fa, out=nate.bam, ref=/Users/nicolas/genomes/mm39/mm39_ref.fa.gz]

Version 39.06

or

Code:

$ ~/bin/bbmap/bbmap.sh in=testmap.fa out=nate.bam ref=~/genomes/mm39/mm39_ref.fa.gz -Xmx32G

/Users/nicolas/bin/bbmap//calcmem.sh: line 78: [: -v: unary operator expected

java -ea -Xmx32G -Xms32G -cp /Users/nicolas/bin/bbmap/current/ align2.BBMap build=1 overwrite=true fastareadlen=500 in=testmap.fa out=nate.bam ref=/Users/nicolas/genomes/mm39/mm39_ref.fa.gz -Xmx32G

Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, in=testmap.fa, out=nate.bam, ref=/Users/nicolas/genomes/mm39/mm39_ref.fa.gz, -Xmx32G]

Version 39.06

Code:

Exception in thread "Thread-10" java.lang.AssertionError
at align2.IndexMaker4$BlockMaker$CountThread.fillArrays(IndexMaker4.java:413)
at align2.IndexMaker4$BlockMaker$CountThread.run(IndexMaker4.java:300)

Exception in thread "Thread-13" java.lang.AssertionError
at align2.IndexMaker4$BlockMaker$CountThread.fillArrays(IndexMaker4.java:413)
at align2.IndexMaker4$BlockMaker$CountThread.run(IndexMaker4.java:300)

Indexing threads finished for block 0-3
Indexing threads finished for block 4-6

Generated Index: 65.699 seconds.
Finished Writing: 0.824 seconds.

Exception in thread "main" java.lang.AssertionError
at align2.BBIndex.analyzeIndex(BBIndex.java:130)
at align2.BBMap.loadIndex(BBMap.java:420)
at align2.BBMap.main(BBMap.java:32)

From what I can tell, the indexing seems to be complete as I see the relevant files, but fails completely when trying to do the alignment. For now, I am only trying to align a simple fasta file with about 12 different entries. I do not know if this is a java issue, an M3 issue, or something completely wrong that I'm doing.

Thank you in advance.

Sincerely,

NK

**valeng** · 05-03-2024, 11:42 AM

Hi there!

I've been working with 16S DNA metagenomics sequences recently. Before proceeding with DADA2 and Qiime2-amplicon-2023.9, I performed quality filtering using the following code:

for file in "${INPUT_DIR}"/*.fastq; do
base_name=$(basename "${file}" .fastq)
bbduk.sh in="${file}" out="${OUTPUT_DIR}/${base_name}.fastq" qtrim=rl trimq=10
echo "Processed file: ${file}"
done

Afterwards, I conducted a FastQC analysis, and the results looked promising. Initially, I encountered an issue where setting the trimq parameter to 15 or 20 resulted in forward and reverse sequences being of different lengths, making it impossible to proceed with DADA2. To resolve this, I set trimq to 10, and adjusted the parameters in DADA2 accordingly:

qiime dada2 denoise-paired --i-demultiplexed-seqs ${INPUT_DIR}/demux-paired-end.qza --p-trunc-len-f 250 --p-trim-left-f 8 --p-trunc-len-r 230 --p-trim-left-r 8 --o-representative-sequences ${OUTPUT_DIR}/dada2-rep-seqs.qza --o-table ${OUTPUT_DIR}/dada2-table.qza --o-denoising-stats ${OUTPUT_DIR}/dada2-stats.qza

However, despite these adjustments, I encountered a significant decrease in the percentage of merged reads.

Do you have any suggestions on how to improve this situation?

Topics	Statistics	Last Post
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, Yesterday, 05:31 AM	0 responses 10 views 0 likes	Last Post by seqadmin Yesterday, 05:31 AM
Small Blood Stem Cell Subset Linked to Immune System Aging by seqadmin Started by seqadmin, 10-24-2024, 06:58 AM	0 responses 20 views 0 likes	Last Post by seqadmin 10-24-2024, 06:58 AM
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types by seqadmin Started by seqadmin, 10-23-2024, 08:43 AM	0 responses 50 views 0 likes	Last Post by seqadmin 10-23-2024, 08:43 AM
Microbes in Urban Spaces Adapt to Disinfectants and Scarce Resources by seqadmin Started by seqadmin, 10-17-2024, 07:29 AM	0 responses 58 views 0 likes	Last Post by seqadmin 10-17-2024, 07:29 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News