"ecct" and extension use kmer data from all reads to do error-correction, so it's best to do that initially, using all reads:
Code:
1) [adapter-trim as you have described] 2) bbmerge.sh in1=r1.fq in2=r2.fq out=merged.fq outu=unmerged#.fq ecct extend2=50 k=62 rem qtrim2=r trimq=10,15,20,25 3) bbduk.sh in=unmerged2.fq out=ftrimmed2.fq ftr=50 minlen=0 (make sure the out count is the same as the in count) 4) bbmerge.sh in1=unmerged1.fq in2=ftrimmed2.fq out=mergedB.fq outu=unmergedB#.fq ecct extend2=50 k=62 rem qtrim2=r trimq=10,15,20,25 extra=merged.fq
But! It's important to ask yourself whether you want to go to great effort to salvage the lowest-quality fraction of your data. I don't generally use "xloose" because it has a higher false-positive merge rate; in practice, I never go beyond "vloose". In this case since it appears that sequencing read 2 largely failed toward the end, it's worth going to some extent to salvage or else you'll lose the longest-insert reads, but it's also worth considering having the data rerun, since it's clearly a sequencing failure. I'd be interested in seeing the error rates of the reads from mapping (after adapter-trimming):
Code:
bbmap.sh in1=r1.fq in2=r2.fq ref=ref.fa mhist=mhist.txt qhist=qhist.txt bhist=bhist.txt
Leave a comment: