Thanks. The data are from a metagenome with no good reference, so mapping to get error rates is not an option. My impression was that most(?) 2x250 HiSeq suffered from problems on the reverse read (including the strange GC divergence problem, which we also see, though maybe it's a separate issue from quality alone). Others have recommended hard-trimming the last 50bp, which inspired my harsh trimming.
I'd be curious to know if JGI or others are having better success with 2x250.
We may end up just using the FW read for the poorqual/unmerged, but given the insert size distrib, it seemed worth trying to push the merging a bit further. I find it interesting that extension and xloose did not help as much as simple trimming, which also seems to me to be safer/conservative.
Thanks for your help, and for making these tools so modular and customizable.
MC
I'd be curious to know if JGI or others are having better success with 2x250.
We may end up just using the FW read for the poorqual/unmerged, but given the insert size distrib, it seemed worth trying to push the merging a bit further. I find it interesting that extension and xloose did not help as much as simple trimming, which also seems to me to be safer/conservative.
Thanks for your help, and for making these tools so modular and customizable.
MC
Comment