Hi all,
I have an experiment with sequences barcoded by the user. Actually I'm sure that 37 bp from 5' ends don't align to the reference genome (because there are adaptor sequences in addition to barcodes). I'm reading at 75 bp with Illumina GAII.
I've seen that bowtie allows to trim 5' ends but the trimmed sequence is not reported in results (at least in SAM format).
I'm also using bwa and apparently there's no option for trimming sequences on the fly.
I would like to align the last ~40 bp and read the whole reads in the results.
I thought I can produce fastq file with trimmed sequences, align those and then get whole reads by matching read name.
Besides text handling tools, can anybody suggest a valid approach to handle this kind of problem?
thanks
d
I have an experiment with sequences barcoded by the user. Actually I'm sure that 37 bp from 5' ends don't align to the reference genome (because there are adaptor sequences in addition to barcodes). I'm reading at 75 bp with Illumina GAII.
I've seen that bowtie allows to trim 5' ends but the trimmed sequence is not reported in results (at least in SAM format).
I'm also using bwa and apparently there's no option for trimming sequences on the fly.
I would like to align the last ~40 bp and read the whole reads in the results.
I thought I can produce fastq file with trimmed sequences, align those and then get whole reads by matching read name.
Besides text handling tools, can anybody suggest a valid approach to handle this kind of problem?
thanks
d
Comment