Great tool...
Hi,
This is a great tool by the way. I was hoping for someone to have already implemented exactly this tool as I was scratching my head as to do it myself. Thanks a lot.
May I make a couple of suggestions in terms of functionality. Would it be possible for you to add in a feature that only keeps trimmed reads if they are above a certain length (e.g. If this parameter was set to 20 and original sequence length is 36 and 17bp adapter was trimmed, then the sequence would not be included in the output because only 19bp of sequence would be remaining; could be set to 0 as default). Also, it would be useful to track which sequence was trimmed as adapter and where in the original sequence it was trimmed from in terms of location. Maybe an optional dump .fastq file would help for this which would contain the trimmed adapter sequence and additional information as to where in the original sequence it was found and how many mismatches were allowed (e.g. if a sequence is 36bp and adapter is found at 1 to 15bp with 0 mismatches, then maybe you could append this information to the '+' line in the fastq file as 1_15_0; the rest of the fields for a fastq sequence entry i.e. '@' would be the same). With the adapter .fastq output it would then be possible to parse the adapter sequence as required.
These are just suggestions by the way. I can see this tool becoming very useful to me and I have already introduced it to all of the bioinformaticians in my lab.
Look forward to reading you response.
Hi,
This is a great tool by the way. I was hoping for someone to have already implemented exactly this tool as I was scratching my head as to do it myself. Thanks a lot.
May I make a couple of suggestions in terms of functionality. Would it be possible for you to add in a feature that only keeps trimmed reads if they are above a certain length (e.g. If this parameter was set to 20 and original sequence length is 36 and 17bp adapter was trimmed, then the sequence would not be included in the output because only 19bp of sequence would be remaining; could be set to 0 as default). Also, it would be useful to track which sequence was trimmed as adapter and where in the original sequence it was trimmed from in terms of location. Maybe an optional dump .fastq file would help for this which would contain the trimmed adapter sequence and additional information as to where in the original sequence it was found and how many mismatches were allowed (e.g. if a sequence is 36bp and adapter is found at 1 to 15bp with 0 mismatches, then maybe you could append this information to the '+' line in the fastq file as 1_15_0; the rest of the fields for a fastq sequence entry i.e. '@' would be the same). With the adapter .fastq output it would then be possible to parse the adapter sequence as required.
These are just suggestions by the way. I can see this tool becoming very useful to me and I have already introduced it to all of the bioinformaticians in my lab.
Look forward to reading you response.
Comment