Hello,
I'm mapping 454 reads against a reference genome using bwa and then create sorted bam files using Samtools.
However, most my samples do have only one copy of a 9bp motif, whereas the reference has two. In some reads of my sequence data the motif will be followed by gaps and in some the gaps will be followed by the motif and in the end the consensus will call it as a tandem repeat.
I tried to use GATK for realignment, but it didn't work. Has anyone used GATK for this purpose successfully? Or does anyone have other ideas how to solve the problem? others than de-novo assembly?
I'm building up a fully automized pipeline so I use command line based software and scripts.
Help would be highly appreciated!
Thanks,
Stefan
I'm mapping 454 reads against a reference genome using bwa and then create sorted bam files using Samtools.
However, most my samples do have only one copy of a 9bp motif, whereas the reference has two. In some reads of my sequence data the motif will be followed by gaps and in some the gaps will be followed by the motif and in the end the consensus will call it as a tandem repeat.
I tried to use GATK for realignment, but it didn't work. Has anyone used GATK for this purpose successfully? Or does anyone have other ideas how to solve the problem? others than de-novo assembly?
I'm building up a fully automized pipeline so I use command line based software and scripts.
Help would be highly appreciated!
Thanks,
Stefan
Comment