Fellow bioinformatics gurus,
I have an apparently simple task - search for a specific pattern in whole genome or in comparable collection of reads. For example, exons coding for fragments of collagen. The protein pattern (GlyXaaYaa)n then translates to GGNNNNNNNGG, or GGNNNNNNNGGNNNNNNNGG and so on. The length of repeats may be variable. Another example would be exons encoding fragments similar to polyQ in FOXP2, like CAR repeats of variable length. What is the best Linux software to do this? I found COMPASSS, but it chocked up while searching GGNNNNNNNGGNNNNNNNGG just in chromosome 7, which carries one of collagen genes.
I have an apparently simple task - search for a specific pattern in whole genome or in comparable collection of reads. For example, exons coding for fragments of collagen. The protein pattern (GlyXaaYaa)n then translates to GGNNNNNNNGG, or GGNNNNNNNGGNNNNNNNGG and so on. The length of repeats may be variable. Another example would be exons encoding fragments similar to polyQ in FOXP2, like CAR repeats of variable length. What is the best Linux software to do this? I found COMPASSS, but it chocked up while searching GGNNNNNNNGGNNNNNNNGG just in chromosome 7, which carries one of collagen genes.
Comment