I am aligning a large number of ESTs. It seems poly-A tails show in many different ways. In addition to occurring at the very end, they can be flanked by the cloning sequence one one end, or have mismatches/errors. What is a good rule or available tools that will handle the usual cases?
A few examples of the non-trivial cases I found, with their Genbank Accs:
Crossposted: StackExchange, Biostars.
A few examples of the non-trivial cases I found, with their Genbank Accs:
Code:
>EE409337 ... AAAAAAAAAAAAAAAAAAAAAAAAAGGAAAAAAAAAAAAAAAAAAAAAAAAAAAACCTTGTC >EE409340 ... TTTCTACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACTTGTC >EE409361 ... TTGTTAAACTGAAAAAAAAAAAAAAAAAAAAAAAAAAAACCATGTCGGC TTACTGAATTGAA >EE420306 .... AAAAAAAGTTATGTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGAAAAAAA AAAAAAAAAAAAAAAAA