Can polyA tails occur within (rather than at the end) of a sequenced tag? Consider, for example the following two sequences from NCBI: DY008075
>gi|119423037|gb|DY008075.1|DY008075 19ACACYS_UP_022_A11_29OCT2004_095 Brassica napus 19ACACYS Brassica napus cDNA 5', mRNA sequence
TGGTACGGTCAGATGCTTGCTAAAGGAGAAATAAATAGAGACATGGGTGATAGTATAAGCGGAAAGGGAA
TGATTCAGGGTGTTTCTGCAGTGGGAGCGTTTTACCAACTGCTTAGTCAGTCCAGCCTAAGTATATTGCA
TTCTGAAGAGAAGAAACCTGTGGCTCCGGTTGAATCATGTCCTATTTTGAAAACACTCTACAAGATACTC
ATCACAAGAGAACAATCAACACAAGCGATTCTGCAAGCATTAAGGGATGAAACACTGAATGACCCAAGAG
ACAGGATTGAGATTGCACAGAGCCATGCATTCTACAGGCCTTCCCTTCTAGATCAGCCTTGATTAGTCTG
TCATGGCTCATAATCCGAACTTCTAAGATCTTACTTGTGCAAACTGCAGATTCTGCTATGTTAAACATCA
TGTCTTAAAATTGATTGTTGTTCAGCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACATGTC
or EE485195
>gi|126492146|gb|EE485195.1|EE485195 DHBN8DCT_UP_012_C05_25FEB2005_043 BRASSICA NAPUS SEEDS BNDH8DCT Brassica napus cDNA 5', mRNA sequence
GTTACAGCTGGTTGAGAACAGTGACAATTCCCGGTTGAGCAAAGAAATTGCAGACAAGAGCCACCAACTA
AGGCAAATGAGAGGAGAGGAACTTCAAGGACTTAACATAGAAGAGCTGCAACAGCTGGAAAAGGCCCTTG
AAGCTGGTTTGACGCGCGTGATTGAAACAAAGAGTGAGAAGATTATGAGTGAGATCAGTGACCTTCAAAG
AAAGGGAATGAAATTGATGGATGAGAACAAGCGGCTAAGGCAGCATGGAACACAACTAACAGAAGAGAAC
GAGCGACTAGGCAAGCAAATATATAATAATATGCATGAAAGATACGGTGGTGTTGAGTCGGAGAAGACCG
CCGTGTACGAGGAAGGGCAGTCGTCAGAGTCCATTACTAACGCCGGAAACTCCACCGGCGCTCCTGTTGA
CTCCGAGAGCTCCGATACCTCTCTTAGGCTCGGCTTACCGTATGGCGGTTAGAGATGGAACCATACAAAG
AAGTTCATGGAGTGAGGAGATGCTCTGTAGTAACAAGTGGCAATGTAGTAATTTCTCTTGTTTGATGTAA
GTTTTTGTCTGAGGAAGAGGTTTTCCTTTTATGTTCTCTTTGATATTATTATCTTTCTTCACTGCAAAAA
AAAAAAAAAAAAAAAAAAAAAAAACATGTC
It seems to me that both the polyA sequences at the end are some sort of tail rather than actual coding for poly lysines. If we BLAST either of the sequences, the polyA part doesn't align with any reliable nucleotide or protein (i.e., with the NCBI non-redundant databases). I can give more examples and show their best alignments to nr-Sequences but it will make the question too long.
Cross-posted on StackExchange.
>gi|119423037|gb|DY008075.1|DY008075 19ACACYS_UP_022_A11_29OCT2004_095 Brassica napus 19ACACYS Brassica napus cDNA 5', mRNA sequence
TGGTACGGTCAGATGCTTGCTAAAGGAGAAATAAATAGAGACATGGGTGATAGTATAAGCGGAAAGGGAA
TGATTCAGGGTGTTTCTGCAGTGGGAGCGTTTTACCAACTGCTTAGTCAGTCCAGCCTAAGTATATTGCA
TTCTGAAGAGAAGAAACCTGTGGCTCCGGTTGAATCATGTCCTATTTTGAAAACACTCTACAAGATACTC
ATCACAAGAGAACAATCAACACAAGCGATTCTGCAAGCATTAAGGGATGAAACACTGAATGACCCAAGAG
ACAGGATTGAGATTGCACAGAGCCATGCATTCTACAGGCCTTCCCTTCTAGATCAGCCTTGATTAGTCTG
TCATGGCTCATAATCCGAACTTCTAAGATCTTACTTGTGCAAACTGCAGATTCTGCTATGTTAAACATCA
TGTCTTAAAATTGATTGTTGTTCAGCCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACATGTC
or EE485195
>gi|126492146|gb|EE485195.1|EE485195 DHBN8DCT_UP_012_C05_25FEB2005_043 BRASSICA NAPUS SEEDS BNDH8DCT Brassica napus cDNA 5', mRNA sequence
GTTACAGCTGGTTGAGAACAGTGACAATTCCCGGTTGAGCAAAGAAATTGCAGACAAGAGCCACCAACTA
AGGCAAATGAGAGGAGAGGAACTTCAAGGACTTAACATAGAAGAGCTGCAACAGCTGGAAAAGGCCCTTG
AAGCTGGTTTGACGCGCGTGATTGAAACAAAGAGTGAGAAGATTATGAGTGAGATCAGTGACCTTCAAAG
AAAGGGAATGAAATTGATGGATGAGAACAAGCGGCTAAGGCAGCATGGAACACAACTAACAGAAGAGAAC
GAGCGACTAGGCAAGCAAATATATAATAATATGCATGAAAGATACGGTGGTGTTGAGTCGGAGAAGACCG
CCGTGTACGAGGAAGGGCAGTCGTCAGAGTCCATTACTAACGCCGGAAACTCCACCGGCGCTCCTGTTGA
CTCCGAGAGCTCCGATACCTCTCTTAGGCTCGGCTTACCGTATGGCGGTTAGAGATGGAACCATACAAAG
AAGTTCATGGAGTGAGGAGATGCTCTGTAGTAACAAGTGGCAATGTAGTAATTTCTCTTGTTTGATGTAA
GTTTTTGTCTGAGGAAGAGGTTTTCCTTTTATGTTCTCTTTGATATTATTATCTTTCTTCACTGCAAAAA
AAAAAAAAAAAAAAAAAAAAAAAACATGTC
It seems to me that both the polyA sequences at the end are some sort of tail rather than actual coding for poly lysines. If we BLAST either of the sequences, the polyA part doesn't align with any reliable nucleotide or protein (i.e., with the NCBI non-redundant databases). I can give more examples and show their best alignments to nr-Sequences but it will make the question too long.
Cross-posted on StackExchange.
Comment