Hi,
How is the fastest way to parse a fasta file to produce an another fasta file where the same sequence are grouped and the number occurences are in the id (id_occurences).
ex:
Result :
Thanks in advance,
N.
How is the fastest way to parse a fasta file to produce an another fasta file where the same sequence are grouped and the number occurences are in the id (id_occurences).
ex:
Code:
>seq1 ATGCATGC >seq2 ATGCCCCC >seq3 ATGCATGC >seq4 ATGCGGGG >seq5 ATGCCCCC >seq6 ATGCATGC >seq7 ATGCAAAA >seq8 ATGCGGGG
Code:
>seq1_3 ATGCATGC >seq2_2 ATGCCCCC >seq3_2 ATGCGGGG >seq4_1 ATGCAAAA
N.
Comment