Hi,
I am new in NGS and have very little knowledge in tools used for analyzing sequences generated by NGS. Here, I have a problem and think somebody with good perl script skill may be able to help me out:
I have two Fasta files like below:
File1:
>a
GVKKDVKCTTTGGG
.
.
>f
AAATTTGGGCCCEEE
>g
SSSGGGYYYTTTGTFR
.
.
>x
DDDGGGYYYTTTGTFR
.
.
.
File2:
1>
GVKKDVKCTTTGGG
.
.
>41
FFFGGGYYYTTTGTFR
.
.
>200
AAATTTGGGCCCEEE
.
.
>1000
SSSGGGYYYTTTGTFR
.
.
.
Many but not all sequences are identical in these two files. I would like to compare each sequence of the first file with the second file and make a following table:
a GVKKDVKCTTTGGG 1
f AAATTTGGGCCCEEE 200
g SSSGGGYYYTTTGTFR 1000
x DDDGGGYYYTTTGTFR 0
.
.
.
In the table, the first column is the header of each sequence of the first file. The second column is each sequence of the first file and the third column is the header of the second file with the identical sequence with the first file. If there is no sequence identical in the second file, then use number zero instead.
Appreciate if someone can help me out.
Thanks a lot.
Acyrocks
I am new in NGS and have very little knowledge in tools used for analyzing sequences generated by NGS. Here, I have a problem and think somebody with good perl script skill may be able to help me out:
I have two Fasta files like below:
File1:
>a
GVKKDVKCTTTGGG
.
.
>f
AAATTTGGGCCCEEE
>g
SSSGGGYYYTTTGTFR
.
.
>x
DDDGGGYYYTTTGTFR
.
.
.
File2:
1>
GVKKDVKCTTTGGG
.
.
>41
FFFGGGYYYTTTGTFR
.
.
>200
AAATTTGGGCCCEEE
.
.
>1000
SSSGGGYYYTTTGTFR
.
.
.
Many but not all sequences are identical in these two files. I would like to compare each sequence of the first file with the second file and make a following table:
a GVKKDVKCTTTGGG 1
f AAATTTGGGCCCEEE 200
g SSSGGGYYYTTTGTFR 1000
x DDDGGGYYYTTTGTFR 0
.
.
.
In the table, the first column is the header of each sequence of the first file. The second column is each sequence of the first file and the third column is the header of the second file with the identical sequence with the first file. If there is no sequence identical in the second file, then use number zero instead.
Appreciate if someone can help me out.
Thanks a lot.
Acyrocks
Comment