Hi,
I would like to hear any suggestion on connecting two fasta files, here is my problem:
I have two large fasta files, each around 1GB.
F1.fasta contains:
AB001.1
GTAGTGTGAGGTGTGT
AB002.1
GTAGTGTGAGGTGTGT
AB005.1
GTAGTGTGAGGTGTGT
And F2.fasta contains:
AB005.2
GTAGTGTGAGGTGTGT
AB006.2
GTAGTGTGAGGTGTGT
Imade up the sequences. But as you can see, they are actually pair-end reads from illumina HiSeq. The files were trimed so contain difference number of sequence. Now I wonder if there is any program I can used to find the paired sequnece (i.e. AB005.1 + AB005.2), and put a string of "--------------" between them.
So the output fasta will look like
AB005
GTAGTGTGAGGTGTGT-------------------GTAGTGTGAGGTGTGT
I would appreciate if you can guide me to any program or any command i can use in R or python?
Thanks!
I would like to hear any suggestion on connecting two fasta files, here is my problem:
I have two large fasta files, each around 1GB.
F1.fasta contains:
AB001.1
GTAGTGTGAGGTGTGT
AB002.1
GTAGTGTGAGGTGTGT
AB005.1
GTAGTGTGAGGTGTGT
And F2.fasta contains:
AB005.2
GTAGTGTGAGGTGTGT
AB006.2
GTAGTGTGAGGTGTGT
Imade up the sequences. But as you can see, they are actually pair-end reads from illumina HiSeq. The files were trimed so contain difference number of sequence. Now I wonder if there is any program I can used to find the paired sequnece (i.e. AB005.1 + AB005.2), and put a string of "--------------" between them.
So the output fasta will look like
AB005
GTAGTGTGAGGTGTGT-------------------GTAGTGTGAGGTGTGT
I would appreciate if you can guide me to any program or any command i can use in R or python?
Thanks!