Hi all,
I need to concatenate a bunch of sequences in a FASTA file. I have a file of extracted introns and would like to essentially splice them all together for use in a program. Is there any way to do either of these using perl (preferably) or python (if necessary):
1. Join all the introns of a single gene, preserving the FASTA heading for that gene.
> Gene 1 Intron 1
GTACGCC....CTGATAGAG
>Gene 1 Intron 2
GTCCAGGAC.....CTGAGTAAG
Becomes
> Gene 1 Intron 1
GTACGCC....CTGATAGAGGTCCAGGAC.....CTGAGTAAG
or
2. Join a number of introns together (not accounting for what genes they came from) under a non-specific FASTA formatted heading?
Basically I want to splice together a bunch of intron sequences like they were exons so that I can run them through a program that doesn't like how short they are. The first way would be the most biologically relevant and useful for my purposes, but if it can't be done I can live with it haha. Any help would be greatly appreciated. Thanks a lot!
I need to concatenate a bunch of sequences in a FASTA file. I have a file of extracted introns and would like to essentially splice them all together for use in a program. Is there any way to do either of these using perl (preferably) or python (if necessary):
1. Join all the introns of a single gene, preserving the FASTA heading for that gene.
> Gene 1 Intron 1
GTACGCC....CTGATAGAG
>Gene 1 Intron 2
GTCCAGGAC.....CTGAGTAAG
Becomes
> Gene 1 Intron 1
GTACGCC....CTGATAGAGGTCCAGGAC.....CTGAGTAAG
or
2. Join a number of introns together (not accounting for what genes they came from) under a non-specific FASTA formatted heading?
Basically I want to splice together a bunch of intron sequences like they were exons so that I can run them through a program that doesn't like how short they are. The first way would be the most biologically relevant and useful for my purposes, but if it can't be done I can live with it haha. Any help would be greatly appreciated. Thanks a lot!
Comment