Hi,
I've got a custom BLASTDB created from a FASTA file containing large amounts of short sequences:
>seqA
GATCGATAGTTAGACTAGTAAGCAAA
>seqB
AAATCACAGTGACTAGTAGAGAGATT
>seqC
AAGGCCCCTATATAGACTGACTAGTA
and so on.
I've sequenced a large library of molecules which are ligation products between 2 of the sequences held in this custom DB i.e. SeqA-SeqC (GATCGATAGTTAGACTAGTAAGCAAAAAGGCCCCTATATAGACTGACTAGTA).
I need to be able to BLAST my sequencing data against the custom database and then tally how many times each possible ligation product is present and output this based on their FASTA labels i.e. something like this:
SeqA-SeqC 54
SeqA-SeqB 102
Is this something that BLAST/BLAST+ may be able to do instrinsically or is it something that would have to be done in perl/python? Does anybody have a script which may do something similar which I could have a look at to get a framework to build upon?
Thanks!
- Julia
I've got a custom BLASTDB created from a FASTA file containing large amounts of short sequences:
>seqA
GATCGATAGTTAGACTAGTAAGCAAA
>seqB
AAATCACAGTGACTAGTAGAGAGATT
>seqC
AAGGCCCCTATATAGACTGACTAGTA
and so on.
I've sequenced a large library of molecules which are ligation products between 2 of the sequences held in this custom DB i.e. SeqA-SeqC (GATCGATAGTTAGACTAGTAAGCAAAAAGGCCCCTATATAGACTGACTAGTA).
I need to be able to BLAST my sequencing data against the custom database and then tally how many times each possible ligation product is present and output this based on their FASTA labels i.e. something like this:
SeqA-SeqC 54
SeqA-SeqB 102
Is this something that BLAST/BLAST+ may be able to do instrinsically or is it something that would have to be done in perl/python? Does anybody have a script which may do something similar which I could have a look at to get a framework to build upon?
Thanks!
- Julia
Comment