I have obtained a .coords file from MUMmer after aligning my assembly to the reference genome.
I have been asked to write a script that outputs 2 fasta files,
1 which will contain the sequences that were aligned to the reference and 1 which will contain the sequences that were not aligned to the sequence.
I'm not sure how to go about this. How am I supposed to use the coordinates for the sequences that aligned in order to do this?
Also, won't there be parts of sequences that don't align? What/how do I go about collecting those pieces and putting it in the 2nd fasta file?
I have been asked to write a script that outputs 2 fasta files,
1 which will contain the sequences that were aligned to the reference and 1 which will contain the sequences that were not aligned to the sequence.
I'm not sure how to go about this. How am I supposed to use the coordinates for the sequences that aligned in order to do this?
Also, won't there be parts of sequences that don't align? What/how do I go about collecting those pieces and putting it in the 2nd fasta file?
Comment