I have a folder with sample files (81 total .fasta) from a barcoded MiSeq run.
Each sample file contains consensus sequences for up to 53 targets.
The .fasta is organized so that name ">" corresponds to locus (AT#G######), followed by the consensus sequence. I need to search all sample files (from 81 total taxa) and create new .fasta files for each locus lists the name of the taxon, followed by the locus consensus sequence for each locus.
With some help from stackexchange, I have a script that does this beautifully. I've now encountered only one hang-up. The new locus .fasta files are not merged for each taxon, so I get a .fasta for locus ATXGXXXXX for Sample_1 only, a separate .fasta for Sample_2 for the same locus, and so on and so forth for all samples. I can't seem to find a command to merge all Sample sequences for locus ATXGXXXXXX into the same .fasta.
Here is the script:
Does anyone have any thoughts?
Each sample file contains consensus sequences for up to 53 targets.
The .fasta is organized so that name ">" corresponds to locus (AT#G######), followed by the consensus sequence. I need to search all sample files (from 81 total taxa) and create new .fasta files for each locus lists the name of the taxon, followed by the locus consensus sequence for each locus.
With some help from stackexchange, I have a script that does this beautifully. I've now encountered only one hang-up. The new locus .fasta files are not merged for each taxon, so I get a .fasta for locus ATXGXXXXX for Sample_1 only, a separate .fasta for Sample_2 for the same locus, and so on and so forth for all samples. I can't seem to find a command to merge all Sample sequences for locus ATXGXXXXXX into the same .fasta.
Here is the script:
awk '
FNR==1 { sample = FILENAME ; sub(/\.fasta/, "", sample )}
/^>/ { target = substr($0,2)".fasta" ; next }
{ print items ">" sample > target ; print > target; close(target) }
' C_*.fasta
FNR==1 { sample = FILENAME ; sub(/\.fasta/, "", sample )}
/^>/ { target = substr($0,2)".fasta" ; next }
{ print items ">" sample > target ; print > target; close(target) }
' C_*.fasta
Comment