Hello,
This seems like a simple enough question but I can't find a straight answer...
I want to know how many SNP differences there are between each of my samples (=genomes). My dataset is composed of 65 bacterial genomes. I used kSNP3 to call the SNPs from the genomes using the core option, and SNP-sites to generate the VCF file from the alignment. And now I am completely stuck, for something that looks really trivial.
The fasta alignment looks like:
>seq1
AAATTTCCCGGG
>seq2
CAATTTCCCGGG
>seq3
CAAGTTCCCGGG
The sequences are the concatenated core SNPs of my whole dataset. Thus I have 1 sequence per sample, and they are aligned and all of exactly the same length (roughly 40 000 bp long).
The output I am looking for is the exact number of SNPs (or similarities) between each pair of sequence:
seq1 seq2 seq3
seq1 0
seq2 1 0
seq3 2 1 0
etc...
Does anyone know a simple way to get either from the alignment or from the resulting VCF file to the disimilarity matrix ? I have been looking into different softwares for 2 days now without success...
This seems like a simple enough question but I can't find a straight answer...
I want to know how many SNP differences there are between each of my samples (=genomes). My dataset is composed of 65 bacterial genomes. I used kSNP3 to call the SNPs from the genomes using the core option, and SNP-sites to generate the VCF file from the alignment. And now I am completely stuck, for something that looks really trivial.
The fasta alignment looks like:
>seq1
AAATTTCCCGGG
>seq2
CAATTTCCCGGG
>seq3
CAAGTTCCCGGG
The sequences are the concatenated core SNPs of my whole dataset. Thus I have 1 sequence per sample, and they are aligned and all of exactly the same length (roughly 40 000 bp long).
The output I am looking for is the exact number of SNPs (or similarities) between each pair of sequence:
seq1 seq2 seq3
seq1 0
seq2 1 0
seq3 2 1 0
etc...
Does anyone know a simple way to get either from the alignment or from the resulting VCF file to the disimilarity matrix ? I have been looking into different softwares for 2 days now without success...
Comment