Dear all,
I'm looking for suggestions how to analyse the sequence variation of a couple of exons, so only very short sequences. FYI I haven't worked on this topic at all so far and this is a request from one of our cooperation partners. So if there are any false assumptions in what I've put together until now, please correct me
Thus far, I thought it might be best to follow samtools mpileup as described here.
My idea was to use the sequences of my exons in FASTA format and provide them as ref.fa for samtools mpileup to obtain a VCF file and proceed with bcftools.
Now, I am not sure if this limits mpileup to only the locations I am interested in. Furthermore, does the VCF file contain information about every sample?
So for instance, sample1 has a SNP at position X whilst sample2 has none.
Best regards
I'm looking for suggestions how to analyse the sequence variation of a couple of exons, so only very short sequences. FYI I haven't worked on this topic at all so far and this is a request from one of our cooperation partners. So if there are any false assumptions in what I've put together until now, please correct me
Thus far, I thought it might be best to follow samtools mpileup as described here.
My idea was to use the sequences of my exons in FASTA format and provide them as ref.fa for samtools mpileup to obtain a VCF file and proceed with bcftools.
Code:
samtools mpileup -uf ref.fa sample1.bam sample2.bam | bcftools view -bvcg - > var.raw.bcf bcftools view var.raw.bcf | vcfutils.pl varFilter -D100 > var.flt.vcf
So for instance, sample1 has a SNP at position X whilst sample2 has none.
Best regards