The 1000 genome project provides us information about "variation" of thousands people's DNA sequence against the human reference DNA sequence. The variation is stored in VCF file
format. Basically, for each person in that project, we can get his/her DNA variation information from the VCF file, for example, the type of variation (e.g Insertion/deletion and SNP ) and the position of the variation relative to the reference. The reference is in FASTA format. By combining variation information of one person from the VCF file and the human reference in FASTA file, I want to construct the DNA sequence for that person.
My question is: Does it already exist some tools can perform the task pretty well,or I have to write the scripts by myself?
format. Basically, for each person in that project, we can get his/her DNA variation information from the VCF file, for example, the type of variation (e.g Insertion/deletion and SNP ) and the position of the variation relative to the reference. The reference is in FASTA format. By combining variation information of one person from the VCF file and the human reference in FASTA file, I want to construct the DNA sequence for that person.
My question is: Does it already exist some tools can perform the task pretty well,or I have to write the scripts by myself?
Comment