Hi all
I'm looking for some inspiration/guidance on processing NGS data from haploid individuals. In a nutshell, we've sequenced at ~ 6X coverage a number of individuals (~30) from a single population. Following mapping, processing, and SNP calling, each individual has ~ 600K SNPs.
At this depth of coverage, ~ 70% of the genome is callable for each individual, and it stands to reason that individuals will not have identical genome coverage. And so in order to generate a single VCF for the population, without sacrificing an enormous number of SNPs, it would make sense to impute missing genotypes. To this end I have experimented with Beagle (v4) on a single chromosome, and it appeared to do the job - although upon closer examination the results indicated ~ 25% heteroygosity in each haploid individual. In addition, all samples were on average 85% idential by state, including 2 individuals which had been sequenced twice and should present a reliable control for the methodology.
Is anyone with experience in processing NGS data for haploids able to offer any insights/suggestions?
D
I'm looking for some inspiration/guidance on processing NGS data from haploid individuals. In a nutshell, we've sequenced at ~ 6X coverage a number of individuals (~30) from a single population. Following mapping, processing, and SNP calling, each individual has ~ 600K SNPs.
At this depth of coverage, ~ 70% of the genome is callable for each individual, and it stands to reason that individuals will not have identical genome coverage. And so in order to generate a single VCF for the population, without sacrificing an enormous number of SNPs, it would make sense to impute missing genotypes. To this end I have experimented with Beagle (v4) on a single chromosome, and it appeared to do the job - although upon closer examination the results indicated ~ 25% heteroygosity in each haploid individual. In addition, all samples were on average 85% idential by state, including 2 individuals which had been sequenced twice and should present a reliable control for the methodology.
Is anyone with experience in processing NGS data for haploids able to offer any insights/suggestions?
D
Comment