This is not really a "next gen" question, but I am not sure where else to ask. I would welcome suggestions for alternative venues.
I am considering using ABI 3730 machines to Sanger sequence a fairly large number of samples in approximately a 10kb locus. The question I am investigating is haplotype-oriented, and I am thinking about pooling the samples by haplotypes estimated from affy 6 genotypes. The idea is that the variants I am seeking are fairly common on these haplotypes, with a frequency of maybe 25%. The haplotypes will be rare and mostly occur on only one chromosome in each sample, so about one eighth of the chromosomes would carry the variants I'm targeting. I could tolerate a relatively high error rate in calling the variants, because the calls will be used in a kind of rare-variant burden test.
Is this just a crazy idea, because the Sanger data is analog? I'd be grateful for references to papers describing this kind of design, if any, or even just "this is crazy," if it is.
The trouble with sequencing the pools with nextgen technology is that the locus is relatively small, so structuring the samples efficiently would require something like a ludicrously complicated barcoding arrangement. However, if there is a sensible way to query such a small locus with nextgen tech, I would love to hear about that, too.
I am considering using ABI 3730 machines to Sanger sequence a fairly large number of samples in approximately a 10kb locus. The question I am investigating is haplotype-oriented, and I am thinking about pooling the samples by haplotypes estimated from affy 6 genotypes. The idea is that the variants I am seeking are fairly common on these haplotypes, with a frequency of maybe 25%. The haplotypes will be rare and mostly occur on only one chromosome in each sample, so about one eighth of the chromosomes would carry the variants I'm targeting. I could tolerate a relatively high error rate in calling the variants, because the calls will be used in a kind of rare-variant burden test.
Is this just a crazy idea, because the Sanger data is analog? I'd be grateful for references to papers describing this kind of design, if any, or even just "this is crazy," if it is.
The trouble with sequencing the pools with nextgen technology is that the locus is relatively small, so structuring the samples efficiently would require something like a ludicrously complicated barcoding arrangement. However, if there is a sensible way to query such a small locus with nextgen tech, I would love to hear about that, too.
Comment