Unconfigured Ad

**henry.wood** · 11-12-2010, 08:06 AM

One of the problems with the anonymous pooling is that a heterozygous SNP in one of your 25 samples is only going to be present in 2% of your reads for that region, which isn't much higher than the error rate. However, if the samples are barcoded, then all the reads with the SNP will all be identified as being from one person, so statistically a lot easier to spot.
I do understand why you wouldn't want to make 1000 libraries though.

**andymay** · 11-12-2010, 02:09 PM

Have you considered doing this by PCR rather than capture? We have developed a system that allows simple preparation of PCR products in which the library preparation and barcoding takes place during amplification. It scales well with large numbers of samples and amplicons, and works with both 454 and Illumina sequencing.

**snapper** · 11-15-2010, 05:16 AM

Thanks both - we are committed to the pulldown capture following a pilot project so PCR not currently an option. In the pilot, the analysis did seem to work reasonably at identifying even a single het call within the pooled system although clearly the false positive rate will be higher than if we barcoded. However, follow up through the pools is potentially fairly substantial.

**Loris** · 11-15-2010, 06:08 AM

If you are willing to make 40 pooled libraries, would you be willing to make 80?
If you put each sample into two internally anonymous libraries - and sequence at sufficient depth - then you will be able to determine which sample near-unique variations came from. It'll probably work okay for rare mutations, although the more common they are the more follow-up work required.

**westerman** · 11-15-2010, 12:17 PM

I agree with Loris. A good way is to make 2 times the libraries in a row/column pool fashion. We use to do this with 'overgos' (ref: https://www.ncbi.nlm.nih.gov/project...chOvergo.shtml) and the same idea should be applicable to any sequencing project.

Also I agree with henry.wood in that 2% is getting very close to the noise level. In theory with enough sequencing depth we should be able to detect variants below 1% but in practice I find this hard to accomplish as per the spiked controls we have used.

**krobison** · 11-15-2010, 10:35 PM

WRT pooling, you might also look at DNA Sudoku.

As per the comments above, you also could see this as an optimization problem -- what is the smallest number of pooled libraries which have acceptable sensitivity, with some degree of losing the ability to precisely localize a variant in the first run (i.e. instead of 1 pooled anonymous library, what about 2 each with half the samples, 4 each with 1/4, etc)

If you haven't run this SureSelect design yet, beware that you may get uneven coverage -- so some regions will capture much more than others, which further complicates trying to design in the right sensitivity. Also, I believe Agilent still recommends capturing each library separately, though certainly here you will find folks discussing capturing pooled libraries

**snapper** · 11-18-2010, 02:51 AM

Thanks all - this is extremely helpful.

**mrivas** · 11-22-2010, 09:59 AM

Hello Snapper,

I developed Syzygy while at the Broad Institute. Syzygy performs well with 25 individuals per pool. In fact we have several small targeted experiments that we designed with 50 individuals per pool (100 chromosomes) across 10 pools . We observe a high validation rate (~90% ) for all variants singletons and above. You can get more information about Syzygy from

Access Denied

http://www.broadinstitute.org/software/syzygy/

We are currently optimizing Syzygy to deal with larger target sizes. Intended targets for applications was approximately 60-100 kb.

Best Regards,
Manuel Rivas

**gfmgfm** · 01-13-2011, 12:59 PM

Hello Manuel Rivas,

I have a pooled experiment with target size of ~803 kb.
Can I use Syzygy?

If not- does anyone has suggestions what tool to use to call the SNPs from a pooled run (10 individuals in one Illumina run)?

**james hadfield** · 01-13-2011, 01:23 PM

check out http://genomebiology.com/2011/12/1/R1/abstract in the latest Genome Biology.

A scalable, fully automated process for construction of sequence-ready human exome targeted capture libraries

**mrivas** · 01-13-2011, 01:28 PM

Originally posted by gfmgfm View Post

Hello Manuel Rivas,

I have a pooled experiment with target size of ~803 kb.
Can I use Syzygy?

If not- does anyone has suggestions what tool to use to call the SNPs from a pooled run (10 individuals in one Illumina run)?

Yes you can use Syzygy. I am uploading an optimized version of Syzygy in the next couple of days it should handle 800 kb target without a problem. You can send an e-mail to [email protected]

Access Denied

http://www.broadinstitute.org/software/syzygy/

Is the Software's website.

Best Regards,
Manuel

**mrivas** · 01-13-2011, 01:29 PM

The current version handles 800 kb target size without a problem.

**gfmgfm** · 01-14-2011, 06:37 AM

Great.
Thanks!

**vbansal** · 01-14-2011, 11:09 AM

For calling variants from pooled sequencing data, you can also try CRISP, a method specifically designed to detect variants using sequence reads from multiple pools (each with a moderate number of individuals). The statistical model behind CRISP is described in this Bioinformatics article http://bioinformatics.oxfordjournals...i318.full?etoc

A python implementation of CRISP is available here: http://polymorphism.scripps.edu/~vba...oftware/CRISP/
A faster and more accurate C implementation is under development and is available on request. We have used CRISP to call variants (both SNPs and short indels) from pooled sequencing of ~600kb of DNA (captured using Agilent SureSelect) of 100 individuals using 5 pools of 20 each. The false discovery rate for detecting SNPs on this dataset was ~ 1%

Topics	Statistics	Last Post
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 30 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 18 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM

Unconfigured Ad

Barcoding vs anonymous pooling

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News