Hi all,
I am trying to estimate the sequencing effort needed for a valid SNP call using RADtags on a non-model organism.
To have an idea of the average cover per locus I used the following equation:
[No. Illumina Reads x read length] / [No. Individuals x Expected RAD sites across the genome x 2 RADtags per RAD site x average length of each RAD tag]
For example, let's say we have a non-model species with a genome size of 350 Mb. We digest the DNA with and an 8-cutter RE such as SbfI. Assuming equal proportion of each base across (A=T=G=C=0.25) the genome we'll expect:
(0.25^8) x 350,000,000 = 5,340 RAD sites
After the size selection step the average RAD tag size will be 500 bp. If we use one Lane of an Illumina GAIIx (35 million reads of 75bp) to sequence 100 pooled individuals then the coverage per RAD locus would be:
(35,000,000 x 75) / (500 x 2 x 5340 x 100) = 4.9 X
As for de novo species is recommended at least 60 X (Davey et al. 2011 Nature Genetics) I think it will be necessary to use 8 individuals per lane.
Are my calculations right? what do you think about the final decision?
I am trying to estimate the sequencing effort needed for a valid SNP call using RADtags on a non-model organism.
To have an idea of the average cover per locus I used the following equation:
[No. Illumina Reads x read length] / [No. Individuals x Expected RAD sites across the genome x 2 RADtags per RAD site x average length of each RAD tag]
For example, let's say we have a non-model species with a genome size of 350 Mb. We digest the DNA with and an 8-cutter RE such as SbfI. Assuming equal proportion of each base across (A=T=G=C=0.25) the genome we'll expect:
(0.25^8) x 350,000,000 = 5,340 RAD sites
After the size selection step the average RAD tag size will be 500 bp. If we use one Lane of an Illumina GAIIx (35 million reads of 75bp) to sequence 100 pooled individuals then the coverage per RAD locus would be:
(35,000,000 x 75) / (500 x 2 x 5340 x 100) = 4.9 X
As for de novo species is recommended at least 60 X (Davey et al. 2011 Nature Genetics) I think it will be necessary to use 8 individuals per lane.
Are my calculations right? what do you think about the final decision?
Comment