Seqanswers Leaderboard Ad

**nilshomer** · 07-12-2011, 06:31 AM

Take your read length, and multiply it by the number of bases to get the total bases present in your dataset. So for a 1M SE @ 50bp, you have 50Mb. For 1M PE @50bpx50bp, you have 100Mb. If you look at one PE file (reads1), then you get 50Mb. Note, that the bfast file will contain both PE in the same file, so that would be 100Mb.

**arkal** · 07-12-2011, 06:51 AM

Originally posted by nilshomer View Post

Take your read length, and multiply it by the number of bases to get the total bases present in your dataset. So for a 1M SE @ 50bp, you have 50Mb. For 1M PE @50bpx50bp, you have 100Mb. If you look at one PE file (reads1), then you get 50Mb. Note, that the bfast file will contain both PE in the same file, so that would be 100Mb.

I'm sorry im still a little confused...

The formula i'm using for N is

N = (Genome size x Coverage) / ( RL1 + RL2)

So if my Genome size is 100Mb, Coverage is 10 and RL1 = RL2 = 50 (PE)

N = 100,000,000 x 10 / 100 = 10,000,000 read pairs
i.e *_10X_PE_1.fq = *_10x_PE_2.fq = 10,000,000 reads.

Now, if RL2=0, keeping coverage and genome size the same,
N = 100,000,000 x 10 / 50 = 20,000,000 read pairs or reads
i.e *_10X_SE_.fq1 = 20,000,000 reads and *_10X_SE_2.fq = 0 reads.

I Hope i'm right till here.

Furthermore, if i have already generated 20X coverage PE for the same genome,
N = 100,000,000 x 20 / 100 = 20,000,000 read pairs
i.e *_20X_PE_1.fq = *_20X_PE_2.fq = 20,000,000 reads.

Is it safe to assume that
either *_20X_PE_1.fq OR *_20X_PE_2.fq can be used as a substitute for *_10X_SE_.fq1 as both have the same number of reads?

**nilshomer** · 07-12-2011, 07:24 AM

The answer is yes.

**arkal** · 07-12-2011, 08:59 AM

Thanks a lot

Topics	Statistics	Last Post
New Method for DNA Sequence Amplification by seqadmin Started by seqadmin, Today, 08:18 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 08:18 AM
New Tools Enhance Single-Molecule DNA Analysis with Minimal Samples by seqadmin Started by seqadmin, Today, 08:04 AM	0 responses 12 views 0 likes	Last Post by seqadmin Today, 08:04 AM
SIX2 Protein Identified as a Key Player in Prostate Cancer Treatment Resistance by seqadmin Started by seqadmin, 06-03-2024, 06:55 AM	0 responses 13 views 0 likes	Last Post by seqadmin 06-03-2024, 06:55 AM
Genetic Mosaicism More Prevalent Than Previously Thought by seqadmin Started by seqadmin, 05-30-2024, 03:16 PM	0 responses 27 views 0 likes	Last Post by seqadmin 05-30-2024, 03:16 PM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News