Seqanswers Leaderboard Ad

**fkrueger** · 02-28-2012, 02:18 PM

Hi Mixter,

I think it is fair to say that mapping efficiency in BS-Seq is a function of the read length, altough the gain in mapping efficiency gets smaller with increasing read lengths. The figure you are probably referring to (Fig. 2?) was indeed done with simulated data that did not contain any Ns.

Real world datasets tend to contain quite a number of sequences that can't be mapped, and this is probably a combination of several factors:
- reads that come from regions in the genome that are not actually present in the genome assembly (e.g. plenty of sequence in the genome builds around centromeres or towards the ends is simply masked by Ns)
- reads from repetitive regions that can't be mapped uniquely
- reads with adapter or primer contamination or other artefacts generated during library generation

Just to give you some ballpark figures, we regularly see around 60-68% mapping efficiency for 40bp long RRBS (SE) reads. I have seen some high quality (quality and adapter trimmed) longer datasets of 75-100bp that were getting close to the 80% mark, and this is already quite high for standard genomic sequence mapping.

We have seen that paired-end reads tend to increase the mapping efficiency by a few percent (up to 3 or 4% for 40bp RRBS reads), however this increase in mapping efficiency does not necessarily translate into a linear increase in methylation data because paired-end reads may overlap, and such overlaps generate redundant data. I have tried to write up a few more things about this in a brief RRBS guide that is available here. I believe the homepage might currently experience some difficulties but hopefully it'll be back up soon. If you have any specific queries about your dataset don't hesitate to send me an email directly.

**mixter** · 03-05-2012, 04:56 AM

Many thanks! For now, we are just looking at public data sets. I just wanted to say that we found this an extremely helpful orientation.

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

BS-Seq mapping efficiency, what can be expected?

Comment

Comment

Latest Articles

ad_right_rmr

News