Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • simonandrews
    Simon Andrews
    • May 2009
    • 870

    Mapping Bisulphite converted sequence

    What software are people using to map bisulphite converted sequence?

    We've been getting more of this kind of data recently and doing the mapping and QC has proved to have all sorts of odd quirks about it. Our mapping pipeline is just a set of scripts which sit on top of bowtie, but I wondered if anyone had done a more formal tool which took into account things such as:
    • Whether the sequence is expected to be fully converted or not
    • Eliminating preferential mapping of unconverted sequence
    • Working out overall conversion frequencies


    Does anyone have any good recommendations or are we all building our own?
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by simonandrews View Post
    What software are people using to map bisulphite converted sequence?

    We've been getting more of this kind of data recently and doing the mapping and QC has proved to have all sorts of odd quirks about it. Our mapping pipeline is just a set of scripts which sit on top of bowtie, but I wondered if anyone had done a more formal tool which took into account things such as:
    • Whether the sequence is expected to be fully converted or not
    • Eliminating preferential mapping of unconverted sequence
    • Working out overall conversion frequencies


    Does anyone have any good recommendations or are we all building our own?
    BFAST can easily be used to align bisulphite treated sequence (see the reference manual). I don't know of a tool for summarizing the conversion frequencies (beyond personal perl scripts), but if you find one let me know.

    Comment

    • MadraghRua
      Junior Member
      • Mar 2008
      • 8

      #3
      Check out RMAPBS - its RMAP modified for BS data. There was a recent genome research paper where it featured. Otherwise its build your own from what i can see.

      Comment

      • simonandrews
        Simon Andrews
        • May 2009
        • 870

        #4
        Originally posted by MadraghRua View Post
        Check out RMAPBS - its RMAP modified for BS data. There was a recent genome research paper where it featured. Otherwise its build your own from what i can see.
        Thanks - RMAPBS was one I'd not seen before. However it seems to suffer the same problem as most mapping programs I've seen, which is that it will map unconverted sequence more efficiently than converted sequence. For some applications this won't matter, but for epigenomics work where you're looking at the ratio of converted to unconverted reads in a particular region then it's essential that both forms should map with equal efficiency, even if this means removing some unconverted reads which otherwise could have been mapped.

        Maybe the applications of bisulphite conversion are too varied to be handled cleanly in a single program?

        Comment

        • wei
          Junior Member
          • Aug 2009
          • 4

          #5
          check out bsmap

          Comment

          • andrewdsusc
            Junior Member
            • Sep 2009
            • 3

            #6
            Originally posted by simonandrews View Post
            Thanks - RMAPBS was one I'd not seen before. However it seems to suffer the same problem as most mapping programs I've seen, which is that it will map unconverted sequence more efficiently than converted sequence. For some applications this won't matter, but for epigenomics work where you're looking at the ratio of converted to unconverted reads in a particular region then it's essential that both forms should map with equal efficiency, even if this means removing some unconverted reads which otherwise could have been mapped.

            Maybe the applications of bisulphite conversion are too varied to be handled cleanly in a single program?
            When the methylation of interest is CpG methylation, RMAPBS *WILL NOT* bias mapping towards a particular methylation state. It exploits unconverted Cs at non-CpG positions to gain specificity in mapping without using those at CpG positions to gain specificity.

            Comment

            • What_Da_Seq
              Member
              • Jul 2008
              • 28

              #7
              BS mode of novoalign and from there Maq pilup and then custom perl scripts.

              Comment

              • sciencewu
                Member
                • Dec 2010
                • 12

                #8
                bsmap maybe is good for you , but the cost of time is huge.
                if you knew the mechnism of bisulfite alignment that many aligners is also ok .

                Comment

                • bioinfosm
                  Senior Member
                  • Jan 2008
                  • 483

                  #9
                  adapter trimming would be an important pre-processing step for BS seq ... rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
                  --
                  bioinfosm

                  Comment

                  • volks
                    Member
                    • Jun 2010
                    • 80

                    #10
                    Originally posted by bioinfosm View Post
                    rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
                    why ist that ?

                    Comment

                    • simonandrews
                      Simon Andrews
                      • May 2009
                      • 870

                      #11
                      Originally posted by bioinfosm View Post
                      adapter trimming would be an important pre-processing step for BS seq ... rrbsmap, bismark, bsseeker are some tools that work exclusively for bisulphite, but all face the challenge of missing out on a big proportion of alignments..
                      Really? Sure, the mapping efficiencies for bisulphite converted sequence are lower than for conventional sequencing, but nearly all of this is due to the loss of information in the conversion process meaning that the read can't be uniquely assigned to the original genome. In addition some aligners specifically choose to ignore unique alignments which couldn't have been found if the methylation state of the sequence was different to ensure that mapping is always fair and unbiased, but other than that I don't see that there's a problem affecting bisulphite aligners which is any worse than deficiencies in conventional aligners.

                      This isn't to say that there aren't still problems in bisulphite alignment. The issue of samples having a different genetic background to the reference genome leads to systematic methylation miscalls which are difficult to spot and lead to methylation change predictions which are actually genetic changes, but this is more a problem of calling than mapping.

                      Comment

                      • aniruddha.otago
                        Member
                        • Jan 2010
                        • 21

                        #12
                        Hi simon,

                        We have done some methylation analysis with RRBS samples (human).. we have compared 3 aligners. (RMAPBS, BSMAP and Bismark) RMAPBS and Bismark came up with reaosanable methylation percentage ( around 40%, but the library is CGI rich). However, BSMAP showed extreme low level of methylation 15-18%. for QC checked ( we have done dynamic trimming and removed adaptor contamination as well) high quality data. Do you have any comments on that. Looks like one can tune methylation by choosing a particular alinger. Will really appreciate if you kindly reply with your views.

                        Regards,
                        Aniruddha.

                        Comment

                        • simonandrews
                          Simon Andrews
                          • May 2009
                          • 870

                          #13
                          I'm surprised to hear that you're seeing such variable results from different programs. Were the mapping efficiencies wildly different between runs? You'd need quite a difference in mapping distribution to generate that kind of discrepancy. We've shown on simulated datasets that with bismark we can reliably extract the true methylation level regardless of the level of methylation in the library. The only factors which really influence this are the things you mentioned (adapters or poor quality sequence).

                          When mapping BS-Seq data it's more important that what you map is accurate than getting really good coverage. If in doubt you should make your mapping parameters more stringent. Mapping and adapter errors tend to drag the predicted methylation level towards 50% so this is especially problematic for low methylation libraries.

                          If you're seeing differences of 25% in your data then I suspect something more fundamental is going wrong in the way the programs are being run. The only thing which we've ever seen which makes this kind of difference is that some programs have an option to remove any reads containing more than 3 unconverted Cs, which can have a dramatic effect on the overall level, but normally this would only be applied in non-CpG context so this shouldn't be the problem in your case if your library is CpG rich.

                          Comment

                          • yxibcm
                            Junior Member
                            • Jun 2010
                            • 6

                            #14
                            The new version of bsmap(v2.2) has greatly improved the mapping speed
                            (28M 76bp PE reads mapped to hg19 genome in about 7 hours, using 8 threads RAM usage: ~9GB)

                            It also includes RRBS mode.

                            Best,

                            Yuanxin

                            Originally posted by sciencewu View Post
                            bsmap maybe is good for you , but the cost of time is huge.
                            if you knew the mechnism of bisulfite alignment that many aligners is also ok .
                            Last edited by yxibcm; 10-05-2011, 08:40 AM.

                            Comment

                            • yxibcm
                              Junior Member
                              • Jun 2010
                              • 6

                              #15
                              Hi Aniruddha,

                              I'm the developer of BSMAP. Could you provide some details about the BSMAP command line and your input reads? I'm very interested in knowing why BSMAP has low level of methylation.

                              Also BSMAP support RRBS mode through option "-D" that adds the digestion sites specificity in mapping, or you can run the separate program RRBSMAP. This mode is also much faster memory efficient.

                              Best,

                              Yuanxin

                              Originally posted by aniruddha.otago View Post
                              Hi simon,

                              We have done some methylation analysis with RRBS samples (human).. we have compared 3 aligners. (RMAPBS, BSMAP and Bismark) RMAPBS and Bismark came up with reaosanable methylation percentage ( around 40%, but the library is CGI rich). However, BSMAP showed extreme low level of methylation 15-18%. for QC checked ( we have done dynamic trimming and removed adaptor contamination as well) high quality data. Do you have any comments on that. Looks like one can tune methylation by choosing a particular alinger. Will really appreciate if you kindly reply with your views.

                              Regards,
                              Aniruddha.
                              Last edited by yxibcm; 10-05-2011, 08:43 AM.

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                Here are nine questions we think about, in roughly the order they matter, before...
                                Yesterday, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              16 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              37 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              43 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              49 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...