Header Leaderboard Ad

Collapse

Bismark - A New Tool for Mapping and Analysis of Bisulfite-Seq Data

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Oliver (and interested others)

    I've now added BAM files made from mapping my paired end data as if it were single end data to GBrowse. A useful example can be seen here. The bottom track shows the paired end mapping BAM file and the two tracks above are the F and R reads mapped as single end data. In all cases forward strand alignments are coloured green and reverse strand alignments in the salmony-orange colour.

    Some things to notice are that in the paired end mapping track, the reverse reads shows the reverse strand base and not the forward strand one (which is what seems to be the standard). There are also a mix of + and - strand alignment in the F and R tracks, which I didn't expect (maybe I'm missing something obvious though?).

    This actually highlights an interesting point: If the - strand alignments shows the + strand base, it will be impossible to see C bases that were converted by the bisulfite treatment unless there is a way to specifically colour either the methylated bases or the bases that were converted (more interesting would be the methylated bases). As for the paired-end BAM files, even in the single end files C bases that were converted still aren't shown as miss matches. I need to learn about the SAM file format to understand why this is - so far I've been rather blindly making things work rather than getting into the proper details.

    Comment


    • #32
      Hi natstreet,

      I have to admit that I never used the pair-end BAM on Gbrowse (I use UCSC, but I guess they don't have Arabidopsis).

      As for the F and R, when you made paired-end libraries, did you specify which genomic strand was ligated to each adapter. As you can imagine, the DNA is double-stranded upon fragmentation, so unless you specify the F adapter to only ligate to the "+" genomic strand, it can potentially ligate to the "-" genomic strand, therefore giving you a "-" read on F. Likewise with R, it depends on whether the original ligation (when you made the library) was ligated ONLY to the "-" genomic DNA, or randomly ligated. Let me know if this wasn't clear.

      I must confess that my scripts were built to work on UCSC genome browser, which might have greater tolerance for badly made SAM/BAM. I could try to look more into the SAM format, but thus far, I'm not sure how to handle the problem. Sorry for that.

      Cheers,
      Oliver

      Comment


      • #33
        a few more features to genome_methylation_bismark2bedGraph.pl

        To Olivertam

        Thank you for your effort in writing the script "genome_methylation_bismark2bedGradph.pl". it is very useful and works great! But I would like to have a few more features in the BED file if possible. Could you add a few lines in your program which can show the methylation level by color, such as methylation by blue and unmethylation by red. I think it will be more straightforward fior us to do a pairwise comparison to identify the differential methylation region. The color parameter I would like to use is as follows. I just dont know how to integrate that into your code since I am on the wet bench side. Also for the meth_percentage, could you only show the value in 3 digits. Thanks

        $meth_value = $meth_percentage * 10;
        if ($meth_value > 900){
        $itemRgb = (57,39,140);
        }
        elsif (( 800 < $meth_value) && ($meth_value <= 900) ) {
        $itemRgb = (49,48,206);
        }
        elsif ((700 < $meth_value )&& ($meth_value <= 800) ) {
        $itemRgb = (57,48,198);
        }
        elsif ((600 < $meth_value ) && ($meth_value <= 700 )) {
        $itemRgb = (80,80,80);
        }
        elsif ( (500 < $meth_value) && ($meth_value <= 600 )) {
        $itemRgb = (112,112,112);
        }
        elsif ( (400 < $meth_value) && ($meth_value <= 500 )) {
        $itemRgb = (144,144,144);
        }
        elsif ( (300 < $meth_value) && ($meth_value <= 400) ) {
        $itemRgb = (192,192,192);
        }
        elsif ( (200 < $meth_value) && ($meth_value <= 300) ) {
        $itemRgb = (148,12,274);
        }
        elsif ( (100 < $meth_value) && ($meth_value <= 200 )) {
        $itemRgb = (165,0,41);
        }
        else {
        $itemRgb = (173, 0, 33);
        }
        }

        Comment


        • #34
          Hi sunsnow86,

          Here's a modified version of the script as requested. I haven't tested the script extensively yet (since I don't have any data to test with), so please let me know if there are issues.

          Cheers,
          Oliver
          Attached Files

          Comment


          • #35
            Thanks a lot, Olive! I will give a try to see how well it works

            Comment


            • #36
              New options in Bismark

              Hi everyone,

              After analysing some BS-seq data recently we decided to implement a few changes to Bismark to improve performance. Bismark can be downloaded from the Babraham Bioinformatics homepage.


              Major changes:

              - Added the new option --directional to Bismark. If the BS-Seq library was constructed in a strand-specific way one would expect to see only sequences corresponding to the (C -> T converted) original top or bottom strands. The two strands which are complementary to the original strands are - in this case -merely theoretical and should not actually be observable in the sequencing experiment. Specifying --directional will reject alignments to these only-in-silico-existing strands and will generate a small report about rejected sequences after the Bismark run has been completed.

              - Changed the default alignment option of Bismark to --best to ensure the most credible alignment results. This can be turned off by specifying --no_best. Disabling --best can speed up the alignment process considerably (good for testing purposes), but this will increase the risk of mismappings at the same time.

              - Added the option -e/--maqerr so that one can play around with the maximum number of tolerated mismatches if desired (even though this can make Bowtie and thus Bismark very slow...).

              Minor changes:

              - The output files generated by Bismark will now end in '_bismark.txt' for single-end files or '_bismark_pe.txt' for paired-end files. The mapping and splitting reports will also end in .txt.

              - The alignment and methylation summary reports have been slightly modified to allow better readability.


              In summary, for the most trustworthy results with as few mismapped reads as possible we recommend that Bismark is run with the following options:

              - '--best' (will now be on by default)
              - decreasing the number of allowed (non-bisulfite) mismatches (adjustable with the -n and -l parameters) to as low as possible
              - selecting '--directional' if the library is known to be strand-specific


              If you have any questions please get in touch.

              Best wishes,
              Felix


              PS: If our homepage does not show the most up to date version of Bismark (v0.2.3) hit Strg +refresh to force the BBSRC cache to update.

              Comment


              • #37
                could somebody write a small script to make the format of the output alignment fit to the downstream statistical analysis pipeline which developed by the MIT Broad Institute for RRBS method.
                (http://www.broadinstitute.org/~cbock...dev/index.html). I think it will be very useful for wet scientist. Thanks

                Comment


                • #38
                  Does Bismark allow you to specify a different insert size range like bowtie (--minins and --maxins)?

                  Comment


                  • #39
                    It doesn't take the options --minins and --maxins right now, but I don't see why this shouldn't be working.

                    If you think it would be a useful feature I can implement it first thing tomorrow morning and make a new release/send it to you via email.

                    Comment


                    • #40
                      That would be fabulous. Thanks!

                      Comment


                      • #41
                        BS-Seeker may have that option, it is also a bowtie-based alignment program, works similar to bismark. Here is the link

                        http://pellegrini.mcdb.ucla.edu/BS_S...BS_Seeker.html

                        Comment


                        • #42
                          Originally posted by sgrimm View Post
                          Does Bismark allow you to specify a different insert size range like bowtie (--minins and --maxins)?
                          Dear sgrimm,

                          I have now implemented the options -I/--minins and -X/--maxins into Bismark. I haven't tested them extensively but everything seems to work fine so far.

                          The latest version (v0.2.4) can be downloaded from the Bismark homepage. (You might need to hit Ctrl+Refresh to force the BBSRC cache to update).

                          Best wishes,
                          Felix
                          Last edited by fkrueger; 11-18-2010, 02:46 AM.

                          Comment


                          • #43
                            Originally posted by sunsnow86 View Post
                            BS-Seeker may have that option, it is also a bowtie-based alignment program, works similar to bismark. Here is the link

                            http://pellegrini.mcdb.ucla.edu/BS_S...BS_Seeker.html
                            I don't think that BS_Seeker supports paired-mapping at all, thus it won't support insert size specifications either.

                            Comment


                            • #44
                              Originally posted by fkrueger View Post
                              Dear sgrimm,

                              I have now implemented the options -I/--minins and -X/--maxins into Bismark. I haven't tested them extensively but everything seems to work fine so far.

                              The latest version (v0.2.4) can be downloaded from the Bismark homepage. (You might need to hit Ctrl+Refresh to force the BBSRC cache to update).

                              Best wishes,
                              Felix

                              Thanks, Felix. The update with --maxins is working great. (I didn't try --minins.)

                              Comment


                              • #45
                                About the mismatch~

                                Hi fkrueger,

                                In BISMARK,I didn't notice the management of BS-mismatch and non-BS-mismatch.So can you give us some details about you algorithm?

                                Have you read the paper:Epigenomics and Genome Wide Methylation Profiling.In this paper,the author mentioned 3 algorithms,can you tell me which algorithm do you choose,and how do you handle the algorithm limitations?

                                Thanks for you wonderful program!
                                Looking forward to you replying.

                                Comment

                                Working...
                                X