Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    bedgraph for CHH and CHG methylation only

    I am trying to generate separate bedgraph files for CHH and CHG methylation contexts like in the previous post. Th only options available in bismark seem to be CpG or All contexts when it comes to the generating the bedgraph report. I was thinking the only way to split these reports is to:

    1. use the --CX option and the --cytosine_report option
    2. pull out just the CHH or CHG methylation calls from the cytosine report and intersect that with the bedgraph output

    I am not really sure what to do. This is older data and I already have the CHH and CHG methylation extractor files from an older version of bismark so I would rather not have to re-generate those files. Any advice is appreciated.

    Thanks

    Originally posted by fkrueger View Post
    Hi momokenken,

    To me the output you linked looks just fine, but you have to note a couple of things:

    - The bedGraph output is 0-based, however the genome-wide cytosine methylation report (the last format) uses 1-based coordinates (as does Bismark itself). Thus, you need to add +1 to all bedGraph coordinates to get to the cytosine report coords.
    - The metylation extractor offers the options CpG-only or all cytosine contexts, i.e. CG, CHG and CHH combined. There is no CHH context-only format unless you filter it out specifically. Thus the full cytosine output contains Cs in many different contexts.

    Finally, may I ask you to install the latest version (v0.7.12) which offers quite a few new features for the methylation extraction, bedGraph and cytosine report? In addition to being a LOT quicker than older versions Bismark comes now with the modules bismark2bedGraph and bedGraph2cytosine that replace any older versions of these scripts. Both of them work either from within the methylation extractor or as stand-alone tools. If you have further questions you can also contact me directly via email.

    Cheers, Felix

    Comment


    • #17
      Hi shawpa,

      If you still have the CHH and CHG context files you can just run the bismark2bedGraph script while only selecting CHH* (or CHG*) as input files.

      Best,
      Felix
      Last edited by fkrueger; 08-05-2013, 05:34 AM. Reason: typos

      Comment


      • #18
        I actually did try that, but I couldn't get the bismark2bedgraph command to read any input files. I used the following command executed from the folder containing my methylation extractor output files.

        perl /home/shawpa/lsf_hpc/bismark2bedGraph.pl --counts -o cvs194_CHHbedgraph.txt CHH_context_cvs194_merged.sam.txt

        No file is generated. This is the output that I get.

        ================================================================
        bedGraph output: cvs194_CHHbedgraph.txt
        output directory: ><
        remove whitespaces: no
        CX context: no (CpG context only, default)
        No-header selected: no
        Sort buffer size: 2G
        Coverage threshold: 1
        Counts requested: yes


        ================================================================
        Methylation information will now be written into a bedGraph file
        ================================================================

        Writing bedGraph to file: cvs194_CHHbedgraph.txt
        Using the following files as Input:



        Collecting temporary chromosome file information...
        processing the following input file(s):




        Originally posted by fkrueger View Post
        Hi shawpa,

        If you still have the CHH and CHG context files you can just run the bismark2bedGraph script while only selecting CHH* (or CHG*) as input files.

        Best,
        Felix

        Comment


        • #19
          Hi Shawpa,

          If you specify --CX it should work (but I agree that this is not how I would expect it to work, so I will think about changing this behavior for the next release).

          Felix

          Comment


          • #20
            Hi, I have a question: what is CHH context, and CHG context? (I know CpG)

            Comment


            • #21
              Originally posted by litc View Post
              Hi, I have a question: what is CHH context, and CHG context? (I know CpG)
              H is the IUPAC abbreviation for the DNA nucleotides A or C or T, so CHH and CHG are basically anything non-CpG.

              Comment


              • #22
                Originally posted by fkrueger View Post
                H is the IUPAC abbreviation for the DNA nucleotides A or C or T, so CHH and CHG are basically anything non-CpG.
                thank fkrueger!

                Comment


                • #23
                  Hi fkrueger,

                  I read part of your code and find you use sleep() function many times. Is it necessary to let the process wait a certain time? Can they be removed safely?
                  Since I need to run bismark with small size reference tens of times, even several seconds wait per execution cost a lot of time.

                  Comment


                  • #24
                    Sorry for all these, I can confirm that they can be commented out safely, it is only to increase the readability during run time (we had some discussions about these 'readability aids' in our department only today as well...)

                    Comment


                    • #25
                      You should advertise the next version as being 50% quicker when processing small genomes

                      Comment


                      • #26
                        Hi fkrueger, I found Bismark Extractor(Version: v0.9.0) have a bug, I give want a genome-wide cytosine report only contain CpG context info, and this is the commands I did:
                        bismark_methylation_extractor -p --no_overlap --bedGraph --counts --cytosine_report --genome_folder /home/share/hg19/ *.sam
                        but the results contain not only CpG, but also CHG, CHH. I check the code of bedGraph2cytosine, found that the parameter "CX|CX_context" did not affect the value of "$CpG_only" which would control the type of context to be outputed, the "$CpG_only" was not assigned a valid value weather or not I specified parameter "CX|CX_context". so I added some code in the function of "process_commandline":
                        if (!$CX_context) {
                        $CpG_only=1;
                        }
                        before "return" statement.

                        mybe this could fix out my problem.

                        Comment


                        • #27
                          Hi litc,

                          I have spotted this problem myself recently, and have incorporated the same fix into the next version (which will be due soon).

                          Best,
                          Felix

                          Comment


                          • #28
                            We have just announced a new version of Bismark in this thread which fixes this problem amongst other things.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM
                            • seqadmin
                              Techniques and Challenges in Conservation Genomics
                              by seqadmin



                              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                              Avian Conservation
                              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                              03-08-2024, 10:41 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, Yesterday, 06:37 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, Yesterday, 06:07 PM
                            0 responses
                            10 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-22-2024, 10:03 AM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 03-21-2024, 07:32 AM
                            0 responses
                            67 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X