Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • chxu02
    replied
    Hi Felix,
    When I tried to run the recently added script:
    Code:
    filter_non_conversion
    I got the following error message:
    Code:
    syntax error at ~/bismark-0.17.0/filter_non_conversion line 280, near "last nless ("
    Execution of ~/bismark-0.17.0/filter_non_conversion aborted due to compilation errors.

    Leave a comment:


  • fkrueger
    replied
    Not directly I am afraid, but one could probably write a quick script (or even use some clever awk/sed constructs) to extract this kind of information from either the coverage files or genome-wide methylation reports.

    Leave a comment:


  • Jon17
    replied
    Is there a way to use your tools to get CpG coverage? For instance

    55% of CpG sites are at 30x
    45% of CpG sites are at 20x
    25% of CpG sites are at 10x

    As well as methylation rates given certain depths at CpG sites?

    55% of CpG sites are methylated at 30x
    45% of CpG sites are methylated at 20x
    25% of CpG sites are methylated at 10x

    Leave a comment:


  • fkrueger
    replied
    Hi Jon,

    I would assume that GATK and Picard will probably run but I don't know exactly what they need to look at the metrics you mentioned, you might have to ask their developers to be honest. We personally use SeqMonk to get all of the stats we are interested in. It is a genome browser that does all sorts of other relevant quantitations as well. Cheers, Felix

    Leave a comment:


  • Jon17
    replied
    Thank you for the response. I'm currently using GATK and Picard for alignment metrics (depth of coverage, insert size, on target bases, FOLD_80_BASE_PENALTY, etc). Given I'm using trim_galore and bismark, would you consider that use of GATK & Picard ok? Or should I use a different tool? What do you recommend?

    If Bismark is "designed to to be used only with properly adapter and base call quality trimmed data", what are the pitfalls?

    Leave a comment:


  • fkrueger
    replied
    Hi Jon,

    Bismark allows you to specify a read group for each sample, so that if you were to combine several different samples afterwards you could still tell which read came from which sequencing run. Other than that I am afraid the methylation extractor doesn't do anything sophisticated such as the BQSR step, it will simply extract the methylation information from the BAM files as is.

    In contrast to tools like GATK which specifically ask you not to perform any quality trimming up-front but rather use soft clipping during the alignment step, Bismark is designed to to be used only with properly adapter and base call quality trimmed data (Trim Galore uses a Phred score cut-off of 20 by default). Because of this additional step we do not check the basecall quality again during the extraction process. I hope this helps, Best, Felix

    Leave a comment:


  • Jon17
    replied
    How does Bismark handle multiple read groups?

    I know you can set the read group flag in Bismark but that's only for 1 read group. What if you have multiple? Does Bismark align those read-groups separately against the reference? Do any re-calibrations?

    GATK seems to imply recalibration (BQSR) is necessary for SNP calling. What about methylation calls?

    I'm wondering if I should:

    1) Run Bismark on each read group separately
    2) Combine the multiple bam files
    3) Re-calibrate base scores (BQSR)
    4) Run bismark_methylation_extractor on the final re-calibrated bam file

    Leave a comment:


  • fkrueger
    replied
    You could either align the to the lambda sequence by itself which should be really quick (probably several hundred million sequences/hour), or if you have the output already I would probably just filter for reads containing the Lambda sequence (NC_001416) and then run the bismark_methylation_extractor on that new SAM file. A command like this should get you the alignments:

    cat current_alignment_file.sam | grep NC_001416 > lambda_alignments.sam

    bismark_methylation_extracor --bedGraph --gzip lambda_alignments.sam

    If you are starting from BAM files the commands need to include Samtools but the idea is the same. I hope this helps. Cheers, Felix

    Leave a comment:


  • Jon17
    replied
    How do you handle control DNA?

    I have Bisulfite Sequencing data with lambda control (NC_001416) spike in.

    I'm looking at the alignment files (SAM) and I see there are reads aligned to the lambda (NC_001416) reference.

    So how do I separate methyl stats of these two samples? bismark_methylation_extractor seems to give me only one set of statistics and doesn't differentiate the control group from the sample group.

    I don't see any guidance in the manual or tutorials.

    This command is not ideal: --split_by_chromosome
    Last edited by Jon17; 09-30-2016, 07:18 AM.

    Leave a comment:


  • Jon17
    replied
    Seems like there are a lot of pipelines out there that get big differences in methylation results. Is there a paper out that that shows a complete workflow and has control data and output results I can compare?

    I can get bismark to work but I'm doing epigenetic sequence capture and want to know my results are in line with industry standard methods.

    Leave a comment:


  • fkrueger
    replied
    We have just posted a new version of Bismark (v0.16.3) which is available from https://github.com/FelixKrueger/Bismark or http://www.bioinformatics.babraham.a...jects/bismark/. It includes essential bug fixes regarding ambiguous alignments in Bowtie 2 mode that had arisen in version 0.16.0, and includes several new convenience options, details are listed below:

    Bismark: Essential fixes (2 in total) to address a bug for Bowtie 2 alignments where reads that should be considered ambiguous were incorrectly assigned to the first alignment thread. These errors had crept in during releases 0.16.0 and 0.16.2). More info available on Github
    Bismark: Added support for large Bowtie (1) index files ending in .ebwtl which had been added in Bowtie v1.1.0

    Changed the Shebang in all scripts of the Bismark suite to #!/usr/bin/env perl instead of #!/usr/bin/perl
    deduplicate_bismark: Does now bail with a useful error message when the input files are empty

    bismark_genome_preparation: Added new option '--genomic_composition' so that the genomic composition can be calculated and written right at the genome preparation stage rather than by using bam2nuc

    bam2nuc: Now also calculates a fold coverage for the various (di-)nucleotides. The changes in the nucleotide_stats text file are also picked up and plotted by bismark2report
    bam2nuc: Added a new option '--genomic_composition_only' to just process the genomic sequence without requiring any data files

    bismark2summary: Added option -o/--basename FILENAME to specify a certain filename. If not specified the name will remain bismark_summary_report.txt/html
    bismark2summary: Added documentation and the options '--help' and '--version' to be consistent with the rest of Bismark
    bismark2summary: Added option '--title STRING' to give the HTML report a different title

    Leave a comment:


  • fkrueger
    replied
    Hi Diego,
    You used the bedGraph file as input but you need to feed it the coverage file because this is the one that contains the counts for methylated and unmethylated calls (the bedGraph file only contains a percentage methylation). Cheers, Felix

    Leave a comment:


  • fkrueger
    replied
    Sorry it is quite late now over here, I'll take a look tomorrow.

    Leave a comment:


  • dzavallo
    replied
    Hi Felix, sorry to bother you again. I ran the bedGraph2coverage script in order to use the output as input for DMRcaller program from Bioconductor


    Code:
    coverage2cytosine --CX --split_by_chromosome --genome_folder /home/zavallo.diego/Downloads/Arabidopsis/Arabidopsis_Genome/Arabidopsis_genome/ CG1_paired_bismark_bt2_pe.bedGraph.gz -o bedGraph2Coverage_CG
    It generates the files, but it looks like it doesnt make the methylation calling. Also during the run, these "errors" appears several times:

    Use of uninitialized value $nonmeth in join or string at /tools/bin/coverage2cytosine line 495, <IN> line 315.
    Use of uninitialized value $meth in join or string at /tools/bin/coverage2cytosine line 495, <IN> line 315.

    and also:

    Use of uninitialized value $nonmeth in join or string at /tools/bin/coverage2cytosine line 660, <IN> line 386.
    Use of uninitialized value $meth in join or string at /tools/bin/coverage2cytosine line 660, <IN> line 386.

    The program end well, but I can't "see" the coverage

    take a look at this:

    PHP Code:
    Chr3    157845    +    0    0    CHH    CAC
    Chr3    1157845    
    +    0    0    CHH    CCA
    Chr3    1578453    
    -    0    0    CHH    CTT
    Chr3    1578455    
    -    0    0    CHH    CAC
    Chr3    1578456    
    +    0    0    CHH    CAT
    Chr3    5157845    
    -    0    0    CHH    CAA
    Chr3    7157845    
    +    0    0    CG    CGT
    Chr3    10157845    
    -    0    0    CHH    CCA
    Chr3    11157845    
    -    0    0    CHH    CAA
    Chr3    11578450    
    -    0    0    CHH    CAT
    Chr3    11578453    
    -    0    0    CHH    CTT
    Chr3    11578459    
    +    0    0    CHH    CTT
    Chr3    15784500    
    -    0    0    CG    CGG
    Chr3    15784505    
    +    0    0    CHH    CTC
    Chr3    15784507    
    +            CHH    CCC
    Chr3    15784508    
    +            CHH    CCC
    Chr3    15784509    
    +            CHH    CCT
    Chr3    15784510    
    +    0    0    CHH    CTA
    Chr3    15784513    
    -    0    0    CHH    CTA
    Chr3    15784514    
    -            CHH    CCT
    Chr3    15784515    
    +            CHH    CCC
    Chr3    15784516    
    +            CHG    CCG
    Chr3    15784517    
    +    0    0    CG    CGA
    Chr3    15784518    
    -    0    0    CG    CGG
    Chr3    15784522    
    +            CHH    CCA
    Chr3    15784523    
    +    0    0    CHH    CAC
    Chr3    15784525    
    +    0    0    CHH    CAT
    Chr3    15784528    
    +    0    0    CHH    CAT
    Chr3    15784531    
    -    0    0    CHH    CAT
    Chr3    15784540    
    -    0    0    CHH    CTA
    Chr3    15784544    
    +    0    0    CG    CGT
    Chr3    15784545    
    -    0    0    CG    CGA
    Chr3    15784547    
    +            CHH    CCC
    Chr3    15784548    
    +            CHH    CCA
    Chr3    15784549    
    +    0    0    CHH    CAA
    Chr3    15784552    
    -            CHH    CTT
    Chr3    15784553    
    +    0    0    CG    CGT
    Chr3    15784554    
    -    0    0    CG    CGC
    Chr3    15784558    
    +    0    0    CHH    CTC
    Chr3    15784560    
    +    0    0    CG    CGA
    Chr3    15784561    
    -    0    0    CG    CGA
    Chr3    15784564    
    +    0    0    CHG    CTG
    Chr3    15784566    
    -            CHG    CAG
    Chr3    15784567    
    +            CHG    CCG
    Chr3    15784568    
    +    0    0    CG    CGA
    Chr3    15784569    
    -    0    0    CG    CGG
    Chr3    15784571    
    +            CHH    CCT
    Chr3    15784572    
    +    0    0    CHH    CTC
    Chr3    15784574    
    +    0    0    CHH    CTA
    Chr3    15784578    
    -            CHH    CTT
    Chr3    15784579    
    +    0    0    CHH    CTT
    Chr3    15784582    
    +    0    0    CG    CGC
    Chr3    15784583    
    -            CG    CGA
    Chr3    15784584    
    +            CHH    CCT
    Chr3    15784585    
    +    0    0    CHH    CTA
    Chr3    15784588    
    -    0    0    CHH    CTA
    Chr3    15784589    
    -    0    0    CHH    CCT
    Chr3    15784590    
    -    0    0    CHH    CCC
    Chr3    15784592    
    +    0    0    CHH    CTT
    Chr3    15784597    
    +            CHH    CCA
    Chr3    15784598    
    +    0    0    CHH    CAA
    Chr3    17157845    
    +    0    0    CHG    CTG
    Chr3    19157845    
    +    0    0    CHH    CTT
    Chr3    20157845    
    +    0    0    CG    CGA
    Chr3    21157845    
    +    0    0    CHH    CCT
    Chr3    21578454    
    +    0    0    CHH    CTC
    Chr3    21578456    
    +    0    0    CHH    CAA
    Chr3    21578459    
    -    0    0    CHH    CTT 
    Are these "gaps" (no 0) ok?
    I think that my bedGraph file is ok.
    What do you think?

    Thanks

    Diego
    Attached Files
    Last edited by dzavallo; 07-11-2016, 12:11 PM.

    Leave a comment:


  • dzavallo
    replied
    Hi Felix, you were right! I hope you bet a lot of money!
    I talk to my coworker and to the people at our sequencing facility and we realize that we use the same index for both libraries. Most of the sequences belong to his bug

    Thanks again for all your help

    Cheers!

    Diego

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 11:49 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-24-2024, 08:47 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Working...
X