Bismark - A New Tool for Mapping and Analysis of Bisulfite-Seq Data

fkrueger replied

06-09-2013, 02:30 AM
Originally posted by Julien Roux View Post

Hi Felix
Actually I was running bismark on a cluster and I requested 32g of memory. I have now run it with 48g and I'm still encountering the same issue...
Do you have any idea of what else could go wrong? There is no apparent error message printed in the report files
Thanks
Julien

This is somewhat weird... You should not need more than 12GB memory or so, maybe your cluster farms one alignment thread out and can't read from it anymore? Could you try to login to a compute node and run it locally on there (or some other machine for that matter)?

If you could send me a few reads (e.g. the first 100K or so reads) via email I could take a quick look here what is going on.
Leave a comment:
fkrueger replied

06-09-2013, 02:25 AM
@pengchy,
It will probably work fine for the singleton reads but this won't guarantee the strandedness of paired-end reads, i.e. read 1 and read 2 would most likely be regarded as coming from OT and and CTOT (instead of OT only) or similarly from OB and CTOB (instead of OB only). In terms of the methylation information it should still be fine though.

To be on the safe side I would probably still treat them separately and then merge the results at the level of the final methylation extractor output.
Leave a comment:
pengchy replied

06-09-2013, 01:04 AM
Hi fkrueger,

When I use the bismark_methylation_extractor, it need to define single-end or paired-end. I have filter the original reads using Trimmomatic, which produced paired-end and single-end simultaneously. Is it possible to merge the two bam files together using samtools and bismark_methylation_extractor later with -p parameter?

Thank you.
P
Leave a comment:
Julien Roux replied

06-08-2013, 07:27 PM
Hi Felix
Actually I was running bismark on a cluster and I requested 32g of memory. I have now run it with 48g and I'm still encountering the same issue...
Do you have any idea of what else could go wrong? There is no apparent error message printed in the report files
Thanks
Julien
Leave a comment:
fkrueger replied

06-08-2013, 07:36 AM
Hi Julien,
This does indeed look like the first instance of Bowtie (OT) is running out of memory... Does your machine have fairly low RAM or are you running many instances of Bismark concurrently? Could you run the analysis on a more powerful machine and see what happens there?
Leave a comment:

Julien Roux replied

06-08-2013, 06:17 AM

Thanks Felix for your answer.

I am now facing another problem: when I run bismark of some of my samples, reads end up mapped to only the top or the bottom strand... This is not happening for all samples, but it is happening repeatedly on given samples. Is this a memory issue?

Here is a an example of report file:

Code:

Bismark report for: ./C3K1_trimmed.fq.gz (version: v0.7.12)
Option '--directional' specified: alignments to complementary strands will be ignored (i.e. not performed!)
Bowtie was run against the bisulfite genome of panTro3_nonrandom+Lambda_prepared_bismark/ with the specified options: -q --phred64-quals -n 1 -k 2 --best --chunkmbs 512

Final Alignment report
======================
Sequences analysed in total:    50619505
Number of alignments with a unique best hit from the different alignments:      19828727
Mapping efficiency:     39.2%
Sequences with no alignments under any condition:       23787863
Sequences did not map uniquely: 7002915
Sequences which were discarded because genomic sequence could not be extracted: 0

Number of sequences with unique best (first) alignment came from the bowtie output:
CT/CT:  0       ((converted) top strand)
CT/GA:  19828727        ((converted) bottom strand)
GA/CT:  0       (complementary to (converted) top strand)
GA/GA:  0       (complementary to (converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total:     0

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   185057365

Total methylated C's in CpG context:     6829013
Total methylated C's in CHG context:    216805
Total methylated C's in CHH context:    741531

Total C to T conversions in CpG context:        2672385
Total C to T conversions in CHG context:        41885198
Total C to T conversions in CHH context:        132712433

C methylated in CpG context:    71.9%
C methylated in CHG context:    0.5%
C methylated in CHH context:    0.6%

Thanks for your help
Julien

Leave a comment:

fkrueger replied

05-30-2013, 10:27 AM
Originally posted by shadow19c View Post

Hello,
I have a question the option of bismark, concerning the bowtie 2 reporting options --most_valid_alignments.
If I'm not wrong the option -M is not available now in Bowtie2, so how can you ask to the program to keep only valid aligments only unique aligments?

Bismark determines if there are any other alignments with the same alignment score. If there are the read is not unique and discarded, otherwise it is kept.
Leave a comment:
shadow19c replied

05-30-2013, 05:47 AM
Hello,
I have a question the option of bismark, concerning the bowtie 2 reporting options --most_valid_alignments.
If I'm not wrong the option -M is not available now in Bowtie2, so how can you ask to the program to keep only valid aligments only unique aligments?
Leave a comment:
fkrueger replied

05-15-2013, 12:47 AM
Hi pengchy,

To 1) It is true that Bismark appends segment numbers to the end of read. This is because Bowtie or Bowtie2 tend to delete these tags internally while aligning, and to make it more difficult they don't do it in the same way. To properly keep track of which read is doing what I had to do this change (btw also white spaces or tab characters are being replaced by _ in the read ID.

To 2) Bismark does not report singleton alignments for paired-end data but only reports paired alignments. In the Bismark help you can find:

Code:

--no-mixed This option disables Bowtie 2's behavior to try to find alignments for the individual mates if it cannot find a concordant or discordant alignment for a pair. This option is invariable and and on by default. --no-discordant Normally, Bowtie 2 looks for discordant alignments if it cannot find any concordant alignments. A discordant alignment is an alignment where both mates align uniquely, but that does not satisfy the paired-end constraints (--fr/--rf/--ff, -I, -X). This option disables that behavior and it is on by default.

If you wanted to look for singleton alignments for reads that do not produce valid paired-end alignments you could always write out unaligned reads and re-align them in single-end mode, but I would probably not advise doing this since comparing SE and PE alignments can have its own pitfalls.

To 3): In order to determine the sequence context of a read Bismark is extracting 2 extra basepairs at the start or the end of a read (where appropriate). If a read happens to align to the very end of a chromosome, Bismark can't extract 2 additional bp from the chromosomal sequence (because there is no more sequence), throws this warning message and moves on. This happens mostly for the MT, and it is normally fine to just ignore these warnings.
Leave a comment:
pengchy replied

05-14-2013, 07:23 PM
Originally posted by fkrueger View Post

We have just released a new version of Bismark (v0.6.4) to address a few minor issues.

The changes include:

- Adjusted the options -u and -s so that only the non-skipped part of the input file will be transcribed and analysed. This allows splitting up very large files into smaller chunks to allow parallel processing, e.g -s 10000000 -u 20000000 would analyse sequences 10000001 to 20000000. The alignment report will be based on this reduced number of reads analysed
- In paired-end mode, the options --unmapped and --ambiguous do now output unaligned or multiply aligned reads, respectively, to their correct output files as intended
- Sequences in FastA format do now receive Phred score qualities of 40 throughout (ASCII 'I') to prevent the SAM to BAM conversion in SAMtools from failing
- If a genomic sequence could not be extracted it will now also be counted and reported for use with Bowtie 1
- Suppressed debugging warning meassages that were printed in error for Bowtie2 alignments (single-end mode only)

Bismark is available here.

Hi fkrueger,
In the report file of bismark, one line is:

Code:

Sequence pairs which were discarded because genomic sequence could not be extracted: 592

I cann't understand this term, what do you mean that the genomic sequence coud not be extracted?
thank you.
Leave a comment:
pengchy replied

05-14-2013, 06:52 PM
Hi all,

I have two questions for bismark.
1. the read ids in the bam is not same as in the original fastq file.
The original read ids were like:

Code:

HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1 HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/2

After bismark alignment, the read ids in the bam file were like:

Code:

HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1/1 HISEQ700708:127:C1LUKACXX:3:1101:1153:42732/1/2

2. In the report file, No information about how many reads were mapped with only one end of the paired-end data.
Leave a comment:
luuloi replied

05-14-2013, 09:09 AM
Originally posted by luuloi View Post

Hi Felix,
Can I run Bismark, bowtie1 in multi threads -p option to tune the performance faster? I did it with bowtie2, but as you memtioned bowtie2 seems to be slow than bowtie1 with your experience. I have been waiting it for 4 days with size of .Bam file is 21M, it is so slow. BTW, when you will release multi thread Bismark? I have really looking forward to it. I have 14 WGBS samples for it

It has been resolved, thanks a lot Felix! Anyone encouter it, please just download the new version of Bismark v0.7.12
Leave a comment:
fkrueger replied

05-14-2013, 05:58 AM
If read 1 always aligns to the original strands you can just run it in default mode and do not need to specify --non_directional.
Leave a comment:

Previous 1 19 26 27 28 29 30 31 32 34 template Next

Recent Advances in Sequencing Analysis Tools

by seqadmin

The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
- Channel: Articles
05-06-2024, 07:48 AM
Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM

Topics	Statistics	Last Post
The Role of Spliceosomes in RNA Splicing and Genome Evolution by seqadmin Started by seqadmin, Today, 07:03 AM	0 responses 9 views 0 likes	Last Post by seqadmin Today, 07:03 AM
A Closer Look at the Enigmatic Genomes of Oikopleura dioica by seqadmin Started by seqadmin, 05-10-2024, 06:35 AM	0 responses 20 views 0 likes	Last Post by seqadmin 05-10-2024, 06:35 AM
Advanced Epigenome Editing Platform Explores Gene Regulation Mechanisms by seqadmin Started by seqadmin, 05-09-2024, 02:46 PM	0 responses 26 views 0 likes	Last Post by seqadmin 05-09-2024, 02:46 PM
Telomere Maintenance by PARP1: A New Perspective in Cancer Research by seqadmin Started by seqadmin, 05-07-2024, 06:57 AM	0 responses 21 views 0 likes	Last Post by seqadmin 05-07-2024, 06:57 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News