Seqanswers Leaderboard Ad

**natstreet** · 09-07-2010, 12:02 PM

I tried using the new version on some reads that I had previously aligned using an older version of bismark and ran into problems.

I started by making fresh bowtie indexes because I previously had each chromosome in a different file and the new version of the prep program can handle MFA files (which is great thanks) and also because my reference has non ACGTN characters and I had previously converted these but now no longer need to (again, great).

I then tried running bismark and have errors of this type for every read that gets aligned

Chromosomal sequence could not be extracted for FC704H7AAXX:5:3:3219:9818#0 Chr2 1818938
Use of uninitialized value in substr at /usr/local/bin/bismark line 1381, <$__ANONIO__> line 395162.
substr outside of string at /usr/local/bin/bismark line 1381, <$__ANONIO__> line 395162.
Use of uninitialized value in transliteration (tr///) at /usr/local/bin/bismark line 1778, <$__ANONIO__> line 395162.
Use of uninitialized value in substr at /usr/local/bin/bismark line 1392, <$__ANONIO__> line 395162.
substr outside of string at /usr/local/bin/bismark line 1392, <$__ANONIO__> line 395162.

Here is a copy of the commands used

Code:

bismark_genome_preparation --verbose /data/bismark/

with /data/bismark containing a file soft-linked to my reference fasta file. This ran fine with no errors.

Code:

bismark --chunkmbs 512 -q --phred64-quals -n 0 -l 32 /data/bismark/ -1 cobl-5_1_1.fq,cobl-5_2_1.fq -2 cobl-5_1_2.fq,cobl-5_2_2.fq

I tried without --chunkmbs (because it's the only thing I changed from my previous runs) but get the same errors. The fastq files are the same ones that previously worked fine.

Any ideas where I'm going wrong?

**fkrueger** · 09-08-2010, 01:57 AM

This problem was caused by the MFA file. I have hotfixed it now and hope it will work fine!

**natstreet** · 09-08-2010, 10:37 AM

I can confirm that the hotfix works and the new version is now working great.

**foxyg** · 09-08-2010, 11:26 AM

What kind data is good for this software? If I have exon sequencing data from illumina pipeline, woult it make sense to run with your software?

**natstreet** · 09-08-2010, 11:32 AM

Originally posted by foxyg View Post

What kind data is good for this software? If I have exon sequencing data from illumina pipeline, woult it make sense to run with your software?

The software is for mapping bisulfite treated sequencing data to examine methylation. Exon sequence data requires a completely different mapping approach. I would look into Tophat and Cufflinks or something similar.

**fkrueger** · 09-08-2010, 12:40 PM

A quick overview of Bismark can be found here.

**fkrueger** · 09-13-2010, 06:47 AM

Bismark v0.2.2 has just been released which fixes a bug in the methylation extractor whereby the positions of some cytosines were offset by a few base pairs (this affected some cytosines from reverse-mapped reads in single-end mapping mode). Sorry for any inconvenience caused.

**Vivek Todur** · 04-17-2013, 11:53 PM

Originally posted by fkrueger View Post

This problem was caused by the MFA file. I have hotfixed it now and hope it will work fine!

Even I am facing the same problem. Can you please post here, what was the problem with fasta file and how did you fixed it.

Many Thanks in advance.

**fkrueger** · 04-18-2013, 12:33 AM

Sorry this post is nearly 3 years old and I seem to have forgotten the exact details... What exactly is the problem you are seeing? And which version of Bismark are you using?

One thing that springs to my mind about multi fasta files for alignments is that Bismark expects them with alternating header and sequence lines, such as:

>1
CATGATCGAACCT
>2
AAATTTTGTTTATTTTT
...

If sequences are spanning multiple lines it won't work, such as this:

>1
CTAGCTAG
GCAAAAAA
TTGGGTAA
>2
AAATTTTG
TTTTATTT
TTTT
...

**Vivek Todur** · 04-18-2013, 01:44 AM

Thanks for replying fkrueger,

I am using the latest version of the Bismark i.e. v0.7.10, and I am getting a error message exactly like this,
Chromosomal sequence could not be extracted for FC704H7AAXX:5:3:3219:9818#0 Chr2 1818938

Currently I am using the multiline fasta file. As per your suggestion I will convert the reference to single line fasta.

Thanks again....

**fkrueger** · 04-18-2013, 03:14 AM

This is actually not an error but a warning message, and this has nothing to with the type of file you are using. It simply means that aligned to the very end of a chromosome, and Bismark could not extract further 2 bp from the end of the chromosome simply because there are no further 2 bp. It is normally safe to just ignore these warnings.

Best,
Felix

**momokenken** · 05-26-2013, 07:59 AM

I tried to process a sorted bedGraph file that contains only CHH contexts using "genome_methylation_bismark2bedGraph_v5.pl". However, when I processed the file, the other contexts like CHG and CG were also included with methylation information and also it was not consistent with the methylation information of the input file. The bedGraph file and the resulting processed one looked like below.
The options I used were --CX and --genome_folder. The version of bismark is 0.7.8.
Any ideas where I'm going wrong?

The part of the bedGraph file that only contains CHH contexts
chr1 3003874 3003874 0 0 19
chr1 3003875 3003875 0 0 19
chr1 3003884 3003884 0 0 21
chr1 3003889 3003889 9.52380952380952 2 19
chr1 3003892 3003892 0 0 21
chr1 3003893 3003893 0 0 21
chr1 3003895 3003895 23.8095238095238 5 16
chr1 3003896 3003896 0 0 11
chr1 3003903 3003903 7.69230769230769 1 12
chr1 3003908 3003908 0 0 12
chr1 3003910 3003910 0 0 12
chr1 3003911 3003911 0 0 9
chr1 3003921 3003921 0 0 22
chr1 3003922 3003922 4.54545454545455 1 21
chr1 3003923 3003923 0 0 22

The part of the processed file by genome_methylation_bismark2bedGraph_v5.pl
chr1 3003874 + 0 0 CHH CCT
chr1 3003875 + 0 19 CHH CTT
chr1 3003881 + 0 0 CHG CAG
chr1 3003883 - 0 0 CHG CTG
chr1 3003884 - 0 0 CHH CCT
chr1 3003885 + 0 21 CG CGG
chr1 3003886 - 0 0 CG CGC
chr1 3003887 - 0 0 CHG CCG
chr1 3003889 - 0 0 CHH CTC
chr1 3003892 - 0 0 CHH CTT
chr1 3003893 - 0 21 CHH CCT
chr1 3003895 - 0 0 CHH CAC
chr1 3003896 + 5 16 CHH CCC
chr1 3003897 + 0 11 CHG CCG
chr1 3003898 + 0 0 CG CGG
chr1 3003899 - 0 0 CG CGG
chr1 3003900 - 0 0 CHG CCG
chr1 3003903 - 0 0 CHH CAT
chr1 3003905 + 0 0 CHG CTG
chr1 3003907 - 0 0 CHG CAG
chr1 3003908 - 0 0 CHH CCA
chr1 3003910 - 0 0 CHH CTC
chr1 3003911 + 0 12 CHH CCT
chr1 3003912 + 0 9 CHG CTG
chr1 3003914 - 0 0 CHG CAG
chr1 3003918 + 0 0 CHG CAG
chr1 3003920 - 0 0 CHG CTG
chr1 3003921 - 0 0 CHH CCT
chr1 3003922 - 0 22 CHH CCC
chr1 3003923 - 1 21 CHH CCC

**fkrueger** · 05-26-2013, 10:40 AM

Hi momokenken,

To me the output you linked looks just fine, but you have to note a couple of things:

- The bedGraph output is 0-based, however the genome-wide cytosine methylation report (the last format) uses 1-based coordinates (as does Bismark itself). Thus, you need to add +1 to all bedGraph coordinates to get to the cytosine report coords.
- The metylation extractor offers the options CpG-only or all cytosine contexts, i.e. CG, CHG and CHH combined. There is no CHH context-only format unless you filter it out specifically. Thus the full cytosine output contains Cs in many different contexts.

Finally, may I ask you to install the latest version (v0.7.12) which offers quite a few new features for the methylation extraction, bedGraph and cytosine report? In addition to being a LOT quicker than older versions Bismark comes now with the modules bismark2bedGraph and bedGraph2cytosine that replace any older versions of these scripts. Both of them work either from within the methylation extractor or as stand-alone tools. If you have further questions you can also contact me directly via email.

Cheers, Felix

**oria34** · 07-02-2013, 05:51 AM

You may also note that the methylation report contains also the cytosines with no coverage at all. Those will never appear in the bedgraph file.

You can filter the bedgrapd file that contains all the cytosine methylation averages using bedtools (in combination with the report file) plus <grep> command.

Cheers

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Today, 11:49 AM	0 responses 10 views 0 likes	Last Post by seqadmin Today, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Yesterday, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin Yesterday, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Bismark Bisulfite Aligner - Now supporting CpG, CHG and CHH context

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News