Hi all,
I am now learning WGBS analysis using Bismark ver1.9.
I'm facing low mapping efficiency problem. When I use with PE mode, Mapping efficiency turn to be 1.8%. But when I use either of that sequence in SE mode, this gives me 88% mapping efficiency.
My sample is not PBAT.
I can't solve this problem by myself. Could anyone answer my problem?
Followings are my procedure.
1. remove poor read quality reads.
2. remove adaptor sequence.
3. convert hg19 refgenome by bismark_genome_preparation
4. try mapping using bismark either PE mode or SE mode
PE mode
======================
Sequence pairs analysed in total: 10000
Number of paired-end alignments with a unique best hit: 175
Mapping efficiency: 1.8%
Sequence pairs with no alignments under any condition: 9817
Sequence pairs did not map uniquely: 8
Sequence pairs which were discarded because genomic sequence could not be extracted: 0
Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT: 78 ((converted) top strand)
GA/CT/CT: 0 (complementary to (converted) top strand)
GA/CT/GA: 0 (complementary to (converted) bottom strand)
CT/GA/GA: 97 ((converted) bottom strand)
Number of alignments to (merely theoretical) complementary strands being rejected in total: 0
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 6079
Total methylated C's in CpG context: 201
Total methylated C's in CHG context: 7
Total methylated C's in CHH context: 23
Total methylated C's in Unknown context: 0
Total unmethylated C's in CpG context: 157
Total unmethylated C's in CHG context: 1271
Total unmethylated C's in CHH context: 4420
Total unmethylated C's in Unknown context: 14
C methylated in CpG context: 56.1%
C methylated in CHG context: 0.5%
C methylated in CHH context: 0.5%
C methylated in unknown context (CN or CHN): 0.0%
=====================
SE mode
======================
Sequences analysed in total: 3014078
Number of alignments with a unique best hit from the different alignments: 2664742
Mapping efficiency: 88.4%
Sequences with no alignments under any condition: 107753
Sequences did not map uniquely: 241583
Sequences which were discarded because genomic sequence could not be extracted: 10
Number of sequences with unique best (first) alignment came from the bowtie output:
CT/CT: 1329899 ((converted) top strand)
CT/GA: 1334833 ((converted) bottom strand)
GA/CT: 0 (complementary to (converted) top strand)
GA/GA: 0 (complementary to (converted) bottom strand)
Number of alignments to (merely theoretical) complementary strands being rejected in total: 0
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 43696755
Total methylated C's in CpG context: 1442578
Total methylated C's in CHG context: 29761
Total methylated C's in CHH context: 110235
Total methylated C's in Unknown context: 654
Total unmethylated C's in CpG context: 395138
Total unmethylated C's in CHG context: 9033660
Total unmethylated C's in CHH context: 32685383
Total unmethylated C's in Unknown context: 13481
C methylated in CpG context: 78.5%
C methylated in CHG context: 0.3%
C methylated in CHH context: 0.3%
C methylated in Unknown context (CN or CHN): 4.6%
===================================
Thanks alot,
Taiki
I am now learning WGBS analysis using Bismark ver1.9.
I'm facing low mapping efficiency problem. When I use with PE mode, Mapping efficiency turn to be 1.8%. But when I use either of that sequence in SE mode, this gives me 88% mapping efficiency.
My sample is not PBAT.
I can't solve this problem by myself. Could anyone answer my problem?
Followings are my procedure.
1. remove poor read quality reads.
2. remove adaptor sequence.
3. convert hg19 refgenome by bismark_genome_preparation
4. try mapping using bismark either PE mode or SE mode
PE mode
Code:
bismark -q --bowtie2 -N 0 -L 20 -u 10000 -X 2000 --score_min L,0,-0.6 /refgenome --1 R1.fastq --2 R2.fastq --sam -o ./bismark_result
Sequence pairs analysed in total: 10000
Number of paired-end alignments with a unique best hit: 175
Mapping efficiency: 1.8%
Sequence pairs with no alignments under any condition: 9817
Sequence pairs did not map uniquely: 8
Sequence pairs which were discarded because genomic sequence could not be extracted: 0
Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT: 78 ((converted) top strand)
GA/CT/CT: 0 (complementary to (converted) top strand)
GA/CT/GA: 0 (complementary to (converted) bottom strand)
CT/GA/GA: 97 ((converted) bottom strand)
Number of alignments to (merely theoretical) complementary strands being rejected in total: 0
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 6079
Total methylated C's in CpG context: 201
Total methylated C's in CHG context: 7
Total methylated C's in CHH context: 23
Total methylated C's in Unknown context: 0
Total unmethylated C's in CpG context: 157
Total unmethylated C's in CHG context: 1271
Total unmethylated C's in CHH context: 4420
Total unmethylated C's in Unknown context: 14
C methylated in CpG context: 56.1%
C methylated in CHG context: 0.5%
C methylated in CHH context: 0.5%
C methylated in unknown context (CN or CHN): 0.0%
=====================
SE mode
Code:
bismark -q --bowtie2 -N 0 -L 20 --score_min L,0,-0.6 /refgenome --se R1.fastq --sam -o ./bismark_result
Sequences analysed in total: 3014078
Number of alignments with a unique best hit from the different alignments: 2664742
Mapping efficiency: 88.4%
Sequences with no alignments under any condition: 107753
Sequences did not map uniquely: 241583
Sequences which were discarded because genomic sequence could not be extracted: 10
Number of sequences with unique best (first) alignment came from the bowtie output:
CT/CT: 1329899 ((converted) top strand)
CT/GA: 1334833 ((converted) bottom strand)
GA/CT: 0 (complementary to (converted) top strand)
GA/GA: 0 (complementary to (converted) bottom strand)
Number of alignments to (merely theoretical) complementary strands being rejected in total: 0
Final Cytosine Methylation Report
=================================
Total number of C's analysed: 43696755
Total methylated C's in CpG context: 1442578
Total methylated C's in CHG context: 29761
Total methylated C's in CHH context: 110235
Total methylated C's in Unknown context: 654
Total unmethylated C's in CpG context: 395138
Total unmethylated C's in CHG context: 9033660
Total unmethylated C's in CHH context: 32685383
Total unmethylated C's in Unknown context: 13481
C methylated in CpG context: 78.5%
C methylated in CHG context: 0.3%
C methylated in CHH context: 0.3%
C methylated in Unknown context (CN or CHN): 4.6%
===================================
Thanks alot,
Taiki
Comment