Hello,
I'm new in bioinformatics, and for first time training I've got the set of WGBS 100bp PE reads from few human cancer tissues.
I've filtered reads with prinseq, sorted, and aligned them with bismark in PE mode to hg38 (prepared with bismark) from ucsc.
Mapping efficiency is ~20% with ~80% C's methylated in CpG context.
OK, low mappability of reads from BS treated DNA has been mentioned many times.
Then I tried to map reads 1 and 2 separately in SE mode.
Read 1: mapping efficiency ~60% with ~80% C's methylated in CpG context.
Read 2: mapping efficiency ~50% with ~40% C's methylated in CpG context.
additional trimming by 10-20 nt from any end of read2 slightly increase mappability, but doesn't affect methylation rate.
This result seems extremely odd to me.
If DNA was treated with BS, how can it happen that only read2 in pair shows 2X less methylation in CpG context?
Does anybody have a fresh look?
Thank you in advance.
I'm new in bioinformatics, and for first time training I've got the set of WGBS 100bp PE reads from few human cancer tissues.
I've filtered reads with prinseq, sorted, and aligned them with bismark in PE mode to hg38 (prepared with bismark) from ucsc.
Mapping efficiency is ~20% with ~80% C's methylated in CpG context.
OK, low mappability of reads from BS treated DNA has been mentioned many times.
Then I tried to map reads 1 and 2 separately in SE mode.
Read 1: mapping efficiency ~60% with ~80% C's methylated in CpG context.
Read 2: mapping efficiency ~50% with ~40% C's methylated in CpG context.
additional trimming by 10-20 nt from any end of read2 slightly increase mappability, but doesn't affect methylation rate.
This result seems extremely odd to me.
If DNA was treated with BS, how can it happen that only read2 in pair shows 2X less methylation in CpG context?
Does anybody have a fresh look?
Thank you in advance.
Comment