Hi, I have a couple of questions regarding Bismark. I'm sorry if they are stupid questions; I am fairly new to WGBS data and Bismark analysis (or to bioinformatics in general). We are researching non-model fish animals. Recently, their genomes have become available in the NCBI database, and their assembly level is Scaffold (Additionally, it is still unannotated). Concurrently, there is another project in our laboratory to sequence our target organisms. I, in particular, have been tasked with analyzing WGBS data from our target animal. I already have the raw WGBS reads (Library type: Accel Methyl-Seq DNA library) and I have already done the initial steps such as aligning it to the assembled genome (mapping efficiency: ~51%) and doing deduplication. As of right now, I want to do the next step of the pipeline which is methylation calling. however, my concern is that most of the papers that I've read says that I should use annotated genome during methylation calling however, our current available genome is still being annotated and what we have is only the assembled genome. is it ok to use the only assembled genome for methylation calling and for subsequent downstream analyses (DMRs/DMGs)? so far I tried to run methylation calling for one sample (see report below: )
[script/commands that I used: bismark_methylation_extractor --comprehensive --bedGraph --gzip --scaffolds --cytosine_report --genome_folder]
after I ran methylation calling, I got these output files:
When I opened the [Sample]_bismark_bt2_pe.deduplicated.cytosine_context_summary.txt file, there was no reports that are in the txt files. however [Sample]_pe.deduplicated.M-bias.txt and [Sample]_bismark_bt2_pe.deduplicated_splitting_report.txt had reports (see representative report of [Sample]_bismark_bt2_pe.deduplicated_splitting_report.txt below: )
[Sample]bismark_bt2_pe.deduplicated.bam
Parameters used to extract methylation information:
Bismark Extractor Version: v0.23.1
Bismark result file: paired-end (SAM format)
Output specified: comprehensive
No overlapping methylation calls specified
Processed 48784544 lines in total
Total number of methylation call strings processed: 97569088
Final Cytosine Methylation Report
Total number of C's analysed: 1891803435
Total methylated C's in CpG context: 136624254
Total methylated C's in CHG context: 1889811
Total methylated C's in CHH context: 5987407
Total C to T conversions in CpG context: 40920797
Total C to T conversions in CHG context: 415950132
Total C to T conversions in CHH context: 1290431034
C methylated in CpG context: 77.0%
C methylated in CHG context: 0.5%
C methylated in CHH context: 0.5%
is my approach correct??
[script/commands that I used: bismark_methylation_extractor --comprehensive --bedGraph --gzip --scaffolds --cytosine_report --genome_folder]
after I ran methylation calling, I got these output files:
When I opened the [Sample]_bismark_bt2_pe.deduplicated.cytosine_context_summary.txt file, there was no reports that are in the txt files. however [Sample]_pe.deduplicated.M-bias.txt and [Sample]_bismark_bt2_pe.deduplicated_splitting_report.txt had reports (see representative report of [Sample]_bismark_bt2_pe.deduplicated_splitting_report.txt below: )
[Sample]bismark_bt2_pe.deduplicated.bam
Parameters used to extract methylation information:
Bismark Extractor Version: v0.23.1
Bismark result file: paired-end (SAM format)
Output specified: comprehensive
No overlapping methylation calls specified
Processed 48784544 lines in total
Total number of methylation call strings processed: 97569088
Final Cytosine Methylation Report
Total number of C's analysed: 1891803435
Total methylated C's in CpG context: 136624254
Total methylated C's in CHG context: 1889811
Total methylated C's in CHH context: 5987407
Total C to T conversions in CpG context: 40920797
Total C to T conversions in CHG context: 415950132
Total C to T conversions in CHH context: 1290431034
C methylated in CpG context: 77.0%
C methylated in CHG context: 0.5%
C methylated in CHH context: 0.5%
is my approach correct??