Use your standard DNA variant caller for SNP calling in Bisulfite-Seq data
Until now, specialised tools are required in order to access DNA variants from bisulfite sequencing data. Experts in the field are often frustrated by the output and performance of the few available variant callers for BS-seq data, to the extent that they tend to re-sequence an additional whole genome DNA-Seq run just to access prominent/trustable tools like GATK (McKenna et al., 2010) or Freebayes (Garrison & Marth, 2012). This results in an expensive outlay in terms of additional cost and labor.
We present a simple bioinformatic pre-processing procedure for mapped BS-Seq data, which makes them usable for all types of DNA-Seq variant callers. This interoperability also produces surprisingly good results with regards to the overall quality and sensitivity. We show that the performance of GATK and Freebayes on BS-Seq data pre-processed by our tool is similarly good as on real DNA-Seq data, with a marked improvement over all tested BS-Seq variant callers. Our approach gives researchers the opportunity to call DNA variants without additional sequencing costs and time-consuming analyses, while using state-of-the-art methods which are continuously optimised through extensive uptake in the community.
Take a look at the simplicity of the method! It is both straightforward to understand for those outside the field and showcases how a small change in perspective can lead to novel and unexpected solutions.
Nunn A, Otto C, Fasold M, Stadler PF, Langenberger D: 'Manipulating base quality scores enables variant calling from bisulfite sequencing alignments using conventional bayesian approaches', BMC Genomics 23, Article number: 477 (2022)
Until now, specialised tools are required in order to access DNA variants from bisulfite sequencing data. Experts in the field are often frustrated by the output and performance of the few available variant callers for BS-seq data, to the extent that they tend to re-sequence an additional whole genome DNA-Seq run just to access prominent/trustable tools like GATK (McKenna et al., 2010) or Freebayes (Garrison & Marth, 2012). This results in an expensive outlay in terms of additional cost and labor.
We present a simple bioinformatic pre-processing procedure for mapped BS-Seq data, which makes them usable for all types of DNA-Seq variant callers. This interoperability also produces surprisingly good results with regards to the overall quality and sensitivity. We show that the performance of GATK and Freebayes on BS-Seq data pre-processed by our tool is similarly good as on real DNA-Seq data, with a marked improvement over all tested BS-Seq variant callers. Our approach gives researchers the opportunity to call DNA variants without additional sequencing costs and time-consuming analyses, while using state-of-the-art methods which are continuously optimised through extensive uptake in the community.
Take a look at the simplicity of the method! It is both straightforward to understand for those outside the field and showcases how a small change in perspective can lead to novel and unexpected solutions.
Nunn A, Otto C, Fasold M, Stadler PF, Langenberger D: 'Manipulating base quality scores enables variant calling from bisulfite sequencing alignments using conventional bayesian approaches', BMC Genomics 23, Article number: 477 (2022)