Hello
I'm analyzing WGBS (bulk) data from ENCODE project.
I would like to know how do many researchers analyze WGBS data.
When I extracted CpG sites having > 10 coverage on 20 samples, CpG sites having >10 coverage on more samples than 70% of all samples are approx. 30 % of all CpG sites. so if one region be focused, the methylation information of the region contains many missing value.
I think that imputation can not be applied, because of numerousness of missing value. So should I treat average methylation rate on each focused regions and compare with average methylation rate ?
and additional question is that How bacth effect and blood contamination effect are removed?
I'm beginner. so I need information (paper, web sites shows analysis process)
I hope your help
thank you
I'm analyzing WGBS (bulk) data from ENCODE project.
I would like to know how do many researchers analyze WGBS data.
When I extracted CpG sites having > 10 coverage on 20 samples, CpG sites having >10 coverage on more samples than 70% of all samples are approx. 30 % of all CpG sites. so if one region be focused, the methylation information of the region contains many missing value.
I think that imputation can not be applied, because of numerousness of missing value. So should I treat average methylation rate on each focused regions and compare with average methylation rate ?
and additional question is that How bacth effect and blood contamination effect are removed?
I'm beginner. so I need information (paper, web sites shows analysis process)
I hope your help
thank you
Comment