Hello! I'm a tech who's starting a Ph.D. in August, so I am very new to bioinformatics. I'm wondering if anyone has some guidance on choosing coverage thresholds when calling DMRs.
I am using DMRcaller on WGBS data. I have 3 biological replicates per condition, which I am pooling and analyzing together.
In the documentation/example code, I notice they set the min. reads per cytosine = 4. Is this number arbitrary? Or is there something in the data that informs you of what the minimum number of reads should be?
In other methods, I've seen the default set equal to 10, or 3, etc. and I can't find a clear answer on where these numbers are coming from. All I've found is a figure that shows that the higher you set the threshold, the fewer DMRs are identified, which is common sense.
Thanks in advance for any advice!
I am using DMRcaller on WGBS data. I have 3 biological replicates per condition, which I am pooling and analyzing together.
In the documentation/example code, I notice they set the min. reads per cytosine = 4. Is this number arbitrary? Or is there something in the data that informs you of what the minimum number of reads should be?
In other methods, I've seen the default set equal to 10, or 3, etc. and I can't find a clear answer on where these numbers are coming from. All I've found is a figure that shows that the higher you set the threshold, the fewer DMRs are identified, which is common sense.
Thanks in advance for any advice!