I am having problems with the broad mark H3K27me3, my biological replicates and identifying differential enrichment between treatment groups
I have performed ChIP-seq in Honeybee on 3 treatment groups each treatment has two biological replicates and one input using the mark H3K27me3. The reads from each sample were aligned to the reference genome using bowtie with between 70-85% mapping. The reads that mapped for each sample range from between 20M-50M reads. I have fiddled around with lots of different peak callers including diffReps, Peakseq, CLC genomics and MACS and MAC2. MACS2 seems to be the only one that can really deal with broad marks. I have had to keep duplicate reads in the analysis because when they are removed I get no peaks. The peak sets I get from MACS2 reveal a vast difference in number of peaks between both biological replicates and treatment groups. The different number of peaks in the treatment groups could well be biologically relevant however I am worried about how to deal with the differences between biological replicates. I have noted that people combine their replicates i.e. concatenate or merge the files in the analysis but when I do this it seems to bias towards one of the replicates.
I have performed ChIP-seq in Honeybee on 3 treatment groups each treatment has two biological replicates and one input using the mark H3K27me3. The reads from each sample were aligned to the reference genome using bowtie with between 70-85% mapping. The reads that mapped for each sample range from between 20M-50M reads. I have fiddled around with lots of different peak callers including diffReps, Peakseq, CLC genomics and MACS and MAC2. MACS2 seems to be the only one that can really deal with broad marks. I have had to keep duplicate reads in the analysis because when they are removed I get no peaks. The peak sets I get from MACS2 reveal a vast difference in number of peaks between both biological replicates and treatment groups. The different number of peaks in the treatment groups could well be biologically relevant however I am worried about how to deal with the differences between biological replicates. I have noted that people combine their replicates i.e. concatenate or merge the files in the analysis but when I do this it seems to bias towards one of the replicates.
Comment