Hi, I'm dealing with some ChIP-seq data which is quite noisy. MACS can only find less than 1k peaks. I checked its cross-correlation. The NSC is 1.02 and the RSC is 0.22. Based on ENCODE recommendation, these are "bad".
I'm just wondering if it's possible to remove those reads which form higher "read length" correlation than "fragment length" correlation, to reduce the phantom peak?
Btw, I also checked the nonduplicated read fraction (=0.82 which is good), and the fraction of mapped reads in peaks (=7% which passes ENCODE's 1% metrics, though doesn't necessarily mean a good one.)
I looked at the bigwig signals on browser. The signal does look quite noisy. Even in the peaks MACS called, it doesn't appear to be real peaks.
Is there any way to rescue this kind of data bioinformatically?
Thanks for any suggestions.
I'm just wondering if it's possible to remove those reads which form higher "read length" correlation than "fragment length" correlation, to reduce the phantom peak?
Btw, I also checked the nonduplicated read fraction (=0.82 which is good), and the fraction of mapped reads in peaks (=7% which passes ENCODE's 1% metrics, though doesn't necessarily mean a good one.)
I looked at the bigwig signals on browser. The signal does look quite noisy. Even in the peaks MACS called, it doesn't appear to be real peaks.
Is there any way to rescue this kind of data bioinformatically?
Thanks for any suggestions.
Comment