Seqanswers Leaderboard Ad

**Heisman** · 08-06-2013, 09:51 PM

What is your duplication rate?

Even if you call no peaks, you can correlate overall signal as described in the link provided here: https://groups.google.com/forum/#!to...t/AO6mldNxIQI/

I would try that and see if your replicates give higher correlations than your non-replicates.

**leaskimo** · 08-11-2013, 04:09 PM

wigCorrelation results

Hey

To answer your first question when the duplicates are left out of the analysis MACS reports a redundant rate as high as 0.42 in my treatments. When using keep-dup 5 the redundant rate is reduced to 0.05.

I have preformed the wigCorrelation which may have thrown a massive spanner into the works.

Some of my replicates correlate much better with non-replicates (marked with *) than the replicates.

correlation between replicates
W1 vs W2 = -0.007
A1 vs A2 = 0.314
Q2 vs Q3 = 0.906

correlation between non replicates
W1 vs Q2 = -0.319
W1 vs Q3 = -0.282
W1 vs A1 = 0.082 *
W1 vs A2 =0.187 *
W2 vs Q2 = 0.642 *
W2 vs Q3 = 0.625 *
W2 vs A1 = 0.602 *
W2 vs A2 = 0.195 *
A1 vs Q2 = 0.603 *
A1 vs Q3 = 0.584 *
A2 vs Q2 = 0.068
A2 vs Q3 = 0.053

It is obvious that I can't combine my replicates now, but where to from here?

Thanks
Megan

**Heisman** · 08-11-2013, 06:41 PM

I think your next step is to try to figure out what's going on. I'd start with W1 and W2 as they seem to be horribly correlated but should be biological replicates.

So I would do a few things with those two in particular. First, make a list of metrics for each; total reads, total aligned reads, duplication rate, etc. I don't know if each was sequenced on one lane or multiple; regardless, I would run all of the raw reads through FastQC and see if that shows anything. I don't know if they are single or paired end but see if MACS2 reported a similar d value for each (and look at the bioanalyzer run for each of them to see if the libraries looked to be of similar fragment distributions). Also look at some of the peak regions in a viewer such as IGV; do they look similar at all between the two samples? Check some of the highest scored peak regions as well as a more broad view.

**apredeus** · 10-01-2013, 12:18 PM

Originally posted by leaskimo View Post

I am having problems with the broad mark H3K27me3, my biological replicates and identifying differential enrichment between treatment groups

In my opinion, histone marks like H3K27me3, H3K36me3 are just too broad for MACS2 to effectively capture.

I have tried many (about 10) different peak callers, and I think SICER really stands out (in a good way) in how it performs. It seems to effectively capture both small and large gaps in signal, and unifies peaks where they need to be unified. So far it's by far the best broad peak caller I've tried.

**gene_x** · 10-07-2013, 08:42 PM

Originally posted by apredeus View Post

In my opinion, histone marks like H3K27me3, H3K36me3 are just too broad for MACS2 to effectively capture.

I have tried many (about 10) different peak callers, and I think SICER really stands out (in a good way) in how it performs. It seems to effectively capture both small and large gaps in signal, and unifies peaks where they need to be unified. So far it's by far the best broad peak caller I've tried.

What are these different peak callers you have tried? What's the metrics you used to evaluate their performances?

**apredeus** · 10-07-2013, 09:06 PM

I've tried MACS, MACS2, SICER, SISSR, Rseg, BroadPeak, HotSpot, and I really can't remember what else. I've also experimented with settings on those peak callers quite a bit, especially on MACS2, SICER and Rseg.

As for the metrics, I've discovered that simple visual inspection of TDF files of Chip-Seq, Input, and BED file of the called peaks makes it very obvious. I'll try to look for screenshots I've made but I'm not sure I'll be able to find them.

At any rate, if anyone has an opinion different from mine, I'd love to hear it.

**harryzs** · 10-08-2013, 06:58 AM

Just a reminder, if you can wait for two months, you will know how people (Anshul) from ENCODE do with broad peaks stably.

https://groups.google.com/forum/#!to...nt/yG8M8Sx_eTM

**Wallysb01** · 10-08-2013, 08:42 AM

I second SICER for histone marks. MACS2 is the right pick for transcription factor ChIP. The wigCorrelation is still concerning though.

One thing you might be sure to check is your input read distribution for both W1 and W2. It kinda looks like one of those replicates just may not have worked at all, as you would expect a near 0 correlation with any successful ChIP-seq compared to basically nothing.

Also, correlation between different treatments could be caused by input bias or sequencing bias. So, if you had a poor batch of crosslinking or maybe library prep wasn't so good, and certain groups all went through those steps together, that may explain W2 being more highly related to Q2, Q3 and A1.

So you might group your samples by date processed through the various steps and see if that explains anything?

**apredeus** · 10-08-2013, 10:42 AM

Originally posted by harryzs View Post

Just a reminder, if you can wait for two months, you will know how people (Anshul) from ENCODE do with broad peaks stably.

https://groups.google.com/forum/#!to...nt/yG8M8Sx_eTM

Sweet, thanks for the reminder. I should re-run some of the peak calling I've done in the past and post some screenshots here, should be fun. But maybe I'll wait until they publish their findings and/or recommended software and settings.

**harryzs** · 10-08-2013, 11:09 AM

Originally posted by apredeus View Post

In my opinion, histone marks like H3K27me3, H3K36me3 are just too broad for MACS2 to effectively capture.

I have tried many (about 10) different peak callers, and I think SICER really stands out (in a good way) in how it performs. It seems to effectively capture both small and large gaps in signal, and unifies peaks where they need to be unified. So far it's by far the best broad peak caller I've tried.

May I ask a question: for H3K27me3 (human/mouse), how many reads (depth) do we need to get "good" results, according to your experiences?

**apredeus** · 10-08-2013, 11:35 AM

It really depends on the quality of the Chip-Seq experiment, i.e. signal-to-noise ratio. As a general rule, I think ENCODE recommends higher number of reads for "broad" marks (20M or so). This, however, would not save you at all if your library is bad and has a lot of noise. So I would say 10M aligned unique reads is the lowest you want to go.

As an example of an amazingly clean library I can give this sample: GSE38046 (GSM932947 - GSM932951) from laboratory of M. Busslinger. It has about 23M reads with pretty low duplicate rates (in Chip-Seq analysis, I always turn on filtering of identical reads; both MACS and SICER do it by default). In general, the quality of their Chip-Seqs is astounding, best I've ever seen. Those guys are surely doing something right

The same experiment done by C.Murre (GSM987809) also displays a pretty good signal-to-noise ratio and correlates with Busslinger lab Chip-Seq very well. That sample adds up to 16M aligned reads.

**harryzs** · 10-08-2013, 12:10 PM

Originally posted by apredeus View Post

It really depends on the quality of the Chip-Seq experiment, i.e. signal-to-noise ratio. As a general rule, I think ENCODE recommends higher number of reads for "broad" marks (20M or so). This, however, would not save you at all if your library is bad and has a lot of noise. So I would say 10M aligned unique reads is the lowest you want to go.

As an example of an amazingly clean library I can give this sample: GSE38046 (GSM932947 - GSM932951) from laboratory of M. Busslinger. It has about 23M reads with pretty low duplicate rates (in Chip-Seq analysis, I always turn on filtering of identical reads; both MACS and SICER do it by default). In general, the quality of their Chip-Seqs is astounding, best I've ever seen. Those guys are surely doing something right

The same experiment done by C.Murre (GSM987809) also displays a pretty good signal-to-noise ratio and correlates with Busslinger lab Chip-Seq very well. That sample adds up to 16M aligned reads.

Great. Thank you very much for sharing.

Topics	Statistics	Last Post
Mechanical Forces in DNA Transcription Uncovered by Clemson Researchers by seqadmin Started by seqadmin, 10-02-2024, 04:51 AM	0 responses 13 views 0 likes	Last Post by seqadmin 10-02-2024, 04:51 AM
New Epigenetic Clock Links Cheek Cells to Mortality Risk by seqadmin Started by seqadmin, 10-01-2024, 07:10 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-01-2024, 07:10 AM
AI-Powered Blood Test Shows Promise for Early Ovarian Cancer Detection by seqadmin Started by seqadmin, 09-30-2024, 08:33 AM	0 responses 25 views 0 likes	Last Post by seqadmin 09-30-2024, 08:33 AM
Stem Cell Research Suggests Human Cells May Enter Developmental Pause by seqadmin Started by seqadmin, 09-26-2024, 12:57 PM	0 responses 18 views 0 likes	Last Post by seqadmin 09-26-2024, 12:57 PM

Seqanswers Leaderboard Ad

Announcement

MACS2 ChIP-SEQ ANALYSIS WITH BIOLOGICAL REPLICATES

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News