Does anyone simulate a ChIP-Seq experiment prior to actually running the experiment? This is coming up because our collaborators want to simulate the experiment to optimize the parameters (aka read length, depth of coverage, etc) to make sure they are doing things correctly prior to sequencing. It makes sense considering we have several options on read length, among other things. So my question goes out to everyone, do you perform a simulation on ChIP-Seq experiments prior to running the experiment? We would like to do so for the following reasons: We are performing this experiment on an organism with a genome of about 35 MB, and the signal in our initial experiment is low, but we have a LOT of coverage, probably more than necessary.
1) What is the optimal read length to sequence to. Is 35, 50, 75, or 100bp enough? If 50 or 75 is enough to uniquely place 90% of the reads and 100bp doesn't give us any added benefit, then we would only like to sequence up to 75bp and save money.
2) How much starting material do we need. Is this really something we can answer prior to sequencing? We have enough based on our core facilities recommendations. Is that good enough?
3) How much coverage are we going to need? We expect this would dictate how much starting material we would need, and how many lanes to run.
Is there anything else or is all of this just overkill? I've read 3 or 4 papers regarding modeling ChIP-Seq in silico, but understanding the underlying code is proving to be difficult, and I'm not convinced its right for us.
I just want to make we are on the right track and would appreciate to hear what others are doing.
1) What is the optimal read length to sequence to. Is 35, 50, 75, or 100bp enough? If 50 or 75 is enough to uniquely place 90% of the reads and 100bp doesn't give us any added benefit, then we would only like to sequence up to 75bp and save money.
2) How much starting material do we need. Is this really something we can answer prior to sequencing? We have enough based on our core facilities recommendations. Is that good enough?
3) How much coverage are we going to need? We expect this would dictate how much starting material we would need, and how many lanes to run.
Is there anything else or is all of this just overkill? I've read 3 or 4 papers regarding modeling ChIP-Seq in silico, but understanding the underlying code is proving to be difficult, and I'm not convinced its right for us.
I just want to make we are on the right track and would appreciate to hear what others are doing.
Comment