Header Leaderboard Ad


ChIPSeq: comparing lanes with different number of reads



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • ChIPSeq: comparing lanes with different number of reads

    Hi users!
    I came across a comparison between IP and input samples where the original IP file has about 25M reads and the input file has 17M reads. Would it be convenient to randomly select the same number of reads from the IP sample to match the input so that peak heights are not biased in the IP due to plain coverage??



  • #2
    It makes more sense to express your data in RPKM (Reads Per Kilobase per Million mapped) - that way you're not throwing any data away. Of course this kind of coverage depth bias is probably the one I worry about the least in chip-seq, mappability & enrichment are more my priority.


    • #3
      Peak callers take care of the read number disparity, which is present almost all the time. And they do normalize for sequencing depth, where needed. So do not worry about that, if you are running a peak caller.

      Now if you want to compare the same regions in the genome for the number of reads they have in two different libraries, you should absolutely always normalize to RPM or RPKM