Header Leaderboard Ad

Collapse

CASIM: ChIP-Seq Normalisation

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CASIM: ChIP-Seq Normalisation

    I would be interested in discussing normalization strategies for ChIP-seq data across (a large number of) samples. More specifically, how to account for library clonality artifacts, differences in IP efficiency and other ChIP-seq specific experimental sources of bias.

  • #2
    Related Question:

    What is the best way to normalize ChIP-seq data, particularly when working with multiple biological replicates and controls that may have differing numbers of sequenced tags?

    Comment


    • #3
      Related Question:

      Which is the best method to use when quantitatively comparing different experiments of varying sequencing depths?

      Comment


      • #4
        I came here this morning to start a very similar thread. So instead will bump this one, although I admit, I am not exactly sure what this subsection of the forum is for. If I need to start a new thread I can, just please let me know.

        I have multiple ChIP-seq data sets for chromatin modifications that do not so much form peaks but instead have differential enrichment over specific genomic zones. But due to the difference in the total number of mapped reads per sample, normalization by number of mapped reads skews the data in the opposite direction of the biologist's expectations. The biologists proclaim that the difference in reads per sample is because in one sample there is more binding. And so I need a method that does not use mapped read counts as a normalization strategy.

        What I imagine could be an interesting strategy, as I have no input controls to work with, would be to attempt to establish a baseline signal in regions that are not enriched for binding, but I feel I am in a bit of a chicken-meets-egg scenario here and cannot find a method that explains how to proceed.

        Any help or hints would be greatly appreciated.

        Comment


        • #5
          >The biologists proclaim that the difference in reads per sample is because in one sample there is more binding.

          So you have different treatments with the same modification and they are saying that some treatments have more binding than others?

          >What I imagine could be an interesting strategy, as I have no input controls to work with, would be to attempt to establish a baseline signal in regions that are not enriched for binding

          What about using regions that are enriched in binding but that are expected to remain consistent across all samples? For example, when we do ChIP-qPCR for some active histone modifications we normalize to enrichment at the Gapdh promoter since it has a strong and consistent signal in all our treatments. It'd be up to the biologists to identify these positive controls sites, and probably having several would be better than just one.

          Comment


          • #6
            Originally posted by biocomputer View Post
            So you have different treatments with the same modification and they are saying that some treatments have more binding than others?
            Thanks for replying.

            Yes, we are studying multiple modifications (multiple antibodies) and have 2 conditions (treatments) so I need a way to normalize data from the same antibody in different conditions to get differential binding. And from there, I assume that I can compare the differential binding between different antibodies without further normalization (an assumption cause I am not there yet...so am not totally sure).

            Originally posted by biocomputer View Post
            What about using regions that are enriched in binding but that are expected to remain consistent across all samples? For example, when we do ChIP-qPCR for some active histone modifications we normalize to enrichment at the Gapdh promoter since it has a strong and consistent signal in all our treatments. It'd be up to the biologists to identify these positive controls sites, and probably having several would be better than just one.
            This idea has occurred to us, and we also have RNAseq data from the same cells with the same treatment. So I guess we can use that to find genes that do not change in expression and then use those loci to define regions that could be used for normalization. Is there anything published with relation this? An R package possibly?

            In any event, this is a start and we are going to try it now. Thanks again.

            Comment

            Working...
            X