I would be interested in discussing normalization strategies for ChIP-seq data across (a large number of) samples. More specifically, how to account for library clonality artifacts, differences in IP efficiency and other ChIP-seq specific experimental sources of bias.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
I came here this morning to start a very similar thread. So instead will bump this one, although I admit, I am not exactly sure what this subsection of the forum is for. If I need to start a new thread I can, just please let me know.
I have multiple ChIP-seq data sets for chromatin modifications that do not so much form peaks but instead have differential enrichment over specific genomic zones. But due to the difference in the total number of mapped reads per sample, normalization by number of mapped reads skews the data in the opposite direction of the biologist's expectations. The biologists proclaim that the difference in reads per sample is because in one sample there is more binding. And so I need a method that does not use mapped read counts as a normalization strategy.
What I imagine could be an interesting strategy, as I have no input controls to work with, would be to attempt to establish a baseline signal in regions that are not enriched for binding, but I feel I am in a bit of a chicken-meets-egg scenario here and cannot find a method that explains how to proceed.
Any help or hints would be greatly appreciated.
Comment
-
>The biologists proclaim that the difference in reads per sample is because in one sample there is more binding.
So you have different treatments with the same modification and they are saying that some treatments have more binding than others?
>What I imagine could be an interesting strategy, as I have no input controls to work with, would be to attempt to establish a baseline signal in regions that are not enriched for binding
What about using regions that are enriched in binding but that are expected to remain consistent across all samples? For example, when we do ChIP-qPCR for some active histone modifications we normalize to enrichment at the Gapdh promoter since it has a strong and consistent signal in all our treatments. It'd be up to the biologists to identify these positive controls sites, and probably having several would be better than just one.
Comment
-
Originally posted by biocomputer View PostSo you have different treatments with the same modification and they are saying that some treatments have more binding than others?
Yes, we are studying multiple modifications (multiple antibodies) and have 2 conditions (treatments) so I need a way to normalize data from the same antibody in different conditions to get differential binding. And from there, I assume that I can compare the differential binding between different antibodies without further normalization (an assumption cause I am not there yet...so am not totally sure).
Originally posted by biocomputer View PostWhat about using regions that are enriched in binding but that are expected to remain consistent across all samples? For example, when we do ChIP-qPCR for some active histone modifications we normalize to enrichment at the Gapdh promoter since it has a strong and consistent signal in all our treatments. It'd be up to the biologists to identify these positive controls sites, and probably having several would be better than just one.
In any event, this is a start and we are going to try it now. Thanks again.
Comment
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 11:49 AM
|
0 responses
13 views
0 likes
|
Last Post
by seqadmin
Yesterday, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
16 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
61 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Comment