Does anyone know the syntax for taking the log of a column in galaxy?
Also, does anyone have an idea for comparing the distributions of two chromatin binding proteins with a correlation function, such as:
1) what is the highest correlation you've ever seen for two proteins that should be binding together?
2) what is the best way to ask this question?
I also have another question about the validity of an analysis method that I made up (maybe this is done by others, don't know)..
Skipping to the point, I want to ask how often two "peaks" are overlapping. So first, I call peaks using MACS. Then I take the intersection of the two peaks. Then I ask what is the base coverage of the two peaks. Then I ask what fraction of the genome each of the two datasets encompass to allow me to estimate what the expected overlap of the two datasets would be if they were completely random. Then I measure the actual extent of overlap and ask the base coverage of that. Then the actual overlap is divided by the expected overlap.
I have done this analysis and most datasets don't overlap anymore than what is expected, but I did find a few interesting correlations.
Also, does anyone have an idea for comparing the distributions of two chromatin binding proteins with a correlation function, such as:
1) what is the highest correlation you've ever seen for two proteins that should be binding together?
2) what is the best way to ask this question?
I also have another question about the validity of an analysis method that I made up (maybe this is done by others, don't know)..
Skipping to the point, I want to ask how often two "peaks" are overlapping. So first, I call peaks using MACS. Then I take the intersection of the two peaks. Then I ask what is the base coverage of the two peaks. Then I ask what fraction of the genome each of the two datasets encompass to allow me to estimate what the expected overlap of the two datasets would be if they were completely random. Then I measure the actual extent of overlap and ask the base coverage of that. Then the actual overlap is divided by the expected overlap.
I have done this analysis and most datasets don't overlap anymore than what is expected, but I did find a few interesting correlations.