Hi Chipper,
Thanks for clarifying. I'm not sure we're talking about the same things, when it comes to sub-peaks. When you have a good model of the length distribution of the reads, you often see complex regions which don't necessarily switch from forward to reverse - but are made up of distinct "clusters" of reads. FindPeaks does this without worrying about the forward/reverse orientation of the reads by simply building in the model of the read lengths. Thus, the "peaks" themselves are probabilities of the number of fragments overlapping at a given point.
For both TF and histone, you will see clear enrichments at certain locations, using these models because the contribution of any given read is clearly directional, based upon the location and strand of the tag sequenced.
This is much easier to draw than to explain in text!
Anyhow, you could be right that seqing is looking only for area of enrichment, but a good peak finder should handle those areas just as well as those with clear TF-like enrichment.
Cheers!
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Hi,
sorry if I was not clear enough. If your FindPeaks identifies subpeaks it will (or at least should) have opposing reads otherwise it is not going to form a subpeak so what SISSR will do is more or less the same. But it does not require a window (was it 20 bp?) to have reads from both strands, just that you go from + to -, it could be readless windows in between. If it is a good method or not is another story.
My interpretation of the Histone question was that he wanted to find regions that are enriched, not the histone positions. But that may be totally wrong. Anyway, if each histone gives only a few reads, or if the nucleosomes are not well positioned, there is really not much valuable information to throw away. If you for example take the avearge read density over the gene body it can still be significanltly enriched.
Leave a comment:
-
Hi Chipper
SISSR does do "subpeaks", in a sense, however it's based entirely on finding areas bracketed by reads facing opposite directions. From personal experience - we had implemented a version of this in FindPeaks at one point, it isn't a particularly reliable method, as peaks which appear in low-seqenceability regions will disappear completely (whether they're real or not is a different story), and small peaks don't always have reads in both directions even when they are real.
In any case, as for the windows, I can't think of a valid reason for using them - you'd lose resolution, and a large window would give you "blurrs" instead of positions for nucleosomes, where they're available. You'd be throwing away a lot of valuable information, while peak finders will still find the blurry regions as well just as well as a windowed method.
Leave a comment:
-
SISSRs way of identifying peak locations makes it unnecessary to search for subpeaks since it does not cluster reads in peaks in the first place. But I agree that you have to use the control carefully - otherwise you may end up filtering away a large proportion of your true positives.
Seqing, did I understand you correctly that you are studying histone marks like k27me3 or k36 wher you would expect large regions to be enriched but with realatively few reads obtained per histone? Then I guess you would be better of trying a window-based scanning methid using large windows as opposed to identifying peaks from individual nucleosomes which is what findpeaks/SISSRs/MACS will do.
Leave a comment:
-
I figure I should respond to the points mentioned here as best as I can.
The "integrated control" feature is coming up soon for FindPeaks. However, I think that this has been WAY overblown. Integrating it into your peak finder itself is a relatively poor solution from many angles. I.e, some implementations require that you have identical numbers of reads in both your control and your sample - which is never a great precondition.
With any peak finder, you can get a list of peaks from your control and your sample - it's a simple matter of scripting to compare your peak list. The trick is then using this information wisely, which I'm not sure any of the peak finders currently do. I've been sketching out ideas for how to improve this for the past couple of days, and finally think I have a winning solution - I just need to find the time to do that, and still write up my thesis proposal. (-;
Anyhow, if you have feature requests like this for findpeaks, feel free to file a request or a bug report for it -- or better yet, write a patch. (-: I do read the bug reports, and try my best to reply to all FindPeaks related email.
For the question of which peak finder should be used for histones - the honest answer is that each peak finder has it's strong and weak points. I personally believe that the triangle weighted distribution in FindPeaks is a major advantage over the other peak finders, and that for this application, you'll absolutely require a sub-peak function. Both FindPeaks and MACS are probably your best bets. (The wold lab and SISSR versions doesn't do sub-peaks, if I recall correctly, but that may have changed.)
I believe I'm about 2 weeks away from tagging a FindPeaks 3.2 beta release, if all goes smoothly - and hopefully this will address the points above.
Leave a comment:
-
I saw a couple good presentations by this group, and others who used their tool:
Leave a comment:
-
QuEST does use a control lane, but I could not interpret it as well as I would like to..
Leave a comment:
-
Hi Anthony
Do you know which ChIP-seq peak finder works well for widespread histone marks? I am trying MACS but am not getting satisfying results.
Thanks
HS
Leave a comment:
-
Good to hear you've found a tool you like.... (-:
and good luck with the experiment!
Anthony
Leave a comment:
-
Thanks Anthony,
I have all the upstream portions and have used a lot of the peakfinders - I like yours quite a bit! BED and WIG tracks are fine, and the intersects can give some info I'm after. I have data for both histone variants and TF's, totally different applications indeed. I'll be on the lookout for ways to make some plots. Thanks and keep up the good work!
sf
Leave a comment:
-
Hi Seqfast,
There are TONS of tools out there for doing the first part of this: aligning the reads against the genome. Most of them are designed for ChIP-Seq: FindPeaks, Peakfinder, USeq, MACS.... etc etc etc.
The trick is then interpreting the data they return. We (the people at the BC Genome Sciences Centre) have developed lots of tools for this particular application, but it's not necessarily a straight forward interpretation - it really depends on the signal you're looking at. (eg. transcription factor vs histone modification, etc). I only work on the first part, processing the reads, but there are several people here working full time on interpreting results and writing software that perform the tasks you require. (I just don't think any of the tools have officially been released, though it's in the works, I believe.)
To get started, you might want to pick one of the tools out there for ChIP-Seq, and play with it for a while. It probably won't get you all the way to the results you require, but it probably will get you pretty far.
I'm the author of FindPeaks, so I'm a little biased towards it, but the others are all good too. (-:
Anthony
Leave a comment:
-
ChIP-Seq reads correlated/distance to with TSS/promoter etc.
Hi all,
Interested in producing some of those plots that illustrate the read density at some distance from the transcription start site, or from other known regions/features.
I understand basic intersects and thing like this, but running all the read locations against said features is my main goal.
I see a site that can produce these for you (http://www.isrec.isb-sib.ch/chipseq/chip_cor.html), and this works for their resident data, but I can't get their ELAND2SGA tool to produce a file for me, and when trying to make my own SGA format things don't go well.
I realize that given a file with TSS sites or other features, you could write a script that would catalog the average read density of window X size at some distance and report this, I lack the requisite programming skills for this however. If there's a simple, free tool or something I am missing at UCSC or Galaxy that's great. I'm comfortable in Linux environments, just not pure programming.
Thanks!
Latest Articles
Collapse
-
by seqadmin
Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.
Long-Read Sequencing
Long-read sequencing has seen remarkable advancements,...-
Channel: Articles
12-02-2024, 01:49 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Today, 07:45 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
Today, 07:45 AM
|
||
Started by seqadmin, Yesterday, 07:59 AM
|
0 responses
11 views
0 likes
|
Last Post
by seqadmin
Yesterday, 07:59 AM
|
||
Newborn Genomic Screening Shows Promise in Reducing Infant Mortality and Hospitalization
by seqadmin
Started by seqadmin, 12-09-2024, 08:22 AM
|
0 responses
9 views
0 likes
|
Last Post
by seqadmin
12-09-2024, 08:22 AM
|
||
Started by seqadmin, 12-02-2024, 09:29 AM
|
0 responses
174 views
0 likes
|
Last Post
by seqadmin
12-02-2024, 09:29 AM
|
Leave a comment: