Seqanswers Leaderboard Ad

**apfejes** · 09-03-2008, 08:44 AM

Hi Seqfast,

There are TONS of tools out there for doing the first part of this: aligning the reads against the genome. Most of them are designed for ChIP-Seq: FindPeaks, Peakfinder, USeq, MACS.... etc etc etc.

The trick is then interpreting the data they return. We (the people at the BC Genome Sciences Centre) have developed lots of tools for this particular application, but it's not necessarily a straight forward interpretation - it really depends on the signal you're looking at. (eg. transcription factor vs histone modification, etc). I only work on the first part, processing the reads, but there are several people here working full time on interpreting results and writing software that perform the tasks you require. (I just don't think any of the tools have officially been released, though it's in the works, I believe.)

To get started, you might want to pick one of the tools out there for ChIP-Seq, and play with it for a while. It probably won't get you all the way to the results you require, but it probably will get you pretty far.

I'm the author of FindPeaks, so I'm a little biased towards it, but the others are all good too. (-:

Anthony

**seqfast** · 09-05-2008, 07:40 AM

Thanks Anthony,

I have all the upstream portions and have used a lot of the peakfinders - I like yours quite a bit! BED and WIG tracks are fine, and the intersects can give some info I'm after. I have data for both histone variants and TF's, totally different applications indeed. I'll be on the lookout for ways to make some plots. Thanks and keep up the good work!

sf

**apfejes** · 09-05-2008, 09:01 AM

Good to hear you've found a tool you like.... (-:

and good luck with the experiment!

Anthony

**seqing** · 10-06-2008, 10:49 PM

Hi Anthony
Do you know which ChIP-seq peak finder works well for widespread histone marks? I am trying MACS but am not getting satisfying results.
Thanks
HS

**seqing** · 10-06-2008, 10:53 PM

One thing that keeps me from trying FindPeaks is that it does not seem to integrate control data to find the peaks...

**seqing** · 10-06-2008, 10:54 PM

it's tough choosing the right peak finder!

**bioinfosm** · 10-07-2008, 06:34 AM

QuEST does use a control lane, but I could not interpret it as well as I would like to..

http://mendel.stanford.edu/SidowLab/downloads/quest/index.html

**ECO** · 10-07-2008, 06:41 AM

I saw a couple good presentations by this group, and others who used their tool:

SiteReorganized - Wold Lab

http://woldlab.caltech.edu/html/chipseq_peak_finder

**apfejes** · 10-07-2008, 12:25 PM

I figure I should respond to the points mentioned here as best as I can.

The "integrated control" feature is coming up soon for FindPeaks. However, I think that this has been WAY overblown. Integrating it into your peak finder itself is a relatively poor solution from many angles. I.e, some implementations require that you have identical numbers of reads in both your control and your sample - which is never a great precondition.

With any peak finder, you can get a list of peaks from your control and your sample - it's a simple matter of scripting to compare your peak list. The trick is then using this information wisely, which I'm not sure any of the peak finders currently do. I've been sketching out ideas for how to improve this for the past couple of days, and finally think I have a winning solution - I just need to find the time to do that, and still write up my thesis proposal. (-;

Anyhow, if you have feature requests like this for findpeaks, feel free to file a request or a bug report for it -- or better yet, write a patch. (-: I do read the bug reports, and try my best to reply to all FindPeaks related email.

For the question of which peak finder should be used for histones - the honest answer is that each peak finder has it's strong and weak points. I personally believe that the triangle weighted distribution in FindPeaks is a major advantage over the other peak finders, and that for this application, you'll absolutely require a sub-peak function. Both FindPeaks and MACS are probably your best bets. (The wold lab and SISSR versions doesn't do sub-peaks, if I recall correctly, but that may have changed.)

I believe I'm about 2 weeks away from tagging a FindPeaks 3.2 beta release, if all goes smoothly - and hopefully this will address the points above.

**Chipper** · 10-07-2008, 12:53 PM

SISSRs way of identifying peak locations makes it unnecessary to search for subpeaks since it does not cluster reads in peaks in the first place. But I agree that you have to use the control carefully - otherwise you may end up filtering away a large proportion of your true positives.

Seqing, did I understand you correctly that you are studying histone marks like k27me3 or k36 wher you would expect large regions to be enriched but with realatively few reads obtained per histone? Then I guess you would be better of trying a window-based scanning methid using large windows as opposed to identifying peaks from individual nucleosomes which is what findpeaks/SISSRs/MACS will do.

**apfejes** · 10-07-2008, 01:05 PM

Hi Chipper

SISSR does do "subpeaks", in a sense, however it's based entirely on finding areas bracketed by reads facing opposite directions. From personal experience - we had implemented a version of this in FindPeaks at one point, it isn't a particularly reliable method, as peaks which appear in low-seqenceability regions will disappear completely (whether they're real or not is a different story), and small peaks don't always have reads in both directions even when they are real.

In any case, as for the windows, I can't think of a valid reason for using them - you'd lose resolution, and a large window would give you "blurrs" instead of positions for nucleosomes, where they're available. You'd be throwing away a lot of valuable information, while peak finders will still find the blurry regions as well just as well as a windowed method.

**Chipper** · 10-07-2008, 01:26 PM

Hi,

sorry if I was not clear enough. If your FindPeaks identifies subpeaks it will (or at least should) have opposing reads otherwise it is not going to form a subpeak so what SISSR will do is more or less the same. But it does not require a window (was it 20 bp?) to have reads from both strands, just that you go from + to -, it could be readless windows in between. If it is a good method or not is another story.

My interpretation of the Histone question was that he wanted to find regions that are enriched, not the histone positions. But that may be totally wrong. Anyway, if each histone gives only a few reads, or if the nucleosomes are not well positioned, there is really not much valuable information to throw away. If you for example take the avearge read density over the gene body it can still be significanltly enriched.

**apfejes** · 10-07-2008, 03:50 PM

Hi Chipper,

Thanks for clarifying. I'm not sure we're talking about the same things, when it comes to sub-peaks. When you have a good model of the length distribution of the reads, you often see complex regions which don't necessarily switch from forward to reverse - but are made up of distinct "clusters" of reads. FindPeaks does this without worrying about the forward/reverse orientation of the reads by simply building in the model of the read lengths. Thus, the "peaks" themselves are probabilities of the number of fragments overlapping at a given point.

For both TF and histone, you will see clear enrichments at certain locations, using these models because the contribution of any given read is clearly directional, based upon the location and strand of the tag sequenced.

This is much easier to draw than to explain in text!

Anyhow, you could be right that seqing is looking only for area of enrichment, but a good peak finder should handle those areas just as well as those with clear TF-like enrichment.

Cheers!

Topics	Statistics	Last Post
Study Reveals How Bacteria Defend Against Viral Attacks by seqadmin Started by seqadmin, 08-27-2024, 04:40 AM	0 responses 16 views 0 likes	Last Post by seqadmin 08-27-2024, 04:40 AM
New Single-Molecule Sequencing Platform Introduces Advanced Features for High-Throughput Genomics by seqadmin Started by seqadmin, 08-22-2024, 05:00 AM	0 responses 293 views 0 likes	Last Post by seqadmin 08-22-2024, 05:00 AM
New DNA Code Discovered Revealing Complex Gene Regulation Mechanisms by seqadmin Started by seqadmin, 08-21-2024, 10:49 AM	0 responses 135 views 0 likes	Last Post by seqadmin 08-21-2024, 10:49 AM
Epigenetic Clocks Derived from Retroelements Offer New Insights into Aging by seqadmin Started by seqadmin, 08-19-2024, 05:12 AM	0 responses 124 views 0 likes	Last Post by seqadmin 08-19-2024, 05:12 AM

Seqanswers Leaderboard Ad

Announcement

ChIP-Seq reads correlated/distance to with TSS/promoter etc.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News