Header Leaderboard Ad

Collapse

Good ChIP-seq finder?

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Good ChIP-seq finder?

    Hi all of you, I was wondering whether there are any peak finders for transcription factors out there that qualify as good under the following criteria:

    * can distinguish between closely adjacent peaks
    * identifies peaks with high spatial resolution
    * high sensitivity and specificity
    * accepts aligned peaks in .bed format
    * does not confuse with millions of useless options
    * reasonably fast (processes more than 2 million drosophila reads per minute)
    * makes full use of input sequencing by subtracting background before peak finding
    * free and open source

    Have wasted quite some time testing various peak finders and have been very disappointed by everything I looked at. The ones that I have tested include MACS, FindPeaks and PICS.

    Wrote my own peak finder today that appears to fulfill all of the above by using a simple strand specific double window scanning approach on background subtracted sample. Appears to work really well and am wondering right now why nobody has done this before.

    Would love to hear your opinion on this.

  • #2
    Hi there!

    Originally posted by steinmann View Post
    Hi all of you, I was wondering whether there are any peak finders for transcription factors out there that qualify as good under the following criteria:

    * can distinguish between closely adjacent peaks
    * identifies peaks with high spatial resolution
    * high sensitivity and specificity
    * accepts aligned peaks in .bed format
    * does not confuse with millions of useless options
    * reasonably fast (processes more than 2 million drosophila reads per minute)
    * makes full use of input sequencing by subtracting background before peak finding
    * free and open source

    Have wasted quite some time testing various peak finders and have been very disappointed by everything I looked at. The ones that I have tested include MACS, FindPeaks and PICS.
    Well, you are asking for the perfect software! I believe there's no a general solution to your problem. I've tried CisGenome, MACS and FP4 and, depending on the biological problem, I think you'll have to tune your parameters. All the softwares available rely on different statistics and different assumptions, each may perform better on certain analysis...
    Generally speaking, all of those are able to find TF binding sites in a reliable way... things change when you're looking for histone modifications or, possibly, megabase-wide phenomena.

    Originally posted by steinmann View Post
    Wrote my own peak finder today that appears to fulfill all of the above by using a simple strand specific double window scanning approach on background subtracted sample. Appears to work really well and am wondering right now why nobody has done this before.

    Would love to hear your opinion on this.
    Well, I'm working on something similar right now :-)

    d

    Comment


    • #3
      Originally posted by dawe View Post
      Generally speaking, all of those are able to find TF binding sites in a reliable way... things change when you're looking for histone modifications or, possibly, megabase-wide phenomena.
      d
      Have not managed to do proper peak finding with those. MACS can simply not distinguish between closely adjacent peaks and I can not get FP4 to not miss a whole lot of obvious peaks.

      Originally posted by dawe View Post
      Well, I'm working on something similar right now :-)
      d
      Interesting

      Which language?
      Do you intend to publish a paper on it?
      Will you make it freely available?
      How did you solve the problem of splitting closely adjacent peaks? Similar to FP4?

      Am not exactly sure what would be the best way to identify and separate peaks that are so close that they overlap. Would want the function to be as simple and robust as possible.

      Comment


      • #4
        Originally posted by steinmann View Post
        Interesting

        Which language?
        Do you intend to publish a paper on it?
        Will you make it freely available?
        How did you solve the problem of splitting closely adjacent peaks? Similar to FP4?
        I'm using python, especially for the numpy/scipy modules (which are pretty fast). Hopefully there will be a paper, it much depends on how it performs on real data I'm working on :-)
        About the license... well, I've included the BSD license, but still there's no code for the release.
        About the adjacent peaks... There's no ready solution for that, still thinking about that.
        BTW, FP4 has a couple of options which could help for that (trim and subpeaks), give those a try.

        Originally posted by steinmann View Post
        Am not exactly sure what would be the best way to identify and separate peaks that are so close that they overlap. Would want the function to be as simple and robust as possible.
        Again, that would depend on the biological effect you are studying... There are cases in which two peaks should be considered as part of the same effect (e.g. pH2AX)...

        Comment


        • #5
          My application would be the identification of transcription factor binding sites. Am aware of the subpeaks function in FP4 and it appears to work fairly well.

          Have now implemented something similar to analyze the enriched regions from the double window scanning. What I essentially get from my scanning is a merging and smoothing of the double peaks (see attachment for transformation of two closely adjacent peaks). For these regions I then take the first derivative and look for sign changes to identify all possible maxima. The maximum with the highest enrichment score I then define as my first peak. I then test whether I have a valley of a certain depth between the first peak and the second highest maximum. If this is the case I return both peaks, if not I try the third highest maximum and so on.

          Seems to be reasonably robust and simple, but does not take care of triple peaks.
          Attached Files

          Comment

          Latest Articles

          Collapse

          • seqadmin
            A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
            by seqadmin


            ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

            01-24-2023, 01:19 PM
          • seqadmin
            Introduction to Single-Cell Sequencing
            by seqadmin
            Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

            The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
            ...
            01-09-2023, 03:10 PM

          ad_right_rmr

          Collapse
          Working...
          X