Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rebrendi
    ng
    • May 2008
    • 78

    How to get TF binding sites from ChIP-Seq

    Hello all,

    I have the following question: suppose I have a ChIP-Seq data for a protein enrichment, and I made peak calling, e.g. with MACS, and obtained, say, 40,000 peaks with a width around 1000 bp each. How do I proceed to detect the precise coordinates of all bound proteins? (By precise I mean at least a 10 bp resolution).

    Thanks!
  • mudshark
    Senior Member
    • Jan 2009
    • 138

    #2
    the coordinates of the peaks summits (MACS has a separate file for this afaik) should give you a fairly good starting point.

    Comment

    • rebrendi
      ng
      • May 2008
      • 78

      #3
      are the summits exactly in the middle between the peak start/end?

      Comment

      • mudshark
        Senior Member
        • Jan 2009
        • 138

        #4
        that is usually not the case (depends in the peak caller though)

        Comment

        • anurag.gautam
          Member
          • Oct 2010
          • 15

          #5
          Hi
          I mapped my chiseq data to reference genome of dm3 using bowtie. Once got the bam file I used macs to call the peaks. I got around 15000 plus peaks. I also got NAME_summits.bed file. What does summit means and how can i use this file for further downstream analysis. Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
          Please let me know your views and sugesstions???
          Anurag

          Comment

          • mudshark
            Senior Member
            • Jan 2009
            • 138

            #6
            Originally posted by anurag.gautam View Post
            Hi
            What does summit means and how can i use this file for further downstream analysis.
            summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

            downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)

            Originally posted by anurag.gautam View Post
            Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
            please re-phrase the question as I cannot understand it. sorry.

            Comment

            • anurag.gautam
              Member
              • Oct 2010
              • 15

              #7
              Originally posted by mudshark View Post
              summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

              downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)



              please re-phrase the question as I cannot understand it. sorry.
              MACS call the peaks by providing pvalue and mfold cutoff.. BASed on those values it calls the peaks. Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
              Hope that elaborates your question...Let me know..

              Comment

              • mudshark
                Senior Member
                • Jan 2009
                • 138

                #8
                Originally posted by anurag.gautam View Post
                Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
                Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).

                Comment

                • anurag.gautam
                  Member
                  • Oct 2010
                  • 15

                  #9
                  Originally posted by mudshark View Post
                  Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).
                  Ok. I sorter out in descending order based on -10*log10(pvalue) . But what about the summit value, I also want to use it for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                  for exampe,,
                  chr start end length summit tags -10*log10(pvalue) fold_enrichment
                  chr3L 12092327 12092827 501 294 212 1372.6 41.16
                  chrX 18215330 18215683 354 249 147 947.81 35.95
                  chrX 587408 587798 391 234 171 1134.01 35.17
                  chr3L 3348888 3349361 474 259 171 1004.39 34.13
                  chrX 8385843 8386276 434 180 143 793.24 33.87
                  chr3R 2225145 2225813 669 396 212 1117.08 33.43

                  Comment

                  • anurag.gautam
                    Member
                    • Oct 2010
                    • 15

                    #10
                    Does summit value also tells about the height of my peak??

                    Comment

                    • mudshark
                      Senior Member
                      • Jan 2009
                      • 138

                      #11
                      summit is just the position within the peak area

                      Comment

                      • anurag.gautam
                        Member
                        • Oct 2010
                        • 15

                        #12
                        If I also want to use summit value for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                        Comment

                        • mudshark
                          Senior Member
                          • Jan 2009
                          • 138

                          #13
                          i suggest you just sort and filter the top 100 based on the p-value (OR enrichment) and then extract the sequence around the summit position.

                          Comment

                          • anurag.gautam
                            Member
                            • Oct 2010
                            • 15

                            #14
                            Thanks for your quick replies mudshark..
                            I was able to rank my strong peaks based on pvaue and length of the peak >1000 bp (depending on my peaks called). Could you give furhter more ideas about motif analysis. I used meme suite to do the denovo motif analysis also. Once I get the motifs ,, what kind of significant biological information can be drawn from it..??

                            Comment

                            Latest Articles

                            Collapse

                            • GATTACAT
                              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by GATTACAT
                              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                              07-01-2026, 11:43 AM
                            • SEQadmin2
                              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                              by SEQadmin2


                              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                              Here are nine questions we think about, in roughly the order they matter, before...
                              06-18-2026, 07:11 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by SEQadmin2, 07-02-2026, 11:08 AM
                            0 responses
                            16 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-30-2026, 05:37 AM
                            0 responses
                            17 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-26-2026, 11:10 AM
                            0 responses
                            20 views
                            0 reactions
                            Last Post SEQadmin2  
                            Started by SEQadmin2, 06-17-2026, 06:09 AM
                            0 responses
                            54 views
                            0 reactions
                            Last Post SEQadmin2  
                            Working...