Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get TF binding sites from ChIP-Seq

    Hello all,

    I have the following question: suppose I have a ChIP-Seq data for a protein enrichment, and I made peak calling, e.g. with MACS, and obtained, say, 40,000 peaks with a width around 1000 bp each. How do I proceed to detect the precise coordinates of all bound proteins? (By precise I mean at least a 10 bp resolution).

    Thanks!

  • #2
    the coordinates of the peaks summits (MACS has a separate file for this afaik) should give you a fairly good starting point.

    Comment


    • #3
      are the summits exactly in the middle between the peak start/end?

      Comment


      • #4
        that is usually not the case (depends in the peak caller though)

        Comment


        • #5
          Hi
          I mapped my chiseq data to reference genome of dm3 using bowtie. Once got the bam file I used macs to call the peaks. I got around 15000 plus peaks. I also got NAME_summits.bed file. What does summit means and how can i use this file for further downstream analysis. Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
          Please let me know your views and sugesstions???
          Anurag

          Comment


          • #6
            Originally posted by anurag.gautam View Post
            Hi
            What does summit means and how can i use this file for further downstream analysis.
            summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

            downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)

            Originally posted by anurag.gautam View Post
            Secondly, how can I get confident and enriched peaks from the peaks.xls file from MACS?
            please re-phrase the question as I cannot understand it. sorry.

            Comment


            • #7
              Originally posted by mudshark View Post
              summits are the positions with maximum enrichment within a larger peak area. most likely they map to the exact binding position of your target protein. however, this will depend on whether your target protein binds to DNA directly..

              downstream analysis depends on the question you are asking. if you e.g. want to discover a binding motif or your TF, the summit position might be used to isolate DNA pieces for motif enrichment analysis (-> MEME)



              please re-phrase the question as I cannot understand it. sorry.
              MACS call the peaks by providing pvalue and mfold cutoff.. BASed on those values it calls the peaks. Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
              Hope that elaborates your question...Let me know..

              Comment


              • #8
                Originally posted by anurag.gautam View Post
                Can I directly use the coordinates of the called peaks to get my DNA sequence to further use it for motif analysis, or Can I provide some more cutt off or filteration criteria to find strong peaks which are of my interest based on summit and fold enrichment values...which in turn will provide me confident and enriched peaks...??
                Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).

                Comment


                • #9
                  Originally posted by mudshark View Post
                  Sure it can make sense to further filter the peaks in particular if you have a lots of them (>1000). Straight forward approach would be to look at the top 100 or 200 ones (ranked by enrichment or p-value).
                  Ok. I sorter out in descending order based on -10*log10(pvalue) . But what about the summit value, I also want to use it for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                  for exampe,,
                  chr start end length summit tags -10*log10(pvalue) fold_enrichment
                  chr3L 12092327 12092827 501 294 212 1372.6 41.16
                  chrX 18215330 18215683 354 249 147 947.81 35.95
                  chrX 587408 587798 391 234 171 1134.01 35.17
                  chr3L 3348888 3349361 474 259 171 1004.39 34.13
                  chrX 8385843 8386276 434 180 143 793.24 33.87
                  chr3R 2225145 2225813 669 396 212 1117.08 33.43

                  Comment


                  • #10
                    Does summit value also tells about the height of my peak??

                    Comment


                    • #11
                      summit is just the position within the peak area

                      Comment


                      • #12
                        If I also want to use summit value for ranking my highly confident or enriched peaks.. Any criteria can u provide which uses -10*log10(pvalue), fold_enrichment and summit value to rank my peaks???

                        Comment


                        • #13
                          i suggest you just sort and filter the top 100 based on the p-value (OR enrichment) and then extract the sequence around the summit position.

                          Comment


                          • #14
                            Thanks for your quick replies mudshark..
                            I was able to rank my strong peaks based on pvaue and length of the peak >1000 bp (depending on my peaks called). Could you give furhter more ideas about motif analysis. I used meme suite to do the denovo motif analysis also. Once I get the motifs ,, what kind of significant biological information can be drawn from it..??

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Recent Developments in Metagenomics
                              by seqadmin





                              Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                              09-23-2024, 06:35 AM
                            • seqadmin
                              Understanding Genetic Influence on Infectious Disease
                              by seqadmin




                              During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

                              Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
                              09-09-2024, 10:59 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 10-02-2024, 04:51 AM
                            0 responses
                            8 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 10-01-2024, 07:10 AM
                            0 responses
                            13 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 09-30-2024, 08:33 AM
                            0 responses
                            18 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 09-26-2024, 12:57 PM
                            0 responses
                            16 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X