Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • autocorrelation pattern in ChIP-seq alignments

    Hello,

    We have ChIP-seq data that was from a single-end run with 35 bp reads. There are a few samples, with a different antibody used in each one. We aligned the reads and created autocorrelation plots (sometimes called cross-correlation) using HOMER and SPP. The DNA fragment length is around 150 bp, so we expect to see a single large peak at 150 bp.

    Some of the samples look as we expect, but some have a large peak at 35 bp, and a small peak at 150 bp. Does this mean that something is wrong with these samples?

    Thanks!

  • #2
    in fact it is a cross-correlation not an autocorrelation.

    as regards your question: i have seen this before and I don't think it is a problem in the first place. It probably depends on the 'true' fragment size of your target bound DNA, the signal-to-noise ratio and the abundance of target sites. i.e. if your signal to noise is low and the target sites are just a few you will get the average fragment size determined by the size selection step. if you have a good signal to noise and the target protein protects 35 bp of DNA you might get a cross correlation of 35bp.

    Comment


    • #3
      It's the other way around - good signal to noise gives av fragment size, else the correlation is dominated by a peak of exactly the read length. Not sure why though, but has nothing to do with protein DNA protection.

      Comment


      • #4
        This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

        Comment


        • #5
          Thank you all for your responses!

          I've looked at the data again, and the best cross-correlation profiles are from the best antibodies, so your explanations make sense.

          I only have one lingering question: is the data from the not-as-good cross-correlation profiles still usable? That is, do we need to repeat those entire experiments, or will MACS be able to identify the real peaks?

          Many thanks!

          skip56558

          Comment


          • #6
            In my experience I have not found realistic-looking or useable peaks in these types of data sets, unfortunately. I usually try to examine some of the peaks in a browser - you can tell pretty quickly if they look like real ChIP-seq peaks, which are very enriched compared to the background, or just like slightly higher regions in a noisy background. Another way to check is to run your peaks through an annotation tool like CEAS and look for enrichment in promoter regions.

            My experience is with ChIP-seq for transcription factor binding sites, so that advice might not apply for other types of experiments like histone modifications, though.

            Comment


            • #7
              Originally posted by cwhelan View Post
              This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

              http://groups.google.com/group/macs-...595465a1f9b212
              Here is some more information from the same author: Phantom Peaks

              I've also noticed the same thing - that there are usually two peaks: one at the read length and one at the average fragment length. I have found that the strength of the fragment length peak compared to the read length peak is usually a good indicator of the signal-to-noise quality and one's ability to detect peaks in the data.

              I've always been under the impression that those peaks at the read length might be caused by PCR duplication, but the above link also has a good idea about biases in mappability.

              Justin

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advanced Methods for the Detection of Infectious Disease
                by seqadmin




                The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
                ...
                11-27-2023, 01:15 PM
              • seqadmin
                Strategies for Investigating the Microbiome
                by seqadmin




                Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
                11-09-2023, 07:02 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-01-2023, 09:55 AM
              0 responses
              21 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-30-2023, 10:48 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-29-2023, 08:26 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-29-2023, 08:12 AM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Working...
              X