Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • autocorrelation pattern in ChIP-seq alignments

    Hello,

    We have ChIP-seq data that was from a single-end run with 35 bp reads. There are a few samples, with a different antibody used in each one. We aligned the reads and created autocorrelation plots (sometimes called cross-correlation) using HOMER and SPP. The DNA fragment length is around 150 bp, so we expect to see a single large peak at 150 bp.

    Some of the samples look as we expect, but some have a large peak at 35 bp, and a small peak at 150 bp. Does this mean that something is wrong with these samples?

    Thanks!

  • #2
    in fact it is a cross-correlation not an autocorrelation.

    as regards your question: i have seen this before and I don't think it is a problem in the first place. It probably depends on the 'true' fragment size of your target bound DNA, the signal-to-noise ratio and the abundance of target sites. i.e. if your signal to noise is low and the target sites are just a few you will get the average fragment size determined by the size selection step. if you have a good signal to noise and the target protein protects 35 bp of DNA you might get a cross correlation of 35bp.

    Comment


    • #3
      It's the other way around - good signal to noise gives av fragment size, else the correlation is dominated by a peak of exactly the read length. Not sure why though, but has nothing to do with protein DNA protection.

      Comment


      • #4
        This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

        Comment


        • #5
          Thank you all for your responses!

          I've looked at the data again, and the best cross-correlation profiles are from the best antibodies, so your explanations make sense.

          I only have one lingering question: is the data from the not-as-good cross-correlation profiles still usable? That is, do we need to repeat those entire experiments, or will MACS be able to identify the real peaks?

          Many thanks!

          skip56558

          Comment


          • #6
            In my experience I have not found realistic-looking or useable peaks in these types of data sets, unfortunately. I usually try to examine some of the peaks in a browser - you can tell pretty quickly if they look like real ChIP-seq peaks, which are very enriched compared to the background, or just like slightly higher regions in a noisy background. Another way to check is to run your peaks through an annotation tool like CEAS and look for enrichment in promoter regions.

            My experience is with ChIP-seq for transcription factor binding sites, so that advice might not apply for other types of experiments like histone modifications, though.

            Comment


            • #7
              Originally posted by cwhelan View Post
              This very insightful and helpful post by Anshul Kundaje on the MACS mailing list has a really good theory involving the mappability of the genome for why you see this pattern in non-enriched ChIP-seq data sets:

              http://groups.google.com/group/macs-...595465a1f9b212
              Here is some more information from the same author: Phantom Peaks

              I've also noticed the same thing - that there are usually two peaks: one at the read length and one at the average fragment length. I have found that the strength of the fragment length peak compared to the read length peak is usually a good indicator of the signal-to-noise quality and one's ability to detect peaks in the data.

              I've always been under the impression that those peaks at the read length might be caused by PCR duplication, but the above link also has a good idea about biases in mappability.

              Justin

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Exploring the Dynamics of the Tumor Microenvironment
                by seqadmin




                The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                07-08-2024, 03:19 PM
              • seqadmin
                Exploring Human Diversity Through Large-Scale Omics
                by seqadmin


                In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                06-25-2024, 06:43 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 07-10-2024, 07:30 AM
              0 responses
              30 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 07-03-2024, 09:45 AM
              0 responses
              202 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 07-03-2024, 08:54 AM
              0 responses
              212 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 07-02-2024, 03:00 PM
              0 responses
              194 views
              0 likes
              Last Post seqadmin  
              Working...
              X