Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ChIP-Seq problems with library generation

    I am trying to sequencing ChIPed DNA.
    My DNA was sonicated to 500~1kbp during IP step.
    I chose two ways to solve this size problem.
    Before starting library preparation for SOLEXA, one was fragmenting them again to make them lower than 200bp in length.
    Another was just proceeding with them.
    Finally, I could get successfully the enriched-adptor modified ChIP-DNA.
    But, after analyzing them, I recognized that the %align is only 1~2% for both methods.(% PF was ~60%, total throughput was good.)
    Is there anyone who can explain which step can ocuur this knid of problem?
    Thank you for your help in advance.

  • #2
    It's a bit difficult to tell based on such limited information. It could be either a problem with how you are using the aligner or a problem with the library prep or ....something else entirely

    Can you tell use what alignment program you are using? e.g. is it ELAND?

    If so do you have any stats from the output e.g. if you run the following unix command we can see what the breakdown of unique matches/repetitive matches/no matches is

    cut -f3 my.eland.output.file | sort | uniq -c

    You could also check to see if you are getting a lot of the same sequence reads (which might indicate library prep problems):

    cut -f2 my.eland.output.file | sort | uniq -c | sort

    Comment


    • #3
      1-2% aligned is very low. You might also want to verify you're aligning against the correct reference sequence, and to make sure that the tags that don't align aren't just adapter dimers/trimers/etc.

      I would also ask how many reads you received for each lane.

      Finally, with ChIP-Seq, the volumes of starting material are very low, so it's possible that you're getting contamination from something else. If you were running a gel to select the desired size range, I would suggest you make sure you run ONLY the ChIP-Seq results on that gel. If you also run a ladder or another experiment on the same gel, you can get a significant amount of contamination. (E.g. we found ladder sequences in our Chip-Seq experiments, even when separated by 5+ empty wells.)
      The more you know, the more you know you don't know. —Aristotle

      Comment


      • #4
        I guess that is why they recommend doing size separation after adaptor ligation now...

        What I would do first would be to check sequences for adapors and possibly try aligning to some bacteria in case it is contaminated, also check the reagents so that the beads are not saturated with ssDNA. And if you have read > 30 bases try aligning truncated reads (sequencing errors are most common in the ends of reads).

        Where are the aligned reads placed, are they only in Satellite repeats etc or wher you would expect them?

        Comment


        • #5
          apfejes - in regards to your contamination issue of the ladder with a sample on the chip-seq size selection gel, how do evaluate what to excise without a ladder? tks

          Comment


          • #6
            Hi sblake,

            I was told the people in the lab run the gel with a blue dye that migrates at along with fragments of a particular fragment size. They use this as a guide to indicate the approximate position to excise - I understand it took a bit of practice to get the technique right, but once you have it down, it's not too bad.

            Off hand, i don't know which dye they use, but I certainly remember the blue dyes from my (long ago) days of running gels. I'd hate to try this on a really small gel, but on a longer one, I don't see this being a problem.
            The more you know, the more you know you don't know. —Aristotle

            Comment


            • #7
              aha! makes sense, tks

              Comment


              • #8
                This low alignment rate puzzles me. elly never replied to this thread
                My first guess was what apfejes took into account: was it the right reference genome? Otherwise the experiment went terribly wrong.

                chIP-seq usually generates rather noisy data. But the noise is in the aligned tags and delivers often rates of 4% to 5% of aligned tags falling into clusters. Here amplification plays a crucial role. One step too many generates quite some headache. However, 4% is enough to still get good results at the end.

                Klaus

                Comment


                • #9
                  Hi kmay,

                  I'm so sorry that I didn't reply to my thread.
                  I was so busy to set up another application.

                  I'm sure that my referernce genome sequence was correct.

                  Could you please explain why ChIP-seq usually generate noisy data?
                  And how 4~5% aligned data can be used for further analysis?
                  What is the rest 95~96% of reads?

                  I haven't solved this problem yet and still seeking solution.

                  Your suggestion and explaination would be so helpful for me.
                  Thank you in advance.

                  elly.

                  Comment


                  • #10
                    elly,

                    sorry for having been mis-understandable!

                    I am talking about two steps.
                    1st: mapping of the reads to the genome
                    2nd: clustering of the mapped reads from step 1 into regions (clusters) of enriched read density.

                    We have only example data for a DNAse-seq experiment online, but they might be helpful in explaining the difference.

                    Step1 statistics
                    Step 2 statistics with arbitrary variation of tag density per bp-window, to demonstrate effects of such. Usually we calculate significance of tag density based on a poisson distribution

                    If step 1 delivers only 5% something is terribly wrong.
                    Did you try other mapping algorithms than Eland? Do mappings with increasing relaxation criteria:
                    1 point mutation,2..,3... indel1, 2, 3... and see how mapped tag numbers behave.

                    The 5% I´ve been talking about, correspond to the number of mapped tags falling into clusters of enriched density from step2.
                    The "noisyness" in this stage largely depends on specificity of the antibody. There is always a lot of unspecific binding carried over. Another major effect has the experimental set-up. Whether and how you do a control for subtraction.
                    Unspecific ab or just input control. To our experience the latter shows better results. Last but not least, be very careful with amplifications. Noise rather quickly gets up to signal levels.

                    5% vs. th rest 95%: well, i am afraid there is no clear-cut statistical method to decide at the end about the success. It requires some human brain to look at the raw data and clusters in the genome annotation (we do this in ElDorado). You can sign up for free for two weeks and inspect the open chromatin data, or go here to see the DGE results from our Science paper. Go down to "user data" and choose which data you want to see.

                    Statistics comes into play again at the next step: see which TF-binding sites are over-represented in the clusters (hopefully the one you IPed), whether they are part of a complex model or they are phylogenetically conserved.

                    Hope this helps!

                    Cheers

                    Klaus

                    Comment


                    • #11
                      Hi again,
                      I guess Klaus is reffering to the fact that of all aligned reads only a small percentage occur in peaks of significant enrichment. You will always have some genomic background and due to the large genome size this will generate a high number of radomly aligned sequences even if you have a good enrichment ratio.

                      What beads and blocking did you use for ChIP, is there any possible contaminats like ssDNA?

                      Comment


                      • #12
                        Klaus,

                        did not see your answer to elly before. Is the amplification a problem in your data even if you do unique positions only for the uniquely aligned reads?

                        Comment


                        • #13
                          Chipper,

                          I am not sure wheter I understand your question right. The amplification in the wet lab amplifies everything, including unspecific noise. One should expect for later analysis that higher copy number tags get up more quickly than unique signals and downstram analysis for perfect and unique matches only should eliminate this. We always at the first step look at perfect and unique maches only. However, there is a lot of "noise" (unspecific bound DNA or otherwise carried over oligos) which matches pefect and uniquely, too.

                          We don´t do te wet lab, we do only analyses. And with the many different data sets we saw so far, we found that amplification seems to be the next crucial step after ab-specificity.

                          If you are interested, I could bring in our specialist for that.

                          Klaus

                          Comment


                          • #14
                            Chipper, replying to your post #11 in this thread, how are you able to tell whether or not ssDNA is contaminating the sample?

                            Comment


                            • #15
                              Did you by chance use salmon sperm or another DNA as a block or carrier during the chIP?

                              (We've had 2 separate groups who did just this. Leads to very low % alignment. Uggg! On the other hand were up to 100million salmon reads if anyone wants to take a go at a denovo assembly.)

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin




                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                Nobel Prize for MicroRNA Discovery
                                This week,...
                                10-07-2024, 08:07 AM
                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 10-11-2024, 06:55 AM
                              0 responses
                              11 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              110 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              114 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              1 response
                              120 views
                              0 likes
                              Last Post EmiTom
                              by EmiTom
                               
                              Working...
                              X