Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ChIP-seq data analysis for beginner

    I should be getting back my first ChIP-seq data this week.

    Which software do people recommend I use for data analysis, peak calling, etc. Any help with the criteria I should be asking to choose a certain software would be helpful and resources to answer my questions.

    I'm bench scientist not a bioinforamtician so something reasonably user friendly would be nice, even better if I can run it on my MacBook.

    Thanks.
    --------------
    Ethan

  • #2
    USEQ is a good tool with step-by-step guide

    For windows solution, using cisGenome is good idea. It requires minimal pre-processing of solexa data before running it through cisGenome.
    --
    bioinfosm

    Comment


    • #3
      Thanks!!!! I'll try USEQ and report back in a week or so on how I fair.
      --------------
      Ethan

      Comment


      • #4
        USEQ looks like it has a lot of great tools, but is there more detailed information on how to use it. I'm really not understanding what I have to do to process my data.
        --------------
        Ethan

        Comment


        • #5
          Hi,
          If you are familar with R, perhaps one of these two packages will help you:
          (try to use the last R version)


          or


          We have developed a web page for the analysis of ChIP-seq data, now it is a beta version, but we will make it avaliable after this summer (only for plant genomes).

          Comment


          • #6
            I checked out the Bioconductor apps and it seems you need to know R so I got absolutely nowhere with it.

            Here's where I am. I have a sorted Illumina .txt files. I assume I need to convert this to a .BED file. Is there software that will do this for me in which I don't have to know a programing language.

            Anybody have any suggested reading to get me up to speed. I'd really like to look at my data.

            Thanks again.
            --------------
            Ethan

            Comment


            • #7
              Once you have the file with the short reads, you should map them to the genome of interest, using SOAP, bowtie, Bwa or any other mapping tool.

              The result of this mapping process should be used by the peak calling software (CSAR, PICS, Useq, Cisgenome...) to identify the significant regions.

              With which organism are you working?
              We have a webtool for the analysis of Arabidopsis ChIP-seq data. Basically, you submit the file with the short reads, and our server will analysis the data using SOAP and CSAR, it will report the binding map in a wig file, a the list of genes near by.

              Comment


              • #8
                Use Starr from bioconductor. However you need know R basically.

                Comment


                • #9
                  Have you looked at Galaxy? http://main.g2.bx.psu.edu/
                  I'm not really familiar with it as I prefer my trusty friend the command-line, but they have quite some nice tools, screencasts to explain typical analyses etc. It's all under active development and also specifically geared towards biologists.

                  Comment


                  • #10
                    I have a related question. This is my first chip-seq analysis, 8 lanes of Solexa reads from a mouse experiment, and bowtie only aligns ~40% of the reads to mm9, no matter how stringent or lax I set the bowtie parameters. This alignment percentage seems low, compared to RNA-seq, but maybe it's not unusual for chip-seq.

                    Anyone with relevant experience, is this about right or should I be looking for an error?

                    Thanks

                    Comment


                    • #11
                      I'm not familiar with Bowtie, but 40% seems quite low. The amount of reads should generally be higher than that. We routinely map 75%-85% of our ChIP-seq sample to mouse. Are reads mapping to repeat regions included in this number?

                      Comment


                      • #12
                        Quality Scores

                        I have seen things like that a couple of times when the quality scores of the reads were in a different scale than the default setting from bowtie, for example the bowtie manual states:

                        --phred33-quals
                        Input qualities are ASCII chars equal to the Phred quality plus 33. Default: on.


                        And my reads came from a newer solexa machine so i had to set --solexa1.3-quals which increased the percentage from ~40 to ~90% mapped reads in RNA-Seq Experiments.

                        This is because of the different scale a lot of good reads were discarded because their qualities were interpreted as being low when they actually were quite reasonable.

                        So you could try and check if your aligner discarded reads due to bad quality / if you used the correct quality scale.

                        Cheers

                        "You are only young once, but you can stay immature indefinitely."

                        Comment


                        • #13
                          Thanks, simonvh and Dethecor.

                          The unmapped reads are not obviously repetitive, but I'll check to see that the mm9 genome I'm aligning to is not masked for repetitive sequences. If that's the case, I'll try an unmasked genome.

                          In the mean time, I'm rerunning with the quality scale specified.

                          Thanks, mceachin

                          Comment


                          • #14
                            Originally posted by simonvh View Post
                            Have you looked at Galaxy? http://main.g2.bx.psu.edu/
                            I'm not really familiar with it as I prefer my trusty friend the command-line, but they have quite some nice tools, screencasts to explain typical analyses etc. It's all under active development and also specifically geared towards biologists.
                            Hi Simon,
                            I am wandering that can we do two sample (Treated vs Control) in Galaxy.
                            thanks

                            Comment


                            • #15
                              Just a note on how I faired with my first ChIP-seq analysis for other beginners.

                              First I used the CLC genomics workbench as it's interface was really easy but ultimately was not satisfied with its performance.

                              I could never really get FindPeaks to work although I heard it's a great program.

                              After a little fiddling around I got USeq to work and so far I am very happy with it. The makers should be congratulated for producing a really nice package of programs. I really hope it is maintained. The ChIP-seq program wrapper doesn't work in my hands but that's no problem as it's probably better to process the data through the programs separately. The one thing that helped a lot getting it to work was when I found the "results > show results" menu option which tells you what went wrong when an error occurs.

                              The Galaxy MACS peak calling tool didn't recognize my Eland files so I never used it. But I did use my USeq peaks to map them to promoters as described in the webcast tutorial.
                              --------------
                              Ethan

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin




                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                Nobel Prize for MicroRNA Discovery
                                This week,...
                                10-07-2024, 08:07 AM
                              • seqadmin
                                Recent Developments in Metagenomics
                                by seqadmin





                                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                                09-23-2024, 06:35 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 10-11-2024, 06:55 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-02-2024, 04:51 AM
                              0 responses
                              110 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-01-2024, 07:10 AM
                              0 responses
                              114 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-30-2024, 08:33 AM
                              1 response
                              121 views
                              0 likes
                              Last Post EmiTom
                              by EmiTom
                               
                              Working...
                              X