Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • ETHANol
    Senior Member
    • Feb 2010
    • 308

    ChIP-seq data analysis for beginner

    I should be getting back my first ChIP-seq data this week.

    Which software do people recommend I use for data analysis, peak calling, etc. Any help with the criteria I should be asking to choose a certain software would be helpful and resources to answer my questions.

    I'm bench scientist not a bioinforamtician so something reasonably user friendly would be nice, even better if I can run it on my MacBook.

    Thanks.
    --------------
    Ethan
  • bioinfosm
    Senior Member
    • Jan 2008
    • 483

    #2
    USEQ is a good tool with step-by-step guide

    For windows solution, using cisGenome is good idea. It requires minimal pre-processing of solexa data before running it through cisGenome.
    --
    bioinfosm

    Comment

    • ETHANol
      Senior Member
      • Feb 2010
      • 308

      #3
      Thanks!!!! I'll try USEQ and report back in a week or so on how I fair.
      --------------
      Ethan

      Comment

      • ETHANol
        Senior Member
        • Feb 2010
        • 308

        #4
        USEQ looks like it has a lot of great tools, but is there more detailed information on how to use it. I'm really not understanding what I have to do to process my data.
        --------------
        Ethan

        Comment

        • Chema76
          Junior Member
          • May 2010
          • 5

          #5
          Hi,
          If you are familar with R, perhaps one of these two packages will help you:
          (try to use the last R version)


          or


          We have developed a web page for the analysis of ChIP-seq data, now it is a beta version, but we will make it avaliable after this summer (only for plant genomes).

          Comment

          • ETHANol
            Senior Member
            • Feb 2010
            • 308

            #6
            I checked out the Bioconductor apps and it seems you need to know R so I got absolutely nowhere with it.

            Here's where I am. I have a sorted Illumina .txt files. I assume I need to convert this to a .BED file. Is there software that will do this for me in which I don't have to know a programing language.

            Anybody have any suggested reading to get me up to speed. I'd really like to look at my data.

            Thanks again.
            --------------
            Ethan

            Comment

            • Chema76
              Junior Member
              • May 2010
              • 5

              #7
              Once you have the file with the short reads, you should map them to the genome of interest, using SOAP, bowtie, Bwa or any other mapping tool.

              The result of this mapping process should be used by the peak calling software (CSAR, PICS, Useq, Cisgenome...) to identify the significant regions.

              With which organism are you working?
              We have a webtool for the analysis of Arabidopsis ChIP-seq data. Basically, you submit the file with the short reads, and our server will analysis the data using SOAP and CSAR, it will report the binding map in a wig file, a the list of genes near by.

              Comment

              • czhang
                Junior Member
                • Jan 2010
                • 4

                #8
                Use Starr from bioconductor. However you need know R basically.

                Comment

                • simonvh
                  Member
                  • Jul 2010
                  • 12

                  #9
                  Have you looked at Galaxy? http://main.g2.bx.psu.edu/
                  I'm not really familiar with it as I prefer my trusty friend the command-line, but they have quite some nice tools, screencasts to explain typical analyses etc. It's all under active development and also specifically geared towards biologists.

                  Comment

                  • mceachin
                    Junior Member
                    • May 2010
                    • 5

                    #10
                    I have a related question. This is my first chip-seq analysis, 8 lanes of Solexa reads from a mouse experiment, and bowtie only aligns ~40% of the reads to mm9, no matter how stringent or lax I set the bowtie parameters. This alignment percentage seems low, compared to RNA-seq, but maybe it's not unusual for chip-seq.

                    Anyone with relevant experience, is this about right or should I be looking for an error?

                    Thanks

                    Comment

                    • simonvh
                      Member
                      • Jul 2010
                      • 12

                      #11
                      I'm not familiar with Bowtie, but 40% seems quite low. The amount of reads should generally be higher than that. We routinely map 75%-85% of our ChIP-seq sample to mouse. Are reads mapping to repeat regions included in this number?

                      Comment

                      • Dethecor
                        Member
                        • May 2010
                        • 24

                        #12
                        Quality Scores

                        I have seen things like that a couple of times when the quality scores of the reads were in a different scale than the default setting from bowtie, for example the bowtie manual states:

                        --phred33-quals
                        Input qualities are ASCII chars equal to the Phred quality plus 33. Default: on.


                        And my reads came from a newer solexa machine so i had to set --solexa1.3-quals which increased the percentage from ~40 to ~90% mapped reads in RNA-Seq Experiments.

                        This is because of the different scale a lot of good reads were discarded because their qualities were interpreted as being low when they actually were quite reasonable.

                        So you could try and check if your aligner discarded reads due to bad quality / if you used the correct quality scale.

                        Cheers

                        "You are only young once, but you can stay immature indefinitely."

                        Comment

                        • mceachin
                          Junior Member
                          • May 2010
                          • 5

                          #13
                          Thanks, simonvh and Dethecor.

                          The unmapped reads are not obviously repetitive, but I'll check to see that the mm9 genome I'm aligning to is not masked for repetitive sequences. If that's the case, I'll try an unmasked genome.

                          In the mean time, I'm rerunning with the quality scale specified.

                          Thanks, mceachin

                          Comment

                          • Bioinfo
                            Member
                            • Jul 2010
                            • 15

                            #14
                            Originally posted by simonvh View Post
                            Have you looked at Galaxy? http://main.g2.bx.psu.edu/
                            I'm not really familiar with it as I prefer my trusty friend the command-line, but they have quite some nice tools, screencasts to explain typical analyses etc. It's all under active development and also specifically geared towards biologists.
                            Hi Simon,
                            I am wandering that can we do two sample (Treated vs Control) in Galaxy.
                            thanks

                            Comment

                            • ETHANol
                              Senior Member
                              • Feb 2010
                              • 308

                              #15
                              Just a note on how I faired with my first ChIP-seq analysis for other beginners.

                              First I used the CLC genomics workbench as it's interface was really easy but ultimately was not satisfied with its performance.

                              I could never really get FindPeaks to work although I heard it's a great program.

                              After a little fiddling around I got USeq to work and so far I am very happy with it. The makers should be congratulated for producing a really nice package of programs. I really hope it is maintained. The ChIP-seq program wrapper doesn't work in my hands but that's no problem as it's probably better to process the data through the programs separately. The one thing that helped a lot getting it to work was when I found the "results > show results" menu option which tells you what went wrong when an error occurs.

                              The Galaxy MACS peak calling tool didn't recognize my Eland files so I never used it. But I did use my USeq peaks to map them to promoters as described in the webcast tutorial.
                              --------------
                              Ethan

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                New Genomics Tools and Methods Shared at AGBT 2025
                                by seqadmin


                                This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                                The Headliner
                                The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                                03-03-2025, 01:39 PM
                              • seqadmin
                                Investigating the Gut Microbiome Through Diet and Spatial Biology
                                by seqadmin




                                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                                02-24-2025, 06:31 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-20-2025, 05:03 AM
                              0 responses
                              17 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-19-2025, 07:27 AM
                              0 responses
                              18 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-18-2025, 12:50 PM
                              0 responses
                              19 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-03-2025, 01:15 PM
                              0 responses
                              186 views
                              0 reactions
                              Last Post seqadmin  
                              Working...