Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HMS, motif finding tool for ChIP-Seq

    This is such a nice forum. I am impressed by the experience and enthusiasm of you guys. I just want to bring your attention to a paper we just published:

    On the detection and refinement of transcription factor binding sites using ChIP-Seq data

    Abstract


    Coupling chromatin immunoprecipitation (ChIP) with recently developed massively parallel sequencing technologies has enabled genome-wide detection of protein-DNA interactions with unprecedented sensitivity and specificity. This new technology, ChIP-Seq, presents opportunities for in-depth analysis of transcription regulation. In this study, we explore the value of using ChIP-Seq data to better detect and refine transcription factor binding sites (TFBS). We introduce a novel computational algorithm named Hybrid Motif Sampler (HMS), specifically designed for TFBS motif discovery in ChIP-Seq data. We propose a Bayesian model that incorporates sequencing depth information to aid motif identification. Our model also allows intra-motif dependency to describe more accurately the underlying motif pattern. Our algorithm combines stochastic sampling and deterministic “greedy” search steps into a novel hybrid iterative scheme. This combination accelerates the computation process. Simulation studies demonstrate favorable performance of HMS compared to other existing methods. When applying HMS to real ChIP-Seq datasets, we find that (i) the accuracy of existing TFBS motif patterns can be significantly improved; and (ii) there is significant intra-motif dependency inside all the TFBS motifs we tested; modeling these dependencies further improves the accuracy of these TFBS motif patterns. These findings may offer new biological insights into the mechanisms of transcription factor regulation.

    It is open access at


    Our HMS program is freely available at


    Please give HMS a try. I hope that you find it useful. Questions, comments, suggestions and criticisms are welcomed and can be sent to me at [email protected]. Thank you very much.

    Best,
    Ming

  • #2
    Hey

    I am actually trying to get some hands on Chip-Seq data. Do you happen to know some good dataset @ SRA which I can download and play with HMS.

    Thanks!
    -Abhi

    Comment


    • #3
      Hey, Abhi, thank you for your interest.

      All the source of our data can be found in our supplementary document at


      The one at BCGSC
      ChIP-Seq Transcription Factor Data — by Steven Jones — last modified Dec 05, 2008

      contains dataset that we did not analyze. You can try that.

      I will try to find more and posted here later.


      Best,

      Comment


      • #4
        hi
        I've read the paper, and i found out that the background model is estimated from human promoter sequence. I wonder if you could also give the source code for background markov chain model estimation, because i am dealing with Drosophila sequence, i cant use the existing markov chain models.

        regards

        Comment


        • #5
          Hi Hanat,

          Thank you for your interest in our program. You can find the C source code for background model at the HMS website:



          The command of this C source code can be found in HMS manual page 22.



          Free feel to contact me if you have any question.

          Best,

          Ming

          Comment


          • #6
            HMS-Segmentation fault

            Hi,

            I tried to run the test run using the sample data described on the user manual. However, I am keep getting "segmentation fault" error that was some what confusing. Usually, "segmentation fault" error spewed when there is a memory problem. I have tested on two system with 8GB RAM and 32GB RAM Linux workstations and getting the same result.

            I then took the top 5 fasta line from the input and rerun the command described below. Still got the same result.

            ./hms -i top500.nrsf.hpeak.seq -w 21 -dna 4 -iteration 10 -chain 20 -seqprop 0.1 -strand 2 -nobase dep 2

            Would you please comment what might go wrong?

            DD


            Originally posted by GAanalyzer View Post
            Hey, Abhi, thank you for your interest.

            All the source of our data can be found in our supplementary document at


            The one at BCGSC
            ChIP-Seq Transcription Factor Data — by Steven Jones — last modified Dec 05, 2008

            contains dataset that we did not analyze. You can try that.

            I will try to find more and posted here later.


            Best,

            Comment


            • #7
              Found the problem

              Sorry, I found the problem!

              Disregard this question.....

              DD

              Comment


              • #8
                Why not share your solution?

                Comment


                • #9
                  I'm also having the same problem. What was the solution?

                  Comment


                  • #10
                    I'm also getting 'Segmentation fault', any ideas ?

                    Comment


                    • #11
                      Originally posted by Retr0 View Post
                      I'm also getting 'Segmentation fault', any ideas ?
                      all the files in HMS_source_code should be in the same fold as the executable file.

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Genetic Variation in Immunogenetics and Antibody Diversity
                        by seqadmin



                        The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                        11-06-2024, 07:24 PM
                      • seqadmin
                        Choosing Between NGS and qPCR
                        by seqadmin



                        Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                        10-18-2024, 07:11 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 11-08-2024, 11:09 AM
                      0 responses
                      36 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 11-08-2024, 06:13 AM
                      0 responses
                      28 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 11-01-2024, 06:09 AM
                      0 responses
                      32 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 10-30-2024, 05:31 AM
                      0 responses
                      23 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X