Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • CisGenome Analysis

    I could not find any topic committed to such a tool, therefore I decided to open a new one. In case it already exists, feel free to merge.

    I am handling some output sequences from Illumina GAII and trying to figure out how to analyze them.
    My ChIP is against histone modifications and I usually look at very large ditributions of signals (I would not call them "peaks", if you get what I mean).

    Now, I am trying to catch a good tool to analyze them and I was suggested to use CisGenome. I tried to figure out how it works, especially last updated feture (CisGenomev2) which is supposed to be quite easy and user friendly.
    However, when I include my sample and control and call for "peaks", I only get one peak, which I know cannot be true from other tools/observations.
    I must likely be doing some mistakes.

    If there someone who knows the tool, I will try to explain my issue.

    1) I am using genome database (hg18)
    2) I include my sample (high signal) and control (low signal)
    3) I then select the parameters:

    # Read Extension Length E: 150 (what am I supposed to use here?)
    # Bin Size B: 500 (I tried to increase over 3000)
    # Half Window Size W: 1
    # Max Gap: 50
    # Min Peak: 100
    # Standardize Windows Statistics: checked (what am I supposed to use here?)
    # Win Stat Cutoff >= 3 (what am I supposed to use here?)
    # Apply Local Read Sampling Rate Filter: checked (I tried with and w/o this with no differences in the output)

    ## Local Rate Window: 10000
    ## Local Rate Cutoff: 1e-005

    # Boundary Refinement: checked

    ## Boundary Resolution: 5

    4) I start the search and it turns out only one peak in a place where I know the signal is high.

    I wonder how this can be possible.
    My sample and control differ a lot in terms of reads. Namely, the control has a lower depth. Does this influence the readout?

    I apologise for the long and complicated post, I hope someone can help.

    Thanks in advance.

  • #2
    I am not an expert. My opinion is only for your reference.

    1.Win Stat Cutoff >= 3 (what am I supposed to use here?)
    the number is based on the exploration result. negbinomial_exp/obs<10%
    2. For boundary refinement: I don't checked if I do histone modifications thinking histone modification is different from TF binding that TF binding needs two strand while histone modification could be on single strand. but I may be wrong.


    • #3
      Before tweaking software parameters it might actually be helpful to look at the raw genomic read coverage to confirm that the raw data looks as expected, in your case containing broad regions of enrichment (relative to control). Otherwise you try to get something from the software that your data does not provide.

      How many mapped reads do you have? The difference in reads will matter a lot, as it is not simple (i.e. impossible) to normalize the read counts properly. Therefore you will probably get many false positives/negatives.


      • #4
        Thanks a lot for the replies.

        I totally agree that tweaking should be the refining part. However, I think that even though the filtering cuts down lots of the reads, the total read numbers are pretty much similar among the samples:
                raw           mapped       uniq. map.    non-redundant
        1a   39,462,703     29,738,325     24,113,845     22,722,389
        1b   37,286,139     28,958,710     23,448,540     21,408,884
        1c   39,499,025     29,346,076     24,366,428     22,060,482
        2a   33,161,351     26,928,919     22,220,217     17,646,007
        2b   36,682,484     28,621,303     23,846,721     20,401,399
        2c   39,406,479     28,787,186     24,292,959     17,977,853
        What do you think?


        • #5
          I've gotten good output with SICER when looking at such data sets. Follow their advice on installing Numpy and Scipy in the ReadMe.txt. Other then that it was pretty easy to get running.



          Latest Articles


          • seqadmin
            The Impact of AI in Genomic Medicine
            by seqadmin

            Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
            02-26-2024, 02:07 PM
          • seqadmin
            Multiomics Techniques Advancing Disease Research
            by seqadmin

            New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

            A major leap in the field has
            02-08-2024, 06:33 AM





          Topics Statistics Last Post
          Started by seqadmin, Today, 06:12 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 02-23-2024, 04:11 PM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 02-21-2024, 08:52 AM
          0 responses
          Last Post seqadmin  
          Started by seqadmin, 02-20-2024, 08:57 AM
          0 responses
          Last Post seqadmin