Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rikkomba
    Junior Member
    • Oct 2009
    • 4

    [ChIP-seq] MACS 1.4: assertion error

    HI,

    I just started working with ChIP-seq and I am using MACS to predict binding sites for a fungal genome (12MB).
    As mentioned here, I am getting the same problem as some previous users: I get an assertion error when calculating negative peaks, and I only managed to solve the problem by reducing the amount of data to 75%, i.e. if the amount of reads goes to 80% or more of the original amount, I will get the error again.

    I have a control file with 31M 36bp single reads (4.1GB) and a sample file
    with 28M 36bp single reads (3.6GB).
    I tried it both a laptop (4GB RAM) and a server (16GB RAM), in
    both cases it was using 1.6GB of memory and was behaving the same way. I also tried it on Cistrome, idem.
    Changing the mfold parameter didn't help.
    With 75% of the data I did get sensible results, but I am not sure how can I move on discarding 25% of a dataset... it just adds another layer of complexity to the analysis.

    Does anybody have an idea of the cause of the problem, and of the reason
    why reducing the amount of data works?
    Also, if anybody knows any alternative, valid tool, feel free to suggest
    Thanks!
  • taoliu
    Junior Member
    • Sep 2009
    • 3

    #2
    I just answered this question in MACS user group, however, since you asked in seqanswer, I re-post it here.


    This error normally happens when you have too many reads in a very small genome. In your case, you use a whole GA2 lane to sequence a single factor in a genome like E coli. Then due to the extremely high coverage, this overflow error occurs since my function doesn't expect a poisson rate higher than 740...

    In practice, you'd better consider using multiplex to fully use a single lane to sequence multiple factors or a single factor in multiple conditions/time points. 30million reads for a single experiment on a 4million genome is a big waste -- you can even assemble the genome for this species now...

    Anyway, since you have already got your 30millions reads, what you can do ( instead of waiting me to fix it (: ) is to subsample your sequencing reads. My impression for human chip-seq, if you want to reach saturation for peak detection, you need about 300 million reads ( from our unpublished Nat Method paper ) which is equivalent to 0.5million reads in E coli . You can use "samtools view -s" to subsample a portion of your BAM file.
    梦蝶

    Comment

    • rikkomba
      Junior Member
      • Oct 2009
      • 4

      #3
      Thanks for the explanation! I ll try with downsampling, then.

      Comment

      Latest Articles

      Collapse

      • GATTACAT
        Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by GATTACAT
        Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
        07-01-2026, 11:43 AM
      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 07-02-2026, 11:08 AM
      0 responses
      11 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-30-2026, 05:37 AM
      0 responses
      13 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-26-2026, 11:10 AM
      0 responses
      20 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      54 views
      0 reactions
      Last Post SEQadmin2  
      Working...