Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Problem with Celera Assembler

  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with Celera Assembler


    I have short reads sequences already in a FRG file (converted from fasta to amos then to frg) which I would like to assemble using Celera Assembler, for which I run the following command:

    runCA -d testDir -p testPrefix testShortReads.frg

    After running several subprograms (gatekeeper, initialTrim, meryl, etc) successfully it fails running overlapTrim:

    ERROR: Failed with signal ABRT (6)
    step OverlapTrim failed with 'Failed to build the obt store.'

    Anyone can help me with this problem?
    I would appreciate any help. Thanks.

  • #2
    CA will not assemble "shorts reads", if you mean Solexa or Solid.
    Have a look ath their wiki at

    If you have FLX or titanium data then the information provided is not enough ;-)



    • #3
      Originally posted by sklages View Post
      CA will not assemble "shorts reads", if you mean Solexa or Solid.
      Have a look ath their wiki at

      If you have FLX or titanium data then the information provided is not enough ;-)

      OK, I should have said a little more about the sequences. They are 454 FLX. A sample of the original fasta file is here:

      >000038_0115_1501 length=95 uaccno=EYVY07101AKEWF

      However I am presenting the input data to CA in FRG as requested and as I said in the original message.

      Any hint?
      Thanks for your time.



      • #4
        This looks like 454 GS20 data. Sure it is FLX data?

        Anyway, if possible you always start from the SFF files. Generating input for CA from SFF files is best done with

        There will (probably) be no general solution for this kind of failure; there should have been written some error logs. Take a look at these.

        Also take a look at the help page
        and contact the authors if you cannot solve the problem.

        Nevertheless, if you have a simple solution you should post it here ;-)



        • #5
          Hi All,
          I am trying to run Celera assembler on the sun grid engine using the below option.

          perl /usr/local/wgs-6.1/Linux-amd64/bin/ useGrid=1 scriptOnGrid=1 -d /roche/Trimmed_Reads/454/Unpaired/ -p grid_test /roche/Trimmed_Reads/454/Unpaired/*.frg ovlMemory="4GB --hashload 0.8 --hashstrings 100000" ovlThreads=2 ovlHashBlockSize=180000 ovlRefBlockSize=2000000 frgCorrBatchSize=200000 frgCorrThreads=2 ovlCorrBatchSize=800000 unitigger=bog

          The script executes but I am getting permission error inspite of changing permissions to the bin directory containing the runCA command.I have pasted runCA.sge.out and error messages below.I would be happy if someone could help me resolve this issue.


          Warning: no access to tty (Bad file descriptor).
          Thus no job control in this shell.
          /bin/.: Permission denied.
          syst=Linux: Command not found.
          arch=x86_64: Command not found.
          name=454rig.dhmriad.local: Command not found.
          arch: Undefined variable.

          # Attempt to (re)configure SGE. For reasons Bri doesn't know,
          # jobs submitted to SGE, and running under SGE, fail to read his
          # .tcshrc (or .bashrc, limited testing), and so they don't setup
          # SGE (or ANY other paths, etc) properly. For the record,
          # interactive SGE logins (qlogin, etc) DO set the environment.

          . $SGE_ROOT/$SGE_CELL/common/

          # On the off chance that there is a pathMap, and the host we
          # eventually get scheduled on doesn't see other hosts, we decide
          # at run time where the binary is.

          syst=`uname -s`
          arch=`uname -m`
          name=`uname -n`

          if [ "$arch" = "x86_64" ] ; then
          if [ "$arch" = "Power Macintosh" ] ; then


          /usr/bin/env perl $bin/runCA "useGrid=1" "scriptOnGrid=1" -d "/roche/Trimmed_Reads/454/Unpaired/" -p "grid_test" "/roche/Trimmed_Reads/454/Unpaired/FR6EAL4.frg" "/roche/Trimmed_Reads/454/Unpaired/FRK90FP0.frg" "/roche/Trimmed_Reads/454/Unpaired/FSIT0PR0.frg" "/roche/Trimmed_Reads/454/Unpaired/FTNMD73.frg" "/roche/Trimmed_Reads/454/Unpaired/FUORSTX0.frg" "/roche/Trimmed_Reads/454/Unpaired/GMEFJA40.frg" "ovlMemory=4GB --hashload 0.8 --hashstrings 100000" "ovlThreads=2" "ovlHashBlockSize=180000" "ovlRefBlockSize=2000000" "frgCorrBatchSize=200000" "frgCorrThreads=2" "ovlCorrBatchSize=800000" "unitigger=bog"



          • #6

            Are you sure that your current directory from which you launch the command and the filepath in your command line are the same?
            This could solve your permissions problem


            • #7

              I am trying out Celera for assembling de novo my 1.8 billion illumina reads.
              Celera RunCA version 6.1.
              I have a question about the `Sun Grid Engine Options`
              On the following web page, they precise how you can adjust this grid to a small Sanger dataset:
              But I find no information how I could use this for a large Illumina dataset of 75-100b reads. Especially because I have 1.8 billion reads, I was wondering how I could adjust the CPU and the Memory best for my kind of data with the Sun Grid Engine.

              Does anyone have experience with Celera and large Illumina datasets?
              I know they say CA1.6 should be able to assemble 1 billion reads, according to their website,and so I am hoping it could work for more too!