Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • celera assembler to ace file problem

    Has anyone been able to work with ace files generated from Celera assemblies using the ca2ace utility in AMOS? We are finding that these files cannot actually be opened in consed. The problem seems to be a glitch in how the conversion utility writes the BS lines, which indicate which read is the source of the consensus base for each position in the contig sequence. Seems like something we can fix with some scripting, but it would be great if the conversion utility was modified to fix this issue. Any alternative suggestions for a workaround?

  • #2
    You could work with 'ca2ace' from the ca package at sourceforge. In both cases you will need to fix the TIME: tags in the DS line so they will match the TIME: tags
    in the phd file. This is necessary if you want to use consed without the '-nophd' switch.

    There are still some BS issues sometimes with 454 data, these can be fixed manually in the ace file itself.

    hth,
    Sven

    Comment


    • #3
      Thanks- We had noticed the time tag issue. I was hoping there was an automated way of fixing the BS line problem, but sounds like manual editing is needed. Do you happen to know what causes it?

      Originally posted by sklages View Post
      You could work with 'ca2ace' from the ca package at sourceforge. In both cases you will need to fix the TIME: tags in the DS line so they will match the TIME: tags
      in the phd file. This is necessary if you want to use consed without the '-nophd' switch.

      There are still some BS issues sometimes with 454 data, these can be fixed manually in the ace file itself.

      hth,
      Sven

      Comment


      • #4
        Originally posted by greigite View Post
        Thanks- We had noticed the time tag issue. I was hoping there was an automated way of fixing the BS line problem, but sounds like manual editing is needed. Do you happen to know what causes it?
        Not really, .. we have never really examined what went wrong, as this only happens ooccasionally.

        I still have my own ca2ace on my todo list .. ;-)

        cheers,
        Sven

        Comment


        • #5
          problems with celera assembler ace

          I ran an assembly using 454 reads paired ends and paired ends titanium and when im loading my ace file to consed i get this:

          no ~/.consedrc file so no user resources will be used--that's ok
          no ./.consedrc file so no project-specific resources--that's ok
          couldn't open readOrder.txt--that's ok
          0% done. 1 reads read so far...
          0% done. 2 reads read so far...
          0% done. 3 reads read so far...
          0% done. 4 reads read so far...
          0% done. 5 reads read so far...
          0% done. 6 reads read so far...
          0% done. 7 reads read so far...
          0% done. 8 reads read so far...
          0% done. 9 reads read so far...
          0% done. 10 reads read so far...
          0% done. 11 reads read so far...
          0% done. 12 reads read so far...
          0% done. 13 reads read so far...
          0% done. 14 reads read so far...
          0% done. 15 reads read so far...
          0% done. 16 reads read so far...
          0% done. 17 reads read so far...
          0% done. 18 reads read so far...
          0% done. 19 reads read so far...
          0% done. 1000 reads read so far...
          0% done. 2000 reads read so far...
          0% done. 3000 reads read so far...
          0% done. 4000 reads read so far...
          0% done. 5000 reads read so far...
          0% done. 6000 reads read so far...
          0% done. 7000 reads read so far...
          0% done. 8000 reads read so far...
          0% done. 9000 reads read so far...
          0% done. 10,000 reads read so far...
          1% done. 20,000 reads read so far...
          1% done. 30,000 reads read so far...
          2% done. 40,000 reads read so far...
          2% done. 50,000 reads read so far...
          3% done. 60,000 reads read so far...
          3% done. 70,000 reads read so far...
          4% done. 80,000 reads read so far...
          4% done. 90,000 reads read so far...
          5% done. 100,000 reads read so far...
          10% done. 200,000 reads read so far...
          15% done. 300,000 reads read so far...
          20% done. 400,000 reads read so far...
          25% done. 500,000 reads read so far...
          30% done. 600,000 reads read so far...
          35% done. 700,000 reads read so far...
          40% done. 800,000 reads read so far...
          46% done. 900,000 reads read so far...
          Base segment in contig 7180000185764 from padded cons pos 233 to 591 is not within read GDQTYB102FLAX8 which lies within padded cons pos 233 to 589 in contig 7180000185764
          Consed is repairing base segments.
          Base segment in contig 7180000185835 from padded cons pos 470 to 576 is not within read GDQTYB102HPTRP which lies within padded cons pos 470 to 575 in contig 7180000185835
          Consed is repairing base segments.
          Base segment in contig 7180000186047 from padded cons pos 1 to 451 is not within read GDQTYB102GBED1b which lies within padded cons pos -72 to 142 in contig 7180000186047
          Consed is repairing base segments.
          severe warning--cannot fix corrupted base segment at padded consensus base143
          Last base segment 0 in contig 7180000186047 is from 1 to 142 but contig ends at 451 so attempting to fix this
          severe warning--cannot fix corrupted base segment at padded consensus base143
          In contig 7180000186047 last base segment (0) should end on the last padded consensus base (451) but instead ends on 142
          exception thrown: assembly3.cpp:183 Failed assertion 'pContig->baseSegArray_.bGetDataStructureOk( true )'

          ace file: SCt_PE_RAW.ace
          Version 19.0 (090206)
          assembly3.cpp:183 Failed assertion 'pContig->baseSegArray_.bGetDataStructureOk( true )'
          ace file = SCt_PE_RAW.ace
          Version 19.0 (090206)


          I dont know what went wrong can anyone helop me or give em directivesd on what to do?

          Comment


          • #6
            Hi,

            just two questions ...

            - which version of celera assembler?
            - how did you convert from ASM to ACE?

            You may want to try http://asm2ace.sourceforge.net/ for asm-to-ace conversion.

            This issue is known and can AFAIK only be fixed manually. The consensus and the base segments disagree. You need to manually cure this by shortening the contig mentioned to fit into the base segments boundaries. CA in some cases creates (ends of) contigs without (physical) read coverage.

            hth,
            Sven

            Comment


            • #7
              Originally posted by sklages View Post
              Hi,

              just two questions ...

              - which version of celera assembler?
              - how did you convert from ASM to ACE?

              You may want to try http://asm2ace.sourceforge.net/ for asm-to-ace conversion.

              This issue is known and can AFAIK only be fixed manually. The consensus and the base segments disagree. You need to manually cure this by shortening the contig mentioned to fit into the base segments boundaries. CA in some cases creates (ends of) contigs without (physical) read coverage.
              I'm having similar error messages with trying to convert to ace format. I've tried asm2ace, but it chokes on my fragment files with multiple {LIB} sections. (At least I'm assuming that's why it chokes, because if I remove the 2nd {LIB} the asm2whatever.pl doesn't crash)

              I'm using Celera Assembler 6.1, and AMOS (toAmos and amos2ace) to convert from ASM to ACE format.

              I'm not sure I understand exactly what the issue is and how to shorten the contig. I'm pretty new at this, so apologies if this seems obvious. The actual warning/error messages I'm seeing form consed are below.

              Thanks,
              Arjun

              Code:
              Base segment in contig 150 from padded cons pos 98334 to 98732 is not within read F5ZRF5K02D5OURa which lies within padded cons pos 98334 to 98600 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base98601
              Base segment in contig 150 from padded cons pos 96886 to 97106 is not within read F5ZRF5K02D6XZHb which lies within padded cons pos 96886 to 97006 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base97007
              Base segment in contig 150 from padded cons pos 96619 to 96875 is not within read F5ZRF5K02EWTCTa which lies within padded cons pos 96619 to 96839 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base96840
              Base segment in contig 150 from padded cons pos 96234 to 96562 is not within read F5ZRF5K02E1MZRa which lies within padded cons pos 96234 to 96478 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base96479
              Base segment in contig 150 from padded cons pos 2860 to 3107 is not within read F5ZRF5K02C38X8b which lies within padded cons pos 2860 to 3057 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base3058
              Base segment in contig 150 from padded cons pos 1673 to 2116 is not within read F5ZRF5K02ENMKSb which lies within padded cons pos 1673 to 1858 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base1859
              Base segment in contig 150 from padded cons pos 1338 to 1672 is not within read F5ZRF5K02EDUAFb which lies within padded cons pos 1338 to 1669 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base1670
              Base segment in contig 150 from padded cons pos 519 to 1223 is not within read F5ZRF5K02DF2SBb which lies within padded cons pos 519 to 671 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base672
              Base segment in contig 150 from padded cons pos 210 to 518 is not within read F5ZRF5K02DXMSTb which lies within padded cons pos 210 to 463 in contig 150
              Consed is repairing base segments.
              severe warning--cannot fix corrupted base segment at padded consensus base464
              Base segment 0 of contig 150 is from 210 to 463 but should start at 1 so attempting to fix this
              severe warning--cannot fix corrupted base segment at padded consensus base1
              Last base segment 2128 in contig 150 is from 98334 to 98600 but contig ends at 98732 so attempting to fix this
              severe warning--cannot fix corrupted base segment at padded consensus base98601
              could not make base segments contiguous at padded pos 464 after read F5ZRF5K02DXMSTb
              could not make base segments contiguous at padded pos 672 after read F5ZRF5K02DF2SBb
              could not make base segments contiguous at padded pos 1670 after read F5ZRF5K02EDUAFb
              could not make base segments contiguous at padded pos 1859 after read F5ZRF5K02ENMKSb
              could not make base segments contiguous at padded pos 3058 after read F5ZRF5K02C38X8b
              could not make base segments contiguous at padded pos 96479 after read F5ZRF5K02E1MZRa
              could not make base segments contiguous at padded pos 96840 after read F5ZRF5K02EWTCTa
              could not make base segments contiguous at padded pos 97007 after read F5ZRF5K02D6XZHb
              Base segment 0 of contig 150 is not at position 1--it should be
              exception thrown: assembly3.cpp:183 Failed assertion 'pContig->baseSegArray_.bGetDataStructureOk( true )' 
              
              ace file: sepi.ace
              Version 19.0 (090206)
              assembly3.cpp:183 Failed assertion 'pContig->baseSegArray_.bGetDataStructureOk( true )' 
              ace file = sepi.ace
              Version 19.0 (090206)

              Comment


              • #8
                I am having trouble accessing asm2ace on Sourceforge. Can you provide an active link?
                Do you know of any other scripts that would convert a CA assembly to ace (or even bam)?
                Thanks.

                Comment


                • #9
                  Originally posted by Boonie View Post
                  I am having trouble accessing asm2ace on Sourceforge. Can you provide an active link?
                  Do you know of any other scripts that would convert a CA assembly to ace (or even bam)?
                  Thanks.
                  What kind of trouble did you have?
                  It can be downloaded here http://sourceforge.net/projects/asm2ace/files/

                  I haven't used CA for quite a while now, I have no idea about other converters.

                  best,
                  Sven

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Genetic Variation in Immunogenetics and Antibody Diversity
                    by seqadmin



                    The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                    11-06-2024, 07:24 PM
                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 11-08-2024, 11:09 AM
                  0 responses
                  48 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-08-2024, 06:13 AM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  34 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  23 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X