Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fasta to Ace conversion

    Is there a program to convert a Fasta file to an Ace assembly file? While googling I came across references to fasta2ace.pl but no program itself.

    Thanks.
    Farhat Habib

  • #2
    I am looking for the exact same tool ... fasta to ace, but have not succeeded yet.
    If it can use quality values, even better...

    the ace file can then be used by eagleView to visualize reads on reference
    --
    bioinfosm

    Comment


    • #3
      I am looking for it for Eagleview as well.

      -Farhat
      Farhat Habib

      Comment


      • #4
        Originally posted by Farhat View Post
        Is there a program to convert a Fasta file to an Ace assembly file?
        Can you be a bit more precise on what you require?
        A FASTA file is just a bunch of sequences with an ID and a description.
        What form do you want the ACE file to take?

        Comment


        • #5
          Farhat,

          I don't think it is possible to do what you are asking. FASTA files only contain ID/definition line(s) followed by sequence line(s). You may also have an accompanying quality score file. An ACE file contains much more information than this. For each contig (an ACE file may include more than one contig) it will contain the gapped sequence and quality scores, the gapped sequences of the constituent reads as well as offset information indicating where each of the constituent reads is located on the contig (reference). This information does not exist in the FASTA files so it would be impossible to construct a meaningful ACE file.

          Comment


          • #6
            Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
            Farhat Habib

            Comment


            • #7
              Originally posted by Farhat View Post
              Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
              Thats great !
              I started writing a script of my own, but then got on to other things

              Farhat - is it possible for you to share the script for format conversion?
              --
              bioinfosm

              Comment


              • #8
                Originally posted by bioinfosm View Post
                Thats great !
                I started writing a script of my own, but then got on to other things

                Farhat - is it possible for you to share the script for format conversion?
                Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                -F
                Farhat Habib

                Comment


                • #9
                  I'm looking for exactly the same thing for eagleview too!!
                  Would you mind sharing your script with me? I'll send you a message shortly. Thanks!
                  Jia


                  Originally posted by Farhat View Post
                  Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                  -F

                  Comment


                  • #10
                    Can I ave a look to your script ?
                    Thanks

                    nico l'allias

                    Comment


                    • #11
                      This still makes no sense.

                      ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.

                      James

                      PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)

                      Comment


                      • #12
                        Originally posted by jkbonfield View Post
                        PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)
                        You can store PHRED qualities for a contig in an ACE file on BQ lines. I don't think the quality scores of the reads themselves are stored, which is probably what you meant.

                        P.S. The MIRA assembly format (MAF, which is a bit like ACE), stores both - using FASTQ like encoding which is much more space efficient:

                        Comment


                        • #13
                          Getting off-topic, sorry.

                          However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.

                          Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).

                          A good find. :-)

                          Comment


                          • #14
                            Originally posted by jkbonfield View Post
                            Getting off-topic, sorry.

                            However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.
                            I think Bastien was thinking along the same lines when he came up with MAF for internal use in MIRA.
                            Originally posted by jkbonfield View Post
                            Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).
                            I'd like the option to include the reference sequences (not just their names and lengths; and as a further option the reference quality scores) to make a SAM/BAM file self contained. This is probably not important for people working on model organisms, but would seem useful for early stages of projects with draft assemblies, or if working on a new strain etc. Its something that ACE and other assembly formats have.

                            Comment


                            • #15
                              Originally posted by jkbonfield View Post
                              This still makes no sense.

                              ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.
                              The original question was probably misleading. Farhat did later on say he was able to convert FASTQ reads into an ACE assembly by getting the missing information from the SOAP/ELAND alignment.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Latest Developments in Precision Medicine
                                by seqadmin



                                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                                Somatic Genomics
                                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                                Yesterday, 01:16 PM
                              • seqadmin
                                Recent Advances in Sequencing Analysis Tools
                                by seqadmin


                                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                                05-06-2024, 07:48 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Yesterday, 07:15 AM
                              0 responses
                              12 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-23-2024, 10:28 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-23-2024, 07:35 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 05-22-2024, 02:06 PM
                              0 responses
                              10 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X