Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Farhat
    Member
    • Apr 2008
    • 21

    Fasta to Ace conversion

    Is there a program to convert a Fasta file to an Ace assembly file? While googling I came across references to fasta2ace.pl but no program itself.

    Thanks.
    Farhat Habib
  • bioinfosm
    Senior Member
    • Jan 2008
    • 483

    #2
    I am looking for the exact same tool ... fasta to ace, but have not succeeded yet.
    If it can use quality values, even better...

    the ace file can then be used by eagleView to visualize reads on reference
    --
    bioinfosm

    Comment

    • Farhat
      Member
      • Apr 2008
      • 21

      #3
      I am looking for it for Eagleview as well.

      -Farhat
      Farhat Habib

      Comment

      • Torst
        Senior Member
        • Apr 2008
        • 275

        #4
        Originally posted by Farhat View Post
        Is there a program to convert a Fasta file to an Ace assembly file?
        Can you be a bit more precise on what you require?
        A FASTA file is just a bunch of sequences with an ID and a description.
        What form do you want the ACE file to take?

        Comment

        • kmcarr
          Senior Member
          • May 2008
          • 1181

          #5
          Farhat,

          I don't think it is possible to do what you are asking. FASTA files only contain ID/definition line(s) followed by sequence line(s). You may also have an accompanying quality score file. An ACE file contains much more information than this. For each contig (an ACE file may include more than one contig) it will contain the gapped sequence and quality scores, the gapped sequences of the constituent reads as well as offset information indicating where each of the constituent reads is located on the contig (reference). This information does not exist in the FASTA files so it would be impossible to construct a meaningful ACE file.

          Comment

          • Farhat
            Member
            • Apr 2008
            • 21

            #6
            Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
            Farhat Habib

            Comment

            • bioinfosm
              Senior Member
              • Jan 2008
              • 483

              #7
              Originally posted by Farhat View Post
              Thanks for the replies. Yes, I realize the Fasta File by itself doesn't have enough information to construct the ACE file. I wrote my own script to take in a FASTA file, a FASTQ quality file and the output from a SOAP or ELAND aligner and convert that to ACE which does work with EagleView.
              Thats great !
              I started writing a script of my own, but then got on to other things

              Farhat - is it possible for you to share the script for format conversion?
              --
              bioinfosm

              Comment

              • Farhat
                Member
                • Apr 2008
                • 21

                #8
                Originally posted by bioinfosm View Post
                Thats great !
                I started writing a script of my own, but then got on to other things

                Farhat - is it possible for you to share the script for format conversion?
                Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                -F
                Farhat Habib

                Comment

                • jia
                  Junior Member
                  • Aug 2008
                  • 1

                  #9
                  I'm looking for exactly the same thing for eagleview too!!
                  Would you mind sharing your script with me? I'll send you a message shortly. Thanks!
                  Jia


                  Originally posted by Farhat View Post
                  Yes, but it is not very mature though and has limitations. It works fine with Eagleview but there seem to be issues making it work with pbShort. If you want it, PM me with your email.

                  -F

                  Comment

                  • nicolallias
                    Member
                    • Jan 2010
                    • 23

                    #10
                    Can I ave a look to your script ?
                    Thanks

                    nico l'allias

                    Comment

                    • jkbonfield
                      Senior Member
                      • Jul 2008
                      • 146

                      #11
                      This still makes no sense.

                      ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.

                      James

                      PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)

                      Comment

                      • maubp
                        Peter (Biopython etc)
                        • Jul 2009
                        • 1544

                        #12
                        Originally posted by jkbonfield View Post
                        PS. Contrary to above, I don't believe ACE supports quality values. At least I've never seen any - instead the authors of ace preferred to store qualities in "phd" files (in possibly the most inefficient format known to man). I'd love to be wrong on this though as it'll make my life easier. :-)
                        You can store PHRED qualities for a contig in an ACE file on BQ lines. I don't think the quality scores of the reads themselves are stored, which is probably what you meant.

                        P.S. The MIRA assembly format (MAF, which is a bit like ACE), stores both - using FASTQ like encoding which is much more space efficient:

                        Comment

                        • jkbonfield
                          Senior Member
                          • Jul 2008
                          • 146

                          #13
                          Getting off-topic, sorry.

                          However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.

                          Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).

                          A good find. :-)

                          Comment

                          • maubp
                            Peter (Biopython etc)
                            • Jul 2009
                            • 1544

                            #14
                            Originally posted by jkbonfield View Post
                            Getting off-topic, sorry.

                            However MAF looks like a nice format. The problems of random ordering of data in CAF and the complete lack of sequence quality in ACE is one reason why I produced BAF, although it never really went anywhere and I only use it locally as an interchange format.
                            I think Bastien was thinking along the same lines when he came up with MAF for internal use in MIRA.
                            Originally posted by jkbonfield View Post
                            Certainly it's true that ACE and CAF are very cumbersome for next-gen data, while SAM/BAM have other major issues when it comes to mixed technologies (such as not supporting older capillary style assemblies with potentially more than two sequences per template).
                            I'd like the option to include the reference sequences (not just their names and lengths; and as a further option the reference quality scores) to make a SAM/BAM file self contained. This is probably not important for people working on model organisms, but would seem useful for early stages of projects with draft assemblies, or if working on a new strain etc. Its something that ACE and other assembly formats have.

                            Comment

                            • maubp
                              Peter (Biopython etc)
                              • Jul 2009
                              • 1544

                              #15
                              Originally posted by jkbonfield View Post
                              This still makes no sense.

                              ACE is an assembly output, while fasta is just a bunch of sequences with no assembly information. Are you asking for advice on what assembler to use? This will obviously depend a lot on the type of data and whether you want a denovo or mapped assembly.
                              The original question was probably misleading. Farhat did later on say he was able to convert FASTQ reads into an ACE assembly by getting the missing information from the SOAP/ELAND alignment.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Pathogen Surveillance with Advanced Genomic Tools
                                by seqadmin




                                The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
                                Today, 11:48 AM
                              • seqadmin
                                New Genomics Tools and Methods Shared at AGBT 2025
                                by seqadmin


                                This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

                                The Headliner
                                The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
                                03-03-2025, 01:39 PM
                              • seqadmin
                                Investigating the Gut Microbiome Through Diet and Spatial Biology
                                by seqadmin




                                The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
                                02-24-2025, 06:31 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 03-20-2025, 05:03 AM
                              0 responses
                              26 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-19-2025, 07:27 AM
                              0 responses
                              33 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-18-2025, 12:50 PM
                              0 responses
                              25 views
                              0 reactions
                              Last Post seqadmin  
                              Started by seqadmin, 03-03-2025, 01:15 PM
                              0 responses
                              190 views
                              0 reactions
                              Last Post seqadmin  
                              Working...