Announcement

Collapse
No announcement yet.

.abi to fasta/fastq conversion script/program?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • .abi to fasta/fastq conversion script/program?

    Hi All,
    We have sanger sequencing data we'd like to incorporate into a de novo 454 bacterial genome assembly using MIRA; only the core facility that does our sanger sequencing can only provide us .abi files, not the more useful fasta/fastq file combination. Any suggestions for a program or script that has useful batch conversion features?

  • #2
    Hi,
    in general it is always better to have the original run/chromatogram data

    You should basecall the data (not only extracting the sequence from the abi file).

    There are two (common) basecalling programs:

    1) phred (http://www.phrap.org/consed/consed.html#howToGet)
    2) TraceTuner (https://sourceforge.net/projects/tracetuner/)

    Both produce sequences in fasta format with qualities ...

    You should also get rid of vector / low quality. For sanger data lucy is doing a good job (https://sourceforge.net/projects/lucy/).

    After basecalling and vector/lowquality clipping you can go for MIRA ..

    cheers,
    Sven

    Comment


    • #3
      Thanks!

      Much appreciated!

      Comment


      • #4
        Currently Trace Tuner's tool ttrace does not offer FASTQ output.

        You can use ttrace to output FASTA+QUAL and then use a script to merge them into a FASTQ file. Or, you can also use ttrace to output PHD files, and convert those into FASTQ.

        Since it is open source (GPL v2 or later), I've written a patch to ttrace to directly support FASTQ output:
        https://sourceforge.net/tracker/?fun...16&atid=895507

        Comment


        • #5
          Patch to the patch

          Dear Maubp,

          Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:
          https://sourceforge.net/tracker/?fun...16&atid=895507

          Cheers,
          Bart

          Comment


          • #6
            Originally posted by BratdaKing View Post
            Dear Maubp,

            Thanks for the patch! Very useful extension to a very useful program! Unfortunately after applying your patch it did not work straight away (for me at least) therefore I made some adaptations to your patch and posted it at:
            https://sourceforge.net/tracker/?fun...16&atid=895507

            Cheers,
            Bart
            Hi Bart,

            What did you need to change and why? A quick verbal summary would be great (since the diff formats we used are different, comparing them by eye
            is a pain).

            I should write back to the tracetuner guys about getting this merged in...

            Peter

            P.S. It would have been simpler to post the patch on the existing feature request wouldn't it - rather than filing a new one?
            Last edited by maubp; 10-18-2010, 06:08 AM. Reason: formatting

            Comment


            • #7
              Hi, I know this thread is very old, but I'm trying to do the same thing what AppleInformatics wanted to do.

              I have no idea how to use the TraceTuner.
              Could you give me some advice, how to run it?

              Comment


              • #8
                - download the archive, deflate it, chdir to tracetuner_3.0.6beta/src
                - read README
                - type
                make
                - type
                ../rel/Linux_64/ttuner -h
                .. assuming your are running a linux system.

                Comment


                • #9
                  Yes, at least I'll use command line version. I was wondered if I could use the viewer
                  (java -jar ttuner_tools.jar). In the log file I've got:

                  Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: No such child: 0
                  at java.awt.Container.getComponent(Container.java:306)
                  at com.paracel.tt.util.Filespec.setSingleSelection(Filespec.java:509)
                  at com.paracel.tt.util.Filespec.<init>(Filespec.java:64)
                  at com.paracel.tt.run.TTrun.<init>(TTrun.java:67)
                  at com.paracel.tt.run.TTrun.main(TTrun.java:553)

                  Comment


                  • #10
                    yep, same for me. I never used the jar file .. ttuner works just fine :-)

                    A few years ago we trained our ttuner with a few 10,000 ABI traces
                    for better performance.

                    You might want to give phred a try as well; a bit faster and the
                    "defacto standard" in sanger sequencing.

                    Comment


                    • #11
                      This is a promising thread! However, after I'm stuck on using the command line for ttuner. I keep getting an error saying I haven't provided an output format. Can someone give me a common command usage for ttuner? I'd like to convert multiple phd.1 files into FastQ.

                      I've used:

                      ttuner -id /directory/*.phd.1 -fd /directory/

                      Comment


                      • #12
                        Little More Progress on Using ttuner - still need help

                        So, I've figured out that I had ordered the commands improperly. I am now writing it as:

                        ttuner -fd /directory/ -qd /directory/ -if /directory/*phd.1

                        I get "Can't stat: 2: No such file or directory"

                        I'm pretty sure it is the *phd.1 that is causing the problems. Can anyone suggest how to input multiple .phd.1 files? When I remove the *phd.1 the program automatically looks for .ab1 files.

                        Comment


                        • #13
                          ttuner is a basecalling software, it reads chromatograms like ab1, determines the base sequence and its qualities and finally writes some other format, either chromatogram files (SCF) or their textual representation (fasta, phd etc.). You cannot convert phd to fastq this way.

                          You might want to have a look at 'convert_project' from the MIRA assembly package (http://mira-assembler.sourceforge.ne...mutils_convpro).
                          There might be other solutions as well (Bioperl might be interesting as well, http://www.bioperl.org/wiki/HOWTO:SeqIO).

                          hth, Sven

                          Comment

                          Working...
                          X