Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • veronique
    Junior Member
    • Apr 2009
    • 1

    using sff_extract

    being a newbie, I have a very simple question:

    I am trying to convert sff formatted 454 data into fasta, fasta quality files and an xml files using sff_extract in python2.6 using the following command:

    sff_extract -s proj_in.454.fasta -q proj_in.454.fasta.qual -x proj_traceinfo_in.454.xml FX3UAMY01.sff

    this results in a syntax error. Does anyone have a suggestion what's wrong?

    Thanks, Veronique
  • AnthonyB
    Junior Member
    • Sep 2008
    • 8

    #2
    Hi,

    AFAIK you only use the -s, -q or -x options if you want only one of the files. To get all three you only need to use the -o option to set the output name (if you want) and then the names of the sff files. You have to manually rename the xml file afterwards however. e.g "sff_extract -o proj_in.454" should output "proj_in.454.fasta, proj_in.454.fasta.qual and proj_in.454.xml"

    Comment

    • Torst
      Senior Member
      • Apr 2008
      • 275

      #3
      I just use the "sffinfo" tool that 454 distribute.

      sffinfo -s file.sff > file.fasta
      sffinfo -q file.sff > file.qual
      sffinfo -m file.sff > file.manifest.xml


      Here is the usage:

      Usage: sffinfo [options...] [- | sfffile] [accno...]
      Options:
      -s or -seq Output just the sequences
      -q or -qual Output just the quality scores
      -f or -flow Output just the flowgrams
      -t or -tab Output the seq/qual/flow as tab-delimited lines
      -n or -notrim Output the untrimmed sequence or quality scores
      -m or -mft Output the manifest text

      Comment

      • Jose Blanca
        Member
        • Aug 2009
        • 70

        #4
        Which is the error message?

        Comment

        • foolishbrat
          Member
          • Nov 2008
          • 45

          #5
          Is there a way I can download "sffinfo". I tried to search google but of no avail.

          Comment

          • Jose Blanca
            Member
            • Aug 2009
            • 70

            #6
            I think sffinfo is not free software.

            Comment

            • AnthonyB
              Junior Member
              • Sep 2008
              • 8

              #7
              I think you can only obtain sffinfo as part of the sfftools package that is supplied by Roche to groups that have the 454 sequencing machines. I'm pretty sure that this was the basis behind the production of sff_extract as many people cannot easily get access to the 454 software to extract the 454 reads from sff files.

              Comment

              • maubp
                Peter (Biopython etc)
                • Jul 2009
                • 1544

                #8
                Originally posted by Jose Blanca View Post
                I think sffinfo is not free software.
                However, Roche are generally relaxed about giving end users access, see:
                Pyrosequencing in picotiter plates, custom arrays for enrichment/decomplexing. (Roche)


                Note that the Newbler tools (sffinfo, plus the assembler and read mapper) are for Linux only.

                Comment

                • chayan
                  Member
                  • Nov 2012
                  • 52

                  #9
                  I am new to this. I want to convert a iontorrent sff to fastq and xml for assembly with mira.
                  i used the commands:
                  sff_extract -s sample_in.iontor.fastq -x sample_traceinfo_in.iontor.xml sample.sff

                  and all the time getting the error msg
                  usage: sff_extract [-h] [-o OUTPUT] [-c] [--min_left_clip MIN_LEFT_CLIP]
                  [--max_percentage MAX_PERCENT] [--version]
                  [input [input ...]]
                  sff_extract: error: unrecognized arguments: -s -x sample_traceinfo_in.iontor.xml sample.sff

                  any help???

                  Comment

                  • Jose Blanca
                    Member
                    • Aug 2009
                    • 70

                    #10
                    It seems that you are using the new sff_extract included in the seq_crumbs package. This new sff_extract has a new interface and it has changed its behaviour a little. I recommend you to use this new version, just change the parameters accordingly.
                    The support for the xml file has been removed, because it is not required any more by MIRA. If you are sure that you need it for the latest MIRA versions just let me know and we'll do something.

                    Comment

                    • chayan
                      Member
                      • Nov 2012
                      • 52

                      #11
                      Thank you for your suggession but I am afraid that the latest version of mira (mira-3.4.1.1) do need both the xml and fastq format for denovo assembly, so in that case i have to use sff_extract to get the out put in xml format as well. can you help me??

                      Comment

                      • Jose Blanca
                        Member
                        • Aug 2009
                        • 70

                        #12
                        From the Mira mailing list:

                        On Aug 13, 2012, at 15:12 , sourakhata tiréra wrote:
                        > [...]
                        > The problem is that i trimmed my initial fastq file and some reads are missing because they come from symbiotes micro-orgamisms of the tick.
                        > Now mira dont work with the trimmed fastq file. Can mira work with 454 Data without xml file?
                        Yes, it can. "--notraceinfo" is probably what you are looking for.

                        > I thought about a mean to just give to mira the ID of reads that I dont want to cluster. Is it possible?
                        I'm not sure I got this question right. Do you want to cluster some reads but others not? If yes, what should MIRA do with the reads not-to-be-clustered?

                        B.
                        If you trim the reads I think that the xml file is irrelevant, but I would ask in the Mira mailing list just in case.

                        Comment

                        • chayan
                          Member
                          • Nov 2012
                          • 52

                          #13
                          okk thnx for the info, i did not knw this...so only the fastq file is enough the run mira for a denovo?? if so, should i use "--notraceinfo" ?

                          Comment

                          • Jose Blanca
                            Member
                            • Aug 2009
                            • 70

                            #14
                            You would had to trim the reads before feeding them to Mira. I guess tha the latest version do not require the notraceinfo, but I recommend to you to ask in the Mira mailing list, because I haven't used Mira for a while.

                            Comment

                            • chayan
                              Member
                              • Nov 2012
                              • 52

                              #15
                              Dear Blanca,
                              I talked to the MIRA mailing list and Bastien recomends me to use the XML files as well for MIRA. He also informed that in the latest version of seq_crumbs package mate-splitting tools are also mising. Additionallly, yes MIRA can use clipped data btu for a better amd more accurate result XML is also required specially when working with Ion torrent data,
                              Here's why
                              "E.g.: assume a sequence without left clips, but with two right clips. The quality clip at 15, the adaptor clip at 52 (visualised with spaces in the next line)

                              GTACGATCGAAAAA aaaaaaaaaaattttttttttttttttttttaaaaaa gtgtgtgtgt

                              Now, Ion has sometimes interesting homopolymer artefacts (not homopolymer errors, but artefacts like above) and MIRA makes sure they get completely clipped. I.e., for MIRA the above sequence become to

                              GTACGATCG aaaaaaaaaaaaaaaattttttttttttttttttttaaaaaa gtgtgtgtgt

                              Note that one of the clips advanced by 5 bases to the left, clipping it another couple of bases and completely hiding the Ion artefact. Now, if people use clipped sequence only, all MIRA would see is:

                              GTACGATCGAAAAA

                              Note how the homopolymer artefact, which was not entirely clipped in the SFF, now still contributes with 5 A to the sequence, and here MIRA is absolutely unable to see the true nature of those (most of the time) totally wrong A bases."
                              this was elustrated to me by Bastien

                              Thanks
                              chayan

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              28 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              33 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              23 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...