Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Deutsche
    Junior Member
    • Apr 2011
    • 9

    SRA format

    Does anybody know about the SRA format specification? Does one exist?
    I just have found API on the NCBI site which can help to read SRA files but I haven't found any information about the format specification.
  • vadim
    Member
    • Sep 2009
    • 37

    #2
    ask them: [email protected]

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      You don't mean the SRA XML Specification, which is documented?
      This documentation provides application notes for the Sequence Read Archive (SRA) at the National Center for Biotechnology Information.


      Rather I assume you mean the binary SRA files whose first 8 bytes are "NCBI.sra"? If you find a link, or you prompt the NCBI to publish this, could you post the URL here please?

      Comment

      • Deutsche
        Junior Member
        • Apr 2011
        • 9

        #4
        Originally posted by maubp View Post
        You don't mean the SRA XML Specification, which is documented?
        This documentation provides application notes for the Sequence Read Archive (SRA) at the National Center for Biotechnology Information.


        Rather I assume you mean the binary SRA files whose first 8 bytes are "NCBI.sra"? If you find a link, or you prompt the NCBI to publish this, could you post the URL here please?
        Yes, I mean the binary file format. Ok, I will write them and will answer you if I find something.

        Comment

        • vadim
          Member
          • Sep 2009
          • 37

          #5
          You could also check their source code of, say, fastq-dump.c or of some other dumping tool. It worked for us.

          Comment

          • Deutsche
            Junior Member
            • Apr 2011
            • 9

            #6
            Originally posted by vadim View Post
            You could also check their source code of, say, fastq-dump.c or of some other dumping tool. It worked for us.
            It is very difficult to understand a format specification from 12 Mbs of code so I want to try the simplest way at the beginning. If nothing is successful I will try to analyse the source code.

            Comment

            • Deutsche
              Junior Member
              • Apr 2011
              • 9

              #7
              Originally posted by vadim View Post
              Could you tell me please how did you got this address?

              Comment

              • vadim
                Member
                • Sep 2009
                • 37

                #8
                Could you please explain what are you planning on doing with SRA data? Most people are happy with fastq/fasta dumps produced by standard tools. For something more complicated you could use the API from the SDK, which in my sense is much easier than understanding the format specs.

                Comment

                • Deutsche
                  Junior Member
                  • Apr 2011
                  • 9

                  #9
                  Originally posted by vadim View Post
                  Could you please explain what are you planning on doing with SRA data? Most people are happy with fastq/fasta dumps produced by standard tools. For something more complicated you could use the API from the SDK, which in my sense is much easier than understanding the format specs.
                  I'm working in the UGENE project and my next task is integration SRA format supporting into our tool. It is not simple to have just included SRA SDK into UGENE because of our tool is a cross-platform program but this SDK is only UNIX-supportable.

                  Comment

                  • vadim
                    Member
                    • Sep 2009
                    • 37

                    #10
                    I believe SRA SDK can be build for Windows and Mac as well, although I have never actually tried this.
                    Is UGENE written in C++? In which case I would definitely consider re-using NCBI's code.

                    Comment

                    • Deutsche
                      Junior Member
                      • Apr 2011
                      • 9

                      #11
                      Originally posted by vadim View Post
                      I believe SRA SDK can be build for Windows and Mac as well, although I have never actually tried this.
                      Is UGENE written in C++? In which case I would definitely consider re-using NCBI's code.
                      Yes, it is. C++ with Qt4.

                      Comment

                      • Deutsche
                        Junior Member
                        • Apr 2011
                        • 9

                        #12
                        Originally posted by maubp View Post
                        If you find a link, or you prompt the NCBI to publish this, could you post the URL here please?
                        Guys from NCBI said me that they don't give this documentation anybody. And if you want to use the SRA format then you need to use their API.

                        Comment

                        • maubp
                          Peter (Biopython etc)
                          • Jul 2009
                          • 1544

                          #13
                          Originally posted by Deutsche View Post
                          Guys from NCBI said me that they don't give this documentation anybody. And if you want to use the SRA format then you need to use their API.
                          Well, at least they are clear about it.

                          Hurrah for the principles of openness and sharing in science! </sarcasm>

                          Comment

                          • jkbonfield
                            Senior Member
                            • Jul 2008
                            • 146

                            #14
                            It's perhaps a reasonable stance to take as it gives them flexibility of changing the format without having to keep notifying people, just as long as the API remains constant. However it does rather block interfaces being written by others in alternative languages.

                            The format is almost certainly quite complex though. I remember lots of discussions and to-ing and fro-ing on the best algorithms for compressing traces, qualities and sequences, with different methods for each type. As others have suggested, I'd recommend using their API and if it doesn't port cleanly to Windows then making it fixing that may be an easier task than reimplementing.

                            I'm not sure what licence they use though and whether that would be a hindrance.

                            Comment

                            • vadim
                              Member
                              • Sep 2009
                              • 37

                              #15
                              Originally posted by jkbonfield View Post
                              As others have suggested, I'd recommend using their API and if it doesn't port cleanly to Windows then making it fixing that may be an easier task than reimplementing.

                              I'm not sure what licence they use though and whether that would be a hindrance.
                              It should work under windows, see here:
                              SRA Tools. Contribute to ncbi/sra-tools development by creating an account on GitHub.


                              It is not licensed, I asked them recently about it and they said "no restrictions", whatever that means.

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 10:09 AM
                              0 responses
                              9 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              17 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              26 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              21 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...