Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRA to fastq conversion with fastq-dump loses sequences

    Hello,

    I converted an SRA archive (ftp://ftp-trace.ncbi.nlm.nih.gov/sra...953/SRR073769/) to fastq with the fastq-dump program (sratoolkit-2.1.6). The resulting fastq file had ~160,000 less sequences (2% of the total number of spots) than expected. Why does this occur?

    Thank you,

    Paul

  • #2
    I've also experienced this problem. Did you find a solution?

    Thank you,

    Elizabeth

    Comment


    • #3
      how do u know the expected number of sequences?

      Comment


      • #4
        I am seeing the same number of sequences as reported on the SRA page:



        in the file I downloaded.

        Code:
        ../sratoolkit.2.1.16-centos_linux64/bin/fastq-dump.2.1.18 SRR073769.sra 
        Written 8175900 spots for SRR073769.sra
        Written 8175900 spots total
        Code:
        $ more SRR073769.fastq | grep "@SRR073769" | wc -l
        8175900

        Comment


        • #5
          Oh, I see.

          I'm still a little confused about something:

          This file, SRR035116.sra, for example, is 3.9Gb
          When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

          Usually when I convert sra to fastq, my files get a lot bigger. Help?

          Thank you!!

          Comment


          • #6
            .sra file is ~383 Mb and the .fastq file is 1.6 G (on my filesystem). If your .sra file is truly that large then something must be wrong.

            Use the aspera client that SRA provides to download the .sra file.


            Originally posted by eeh_021 View Post

            I'm still a little confused about something:

            This file, SRR035116.sra, for example, is 3.9Gb
            When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

            Usually when I convert sra to fastq, my files get a lot bigger. Help?

            Thank you!!

            Comment


            • #7
              383Mb?

              If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

              Comment


              • #8
                Originally posted by eeh_021 View Post
                383Mb?

                If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...
                Perhaps it is more straightforward to fetch it from Europe or Japan.

                Compressed files (.fastq.gz or .fastq.bz2) are just easier to use than those .sra files.


                Sébastien Boisvert

                Comment


                • #9
                  We are talking about two different data sets.

                  My response was for the dataset (SRR073769) that was in pcantalupo's original post.

                  Dataset you are referring to below is indeed 3.9 Gb.


                  Originally posted by eeh_021 View Post
                  383Mb?

                  If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

                  Comment


                  • #10
                    Originally posted by eeh_021 View Post
                    Oh, I see.

                    I'm still a little confused about something:

                    This file, SRR035116.sra, for example, is 3.9Gb
                    When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

                    Usually when I convert sra to fastq, my files get a lot bigger. Help?

                    Thank you!!
                    I don't think that's a problem if the fastq file gets bigger because the sra file is in binary anyway, which is more compact.

                    Comment


                    • #11
                      how to convert SRA file to FASTQ?

                      Comment


                      • #12
                        Originally posted by alireda82 View Post
                        how to convert SRA file to FASTQ?
                        Use SRA toolkit: http://eutils.ncbi.nih.gov/Traces/sr...lkit_doc&f=std

                        Comment


                        • #13
                          Hi everyone, i'm new here!
                          Can someone tell-me if it's possible to cenvert a WIG file type to FASTQ?thanks in advance

                          Comment


                          • #14
                            No, the Wig files do not contain the sequences, just the coverage.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Best Practices for Single-Cell Sequencing Analysis
                              by seqadmin



                              While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                              06-06-2024, 07:15 AM
                            • seqadmin
                              Latest Developments in Precision Medicine
                              by seqadmin



                              Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                              Somatic Genomics
                              “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                              05-24-2024, 01:16 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 06-07-2024, 06:58 AM
                            0 responses
                            13 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 06-06-2024, 08:18 AM
                            0 responses
                            20 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 06-06-2024, 08:04 AM
                            0 responses
                            19 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 06-03-2024, 06:55 AM
                            0 responses
                            13 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X