Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRA to fastq conversion with fastq-dump loses sequences

    Hello,

    I converted an SRA archive (ftp://ftp-trace.ncbi.nlm.nih.gov/sra...953/SRR073769/) to fastq with the fastq-dump program (sratoolkit-2.1.6). The resulting fastq file had ~160,000 less sequences (2% of the total number of spots) than expected. Why does this occur?

    Thank you,

    Paul

  • #2
    I've also experienced this problem. Did you find a solution?

    Thank you,

    Elizabeth

    Comment


    • #3
      how do u know the expected number of sequences?

      Comment


      • #4
        I am seeing the same number of sequences as reported on the SRA page:



        in the file I downloaded.

        Code:
        ../sratoolkit.2.1.16-centos_linux64/bin/fastq-dump.2.1.18 SRR073769.sra 
        Written 8175900 spots for SRR073769.sra
        Written 8175900 spots total
        Code:
        $ more SRR073769.fastq | grep "@SRR073769" | wc -l
        8175900

        Comment


        • #5
          Oh, I see.

          I'm still a little confused about something:

          This file, SRR035116.sra, for example, is 3.9Gb
          When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

          Usually when I convert sra to fastq, my files get a lot bigger. Help?

          Thank you!!

          Comment


          • #6
            .sra file is ~383 Mb and the .fastq file is 1.6 G (on my filesystem). If your .sra file is truly that large then something must be wrong.

            Use the aspera client that SRA provides to download the .sra file.


            Originally posted by eeh_021 View Post

            I'm still a little confused about something:

            This file, SRR035116.sra, for example, is 3.9Gb
            When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

            Usually when I convert sra to fastq, my files get a lot bigger. Help?

            Thank you!!

            Comment


            • #7
              383Mb?

              If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

              Comment


              • #8
                Originally posted by eeh_021 View Post
                383Mb?

                If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...
                Perhaps it is more straightforward to fetch it from Europe or Japan.

                Compressed files (.fastq.gz or .fastq.bz2) are just easier to use than those .sra files.


                Sébastien Boisvert

                Comment


                • #9
                  We are talking about two different data sets.

                  My response was for the dataset (SRR073769) that was in pcantalupo's original post.

                  Dataset you are referring to below is indeed 3.9 Gb.


                  Originally posted by eeh_021 View Post
                  383Mb?

                  If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

                  Comment


                  • #10
                    Originally posted by eeh_021 View Post
                    Oh, I see.

                    I'm still a little confused about something:

                    This file, SRR035116.sra, for example, is 3.9Gb
                    When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

                    Usually when I convert sra to fastq, my files get a lot bigger. Help?

                    Thank you!!
                    I don't think that's a problem if the fastq file gets bigger because the sra file is in binary anyway, which is more compact.

                    Comment


                    • #11
                      how to convert SRA file to FASTQ?

                      Comment


                      • #12
                        Originally posted by alireda82 View Post
                        how to convert SRA file to FASTQ?
                        Use SRA toolkit: http://eutils.ncbi.nih.gov/Traces/sr...lkit_doc&f=std

                        Comment


                        • #13
                          Hi everyone, i'm new here!
                          Can someone tell-me if it's possible to cenvert a WIG file type to FASTQ?thanks in advance

                          Comment


                          • #14
                            No, the Wig files do not contain the sequences, just the coverage.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Addressing Off-Target Effects in CRISPR Technologies
                              by seqadmin






                              The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
                              08-27-2024, 04:44 AM
                            • seqadmin
                              Selecting and Optimizing mRNA Library Preparations
                              by seqadmin



                              Sequencing mRNA provides a snapshot of cellular activity, allowing researchers to study the dynamics of cellular processes, compare gene expression across different tissue types, and gain insights into the mechanisms of complex diseases. “mRNA’s central role in the dogma of molecular biology makes it a logical and relevant focus for transcriptomic studies,” stated Sebastian Aguilar Pierlé, Ph.D., Application Development Lead at Inorevia. “One of the major hurdles for...
                              08-07-2024, 12:11 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 08-27-2024, 04:40 AM
                            0 responses
                            16 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 08-22-2024, 05:00 AM
                            0 responses
                            293 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 08-21-2024, 10:49 AM
                            0 responses
                            135 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 08-19-2024, 05:12 AM
                            0 responses
                            124 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X