Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • rnaeye
    Member
    • May 2011
    • 80

    Help needed with fastq-dump (SRA toolkit)

    Hi
    I am trying to split an .sra file into R1.fastq and R2.fastq However, I am getting single file, and I think forward and reverse reads are joined. Here is the accession number: SRR5439504.sra

    Command I run is

    Code:
       fastq-dump -I --split-files SRR5439504.sra
    Output file I get looks like this:

    Code:
    @SRR5439504.1.1 1 length=302
    CCATAACCCTAACCCTAACCCTAACCCTAACTCTATCCATAACCCTAACCCTTACCCTATCCCTAACCCTAACCCTAACCCTAACCCTAGCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAGCCTAAGCGTAGCCCTAAGCCTAAGCCTAAGCCAAAGCGTAAGCCTAAGCCTAAGCCACAGCATAAAAAAAAGCAAAAACATAAACCCAAGAAAAAG
    +SRR5439504.1.1 1 length=302
    F22F<2@2C?02GCFHF?FB0?0?02BB44B334?3B33/0B?20/0003@33BB33223B21E1G?2FG1BF2BB1BB2FA1BF1A112B2FAA3CBFE1FHFHGFAHGHHHHGHHGFBHHGFBAFFFGGGGFGEFEGFFBFFFFCCBBBCBCBCFFFCFFFGGGGGGGGGCFGHHHHGHHFFGHCFCGHCHFHHGHFCB1AA233333B3B0BA0133222333333333B3@3F322B321>>11@3BF@3333333B322BB/2333433/<</02<2@///2<<110////00000.
    @SRR5439504.2.1 2 length=302
    CTCTAACCCTAACTCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACTCTAACCCTAACCCTAACCCTAACCCTATCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCGTAACCCTAAGCCTAACCCTAACCCTAACCCAAAACATAAGCCAAAGCCTAACCCTAACCCCAAGCATAATCCTAAACATAATCACACA
    +SRR5439504.2.1 2 length=302
    ?1A?0GF0>2@@@10HGFFEG?00AF0/FBFB0HFB<00>0BF0B/0BF0HGBBFBFG@BB1CFBBB00>0GF>0>B0B0BA0BB01AB0F00/0B00FA0GF00F00B0FA0FF00A00G0G0A0G00AB1GGGGGGGFF>CFFFAAAA@BABBBFFFBFFFGGGGGGGE44AEAAFFEH2F2GF222A22222BB2A2B1FFC2BF1ABE10ABA131B2?3333B32??12F2B1B2F2111??1B133333300B3B0BFC00?B?F0B///C//01BB22?12@1111@@2>1111/
    I am not even sure if length should be 302 bp. I am guessing that original file was 2x151. Does anyone have any recommendations? Thanks.
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    For reference cross-posted: https://www.biostars.org/p/251363/

    My answer there: Looking at the SRA record the sequence seems to have been submitted as single (302 bp) reads (even though the layout is described as PAIRED) from a CIRCLE-seq experiment. So you are likely not going to get the paired-end sequence from SRA. I don't know what CIRCLE-seq is but you can take a look at the Nature protocol paper mentioned and process the data accordingly. Perhaps every read represents a circular sequence of some sort?

    Comment

    • rnaeye
      Member
      • May 2011
      • 80

      #3
      Hi GenoMax,
      Thank you for the answer at both sites. I checked the paper again, and finally found description of the reads. They did 150 bp paired end sequencing. They must have prepared the files wrong. I emailed the author, and let's see if they will fix it. Best,

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      24 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      30 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      39 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      62 views
      0 reactions
      Last Post SEQadmin2  
      Working...