Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • juhang_62
    Junior Member
    • Nov 2011
    • 2

    how to interpret fastq

    Below are some lines I excerpt from a fastq file. It's generated by illumnia. It's a little different from normal fastq format. How to inpterpret it? I want to find information about sequence type, flowlane, paired-end, multiplexed. I think the numbers tell all the information but I do not know what they stand for.


    @SRR307074.65513 HWI-EAS88_0007:6:1:19833:18832 length=49
    NTTTCGTGTCGCAATAACAATAAGAAAGAAAGAAAAAGAAAACCAGAGA
    +SRR307074.65513 HWI-EAS88_0007:6:1:19833:18832 length=49
    #:;99=====EEEEEBEEEEEEEEEBEEEEEEEEEEEEEEEEEEEEEEB
    @SRR307074.65514 HWI-EAS88_0007:6:1:19833:10713 length=49
    NACAAATGTTAAGTTCTTCAGTTTCCTTCAAAATTGGGTTTTCTTGAGA
    +SRR307074.65514 HWI-EAS88_0007:6:1:19833:10713 length=49
    #972188===EEEEEBBBBBEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
    @SRR307074.65515 HWI-EAS88_0007:6:2:1069:8480 length=49
    CCAACCTAGAACCATAATTTANTCTCTTACANNANGAAAAAACAAGACN
    +SRR307074.65515 HWI-EAS88_0007:6:2:1069:8480 length=49
    GGDEGFGGGDGGFGDFEEGFD#DD=B=;350##1#C>=4>BB#######
    @SRR307074.65516 HWI-EAS88_0007:6:2:1070:6870 length=49
    GAAATTTTATATGGAACTCATNTATAAGAAANNANGAGATGTAAAGGCN
    +SRR307074.65516 HWI-EAS88_0007:6:2:1070:6870 length=49

    Thank you!
  • fkrueger
    Senior Member
    • Sep 2009
    • 627

    #2
    I don't know what you mean with 'it's a little different from normal fastq format', to me it looks pretty 'normal'.

    @SRR307074.65513 should be the study number
    HWI-EAS88_0007 should be instrument and run number
    6:1:19833:18832 should be lane, tile, x-coordinate, y-coordinate (of the sequence cluster)
    length=49 ...

    Quality values in SRA data are encoded in Sanger format (Phred +33), and I can't see any indications that the data was paired-end or multiplexed.

    Comment

    • mgogol
      Senior Member
      • Mar 2008
      • 197

      #3
      If it's paired end, you'll have two different files, one with reads from the first end and one with reads from the second.

      If it's multiplexed and has already been split up, the file names should have the index in them. If it's multiplexed and hasn't been split, you can use fastx_barcode_splitter.

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      17 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      27 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      38 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      61 views
      0 reactions
      Last Post SEQadmin2  
      Working...