Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • MiniRider
    Junior Member
    • Dec 2008
    • 2

    Raw solexa data processing

    Hi there,

    I am a beginner in sequencing field, I have several silly questions hopefully someone can give me some instruction.

    I just got the our sequencing data from Solexa 1G analyzer which gave us s_*_000*_seq.txt and s_*_000*_prb.txt

    First question:
    Why the length of each read is 152bps (76x2) instead of 76bps (38x2)?

    Second question:
    What is the first process I should do? I already convert raw text file into .fasta format, should I split the 152bp read into 2 short reads with 76 bps or even shorter?

    Third question:
    For data input for Maq, we need to convert reads from fastq to bfg. But I only have s_*_000*_seq.txt and s_*_000*_prb.txt in hand, which software can I use for this step conversion or I can do this by Matlab?

    Thanks your time helping me answer these questions.
  • ECO
    --Site Admin--
    • Oct 2007
    • 1360

    #2
    Hey MiniRider,

    For your 3rd question, see this page, which contains the following instructions:

    IMPORTANT NOTE: The raw reads format used by Solexa (those `s_?_sequence.txt' from the Solexa pipeline) are different from mapass' FASTQ format in that the qualties are scaled differently. To use maq, you need to first convert the format with:
    • maq sol2sanger s_1_sequence.txt s_1_sequence.fastq

    where s_1_sequence.txt is the Solexa read sequence file. Missing this step will lead to unreliable SNP calling.
    First you'll have to use the above advice to convert your solexa fastq to a sanger-scaled fastq, then you will simply use "maq fastq2bfq" to make the bfq files. If you run into computational issues, you can split the reads at this step using the -n option.

    Hope that helps.

    Comment

    • kevinlu
      Junior Member
      • Oct 2008
      • 6

      #3
      Hi there all,

      I'm also a beginner, and quite inexperienced at that. What does it take to convert the raw eland output from the Solexa machine into a .bed format suitable for viewing purposes on something like the UCSC Genome Browser?

      Thanks,
      Kevin

      Comment

      Latest Articles

      Collapse

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Today, 06:09 AM
      0 responses
      15 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      34 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-05-2026, 10:09 AM
      0 responses
      39 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-04-2026, 08:59 AM
      0 responses
      46 views
      0 reactions
      Last Post SEQadmin2  
      Working...