Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • m_elena_bioinfo
    Member
    • Oct 2009
    • 99

    PE SOLiD reads alignment by bwa

    Dear users,
    I have PE reads from SOLiD to align to human genome.
    I have these files:

    - solid_data_F3.csfasta
    - solid_data_F3_QV.qual
    - solid_data_F5-P2.csfasta
    - solid_data_F5-P2_QV.qual

    I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
    This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

    So, if I use:
    > solid2fastq.pl solid_data_ solid_data_total
    I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

    This fastq is in colorspace but the colors are represented as ACTG.
    So to index the genome and to perform bwa alignment, have I to use -c option?

    Thanks a lot,
    ME
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by m_elena_bioinfo View Post
    Dear users,
    I have PE reads from SOLiD to align to human genome.
    I have these files:

    - solid_data_F3.csfasta
    - solid_data_F3_QV.qual
    - solid_data_F5-P2.csfasta
    - solid_data_F5-P2_QV.qual

    I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
    This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

    So, if I use:
    > solid2fastq.pl solid_data_ solid_data_total
    I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

    This fastq is in colorspace but the colors are represented as ACTG.
    So to index the genome and to perform bwa alignment, have I to use -c option?

    Thanks a lot,
    ME
    It looks like the script doesn't support the paired end protocol. Bug the BWA mailing list ([email protected]) or the author (username:lh3).

    Comment

    • drio
      Senior Member
      • Oct 2008
      • 323

      #3
      If you want to use the script with the PE data make this change in the script:

      98 #if (/^>(\d+)_(\d+)_(\d+)_[FR]3/) {
      99 if (/^>(\d+)_(\d+)_(\d+)_[F3|R3|F5-P2]/) {

      And also rename the F5-P2 to R3:

      solid_data_F5-P2.csfasta -> solid_data_R3.csfasta
      solid_data_F5-P2_QV.qual -> solid_data_R3_QV.qual

      Also, bfast has a solid2fastq (in the git repo) that supports now bwa output and
      handles PE data. You can use that too.
      -drd

      Comment

      • m_elena_bioinfo
        Member
        • Oct 2009
        • 99

        #4
        Thanx very much for your help Drio!
        I'll try and let you know if the program run!

        Comment

        • SoftGenetics
          Registered Vendor
          • Apr 2009
          • 36

          #5
          Originally posted by m_elena_bioinfo View Post
          Dear users,
          I have PE reads from SOLiD to align to human genome.
          I have these files:

          - solid_data_F3.csfasta
          - solid_data_F3_QV.qual
          - solid_data_F5-P2.csfasta
          - solid_data_F5-P2_QV.qual

          I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
          This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

          So, if I use:
          > solid2fastq.pl solid_data_ solid_data_total
          I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

          This fastq is in colorspace but the colors are represented as ACTG.
          So to index the genome and to perform bwa alignment, have I to use -c option?

          Thanks a lot,
          ME
          You will loose a lot of information by converting the color space files to fasta, you would be better off aligning the solid reads to a color space reference

          John

          Comment

          • drio
            Senior Member
            • Oct 2008
            • 323

            #6
            There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
            -drd

            Comment

            • SoftGenetics
              Registered Vendor
              • Apr 2009
              • 36

              #7
              Originally posted by drio View Post
              There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
              We utilize a modified BWA in our NextGENe software which adds a couple of additional steps to the BWA alignment, creating a much more robust alignment, addtionally, we utilize a fully annotated color space reference so no information is lost, if you would like to try, we can supply a trial.
              John

              Comment

              • drio
                Senior Member
                • Oct 2008
                • 323

                #8
                Cool, any plans to integrate that into the main bwa repo?
                -drd

                Comment

                • Agent47
                  Junior Member
                  • Jan 2009
                  • 3

                  #9
                  Thanks! Elena and drio

                  This was useful. i am trying to run the solid pe barcoded analysis.
                  I have submitted it to run just now.
                  I hope this works.

                  Comment

                  Latest Articles

                  Collapse

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 06-05-2026, 10:09 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-04-2026, 08:59 AM
                  0 responses
                  24 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 12:03 PM
                  0 responses
                  30 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-02-2026, 11:40 AM
                  0 responses
                  23 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...