Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA - samse

    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

  • #2
    If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

    Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

    I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.

    Originally posted by giverny View Post
    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

    Comment


    • #3
      Originally posted by sperry View Post
      If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

      Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

      I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.
      Thanks for your answer and sorry for the delay in getting back to you.
      Yes the quality of these lines is not the best quality I have on the set... it was just few examples
      Finally the problem was relative to the fastq file.
      For sure it's more quick now ... and I have results.
      Have a good day and thanks again

      Comment


      • #4
        same problem

        Hi guys,

        I have exactly the same problem!!
        Giverny I would be very greateful if you could describe what was the problem with your fastq file!

        best ro

        Comment


        • #5
          Hi, How did you fix the problem of fastq format. I am using maq's sol2sanger program and still get segmentation fault. Please explain. Thanks.

          Comment


          • #6
            Hi, did anybody find who to figure out this problem?
            i am the same problem but i couldn't find any problem to my fastq files,
            thanks

            Comment


            • #7
              I had BWA segmentation fault issues with bwa aln. It turned out my reference fasta file was somehow damaged (I used cat to combine the equine chromosome files into one). Once I received a working genome file, it worked without issues.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin



                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                06-06-2024, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:58 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-06-2024, 08:18 AM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-06-2024, 08:04 AM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-03-2024, 06:55 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Working...
              X