Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Papillon View Post
    I now first filter out reads with a low mapping quality within the SAM-file and reads that are 'B' flagged for over 90% by Illumina within FASTQ-files...
    This doesn't seems to work. Although alignment looks okay at first sight, creating the SAM file fails hopelessly (as you can see below). The same thing happened to me when I tried to map fastq-files in which I trimmed the bad read-ends.

    Since mapping went well with untouched fastq-files, would it be safe for me to assume that tempering with these files can be quite tricky?
    The only other cause could be using a newer version of BWA - v. 0.5.9b - and using the -I option for Illumina scoring.
    Those are the only differences between being able to map at all and the absolute failures.

    In retrospect, this method wasn't powerful enough to begin with, so I removed it from the pipeline.

    [bwa_sai2sam_pe_core] time elapses: 10.97 sec
    [bwa_sai2sam_pe_core] changing coordinates of 18 alignments.
    [bwa_sai2sam_pe_core] align unmapped mate...
    [bwa_paired_sw] 19395 out of 60331 Q17 singletons are mated.
    [bwa_paired_sw] 238 out of 193695 Q17 discordant pairs are fixed.
    [bwa_sai2sam_pe_core] time elapses: 22668.70 sec
    [bwa_sai2sam_pe_core] refine gapped alignments... 0.81 sec
    [bwa_sai2sam_pe_core] print alignments... 2.02 sec
    [bwa_sai2sam_pe_core] 262144 sequences have been processed.
    [bwa_sai2sam_pe_core] convert to sequence coordinate...
    [infer_isize] (25, 50, 75) percentile: (16981, 39972, 70412)
    [infer_isize] low and high boundaries: 101 and 177274 for estimating avg and std
    [infer_isize] inferred external isize from 39 pairs: 42807.359 +/- 29283.488
    [infer_isize] skewness: 0.344; kurtosis: -1.096; ap_prior: 1.00e-05
    [infer_isize] inferred maximum insert size: 220265 (6.06 sigma)
    [bwa_sai2sam_pe_core] time elapses: 11.07 sec
    [bwa_sai2sam_pe_core] changing coordinates of 12 alignments.
    [bwa_sai2sam_pe_core] align unmapped mate...
    [bwa_paired_sw] 20365 out of 59571 Q17 singletons are mated.
    [bwa_paired_sw] 253 out of 194428 Q17 discordant pairs are fixed.
    [bwa_sai2sam_pe_core] time elapses: 27222.88 sec
    [bwa_sai2sam_pe_core] refine gapped alignments... 0.83 sec
    [bwa_sai2sam_pe_core] print alignments... 2.03 sec
    [bwa_sai2sam_pe_core] 524288 sequences have been processed.
    [bwa_sai2sam_pe_core] convert to sequence coordinate...
    [infer_isize] fail to infer insert size: too few good pairs

    Comment


    • #17
      It should not be tricky, but you need to remove both sequences from a pair of reads if you decide to remove one. Although I'm not familiar with BWA, that seems like your problem.

      Comment


      • #18
        In the end I only removed ~109,000 reads out of ~92,000,000 per fastq file, so I'd expected that BWA would only have difficulties with pairing those few reads. Trimming reads (not removing them) seemed to cause similar problems, although I have to admit there were > ~50,000 with a 100% Q2/B flag, so that would be identical to removing them.

        Somewhere on this forum, the same problem is described (thread: 'BWA sampe hanging'). Same symptoms, slightly different causes, but it seems to agree with your answer that all pairs have to be lined up correctly and it seems that a small change can cause serious problems.

        Comment


        • #19
          I'm not familiar with BWA, but on bowtie, it just uses the read order to find read pairs, so as soon as you remove one read but not its pair, all subsequent reads go out of alignment and can't be correctly paired.

          Comment


          • #20
            Thank you for responding! I remapped my data again, using the latest version of BWA and it worked out great, so the most likely cause for my previous failure would indeed be that the read order was disturbed and therefor caused all problems.

            BWA does try to fix it, so it seems, but it takes an insane amount of time and in the end the results are not quite what you would expect (see my previous copy-paste of BWA's output).

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              05-06-2024, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 05-14-2024, 07:03 AM
            0 responses
            15 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-10-2024, 06:35 AM
            0 responses
            37 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-09-2024, 02:46 PM
            0 responses
            46 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-07-2024, 06:57 AM
            0 responses
            39 views
            0 likes
            Last Post seqadmin  
            Working...
            X