Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dmurdock
    Junior Member
    • Mar 2010
    • 9

    bfast postprocess -U option

    In the latest version of bfast (0.6.4b) does anyone have any experience with the -U option in the postprocess command? In the previous version it seems this wasn't present and I just used the standard -a 3 -O 3 options. How is the output different using -U? Also the bfast guide example commands need to be updated in that -O 3 should be replaced with -O 1 for sam output. Thanks!

    David Murdock
    Baylor College of Medicine
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    Originally posted by dmurdock View Post
    In the latest version of bfast (0.6.4b) does anyone have any experience with the -U option in the postprocess command? In the previous version it seems this wasn't present and I just used the standard -a 3 -O 3 options. How is the output different using -U? Also the bfast guide example commands need to be updated in that -O 3 should be replaced with -O 1 for sam output. Thanks!

    David Murdock
    Baylor College of Medicine
    Great catch in the manual! The typo is fixed in the latest source and thus will be there in the next release.

    You caught the "silent" upgrade . If you use the "-U" option, the output will be the same as prior versions. Without the "-U" option (now the default), the alignments for each end of paired end (mate pairs) will be selected such that the empirical insert size distribution, as well as inversion ratio, will be taken into account. This allows for ambiguous reads (two or more equally likely alignments) to be anchored by their mate. I have seen this helps improve both power and accuracy.

    A next step is to add a feature that for unpaired reads (one end maps, the other doesn't) will examine the nearby region implied by the mate pair. This may be quite expensive, especially for color space and/or gapped alignment, but I have seen it successfully used in Novoalign and BWA to improve mapping power while preserving accuracy.

    Comment

    • eyalbd
      Member
      • Apr 2010
      • 11

      #3
      Hi, I have a related comment and also a question. First, the entire postprocess command example in the manual in the SOLiD section 7.1.2 is outdated, and it would be great if you could update it.
      Second, I noticed that postprocess has a -A 1 option for color space (which is not even mentioned in the book but exists in the help output). Should this be used in SOLiD alignments, or is the output of the align command already in NT space.

      Thanks for this awesome tool,
      Eyal

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        Originally posted by eyalbd View Post
        Hi, I have a related comment and also a question. First, the entire postprocess command example in the manual in the SOLiD section 7.1.2 is outdated, and it would be great if you could update it.
        Second, I noticed that postprocess has a -A 1 option for color space (which is not even mentioned in the book but exists in the help output). Should this be used in SOLiD alignments, or is the output of the align command already in NT space.

        Thanks for this awesome tool,
        Eyal
        Here's where "Release early release often" allows documentation to be out of date. The latest git master branch has an update manual.

        The "-A" option should be set wherever possible. It now is included in the "postprocess" step (version 0.6.4* and onwards). I apologize for its sudden inclusion and ambiguity.

        Comment

        • eyalbd
          Member
          • Apr 2010
          • 11

          #5
          Thanks Nils.

          I performed the postprocess as written in the old manual (except for -A 1), I hope there aren't more changes.
          The ouput SAM file is about twice as large as I get with BWA or bowtie, even though the number of alignments is similar. Is BFAST outputting also nonaligned reads into the sam file?
          What could cause this size difference?

          Comment

          • nilshomer
            Nils Homer
            • Nov 2008
            • 1283

            #6
            Originally posted by eyalbd View Post
            Thanks Nils.

            I performed the postprocess as written in the old manual (except for -A 1), I hope there aren't more changes.
            The ouput SAM file is about twice as large as I get with BWA or bowtie, even though the number of alignments is similar. Is BFAST outputting also nonaligned reads into the sam file?
            What could cause this size difference?
            All reads, aligned and unaligned, should be present in the SAM/BAM file (let me know if they are not). There are also a fair number of optional SAM tags in each alignment record that BWA/bowtie may or may not also produce. These tags are used to annotate each alignment and can help downstream tools. If you don't want to store the optional tags, you can always rip them out using awk/perl (they always appear as the last N columns etc.). Remember to convert your SAM file to BAM for good compaction and compression, as well as fast record retrieval.

            Comment

            • eyalbd
              Member
              • Apr 2010
              • 11

              #7
              Thanks, I think I managed to get it to work. I think the alignment worked well, although I'm having troubles trying to call SNPs with pileup for the results. The problem is that every base it called as a deletion. This could also explain why I got such a huge SAM file, as well as the reason why now pileup is taking hours to work.
              Thanks again for all your help!
              Last edited by eyalbd; 04-22-2010, 06:22 AM.

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                #8
                Originally posted by eyalbd View Post
                Thanks, I think I managed to get it to work. I think the alignment worked well, although I'm having troubles trying to call SNPs with pileup for the results. The problem is that every base it called as a deletion. This could also explain why I got such a huge SAM file, as well as the reason why now pileup is taking hours to work.
                Thanks again for all your help!
                Did you try running the "samtools.pl varFilter" found in the "misc" directory of samtools (remember to adjust based on coverage)? Try filtering based on SNP quality.

                Comment

                • eyalbd
                  Member
                  • Apr 2010
                  • 11

                  #9
                  Originally posted by nilshomer View Post
                  Did you try running the "samtools.pl varFilter" found in the "misc" directory of samtools (remember to adjust based on coverage)? Try filtering based on SNP quality.
                  Thanks, I'll try that. However, as every base, even those called like the reference, is called as a deletion (meaning, if I understand correctly, that it aligned it to the correct spot on the reference but as if it came sooner), the problem would seem to be more profound. My coverage, for some reason, starts from base 2 in the reference, not base 1. As the mitochondria genome is circular, could this lead to an artifact which also confuses the alignment? It's highly improbable I really don't have coverage for this base, as I have a lot of coverage for the adjacent bases and the mitochondrial genome is, again, circular.

                  Another problem with the output SAM I get from BFAST seems to be that tview can't view it, for some reason (it work for me on Bowtie or BWA output with the same reference).

                  Thanks
                  Eyal

                  Comment

                  • nilshomer
                    Nils Homer
                    • Nov 2008
                    • 1283

                    #10
                    Originally posted by eyalbd View Post
                    Thanks, I'll try that. However, as every base, even those called like the reference, is called as a deletion (meaning, if I understand correctly, that it aligned it to the correct spot on the reference but as if it came sooner), the problem would seem to be more profound. My coverage, for some reason, starts from base 2 in the reference, not base 1. As the mitochondria genome is circular, could this lead to an artifact which also confuses the alignment? It's highly improbable I really don't have coverage for this base, as I have a lot of coverage for the adjacent bases and the mitochondrial genome is, again, circular.
                    I would be happy to take a look if you want to give me the reads and/or the SAM/BAM file.


                    Originally posted by eyalbd View Post
                    Another problem with the output SAM I get from BFAST seems to be that tview can't view it, for some reason (it work for me on Bowtie or BWA output with the same reference).

                    Thanks
                    Eyal
                    Don't use tview out of samtools since it doesn't fully support the specification (with respect to indels). You can use IGV out of the broad, which I use daily remotely as not to have to download each BAM from our servers.

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM
                    • SEQadmin2
                      Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                      by SEQadmin2


                      With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                      Introduction

                      Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                      05-22-2026, 06:42 AM
                    • SEQadmin2
                      Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                      by SEQadmin2

                      Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                      Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                      05-06-2026, 09:04 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Today, 08:59 AM
                    0 responses
                    8 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    21 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    17 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 05-28-2026, 11:40 AM
                    0 responses
                    29 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...