Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools mpileup. Which reads are excluded during SNP calling?

    Hi All,

    I have two questions about Samtools mpileup. Hope somebody can help me!

    First, I found 'mpileup' filters some reads/nts out before SNP calling (compared to command 'pileup'). How can I know which reads are actually filtered?

    Second, in 'pileup' output, we can directly see the nucleotides and corresponding base quality scores mapped to one position. Does mpileup provide similar output? Or mpileup just generates the VCF file?

    Many thanks!

  • #2
    As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

    As for what's being filtered out, however, I don't know. One of the default options is

    -Q INT skip bases with baseQ/BAQ smaller than INT [13]

    This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

    I wish I could be of more help.

    Comment


    • #3
      Originally posted by dagarfield View Post
      As I understand it, we shouldn't be using pileup anymore. Yes, the output is nice, but all of the same information is (somewhere) in the VCF file.

      As for what's being filtered out, however, I don't know. One of the default options is

      -Q INT skip bases with baseQ/BAQ smaller than INT [13]

      This suggests to me that you might be losing some bases with low baseQs or those with low BAQ scores (explained here: http://samtools.sourceforge.net/mpileup.shtml)

      I wish I could be of more help.
      Thank you! So we cannot get the specific nucleotides and corresponding base qualities from samtools anymore, right?

      Comment


      • #4
        It is not obvious to me where that information is in the VCF file, if it is in there at all. However, you might be able to get something in the file generated by mpileup.

        Rather than generating a BCF formatted file with mpileup (as is outlined on the man page for mpileup), have you tried running mpileup without the -u and -g options? The output looks a whole lot like the output from old pileup.

        --DG

        Comment


        • #5
          I've tried to run simply

          Code:
          samtools mpileup -f ref.fasta -b bam > out
          but I get "Segmentation Fault" almost immediately.
          @dagarfield
          Any ideas on what is going on? I though I'd drop you a question here because you said you have run it without the -u and -g options.

          Thanks,
          Gareth

          Comment


          • #6
            I just ran mine with the following syntax

            Code:
            samtools mpileup -f myGenome.fasta myBam.bam > myoutput.txt
            Where myGenome.fasta is a fasta file on which I have run the command

            Code:
            samtools faidx myGenome.fasta
            In the same directory in which myGenome.fasta lives.

            This looks pretty much like what you did except for the -b option. I think you can (and maybe should) leave that out when you are running just a single BAM file. For the -b option, you'd specify not a BAM file but rather a file that contains a list of the BAM files you want to analyze.

            How'd that work?

            Comment


            • #7
              You are right about the -b parameter being a list rather than a file. I think its time for some coffee.

              Thanks!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Non-Coding RNA Research and Technologies
                by seqadmin




                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                Nobel Prize for MicroRNA Discovery
                This week,...
                10-07-2024, 08:07 AM
              • seqadmin
                Recent Developments in Metagenomics
                by seqadmin





                Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                09-23-2024, 06:35 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 06:35 AM
              0 responses
              7 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 02:44 PM
              0 responses
              7 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-11-2024, 06:55 AM
              0 responses
              15 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 10-02-2024, 04:51 AM
              0 responses
              111 views
              0 likes
              Last Post seqadmin  
              Working...
              X