Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Samtools/bcftools problem

    I'm using samtools to look for SNPs with ultradeep (>18,000X) coverage. For some reason samtools and bcftools does not call some of the SNPs.

    Code:
    gi|251831106|ref|NC_012920.1|	16125	.	G	.	99	.	DP=18552;AF1=0;CI95=1.5,0;DP4=832,646,0,1;MQ=35;PV4=0.44,0.14,1,1	PL	0
    gi|251831106|ref|NC_012920.1|	16126	.	T	C	99	.	DP=18770;AF1=1;CI95=1,1;DP4=28,21,777,596;MQ=35;PV4=1,0,0.00042,0.28	PL	219,255,0
    For example, at position 16215, IGV gives me:
    Total count: 18522
    A: 17030 (92%, 8124+, 8906-)
    C: 0
    G: 1516 (8%, 861+, 655-)
    T: 5 (0%, 4+, 1-)
    N:1 (0%, 1+, 0-)

    Sam/bcftools should be making a SNP call at position 16215 but is not. Also, the first two counts in DP4 almost agree between bcftools and IGV, but the third and fourth do not. The sequences are quality trimmed and the reads containing the A allele are not of low quality.

    Anyone know why? Thanks.

  • #2
    Did you get any response to your post?

    I have similar data with ultra deep coverage where the total depth counts seen in IGV match the bcftools DP but the alternate allele counts are close but do not match those in the DP4 bcftools output.

    I assume this is to do with the quality of the calls but I thought I had disabled this with -BQ0 when generating the mpileup

    Any ideas?

    Thanks

    Comment


    • #3
      Originally posted by abelhj View Post
      I'm using samtools to look for SNPs with ultradeep (>18,000X) coverage. For some reason samtools and bcftools does not call some of the SNPs.
      Hi,

      Can you post the commands of samtools/bcftools you are using? If you use vcfutils.pl varFilter there is a -D option that does not call SNPs exceeding D depth (the default is 10000000 so probably this is not the problem if you didn't change it!)

      Sorry I can't be of more help...

      All the best
      Dario

      Comment


      • #4
        Hi, thanks for the quick response

        I am simply running

        samtools mpileup -l site.txt -BQ0 -d1000000 -uf ref.fasta test.sorted.bam > test_mpileup.bcf

        bcftools view -Acg test_mpileup.bcf

        I am just after the number of reads for each allele at a specific position, like the display you can get in IGV that the OP showed

        Comment


        • #5
          Maybe try:

          mpileup -l site.txt -BQ0 -d1000000 ref.fasta test.sorted.bam > test_mpileup.pileup

          To see what the quality of your alternate letters are according to mpileup. Then look at the individual sam entries for those reads. Maybe there's a discrepancy there.

          Comment


          • #6
            Thanks for the reply, I've just come across VarScan (http://varscan.sourceforge.net/index.html) which works on the pileup output from samtools.

            Using this I am able to get the counts per allele per position in the pileup which also matches the IGV calls

            All the best

            Chris

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Recent Advances in Sequencing Analysis Tools
              by seqadmin


              The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
              05-06-2024, 07:48 AM
            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Today, 07:03 AM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-10-2024, 06:35 AM
            0 responses
            31 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-09-2024, 02:46 PM
            0 responses
            41 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 05-07-2024, 06:57 AM
            0 responses
            33 views
            0 likes
            Last Post seqadmin  
            Working...
            X