Header Leaderboard Ad

Collapse

Does bcftools generate valid VCF files?

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does bcftools generate valid VCF files?

    Hello,
    I am working with the output from a bcftools -cgA command, and it seems to be different from the VCF specification.
    In particular it seems to show in some positions in the gene a REF as a string of various letters and in the ALT only a dot (see screenshot). The INFO field calls an INDEL. I assume that this means a deletion, and that the ALT is only the first base, in the following manner:
    REF= CGGGT #sequence should be ...CGGGT...
    ALT= . #sequence is ...C...

    However, the VCF 4.1 format would call for a C instead of a dot (see screenshot). The header of the file specifies:
    ##fileformat=VCFv4.1
    ##samtoolsVersion=0.1.18 (r982:295)

    I'm I correct in my interpretation of the deletion?

    Thank you,
    Keo.
    Attached Files

  • #2
    Hmmm,

    my VCF files contain two bases, reference and alternative.

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
    1 717242 . T C 20 . DP=7;VDB=0.0167; .....................

    Are you sure you used consistent headers all the way through ? One of the reasons Samtools doesn't
    add the correct base in is because of header mismatching (search this forum).

    Alternatively, I would have a look at the BAMs in Tablet or similar.

    Comment


    • #3
      Originally posted by colindaven View Post
      Hmmm,

      my VCF files contain two bases, reference and alternative.

      #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
      1 717242 . T C 20 . DP=7;VDB=0.0167; .....................

      Are you sure you used consistent headers all the way through ? One of the reasons Samtools doesn't
      add the correct base in is because of header mismatching (search this forum).

      Alternatively, I would have a look at the BAMs in Tablet or similar.
      His example is a deletion, not a substitution.

      You can try to ask the samtools mailing list. http://sourceforge.net/mailarchive/f...=samtools-help

      Comment


      • #4
        That's correct, but you should see the bases listed there as well.

        For a deletion that would be something like this:

        1 1900024 . TG T 57.5 . INDEL;DP=28;

        Note the TG to T 1bp deletion.

        Comment


        • #5
          Having a dot in the ref column is not right. And, if you look at your DP4, they all seem to show 0 reads for the alt forward and reverse reads.

          So something is wrong with your vcf. I use samtools and bcftools all the time, and I have no problem calling small indels.

          Comment


          • #6
            Hi,
            Thank you all for your suggestions, I'll try in the samtools mailing list. Just some pointers about the problem:
            The bam files do open in IGV with no problem, and do show the deletions in the correct number of reads, in the correct orientation, as marked in the DP4 parameter, although this problem only appears to happen in positions with very few (< 2+, 2-) reads. In the positions with higher representation of the deletion, the format is correct.
            I'm starting to guess that the problem is that I used the -Q 20 parameter in mpileup to filter bases with less than 20 QV, but maybe this filtering is performed after some indexing with all deletions counted, so that it removes the ALT value, but leaves the REF value.
            Just to clarify swbarnes2, I don't see any dots in the REF column, only in the ALT column, which appears to be valid (see second to last row in first example VCF wiki), and thanks for pointing out the low coverage.
            It is clear that the problematic positions are not really small deletions, so I don't have to worry. My concern was that I'm writing a script that can read VCF, and I needed to be sure it could read from bcftools.
            I'll post a reply if I find a solution.
            Best, Keo.
            Attached Files

            Comment

            Working...
            X