Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Does bcftools generate valid VCF files?

    Hello,
    I am working with the output from a bcftools -cgA command, and it seems to be different from the VCF specification.
    In particular it seems to show in some positions in the gene a REF as a string of various letters and in the ALT only a dot (see screenshot). The INFO field calls an INDEL. I assume that this means a deletion, and that the ALT is only the first base, in the following manner:
    REF= CGGGT #sequence should be ...CGGGT...
    ALT= . #sequence is ...C...

    However, the VCF 4.1 format would call for a C instead of a dot (see screenshot). The header of the file specifies:
    ##fileformat=VCFv4.1
    ##samtoolsVersion=0.1.18 (r982:295)

    I'm I correct in my interpretation of the deletion?

    Thank you,
    Keo.
    Attached Files

  • #2
    Hmmm,

    my VCF files contain two bases, reference and alternative.

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
    1 717242 . T C 20 . DP=7;VDB=0.0167; .....................

    Are you sure you used consistent headers all the way through ? One of the reasons Samtools doesn't
    add the correct base in is because of header mismatching (search this forum).

    Alternatively, I would have a look at the BAMs in Tablet or similar.

    Comment


    • #3
      Originally posted by colindaven View Post
      Hmmm,

      my VCF files contain two bases, reference and alternative.

      #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT
      1 717242 . T C 20 . DP=7;VDB=0.0167; .....................

      Are you sure you used consistent headers all the way through ? One of the reasons Samtools doesn't
      add the correct base in is because of header mismatching (search this forum).

      Alternatively, I would have a look at the BAMs in Tablet or similar.
      His example is a deletion, not a substitution.

      You can try to ask the samtools mailing list. http://sourceforge.net/mailarchive/f...=samtools-help

      Comment


      • #4
        That's correct, but you should see the bases listed there as well.

        For a deletion that would be something like this:

        1 1900024 . TG T 57.5 . INDEL;DP=28;

        Note the TG to T 1bp deletion.

        Comment


        • #5
          Having a dot in the ref column is not right. And, if you look at your DP4, they all seem to show 0 reads for the alt forward and reverse reads.

          So something is wrong with your vcf. I use samtools and bcftools all the time, and I have no problem calling small indels.

          Comment


          • #6
            Hi,
            Thank you all for your suggestions, I'll try in the samtools mailing list. Just some pointers about the problem:
            The bam files do open in IGV with no problem, and do show the deletions in the correct number of reads, in the correct orientation, as marked in the DP4 parameter, although this problem only appears to happen in positions with very few (< 2+, 2-) reads. In the positions with higher representation of the deletion, the format is correct.
            I'm starting to guess that the problem is that I used the -Q 20 parameter in mpileup to filter bases with less than 20 QV, but maybe this filtering is performed after some indexing with all deletions counted, so that it removes the ALT value, but leaves the REF value.
            Just to clarify swbarnes2, I don't see any dots in the REF column, only in the ALT column, which appears to be valid (see second to last row in first example VCF wiki), and thanks for pointing out the low coverage.
            It is clear that the problematic positions are not really small deletions, so I don't have to worry. My concern was that I'm writing a script that can read VCF, and I needed to be sure it could read from bcftools.
            I'll post a reply if I find a solution.
            Best, Keo.
            Attached Files

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X