Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • manore
    Member
    • Jun 2011
    • 19

    bcftools view generates an empty output

    Hello,


    file.sam
    #I convert SAM to BAM
    Code:
    samtools view -bS file -o file.bam
    #I sort BAM file
    Code:
    samtools sort file.bam file.srt
    I remove duplicat, then I detect variants with samtools mpileup

    Code:
    samtools mpileup -uf genome.fasta srt_file.bam | bcftools view -bvcg - > file_raw.bcf
    I obtain no errors, bcf file contain data. (file is about gigabase). => samtools mpileup generates an empty output.

    When I convert bcf into vcf, I obtain
    Code:
    [mpileup] 1 samples in 1 input files
    <mpileup> Set max per-sample depth to 8000
    and different lines with
    Code:
    [bcfview] 434500000 sites processed.
    [afs] 0:54828.077 1:20102.090 2:25069.833
    [bcfview] 434600000 sites processed.
    [afs] 0:44750.569 1:24548.359 2:30701.072
    but the vcf file only contains the header and this line :
    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT srt_file.bam

    #convert bcf into vcf
    Code:
    bcftools view file_raw.bcf | vcfutils.pl varFilter -D100 > variant.flt.vcf
    Someone can help me?
    Thanks in advance
  • swbarnes2
    Senior Member
    • May 2008
    • 910

    #2
    Sounds like it called no SNPs. Maybe you should generate the pileup, and eyeball some of it it to see if everything looks alright.

    Comment

    • manore
      Member
      • Jun 2011
      • 19

      #3
      I decide to proceed step by step,

      Code:
      samtools mpileup -uf genome.fasta srt_file.bam > output_mpileup
      [mpileup] 1 samples in 1 input files
      <mpileup> Set max per-file depth to 8000

      ls -lah sortie_mpileup
      xxx 35G Feb x xxx output_mpileup

      Code:
      bcftools view -bvcg output_mpileup > output_bcf
      ls -lah output_bcf
      xxx 1.3G Feb x xxx output_bcf
      bcftools view output_bcf | more

      output_bcf file is not empty!

      Code:
      bcftools view output_bcf | vcfutils.pl varFilter -D100 > variant.flt.vcf
      variant.flt.vcf file only contains the header and this line :
      #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT srt_file.bam


      I don't know where is the problem!

      Comment

      • swbarnes2
        Senior Member
        • May 2008
        • 910

        #4
        Again, why don't you manaully inspect the pileup? Is filtering everything with a coverage higher than 100 really appropriate for your dataset?

        Comment

        • manore
          Member
          • Jun 2011
          • 19

          #5
          Thanks,

          The problem is that vcf output indiscates only N as an SNP!

          Code:
          less xxx.vcf
          #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT C01HGABXX_002_s_7_brut_rm.bam
          chromosome:xxx 400 . N A 3.02 . DP=1;AF1=1;CI95=0.5,1;DP4=0,0,1,0;MQ=35;FQ=-30 PL 30,3,0

          Comment

          • manore
            Member
            • Jun 2011
            • 19

            #6
            My problem is that the pileup file contains all "N's" in the reference sequence field (column 3).
            I am mapping reads to a genome reference using stampy. I have converted the bwa sam alignment file to a sorted bam file using samtools. When I try running pileup to identify variants, I get only N's in the reference base position

            *I have checked the reference fasta file and don't find anything wrong.
            * the headers are consistent between SAM and the original fasta.

            see http://seqanswers.com/forums/showthread.php?t=5506

            Someone can hepl me?

            Comment

            • swbarnes2
              Senior Member
              • May 2008
              • 910

              #7
              Double check the reference fasta index. You can make it with samtools faidx.
              That index has to be made correctly for mpileup to work right. mpileup will run that command itself if it doesn't see the fasta index, and it will say something if that step fails, but it will keep going on anyway without it, and it will output all N's like you see.

              Comment

              • manore
                Member
                • Jun 2011
                • 19

                #8
                I make a index reference sequence
                Code:
                samtools faidx genome.fasta
                It creates genome.fai file. Hovewer, I obtain the same results!

                Comment

                • swbarnes2
                  Senior Member
                  • May 2008
                  • 910

                  #9
                  Are 100% sure that the reference you indexed is the same as the file you aligned to? Spot check the names in the .sam, make sure that they match perfectly with the names in the fasta and the fasta.fai.

                  You definately can't call SNPs until you get mpileup to reconize the association bewteen the reference you aligned to, and the reference fasta you are pointing it to, and having N's in the reference allele indicates that it's not making that association yet. Are there any odd non-space chracters in your chromsome name that maybe your aligner is truncating when making the .sam?

                  Comment

                  • manore
                    Member
                    • Jun 2011
                    • 19

                    #10
                    I used v0.1.16 (r973) version of samtools; and now with v0.1.17 version, the pileup file doesn't contain "N" in the reference sequence field.
                    Older versions of SAMtools support the pileup command. As of SAMtools v0.1.17 (r973) and later, the pileup command is deprecated and has been replaced with mpileup to accommodate multi-sample calling.

                    Thanks for your help swbarnes2.

                    Comment

                    • wzhjlau2009
                      Junior Member
                      • Jun 2011
                      • 5

                      #11
                      how about the problem now?solved?

                      Comment

                      • Anjali
                        Member
                        • Dec 2011
                        • 17

                        #12
                        Hi,
                        I am also facing the similar problem.
                        I use samtools 1.1.18 and with mpileup command I get a bcf file but the vcf file is only with the header
                        when I try changing the varFilter read depth, the vcf file is of 0 size

                        Can any one of you please help me

                        Comment

                        • Jeffzheng
                          Junior Member
                          • Jan 2016
                          • 1

                          #13
                          have this problem solved?

                          I also got empty vcf file

                          Comment

                          • dvdhover
                            Junior Member
                            • Oct 2015
                            • 1

                            #14
                            htslib matters

                            I got into the same problem.
                            Code:
                            samtools mpileup -ugf <ref.fa> -Q 0 <bamfile.bam> | bcftools call -vc | less
                            I can see the called variants.
                            Code:
                            samtools mpileup -ugf <ref.fa> -Q 0 <bamfile.bam> | bcftools call -vc > <mutation.vcf>
                            And I get an empty vcf file. As have been pointed out by others, the problem lies in exporting the vcf file. This problem almost drove me crazy. I reinstalled the samtools and bcftools. Still not working. Finally, I installed htslib. It worked! It turned out that htslib is not optional! So you may need to make sure htslib is installed in your machine. Good luck.

                            Comment

                            • mmmm
                              Senior Member
                              • Jul 2013
                              • 131

                              #15
                              bcftools view -b

                              Dear all, what is wrong in the bcftools view -bvcg ? (why -b option can not be found?). thanks

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                Yesterday, 10:05 AM
                              • SEQadmin2
                                Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                by SEQadmin2


                                With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                Introduction

                                Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                05-22-2026, 06:42 AM
                              • SEQadmin2
                                Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                by SEQadmin2

                                Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                05-06-2026, 09:04 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Yesterday, 12:03 PM
                              0 responses
                              17 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, Yesterday, 11:40 AM
                              0 responses
                              13 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-28-2026, 11:40 AM
                              0 responses
                              29 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 05-26-2026, 10:12 AM
                              0 responses
                              31 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...