Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error in using vcfutil and bcftools

    outPut.sam is my sam file from bowtie2.

    I am trying to generate a consensus fq/fasta file from it.

    I ran:
    samtools-1.1/./samtools view -bS outPut.sam > bamOut.bam --> get the bam file out of sam file

    samtools-1.1/./samtools sort bamOut.bam sorted_bam.bam --> sort the bam file

    samtools-1.1/./samtools index sorted_bam.bam.bam --> create the index of the bam file sorted
    Now when I try to use this command:

    samtools-1.1/./samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/./bcftools view -cg - | bcftools/./vcfutils.pl vcf2fq > cns.fq

    I get an error,


    Error: Could not parse --min-ac g
    [mpileup] 1 samples in 1 input files
    Use of uninitialized value $l in numeric lt (<) at bcftools/./vcfutils.pl line 566.
    Use of uninitialized value $l in numeric lt (<) at bcftools/./vcfutils.pl line 566.
    <mpileup> Set max per-file depth to 8000
    And everything stops.
    It is something with bcftools.
    References:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc




    Please guide.
    Last edited by bio_informatics; 10-15-2014, 06:59 AM.
    Bioinformaticscally calm

  • #2
    See this recent thread: http://seqanswers.com/forums/showthread.php?t=47210 If the following does not work then we can try splitting the three steps as indicated in this thread.

    Can you try running your command like this:
    Code:
    $ samtools-1.1/samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/bcftools view -cg - | bcftools/vcfutils.pl vcf2fq > cns.fq

    Comment


    • #3
      Hi GenoMax,
      Thanks for your response.

      I ran the command you provided. It didn't work, then I proceeded with individual.

      #
      Code:
      samtools-1.1/./samtools mpileup -uf sequence.fasta sorted_bam.bam.bam > bcf_file.bcf
      ran successfully.

      Below one threw error
      Code:
      bcftools/./bcftools view -cg bcf_file.bcf > bcftoolsout
      Error is Error: Could not parse --min-ac g

      When I removed -gc, it ran successfully. Strange?
      bcftools/./bcftools view bcf_file.bcf > bcftoolsout
      Originally posted by GenoMax View Post
      See this recent thread: http://seqanswers.com/forums/showthread.php?t=47210 If the following does not work then we can try splitting the three steps as indicated in this thread.

      Can you try running your command like this:
      Code:
      $ samtools-1.1/samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/bcftools view -cg - | bcftools/vcfutils.pl vcf2fq > cns.fq
      Bioinformaticscally calm

      Comment


      • #4
        It looks like the new bcftools (v.1.x) expects an integer to go with -c option. Try it with that.

        -c/C, --min-ac/--max-ac <int>[:<type>] minimum/maximum count for non-reference (nref), 1st alternate (alt1) or minor (minor) alleles [nref]

        Comment


        • #5
          Yes, I was going through the -help.
          I downloaded bcftools today.

          I am constructing a fasta sequence [which is my goal].

          Any random number would spoil my output.
          Originally posted by GenoMax View Post
          It looks like the new bcftools (v.1.x) expects an integer to go with -c option. Try it with that.
          Bioinformaticscally calm

          Comment


          • #6
            Hi Genomax,
            So, I tried few things with my file,

            Code:
            bcftools/./bcftools view --min-ac 0 -g "^miss" bcf_file.bcf
            any number above 0 in --min-ac 0 isn't generating me a file worth moving ahead.
            -g "^miss"

            If I understand this correctly, I am excluding all the calls with missing genotypes.

            Kindly throw light more on the genotype flag.
            Bioinformaticscally calm

            Comment


            • #7
              I think I have the same problem here, and I may have a clue of where it comes from, but none about how to solve it...

              I'm using Illumina paired end data, that I assembled using Soap De Novo which gave me a file of the scafolds in a fasta format.

              I then mapped this scaffold using bwa to a close reference, converted it to a .bam file using samtools (without forgetting to index the reference).

              I'm now trying to generate the consensus file with this command

              Code:
              /usit/abel/u1/maxime/8_samtools/bin/samtools mpileup  -uf chrysanthemum_indicum_chloroplast.fa 1_file_sorted.bam | /usit/abel/u1/maxime/8_samtools/bin/bcftools call   -c - | /usit/abel/u1/maxime/8_samtools/bin/vcfutils.pl vcf2fq > ass_aln.fq
              And i have the same error message :

              Code:
              [mpileup] 1 samples in 1 input files
              <mpileup> Set max per-file depth to 8000
              Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.
              Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.

              Note that bcftools view seems to have been replaced by bcftools call according to their page

              From my understanding, I may have this problem because at some point, I lost the base quality encoding (in the de novo assembly).

              Please find attached under this line the output of the bcftools call -c - command

              https://dl.dropboxusercontent.com/u/...all_output.txt

              Comment


              • #8
                I'm still using samtools 0.1.19, can someone just post the portion of vcfutils that is causing the problem? It's a perl script, you can open it in any text editor. I feel like I've seen that error in older versions, but I can't pinpoint it without know where to look.

                My first suggestion would be to reindex the reference with samtools faidx, make sure that happens without errors, as mpileup tends not to warn you if that index isn't right.

                Comment


                • #9
                  Can you switch to bcftools consensus?


                  Originally posted by MaximeOfOslo View Post
                  I think I have the same problem here, and I may have a clue of where it comes from, but none about how to solve it...

                  I'm using Illumina paired end data, that I assembled using Soap De Novo which gave me a file of the scafolds in a fasta format.

                  I then mapped this scaffold using bwa to a close reference, converted it to a .bam file using samtools (without forgetting to index the reference).

                  I'm now trying to generate the consensus file with this command

                  Code:
                  /usit/abel/u1/maxime/8_samtools/bin/samtools mpileup  -uf chrysanthemum_indicum_chloroplast.fa 1_file_sorted.bam | /usit/abel/u1/maxime/8_samtools/bin/bcftools call   -c - | /usit/abel/u1/maxime/8_samtools/bin/vcfutils.pl vcf2fq > ass_aln.fq
                  And i have the same error message :

                  Code:
                  [mpileup] 1 samples in 1 input files
                  <mpileup> Set max per-file depth to 8000
                  Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.
                  Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.

                  Note that bcftools view seems to have been replaced by bcftools call according to their page

                  From my understanding, I may have this problem because at some point, I lost the base quality encoding (in the de novo assembly).

                  Please find attached under this line the output of the bcftools call -c - command

                  https://dl.dropboxusercontent.com/u/...all_output.txt
                  Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                  Comment


                  • #10
                    Hi,
                    I'd like to try to use consensus. Till the faidx command, everything is fine, however, I'm now stuck at the mpileup stage.

                    Here is my code :
                    Code:
                    /usit/abel/u1/maxib/8_samtools/bin/samtools  view -bS 1_align.sam | /usit/abel/u1/maxib/8_samtools/bin/samtools sort - file_sorted
                    /usit/abel/u1/maxib/8_samtools/bin/samtools index file_sorted.bam
                    /usit/abel/u1/maxib/8_samtools/bin/samtools faidx chrysanthemum_indicum_chloroplast.fasta
                    /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz 
                    /usit/abel/u1/maxib/8_samtools/bin/bcftools consensus -i chrysanthemum_indicum_chloroplast.fasta file.vcf.gz >  out.fa
                    And here is the output :

                    Code:
                    [mpileup] 1 samples in 1 input files
                    <mpileup> Set max per-file depth to 8000
                    ./sam.sh: line 4: 24661 Abandon                 /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz
                    Are the fasta lines too long ?
                    Last edited by MaximeOfOslo; 06-18-2015, 05:01 AM.

                    Comment


                    • #11
                      @Maxime: Are you using new versions of samtools and bcftools (v.1.x)? You can't mix and match old/new versions.

                      Comment


                      • #12
                        Thanks for your help @GenoMax, it's highly appreciated
                        Yes, I'm using the latest versions of both samtools and bcftools
                        Code:
                        -bash-4.1$ ls 8_samtools/install/
                        bcftools-1.2  htslib-1.2.1  samtools-1.2
                        The fasta file I'm using as a reference is the coding sequence of this chloroplast genome : http://www.ncbi.nlm.nih.gov/nuccore/JN867589.1
                        Last edited by MaximeOfOslo; 06-18-2015, 05:17 AM.

                        Comment


                        • #13
                          Was the NCBI sequence used for generating the index and doing alignments? Is that the only error you are seeing?

                          Comment


                          • #14
                            Indeed it is. The alignement was performed with bwa using the fasta file mentioned above.

                            Here is the complete output of my script :
                            Code:
                            -bash-4.1$ ./sam.sh 
                            [bam_sort_core] merging from 20 files...
                            [mpileup] 1 samples in 1 input files
                            <mpileup> Set max per-file depth to 8000
                            ./sam.sh: line 4:  4949 Abandon                 /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz
                            
                            About:   Create consensus sequence by applying VCF variants to a reference
                                     fasta file.
                            Usage:   bcftools consensus [OPTIONS] <file.vcf>
                            Options:
                                -f, --fasta-ref <file>     reference sequence in fasta format
                                -H, --haplotype <1|2>      apply variants for the given haplotype
                                -i, --iupac-codes          output variants in the form of IUPAC ambiguity codes
                                -m, --mask <file>          replace regions with N
                                -o, --output <file>        write output to a file [standard output]
                                -c, --chain <file>         write a chain file for liftover
                                -s, --sample <name>        apply variants of the given sample
                            Examples:
                               # Get the consensus for one region. The fasta header lines are then expected
                               # in the form ">chr:from-to".
                               samtools faidx ref.fa 8:11870-11890 | bcftools consensus in.vcf.gz > out.fa
                            Up until mpileup, there are no error. After samtools consensus print its error message because no vcf file was generated at the mpileup stage.

                            Comment


                            • #15
                              Have you tried to manually run the mpileup command? Are all the files in the current directory? Does the sorted bam file look to be of about the same size?

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Genetic Variation in Immunogenetics and Antibody Diversity
                                by seqadmin



                                The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                                Yesterday, 07:24 PM
                              • seqadmin
                                Choosing Between NGS and qPCR
                                by seqadmin



                                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                10-18-2024, 07:11 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 11-01-2024, 06:09 AM
                              0 responses
                              28 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-30-2024, 05:31 AM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-24-2024, 06:58 AM
                              0 responses
                              25 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-23-2024, 08:43 AM
                              0 responses
                              57 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X