Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • bio_informatics
    Senior Member
    • Nov 2013
    • 182

    Error in using vcfutil and bcftools

    outPut.sam is my sam file from bowtie2.

    I am trying to generate a consensus fq/fasta file from it.

    I ran:
    samtools-1.1/./samtools view -bS outPut.sam > bamOut.bam --> get the bam file out of sam file

    samtools-1.1/./samtools sort bamOut.bam sorted_bam.bam --> sort the bam file

    samtools-1.1/./samtools index sorted_bam.bam.bam --> create the index of the bam file sorted
    Now when I try to use this command:

    samtools-1.1/./samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/./bcftools view -cg - | bcftools/./vcfutils.pl vcf2fq > cns.fq

    I get an error,


    Error: Could not parse --min-ac g
    [mpileup] 1 samples in 1 input files
    Use of uninitialized value $l in numeric lt (<) at bcftools/./vcfutils.pl line 566.
    Use of uninitialized value $l in numeric lt (<) at bcftools/./vcfutils.pl line 566.
    <mpileup> Set max per-file depth to 8000
    And everything stops.
    It is something with bcftools.
    References:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc




    Please guide.
    Last edited by bio_informatics; 10-15-2014, 06:59 AM.
    Bioinformaticscally calm
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    See this recent thread: http://seqanswers.com/forums/showthread.php?t=47210 If the following does not work then we can try splitting the three steps as indicated in this thread.

    Can you try running your command like this:
    Code:
    $ samtools-1.1/samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/bcftools view -cg - | bcftools/vcfutils.pl vcf2fq > cns.fq

    Comment

    • bio_informatics
      Senior Member
      • Nov 2013
      • 182

      #3
      Hi GenoMax,
      Thanks for your response.

      I ran the command you provided. It didn't work, then I proceeded with individual.

      #
      Code:
      samtools-1.1/./samtools mpileup -uf sequence.fasta sorted_bam.bam.bam > bcf_file.bcf
      ran successfully.

      Below one threw error
      Code:
      bcftools/./bcftools view -cg bcf_file.bcf > bcftoolsout
      Error is Error: Could not parse --min-ac g

      When I removed -gc, it ran successfully. Strange?
      bcftools/./bcftools view bcf_file.bcf > bcftoolsout
      Originally posted by GenoMax View Post
      See this recent thread: http://seqanswers.com/forums/showthread.php?t=47210 If the following does not work then we can try splitting the three steps as indicated in this thread.

      Can you try running your command like this:
      Code:
      $ samtools-1.1/samtools mpileup -uf sequence.fasta sorted_bam.bam.bam | bcftools/bcftools view -cg - | bcftools/vcfutils.pl vcf2fq > cns.fq
      Bioinformaticscally calm

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        It looks like the new bcftools (v.1.x) expects an integer to go with -c option. Try it with that.

        -c/C, --min-ac/--max-ac <int>[:<type>] minimum/maximum count for non-reference (nref), 1st alternate (alt1) or minor (minor) alleles [nref]

        Comment

        • bio_informatics
          Senior Member
          • Nov 2013
          • 182

          #5
          Yes, I was going through the -help.
          I downloaded bcftools today.

          I am constructing a fasta sequence [which is my goal].

          Any random number would spoil my output.
          Originally posted by GenoMax View Post
          It looks like the new bcftools (v.1.x) expects an integer to go with -c option. Try it with that.
          Bioinformaticscally calm

          Comment

          • bio_informatics
            Senior Member
            • Nov 2013
            • 182

            #6
            Hi Genomax,
            So, I tried few things with my file,

            Code:
            bcftools/./bcftools view --min-ac 0 -g "^miss" bcf_file.bcf
            any number above 0 in --min-ac 0 isn't generating me a file worth moving ahead.
            -g "^miss"

            If I understand this correctly, I am excluding all the calls with missing genotypes.

            Kindly throw light more on the genotype flag.
            Bioinformaticscally calm

            Comment

            • MaximeOfOslo
              Junior Member
              • Jun 2015
              • 6

              #7
              I think I have the same problem here, and I may have a clue of where it comes from, but none about how to solve it...

              I'm using Illumina paired end data, that I assembled using Soap De Novo which gave me a file of the scafolds in a fasta format.

              I then mapped this scaffold using bwa to a close reference, converted it to a .bam file using samtools (without forgetting to index the reference).

              I'm now trying to generate the consensus file with this command

              Code:
              /usit/abel/u1/maxime/8_samtools/bin/samtools mpileup  -uf chrysanthemum_indicum_chloroplast.fa 1_file_sorted.bam | /usit/abel/u1/maxime/8_samtools/bin/bcftools call   -c - | /usit/abel/u1/maxime/8_samtools/bin/vcfutils.pl vcf2fq > ass_aln.fq
              And i have the same error message :

              Code:
              [mpileup] 1 samples in 1 input files
              <mpileup> Set max per-file depth to 8000
              Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.
              Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.

              Note that bcftools view seems to have been replaced by bcftools call according to their page

              From my understanding, I may have this problem because at some point, I lost the base quality encoding (in the de novo assembly).

              Please find attached under this line the output of the bcftools call -c - command

              Comment

              • swbarnes2
                Senior Member
                • May 2008
                • 910

                #8
                I'm still using samtools 0.1.19, can someone just post the portion of vcfutils that is causing the problem? It's a perl script, you can open it in any text editor. I feel like I've seen that error in older versions, but I can't pinpoint it without know where to look.

                My first suggestion would be to reindex the reference with samtools faidx, make sure that happens without errors, as mpileup tends not to warn you if that index isn't right.

                Comment

                • SNPsaurus
                  Registered Vendor
                  • May 2013
                  • 525

                  #9
                  Can you switch to bcftools consensus?


                  Originally posted by MaximeOfOslo View Post
                  I think I have the same problem here, and I may have a clue of where it comes from, but none about how to solve it...

                  I'm using Illumina paired end data, that I assembled using Soap De Novo which gave me a file of the scafolds in a fasta format.

                  I then mapped this scaffold using bwa to a close reference, converted it to a .bam file using samtools (without forgetting to index the reference).

                  I'm now trying to generate the consensus file with this command

                  Code:
                  /usit/abel/u1/maxime/8_samtools/bin/samtools mpileup  -uf chrysanthemum_indicum_chloroplast.fa 1_file_sorted.bam | /usit/abel/u1/maxime/8_samtools/bin/bcftools call   -c - | /usit/abel/u1/maxime/8_samtools/bin/vcfutils.pl vcf2fq > ass_aln.fq
                  And i have the same error message :

                  Code:
                  [mpileup] 1 samples in 1 input files
                  <mpileup> Set max per-file depth to 8000
                  Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.
                  Use of uninitialized value in length at /usit/abel/u1/maxib/8_samtools/bin/vcfutils.pl line 565, <> line 114.

                  Note that bcftools view seems to have been replaced by bcftools call according to their page

                  From my understanding, I may have this problem because at some point, I lost the base quality encoding (in the de novo assembly).

                  Please find attached under this line the output of the bcftools call -c - command

                  https://dl.dropboxusercontent.com/u/...all_output.txt
                  Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

                  Comment

                  • MaximeOfOslo
                    Junior Member
                    • Jun 2015
                    • 6

                    #10
                    Hi,
                    I'd like to try to use consensus. Till the faidx command, everything is fine, however, I'm now stuck at the mpileup stage.

                    Here is my code :
                    Code:
                    /usit/abel/u1/maxib/8_samtools/bin/samtools  view -bS 1_align.sam | /usit/abel/u1/maxib/8_samtools/bin/samtools sort - file_sorted
                    /usit/abel/u1/maxib/8_samtools/bin/samtools index file_sorted.bam
                    /usit/abel/u1/maxib/8_samtools/bin/samtools faidx chrysanthemum_indicum_chloroplast.fasta
                    /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz 
                    /usit/abel/u1/maxib/8_samtools/bin/bcftools consensus -i chrysanthemum_indicum_chloroplast.fasta file.vcf.gz >  out.fa
                    And here is the output :

                    Code:
                    [mpileup] 1 samples in 1 input files
                    <mpileup> Set max per-file depth to 8000
                    ./sam.sh: line 4: 24661 Abandon                 /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz
                    Are the fasta lines too long ?
                    Last edited by MaximeOfOslo; 06-18-2015, 05:01 AM.

                    Comment

                    • GenoMax
                      Senior Member
                      • Feb 2008
                      • 7142

                      #11
                      @Maxime: Are you using new versions of samtools and bcftools (v.1.x)? You can't mix and match old/new versions.

                      Comment

                      • MaximeOfOslo
                        Junior Member
                        • Jun 2015
                        • 6

                        #12
                        Thanks for your help @GenoMax, it's highly appreciated
                        Yes, I'm using the latest versions of both samtools and bcftools
                        Code:
                        -bash-4.1$ ls 8_samtools/install/
                        bcftools-1.2  htslib-1.2.1  samtools-1.2
                        The fasta file I'm using as a reference is the coding sequence of this chloroplast genome : http://www.ncbi.nlm.nih.gov/nuccore/JN867589.1
                        Last edited by MaximeOfOslo; 06-18-2015, 05:17 AM.

                        Comment

                        • GenoMax
                          Senior Member
                          • Feb 2008
                          • 7142

                          #13
                          Was the NCBI sequence used for generating the index and doing alignments? Is that the only error you are seeing?

                          Comment

                          • MaximeOfOslo
                            Junior Member
                            • Jun 2015
                            • 6

                            #14
                            Indeed it is. The alignement was performed with bwa using the fasta file mentioned above.

                            Here is the complete output of my script :
                            Code:
                            -bash-4.1$ ./sam.sh 
                            [bam_sort_core] merging from 20 files...
                            [mpileup] 1 samples in 1 input files
                            <mpileup> Set max per-file depth to 8000
                            ./sam.sh: line 4:  4949 Abandon                 /usit/abel/u1/maxib/8_samtools/bin/samtools mpileup -v -f chrysanthemum_indicum_chloroplast.fasta file_sorted.bam -o file.vcf.gz
                            
                            About:   Create consensus sequence by applying VCF variants to a reference
                                     fasta file.
                            Usage:   bcftools consensus [OPTIONS] <file.vcf>
                            Options:
                                -f, --fasta-ref <file>     reference sequence in fasta format
                                -H, --haplotype <1|2>      apply variants for the given haplotype
                                -i, --iupac-codes          output variants in the form of IUPAC ambiguity codes
                                -m, --mask <file>          replace regions with N
                                -o, --output <file>        write output to a file [standard output]
                                -c, --chain <file>         write a chain file for liftover
                                -s, --sample <name>        apply variants of the given sample
                            Examples:
                               # Get the consensus for one region. The fasta header lines are then expected
                               # in the form ">chr:from-to".
                               samtools faidx ref.fa 8:11870-11890 | bcftools consensus in.vcf.gz > out.fa
                            Up until mpileup, there are no error. After samtools consensus print its error message because no vcf file was generated at the mpileup stage.

                            Comment

                            • GenoMax
                              Senior Member
                              • Feb 2008
                              • 7142

                              #15
                              Have you tried to manually run the mpileup command? Are all the files in the current directory? Does the sorted bam file look to be of about the same size?

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, Today, 10:09 AM
                              0 responses
                              7 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, Yesterday, 08:59 AM
                              0 responses
                              14 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              22 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              19 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...