Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mpile up problem

    I'm trying to derive a vcf file from a bam using:

    samtools view -bS test.sam | samtools sort - test_sort
    samtools index test_sorted.bam
    samtools mpileup -E -uf Ref.fna test_sorted.bam > test.pileup
    bcftools view -cg test.pileup > test.vcf

    The sam and bam look fine but the mpileup command runs too quickly and gives me a small file without sequence data.

    bcftools gives me:
    [bcf_sync] incorrect number of fields (6 != 5) at 0.0

    I've re-generated the bowtie index with the Ref.fna file to confirm the files match but that doesn't help. I'm using samtools 1.2, bcftools 0.1.17.

    Any idea what's wrong? It worked fine a couple of weeks ago.

  • #2
    Advisable not to mix old and new versions of samtools and bcftools. You may want to look at the new "call" option in new bcftools: http://www.htslib.org/doc/bcftools.html#call

    Comment


    • #3
      Thanks, I'll upgrade the bcftools.

      However, I think something is wrong with the mpileup output too.
      It begins readable but then turns into binary. This doesn't look like any examples of pileup format I've seen. Could it be behind the problem?

      BCF^B^Bf<^@^@##fileformat=VCFv4.2
      ##FILTER=<ID=PASS,Description="All filters passed",IDX=0>
      ##samtoolsVersion=1.2+htslib-1.2.1
      ##samtoolsCommand=samtools mpileup -E -uf Ref_new.fna test_sorted.bam
      ##reference=file://Ref_new.fna
      ##contig=<ID=comp39600_c0_seq2,length=1517,IDX=0>
      ##contig=<ID=comp39985_c0_seq4,length=1303,IDX=1>
      ##contig=<ID=comp40415_c0_seq2,length=873,IDX=2>

      .......


      ##contig=<ID=comp44856_c1_seq1,length=608,IDX=263>
      ##ALT=<ID=X,Description="Represents allele(s) other than observed.">
      ##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.",IDX=1>
      ##INFO=<ID=IDV,Number=1,Type=Integer,Description="Maximum number of reads supporting an indel",IDX=2>
      ##INFO=<ID=IMF,Number=1,Type=Float,Description="Maximum fraction of reads supporting an indel",IDX=3>
      ##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth",IDX=4>
      ##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3",IDX=5>
      ##INFO=<ID=RPB,Number=1,Type=Float,Description="Mann-Whitney U test of Read Position Bias (bigger is better)",IDX=6>
      ##INFO=<ID=MQB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality Bias (bigger is better)",IDX=7>
      ##INFO=<ID=BQB,Number=1,Type=Float,Description="Mann-Whitney U test of Base Quality Bias (bigger is better)",IDX=8>
      ##INFO=<ID=MQSB,Number=1,Type=Float,Description="Mann-Whitney U test of Mapping Quality vs Strand Bias (bigger is better)",IDX=9>
      ##INFO=<ID=SGB,Number=1,Type=Float,Description="Segregation based metric.",IDX=10>
      ##INFO=<ID=MQ0F,Number=1,Type=Float,Description="Fraction of MQ0 reads (smaller is better)",IDX=11>
      ##INFO=<ID=I16,Number=16,Type=Float,Description="Auxiliary tag used for calling, see description of bcf_callret1_t in bam2bcf.h",IDX=12>
      ##INFO=<ID=QS,Number=R,Type=Float,Description="Auxiliary tag used for calling",IDX=13>
      ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods",IDX=14>
      #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT test_sorted.bam
      ^@{^@^@^@^F^@^@^@^@^@^@^@^@^@^@^@^A^@^@^@^@^@^@^@^D^@^B^@^A^@^@^A^G^WC7<X>^@^Q^D^Q^U^Q^L<F5>^Q^P^@^@<A8>A^@^@^@^@^@^@^@^@^@^@^@^@^@^@@D^@8<DC>F^@^@^@^@^@^@^@^@^@^@<A8>C^@<A0><D5>E^@^@^@^@^@^@^@^@^@<80><A7>C^@H<D9>E^@^@^@^@^@^@^@^@^Q^M%^@^@<80>?^@^@^@^@^Q^K^U^@^@^@^@^Q^N1^@?f{^@^@^@^F^@^@^@^@^@^@^@^A^@^@^@^A^@^@^@^@^@^@^@^D^@^B^@^A^@^@^A^G^WG7<X>^@^Q^D^Q^U^Q^L<F5>^Q^P^@^@<A8>A^@^@^@^@^@^@^@^@^@^@^@^@^@@;D^@*<D5>F^@^@^@^@^@^@^@^@^@^@<A8>C^@<A0><D5>E^@^@^@^@^@^@^@^@^@<80><AE>C^@<B8><E3>E^@^@^@^@^@^@^@^@^Q^M%^@^@<80>?^@^@^@^@^Q^K^U^@^@^@^@^Q^N1^@?f{^@^@^@^F^@^@^@^@^@^@^@^B^@^@^@^A^@^@^@^@^@^@^@^D^@^B^@^A^@^@^A^G^WG7<X>^@^Q^D^Q^U^Q^L<F5>^Q^P^@^@<A8>A^@^@^@^@^@^@^@
      ^@^@^@^@^@^@<80>6D^@\<CE>F^@^@^@^@^@^@^@^@^@^@<A8>C^@<A0><D5>E^@^@^@^@^@^@^@^@^@<80><B5>C^@^H<EF>E^@^@^@^@^@^@^@^@^Q^M%^@^@<80>?^@^@^@^@^Q^K^U^@^@^@^@^Q^N1^@?j{^@^@^@

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 06:37 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Yesterday, 06:07 PM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      49 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      66 views
      0 likes
      Last Post seqadmin  
      Working...
      X