Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • skingan
    Member
    • Feb 2010
    • 17

    bam -> vcf -> fasta

    Hi Everyone,

    I am trying to generate a FASTA sequence from a BAM file using a BCF/VCF file as an intermediate. I would like to create a temporary BCF/VCF file from which I can make a FASTA sequence as well as estimate nucleotide polymorphism using the vcftools package.

    Fasta sequences can only generated for a single sample at a time, so I need to create a BCF/VCF file with a single sample. To do this, I have two commands: one that converts from BAM to BCF/VCF and the other which converts fro BCF/VCF to FASTA. Frustratingly, when I select a single sample in the second command, my FASTA sequence contains lots of ambiguities. But when I select it in the first, it works just fine. I would like to create a single BCF/VCF file for a region of interest and then generate FASTA sequences for each sample, rather than having to make a BCF/VCF file for every sample (waste of space!).

    Bear with me while I give each pair of commands with the FASTA formatted output. The first two use a VCF intermediate, the last two use a BCF.

    samtools mpileup -gIq20 -r 2L:53202-53313 -f /home/data/mau12/mau12.fa /home/data/maupoly/maupolyI.bam | bcftools view -g - > all.vcf
    bcftools view -Ss mel.txt all.vcf | vcfutils.pl vcf2fq | seqtk fq2fa - | seqtk cutN -
    >0:53202-53313
    TGTAAGTAAATARAAAGGAACTAGAATTTAATCCTAGATCTGRATAAGTATSAAAACAAG
    ACCTMCRSGTACTTGCTARRACTCGACATTYAAAAAAYRTATTKCTTTTTAG

    samtools mpileup -gIq20 -r 2L:53202-53313 -f /home/data/mau12/mau12.fa /home/data/maupoly/maupolyI.bam | bcftools view -gs mel.txt - > mel.vcf
    bcftools view -S tmp.vcf | vcfutils.pl vcf2fq | seqtk fq2fa - | seqtk cutN -
    >2L:53202-53313
    TGTAAGTAAATARAAAGGAACTAGAATTTAATCCTAGATCTGRATAAGTATSAAAACAAG
    ACCTMCRSGTACTTGCTARRACTCGACATTYAAAAAAYRTATTKCTTTTTAG


    samtools mpileup -gIq20 -r 2L:53202-53313 -f /home/data/mau12/mau12.fa /home/data/maupoly/maupolyI.bam | bcftools view -bgs mel.txt - > mel.bcf
    bcftools view mel.bcf | vcfutils.pl vcf2fq | seqtk fq2fa - | seqtk cutN -
    >2L:53202-53313
    TGTAAGTAAATAAAAAGGAACTAGAATTTAATCCTAGATCTGAATAAGTATCAAAACAAG
    ACCTCCGCGTACTTGCTAAGACTCGACATTTAAAAAACGTATTTCTTTTTAG

    samtools mpileup -gIq20 -r 2L:53202-53313 -f /home/data/mau12/mau12.fa /home/data/maupoly/maupolyI.bam | bcftools view -bg - > all.bcf
    bcftools view -s mel.txt all.bcf | vcfutils.pl vcf2fq | seqtk fq2fa - | seqtk cutN -
    >2L:53202-53313
    TGTAAGTAAATARAAAGGAACTAGAATTTAATCCTAGATCTGRATAAGTATSAAAACAAG
    ACCTMCRSGTACTTGCTARRACTCGACATTYAAAAAAYRTATTKCTTTTTAG

    The third set of commands gives me the right output: no ambiguities, and I've check that this is 100% correct from an independent source.

    Does anyone have any idea why this happens???

    Thanks in advance.

    Sarah
  • nthompson
    Junior Member
    • Dec 2013
    • 1

    #2
    Hello,

    Does anyone knows how I could convert a .vcf file to .fasta format? I have used the following string in samtools, however, I am seeing only n's in the fasta output.

    <vcfutils.pl vcf2fq mappedreads.vcf > mappedreads.fq>

    Thanks in advance!

    Comment

    Latest Articles

    Collapse

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, Today, 06:09 AM
    0 responses
    15 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-09-2026, 11:58 AM
    0 responses
    34 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    39 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-04-2026, 08:59 AM
    0 responses
    45 views
    0 reactions
    Last Post SEQadmin2  
    Working...