Hi,
I have multiple samples from which I would like to call SNP. I've been trying either to call them separately or all at once
samtools mpileup -uf hg19.fa sample1.bam | bcftools view -bvcg - > out.bcf
or
samtools mpileup -uf hg19.fa sample1.bam sample2.bam | bcftools view -bvcg - > out.bcf
In the first case, I can get coverage (DP) at each position where a SNP is encountered, but only for the sample that has the SNP. That is, I would like to get the coverage at that position in all samples, but I could not.
In the second case, for SNP I know are present in both samples, I found only 1 value for DP, which I assume is the combined coverage of both samples (see below).
Is there any way I can get DP and DP4 for all samples even if the SNP is only in one?
chr1 15211 . T G 51 . DP=16;AF1=1;CI95=0.8333,1;DP4=3,0,10,3;MQ=30;FQ=-37.5;PV4=1,1.3e-05,1,1 GT:PL:GQ 1/1:46,9,0:21 1/1:23,9,0:21
Second question, I've been having a hard time trying to annotate the SNP calls in the vcf file (produced by bcftools view) with its rsID. I tried vcftools
>bgzip UCSC_SNP_v132
>zcat input.vcf.gz | vcftools_0.1.5/bin/fill-rsIDs -r UCSC_SNP_v132.gz | tabix-0.2.5/bgzip -c > out
but kept getting error below which suggests to me that I'm not using the right format for the input dbSNP. I used the BED format downloaded from UCSC. What am I doing wrong here?
tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory at vcftools_0.1.5/bin/fill-rsIDs line 20
main::error('tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory') called at vcftools_0.1.5/bin/fill-rsIDs line 93
main::fill_rsids('HASH(0x169602a0)', './UCSC_SNP_v132.gz') called at vcftools_0.1.5/bin/fill-rsIDs line 11
Thanks!!
I have multiple samples from which I would like to call SNP. I've been trying either to call them separately or all at once
samtools mpileup -uf hg19.fa sample1.bam | bcftools view -bvcg - > out.bcf
or
samtools mpileup -uf hg19.fa sample1.bam sample2.bam | bcftools view -bvcg - > out.bcf
In the first case, I can get coverage (DP) at each position where a SNP is encountered, but only for the sample that has the SNP. That is, I would like to get the coverage at that position in all samples, but I could not.
In the second case, for SNP I know are present in both samples, I found only 1 value for DP, which I assume is the combined coverage of both samples (see below).
Is there any way I can get DP and DP4 for all samples even if the SNP is only in one?
chr1 15211 . T G 51 . DP=16;AF1=1;CI95=0.8333,1;DP4=3,0,10,3;MQ=30;FQ=-37.5;PV4=1,1.3e-05,1,1 GT:PL:GQ 1/1:46,9,0:21 1/1:23,9,0:21
Second question, I've been having a hard time trying to annotate the SNP calls in the vcf file (produced by bcftools view) with its rsID. I tried vcftools
>bgzip UCSC_SNP_v132
>zcat input.vcf.gz | vcftools_0.1.5/bin/fill-rsIDs -r UCSC_SNP_v132.gz | tabix-0.2.5/bgzip -c > out
but kept getting error below which suggests to me that I'm not using the right format for the input dbSNP. I used the BED format downloaded from UCSC. What am I doing wrong here?
tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory at vcftools_0.1.5/bin/fill-rsIDs line 20
main::error('tabix ./UCSC_SNP_v132.gz chr1 2>&1 |: No such file or directory') called at vcftools_0.1.5/bin/fill-rsIDs line 93
main::fill_rsids('HASH(0x169602a0)', './UCSC_SNP_v132.gz') called at vcftools_0.1.5/bin/fill-rsIDs line 11
Thanks!!
Comment