Unconfigured Ad

**dpryan** · 02-26-2015, 11:53 AM

Be very careful with that program as its author fundamentally misunderstands what he's doing. BAM->pileup->VCF is absolutely NOT a format conversion. Actually, bcftools doesn't even call variants, samtools mpileup does, so he didn't even get that part right. The biggest mistake that the author makes is believing that if a variant exists in a read that it should be in the VCF file. This is simply false. Typical NGS reads are full of sequencing mistakes and you need multiple alignments giving evidence of a variant in order to reasonably declare that it actually exists. Not doing this is a complete fail.

The reason many tools don't accurately report multiallelic variants is that they're designed around diploid organisms and use a model dependent on that. If you're doing an experiment that requires calling rare variants in pooled data, then use a tool intended for that (I don't know of any off-hand, but that's not what I work on).

**markusli** · 02-26-2015, 12:07 PM

But he doesn't expect you to use it to somehow generate a vcf from a bamfile, but to pipe samtools mpileup to this tool and have it do the heavy lifting.
It generates a vcf file that has total depth, reference supporting read count, variant supporting read count and qualities for each reported position and doesn't really seem to just report all inconsistencies between reads and the reference sequence.

**dpryan** · 02-26-2015, 11:51 PM

The processing that samtools mpileup is doing before being piped into that script is non-existent. There are two ways of running mpileup:

You can have mpileup call variants in VCF or BCF format
You can have mpileup create a pileup (or mpileup) of each base

His script processes the output of #2. Varscan does the same thing, but it also doesn't perform a trivial conversion of that to VCF.

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 8 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 44 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 104 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

Indels in bacterial RNA-seq data

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News