Unconfigured Ad

**hansdd** · 05-17-2011, 04:12 PM

Originally posted by Chiel View Post

Hi,

In our group we are using samtools for variant calling. As a basic guide we use the example given at http://samtools.sourceforge.net/mpileup.shtml. It seems samtools is able to perform as a nice tool to get from bam to a useful variant call format that can be annotated using other resources. Yet we have some difficulties understanding and applying some parts to proper use.

Instead of what is shown in the example we want to apply variant calling on a single sample. The first question is if it's safe to use mpileup on a single sample in a similar way as is shown in the example, or should I use normal pileup for this? (And does this still apply BAQ?)

Then the data is converted to a raw bcf file using bcftools. The second question is if this output contains every possible variant disregarding quality, depth, and the number of variant supporting calls? I assume this is the case and further polishing is done using vcfutils but please correct me if I'm wrong.

Finally, vcfutils' varfilter is applied for filtering. In the example only a depth filter is shown. Next to the depth there are some other thresholds we would like to set. We would like to apply a (base) quality cutoff, a strand-bias filter for reference and variant calls, and inlcude variant supporting calls.

A close inspection of the varfilter help shows a couple of possibilities. I'll briefly describe how we think they should be used, or what our difficulties are.
-Using the -a flag we can set the number of variant supporting calls?
-The -1 flag seems to be a p-val for strand bias cutoff. Yet I'm unable to find any explanation on what useful values we can use. (Or how this behaves in certain conditions we are interested in. i.e. Both reference and variant calls found on both strands.
-Then there are the -2, -3, and -4 flags which imply serveral p-val setting. Default values are given. However, also here an explanation on how to alter this for different practical conditions would be very welcome.
-The default value for mapQ bias is 0, why?

We couldn't find much information on these issues in literature or other recources. Nevertheless, some of these setting are crucial in variant calling and I would expect better descriptions than what we could find so far, especially when a clinical setting comes into play. It would be greatly appreciated if anyone could give some answers. Thanks.

I have many of the same questions and cannot find answers. Can someone give some guidance or points us towards resources which explain this more.

**sergiodealencar** · 06-07-2011, 09:10 AM

I would also like to know how to filter strand bias using GATK Unified Genotyper. What is the ideal SB (Strand Bias) threshold value?

Thanks,
Sérgio

Topics	Statistics	Last Post
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, Yesterday, 11:10 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 42 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 104 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 125 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM

Unconfigured Ad

Samtools variant calling questions

Comment

Comment

Latest Articles

ad_right_rmr

News