I have been using mpileup with two to six samples for a while, with good results. Today I tried a run with 20 input bam files, and a small percentage of my output lines are truncated. I haven't been able to find a reference to this on the web; my apologies if it has been discussed before.
My use of mpileup is about as simple as can be; I'm just getting the pileups, not doing any calling:
samtools mpileup -f <reference> <bam-list>
With 20 input files, I expect 63 tab-separated fields per line. In a small percentage of lines, I'm getting 61 fields. When the line is short, it always has 61 fields. In a subset of my data, the total mpileup output is 26,633 lines; of those, 23 are truncated.
I have run mpileup multiple times on the full set of the data and the output files are identical.
If this is not a known problem and my subset bams would be helpful, I can point lh3 at them.
Thanks for any pointers.
-Al
My use of mpileup is about as simple as can be; I'm just getting the pileups, not doing any calling:
samtools mpileup -f <reference> <bam-list>
With 20 input files, I expect 63 tab-separated fields per line. In a small percentage of lines, I'm getting 61 fields. When the line is short, it always has 61 fields. In a subset of my data, the total mpileup output is 26,633 lines; of those, 23 are truncated.
I have run mpileup multiple times on the full set of the data and the output files are identical.
If this is not a known problem and my subset bams would be helpful, I can point lh3 at them.
Thanks for any pointers.
-Al
Comment