Hello,
I have sequencing data where the position and frequency of mismatches play important role in the downstream analysis. I generated short read mappings using BWA. In the samse output file I have inconsistent MD and cigar fields. As far as I saw MD field is generated from CIGAR field and should be consistent with it. Did anyone have the same problem?
>less uniq_part_001.fastq.sam | cut -f 1-6,10,19 |grep -v "*"| head
seqAAA_0 16 chr12 50113894 25 36M CGCCATCTGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:1A34
seqAAA_1 0 chr4 178545809 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACATATGCCT MD:Z:34T1
seqAAA_3 16 chr10 6381930 25 36M CAGAAGACGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:8T27
seqAAA_4 0 chr4 39577867 25 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACGTTTGCCC MD:Z:27A8
seqAAA_7 0 chr9 117092453 25 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACTTATGCCC MD:Z:26C9
seqAAA_8 16 chr20 17734163 25 36M CGGAATAAGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:0A35
seqAAA_10 0 chr8 112121071 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACTTTAGCCG MD:Z:28A7
seqAAA_11 16 chr2 66418968 0 36M GCATACCTCTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:36
seqAAA_12 0 chr16 73425684 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAAGCTGGGACC MD:Z:34G1
seqAAA_13 16 chr3 22762342 0 36M GGGGCATCCTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:1T34
Biter Bilen
I have sequencing data where the position and frequency of mismatches play important role in the downstream analysis. I generated short read mappings using BWA. In the samse output file I have inconsistent MD and cigar fields. As far as I saw MD field is generated from CIGAR field and should be consistent with it. Did anyone have the same problem?
>less uniq_part_001.fastq.sam | cut -f 1-6,10,19 |grep -v "*"| head
seqAAA_0 16 chr12 50113894 25 36M CGCCATCTGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:1A34
seqAAA_1 0 chr4 178545809 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACATATGCCT MD:Z:34T1
seqAAA_3 16 chr10 6381930 25 36M CAGAAGACGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:8T27
seqAAA_4 0 chr4 39577867 25 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACGTTTGCCC MD:Z:27A8
seqAAA_7 0 chr9 117092453 25 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACTTATGCCC MD:Z:26C9
seqAAA_8 16 chr20 17734163 25 36M CGGAATAAGTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:0A35
seqAAA_10 0 chr8 112121071 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAACTTTAGCCG MD:Z:28A7
seqAAA_11 16 chr2 66418968 0 36M GCATACCTCTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:36
seqAAA_12 0 chr16 73425684 0 36M AAAAAAAAAAAAAAAAAAAAAAAAAAAGCTGGGACC MD:Z:34G1
seqAAA_13 16 chr3 22762342 0 36M GGGGCATCCTTTTTTTTTTTTTTTTTTTTTTTTTTT MD:Z:1T34
Biter Bilen
Comment