Hi,
I spent a few hours on this problem. I am using the latest HTSeq (0.5.3).
The sam file was generated via bwa. I used sort, samtools, and picard to sort the sam files but in all cases HTSeq complained excessively about "(Is the SAM file properly sorted?)".
I used "cat my_file.sorted.sam |grep "read_name"" and here are two examples:
1)
5G_BaTVPgyNB42 147 All_unigene019861 858 255 75M = 745 -188 GTGTAGTCCGTGTGTGAAAGGCTGGGATAAAAAGTCAGGTTATTAGGTTGCTGTAGGGAACAGTCTGTTAATTCA bbbbbbcacbacbbcbbbbbbbbbb_abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbabb XA:i:0 MD:Z:75 NM:i:0
5G_BaTVPgyNB42 99 All_unigene019861 745 255 75M = 858 188 GTGTAAAGCTACTCGACAGTTGATTCAAACGCAATGAATTAGAATAGAAGATTTTGTTTGAATCACGAGCAACAG bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb_a_bbbbbbbcbbbbcba XA:i:0 MD:Z:32G42 NM:i:1
2)
1D_J7uMriyNB42 163 All_unigene003941 81 29 75M = 201 178 AAACCTACCACCGATTTCACAGACAAAGTCATAGAAGGCGAGGATGGACTAAAATTCGACACTCCGTCGTCCGCA bbbbbbbbbbbbbbbbbbbbbbbbbbbbcbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcbbbabbacbb`c XT:A:U NM:i:0 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:75
1D_J7uMriyNB42 87 All_unigene003941 201 29 58M17S = 81 -178 GTCGTCCAGTTGTCCGACAAACAAGGAGATCTGACTGAACAGAGTGACACGACGTCTGCAACTGGACATCTTGAC \aba\bcc`bcaabb`cbbbbbbabbbbbbbbbabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb XT:A:M NM:i:1 SM:i:29 AM:i:29 XM:i:1 XO:i:0 XG:i:0 MD:Z:55T2
Anybody can offer an explanation?
Thank you,
Douglas
I spent a few hours on this problem. I am using the latest HTSeq (0.5.3).
The sam file was generated via bwa. I used sort, samtools, and picard to sort the sam files but in all cases HTSeq complained excessively about "(Is the SAM file properly sorted?)".
I used "cat my_file.sorted.sam |grep "read_name"" and here are two examples:
1)
5G_BaTVPgyNB42 147 All_unigene019861 858 255 75M = 745 -188 GTGTAGTCCGTGTGTGAAAGGCTGGGATAAAAAGTCAGGTTATTAGGTTGCTGTAGGGAACAGTCTGTTAATTCA bbbbbbcacbacbbcbbbbbbbbbb_abbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbabb XA:i:0 MD:Z:75 NM:i:0
5G_BaTVPgyNB42 99 All_unigene019861 745 255 75M = 858 188 GTGTAAAGCTACTCGACAGTTGATTCAAACGCAATGAATTAGAATAGAAGATTTTGTTTGAATCACGAGCAACAG bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb_a_bbbbbbbcbbbbcba XA:i:0 MD:Z:32G42 NM:i:1
2)
1D_J7uMriyNB42 163 All_unigene003941 81 29 75M = 201 178 AAACCTACCACCGATTTCACAGACAAAGTCATAGAAGGCGAGGATGGACTAAAATTCGACACTCCGTCGTCCGCA bbbbbbbbbbbbbbbbbbbbbbbbbbbbcbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcbbbabbacbb`c XT:A:U NM:i:0 SM:i:29 AM:i:29 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:75
1D_J7uMriyNB42 87 All_unigene003941 201 29 58M17S = 81 -178 GTCGTCCAGTTGTCCGACAAACAAGGAGATCTGACTGAACAGAGTGACACGACGTCTGCAACTGGACATCTTGAC \aba\bcc`bcaabb`cbbbbbbabbbbbbbbbabbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb XT:A:M NM:i:1 SM:i:29 AM:i:29 XM:i:1 XO:i:0 XG:i:0 MD:Z:55T2
Anybody can offer an explanation?
Thank you,
Douglas
Comment