Hi all,
I am dealing with the paired-end BAM file, and come up with many warnings like this:
I check the warning reads in the BAM file, and find all the warning reads have three reads with the same name. For example:
The BAM file is alignment of HiSeq reads aligned to the reference genome using bwa, and use picard to remove redundancy. Base realignments were done using gatk.
My confusion is:
1、Why there are three reads with the same name, but have no relation?
2、Maybe the first two are treated as mate pairs and the third as a single read. So could I just ignore it?
Could eveyone help me? Many thanks for your help!
I am dealing with the paired-end BAM file, and come up with many warnings like this:
Code:
WARNING: Could not find pair for HWI-ST430:177:2:1:4979:15503#0 WARNING: Could not find pair for HWI-ST430:177:2:1:5127:13427#0 WARNING: Could not find pair for HWI-ST430:177:2:1:6521:21452#0
Code:
[COLOR="Red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR] 65 chr32 26100696 60 79M21S chr5 36697147 0 ACTTTGCAATTTAAGTTTTACTTACTTTTTAACTAATATACATGCCTAAAATTTACAAAAACAATAATAAAAACAACAGAACACTGGAAACATTTTTAAA >;=<>=<<=======<====;===;=======<=>>>>>><=>>==>>>>=>>>>==>?>=<<==>?>>>?>?==><=?>><=<>>>?>?=>??>?===> BD:Z:FFHFCIKKIHG@EEEHF??DGGEDGGE???DEEGGEFFFFGDHHHHGGE??FF?DGDG???EDGFGFGGF@@@FEHFEIEGFEEIJJIHBHGLJDD@EF@ MD:Z:79 PG:Z:MarkDuplicates RG:Z:Basenji BI:Z:FFIECHGIHFEAFEEHEAAFFHDFFHDAAAFEEIHFGGHGGGHHGHHHFBBGFBGGGHBBBFGHGGFGGFBBBGHIGHJGHGHFKJJJJEIKLJGHBGFB NM:i:0 AS:i:79 XS:i:19 [COLOR="red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR] 129 chr5 36697147 60 72M28S chr32 26100696 0 ATTTGCCCCTGGGCTATTTTTTTCCTNCCATGTAAGATTCCGTTTTAAAAATGTTTCCAGTGTTCTGTTGTTTTTATTATTGTTTTTGTAAATTTTAGGC ===<=<<<<====<=>========<<!<<<=><<=>>>>>=5=>>>>>>>>>>=>>>==>=>=>>>>=?>=>>>>>>>>=?>=>>>?>>>??>??>;<=> SA:Z:chr32,26100739,-,36M64S,60,0; BD:Z:FFG@JKKFFHIIEHIGFF?????EGGEEEGHHEGEEDGFEGEGF??DE???FHEF?EGGHIFFGFEIFGGFG@@@EGGEGGGFHAAAHGJHBJJDDEHHI MD:Z:26T37T7 PG:Z:MarkDuplicates RG:Z:Basenji BI:Z:FFFBHHHFFHGGDGHGGEAAAAADFGEEEIHHGHFFFGFEGHHFBBGFBBBGHGFBEGIIIFGFEFHGFHHGCCCHIGHIGHHGDDDIIKIFKJGHGHGH NM:i:2 AS:i:65 XS:i:21 [COLOR="red"]HWI-ST430:177:2:1:4979:15503#0[/COLOR] 401 chr32 26100739 60 36M64H = 26100696 -79 GCCTAAAATTTACAAAAACAATAATAAAAACAACAG ===<=>>=>>===>===<=>===========>;=== SA:Z:chr5,36697147,+,72M28S,60,2; BD:Z:IHHE??FF?EGEF???FEFFFDFGE@@AHHIJFIFF MD:Z:36 PG:Z:MarkDuplicates RG:Z:Basenji BI:Z:HGHGBBFFAEGFFAAAEFFEGFEGFABBFGHGGHFF NM:i:0 AS:i:36 XS:i:22
My confusion is:
1、Why there are three reads with the same name, but have no relation?
2、Maybe the first two are treated as mate pairs and the third as a single read. So could I just ignore it?
Could eveyone help me? Many thanks for your help!
Comment