Hi,
I used SMALT to align MiSeq PE reads to a single-strand, unsegmented viral reference genome. When I looked into the results, I found some pairs were not flagged as "read mapped in proper pair (0x2)," but it seemed to me that they should be proper pairs.
I listed two pairs that I think should be proper pairs but did not get the 0x2 flag as below. Reads were adapter-trimmed and quality-trimmed by Trimmomatic before aligned by SMALT, so the length was different between reads.
SMALT commands:
smalt map -x -y 0.7 -j 0 -i 2000 -o sample_X.sam ref_index read_1.fastq read_2.fastq
First one:
MISEQ:1:1101:2325:10343 81 ref 5130 29 2S210M = 5130 210
GCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTCA
ACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAGC
AACAAACCAAAAAAAAAACCAACTACAAAACCCACAAACAAACCACCCACCAAAACCACAAACAAAAG
AGACCCCA
5GE6>99C9C=,8@BCGGFB6:>EECFCF:F7FFCF<:9FGFFDGD@FECEC,FCCECFB9GFCCFDC
<=FFD<EFGGGGEAFFF<?,B,<EFFGCFFD@ADFGFEEA9GGFAGFFCE<@GFF9EFF<C5,FC,9E
C<EE7GGGEGGGGGGFD,GFF6EFFGGFCCGGFEFFCEGEGGGGFGFDCGFF@GFGGGGGGFFDDGGF
9GFCCCCC
NM:i:1 AS:i:207
MISEQ:1:1101:2325:10343 161 ref 5130 29 3S159M = 5130 -210
GGCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTC
AACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAG
CAACAAACCAAAAAAAAAACCAACTA
CCCCCGGGGG<FGGGGGGDGGGCCFGGGGGGFGFGDFGGGGGGGGGECFE,EDCFFFAECEGFCFACF
GGDEF,@FFDEFFCFGGGGGGF9F<A=EFCFGF8FFFCFCCDFGGG=FGGCFGGCEFCGEFDFF8FG<
FFG:FFF<FEGGGGGGGGGC8EG8,>
NM:i:1 AS:i:15
Second one:
MISEQ:1:1101:2472:14296 81 ref 4225 29 4S260M = 4225 -260
GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC
>9970F7FGGGGGGGGGGGFGEGGDFGED8GGFGFCGGFAGF8EEF?GGGGFFEGE=??GGFGGGGGE
EGGGGGDFGGDGFGFFEGGGGGGECFFFCGGGGFGFFGGF9GFGGGFGFFGGFGFAAF8GF<@FB<CF
FGGGGGFGGFDFGGGGGGGF@FC<CGGGGFADGGFCGFAAEDGGGGGGGGGGGDGGGGGGGGGGGFGG
GFDGGFAGGGGF@DGGGFGFCFFFEFDFAFAFEAGGGGGGGGEECGFF<AGGGGGCCCCC
NM:i:1 AS:i:257
MISEQ:1:1101:2472:14296 161 ref 4225 29 4S260M = 4225 260
GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC
CCCCCGGGGFGFGFGFFG,CFGGFGGGGDFFFFE?FGGGFGAFBFGFGGGGGGGGF9FFGGGGGCGGF
GGGGGGGGF9FFGFGGGGGGGCFGG8FFFGGGGGGCFFGGGF9=FEGGGGFGFFGCBCFGFGB,@FFG
FGGGGGEF?@FFFGEGGFCFGG9ECCFGGG;9;FFGGGGG9F9CFG?C9CFGFFGCF??9CGE?GGCE
FFGGGGGGGGGGCFCEFF9C*0:FGC7CFBGGGGFFFFFCD555CFFFFF*=?FFFF>?F
NM:i:1 AS:i:257
Flag 81: read paired (0x1), read reverse strand (0x10), first in pair (0x40)
Flag 161: read paired (0x1), mate reverse strand (0x20), second in pair (0x80)
Does anyone have any idea why they were not flagged as proper flags? I have not managed to figure it out, so any thought would be greatly helpful.
Best,
Michael
I used SMALT to align MiSeq PE reads to a single-strand, unsegmented viral reference genome. When I looked into the results, I found some pairs were not flagged as "read mapped in proper pair (0x2)," but it seemed to me that they should be proper pairs.
I listed two pairs that I think should be proper pairs but did not get the 0x2 flag as below. Reads were adapter-trimmed and quality-trimmed by Trimmomatic before aligned by SMALT, so the length was different between reads.
SMALT commands:
smalt map -x -y 0.7 -j 0 -i 2000 -o sample_X.sam ref_index read_1.fastq read_2.fastq
First one:
MISEQ:1:1101:2325:10343 81 ref 5130 29 2S210M = 5130 210
GCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTCA
ACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAGC
AACAAACCAAAAAAAAAACCAACTACAAAACCCACAAACAAACCACCCACCAAAACCACAAACAAAAG
AGACCCCA
5GE6>99C9C=,8@BCGGFB6:>EECFCF:F7FFCF<:9FGFFDGD@FECEC,FCCECFB9GFCCFDC
<=FFD<EFGGGGEAFFF<?,B,<EFFGCFFD@ADFGFEEA9GGFAGFFCE<@GFF9EFF<C5,FC,9E
C<EE7GGGEGGGGGGFD,GFF6EFFGGFCCGGFEFFCEGEGGGGFGFDCGFF@GFGGGGGGFFDDGGF
9GFCCCCC
NM:i:1 AS:i:207
MISEQ:1:1101:2325:10343 161 ref 5130 29 3S159M = 5130 -210
GGCAGCCAAGCACAAAACCACGTCCAAAAAATCCACCAAAAAAAGATGATTACCATTTTGAAGTGTTC
AACTTTGTTCCCTGTAGTATATGTGGCAACAATCAACTCTGCAAATCCATTTGCAAAACAATACCAAG
CAACAAACCAAAAAAAAAACCAACTA
CCCCCGGGGG<FGGGGGGDGGGCCFGGGGGGFGFGDFGGGGGGGGGECFE,EDCFFFAECEGFCFACF
GGDEF,@FFDEFFCFGGGGGGF9F<A=EFCFGF8FFFCFCCDFGGG=FGGCFGGCEFCGEFDFF8FG<
FFG:FFF<FEGGGGGGGGGC8EG8,>
NM:i:1 AS:i:15
Second one:
MISEQ:1:1101:2472:14296 81 ref 4225 29 4S260M = 4225 -260
GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC
>9970F7FGGGGGGGGGGGFGEGGDFGED8GGFGFCGGFAGF8EEF?GGGGFFEGE=??GGFGGGGGE
EGGGGGDFGGDGFGFFEGGGGGGECFFFCGGGGFGFFGGF9GFGGGFGFFGGFGFAAF8GF<@FB<CF
FGGGGGFGGFDFGGGGGGGF@FC<CGGGGFADGGFCGFAAEDGGGGGGGGGGGDGGGGGGGGGGGFGG
GFDGGFAGGGGF@DGGGFGFCFFFEFDFAFAFEAGGGGGGGGEECGFF<AGGGGGCCCCC
NM:i:1 AS:i:257
MISEQ:1:1101:2472:14296 161 ref 4225 29 4S260M = 4225 260
GCGGGGGGTAAATAGATATCAGTTAGAGTTTAACCAATCTTAACAACCATCTATACCGCCAATCCAAT
ACATACATTGCAAATCTTAAAATGGGAAACACATCCATCACAATAGAATTCACAAGCAAATTTTGGCC
CTATTTTACACTAATATATATGATCCTAACTCTAATCTCTTTACTAATTATAATCACCATTATGATTG
CAATACTAAATAAGCTAAGTGAACATAAAATATTCTGCAACAAGACTCTTGAACAAGGAC
CCCCCGGGGFGFGFGFFG,CFGGFGGGGDFFFFE?FGGGFGAFBFGFGGGGGGGGF9FFGGGGGCGGF
GGGGGGGGF9FFGFGGGGGGGCFGG8FFFGGGGGGCFFGGGF9=FEGGGGFGFFGCBCFGFGB,@FFG
FGGGGGEF?@FFFGEGGFCFGG9ECCFGGG;9;FFGGGGG9F9CFG?C9CFGFFGCF??9CGE?GGCE
FFGGGGGGGGGGCFCEFF9C*0:FGC7CFBGGGGFFFFFCD555CFFFFF*=?FFFF>?F
NM:i:1 AS:i:257
Flag 81: read paired (0x1), read reverse strand (0x10), first in pair (0x40)
Flag 161: read paired (0x1), mate reverse strand (0x20), second in pair (0x80)
Does anyone have any idea why they were not flagged as proper flags? I have not managed to figure it out, so any thought would be greatly helpful.
Best,
Michael