Hi,
If anyone has worked with a report (native format) generated using novoalign, please help me with these doubts. The datasets used are Illumina paired reads.
A) Below is the snapshot of a Novoalign report (native format) for Illumina paired reads.
With the help of the Novocraft Alignment suite pdf (section Output Formats, page 24), I was able to understand certain columns in the report but please help me identify what
1. Aligned Sequence
2. Aligned Offseet
3. Pair Sequence
4. Pair Offset
5. Mismatches
are, in the report.
B ) I was also looking for the aligned reads' start and end positions. Is that information available in this report?
C) At the end of the report are 3 columns given with data
# Fragment Length Distribution
# From To Count
# 27 29 4
# 30 32 30
# 33 35 141
# 36 38 696
# 39 41 1136 ..............etc
Does this mean that from positions 27 to 29, there are 4 reads and so on.
D) Finally, here were the report statistics.
# Paired Reads: 9686877
# Pairs Aligned: 6253455
# Read Sequences: 19373754
# Aligned: 14102273
# Unique Alignment: 14102068
# Gapped Alignment: 875179
# Quality Filter: 248607
#Homopolymer Filter: 1306
I understand that 2 times Paired Reads = Read Sequences. Please help me in understanding why 2 times Pairs Aligned < Aligned. Again if I add Gapped Alignment with Unique Alignment, I do not get Aligned.
Please advice.
If anyone has worked with a report (native format) generated using novoalign, please help me with these doubts. The datasets used are Illumina paired reads.
A) Below is the snapshot of a Novoalign report (native format) for Illumina paired reads.
Code:
@0:1:1:34:429 L GAAGNAAAAATAAAAGCATTAGNAGAAATTTGTACA IIII$IIIII&IIIIIIIIIII$IIIIIIIIIIIII U 14 91 >gi|9629357:1-9117 2177 F . 2308 R @0:1:1:34:429 R TNCTTATTAAGCNCTCTGAAATNNANNNNTTTTCTC I$IIIIIIIIII$IIIIIIIII$$'$$$$IIIIIII U 126 91 >gi|9629357:1-9117 2308 R . 2177 F 25A>G 36G>A
1. Aligned Sequence
2. Aligned Offseet
3. Pair Sequence
4. Pair Offset
5. Mismatches
are, in the report.
B ) I was also looking for the aligned reads' start and end positions. Is that information available in this report?
C) At the end of the report are 3 columns given with data
# Fragment Length Distribution
# From To Count
# 27 29 4
# 30 32 30
# 33 35 141
# 36 38 696
# 39 41 1136 ..............etc
Does this mean that from positions 27 to 29, there are 4 reads and so on.
D) Finally, here were the report statistics.
# Paired Reads: 9686877
# Pairs Aligned: 6253455
# Read Sequences: 19373754
# Aligned: 14102273
# Unique Alignment: 14102068
# Gapped Alignment: 875179
# Quality Filter: 248607
#Homopolymer Filter: 1306
I understand that 2 times Paired Reads = Read Sequences. Please help me in understanding why 2 times Pairs Aligned < Aligned. Again if I add Gapped Alignment with Unique Alignment, I do not get Aligned.
Please advice.