Hi,
I've got a pair-end illumina sequencing data (100bp each end) for Ago iClip in mouse ES cell.
majority of read pairs show that the inserted fragments are very short. As shown in the following example:
Above is one read pair. The underlined part of each read can be aligned with illumina PCR primers in the other end. And the 17bp part of each read is reverse complement to each other. This means that the insert fragment is only 17bp and the 100bp read go through the PCR primer (and sequencing primer) in the other end.
Could any one explain me what is the "AAAAAAAAAAAAAAAAGCACAC" and "AAAAAAAAAAAAAAACACAAGAGAG" after the underlined sequences? How can I interpret them?
The fastq score is low for these regions.
Here is the same reads with fastq score:
Also, I am curious what will be the reason for these short insert fragments during sequencing. After library construction, the results of bioanalyzer show that the average length of insert fragment is ~100bp.
I've got a pair-end illumina sequencing data (100bp each end) for Ago iClip in mouse ES cell.
majority of read pairs show that the inserted fragments are very short. As shown in the following example:
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 1:N:0:
ATAGGTATGCGCCACTGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAGCACAC
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 2:N:0:
CAGTGGCGCATACCTATAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAACACAAGAGAG
ATAGGTATGCGCCACTGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAGCACAC
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 2:N:0:
CAGTGGCGCATACCTATAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAACACAAGAGAG
Could any one explain me what is the "AAAAAAAAAAAAAAAAGCACAC" and "AAAAAAAAAAAAAAACACAAGAGAG" after the underlined sequences? How can I interpret them?
The fastq score is low for these regions.
Here is the same reads with fastq score:
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 1:N:0:
ATAGGTATGCGCCACTGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAGCACAC
+
@C@FFDDFHHHHHJJJJJIIJJJJJJJGHJJJJGIJJJJJJIJIJIIJIGHGGFC>ADBD@CDDDDDDDDDDDCCDCBDDDBDDDDB############
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 2:N:0:
CAGTGGCGCATACCTATAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAACACAAGAGAG
+
CCCFFFFFHHHHHJJIJJJJJJJJJIJIHGIJJJJJIDIIIJJJJJIHIJCHFHHHHHFFFDDEDDDDDDDDEEEDDCCCBDB#################
ATAGGTATGCGCCACTGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAGCACAC
+
@C@FFDDFHHHHHJJJJJIIJJJJJJJGHJJJJGIJJJJJJIJIJIIJIGHGGFC>ADBD@CDDDDDDDDDDDCCDCBDDDBDDDDB############
@DBRHHJN1:278:C11RFACXX:3:1101:1443:1167 2:N:0:
CAGTGGCGCATACCTATAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTAAAAAAAAAAAAAAACACAAGAGAG
+
CCCFFFFFHHHHHJJIJJJJJJJJJIJIHGIJJJJJIDIIIJJJJJIHIJCHFHHHHHFFFDDEDDDDDDDDEEEDDCCCBDB#################
Comment