I got the raw illumina DGE expression data in FASTQ format, and trying to remove the 3'-adaptor sequence from it.
here is samples of the raw data I got from the sequencing company
@FC81M3VABXX:4:1101:1130:2169#0/1
GGATCTGGTTGGGTTATCCAGTACTTCTCGTATGGCGTCTTCTGCTTGA
+
eceaeedec_bddI_c^bccebUecRc^cXXZ__L^BBBBBBBBBBBBB
@FC81M3VABXX:4:1101:1110:2188#0/1
TTCAGGTGGTTTCTTCTCCAGTACTTCTCGTATGCCGTCTTCTGCTTGA
+
gggggfdffdgggggggggdgggggggggedfeefffdfefd^aeefa^
@FC81M3VABXX:4:1101:1184:2239#0/1
GAACATCACTGTAGACTTCCAGTACTTCTCGTATGCCGTCTTCTGCTTG
+
fffffffffffefMfdddddffeffffeffe[db[eedbceecececd^
I can find the Gex Adapter 2 for NlaIII gene expression (TCGTATGCCGTCTTCTGCTTG) at the end of the sequence, but the problem is that the tag sequence shall be just 17bp and the remaining sequences doesn't seem to match the adapter 1.
anyone knows how to get the correct tag sequences from the sample fastq above?
many thanks!
here is samples of the raw data I got from the sequencing company
@FC81M3VABXX:4:1101:1130:2169#0/1
GGATCTGGTTGGGTTATCCAGTACTTCTCGTATGGCGTCTTCTGCTTGA
+
eceaeedec_bddI_c^bccebUecRc^cXXZ__L^BBBBBBBBBBBBB
@FC81M3VABXX:4:1101:1110:2188#0/1
TTCAGGTGGTTTCTTCTCCAGTACTTCTCGTATGCCGTCTTCTGCTTGA
+
gggggfdffdgggggggggdgggggggggedfeefffdfefd^aeefa^
@FC81M3VABXX:4:1101:1184:2239#0/1
GAACATCACTGTAGACTTCCAGTACTTCTCGTATGCCGTCTTCTGCTTG
+
fffffffffffefMfdddddffeffffeffe[db[eedbceecececd^
I can find the Gex Adapter 2 for NlaIII gene expression (TCGTATGCCGTCTTCTGCTTG) at the end of the sequence, but the problem is that the tag sequence shall be just 17bp and the remaining sequences doesn't seem to match the adapter 1.
anyone knows how to get the correct tag sequences from the sample fastq above?
many thanks!
Comment