Hi all,
I downloaded some RNA-seq datasets from NCBI, but some libs have 3 raw files like: (from http://www.ncbi.nlm.nih.gov/sra?term=SRR059171)
300M SRX022780_SRR059171_1.fastq.bz2
300M SRX022780_SRR059171_2.fastq.bz2
11M SRX022780_SRR059171.fastq.bz2
I know the _1/2 are paired tags, but what is the last file? It is much smaller, some reads in the last file:
@SRR059171.6873676 SL-XBB:7:120:1786:2047
NTGGNGTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!%%%!%%%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@SRR059171.6873810 SL-XBB:7:120:1790:2045
NTCCANTNTCTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!%%%%!%!%%%%%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Is that file adaptor sequences? I have no understranding about adaptor, normally when I got the sequences, I just do mapping and further analysis.
In the nature methods paper used that dataset (Comprehensive comparative analysis of strand-specifc RNA sequencing methods), they mentioned that the adaptors in the NNSR, Hybrid, SMART libs were trimmed and then mapped to the genome.
But how can I know what and where is the adaptor? Is the adaptor at the 5'-end, like XXXX in read XXXXTTTTTTTTTTTATCG...? And is XXXX in some pattern, like alway AACC?
Thank you!
I downloaded some RNA-seq datasets from NCBI, but some libs have 3 raw files like: (from http://www.ncbi.nlm.nih.gov/sra?term=SRR059171)
300M SRX022780_SRR059171_1.fastq.bz2
300M SRX022780_SRR059171_2.fastq.bz2
11M SRX022780_SRR059171.fastq.bz2
I know the _1/2 are paired tags, but what is the last file? It is much smaller, some reads in the last file:
@SRR059171.6873676 SL-XBB:7:120:1786:2047
NTGGNGTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!%%%!%%%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@SRR059171.6873810 SL-XBB:7:120:1790:2045
NTCCANTNTCTTTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!%%%%!%!%%%%%!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Is that file adaptor sequences? I have no understranding about adaptor, normally when I got the sequences, I just do mapping and further analysis.
In the nature methods paper used that dataset (Comprehensive comparative analysis of strand-specifc RNA sequencing methods), they mentioned that the adaptors in the NNSR, Hybrid, SMART libs were trimmed and then mapped to the genome.
But how can I know what and where is the adaptor? Is the adaptor at the 5'-end, like XXXX in read XXXXTTTTTTTTTTTATCG...? And is XXXX in some pattern, like alway AACC?
Thank you!
Comment