solexa adapter sequences for paired end data
Hi
I wonder if somebody can help me here....I have got some solexa data (paired end reads) and want to map these to the genome with bwa/maq...However when I did this only 20% of sequences mapped so I gather I want to check/ for presence of adapters in the sequences and then repeat the alignment after trimming of the data.
Which adapter sequences should I screen for- as this is paired end data I assume its the 5` to 3` sequence for the PE adpter 1 and 2 which I found in one of your posts?
Paired-end DNA
PE Adapter1:
5' -------------------- -----ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (-) -------------------- -------------------- -------------------- - 3'
3' -------------------- -----TGTGAGAAAGGGATG TGCTGCGAGAAGGCTAGp (-) -------------------- -------------------- -------------------- - 5'
PE Adapter2:
5' -------------------- -------------------- ------------------ (-) pGATCGGAAGAGCGGTTCAG CAGGAATGCCGAG------- -------------------- - 3'
3' -------------------- -------------------- ------------------ (-) TCTAGCCTTCTCGCCAAGTC GTCCTTACGGCTC------- -------------------- - 5'
In fasta format:
>Solexa-PairedEndAdapter1
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
>SolexaPairedEndAdapter2
GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
If I had single end solexa data would I then screen for the same adapters
Genomic DNA oligonucleotide sequences (from previous posting)
Adapters 1
5' P-GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
5' ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Hi
I wonder if somebody can help me here....I have got some solexa data (paired end reads) and want to map these to the genome with bwa/maq...However when I did this only 20% of sequences mapped so I gather I want to check/ for presence of adapters in the sequences and then repeat the alignment after trimming of the data.
Which adapter sequences should I screen for- as this is paired end data I assume its the 5` to 3` sequence for the PE adpter 1 and 2 which I found in one of your posts?
Paired-end DNA
PE Adapter1:
5' -------------------- -----ACACTCTTTCCCTAC ACGACGCTCTTCCGATCT (-) -------------------- -------------------- -------------------- - 3'
3' -------------------- -----TGTGAGAAAGGGATG TGCTGCGAGAAGGCTAGp (-) -------------------- -------------------- -------------------- - 5'
PE Adapter2:
5' -------------------- -------------------- ------------------ (-) pGATCGGAAGAGCGGTTCAG CAGGAATGCCGAG------- -------------------- - 3'
3' -------------------- -------------------- ------------------ (-) TCTAGCCTTCTCGCCAAGTC GTCCTTACGGCTC------- -------------------- - 5'
In fasta format:
>Solexa-PairedEndAdapter1
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
>SolexaPairedEndAdapter2
GATCGGAAGAGCGGTTCAGCAGGAATGCCGAG
If I had single end solexa data would I then screen for the same adapters
Genomic DNA oligonucleotide sequences (from previous posting)
Adapters 1
5' P-GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTG
5' ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Comment