Hi,
I have about 100 cDNA sequences (let's call them "ref.") for which I would like to know how many reads from the original Illumina dataset (10 millions of reads; let's call them "reads") align to them fully (i.e. the entire ref. sequence is in the read; see "read1" below) or partially (see "reads 2, 3, 4" below) without gaps.
Example:
Is there any "mapping" program to do that?
Can I use Bowtie2 (although it seems a bit complicated to use when I look at the extensive list of the option arguments)? It seems like I would have to input one file containing all the sequences (ref. + reads), which would probably align all the sequences to each other and take ages?
Also should I used the raw reads (paired-end) or the merged+unmerged reads?
Thanks for your help !
I have about 100 cDNA sequences (let's call them "ref.") for which I would like to know how many reads from the original Illumina dataset (10 millions of reads; let's call them "reads") align to them fully (i.e. the entire ref. sequence is in the read; see "read1" below) or partially (see "reads 2, 3, 4" below) without gaps.
Example:
Code:
[COLOR="red"]ref. AGTTCGGCCGCTCACCGCACCGTCACGCCATCCAGGCATC[/COLOR] read1 ATGCGCTAGCTAGCATAGTTCGGCCGCTCACCGCACCGTCACGCCATCCAGGCATCTTGGACCGCATAGCATC read2 ATTAAGTTCGGCCGCTCACCGCACC read3 CCGCACCGTCACGCCATCCAGGCATCATGCGCGATCTCAGC read4 GCCGCTCACCGCACC
Can I use Bowtie2 (although it seems a bit complicated to use when I look at the extensive list of the option arguments)? It seems like I would have to input one file containing all the sequences (ref. + reads), which would probably align all the sequences to each other and take ages?
Also should I used the raw reads (paired-end) or the merged+unmerged reads?
Thanks for your help !
Comment