Hi!
Could you help me with my long list of problems? I have Illumina HiSeq paired-end RNA data from an unmodel organism. The reads have a lot of rRNA and other sources of contamination and they have quite low quality scores.
I'm trying to filter out the rRNA out of the samples using the whole length shotgun sequences from a closest model species. I can't use the species whose rRNA genes we have found from the data because they have only partial sequences available. So I downloaded the rRNA SSU and LSU r117 sequences from Silva as FASTA without gaps and decompressed the .tar.gz files into FASTA before usage. Should I use some other format than FASTA without gaps?
I created the index out of the Silva-files and aligned the reads using Bowtie2 using simplest possible commands:
But nothing is aligning. Why is that? The reads are untrimmed because I've read that it's best for Bowtie2 to use the original reads.
We have also tried to align some example rRNA-sequencefiles and the Silva-files with consensuses of contigs constructed from combined samples gained from CLC assembly but they won't align with the rRNA datas with Bowtie2 either. Any ideas why not?
And most importantly, what should I do to find the rRNA sequences? I'm familiar with extracting the unmapped reads by using samtools so it's basically just the alignment that seems to be the problem.
Could you help me with my long list of problems? I have Illumina HiSeq paired-end RNA data from an unmodel organism. The reads have a lot of rRNA and other sources of contamination and they have quite low quality scores.
I'm trying to filter out the rRNA out of the samples using the whole length shotgun sequences from a closest model species. I can't use the species whose rRNA genes we have found from the data because they have only partial sequences available. So I downloaded the rRNA SSU and LSU r117 sequences from Silva as FASTA without gaps and decompressed the .tar.gz files into FASTA before usage. Should I use some other format than FASTA without gaps?
I created the index out of the Silva-files and aligned the reads using Bowtie2 using simplest possible commands:
bowtie2-build C_brenneriSSU.fasta C_brenneriSSU_index
bowtie2-align -x C_brenneriSSU_index -1 ove2_R1.fastq -2 ove2_R2.fastq -S result.fastq
We have also tried to align some example rRNA-sequencefiles and the Silva-files with consensuses of contigs constructed from combined samples gained from CLC assembly but they won't align with the rRNA datas with Bowtie2 either. Any ideas why not?
And most importantly, what should I do to find the rRNA sequences? I'm familiar with extracting the unmapped reads by using samtools so it's basically just the alignment that seems to be the problem.
Comment