Hi all,
I'd like to pose you a question on RNAseq data analysis.
I did several RNAseq libraries on bivalve larvae (from hatchery conditions), thus most probably including "contaminant" species such as algae and bacteria (not known).
I removed ribosomal RNAs and did a transcriptome assembly.
Then I performed a mapping of each library against the assembled transcriptome in order to do DE expression analysis.
I'd like to know your opinion about the presence of reads from other species, which can be included in the assembled transcriptome (generating transcripts from contaminant species or, in the worst cases giving chimeras) and are also considered in the mapping procedure.
What do you suggest among these options?
1- remove assembled transcripts based on blastn results. I would eliminating transcripts having best match against bacteria or plants (blast against whole nr database). Then map all reads against the "cleaned" transcriptome and do DE analysis. Obviously the "contaminant" reads will not map"
2- do not eliminate anything, assemble the transcriptome, map reads, do DE analysis and consider only the DE transcripts that do not match against bacteria or plant (blastn).
or other suggestions based on your experiences?
I really appreciate
Many thanks.
Marianna
I'd like to pose you a question on RNAseq data analysis.
I did several RNAseq libraries on bivalve larvae (from hatchery conditions), thus most probably including "contaminant" species such as algae and bacteria (not known).
I removed ribosomal RNAs and did a transcriptome assembly.
Then I performed a mapping of each library against the assembled transcriptome in order to do DE expression analysis.
I'd like to know your opinion about the presence of reads from other species, which can be included in the assembled transcriptome (generating transcripts from contaminant species or, in the worst cases giving chimeras) and are also considered in the mapping procedure.
What do you suggest among these options?
1- remove assembled transcripts based on blastn results. I would eliminating transcripts having best match against bacteria or plants (blast against whole nr database). Then map all reads against the "cleaned" transcriptome and do DE analysis. Obviously the "contaminant" reads will not map"
2- do not eliminate anything, assemble the transcriptome, map reads, do DE analysis and consider only the DE transcripts that do not match against bacteria or plant (blastn).
or other suggestions based on your experiences?
I really appreciate
Many thanks.
Marianna
Comment