Hi all,
I'll be working on a large metatranscriptomics (meta-RNAseq) dataset (human host), and I'm not too familiar with the experimental protocol that was followed during the generation of this publicly available dataset (e.g. how rRNA depletion was carried out etc).
I was wondering if someone with some metatranscriptomics analyses experience could comment on the degree of residual human sequences (mRNA) that he/she found present in the raw reads after sequencing. I realized that depending on the exp. procedures/kits followed, the raw reads can contain >80% rRNA/tRNA of both the host and the associated microbiota, but I'm not too familiar, in case the host is human, how much human mRNA would/could remain.. Can it effectively be ~0? I'd appreciate some comments.
I did a literature scanning and people almost always are using a custom (mapping) database where the reference sequences are from the organims of interest (excluding the host), which in a way eliminates the need for host-specific pre-filtering of raw/QC'd reads. So it is not clear to me what would happen if the reference database contains host-specific genes.
Cheers
I'll be working on a large metatranscriptomics (meta-RNAseq) dataset (human host), and I'm not too familiar with the experimental protocol that was followed during the generation of this publicly available dataset (e.g. how rRNA depletion was carried out etc).
I was wondering if someone with some metatranscriptomics analyses experience could comment on the degree of residual human sequences (mRNA) that he/she found present in the raw reads after sequencing. I realized that depending on the exp. procedures/kits followed, the raw reads can contain >80% rRNA/tRNA of both the host and the associated microbiota, but I'm not too familiar, in case the host is human, how much human mRNA would/could remain.. Can it effectively be ~0? I'd appreciate some comments.
I did a literature scanning and people almost always are using a custom (mapping) database where the reference sequences are from the organims of interest (excluding the host), which in a way eliminates the need for host-specific pre-filtering of raw/QC'd reads. So it is not clear to me what would happen if the reference database contains host-specific genes.
Cheers
Comment