Hello,
I am currently analyzing sequencing data of total RNA from mouse samples. As can be expected in total RNA, I have quite a bit of ribosomal sequences. However, when aligning to my genome assemblies, I noticed that a very large amount (half of my total reads) were unmapped, and that almost all of them were easily mapped to "TPA: Mus musculus ribosomal DNA, complete repeating unit" using BLAST. This is where I get confused:
- According to the literature I've consulted, rDNA is composed of a multitude of tandem repeats of the above-linked sequences. However, that sequence itself is not present even once in any of the genome assemblies (cattle, mouse) that I use on a regular basis, despite the gene having been clearly identified, sequenced, and even mapped to specific chromosomes using argent staining. Why is that?
- Furthermore, what Is present within the genomes are repetitive sequences labeled "SSU-rRNA_Hsa" and other similar sounding names, but which are not organized in the way Nucleolar Organizer Regions (NOR) are supposed to be. Are these supposed to be inert or active copies of the rRNA genes?
- And, more practically, should I add the rDNA sequences to the bowtie2 indices I use for my alignments?
Thansk a lot for any help!
I am currently analyzing sequencing data of total RNA from mouse samples. As can be expected in total RNA, I have quite a bit of ribosomal sequences. However, when aligning to my genome assemblies, I noticed that a very large amount (half of my total reads) were unmapped, and that almost all of them were easily mapped to "TPA: Mus musculus ribosomal DNA, complete repeating unit" using BLAST. This is where I get confused:
- According to the literature I've consulted, rDNA is composed of a multitude of tandem repeats of the above-linked sequences. However, that sequence itself is not present even once in any of the genome assemblies (cattle, mouse) that I use on a regular basis, despite the gene having been clearly identified, sequenced, and even mapped to specific chromosomes using argent staining. Why is that?
- Furthermore, what Is present within the genomes are repetitive sequences labeled "SSU-rRNA_Hsa" and other similar sounding names, but which are not organized in the way Nucleolar Organizer Regions (NOR) are supposed to be. Are these supposed to be inert or active copies of the rRNA genes?
- And, more practically, should I add the rDNA sequences to the bowtie2 indices I use for my alignments?
Thansk a lot for any help!