Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to close rRNA gaps in a genome

    Hello everyone,

    I have sequenced and assembled my bacterial genome from PE 100bp Illumina reads into a series of contigs using Velvet. However, when I attempted to align the contigs against a very closely related strain using Mauve, I noticed that all the contig breaks correspond to annotated rRNA operons on my reference genome.

    There are five other fully sequenced genome of my species, and they all have ~8 rRNA operons, however my assembly has no 5, 16 or 23s annotated features.

    I have only used ~10 million reads from a total of ~255 million reads produced, so my question is this: How can I pull out the rRNA sequences form the total Illumina dataset and use these to join contigs/close gaps? Looking forward to some potential feedback, thanks!

  • #2
    Why did you only use a small part?

    If you have no rRNAs at all, then it first means that they couldn't be assembled (none at all a bit weird, but whatever).
    Maybe an issue with velvet, no clue.
    But might be a coverage issue, although it shouldn't.

    Basic first recommendation:
    Map your reads to your genome, and filter out the not-mapping reads (there's a bowtie2 option for that).
    Try to assembly them separately with another assembler.
    Then try to see if CAP3 can stitch the rRNAs + your assembly together.
    Else try to scaffold to the reference with e.g. Contiguator.

    But another question: If these are your only gaps: Why bother? Apparently your genome is nearly fully complete, and making it fully circular will probably cost you quite a bit time. Might not be worth it.

    Comment


    • #3
      I can only run a small portion of my reads as i only have my personal laptop to perform assemblies on. However, it seems that including more reads does not improve the assembly. That is, roughly the same contigs are returned when assembled with more, or less reads.

      I do have several small contigs which contain rRNA sequence, and each has an eight-fold greater read coverage then the rest of the genome, indicating the eight rRNA operons are being assembled as one. However each contig is smaller then the size of a 16s or 23s gene, which points to velvet not being able to extend the contigs.

      I have gone ahead and mapped my reads to the closely related reference genome using bwa. I was able to map reads to each of the rRNA operons and the surrounding sequence, and extract a consensus sequence which ought to bridge these gaps. Are CAP3/Contiguator programs that can create larger scaffolds using additional contigs?

      As it stands, the genome is likely assembled good enough for our downstream applications of ChIP-seq and RNA-seq, however I would like to close the gaps because 1)we seem to have the read data, its just a matter of assembling/mapping it and 2) I dont know if we can submit the genome as a draft when it lacks any rRNA operons and therefor transcriptional machinery.

      Comment


      • #4
        Originally posted by Tom_C View Post
        I can only run a small portion of my reads as i only have my personal laptop to perform assemblies on. However, it seems that including more reads does not improve the assembly. That is, roughly the same contigs are returned when assembled with more, or less reads.
        How much % of your data maps back to the assembly?

        Originally posted by Tom_C View Post
        I have gone ahead and mapped my reads to the closely related reference genome using bwa. I was able to map reads to each of the rRNA operons and the surrounding sequence, and extract a consensus sequence which ought to bridge these gaps. Are CAP3/Contiguator programs that can create larger scaffolds using additional contigs?
        CAP3 is one of these really, really old assemblers, which don't use a graph approach, but directly try to overlap sequences.
        If you think you might have overlapping sequences, you can try to throw them in, maybe they can be merged.

        Contiguator is a program to scaffold according to a reference. Just so that you can get one big .fasta sequence with a few gaps, for easier handling in that case.

        Trying to make a consensus with the extracted rRNA sequences and your assembled contigs is definitely worth a thought IMHO. If mapping quality is good, and in all cases you have uniquely mapping reads, then I'd not see a problem, but it's not 100% clean science.
        Maybe some of you lab people can PCR + sequence through the dubious regions, that would be the cleaniest thing.

        Originally posted by Tom_C View Post
        2) I dont know if we can submit the genome as a draft when it lacks any rRNA operons and therefor transcriptional machinery.
        Submit to...
        - databases: Nobody cares
        - a journal: People might care, but with your reasoning that you can nicely map your reads to the related rRNA, I'd write that down, and hope that people understand your reasoning.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Non-Coding RNA Research and Technologies
          by seqadmin




          Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

          Nobel Prize for MicroRNA Discovery
          This week,...
          10-07-2024, 08:07 AM
        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 02:44 PM
        0 responses
        7 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-11-2024, 06:55 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        110 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        117 views
        0 likes
        Last Post seqadmin  
        Working...
        X