Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aligning/Mapping Illumina reads to reference in Geneious problem

    I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

    I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

    I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?

  • #2
    how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.

    Comment


    • #3
      Originally posted by snetmcom View Post
      how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.
      Well based on just the 16S rRNA gene approximately 78%, which is what I expected, but that is just a small part of their genome. Do you have any other suggestions in terms of tools?

      Comment


      • #4
        Yeah you shouldn't be seeing that, Vibrio and Pseudomonas are from different phyla.

        If you paid for Geneious, I would definitely use their support. I did the free trial it was good but I couldn't justify the price - would rather spend the money on other things, like sequencing.

        If you want a user friendly free tool you might try Galaxy.

        Comment


        • #5
          Originally posted by Illusive Man View Post
          I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

          I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

          I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
          Did you sequence total DNA or just 16S? The fact that >99% of your reads map to a 16S reference would suggest the latter. In this case, which region of the gene did you sequence? Also, since 16S is highly conserved, is it really that surprising that basically all your reads map to some bacterial 16S reference, especially considering that you're using "medium sensitivity"? What does it mean anyway? How similar does the read have to be in order to map with "medium sensitivity" setting? 50%? 75%?
          savetherhino.org

          Comment


          • #6
            Originally posted by Illusive Man View Post
            I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

            I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

            I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
            For starters, it looks like you're doing amplicon sequencing and not whole genome. If that's the case, then it makes perfect sense that a read aligner would map all of your reads to any given 16S, because the 16S gene is so homogenous between spp. compared to protein encoding genes. Geneious especially tries really hard to map as many reads as possible to the reference, because it assumes that they should, and if you're just taking blanket values then you're going to see results like this. There are advanced options in Geneious that you can use to change this behaviour, but that still isn't what you probably should be doing.

            So, what should you do? As I said, if you really are doing amplicon sequencing, then use a package such as Qiime or Mothur for you analyses. First, they're designed for 16S amplicons, and secondly what you were doing would be immediately rejected by reviewers if you tried to publish it (or at least I'd reject it based on your description so far).

            Also, bioinformatics is very difficult, and even more so to do properly. My best advice would be to read the literature to see how other people are doing what you want to do, and then delve into the manuals/tutorials for those software packages so you know what they do, how they do it, and why you get the results that you do. If you don't put forth that effort, not only will you not be able to tell if you're getting the correct results, but you'll never be able to figure out on your own what might have gone wrong.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Understanding Genetic Influence on Infectious Disease
              by seqadmin




              During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

              Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
              09-09-2024, 10:59 AM
            • seqadmin
              Addressing Off-Target Effects in CRISPR Technologies
              by seqadmin






              The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
              08-27-2024, 04:44 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 09-11-2024, 02:44 PM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-06-2024, 08:02 AM
            0 responses
            146 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 09-03-2024, 08:30 AM
            0 responses
            153 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 08-27-2024, 04:40 AM
            0 responses
            163 views
            0 likes
            Last Post seqadmin  
            Working...
            X