Seqanswers Leaderboard Ad

Collapse
X
Collapse
+ More Options
Posts
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Illusive Man
    Member
    • Sep 2013
    • 15

    Aligning/Mapping Illumina reads to reference in Geneious problem

    I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

    I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

    I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
  • snetmcom
    Senior Member
    • Oct 2008
    • 159

    #2
    how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.

    Comment

    • Illusive Man
      Member
      • Sep 2013
      • 15

      #3
      Originally posted by snetmcom View Post
      how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.
      Well based on just the 16S rRNA gene approximately 78%, which is what I expected, but that is just a small part of their genome. Do you have any other suggestions in terms of tools?

      Comment

      • cliffbeall
        Senior Member
        • Jan 2010
        • 144

        #4
        Yeah you shouldn't be seeing that, Vibrio and Pseudomonas are from different phyla.

        If you paid for Geneious, I would definitely use their support. I did the free trial it was good but I couldn't justify the price - would rather spend the money on other things, like sequencing.

        If you want a user friendly free tool you might try Galaxy.

        Comment

        • rhinoceros
          Senior Member
          • Apr 2013
          • 372

          #5
          Originally posted by Illusive Man View Post
          I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

          I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

          I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
          Did you sequence total DNA or just 16S? The fact that >99% of your reads map to a 16S reference would suggest the latter. In this case, which region of the gene did you sequence? Also, since 16S is highly conserved, is it really that surprising that basically all your reads map to some bacterial 16S reference, especially considering that you're using "medium sensitivity"? What does it mean anyway? How similar does the read have to be in order to map with "medium sensitivity" setting? 50%? 75%?
          savetherhino.org

          Comment

          • mcnelson.phd
            Senior Member
            • Jul 2011
            • 162

            #6
            Originally posted by Illusive Man View Post
            I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

            I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

            I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
            For starters, it looks like you're doing amplicon sequencing and not whole genome. If that's the case, then it makes perfect sense that a read aligner would map all of your reads to any given 16S, because the 16S gene is so homogenous between spp. compared to protein encoding genes. Geneious especially tries really hard to map as many reads as possible to the reference, because it assumes that they should, and if you're just taking blanket values then you're going to see results like this. There are advanced options in Geneious that you can use to change this behaviour, but that still isn't what you probably should be doing.

            So, what should you do? As I said, if you really are doing amplicon sequencing, then use a package such as Qiime or Mothur for you analyses. First, they're designed for 16S amplicons, and secondly what you were doing would be immediately rejected by reviewers if you tried to publish it (or at least I'd reject it based on your description so far).

            Also, bioinformatics is very difficult, and even more so to do properly. My best advice would be to read the literature to see how other people are doing what you want to do, and then delve into the manuals/tutorials for those software packages so you know what they do, how they do it, and why you get the results that you do. If you don't put forth that effort, not only will you not be able to tell if you're getting the correct results, but you'll never be able to figure out on your own what might have gone wrong.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Pathogen Surveillance with Advanced Genomic Tools
              by seqadmin




              The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...
              03-24-2025, 11:48 AM
            • seqadmin
              New Genomics Tools and Methods Shared at AGBT 2025
              by seqadmin


              This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

              The Headliner
              The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
              03-03-2025, 01:39 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 10:17 AM
            0 responses
            7 views
            0 reactions
            Last Post seqadmin  
            Started by seqadmin, 03-20-2025, 05:03 AM
            0 responses
            49 views
            0 reactions
            Last Post seqadmin  
            Started by seqadmin, 03-19-2025, 07:27 AM
            0 responses
            59 views
            0 reactions
            Last Post seqadmin  
            Started by seqadmin, 03-18-2025, 12:50 PM
            0 responses
            50 views
            0 reactions
            Last Post seqadmin  
            Working...