Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sd3
    Junior Member
    • Apr 2010
    • 6

    Short read alignments between species

    Hi,

    I have some Illumina paired end genomic reads from a plant species without a genome sequence, so I wanted to align them to a related genome. I tried using bowtie ( --seedmms 3 --maqerr 250) but I am getting very few alignments (<5% paired ends, and ~10% for each end separately). I tried to use the -v option to increase the mismatches but the limit seems to be 3 (same as the seed mismatches acc to the manual). I guess my genetic distance is too great...

    Do people have a preferred aligner when aligning to a reference from another species, or would I be better off assembling the reads de novo and aligning them afterwards?

    Thanks, SD
  • natstreet
    Member
    • Nov 2009
    • 83

    #2
    If you want to continue using bowtie you should increase the allowed error to something much higher.

    I have also used mosaik for this kind of thing as there you can allow many more MM or specify an allowed %. One issue you will face is that to get enough reads mapping you will likely have to increase the allowed MM to such a degree that mapping becomes so ambiguous that the whole thing can be of questionable value.

    I think would go with your second option of a de novo assembly and aligning the assembled contigs. However, that's a whole other world of pain and your success will highly depend on how much Illumina data you have and what combinations of library insert sizes and very much on the polymorphism rate of your species. Here are some very brief comments on some of the available assemblers:

    MIRA: Probably not worth trying unless your genome is very small because it has such high memory requirements

    SOAPdenovo: Many people report OK results but you will likely get a very large number of very short contigs. The documentation is terrible and the maillist is far from the best because the developers don't seem to read it.

    ABySS: Great mailing list and gives about the best result. The developers are really helpful and users on the mailing list will help will anything from newbie to advanced issues.

    Velvet: OK for small genomes but has really high RAM requirements otherwise (but not as bad as MIRA).

    Celerea(Caborg): I can't say because I haven't reied it yet but they recently added full support for Illumina data.

    clc: commercial so maybe not an option. Also a a rather mysterious black box but it is incredibly fast and has amazingly low RAM requirements (but a black box so who knows how they manage this).

    Comment

    • francois.sabot
      Member
      • Dec 2009
      • 41

      #3
      LASTZ has a specific module to perform exomapping, called FEAST..;

      See http://www.ncbi.nlm.nih.gov/pubmed/20733242

      Either use BWA (which accept more SNPs than bowtie and can manage indels) with relaxed states
      Francois Sabot, PhD

      Be realistic. Demand the Impossible.
      www.wikiposon.org

      Comment

      • rwenang
        Member
        • Jan 2009
        • 31

        #4
        Since you already tried mapping and got few alignments, I don't think assembly will produce a better result. However, as francois.sabot said, you can try relaxing the mapping criteria (higher mismatches, larger gaps), and imo you better off using the hash-based mapping tool (maq,rmap,etc.) for that purpose.

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM
        • SEQadmin2
          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
          by SEQadmin2


          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


          Introduction

          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
          05-22-2026, 06:42 AM
        • SEQadmin2
          Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
          by SEQadmin2

          Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


          Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
          05-06-2026, 09:04 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 08:59 AM
        0 responses
        9 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 12:03 PM
        0 responses
        21 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-02-2026, 11:40 AM
        0 responses
        17 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 05-28-2026, 11:40 AM
        0 responses
        30 views
        0 reactions
        Last Post SEQadmin2  
        Working...