Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Has anyone tried RUM for aligning/counting Illumina RNA-Seq data?

    2



    The publication for this has just arrived: http://rna-seqblog.com/data-analysis...ed-mapper-rum/

    Based on speed and accuracy, it looks like a contender. Comparable to GSNAP in the accuracy stakes whilst being considerably faster.

    I'm, wondering if anyone has trialled it yet? I ran a simulated data set through it just to see how it ran and it went both fast and smoothly. I am unsure as to how the read counts are calculated for closely related isoforms (i.e. how it distinguishes between them). I contacted the author about it but got no response.

    Has anyone else looked into this? Can you offer any thoughts?

    Thanks in advance.

  • #2
    Hi Fabrice,

    I am the developer of GSNAP. Regarding the speed of GSNAP, the latest version (starting with 2011-08-15) is about 4-8 times faster than the one that was tested in that Bioinformatics paper. The latest version of GSNAP uses k-mer sizes of 15 by default, which accounts for the speed up over the previous k-mer size of 12.

    In addition, the latest version is more accurate, at least for paired-end reads, since I have incorporated the GMAP algorithm to handle difficult cases involving multiple indels or splicing.

    As always, you can download the latest source code for free at http://research-pub.gene.com/gmap.

    Regards,

    Thomas Wu

    Comment


    • #3
      Hi Thomas,

      I'm trying to look at both RUM and gsnap for RNA-seq alignments and compare to TopHat. gsnap looks like really nice software and I noticed user lh3 recommended it! What command line options would you recommend to align/map RNA-seq to human/mouse genomes for 1) Illumina 36bp single-end reads and 2) Illumina 76bp paired-end reads.

      Thanks, Chris

      Comment


      • #4
        Hi Chris,

        I tend to run GSNAP myself with mostly default options. However, for RNA-Seq, you need to tell the program to find novel splices with "-N 1", otherwise the program will think you have DNA-Seq data. You will also get better results at the ends of reads if you prepare a known splicesites file (see the README file for details), and use that with the "-s" flag to make use of known splice sites.

        You also may want to provide some reasonable value for an allowable number of mismatches with the "-m" flag. If you don't provide it, then GSNAP will pick a value that allows it to compute quickly, but it might pick a value too low for 36-mers. You might therefore select "-m 3" or "-m 4" or so for the 36-mers. For 76-mers, GSNAP will explore up to "-m 4", which is probably fine, or you could specify "-m 5".

        Regards,

        Tom

        Comment


        • #5
          Thanks for that Tom,

          A bit later, I also found some other links where other users that had different command line options:



          Application of sequencing to RNA analysis (RNA-Seq, whole transcriptome, SAGE, expression analysis, novel organism mining, splice variants)


          Chris

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          21 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X