Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pipeline for de novo RNA sequencing, and Galaxy

    Hi all, I'm trying to figure out a good pipeline for a de novo RNA sequencing project (hybrid assembly) - I figure I will have to use MIRA. I will have paired ~100 bp Illumina HiSeq data and am hoping to someday also get 454 FLX data (they are having trouble with the new chemistry, so I don't know when this will be). I'll be making a hybrid transcriptome assembly, then mapping the Illumina sequences to the assembly to quantify reads. This is the first time I've ever done anything like this- can people suggest pipelines that would be good to try? Also, what metrics should I be using to determine if my assembly result is good or not? I don't have a reference genome to map to.

    I'm not used to command line interfaces and if anyone has used MIRA and has an example of commands they used that they can share with me, I'd be grateful.

    Also, I've encountered Galaxy, which apparently can let you use MIRA with it. Has anyone done this, and had problems? Anyone have problems in general with Galaxy not allowing programs to work correctly?

    Thanks for any help you can provide to this noob.

  • #2
    If you know someone else who has done RNASeq on your organism, you could test the assembly with their data. Any de-novo assembly you do should have a really good mapping of your reads back to the assembly, but might not be so great with someone else's reads.

    Comment


    • #3
      I have good results with Trinity for Illumina data, I guess it won't be too happy about 454 reads though, unless you pre-process them to correct homopolymer errors.

      Metrics for de novo transcriptomes are difficult to define, we have tried to map the transcript contigs to the transcripts of similar organisms to get an idea of the completeness. You could look at the contig length distribution and compare it to that of a similar organism.

      For MIRA I suggest you to ask on the mailing list, Bastien is quite fast in helping out new users there... It might choke on big Illumina sets though, make sure you have lots of RAM and time for your analysis or subset your dataset to have a manageable run.

      Comment


      • #4
        Trinity vs Mira /de novo assembly

        Hi all,

        I'm keen to see how others are getting on with de novo assemblies, particularly with Trinity. It's interesting to me that their Nat. Biotechnology paper doesn't mention Mira, and I was wondering if anyone has compared the two programs.

        I'm doing de novo assemblies using 50bp single-read Illumina data, with a little 454 data thrown in there. When Trinity first came out, it crashed pretty quickly. But now that they have different options for the first step/inchworm (I've been trying jellyfish), I've been able to assemble 100 million reads on my local machine (24GB ram) in less than a day. This has been the case for Illumina data alone, and with the 454 data pooled. I suspect, however, that the 454 data had little impact on the outcome, because I only have about 200,000 reads!

        So far, Trinity gives me more long reads and has less redundancy (according to TGICL). But it's always difficult to assess these alignments. In particular, I can't find out how much of my data is being used by Trinity. Is there a handy report file with this information somewhere? The webpage suggests using bowtie to figure out what has gone into the alignment, but this will throw out anything that aligns ambiguously. Is there an easier way? Does anyone else have experience with Trinity that they can share?

        Also, Trinity is able to align all of my data at once, whereas Mira was crashing when I tried to align it all together (even on a cluster with 96GB RAM). I was getting around this by partitioning my data in mira, so it was working. But doing it all in one alignment is a plus.

        And Liz- you might have found this already, but the example inputs on the mira html guide are quite useful: http://mira-assembler.sourceforge.ne...ideToMIRA.html

        Thanks!
        -Alice

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Latest Developments in Precision Medicine
          by seqadmin



          Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

          Somatic Genomics
          “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
          05-24-2024, 01:16 PM
        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          05-06-2024, 07:48 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 01:32 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-24-2024, 07:15 AM
        0 responses
        199 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 10:28 AM
        0 responses
        221 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 07:35 AM
        0 responses
        232 views
        0 likes
        Last Post seqadmin  
        Working...
        X