Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • James
    Member
    • Mar 2010
    • 23

    Starting out with mRNA-seq analysis

    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J
  • Siva
    Member
    • Nov 2008
    • 57

    #2
    Originally posted by James View Post
    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J
    Hi
    To count alignments and create RPKM you could use Cufflinks it does transcript assembly: check out: http://cufflinks.cbcb.umd.edu/

    To get read counts per feature : You could use a script htseq-count (within http://www-huber.embl.de/users/ander.../overview.html). It takes a SAM output and compares it to a reference annotation (GFF and GTF) and assigns read counts per feature (exon by default).

    If your lab is willing to pay you could use a proprietary aligner/assembler like http://www.softgenetics.com/NextGENe.html

    and this forum will have many great suggestions so stay tuned

    Siva

    Comment

    • shurjo
      Senior Member
      • Jan 2009
      • 132

      #3
      Hi James,

      My two cents worth:

      I would use TopHat rather than Bowtie for the alignment so as to include reads mapping to splice junctions

      Cufflinks and ERANGE both calculate RPKMs (or FPKMs in the case of Cufflinks). For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

      Comparing RPKMs may be as simple as a scatter plot, or if you need to do statistical tests, there are multiple R/Bioconductor packages designed for this purpose (DEGSeq/DEseq/edgeR/Bayseq).

      Best of luck,

      Shurjo

      Comment

      • James
        Member
        • Mar 2010
        • 23

        #4
        Thanks guys RPKM/FPKM is where I'm stumbling currently. Will try some of those out.

        Thanks, J

        Comment

        • Siva
          Member
          • Nov 2008
          • 57

          #5
          Originally posted by shurjo View Post
          For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

          Shurjo
          Hi Shurjo
          In BEDTools, can I use coverageBed using a SAM file and GFF file as inputs or should I convert the SAM file to a bed file? What kind of output does this produce?

          thanks
          Siva

          Comment

          • shurjo
            Senior Member
            • Jan 2009
            • 132

            #6
            Hi Siva,

            You need to convert both your reads and the gene annotation file to BED format (the latter can be easily downloaded from UCSC). BEDTools has an utility called bam2bed that will convert bam to BED, and if you have samtools you can easily convert from sam to bam. The output format is explained in section 5.9 of the BEDTools manual.

            Regards,

            Shurjo

            Comment

            • ScottC
              Senior Member
              • Jan 2008
              • 244

              #7
              You might also find this review useful:

              Computation for ChIP-seq and RNA-seq studies
              Shirley Pepke et al.

              Document Delivery with RightFind individual articles, where and when you need them, on any device

              Comment

              • alaincoletta
                Junior Member
                • Oct 2012
                • 3

                #8
                At InSilico DB (https://insilicodb.org), we use the tophat-cufflinks-cummeRbund pipeline (www.nature.com/protocolexchange/protocols/2327)

                There is an example of some of the results here: https://insilicodb.org/differential-...ng-cummerbund/


                Align to my cds/genome: I have used bowtie to do this easy to use.
                We use Tophat with Bowtie2



                Count alignments in cds and create RPKM values: How can I do this?
                We use Cufflinks

                Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?
                We use Cuffdiff for deifferential gene expression

                Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?
                You can send your BAM file to GenomeSpace and visualie them with IGV (http://genomespace.org)


                Hope this helps

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  Yesterday, 10:05 AM
                • SEQadmin2
                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                  by SEQadmin2


                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                  Introduction

                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                  05-22-2026, 06:42 AM
                • SEQadmin2
                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                  by SEQadmin2

                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                  05-06-2026, 09:04 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 12:03 PM
                0 responses
                19 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, Yesterday, 11:40 AM
                0 responses
                14 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-28-2026, 11:40 AM
                0 responses
                29 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 05-26-2026, 10:12 AM
                0 responses
                31 views
                0 reactions
                Last Post SEQadmin2  
                Working...