Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Starting out with mRNA-seq analysis

    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J

  • #2
    Originally posted by James View Post
    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J
    Hi
    To count alignments and create RPKM you could use Cufflinks it does transcript assembly: check out: http://cufflinks.cbcb.umd.edu/

    To get read counts per feature : You could use a script htseq-count (within http://www-huber.embl.de/users/ander.../overview.html). It takes a SAM output and compares it to a reference annotation (GFF and GTF) and assigns read counts per feature (exon by default).

    If your lab is willing to pay you could use a proprietary aligner/assembler like http://www.softgenetics.com/NextGENe.html

    and this forum will have many great suggestions so stay tuned

    Siva

    Comment


    • #3
      Hi James,

      My two cents worth:

      I would use TopHat rather than Bowtie for the alignment so as to include reads mapping to splice junctions

      Cufflinks and ERANGE both calculate RPKMs (or FPKMs in the case of Cufflinks). For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

      Comparing RPKMs may be as simple as a scatter plot, or if you need to do statistical tests, there are multiple R/Bioconductor packages designed for this purpose (DEGSeq/DEseq/edgeR/Bayseq).

      Best of luck,

      Shurjo

      Comment


      • #4
        Thanks guys RPKM/FPKM is where I'm stumbling currently. Will try some of those out.

        Thanks, J

        Comment


        • #5
          Originally posted by shurjo View Post
          For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

          Shurjo
          Hi Shurjo
          In BEDTools, can I use coverageBed using a SAM file and GFF file as inputs or should I convert the SAM file to a bed file? What kind of output does this produce?

          thanks
          Siva

          Comment


          • #6
            Hi Siva,

            You need to convert both your reads and the gene annotation file to BED format (the latter can be easily downloaded from UCSC). BEDTools has an utility called bam2bed that will convert bam to BED, and if you have samtools you can easily convert from sam to bam. The output format is explained in section 5.9 of the BEDTools manual.

            Regards,

            Shurjo

            Comment


            • #7
              You might also find this review useful:

              Computation for ChIP-seq and RNA-seq studies
              Shirley Pepke et al.

              Document Delivery with RightFind individual articles, where and when you need them, on any device

              Comment


              • #8
                At InSilico DB (https://insilicodb.org), we use the tophat-cufflinks-cummeRbund pipeline (www.nature.com/protocolexchange/protocols/2327)

                There is an example of some of the results here: https://insilicodb.org/differential-...ng-cummerbund/


                Align to my cds/genome: I have used bowtie to do this easy to use.
                We use Tophat with Bowtie2



                Count alignments in cds and create RPKM values: How can I do this?
                We use Cufflinks

                Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?
                We use Cuffdiff for deifferential gene expression

                Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?
                You can send your BAM file to GenomeSpace and visualie them with IGV (http://genomespace.org)


                Hope this helps

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Non-Coding RNA Research and Technologies
                  by seqadmin




                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                  Nobel Prize for MicroRNA Discovery
                  This week,...
                  10-07-2024, 08:07 AM
                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 06:35 AM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 02:44 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-11-2024, 06:55 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-02-2024, 04:51 AM
                0 responses
                111 views
                0 likes
                Last Post seqadmin  
                Working...
                X