Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Starting out with mRNA-seq analysis

    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J

  • #2
    Originally posted by James View Post
    So I'm a newbie to this so have a few questions on what programs to use (just to get me started) that are good for the beginner, I've had a good look through the list of programs but I'd like some peoples opinions. For an average pipeline for illumina mRNA-seq data.

    Align to my cds/genome: I have used bowtie to do this easy to use.

    Count alignments in cds and create RPKM values: How can I do this?

    Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?

    Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?

    Sorry for the awful newbie questions. Thanks, J
    Hi
    To count alignments and create RPKM you could use Cufflinks it does transcript assembly: check out: http://cufflinks.cbcb.umd.edu/

    To get read counts per feature : You could use a script htseq-count (within http://www-huber.embl.de/users/ander.../overview.html). It takes a SAM output and compares it to a reference annotation (GFF and GTF) and assigns read counts per feature (exon by default).

    If your lab is willing to pay you could use a proprietary aligner/assembler like http://www.softgenetics.com/NextGENe.html

    and this forum will have many great suggestions so stay tuned

    Siva

    Comment


    • #3
      Hi James,

      My two cents worth:

      I would use TopHat rather than Bowtie for the alignment so as to include reads mapping to splice junctions

      Cufflinks and ERANGE both calculate RPKMs (or FPKMs in the case of Cufflinks). For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

      Comparing RPKMs may be as simple as a scatter plot, or if you need to do statistical tests, there are multiple R/Bioconductor packages designed for this purpose (DEGSeq/DEseq/edgeR/Bayseq).

      Best of luck,

      Shurjo

      Comment


      • #4
        Thanks guys RPKM/FPKM is where I'm stumbling currently. Will try some of those out.

        Thanks, J

        Comment


        • #5
          Originally posted by shurjo View Post
          For getting raw counts of reads mapping to transcripts, in addition to htseq-count, BedTools has an utility called coverageBed which I use.

          Shurjo
          Hi Shurjo
          In BEDTools, can I use coverageBed using a SAM file and GFF file as inputs or should I convert the SAM file to a bed file? What kind of output does this produce?

          thanks
          Siva

          Comment


          • #6
            Hi Siva,

            You need to convert both your reads and the gene annotation file to BED format (the latter can be easily downloaded from UCSC). BEDTools has an utility called bam2bed that will convert bam to BED, and if you have samtools you can easily convert from sam to bam. The output format is explained in section 5.9 of the BEDTools manual.

            Regards,

            Shurjo

            Comment


            • #7
              You might also find this review useful:

              Computation for ChIP-seq and RNA-seq studies
              Shirley Pepke et al.

              Document Delivery with RightFind individual articles, where and when you need them, on any device

              Comment


              • #8
                At InSilico DB (https://insilicodb.org), we use the tophat-cufflinks-cummeRbund pipeline (www.nature.com/protocolexchange/protocols/2327)

                There is an example of some of the results here: https://insilicodb.org/differential-...ng-cummerbund/


                Align to my cds/genome: I have used bowtie to do this easy to use.
                We use Tophat with Bowtie2



                Count alignments in cds and create RPKM values: How can I do this?
                We use Cufflinks

                Compare RPKM values from different data sets: I guess you could use scatter graphs to compare, any other good ways?
                We use Cuffdiff for deifferential gene expression

                Visualization of the reads on the genome: The UCSC browser looks superb, however it doesn't have the genome of the critter I work on. What else is good for this?
                You can send your BAM file to GenomeSpace and visualie them with IGV (http://genomespace.org)


                Hope this helps

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Best Practices for Single-Cell Sequencing Analysis
                  by seqadmin



                  While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                  06-06-2024, 07:15 AM
                • seqadmin
                  Latest Developments in Precision Medicine
                  by seqadmin



                  Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                  Somatic Genomics
                  “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                  05-24-2024, 01:16 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 06-07-2024, 06:58 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-06-2024, 08:18 AM
                0 responses
                23 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-06-2024, 08:04 AM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 06-03-2024, 06:55 AM
                0 responses
                15 views
                0 likes
                Last Post seqadmin  
                Working...
                X