Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-Seq quality controls: golden standard tool?

    Hi,
    I need to establish a protocol for checking the quality of the RNA-Seq data. There are few pipelines for this purpose published over the last few years (ShortRead, htSeqTools, ArrayExpressHTS), and I'm wondering which of these are commonly used and represent golden standards? The questions I ask in my QC is what is the degradation level of the RNA, what is the quality of the sequences coming out of the Illumina platforms (HiSeq at the moment), how the given sample differs from the rest (is it a weird outlier).
    I have other datasets that I would like to evaluate as well, and they are produced on GAII platform of different sequence read lengths. So Ideally, I would be looking for the QC tool that are:
    1) flexible to use different read length
    2) provide rigorous QC with nice graphs
    3) can use the output from the TopHat pipeline.

    I would very much appreciate any help, suggestions, advices.
    Thanks!!!
    Anna

  • #2
    I might not be a great reference but I don't think there IS a standard at this point.

    FastQC is a nice tool to get a set of quality assesment tests and graphs all at once for the raw reads (in FASTQ format) http://www.bioinformatics.babraham.a...ojects/fastqc/. You might use the output of FastQC to help you get an idea of whether you want to trim bases off of the 5' or 3' ends of your reads. Some aligners can do that for you, like BWA. Most aligners provide an option for you to specify some type of threshold for base qualities that are accepted for alignments. So tools like FastQC are just there for you to check up on the quality of your run however they aren't directly used to control what you run through the aligners.

    As far as determining how "any given sample differs from the rest" - this question could be pretty complex to answer. You can look at SNPs, differential gene expression, or splice variant differences (from some novel transcript assembler like cufflinks). You can use the "tuxedo" pipeline to access differential expression and splicing variation between samples. For SNPs I like to use samtools mpileup followed by bcftools for variant calling. After that I use bedtools to make comparisons between VCF outputs from bcftools to determine which SNPs are unique to which samples, which are shared, etc.

    I've had good results from clustering samples in R using its hierarchical clustering function on gene expression output across multiple samples from multiple lines. However determining why any sample clusters separately from others (or more specifically producing a gene list responsible for the clustering) has not been straightforward nor "established" from what I can tell.
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      There is a program called RNASeqQC which is more useful that FastQC for this purpose.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Developments in Metagenomics
        by seqadmin





        Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
        09-23-2024, 06:35 AM
      • seqadmin
        Understanding Genetic Influence on Infectious Disease
        by seqadmin




        During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

        Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
        09-09-2024, 10:59 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 10-02-2024, 04:51 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 10-01-2024, 07:10 AM
      0 responses
      21 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-30-2024, 08:33 AM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 09-26-2024, 12:57 PM
      0 responses
      18 views
      0 likes
      Last Post seqadmin  
      Working...
      X