Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • runnerBio88
    Member
    • Oct 2015
    • 10

    RNA-seq for Variant calling

    Hi

    I've read a few number of papers about this topic such:

    Next-generation RNA sequencing (RNA-seq) maps and analyzes transcriptomes and generates data on sequence variation in expressed genes. There are few reported studies on analysis strategies to maximize the yield of quality RNA-seq SNP data. We evaluated the performance of different SNP-calling methods following alignment to both genome and transcriptome by applying them to RNA-seq data from a HapMap lymphoblastoid cell line sample and comparing results with sequence variation data from 1000 Genomes. We determined that the best method to achieve high specificity and sensitivity, and greatest number of SNP calls, is to remove duplicate sequence reads after alignment to the genome and to call SNPs using SAMtools. The accuracy of SNP calls is dependent on sequence coverage available. In terms of specificity, 89% of RNA-seq SNPs calls were true variants where coverage is >10X. In terms of sensitivity, at >10X coverage 92% of all expected SNPs in expressed exons could be detected. Overall, the results indicate that RNA-seq SNP data are a very useful by-product of sequence-based transcriptome analysis. If RNA-seq is applied to disease tissue samples and assuming that genes carrying mutations relevant to disease biology are being expressed, a very high proportion of these mutations can be detected.


    I'm surprised about the results, for example, when comparing coding sites in WES, RNA-seq only finds about 33% of SNP. When compared with WGS the percentage rises up to 45%, but still quite low (I think).

    The specificity of these methods are not bad, the sensitivity is not so good (depends of coverage).

    I'm surprised (I'm newbie) by what I think these are poor results, I would expect that at least for expressed regions the % of SNP found to be higher. I'm also surprised by the fact that most papers conform with coverages of 10x or even 3x to try to call for a variant when by being RNA-seq coverage souldn't be big deal.

    I understand that some aligners can work with splice junctions, so this should not be a problem for variant calling with RNA-seq data.

    I don't know if anyone can give me some clues or some more info about this. I'm just wondering about this.

    Any paper where same samples are compared by using WGS, WES and RNA-seq?
    Thanks
    Last edited by runnerBio88; 12-29-2015, 07:37 AM.
  • vivek_
    PhD Student
    • Jul 2012
    • 164

    #2
    Have you looked into things like allele specific expression which could have an effect on number of common SNPs found between DNA and RNA Seq? I'm not sure if that alone could contribute to the entire divergence but its worth investigating.

    Comment

    • HESmith
      Senior Member
      • Oct 2009
      • 512

      #3
      RNA-Seq is less sensitive for variant detection because transcript levels vary widely, whereas WGS and (for the most part) WES produce relatively even coverage. Some fraction of genes will not be expressed in your sample (and therefore undetectable), while others expressed at such low levels that the read depth is insufficient. For example, to obtain 10X coverage of a 1kbp transcript present at one-in-a-million copies would require 200M 50bp reads. Most RNA-Seq datasets are an order of magnitude smaller.

      Note that there are methods for normalizing RNA-Seq libraries, but the vast majority of experiments are designed to detect differences in transcript levels (which would be obviated by normalization).
      Last edited by HESmith; 12-30-2015, 08:09 AM.

      Comment

      • InDel
        Junior Member
        • Mar 2016
        • 7

        #4
        Generally speaking, it is preferred using WES (whole exome seq) to identify mutations and even better if you have matched patient samples with integration of Whole Exome seq and RNA-seq to increases mutation detection performance. Detecting mutations from RNA-Seq is not a typical approach to detect mutations, mainly due to the intrinsic complexity in the transcriptome (e.g., splicing, high/low gene expression level).

        Comment

        Latest Articles

        Collapse

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 06:09 AM
        0 responses
        15 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        34 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-05-2026, 10:09 AM
        0 responses
        39 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-04-2026, 08:59 AM
        0 responses
        44 views
        0 reactions
        Last Post SEQadmin2  
        Working...