Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Ohad
    Member
    • Jul 2013
    • 28

    exon expression level

    Hi there.

    I would like to to get for a list of exons their expression level from my RNA-seq data.
    I thought about using cufflinks but the problem is that cufflinks generate FPKM at the isoform level. I usually feed cufflinks with a refseq gtf file downloaded from UCSC table browser (under genes and genes prediction), and cufflinks use that to build locus.
    I thought to switch back from locus to exons (using the ref ID NM/R_somenumber) but it's a wrong way to go by since:

    1) many exons participate in more than one transcript, and each has a different FPKM value
    2) FPKM number are probably calculated considering fragment length and each transcripts has its own, so I cannot just sum up the FPKM from all transcripts for a particular exon

    If I was in a world in which every gene had only one transcripts I guess my life would be easier and I would be blonde and handsome , but reality is crueler than that.

    All tables I can download on ucsc include variants so I got my hands on illumina truseq exome file containing all exons per gene in a definitively matter, meaning each exon appear once , even if some belongs to different variants. But it seems this file is no good for cufflinks and it was not able to build locus right.

    I thought of just using samtools for each exon:
    samtools view accpeted_hits.bam chr1:exonstrat-exonend | wc -l
    and just get and number and calculate my own FPKM using the exon's length (and total number of reads)

    Do you think using samtools is fine ?
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    Just use DEXseq and its python script for counting exonic bins. This is a solved problem. Yes, you'll get raw counts, but you can do whatever you want with those then.

    Comment

    • Ohad
      Member
      • Jul 2013
      • 28

      #3
      Do I need to install the whole package or can I just use this one script ?

      Comment

      • dpryan
        Devon Ryan
        • Jul 2011
        • 3478

        #4
        As I recall, the python scripts are just wrappers around htseq-count, so you can likely use them without the rest of the R package. Having said that, if you're interested in using the data to look at differential exon usage then DEXseq would be the tool to use anyway.

        Comment

        • Ohad
          Member
          • Jul 2013
          • 28

          #5
          As for now I only need the expression, will samtools be enough ?
          I would like to save the time of learning how to use this package as I'm not that familiar yet with R

          Comment

          • dpryan
            Devon Ryan
            • Jul 2011
            • 3478

            #6
            As long as you don't have paired-end reads, then yes, that'll work. You could also just
            Code:
            samtools view -c alignments.bam chr1:start-end
            since the -c option will do the counting for you.

            Comment

            • Ohad
              Member
              • Jul 2013
              • 28

              #7
              ok thank you for this help.

              I do have paired-end reads, but I guess I could write some script to handle that, using the reads ID.

              Nevertheless, I do preform differential exons analysis quite often, as I study alternative splicing, so it's about time I learn to use DEXseq anyhow.

              Comment

              Latest Articles

              Collapse

              • SEQadmin2
                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                by SEQadmin2


                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                Here are nine questions we think about, in roughly the order they matter, before...
                06-18-2026, 07:11 AM
              • SEQadmin2
                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                by SEQadmin2


                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                ...
                06-02-2026, 10:05 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by SEQadmin2, 06-17-2026, 06:09 AM
              0 responses
              36 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-09-2026, 11:58 AM
              0 responses
              99 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-05-2026, 10:09 AM
              0 responses
              120 views
              0 reactions
              Last Post SEQadmin2  
              Started by SEQadmin2, 06-04-2026, 08:59 AM
              0 responses
              113 views
              0 reactions
              Last Post SEQadmin2  
              Working...