Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • lnzzz
    Junior Member
    • Jan 2014
    • 4

    RNA-seq : 5’ to 3’ transcript coverage

    Hello everybody,

    I would like to get the read distribution along transcripts in my RNA-seq libraries. Basically, I would like to see if I have a 5' or 3' shortening of transcripts.


    I already tried without success to different tools :
    1) RNA-SeQC. I have some issues to use it on my tophat bam files
    2) RSeQC (geneBody_coverage.py). For this one, I get this error :

    ImportError: dlopen(/Library/Python/2.7/site-packages/RSeQC-2.3.7-py2.7-macosx-10.8-intel.egg/csamtools.so, 2): Symbol not found: ___ks_insertsort_heap
    Referenced from: /Library/Python/2.7/site-packages/RSeQC-2.3.7-py2.7-macosx-10.8-intel.egg/csamtools.so
    Expected in: flat namespace
    in /Library/Python/2.7/site-packages/RSeQC-2.3.7-py2.7-macosx-10.8-intel.egg/csamtools.so


    It seems to be an installation problem but my informatic skills are too weak to know how to deal with this problem .


    I'm blocked. I would very much appreciate your help. Could you help me to use RNA-SeQC or RSeQC? Perhabs do you know a best strategy to get the 5’ to 3’ transcript coverage?

    Thank you!!!

    ln


    PS: I wish you all the best for the new year
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    You can use the coverageBed from the bedtools program to get the coverage information: http://bedtools.readthedocs.org/en/l.../coverage.html

    Comment

    • lnzzz
      Junior Member
      • Jan 2014
      • 4

      #3
      Thank you very much.
      I tried coverageBed. I got a file with the read coverage for each position in the genome. I succeeded in extracting the distribution along one particular transcript. However, what I really need, it is an estimation of the global read distribution along all transcripts. I don't really how I can get it with coverageBed .

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Can you post the command you used for coverageBed?

        BTW: Did you try the test dataset for RSeQC with your local install? Is that generating an error? Looking at your file paths it appears that you are using OS X (10.8?)
        Last edited by GenoMax; 01-07-2014, 04:44 AM.

        Comment

        • lnzzz
          Junior Member
          • Jan 2014
          • 4

          #5
          Yes I used a bed file with the coordinate of my genomes. Here is the command I use for coverage:
          /Users/bedtools2/bin/coverageBed -d -s -split -abam 621_hits.bam -b TAIR10.bed >coverage.txt

          This command gave me a very big file (20Go)... As this file obtained with coveragebed was huge (20Go), I also use the genomeCoverageBed and obtained the depth at each position:

          /Users/bedtools2/bin/genomeCoverageBed -bg -split -trackline -ibam 621_hits.bam -g TAIR10.bed > coverage2.txt

          What I would like is the global read distribution along transcript. That is to say, what is the read percentage in the first 10% bases of the transcripts, in the next 10%… (Enclosed an example of what I could get for one transcript and I would to obtain for all the transcripts.)
          Attached Files

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            The -d option reports coverage at each position. Can you try the coverageBed without the -d and -split? Your transcriptome BED file is start/stop positions of exons? If you are set on the 10% intervals then you may need to create a custom BED file.
            Last edited by GenoMax; 01-07-2014, 07:51 AM.

            Comment

            • lnzzz
              Junior Member
              • Jan 2014
              • 4

              #7
              Yes, I think I need to create custom bed file.

              Thank you for your help

              Comment

              • swbarnes2
                Senior Member
                • May 2008
                • 910

                #8
                Quick and dirty way
                This works best if you align to a list of transcripts, instead of genome. Sure, it's not quite as accurate as aligning to genome with TopHat, but you don't need exact figures, just a ballpark.

                1) get a list of your transcripts, and how long each one is. (If you align to a list of transcripts, samtools idxstats will tell you this)
                2) take line of your sam file, and associate it with a transcript (If you align to a list of transcripts, each line will already have that info)
                3) Go through each line of the .sam, and change the alignment position to a corrected integer position that is the position / total length of the transcript
                4) Bin up all your new positions.
                Last edited by swbarnes2; 01-07-2014, 10:28 AM.

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Today, 11:58 AM
                0 responses
                9 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                25 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                34 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                56 views
                0 reactions
                Last Post SEQadmin2  
                Working...