Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Going from transcriptome to genome coordinates with a bam file

    I'm venturing into RNA-editing with mouse, and one of the most common methods to avoid false positives includes mapping to a transcriptome or a custom set of junction sequences followed by mapping to a genome. That's the easy part.

    Once I have the mapped reads for the transcriptome, does anyone know of a good tool/method to convert those coordinates to genome coordinates while updating CIGAR strings to show split reads?

  • #2
    What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

    Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?

    Comment


    • #3
      Originally posted by cedance View Post
      What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

      Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?
      It's essentially a solved problem since this is what mappers like Tophat do. However, my python skills are not sufficient to figure out how the code works so that I could implement it on a separate bam file. Any help you could provide would be great.
      I'm using mm9 with a UCSC gtf.

      Comment


      • #4
        Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.

        Comment


        • #5
          Originally posted by golharam View Post
          Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.
          Tophat's output isn't sufficient for what I want to do. One reason is that a common filtering step for RNA edits is based on MAPQ, which Tophat doesn't output in a manner correlating with quality.

          Comment


          • #6
            Excuse the bad terminology. The mapping I meant is not the generic "mapping" terminology associated with mapping reads to your genome. Rather a mapping in a "function" sense, or association, if you will.

            Rewriting, you'll need an association of every coordinate of your transcriptome to that of your genome. Imagine a read starting at chromosome "Chr1" and position "1500" and its CIGAR string is "80M". Imagine that, if you mapped to your reference genome, the read's CIGAR string would be "30M60N50M". This of course means that the read is spliced in this position. For you to be able to do this, the only way I could think of right now is for you to have known that 1500-1529 of your transcriptome corresponds to 1500-1529 of your reference genome. However, 1530-1579 of your transcriptome corresponds to 1530+60 = 1590 to 1639. Hence the need for association of transcriptome to genome.

            Going by this logic, if your GTF/GFF file for your transcriptome and genome have similar gene ids (or you know which RNA id of your transcriptome corresponds to which gene, and its coordinates), then, probably it might be possible to establish this association. In case I once again confused you or I understood it totally wrong, excuse the mess!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Quality Control Essentials for Next-Generation Sequencing Workflows
              by seqadmin




              Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

              Nucleic Acid Quality Control
              Preparing for NGS starts with isolating the...
              02-10-2025, 01:58 PM
            • seqadmin
              An Introduction to the Technologies Transforming Precision Medicine
              by seqadmin


              In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
              01-27-2025, 07:46 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 02-07-2025, 09:30 AM
            0 responses
            65 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-05-2025, 10:34 AM
            0 responses
            101 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-03-2025, 09:07 AM
            0 responses
            81 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 01-31-2025, 08:31 AM
            0 responses
            45 views
            0 likes
            Last Post seqadmin  
            Working...
            X