Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Going from transcriptome to genome coordinates with a bam file

    I'm venturing into RNA-editing with mouse, and one of the most common methods to avoid false positives includes mapping to a transcriptome or a custom set of junction sequences followed by mapping to a genome. That's the easy part.

    Once I have the mapped reads for the transcriptome, does anyone know of a good tool/method to convert those coordinates to genome coordinates while updating CIGAR strings to show split reads?

  • #2
    What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

    Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?

    Comment


    • #3
      Originally posted by cedance View Post
      What you are asking for, is essentially, another mapping, isn't it? There should be a map of transcriptome to your genome constructed first (maybe easier with a genome annotation available? ) and then you want to map your reads from transcriptome to that of the genome with the constructed map. This is the idea I could think of. Certainly seems a nice problem to invest some time for me. I am not aware of any existing tools that do this. I'll try to work a bit on this and see if I get anywhere and post back if I have something going on.

      Just 1 question, did you construct your transcriptome yourself (from an annotation file)? Or do you have a GFF file at all?
      It's essentially a solved problem since this is what mappers like Tophat do. However, my python skills are not sufficient to figure out how the code works so that I could implement it on a separate bam file. Any help you could provide would be great.
      I'm using mm9 with a UCSC gtf.

      Comment


      • #4
        Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.

        Comment


        • #5
          Originally posted by golharam View Post
          Use Tophat. You will need a GTF file of exons, CDS, etc. Tophat will map to the known transcriptome then map to the rest of the genome.
          Tophat's output isn't sufficient for what I want to do. One reason is that a common filtering step for RNA edits is based on MAPQ, which Tophat doesn't output in a manner correlating with quality.

          Comment


          • #6
            Excuse the bad terminology. The mapping I meant is not the generic "mapping" terminology associated with mapping reads to your genome. Rather a mapping in a "function" sense, or association, if you will.

            Rewriting, you'll need an association of every coordinate of your transcriptome to that of your genome. Imagine a read starting at chromosome "Chr1" and position "1500" and its CIGAR string is "80M". Imagine that, if you mapped to your reference genome, the read's CIGAR string would be "30M60N50M". This of course means that the read is spliced in this position. For you to be able to do this, the only way I could think of right now is for you to have known that 1500-1529 of your transcriptome corresponds to 1500-1529 of your reference genome. However, 1530-1579 of your transcriptome corresponds to 1530+60 = 1590 to 1639. Hence the need for association of transcriptome to genome.

            Going by this logic, if your GTF/GFF file for your transcriptome and genome have similar gene ids (or you know which RNA id of your transcriptome corresponds to which gene, and its coordinates), then, probably it might be possible to establish this association. In case I once again confused you or I understood it totally wrong, excuse the mess!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              New Genomics Tools and Methods Shared at AGBT 2025
              by seqadmin


              This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

              The Headliner
              The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
              03-03-2025, 01:39 PM
            • seqadmin
              Investigating the Gut Microbiome Through Diet and Spatial Biology
              by seqadmin




              The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
              02-24-2025, 06:31 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-03-2025, 01:15 PM
            0 responses
            179 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-28-2025, 12:58 PM
            0 responses
            273 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-24-2025, 02:48 PM
            0 responses
            659 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 02-21-2025, 02:46 PM
            0 responses
            268 views
            0 likes
            Last Post seqadmin  
            Working...
            X