Header Leaderboard Ad

Collapse

how to extract information about mapped genes from a genome-mapping bam file

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • how to extract information about mapped genes from a genome-mapping bam file

    Hello, there,

    I mapped the reads to the genome scaffold using tophat and got a bam file. I have gff file for the genome sequences. I wonder if there is an easy way to extract the information about the mapped genes and the corresponding reads.

    Thank you for your help.

    Capricy

  • #2
    Are you looking for coverage information (http://bedtools.readthedocs.org/en/l.../coverage.html) or actually looking to extract reads that are mapped to the genes (http://seqanswers.com/forums/showthread.php?t=50390)?

    Comment


    • #3
      I am looking for which gene was mapped by the reads. For example:

      My alignment file:
      ----
      HWI-M01439:125:000000000-A7P33:1:1110:21257:22290 99 A_Cont998 55849 50 106M95N138M98N6M = 55887 481 ATTTTGAAGATATCGGAGTATTAGACCTCGACGCCTCACGTGAGCCAATGAGGGCTTTAGTTTGACTTCGTGTGACCTTCACCGCAGGATCAGTTGTGGAGAGGAACAGTTCCGTCACTGTGTTCTTATGCGTAGGATCAAATAACTTTTTCAATTCGCCAGATGCAGCAGCCACTTCAGCGGCCGTCTGCCCATAAAAGACGTCATCCTCCTGCAGTTCCCGAGGTTTAAGGCCAGTTTTATCATCTCT CDCEEFDFFFFFGGGGGGGGGGHHHHHHHGGGGGGHGHHHHHHGHHHHHHHHHGGHHHHHHHHHHHHHHGHGHHHHHHHHHHHGGGGGGHHHHHHHHGGHHHGGHHHHHHHHHGHHGGHHHHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHHHGGGGGHHHHHHHHHHHHHHHHHHHGGGGGGGHGHGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFFFFFFFFFFFFFFFFHHFFFHFFHFHF MD:Z:250 XG:i:0 NH:i:1 NM:i:0 XM:i:0 XO:i:0 AS:i:0 XS:A:-
      HWI-M01439:125:000000000-A7P33:1:1110:21257:22290 147 A_Cont998 55887 50 68M95N138M98N44M = 55849 -481 CGTGAGCCAATGAGGGCTTTAGTTTGACTTCGTGTGACCTTCACCGCAGGATCAGTTGTGGAGAGGAACAGTTCCGTCACTGTGTTCTTATGCGTAGGATCAAATAACTTTTTCAATTCGCCAGATGCAGCAGCCACTTCAGCGGCCGTCTGCCCATAAAAGACGTCATCCTCCTGCAGTTCCCGAGGTTTAAGGCCAGTTTTATCATCTCTAGTAACTATTTCCGAAACGTACTCCCAACGTGGGCCTC EFBFFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFBBGFBA9.FAGFGGGGFFGGGGFBFFFGGGHHGHHEEECGGGHHHHHHHHHHFGGGGGHHHHHHHHHHHHHGGHHHGGHHGC<CHHFHHHGHHHGGHGHHGHHGGGGGGGGGHGGHHHHHHHHHGHGGGHHHHFGGGHHHHHHGGGGGGHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHGGGGGHHHGGGGGGGGGGFCCDDDDDDDDD MD:Z:250 XG:i:0 NH:i:1 NM:i:0 XM:i:0 XO:i:0 AS:i:0 XS:A:-
      ----

      my gff file indicate that: B gene is located on scaffold A_Cont998, between 558000 - 578000

      I would like to have an output: HWI-M01439:125:000000000-A7P33:1:1110:21257:22290 "B gene"

      I don't need to count, just match reads to genes, based on bam file and gff file

      Any idea? Thanks a lot!

      Comment


      • #4
        Why would you want to do that? Probably you have your reasons. One idea: I think using htseq-count (you need sam file as input) have the option of giving another sam output (--samout) which have some extra gene alignment information on it (along the flags). Then you can parse this sam output with your desire info. Good luck.

        Comment


        • #5
          Ok, I will go with this idea!

          Thanks

          Comment

          Latest Articles

          Collapse

          • seqadmin
            How RNA-Seq is Transforming Cancer Studies
            by seqadmin



            Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
            09-07-2023, 11:15 PM
          • seqadmin
            Methods for Investigating the Transcriptome
            by seqadmin




            Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

            Whole Transcriptome RNA-seq
            Whole transcriptome sequencing...
            08-31-2023, 11:07 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 09-22-2023, 09:05 AM
          0 responses
          21 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-21-2023, 06:18 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-20-2023, 09:17 AM
          0 responses
          14 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 09-19-2023, 09:23 AM
          0 responses
          29 views
          0 likes
          Last Post seqadmin  
          Working...
          X