Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Annotation viewing in IGV/Tablet - how to create an alias file

    Hello everyone!

    The problem of the day that has me a little stymied is the following:

    I try to visualise a large alignment (bam file) in either IGV or tablet.

    I have tried to upload the feature file (GFF3); to help me navigate. No luck.

    My suspicion is; that the name of the contigs does not match between genome reference and gff.

    My reference genome is a whole genome shotgut assembly consisting of a ton of scaffolds.

    These are named as follows (example): gi|123456789|ref|NW_123456789.1|

    They are also showing up in my bam file; so all good there!

    The gff file, if I interpret it correctly; only uses the NW_123456789.1 part as the reference; which is why I assume tablet and IGV cannot recognize the features.

    Now, the solution would be to create an alias file. But, how do I do it? I have 100.000 odd scaffolds; doing it by hand is out of the question...
    Last edited by TabeaK; 12-11-2012, 12:41 PM.

  • #2
    Hi,

    What version of IGV are you using? It should automatically recognize and alias identifiers of that sort.

    I'm sure there are more direct ways to do this, but this method works. First cut out the short identifiers like this

    cut -f 4 -d '|' inputFile > names.txt

    Then paste the file together

    paste inputFile names.txt > yourGenome_alias.tab

    Jim

    Comment


    • #3
      Ho Jim; thanks for your answer!

      I am running version 2.1.28 on MacOSX 7.4.

      Working brilliantly (great tool!); apart from viewing annotations.

      I'll try your suggestion and report back.

      Comment


      • #4
        I'm a bit puzzled because the aliasing should not be necessary in this case. This is done automatically for sequences that start with "gi|". So gi|123456789|ref|NW_123456789.1| should automatically resolve to NW_123456789.1. If you want to send me a short snippet of your fasta file, and some sample of your gff3, I will look into this further. You can send it to [email protected].

        Comment


        • #5
          Thanks for the offer! I'll email you a bit of data from both fasta and gff ASAP.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin



            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X