Seqanswers Leaderboard Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • masylichu
    Member
    • Oct 2010
    • 30

    How to merge the annotations

    Hello,

    I have the lincRNA annotation file from UCSC NR*, GENCODE, and other published lincRNA collections. However, i want to merge them into one larger lincRNA collections. what pipeline can do this ?
  • pallevillesen
    Member
    • May 2012
    • 19

    #2
    cat file1 >combinedset.txt
    cat file2 >>combinedset.txt
    cat file3 >>combinedset.txt

    If you need to reformat:
    # Column 1,2,3
    cat file1 | awk -v "OFS=\t" '{ print $1, $2,$3;} >combinedset.txt
    # Column 3,4,5
    cat file2 | awk -v "OFS=\t" '{ print $3, $4,$5;} >>combinedset.txt
    # Column 1, 2,3 : change col 2 from 1 based to 0 based
    cat file3 | awk -v "OFS=\t" '{ print $1, int($2)-1, $3;} >>combinedset.txt

    Comment

    • zinky
      Member
      • Dec 2011
      • 48

      #3
      Originally posted by masylichu View Post
      Hello,

      I have the lincRNA annotation file from UCSC NR*, GENCODE, and other published lincRNA collections. However, i want to merge them into one larger lincRNA collections. what pipeline can do this ?
      can you paste those linCRNA annotation file's weblinks out? i want it either
      Last edited by zinky; 12-05-2012, 11:49 PM.

      Comment

      • sdriscoll
        I like code
        • Sep 2009
        • 436

        #4
        in your merging i assume you might need to check each separate annotation for duplicates between annotations. is that the case?

        if not then 'catting' them together is the right thing to do (assuming you're using a *nix) based system or cygwin in windows. just a dorky note...you can do those cat's in one line:

        Code:
        cat file1 file2 file3 > combinedset.txt
        and you could also do the reformats in one line:

        Code:
        cat <(cut -f1,2,3 file1) <(cut -f3,4,5 file2) <(cut -f1,2,3 file3) > combinedset.txt
        /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
        Salk Institute for Biological Studies, La Jolla, CA, USA */

        Comment

        • pallevillesen
          Member
          • May 2012
          • 19

          #5
          Ok, if you end up with something like:

          chr1 1002 9005 linRNA1 . + (BED FORMAT)

          Then you can

          cat combinedfile.bed | sort -k1,1 -k2,2n | uniq >combined.sorted.collapsed.bed

          Then it is sorted by chromosome and only contains unique entries.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            New Genomics Tools and Methods Shared at AGBT 2025
            by seqadmin


            This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

            The Headliner
            The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
            03-03-2025, 01:39 PM
          • seqadmin
            Investigating the Gut Microbiome Through Diet and Spatial Biology
            by seqadmin




            The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
            02-24-2025, 06:31 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-20-2025, 05:03 AM
          0 responses
          20 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-19-2025, 07:27 AM
          0 responses
          26 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-18-2025, 12:50 PM
          0 responses
          19 views
          0 reactions
          Last Post seqadmin  
          Started by seqadmin, 03-03-2025, 01:15 PM
          0 responses
          187 views
          0 reactions
          Last Post seqadmin  
          Working...