Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • anagari
    Member
    • Jun 2011
    • 18

    Compare two files using awk command

    Hi all,

    I am trying to compare two files and get the output: The files are tab separated.

    File1:
    chr12 45677 45999 3000 a b
    chrX 90999 100000 34 c d

    File2:
    chr12 45680 46000 300 d g h k
    chrY 47800 80000 560 l y z m

    I want the all the columns from both files which has:

    The ‪first column should be same‬
    2 col of file 1 < 3col in file 2
    3col in file 1 > 2col in file 2


    I would really appreciate if anyone could help me in this regard ASAP.


    Thanks in advance,

    PS: I tried to use this command but, didn't understand how to change or modify it for my need:

    awk -F"/t" 'NR==FNR{a[$2]=$1FS$3;next} a[$3]{print $0 FS a[$3]}' 2.txt 1.txt > outputfile
  • steven
    Senior Member
    • Aug 2009
    • 269

    #2
    Hi,
    Not sure i got it, but my guess is that you are looking for overlapping intervals.. consider BEDtools.

    Comment

    • gringer
      David Eccles (gringer)
      • May 2011
      • 845

      #3
      I'm also not sure what you're asking here. Do you have an example output?

      for comparing files side-by-side, 'diff -y' is probably better than awk. If you want to join files on a common field, use 'join'. If you just want to combine files so that one file appears to the left of the other, use 'paste'. [all standard Linux commands]

      Comment

      • Clare S
        Junior Member
        • Jan 2010
        • 5

        #4
        Hi anagari,

        I don't know if this is the kind of path you want to go down (non command-line) but galaxy (usegalaxy.org) has built-in tools for finding overlapping intervals as well.

        You can essentially upload your files and then from the tools on the left select Operate on Genomic Intervals -> Intersect

        Comment

        Latest Articles

        Collapse

        • SEQadmin2
          Nine Things a Sample Prep Scientist Thinks About Before Sequencing
          by SEQadmin2


          I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

          Here are nine questions we think about, in roughly the order they matter, before...
          06-18-2026, 07:11 AM
        • SEQadmin2
          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
          by SEQadmin2


          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
          ...
          06-02-2026, 10:05 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by SEQadmin2, Today, 05:37 AM
        0 responses
        5 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-26-2026, 11:10 AM
        0 responses
        16 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-17-2026, 06:09 AM
        0 responses
        50 views
        0 reactions
        Last Post SEQadmin2  
        Started by SEQadmin2, 06-09-2026, 11:58 AM
        0 responses
        110 views
        0 reactions
        Last Post SEQadmin2  
        Working...