Hi there,
I have two very long files like:
file1:
file2:
I want to get an output file like
file3:
The idea is:
If field 1 and field 2 in file1 match the ones in file2, then print fields 1 and 2 in file1 and fields 3, 4, 5 and 6 in field2 together in one line in file3;
If fields 1 and 2 in file1 do not have any match in file2, then print fields 1 and 2 and 0 0 0 0 in one line in file3.
I am wondering if this can be done using awk or join or any other in linux? Since the files are very large, I really want it to be fast. Thanks a lot~~~
Note: Column 2 in both file1 and file2 has only number values, but column 1 in both files may has characters too. The two columns are sorted. And in both files, this kind of situation will not happen, they exist only once.
Also, I only want to get the information of fields 3, 4, 5 and 6 in file2 for all lines the same in file1, if the lines in file1 exits in file2, then add the whole line to file3, if one line in file1 is not in file2, then add "0 0 0 0" for fields 3, 4, 5 and 6. But if one line in field 2 does not exit in file1, then just ignore the line. For example "1 126 2 1 0 0" is not in file1, then this line should not be added to file3.
I have two very long files like:
file1:
Code:
1 123 1 125 1 234 2 123 2 234 ...
Code:
1 123 0 1 0 0 1 126 2 1 0 0 1 234 2 3 0 0 2 123 0 1 0 1 2 138 1 1 1 1 ...
file3:
Code:
1 123 0 1 0 0 1 125 0 0 0 0 1 234 2 3 0 0 2 123 0 1 0 1 2 234 0 0 0 0 ...
If field 1 and field 2 in file1 match the ones in file2, then print fields 1 and 2 in file1 and fields 3, 4, 5 and 6 in field2 together in one line in file3;
If fields 1 and 2 in file1 do not have any match in file2, then print fields 1 and 2 and 0 0 0 0 in one line in file3.
I am wondering if this can be done using awk or join or any other in linux? Since the files are very large, I really want it to be fast. Thanks a lot~~~
Note: Column 2 in both file1 and file2 has only number values, but column 1 in both files may has characters too. The two columns are sorted. And in both files, this kind of situation will not happen, they exist only once.
Code:
1 123 1 123 ...