Joining two csv files on common column and removing the second last column
up vote
1
down vote
favorite
I have two csv files:
file1:
C1, 1, 0, 1, 0, 1
C2, 1, 0, 1, 1, 0
C3, 0, 0, 1, 1, 0
file2:
C3, 1.2
C1, 2.3
C2, 1.8
I want to merge these two files based on C column (which produces):
C1, 1, 0, 1, 0, 1, 2.3
C2, 1, 0, 1, 1, 0, 1.8
C3, 0, 0, 1, 1, 0, 1.2
And then remove the second last column (to produce):
C1, 1, 0, 1, 0, 2.3
C2, 1, 0, 1, 1, 1.8
C3, 0, 0, 1, 1, 1.2
shell-script awk join merge
add a comment |
up vote
1
down vote
favorite
I have two csv files:
file1:
C1, 1, 0, 1, 0, 1
C2, 1, 0, 1, 1, 0
C3, 0, 0, 1, 1, 0
file2:
C3, 1.2
C1, 2.3
C2, 1.8
I want to merge these two files based on C column (which produces):
C1, 1, 0, 1, 0, 1, 2.3
C2, 1, 0, 1, 1, 0, 1.8
C3, 0, 0, 1, 1, 0, 1.2
And then remove the second last column (to produce):
C1, 1, 0, 1, 0, 2.3
C2, 1, 0, 1, 1, 1.8
C3, 0, 0, 1, 1, 1.2
shell-script awk join merge
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have two csv files:
file1:
C1, 1, 0, 1, 0, 1
C2, 1, 0, 1, 1, 0
C3, 0, 0, 1, 1, 0
file2:
C3, 1.2
C1, 2.3
C2, 1.8
I want to merge these two files based on C column (which produces):
C1, 1, 0, 1, 0, 1, 2.3
C2, 1, 0, 1, 1, 0, 1.8
C3, 0, 0, 1, 1, 0, 1.2
And then remove the second last column (to produce):
C1, 1, 0, 1, 0, 2.3
C2, 1, 0, 1, 1, 1.8
C3, 0, 0, 1, 1, 1.2
shell-script awk join merge
I have two csv files:
file1:
C1, 1, 0, 1, 0, 1
C2, 1, 0, 1, 1, 0
C3, 0, 0, 1, 1, 0
file2:
C3, 1.2
C1, 2.3
C2, 1.8
I want to merge these two files based on C column (which produces):
C1, 1, 0, 1, 0, 1, 2.3
C2, 1, 0, 1, 1, 0, 1.8
C3, 0, 0, 1, 1, 0, 1.2
And then remove the second last column (to produce):
C1, 1, 0, 1, 0, 2.3
C2, 1, 0, 1, 1, 1.8
C3, 0, 0, 1, 1, 1.2
shell-script awk join merge
shell-script awk join merge
asked Nov 15 at 6:19
Coder
1106
1106
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57
add a comment |
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
You just have create a hash-map on the second file on the C column and use that on the first file as below. The actions right next FNR==NR
applies to the first file specified at the end and the subsequent action happens on the last file. This is because of the special variables in awk
, FNR
and NR
which track line numbers per file and across the files respectively.
awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]; }1' file2 file1
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
add a comment |
up vote
1
down vote
Try also
join -t, -o1.1,1.2,1.3,1.4,1.5,2.2 <(sort file1) <(sort file2)
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider acut
approach?
– RudiC
Nov 16 at 8:09
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
You just have create a hash-map on the second file on the C column and use that on the first file as below. The actions right next FNR==NR
applies to the first file specified at the end and the subsequent action happens on the last file. This is because of the special variables in awk
, FNR
and NR
which track line numbers per file and across the files respectively.
awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]; }1' file2 file1
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
add a comment |
up vote
1
down vote
accepted
You just have create a hash-map on the second file on the C column and use that on the first file as below. The actions right next FNR==NR
applies to the first file specified at the end and the subsequent action happens on the last file. This is because of the special variables in awk
, FNR
and NR
which track line numbers per file and across the files respectively.
awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]; }1' file2 file1
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
You just have create a hash-map on the second file on the C column and use that on the first file as below. The actions right next FNR==NR
applies to the first file specified at the end and the subsequent action happens on the last file. This is because of the special variables in awk
, FNR
and NR
which track line numbers per file and across the files respectively.
awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]; }1' file2 file1
You just have create a hash-map on the second file on the C column and use that on the first file as below. The actions right next FNR==NR
applies to the first file specified at the end and the subsequent action happens on the last file. This is because of the special variables in awk
, FNR
and NR
which track line numbers per file and across the files respectively.
awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]; }1' file2 file1
answered Nov 15 at 6:38
Inian
3,755824
3,755824
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
add a comment |
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
This works: awk -v FS="," -v OFS="," 'FNR==NR { unique[$1]=$2; next } $1 in unique { $NF=unique[$1]"," $NF; }1' file2 file1 (It prints that second last column after merging as the last column.)
– Coder
Nov 15 at 9:56
add a comment |
up vote
1
down vote
Try also
join -t, -o1.1,1.2,1.3,1.4,1.5,2.2 <(sort file1) <(sort file2)
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider acut
approach?
– RudiC
Nov 16 at 8:09
add a comment |
up vote
1
down vote
Try also
join -t, -o1.1,1.2,1.3,1.4,1.5,2.2 <(sort file1) <(sort file2)
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider acut
approach?
– RudiC
Nov 16 at 8:09
add a comment |
up vote
1
down vote
up vote
1
down vote
Try also
join -t, -o1.1,1.2,1.3,1.4,1.5,2.2 <(sort file1) <(sort file2)
Try also
join -t, -o1.1,1.2,1.3,1.4,1.5,2.2 <(sort file1) <(sort file2)
answered Nov 15 at 10:40
RudiC
3,0811211
3,0811211
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider acut
approach?
– RudiC
Nov 16 at 8:09
add a comment |
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider acut
approach?
– RudiC
Nov 16 at 8:09
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
It will work. But, when I have more columns, it has to be something else. Can you guide?
– Coder
Nov 15 at 10:45
Be more specific.
– RudiC
Nov 15 at 10:48
Be more specific.
– RudiC
Nov 15 at 10:48
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
let us say file 1 has 100 columns and file 2 has 20.
– Coder
Nov 16 at 2:56
While in principle that might be doable it will be quite some typing effort. Did you consider a
cut
approach?– RudiC
Nov 16 at 8:09
While in principle that might be doable it will be quite some typing effort. Did you consider a
cut
approach?– RudiC
Nov 16 at 8:09
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f481855%2fjoining-two-csv-files-on-common-column-and-removing-the-second-last-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What did you try? Post your own efforts to the question
– Inian
Nov 15 at 6:35
@Inian My try was this: (1) sort two files based on the common column, (2) apply join on this column (3) then awk all the columns except the second last.
– Coder
Nov 15 at 8:57