Join two files based on a common field
I'm trying to join two files together based on their common field, and output it to a file, using the join
command.
File 1:
C01:Nancy:Jones:njones@abc.com
C02:Barbara:Madison:bmadison@bcd.com
C03:Tim:Adams:tadams@cde.com
C04:Sarah:Moore:smoore@def.com
C05:John:Polk:jpolk@efg.com
C06:Paula:Jacobs:pjacobs@fgh.com
File 2:
R001:07/04/15:123.45:C01
R002:12/20/15:167.50:C03
R003:03/14/16:298.00:C06
R005:09/15/16:36.50:C03
R005:11/27/16:58.00:C02
R006:02/28/17:72.98:C05
Expected Output
C01:Nancy:Jones:njones@abc.com:R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com:R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R004:09/15/16:36.5
C05:John:Polk:jpolk@efg.com:R006:02/28/17:72.9
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
I've tried sorting File2
based on field 4 first and then outputting it to a new file
sort -t: -k 4 File2 > File22
then joining
join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' File1 File22 > File 3
but I always get a blank output file. I'm also not sure how to handle the omission of row 4 from File1
and the fact that C03
comes up twice in File2
.
text-processing join
|
show 1 more comment
I'm trying to join two files together based on their common field, and output it to a file, using the join
command.
File 1:
C01:Nancy:Jones:njones@abc.com
C02:Barbara:Madison:bmadison@bcd.com
C03:Tim:Adams:tadams@cde.com
C04:Sarah:Moore:smoore@def.com
C05:John:Polk:jpolk@efg.com
C06:Paula:Jacobs:pjacobs@fgh.com
File 2:
R001:07/04/15:123.45:C01
R002:12/20/15:167.50:C03
R003:03/14/16:298.00:C06
R005:09/15/16:36.50:C03
R005:11/27/16:58.00:C02
R006:02/28/17:72.98:C05
Expected Output
C01:Nancy:Jones:njones@abc.com:R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com:R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R004:09/15/16:36.5
C05:John:Polk:jpolk@efg.com:R006:02/28/17:72.9
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
I've tried sorting File2
based on field 4 first and then outputting it to a new file
sort -t: -k 4 File2 > File22
then joining
join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' File1 File22 > File 3
but I always get a blank output file. I'm also not sure how to handle the omission of row 4 from File1
and the fact that C03
comes up twice in File2
.
text-processing join
1
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
@don_crissti Not here. SinceFile 3
is not surrounded by quotation marks I get an errorjoin: extra operand '3'
; and an empty file namedFile
is created. Furthermore, when I run the command without redirection, the records forC04
andC05
are missing from the output, whereas Jim has stated they should appear in the output.
– cryptarch
Jan 9 at 21:06
@cryptarch - I assumeFile 3
is a typo and the OP meantFile3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, onlyC04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...
– don_crissti
Jan 9 at 21:14
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18
|
show 1 more comment
I'm trying to join two files together based on their common field, and output it to a file, using the join
command.
File 1:
C01:Nancy:Jones:njones@abc.com
C02:Barbara:Madison:bmadison@bcd.com
C03:Tim:Adams:tadams@cde.com
C04:Sarah:Moore:smoore@def.com
C05:John:Polk:jpolk@efg.com
C06:Paula:Jacobs:pjacobs@fgh.com
File 2:
R001:07/04/15:123.45:C01
R002:12/20/15:167.50:C03
R003:03/14/16:298.00:C06
R005:09/15/16:36.50:C03
R005:11/27/16:58.00:C02
R006:02/28/17:72.98:C05
Expected Output
C01:Nancy:Jones:njones@abc.com:R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com:R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R004:09/15/16:36.5
C05:John:Polk:jpolk@efg.com:R006:02/28/17:72.9
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
I've tried sorting File2
based on field 4 first and then outputting it to a new file
sort -t: -k 4 File2 > File22
then joining
join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' File1 File22 > File 3
but I always get a blank output file. I'm also not sure how to handle the omission of row 4 from File1
and the fact that C03
comes up twice in File2
.
text-processing join
I'm trying to join two files together based on their common field, and output it to a file, using the join
command.
File 1:
C01:Nancy:Jones:njones@abc.com
C02:Barbara:Madison:bmadison@bcd.com
C03:Tim:Adams:tadams@cde.com
C04:Sarah:Moore:smoore@def.com
C05:John:Polk:jpolk@efg.com
C06:Paula:Jacobs:pjacobs@fgh.com
File 2:
R001:07/04/15:123.45:C01
R002:12/20/15:167.50:C03
R003:03/14/16:298.00:C06
R005:09/15/16:36.50:C03
R005:11/27/16:58.00:C02
R006:02/28/17:72.98:C05
Expected Output
C01:Nancy:Jones:njones@abc.com:R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com:R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R004:09/15/16:36.5
C05:John:Polk:jpolk@efg.com:R006:02/28/17:72.9
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
I've tried sorting File2
based on field 4 first and then outputting it to a new file
sort -t: -k 4 File2 > File22
then joining
join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' File1 File22 > File 3
but I always get a blank output file. I'm also not sure how to handle the omission of row 4 from File1
and the fact that C03
comes up twice in File2
.
text-processing join
text-processing join
edited Jan 10 at 19:00
Rui F Ribeiro
39.5k1479133
39.5k1479133
asked Jan 9 at 20:52
JimJim
161
161
1
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
@don_crissti Not here. SinceFile 3
is not surrounded by quotation marks I get an errorjoin: extra operand '3'
; and an empty file namedFile
is created. Furthermore, when I run the command without redirection, the records forC04
andC05
are missing from the output, whereas Jim has stated they should appear in the output.
– cryptarch
Jan 9 at 21:06
@cryptarch - I assumeFile 3
is a typo and the OP meantFile3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, onlyC04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...
– don_crissti
Jan 9 at 21:14
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18
|
show 1 more comment
1
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
@don_crissti Not here. SinceFile 3
is not surrounded by quotation marks I get an errorjoin: extra operand '3'
; and an empty file namedFile
is created. Furthermore, when I run the command without redirection, the records forC04
andC05
are missing from the output, whereas Jim has stated they should appear in the output.
– cryptarch
Jan 9 at 21:06
@cryptarch - I assumeFile 3
is a typo and the OP meantFile3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, onlyC04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...
– don_crissti
Jan 9 at 21:14
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18
1
1
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
@don_crissti Not here. Since
File 3
is not surrounded by quotation marks I get an error join: extra operand '3'
; and an empty file named File
is created. Furthermore, when I run the command without redirection, the records for C04
and C05
are missing from the output, whereas Jim has stated they should appear in the output.– cryptarch
Jan 9 at 21:06
@don_crissti Not here. Since
File 3
is not surrounded by quotation marks I get an error join: extra operand '3'
; and an empty file named File
is created. Furthermore, when I run the command without redirection, the records for C04
and C05
are missing from the output, whereas Jim has stated they should appear in the output.– cryptarch
Jan 9 at 21:06
@cryptarch - I assume
File 3
is a typo and the OP meant File3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, only C04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...– don_crissti
Jan 9 at 21:14
@cryptarch - I assume
File 3
is a typo and the OP meant File3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, only C04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...– don_crissti
Jan 9 at 21:14
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18
|
show 1 more comment
2 Answers
2
active
oldest
votes
There are two things you need to do to get things working the way you want. First of all, by appending > File 3
to the end of the command, you are redirecting the output to File
and also adding the extra argument 3
to the join
command.
To explain why this happens, I recently (and coincidentally!) answered a similar question at Strange behaviour redirecting man pages. Basically, it has to do with Bash grammar, which you can read more about in man bash
.
To fix that behaviour, you should either ensure there are no spaces in the output file, or you should escape the space, or you should surround the filename with quotation marks. Any of the following redirections would work:
... > File3
... > File 3
... > "File 3"
Now to address the missing records from the output. This is handled by the -a
flag. From man join
:
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
Putting it all together, the following command works for me (I will use tee
rather than a redirection to show the output):
$ join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' -a 1 File1 <(sort -t: -k4 File2) | tee "File 3"
C01:Nancy:Jones:njones@abc.com :R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com :R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R005:09/15/16:36.50
C04:Sarah:Moore:smoore@def.com:::
C05:John:Polk:jpolk@efg.com:::
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
add a comment |
with Miller http://johnkerl.org/miller/doc/reference.html the command is
mlr --csv --implicit-csv-header --headerless-csv-output --fs ":" join --ul -l 4 -r 1 -j 4 --lp l --rp r -f file2.csv file1.csv
And you have
C01:R001:07/04/15:123.45:Nancy:Jones:njones@abc.com
C02:R005:11/27/16:58.00:Barbara:Madison:bmadison@bcd.com
C03:R002:12/20/15:167.50:Tim:Adams:tadams@cde.com
C03:R005:09/15/16:36.50:Tim:Adams:tadams@cde.com
C05:R006:02/28/17:72.98:John:Polk:jpolk@efg.com
C06:R003:03/14/16:298.00:Paula:Jacobs:pjacobs@fgh.com
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f493551%2fjoin-two-files-based-on-a-common-field%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
There are two things you need to do to get things working the way you want. First of all, by appending > File 3
to the end of the command, you are redirecting the output to File
and also adding the extra argument 3
to the join
command.
To explain why this happens, I recently (and coincidentally!) answered a similar question at Strange behaviour redirecting man pages. Basically, it has to do with Bash grammar, which you can read more about in man bash
.
To fix that behaviour, you should either ensure there are no spaces in the output file, or you should escape the space, or you should surround the filename with quotation marks. Any of the following redirections would work:
... > File3
... > File 3
... > "File 3"
Now to address the missing records from the output. This is handled by the -a
flag. From man join
:
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
Putting it all together, the following command works for me (I will use tee
rather than a redirection to show the output):
$ join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' -a 1 File1 <(sort -t: -k4 File2) | tee "File 3"
C01:Nancy:Jones:njones@abc.com :R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com :R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R005:09/15/16:36.50
C04:Sarah:Moore:smoore@def.com:::
C05:John:Polk:jpolk@efg.com:::
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
add a comment |
There are two things you need to do to get things working the way you want. First of all, by appending > File 3
to the end of the command, you are redirecting the output to File
and also adding the extra argument 3
to the join
command.
To explain why this happens, I recently (and coincidentally!) answered a similar question at Strange behaviour redirecting man pages. Basically, it has to do with Bash grammar, which you can read more about in man bash
.
To fix that behaviour, you should either ensure there are no spaces in the output file, or you should escape the space, or you should surround the filename with quotation marks. Any of the following redirections would work:
... > File3
... > File 3
... > "File 3"
Now to address the missing records from the output. This is handled by the -a
flag. From man join
:
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
Putting it all together, the following command works for me (I will use tee
rather than a redirection to show the output):
$ join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' -a 1 File1 <(sort -t: -k4 File2) | tee "File 3"
C01:Nancy:Jones:njones@abc.com :R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com :R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R005:09/15/16:36.50
C04:Sarah:Moore:smoore@def.com:::
C05:John:Polk:jpolk@efg.com:::
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
add a comment |
There are two things you need to do to get things working the way you want. First of all, by appending > File 3
to the end of the command, you are redirecting the output to File
and also adding the extra argument 3
to the join
command.
To explain why this happens, I recently (and coincidentally!) answered a similar question at Strange behaviour redirecting man pages. Basically, it has to do with Bash grammar, which you can read more about in man bash
.
To fix that behaviour, you should either ensure there are no spaces in the output file, or you should escape the space, or you should surround the filename with quotation marks. Any of the following redirections would work:
... > File3
... > File 3
... > "File 3"
Now to address the missing records from the output. This is handled by the -a
flag. From man join
:
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
Putting it all together, the following command works for me (I will use tee
rather than a redirection to show the output):
$ join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' -a 1 File1 <(sort -t: -k4 File2) | tee "File 3"
C01:Nancy:Jones:njones@abc.com :R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com :R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R005:09/15/16:36.50
C04:Sarah:Moore:smoore@def.com:::
C05:John:Polk:jpolk@efg.com:::
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
There are two things you need to do to get things working the way you want. First of all, by appending > File 3
to the end of the command, you are redirecting the output to File
and also adding the extra argument 3
to the join
command.
To explain why this happens, I recently (and coincidentally!) answered a similar question at Strange behaviour redirecting man pages. Basically, it has to do with Bash grammar, which you can read more about in man bash
.
To fix that behaviour, you should either ensure there are no spaces in the output file, or you should escape the space, or you should surround the filename with quotation marks. Any of the following redirections would work:
... > File3
... > File 3
... > "File 3"
Now to address the missing records from the output. This is handled by the -a
flag. From man join
:
-a FILENUM
also print unpairable lines from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2
Putting it all together, the following command works for me (I will use tee
rather than a redirection to show the output):
$ join -t: -1 1 -2 4 -o '1.1 1.2 1.3 1.4 2.1 2.2 2.3' -a 1 File1 <(sort -t: -k4 File2) | tee "File 3"
C01:Nancy:Jones:njones@abc.com :R001:07/04/15:123.45
C02:Barbara:Madison:bmadison@bcd.com :R005:11/27/16:58.00
C03:Tim:Adams:tadams@cde.com:R002:12/20/15:167.50
C03:Tim:Adams:tadams@cde.com:R005:09/15/16:36.50
C04:Sarah:Moore:smoore@def.com:::
C05:John:Polk:jpolk@efg.com:::
C06:Paula:Jacobs:pjacobs@fgh.com:R003:03/14/16:298.00
answered Jan 9 at 21:16
cryptarchcryptarch
76011
76011
add a comment |
add a comment |
with Miller http://johnkerl.org/miller/doc/reference.html the command is
mlr --csv --implicit-csv-header --headerless-csv-output --fs ":" join --ul -l 4 -r 1 -j 4 --lp l --rp r -f file2.csv file1.csv
And you have
C01:R001:07/04/15:123.45:Nancy:Jones:njones@abc.com
C02:R005:11/27/16:58.00:Barbara:Madison:bmadison@bcd.com
C03:R002:12/20/15:167.50:Tim:Adams:tadams@cde.com
C03:R005:09/15/16:36.50:Tim:Adams:tadams@cde.com
C05:R006:02/28/17:72.98:John:Polk:jpolk@efg.com
C06:R003:03/14/16:298.00:Paula:Jacobs:pjacobs@fgh.com
add a comment |
with Miller http://johnkerl.org/miller/doc/reference.html the command is
mlr --csv --implicit-csv-header --headerless-csv-output --fs ":" join --ul -l 4 -r 1 -j 4 --lp l --rp r -f file2.csv file1.csv
And you have
C01:R001:07/04/15:123.45:Nancy:Jones:njones@abc.com
C02:R005:11/27/16:58.00:Barbara:Madison:bmadison@bcd.com
C03:R002:12/20/15:167.50:Tim:Adams:tadams@cde.com
C03:R005:09/15/16:36.50:Tim:Adams:tadams@cde.com
C05:R006:02/28/17:72.98:John:Polk:jpolk@efg.com
C06:R003:03/14/16:298.00:Paula:Jacobs:pjacobs@fgh.com
add a comment |
with Miller http://johnkerl.org/miller/doc/reference.html the command is
mlr --csv --implicit-csv-header --headerless-csv-output --fs ":" join --ul -l 4 -r 1 -j 4 --lp l --rp r -f file2.csv file1.csv
And you have
C01:R001:07/04/15:123.45:Nancy:Jones:njones@abc.com
C02:R005:11/27/16:58.00:Barbara:Madison:bmadison@bcd.com
C03:R002:12/20/15:167.50:Tim:Adams:tadams@cde.com
C03:R005:09/15/16:36.50:Tim:Adams:tadams@cde.com
C05:R006:02/28/17:72.98:John:Polk:jpolk@efg.com
C06:R003:03/14/16:298.00:Paula:Jacobs:pjacobs@fgh.com
with Miller http://johnkerl.org/miller/doc/reference.html the command is
mlr --csv --implicit-csv-header --headerless-csv-output --fs ":" join --ul -l 4 -r 1 -j 4 --lp l --rp r -f file2.csv file1.csv
And you have
C01:R001:07/04/15:123.45:Nancy:Jones:njones@abc.com
C02:R005:11/27/16:58.00:Barbara:Madison:bmadison@bcd.com
C03:R002:12/20/15:167.50:Tim:Adams:tadams@cde.com
C03:R005:09/15/16:36.50:Tim:Adams:tadams@cde.com
C05:R006:02/28/17:72.98:John:Polk:jpolk@efg.com
C06:R003:03/14/16:298.00:Paula:Jacobs:pjacobs@fgh.com
answered Jan 10 at 8:29
aborrusoaborruso
20619
20619
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f493551%2fjoin-two-files-based-on-a-common-field%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Nicely worded question: example input, expected output, demonstration of attempts to solve the problem yourself. +1
– cryptarch
Jan 9 at 20:57
Your code works fine here, it produces the expected output. Are you working with dos files or unix files ?
– don_crissti
Jan 9 at 21:02
@don_crissti Not here. Since
File 3
is not surrounded by quotation marks I get an errorjoin: extra operand '3'
; and an empty file namedFile
is created. Furthermore, when I run the command without redirection, the records forC04
andC05
are missing from the output, whereas Jim has stated they should appear in the output.– cryptarch
Jan 9 at 21:06
@cryptarch - I assume
File 3
is a typo and the OP meantFile3
(or whatever the real name of the file is). There is no mention of an error in the post. As to the records missing, onlyC04
is missing from the output here - which is correct, according to the "Expected output" posted by the OP. As I said, I get the exact output that is expected. I'm not sure how to read the part that says "how to handle the omission of row 4 from File1 and the fact that C03 comes up twice in File2" maybe the OP can elaborate on that...– don_crissti
Jan 9 at 21:14
@don_crissti Yeah the File 3 was a typo meant File3. To clarify what you're saying, when you ran this using those two files you got the expected output? I created both files in the vi editor, i'll go back over them, maybe I mistyped something.
– Jim
Jan 9 at 21:18