How to operate on all columns with datamash?
Suppose I have the following data file:
111 222 333
444 555 666
777 888 999
I'm able to calculate the sum per column with GNU Datamash like this:
cat foo | datamash -t sum 1 sum 2 sum 3
1332 1665 1998
How would I do this with datamash if I didn't know the number of columns in my data file?
I'm asking because for example cut
supports end of range symbols like -
for its field selector.
shell text-processing
add a comment |
Suppose I have the following data file:
111 222 333
444 555 666
777 888 999
I'm able to calculate the sum per column with GNU Datamash like this:
cat foo | datamash -t sum 1 sum 2 sum 3
1332 1665 1998
How would I do this with datamash if I didn't know the number of columns in my data file?
I'm asking because for example cut
supports end of range symbols like -
for its field selector.
shell text-processing
add a comment |
Suppose I have the following data file:
111 222 333
444 555 666
777 888 999
I'm able to calculate the sum per column with GNU Datamash like this:
cat foo | datamash -t sum 1 sum 2 sum 3
1332 1665 1998
How would I do this with datamash if I didn't know the number of columns in my data file?
I'm asking because for example cut
supports end of range symbols like -
for its field selector.
shell text-processing
Suppose I have the following data file:
111 222 333
444 555 666
777 888 999
I'm able to calculate the sum per column with GNU Datamash like this:
cat foo | datamash -t sum 1 sum 2 sum 3
1332 1665 1998
How would I do this with datamash if I didn't know the number of columns in my data file?
I'm asking because for example cut
supports end of range symbols like -
for its field selector.
shell text-processing
shell text-processing
edited Feb 22 at 15:53
DopeGhoti
43.1k55382
43.1k55382
asked Feb 22 at 15:44
w177us
61
61
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t sum 1-$cols
or
cat foo | datamash -t sum 1-$( awk '{print NF; exit}' foo)
datamash
has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk
to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash
itself has a -check
function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.
add a comment |
I don't see an option to specify unknown range in datamash manual
Try this perl
one-liner
$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998
-a
option will auto split input line on whitespaces, results are saved in@F
array
for 0..$#F
to loop over the array,$#F
gives index of last element
$s[$_]+=$F[$_]
save the sum in@s
array, by default initial value will be0
in numeric context.$_
will have the index value for each iteration
END{print join " ", @s}
after processing all input lines, print contents of@s
array with space as separator
add a comment |
I don't know about datamash
, but here is an awk
solution:
$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
1332 1665 1998
To make that awk
script more readable:
{ // execute on all records
for( col=1; col<=NF; col++ ) {
totals[col]+=$col
};
}
END { // execute after all records processed
for( col=0; col<length(totals); col++ ) {
printf "%s ", totals[col]
};
printf "n";
}
add a comment |
Using datamash
and bash
:
n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo
Output:
1332 1665 1998
How it works:
datamash -W check < foo
outputs the string "3 lines, 3 fields".n=($(datamash -W check < foo))
loads that string into an array$n
. We want the number of fields, which would be${n[2]}
.datamash -W sum 1-${n[2]} < foo
does the rest.
This can also be done with a POSIX shell, using a complex printf
formatting string instead of an array, but it's gnarlier:
datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo
It can also be done with shell tools:
datamash -W sum 1-$(head -1 foo | wc -w) < foo
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f425927%2fhow-to-operate-on-all-columns-with-datamash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t sum 1-$cols
or
cat foo | datamash -t sum 1-$( awk '{print NF; exit}' foo)
datamash
has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk
to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash
itself has a -check
function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.
add a comment |
cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t sum 1-$cols
or
cat foo | datamash -t sum 1-$( awk '{print NF; exit}' foo)
datamash
has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk
to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash
itself has a -check
function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.
add a comment |
cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t sum 1-$cols
or
cat foo | datamash -t sum 1-$( awk '{print NF; exit}' foo)
datamash
has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk
to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash
itself has a -check
function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.
cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t sum 1-$cols
or
cat foo | datamash -t sum 1-$( awk '{print NF; exit}' foo)
datamash
has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk
to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash
itself has a -check
function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.
answered Feb 22 at 16:19
user1404316
2,324520
2,324520
add a comment |
add a comment |
I don't see an option to specify unknown range in datamash manual
Try this perl
one-liner
$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998
-a
option will auto split input line on whitespaces, results are saved in@F
array
for 0..$#F
to loop over the array,$#F
gives index of last element
$s[$_]+=$F[$_]
save the sum in@s
array, by default initial value will be0
in numeric context.$_
will have the index value for each iteration
END{print join " ", @s}
after processing all input lines, print contents of@s
array with space as separator
add a comment |
I don't see an option to specify unknown range in datamash manual
Try this perl
one-liner
$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998
-a
option will auto split input line on whitespaces, results are saved in@F
array
for 0..$#F
to loop over the array,$#F
gives index of last element
$s[$_]+=$F[$_]
save the sum in@s
array, by default initial value will be0
in numeric context.$_
will have the index value for each iteration
END{print join " ", @s}
after processing all input lines, print contents of@s
array with space as separator
add a comment |
I don't see an option to specify unknown range in datamash manual
Try this perl
one-liner
$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998
-a
option will auto split input line on whitespaces, results are saved in@F
array
for 0..$#F
to loop over the array,$#F
gives index of last element
$s[$_]+=$F[$_]
save the sum in@s
array, by default initial value will be0
in numeric context.$_
will have the index value for each iteration
END{print join " ", @s}
after processing all input lines, print contents of@s
array with space as separator
I don't see an option to specify unknown range in datamash manual
Try this perl
one-liner
$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
1332 1665 1998
-a
option will auto split input line on whitespaces, results are saved in@F
array
for 0..$#F
to loop over the array,$#F
gives index of last element
$s[$_]+=$F[$_]
save the sum in@s
array, by default initial value will be0
in numeric context.$_
will have the index value for each iteration
END{print join " ", @s}
after processing all input lines, print contents of@s
array with space as separator
answered Feb 23 at 4:20
Sundeep
7,0911826
7,0911826
add a comment |
add a comment |
I don't know about datamash
, but here is an awk
solution:
$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
1332 1665 1998
To make that awk
script more readable:
{ // execute on all records
for( col=1; col<=NF; col++ ) {
totals[col]+=$col
};
}
END { // execute after all records processed
for( col=0; col<length(totals); col++ ) {
printf "%s ", totals[col]
};
printf "n";
}
add a comment |
I don't know about datamash
, but here is an awk
solution:
$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
1332 1665 1998
To make that awk
script more readable:
{ // execute on all records
for( col=1; col<=NF; col++ ) {
totals[col]+=$col
};
}
END { // execute after all records processed
for( col=0; col<length(totals); col++ ) {
printf "%s ", totals[col]
};
printf "n";
}
add a comment |
I don't know about datamash
, but here is an awk
solution:
$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
1332 1665 1998
To make that awk
script more readable:
{ // execute on all records
for( col=1; col<=NF; col++ ) {
totals[col]+=$col
};
}
END { // execute after all records processed
for( col=0; col<length(totals); col++ ) {
printf "%s ", totals[col]
};
printf "n";
}
I don't know about datamash
, but here is an awk
solution:
$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
1332 1665 1998
To make that awk
script more readable:
{ // execute on all records
for( col=1; col<=NF; col++ ) {
totals[col]+=$col
};
}
END { // execute after all records processed
for( col=0; col<length(totals); col++ ) {
printf "%s ", totals[col]
};
printf "n";
}
answered Feb 22 at 16:02
DopeGhoti
43.1k55382
43.1k55382
add a comment |
add a comment |
Using datamash
and bash
:
n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo
Output:
1332 1665 1998
How it works:
datamash -W check < foo
outputs the string "3 lines, 3 fields".n=($(datamash -W check < foo))
loads that string into an array$n
. We want the number of fields, which would be${n[2]}
.datamash -W sum 1-${n[2]} < foo
does the rest.
This can also be done with a POSIX shell, using a complex printf
formatting string instead of an array, but it's gnarlier:
datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo
It can also be done with shell tools:
datamash -W sum 1-$(head -1 foo | wc -w) < foo
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
add a comment |
Using datamash
and bash
:
n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo
Output:
1332 1665 1998
How it works:
datamash -W check < foo
outputs the string "3 lines, 3 fields".n=($(datamash -W check < foo))
loads that string into an array$n
. We want the number of fields, which would be${n[2]}
.datamash -W sum 1-${n[2]} < foo
does the rest.
This can also be done with a POSIX shell, using a complex printf
formatting string instead of an array, but it's gnarlier:
datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo
It can also be done with shell tools:
datamash -W sum 1-$(head -1 foo | wc -w) < foo
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
add a comment |
Using datamash
and bash
:
n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo
Output:
1332 1665 1998
How it works:
datamash -W check < foo
outputs the string "3 lines, 3 fields".n=($(datamash -W check < foo))
loads that string into an array$n
. We want the number of fields, which would be${n[2]}
.datamash -W sum 1-${n[2]} < foo
does the rest.
This can also be done with a POSIX shell, using a complex printf
formatting string instead of an array, but it's gnarlier:
datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo
It can also be done with shell tools:
datamash -W sum 1-$(head -1 foo | wc -w) < foo
Using datamash
and bash
:
n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo
Output:
1332 1665 1998
How it works:
datamash -W check < foo
outputs the string "3 lines, 3 fields".n=($(datamash -W check < foo))
loads that string into an array$n
. We want the number of fields, which would be${n[2]}
.datamash -W sum 1-${n[2]} < foo
does the rest.
This can also be done with a POSIX shell, using a complex printf
formatting string instead of an array, but it's gnarlier:
datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo
It can also be done with shell tools:
datamash -W sum 1-$(head -1 foo | wc -w) < foo
edited Dec 17 at 3:58
answered Dec 17 at 3:50
agc
4,43111036
4,43111036
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
add a comment |
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f425927%2fhow-to-operate-on-all-columns-with-datamash%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown