How to operate on all columns with datamash?

Suppose I have the following data file:

111 222 333

444 555 666

777 888 999

I'm able to calculate the sum per column with GNU Datamash like this:

cat foo | datamash -t  sum 1 sum 2 sum 3

1332 1665 1998

How would I do this with datamash if I didn't know the number of columns in my data file?

I'm asking because for example cut supports end of range symbols like - for its field selector.

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

add a comment |

Suppose I have the following data file:

111 222 333

444 555 666

777 888 999

I'm able to calculate the sum per column with GNU Datamash like this:

cat foo | datamash -t  sum 1 sum 2 sum 3

1332 1665 1998

How would I do this with datamash if I didn't know the number of columns in my data file?

I'm asking because for example cut supports end of range symbols like - for its field selector.

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

add a comment |

Suppose I have the following data file:

111 222 333

444 555 666

777 888 999

I'm able to calculate the sum per column with GNU Datamash like this:

cat foo | datamash -t  sum 1 sum 2 sum 3

1332 1665 1998

How would I do this with datamash if I didn't know the number of columns in my data file?

I'm asking because for example cut supports end of range symbols like - for its field selector.

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

Suppose I have the following data file:

111 222 333

444 555 666

777 888 999

I'm able to calculate the sum per column with GNU Datamash like this:

cat foo | datamash -t  sum 1 sum 2 sum 3

1332 1665 1998

How would I do this with datamash if I didn't know the number of columns in my data file?

I'm asking because for example cut supports end of range symbols like - for its field selector.

shell text-processing

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

edited Feb 22 at 15:53

DopeGhoti

43.1k55382

asked Feb 22 at 15:44

w177us

asked Feb 22 at 15:44

w177us

asked Feb 22 at 15:44

w177us

add a comment |

4 Answers
4

active

oldest

votes

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols

cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)

datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.

answered Feb 22 at 16:19

user1404316

2,324520

add a comment |

I don't see an option to specify unknown range in datamash manual

Try this perl one-liner

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt

1332 1665 1998

-a option will auto split input line on whitespaces, results are saved in @F array

for 0..$#F to loop over the array, $#F gives index of last element

$s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration

END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator

answered Feb 23 at 4:20

Sundeep

7,0911826

add a comment |

I don't know about datamash, but here is an awk solution:

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input

1332 1665 1998

To make that awk script more readable:

{      // execute on all records

  for( col=1; col<=NF; col++ ) { 

    totals[col]+=$col 

  }; 

} 

END {  // execute after all records processed

  for( col=0; col<length(totals); col++ ) {

    printf "%s ", totals[col]

  }; 

  printf "n";

}

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

add a comment |

Using datamash and bash:

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

Output:

1332    1665    1998

How it works:

datamash -W check < foo outputs the string "3 lines, 3 fields".

n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.

datamash -W sum 1-${n[2]} < foo does the rest.

This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

It can also be done with shell tools:

datamash -W sum 1-$(head -1 foo | wc -w) < foo

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f425927%2fhow-to-operate-on-all-columns-with-datamash%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols

cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)

answered Feb 22 at 16:19

user1404316

2,324520

add a comment |

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols

cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)

answered Feb 22 at 16:19

user1404316

2,324520

add a comment |

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols

cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)

answered Feb 22 at 16:19

user1404316

2,324520

cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols

cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)

answered Feb 22 at 16:19

user1404316

2,324520

answered Feb 22 at 16:19

user1404316

2,324520

answered Feb 22 at 16:19

user1404316

2,324520

answered Feb 22 at 16:19

user1404316

2,324520

add a comment |

I don't see an option to specify unknown range in datamash manual

Try this perl one-liner

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt

1332 1665 1998

-a option will auto split input line on whitespaces, results are saved in @F array

for 0..$#F to loop over the array, $#F gives index of last element

$s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration

END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator

answered Feb 23 at 4:20

Sundeep

7,0911826

add a comment |

I don't see an option to specify unknown range in datamash manual

Try this perl one-liner

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt

1332 1665 1998

-a option will auto split input line on whitespaces, results are saved in @F array

for 0..$#F to loop over the array, $#F gives index of last element

$s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration

END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator

answered Feb 23 at 4:20

Sundeep

7,0911826

add a comment |

I don't see an option to specify unknown range in datamash manual

Try this perl one-liner

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt

1332 1665 1998

-a option will auto split input line on whitespaces, results are saved in @F array

for 0..$#F to loop over the array, $#F gives index of last element

$s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration

END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator

answered Feb 23 at 4:20

Sundeep

7,0911826

I don't see an option to specify unknown range in datamash manual

Try this perl one-liner

$ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt

1332 1665 1998

-a option will auto split input line on whitespaces, results are saved in @F array

for 0..$#F to loop over the array, $#F gives index of last element

$s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration

END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator

answered Feb 23 at 4:20

Sundeep

7,0911826

answered Feb 23 at 4:20

Sundeep

7,0911826

answered Feb 23 at 4:20

Sundeep

7,0911826

answered Feb 23 at 4:20

Sundeep

7,0911826

add a comment |

I don't know about datamash, but here is an awk solution:

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input

1332 1665 1998

To make that awk script more readable:

{      // execute on all records

  for( col=1; col<=NF; col++ ) { 

    totals[col]+=$col 

  }; 

} 

END {  // execute after all records processed

  for( col=0; col<length(totals); col++ ) {

    printf "%s ", totals[col]

  }; 

  printf "n";

}

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

add a comment |

I don't know about datamash, but here is an awk solution:

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input

1332 1665 1998

To make that awk script more readable:

{      // execute on all records

  for( col=1; col<=NF; col++ ) { 

    totals[col]+=$col 

  }; 

} 

END {  // execute after all records processed

  for( col=0; col<length(totals); col++ ) {

    printf "%s ", totals[col]

  }; 

  printf "n";

}

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

add a comment |

I don't know about datamash, but here is an awk solution:

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input

1332 1665 1998

To make that awk script more readable:

{      // execute on all records

  for( col=1; col<=NF; col++ ) { 

    totals[col]+=$col 

  }; 

} 

END {  // execute after all records processed

  for( col=0; col<length(totals); col++ ) {

    printf "%s ", totals[col]

  }; 

  printf "n";

}

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

I don't know about datamash, but here is an awk solution:

$ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input

1332 1665 1998

To make that awk script more readable:

{      // execute on all records

  for( col=1; col<=NF; col++ ) { 

    totals[col]+=$col 

  }; 

} 

END {  // execute after all records processed

  for( col=0; col<length(totals); col++ ) {

    printf "%s ", totals[col]

  }; 

  printf "n";

}

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

answered Feb 22 at 16:02

DopeGhoti

43.1k55382

add a comment |

Using datamash and bash:

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

Output:

1332    1665    1998

How it works:

datamash -W check < foo outputs the string "3 lines, 3 fields".

n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.

datamash -W sum 1-${n[2]} < foo does the rest.

This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

It can also be done with shell tools:

datamash -W sum 1-$(head -1 foo | wc -w) < foo

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

add a comment |

Using datamash and bash:

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

Output:

1332    1665    1998

How it works:

datamash -W check < foo outputs the string "3 lines, 3 fields".

n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.

datamash -W sum 1-${n[2]} < foo does the rest.

This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

It can also be done with shell tools:

datamash -W sum 1-$(head -1 foo | wc -w) < foo

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

add a comment |

Using datamash and bash:

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

Output:

1332    1665    1998

How it works:

datamash -W check < foo outputs the string "3 lines, 3 fields".

n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.

datamash -W sum 1-${n[2]} < foo does the rest.

This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

It can also be done with shell tools:

datamash -W sum 1-$(head -1 foo | wc -w) < foo

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

Using datamash and bash:

n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo

Output:

1332    1665    1998

How it works:

datamash -W check < foo outputs the string "3 lines, 3 fields".

n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.

datamash -W sum 1-${n[2]} < foo does the rest.

This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:

datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo

It can also be done with shell tools:

datamash -W sum 1-$(head -1 foo | wc -w) < foo

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

edited Dec 17 at 3:58

answered Dec 17 at 3:50

agc

4,43111036

answered Dec 17 at 3:50

agc

4,43111036

answered Dec 17 at 3:50

agc

4,43111036

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

add a comment |

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

The first two methods were suggested by user1404316's answer.
– agc
Dec 17 at 4:06

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk