How to operate on all columns with datamash?












1














Suppose I have the following data file:



111 222 333
444 555 666
777 888 999


I'm able to calculate the sum per column with GNU Datamash like this:



cat foo | datamash -t  sum 1 sum 2 sum 3
1332 1665 1998


How would I do this with datamash if I didn't know the number of columns in my data file?



I'm asking because for example cut supports end of range symbols like - for its field selector.










share|improve this question





























    1














    Suppose I have the following data file:



    111 222 333
    444 555 666
    777 888 999


    I'm able to calculate the sum per column with GNU Datamash like this:



    cat foo | datamash -t  sum 1 sum 2 sum 3
    1332 1665 1998


    How would I do this with datamash if I didn't know the number of columns in my data file?



    I'm asking because for example cut supports end of range symbols like - for its field selector.










    share|improve this question



























      1












      1








      1







      Suppose I have the following data file:



      111 222 333
      444 555 666
      777 888 999


      I'm able to calculate the sum per column with GNU Datamash like this:



      cat foo | datamash -t  sum 1 sum 2 sum 3
      1332 1665 1998


      How would I do this with datamash if I didn't know the number of columns in my data file?



      I'm asking because for example cut supports end of range symbols like - for its field selector.










      share|improve this question















      Suppose I have the following data file:



      111 222 333
      444 555 666
      777 888 999


      I'm able to calculate the sum per column with GNU Datamash like this:



      cat foo | datamash -t  sum 1 sum 2 sum 3
      1332 1665 1998


      How would I do this with datamash if I didn't know the number of columns in my data file?



      I'm asking because for example cut supports end of range symbols like - for its field selector.







      shell text-processing






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Feb 22 at 15:53









      DopeGhoti

      43.1k55382




      43.1k55382










      asked Feb 22 at 15:44









      w177us

      61




      61






















          4 Answers
          4






          active

          oldest

          votes


















          1














          cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols


          or



          cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)


          datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.






          share|improve this answer





























            1














            I don't see an option to specify unknown range in datamash manual



            Try this perl one-liner



            $ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
            1332 1665 1998




            • -a option will auto split input line on whitespaces, results are saved in @F array


            • for 0..$#F to loop over the array, $#F gives index of last element


            • $s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration


            • END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator






            share|improve this answer





























              0














              I don't know about datamash, but here is an awk solution:



              $ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
              1332 1665 1998


              To make that awk script more readable:



              {      // execute on all records
              for( col=1; col<=NF; col++ ) {
              totals[col]+=$col
              };
              }
              END { // execute after all records processed
              for( col=0; col<length(totals); col++ ) {
              printf "%s ", totals[col]
              };
              printf "n";
              }





              share|improve this answer





























                0














                Using datamash and bash:



                n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo


                Output:



                1332    1665    1998




                How it works:




                1. datamash -W check < foo outputs the string "3 lines, 3 fields".


                2. n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.


                3. datamash -W sum 1-${n[2]} < foo does the rest.





                This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:



                datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo


                It can also be done with shell tools:



                datamash -W sum 1-$(head -1 foo | wc -w) < foo





                share|improve this answer























                • The first two methods were suggested by user1404316's answer.
                  – agc
                  Dec 17 at 4:06











                Your Answer








                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "106"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f425927%2fhow-to-operate-on-all-columns-with-datamash%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                4 Answers
                4






                active

                oldest

                votes








                4 Answers
                4






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                1














                cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols


                or



                cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)


                datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.






                share|improve this answer


























                  1














                  cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols


                  or



                  cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)


                  datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.






                  share|improve this answer
























                    1












                    1








                    1






                    cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols


                    or



                    cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)


                    datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.






                    share|improve this answer












                    cols=$( awk '{print NF; exit}' foo); cat foo | datamash -t  sum 1-$cols


                    or



                    cat foo | datamash -t  sum 1-$( awk '{print NF; exit}' foo)


                    datamash has a feature to specify column ranges, so calculate the number of columns and use that result as part of the range spec. In my example solution, I used awk to check only the first line of the file and exit, but you could use anything else that suits your fancy. datamash itself has a -check function whose output includes the number of columns, but in a format that would still need to be parsed for the specific number that's of interest to you.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Feb 22 at 16:19









                    user1404316

                    2,324520




                    2,324520

























                        1














                        I don't see an option to specify unknown range in datamash manual



                        Try this perl one-liner



                        $ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
                        1332 1665 1998




                        • -a option will auto split input line on whitespaces, results are saved in @F array


                        • for 0..$#F to loop over the array, $#F gives index of last element


                        • $s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration


                        • END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator






                        share|improve this answer


























                          1














                          I don't see an option to specify unknown range in datamash manual



                          Try this perl one-liner



                          $ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
                          1332 1665 1998




                          • -a option will auto split input line on whitespaces, results are saved in @F array


                          • for 0..$#F to loop over the array, $#F gives index of last element


                          • $s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration


                          • END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator






                          share|improve this answer
























                            1












                            1








                            1






                            I don't see an option to specify unknown range in datamash manual



                            Try this perl one-liner



                            $ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
                            1332 1665 1998




                            • -a option will auto split input line on whitespaces, results are saved in @F array


                            • for 0..$#F to loop over the array, $#F gives index of last element


                            • $s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration


                            • END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator






                            share|improve this answer












                            I don't see an option to specify unknown range in datamash manual



                            Try this perl one-liner



                            $ perl -lane '$s[$_]+=$F[$_] for 0..$#F; END{print join " ", @s}' ip.txt
                            1332 1665 1998




                            • -a option will auto split input line on whitespaces, results are saved in @F array


                            • for 0..$#F to loop over the array, $#F gives index of last element


                            • $s[$_]+=$F[$_] save the sum in @s array, by default initial value will be 0 in numeric context. $_ will have the index value for each iteration


                            • END{print join " ", @s} after processing all input lines, print contents of @s array with space as separator







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Feb 23 at 4:20









                            Sundeep

                            7,0911826




                            7,0911826























                                0














                                I don't know about datamash, but here is an awk solution:



                                $ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
                                1332 1665 1998


                                To make that awk script more readable:



                                {      // execute on all records
                                for( col=1; col<=NF; col++ ) {
                                totals[col]+=$col
                                };
                                }
                                END { // execute after all records processed
                                for( col=0; col<length(totals); col++ ) {
                                printf "%s ", totals[col]
                                };
                                printf "n";
                                }





                                share|improve this answer


























                                  0














                                  I don't know about datamash, but here is an awk solution:



                                  $ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
                                  1332 1665 1998


                                  To make that awk script more readable:



                                  {      // execute on all records
                                  for( col=1; col<=NF; col++ ) {
                                  totals[col]+=$col
                                  };
                                  }
                                  END { // execute after all records processed
                                  for( col=0; col<length(totals); col++ ) {
                                  printf "%s ", totals[col]
                                  };
                                  printf "n";
                                  }





                                  share|improve this answer
























                                    0












                                    0








                                    0






                                    I don't know about datamash, but here is an awk solution:



                                    $ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
                                    1332 1665 1998


                                    To make that awk script more readable:



                                    {      // execute on all records
                                    for( col=1; col<=NF; col++ ) {
                                    totals[col]+=$col
                                    };
                                    }
                                    END { // execute after all records processed
                                    for( col=0; col<length(totals); col++ ) {
                                    printf "%s ", totals[col]
                                    };
                                    printf "n";
                                    }





                                    share|improve this answer












                                    I don't know about datamash, but here is an awk solution:



                                    $ awk '{ for( col=1; col<=NF; col++ ) { totals[col]+=$col } } END { for( col=0; col<length(totals); col++ ) {printf "%s ", totals[col]}; printf "n" } ' input
                                    1332 1665 1998


                                    To make that awk script more readable:



                                    {      // execute on all records
                                    for( col=1; col<=NF; col++ ) {
                                    totals[col]+=$col
                                    };
                                    }
                                    END { // execute after all records processed
                                    for( col=0; col<length(totals); col++ ) {
                                    printf "%s ", totals[col]
                                    };
                                    printf "n";
                                    }






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Feb 22 at 16:02









                                    DopeGhoti

                                    43.1k55382




                                    43.1k55382























                                        0














                                        Using datamash and bash:



                                        n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo


                                        Output:



                                        1332    1665    1998




                                        How it works:




                                        1. datamash -W check < foo outputs the string "3 lines, 3 fields".


                                        2. n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.


                                        3. datamash -W sum 1-${n[2]} < foo does the rest.





                                        This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:



                                        datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo


                                        It can also be done with shell tools:



                                        datamash -W sum 1-$(head -1 foo | wc -w) < foo





                                        share|improve this answer























                                        • The first two methods were suggested by user1404316's answer.
                                          – agc
                                          Dec 17 at 4:06
















                                        0














                                        Using datamash and bash:



                                        n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo


                                        Output:



                                        1332    1665    1998




                                        How it works:




                                        1. datamash -W check < foo outputs the string "3 lines, 3 fields".


                                        2. n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.


                                        3. datamash -W sum 1-${n[2]} < foo does the rest.





                                        This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:



                                        datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo


                                        It can also be done with shell tools:



                                        datamash -W sum 1-$(head -1 foo | wc -w) < foo





                                        share|improve this answer























                                        • The first two methods were suggested by user1404316's answer.
                                          – agc
                                          Dec 17 at 4:06














                                        0












                                        0








                                        0






                                        Using datamash and bash:



                                        n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo


                                        Output:



                                        1332    1665    1998




                                        How it works:




                                        1. datamash -W check < foo outputs the string "3 lines, 3 fields".


                                        2. n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.


                                        3. datamash -W sum 1-${n[2]} < foo does the rest.





                                        This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:



                                        datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo


                                        It can also be done with shell tools:



                                        datamash -W sum 1-$(head -1 foo | wc -w) < foo





                                        share|improve this answer














                                        Using datamash and bash:



                                        n=($(datamash -W check < foo)); datamash -W sum 1-${n[2]} < foo


                                        Output:



                                        1332    1665    1998




                                        How it works:




                                        1. datamash -W check < foo outputs the string "3 lines, 3 fields".


                                        2. n=($(datamash -W check < foo)) loads that string into an array $n. We want the number of fields, which would be ${n[2]}.


                                        3. datamash -W sum 1-${n[2]} < foo does the rest.





                                        This can also be done with a POSIX shell, using a complex printf formatting string instead of an array, but it's gnarlier:



                                        datamash -W sum 1-$(printf '%0.0s%0.0s%s%0.0s' $(datamash -W check < foo)) < foo


                                        It can also be done with shell tools:



                                        datamash -W sum 1-$(head -1 foo | wc -w) < foo






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited Dec 17 at 3:58

























                                        answered Dec 17 at 3:50









                                        agc

                                        4,43111036




                                        4,43111036












                                        • The first two methods were suggested by user1404316's answer.
                                          – agc
                                          Dec 17 at 4:06


















                                        • The first two methods were suggested by user1404316's answer.
                                          – agc
                                          Dec 17 at 4:06
















                                        The first two methods were suggested by user1404316's answer.
                                        – agc
                                        Dec 17 at 4:06




                                        The first two methods were suggested by user1404316's answer.
                                        – agc
                                        Dec 17 at 4:06


















                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.





                                        Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                                        Please pay close attention to the following guidance:


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f425927%2fhow-to-operate-on-all-columns-with-datamash%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Morgemoulin

                                        Scott Moir

                                        Souastre