From gpx to csv file











up vote
0
down vote

favorite












<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 
<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt>
<wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>


I have a file like having the above lines which needed to be converted into



         1.345529841,103.7577152,2010-01-01 00:00:00
1.345529841,103.7577152,2010-01-01 00:00:00
1.3982529841,103.90877152,2010-01-01 00:00:00









share|improve this question




























    up vote
    0
    down vote

    favorite












    <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 
    <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt>
    <wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>


    I have a file like having the above lines which needed to be converted into



             1.345529841,103.7577152,2010-01-01 00:00:00
    1.345529841,103.7577152,2010-01-01 00:00:00
    1.3982529841,103.90877152,2010-01-01 00:00:00









    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 
      <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt>
      <wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>


      I have a file like having the above lines which needed to be converted into



               1.345529841,103.7577152,2010-01-01 00:00:00
      1.345529841,103.7577152,2010-01-01 00:00:00
      1.3982529841,103.90877152,2010-01-01 00:00:00









      share|improve this question















      <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 
      <wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt>
      <wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>


      I have a file like having the above lines which needed to be converted into



               1.345529841,103.7577152,2010-01-01 00:00:00
      1.345529841,103.7577152,2010-01-01 00:00:00
      1.3982529841,103.90877152,2010-01-01 00:00:00






      linux awk sed xml






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 24 at 19:41









      Rui F Ribeiro

      38.3k1476127




      38.3k1476127










      asked Feb 9 '17 at 6:00









      RKR

      23219




      23219






















          5 Answers
          5






          active

          oldest

          votes

















          up vote
          1
          down vote



          accepted










          You could use sed to strip out the characters you don't want:



          sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file


          s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma



          s/T/ / is replacing the character T with a space



          s/^,|,$//g is removing the first and last comma






          share|improve this answer























          • Yours is ok but comma will not come.We have to add it by ourselves
            – RKR
            Feb 9 '17 at 10:29






          • 1




            didn't notice the comma, answer updated
            – oliv
            Feb 9 '17 at 10:46












          • oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
            – RKR
            Feb 10 '17 at 1:57








          • 1




            look at the updated answer, it should the solve the comma/dot notation
            – oliv
            Feb 10 '17 at 6:57






          • 1




            Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
            – Sobrique
            Feb 10 '17 at 8:51


















          up vote
          3
          down vote













          GPX is an XML format, so you can't use awk or sed to parse it reliably.



          Instead, use something like XMLStarlet:



          $ xml sel -t -m '//wpt' 
          -v '@lat' -o ',' -v '@lon' -o ','
          -v 'time' -nl data.gpx
          1.345529841,103.7577152,2010-01-01T00:00:00Z
          1.345529841,103.7577152,2010-01-01T00:00:00Z
          1.3982529841,103.90877152,2010-01-01T00:00:00Z


          Alternatively:



          $ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx





          share|improve this answer























          • No command xml found.Should I download anything?
            – RKR
            Feb 9 '17 at 10:31






          • 2




            @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
            – Kusalananda
            Feb 9 '17 at 10:32


















          up vote
          1
          down vote













          this answer is Based on the input given...



          awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"
          1.345529841,103.7577152,2010-01-01 00:00:00
          1.345529841,103.7577152,2010-01-01 00:00:00
          1.3982529841,103.90877152,2010-01-01 00:00:00




          awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt





          share|improve this answer























          • Itruns for quite a long time and does not seems to be finishing even for small file
            – RKR
            Feb 9 '17 at 10:30










          • what is your OS ? provide the exact command you typed in your terminal
            – Kamaraj
            Feb 9 '17 at 23:27










          • seems, you missed some double quotes in the command and it expects to close and waits for long time...
            – Kamaraj
            Feb 9 '17 at 23:47


















          up vote
          1
          down vote













          Please, please - don't use a regular expression based solution, like awk or sed.



          XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.



          But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.



          So to take your example:



          #!/usr/bin/perl

          use warnings;
          use strict;
          use XML::Twig;

          my $xml = XML::Twig -> new -> parsefile('your_file.xml');

          foreach my $wpt ( $xml -> get_xpath('//wpt') ) {
          print join ",", $wpt -> att('lat'),
          $wpt -> att('lon'),
          $wpt -> first_child_text('time'), "n";
          }


          Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.



          Like indented:



          <xml>
          <wpt lat="1.345529841" lon="103.7577152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          <wpt lat="1.345529841" lon="103.7577152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          <wpt lat="1.3982529841" lon="103.90877152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          </xml>


          All on a single line:



          <xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>


          Another style of indenting:



          <xml>
          <wpt
          lat="1.345529841"
          lon="103.7577152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          <wpt
          lat="1.345529841"
          lon="103.7577152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          <wpt
          lat="1.3982529841"
          lon="103.90877152">
          <time>2010-01-01T00:00:00Z</time>
          </wpt>
          </xml>


          Or even:



          <xml
          ><wpt
          lat="1.345529841"
          lon="103.7577152"
          ><time
          >2010-01-01T00:00:00Z</time></wpt><wpt
          lat="1.345529841"
          lon="103.7577152"
          ><time
          >2010-01-01T00:00:00Z</time></wpt><wpt
          lat="1.3982529841"
          lon="103.90877152"
          ><time
          >2010-01-01T00:00:00Z</time></wpt></xml>


          These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.



          For the sake of being concise though:



          perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'





          share|improve this answer






























            up vote
            0
            down vote













            Assuming f.xml is our input (a valid xml):



            $ perl -MXML::DT -E 'dt("f.xml",
            time=>sub{$a=father;
            $c =~ s/[TZ]/ /g;
            say "$a->{lat},$a->{lon},$c"}
            )'




            • -MXML::DT load XML::DT module (xml down translator)


            • dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub


            • $a=father : get the attributes from father


            • $c : is the current element content


            Warning: I am one of the authors of XML::DT (install with cpan XML::DT)






            share|improve this answer























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "106"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f343636%2ffrom-gpx-to-csv-file%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              5 Answers
              5






              active

              oldest

              votes








              5 Answers
              5






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              1
              down vote



              accepted










              You could use sed to strip out the characters you don't want:



              sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file


              s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma



              s/T/ / is replacing the character T with a space



              s/^,|,$//g is removing the first and last comma






              share|improve this answer























              • Yours is ok but comma will not come.We have to add it by ourselves
                – RKR
                Feb 9 '17 at 10:29






              • 1




                didn't notice the comma, answer updated
                – oliv
                Feb 9 '17 at 10:46












              • oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
                – RKR
                Feb 10 '17 at 1:57








              • 1




                look at the updated answer, it should the solve the comma/dot notation
                – oliv
                Feb 10 '17 at 6:57






              • 1




                Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
                – Sobrique
                Feb 10 '17 at 8:51















              up vote
              1
              down vote



              accepted










              You could use sed to strip out the characters you don't want:



              sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file


              s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma



              s/T/ / is replacing the character T with a space



              s/^,|,$//g is removing the first and last comma






              share|improve this answer























              • Yours is ok but comma will not come.We have to add it by ourselves
                – RKR
                Feb 9 '17 at 10:29






              • 1




                didn't notice the comma, answer updated
                – oliv
                Feb 9 '17 at 10:46












              • oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
                – RKR
                Feb 10 '17 at 1:57








              • 1




                look at the updated answer, it should the solve the comma/dot notation
                – oliv
                Feb 10 '17 at 6:57






              • 1




                Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
                – Sobrique
                Feb 10 '17 at 8:51













              up vote
              1
              down vote



              accepted







              up vote
              1
              down vote



              accepted






              You could use sed to strip out the characters you don't want:



              sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file


              s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma



              s/T/ / is replacing the character T with a space



              s/^,|,$//g is removing the first and last comma






              share|improve this answer














              You could use sed to strip out the characters you don't want:



              sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file


              s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma



              s/T/ / is replacing the character T with a space



              s/^,|,$//g is removing the first and last comma







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Feb 10 '17 at 6:56

























              answered Feb 9 '17 at 8:43









              oliv

              1,651311




              1,651311












              • Yours is ok but comma will not come.We have to add it by ourselves
                – RKR
                Feb 9 '17 at 10:29






              • 1




                didn't notice the comma, answer updated
                – oliv
                Feb 9 '17 at 10:46












              • oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
                – RKR
                Feb 10 '17 at 1:57








              • 1




                look at the updated answer, it should the solve the comma/dot notation
                – oliv
                Feb 10 '17 at 6:57






              • 1




                Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
                – Sobrique
                Feb 10 '17 at 8:51


















              • Yours is ok but comma will not come.We have to add it by ourselves
                – RKR
                Feb 9 '17 at 10:29






              • 1




                didn't notice the comma, answer updated
                – oliv
                Feb 9 '17 at 10:46












              • oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
                – RKR
                Feb 10 '17 at 1:57








              • 1




                look at the updated answer, it should the solve the comma/dot notation
                – oliv
                Feb 10 '17 at 6:57






              • 1




                Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
                – Sobrique
                Feb 10 '17 at 8:51
















              Yours is ok but comma will not come.We have to add it by ourselves
              – RKR
              Feb 9 '17 at 10:29




              Yours is ok but comma will not come.We have to add it by ourselves
              – RKR
              Feb 9 '17 at 10:29




              1




              1




              didn't notice the comma, answer updated
              – oliv
              Feb 9 '17 at 10:46






              didn't notice the comma, answer updated
              – oliv
              Feb 9 '17 at 10:46














              oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
              – RKR
              Feb 10 '17 at 1:57






              oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
              – RKR
              Feb 10 '17 at 1:57






              1




              1




              look at the updated answer, it should the solve the comma/dot notation
              – oliv
              Feb 10 '17 at 6:57




              look at the updated answer, it should the solve the comma/dot notation
              – oliv
              Feb 10 '17 at 6:57




              1




              1




              Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
              – Sobrique
              Feb 10 '17 at 8:51




              Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
              – Sobrique
              Feb 10 '17 at 8:51












              up vote
              3
              down vote













              GPX is an XML format, so you can't use awk or sed to parse it reliably.



              Instead, use something like XMLStarlet:



              $ xml sel -t -m '//wpt' 
              -v '@lat' -o ',' -v '@lon' -o ','
              -v 'time' -nl data.gpx
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.3982529841,103.90877152,2010-01-01T00:00:00Z


              Alternatively:



              $ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx





              share|improve this answer























              • No command xml found.Should I download anything?
                – RKR
                Feb 9 '17 at 10:31






              • 2




                @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
                – Kusalananda
                Feb 9 '17 at 10:32















              up vote
              3
              down vote













              GPX is an XML format, so you can't use awk or sed to parse it reliably.



              Instead, use something like XMLStarlet:



              $ xml sel -t -m '//wpt' 
              -v '@lat' -o ',' -v '@lon' -o ','
              -v 'time' -nl data.gpx
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.3982529841,103.90877152,2010-01-01T00:00:00Z


              Alternatively:



              $ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx





              share|improve this answer























              • No command xml found.Should I download anything?
                – RKR
                Feb 9 '17 at 10:31






              • 2




                @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
                – Kusalananda
                Feb 9 '17 at 10:32













              up vote
              3
              down vote










              up vote
              3
              down vote









              GPX is an XML format, so you can't use awk or sed to parse it reliably.



              Instead, use something like XMLStarlet:



              $ xml sel -t -m '//wpt' 
              -v '@lat' -o ',' -v '@lon' -o ','
              -v 'time' -nl data.gpx
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.3982529841,103.90877152,2010-01-01T00:00:00Z


              Alternatively:



              $ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx





              share|improve this answer














              GPX is an XML format, so you can't use awk or sed to parse it reliably.



              Instead, use something like XMLStarlet:



              $ xml sel -t -m '//wpt' 
              -v '@lat' -o ',' -v '@lon' -o ','
              -v 'time' -nl data.gpx
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.345529841,103.7577152,2010-01-01T00:00:00Z
              1.3982529841,103.90877152,2010-01-01T00:00:00Z


              Alternatively:



              $ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Feb 9 '17 at 9:49

























              answered Feb 9 '17 at 9:38









              Kusalananda

              118k16223361




              118k16223361












              • No command xml found.Should I download anything?
                – RKR
                Feb 9 '17 at 10:31






              • 2




                @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
                – Kusalananda
                Feb 9 '17 at 10:32


















              • No command xml found.Should I download anything?
                – RKR
                Feb 9 '17 at 10:31






              • 2




                @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
                – Kusalananda
                Feb 9 '17 at 10:32
















              No command xml found.Should I download anything?
              – RKR
              Feb 9 '17 at 10:31




              No command xml found.Should I download anything?
              – RKR
              Feb 9 '17 at 10:31




              2




              2




              @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
              – Kusalananda
              Feb 9 '17 at 10:32




              @RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
              – Kusalananda
              Feb 9 '17 at 10:32










              up vote
              1
              down vote













              this answer is Based on the input given...



              awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.3982529841,103.90877152,2010-01-01 00:00:00




              awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt





              share|improve this answer























              • Itruns for quite a long time and does not seems to be finishing even for small file
                – RKR
                Feb 9 '17 at 10:30










              • what is your OS ? provide the exact command you typed in your terminal
                – Kamaraj
                Feb 9 '17 at 23:27










              • seems, you missed some double quotes in the command and it expects to close and waits for long time...
                – Kamaraj
                Feb 9 '17 at 23:47















              up vote
              1
              down vote













              this answer is Based on the input given...



              awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.3982529841,103.90877152,2010-01-01 00:00:00




              awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt





              share|improve this answer























              • Itruns for quite a long time and does not seems to be finishing even for small file
                – RKR
                Feb 9 '17 at 10:30










              • what is your OS ? provide the exact command you typed in your terminal
                – Kamaraj
                Feb 9 '17 at 23:27










              • seems, you missed some double quotes in the command and it expects to close and waits for long time...
                – Kamaraj
                Feb 9 '17 at 23:47













              up vote
              1
              down vote










              up vote
              1
              down vote









              this answer is Based on the input given...



              awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.3982529841,103.90877152,2010-01-01 00:00:00




              awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt





              share|improve this answer














              this answer is Based on the input given...



              awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.345529841,103.7577152,2010-01-01 00:00:00
              1.3982529841,103.90877152,2010-01-01 00:00:00




              awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Feb 9 '17 at 6:58

























              answered Feb 9 '17 at 6:51









              Kamaraj

              2,9161513




              2,9161513












              • Itruns for quite a long time and does not seems to be finishing even for small file
                – RKR
                Feb 9 '17 at 10:30










              • what is your OS ? provide the exact command you typed in your terminal
                – Kamaraj
                Feb 9 '17 at 23:27










              • seems, you missed some double quotes in the command and it expects to close and waits for long time...
                – Kamaraj
                Feb 9 '17 at 23:47


















              • Itruns for quite a long time and does not seems to be finishing even for small file
                – RKR
                Feb 9 '17 at 10:30










              • what is your OS ? provide the exact command you typed in your terminal
                – Kamaraj
                Feb 9 '17 at 23:27










              • seems, you missed some double quotes in the command and it expects to close and waits for long time...
                – Kamaraj
                Feb 9 '17 at 23:47
















              Itruns for quite a long time and does not seems to be finishing even for small file
              – RKR
              Feb 9 '17 at 10:30




              Itruns for quite a long time and does not seems to be finishing even for small file
              – RKR
              Feb 9 '17 at 10:30












              what is your OS ? provide the exact command you typed in your terminal
              – Kamaraj
              Feb 9 '17 at 23:27




              what is your OS ? provide the exact command you typed in your terminal
              – Kamaraj
              Feb 9 '17 at 23:27












              seems, you missed some double quotes in the command and it expects to close and waits for long time...
              – Kamaraj
              Feb 9 '17 at 23:47




              seems, you missed some double quotes in the command and it expects to close and waits for long time...
              – Kamaraj
              Feb 9 '17 at 23:47










              up vote
              1
              down vote













              Please, please - don't use a regular expression based solution, like awk or sed.



              XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.



              But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.



              So to take your example:



              #!/usr/bin/perl

              use warnings;
              use strict;
              use XML::Twig;

              my $xml = XML::Twig -> new -> parsefile('your_file.xml');

              foreach my $wpt ( $xml -> get_xpath('//wpt') ) {
              print join ",", $wpt -> att('lat'),
              $wpt -> att('lon'),
              $wpt -> first_child_text('time'), "n";
              }


              Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.



              Like indented:



              <xml>
              <wpt lat="1.345529841" lon="103.7577152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              <wpt lat="1.345529841" lon="103.7577152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              <wpt lat="1.3982529841" lon="103.90877152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              </xml>


              All on a single line:



              <xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>


              Another style of indenting:



              <xml>
              <wpt
              lat="1.345529841"
              lon="103.7577152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              <wpt
              lat="1.345529841"
              lon="103.7577152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              <wpt
              lat="1.3982529841"
              lon="103.90877152">
              <time>2010-01-01T00:00:00Z</time>
              </wpt>
              </xml>


              Or even:



              <xml
              ><wpt
              lat="1.345529841"
              lon="103.7577152"
              ><time
              >2010-01-01T00:00:00Z</time></wpt><wpt
              lat="1.345529841"
              lon="103.7577152"
              ><time
              >2010-01-01T00:00:00Z</time></wpt><wpt
              lat="1.3982529841"
              lon="103.90877152"
              ><time
              >2010-01-01T00:00:00Z</time></wpt></xml>


              These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.



              For the sake of being concise though:



              perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'





              share|improve this answer



























                up vote
                1
                down vote













                Please, please - don't use a regular expression based solution, like awk or sed.



                XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.



                But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.



                So to take your example:



                #!/usr/bin/perl

                use warnings;
                use strict;
                use XML::Twig;

                my $xml = XML::Twig -> new -> parsefile('your_file.xml');

                foreach my $wpt ( $xml -> get_xpath('//wpt') ) {
                print join ",", $wpt -> att('lat'),
                $wpt -> att('lon'),
                $wpt -> first_child_text('time'), "n";
                }


                Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.



                Like indented:



                <xml>
                <wpt lat="1.345529841" lon="103.7577152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                <wpt lat="1.345529841" lon="103.7577152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                <wpt lat="1.3982529841" lon="103.90877152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                </xml>


                All on a single line:



                <xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>


                Another style of indenting:



                <xml>
                <wpt
                lat="1.345529841"
                lon="103.7577152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                <wpt
                lat="1.345529841"
                lon="103.7577152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                <wpt
                lat="1.3982529841"
                lon="103.90877152">
                <time>2010-01-01T00:00:00Z</time>
                </wpt>
                </xml>


                Or even:



                <xml
                ><wpt
                lat="1.345529841"
                lon="103.7577152"
                ><time
                >2010-01-01T00:00:00Z</time></wpt><wpt
                lat="1.345529841"
                lon="103.7577152"
                ><time
                >2010-01-01T00:00:00Z</time></wpt><wpt
                lat="1.3982529841"
                lon="103.90877152"
                ><time
                >2010-01-01T00:00:00Z</time></wpt></xml>


                These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.



                For the sake of being concise though:



                perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'





                share|improve this answer

























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  Please, please - don't use a regular expression based solution, like awk or sed.



                  XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.



                  But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.



                  So to take your example:



                  #!/usr/bin/perl

                  use warnings;
                  use strict;
                  use XML::Twig;

                  my $xml = XML::Twig -> new -> parsefile('your_file.xml');

                  foreach my $wpt ( $xml -> get_xpath('//wpt') ) {
                  print join ",", $wpt -> att('lat'),
                  $wpt -> att('lon'),
                  $wpt -> first_child_text('time'), "n";
                  }


                  Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.



                  Like indented:



                  <xml>
                  <wpt lat="1.345529841" lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt lat="1.345529841" lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt lat="1.3982529841" lon="103.90877152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  </xml>


                  All on a single line:



                  <xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>


                  Another style of indenting:



                  <xml>
                  <wpt
                  lat="1.345529841"
                  lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt
                  lat="1.345529841"
                  lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt
                  lat="1.3982529841"
                  lon="103.90877152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  </xml>


                  Or even:



                  <xml
                  ><wpt
                  lat="1.345529841"
                  lon="103.7577152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt><wpt
                  lat="1.345529841"
                  lon="103.7577152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt><wpt
                  lat="1.3982529841"
                  lon="103.90877152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt></xml>


                  These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.



                  For the sake of being concise though:



                  perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'





                  share|improve this answer














                  Please, please - don't use a regular expression based solution, like awk or sed.



                  XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.



                  But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.



                  So to take your example:



                  #!/usr/bin/perl

                  use warnings;
                  use strict;
                  use XML::Twig;

                  my $xml = XML::Twig -> new -> parsefile('your_file.xml');

                  foreach my $wpt ( $xml -> get_xpath('//wpt') ) {
                  print join ",", $wpt -> att('lat'),
                  $wpt -> att('lon'),
                  $wpt -> first_child_text('time'), "n";
                  }


                  Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.



                  Like indented:



                  <xml>
                  <wpt lat="1.345529841" lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt lat="1.345529841" lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt lat="1.3982529841" lon="103.90877152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  </xml>


                  All on a single line:



                  <xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>


                  Another style of indenting:



                  <xml>
                  <wpt
                  lat="1.345529841"
                  lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt
                  lat="1.345529841"
                  lon="103.7577152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  <wpt
                  lat="1.3982529841"
                  lon="103.90877152">
                  <time>2010-01-01T00:00:00Z</time>
                  </wpt>
                  </xml>


                  Or even:



                  <xml
                  ><wpt
                  lat="1.345529841"
                  lon="103.7577152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt><wpt
                  lat="1.345529841"
                  lon="103.7577152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt><wpt
                  lat="1.3982529841"
                  lon="103.90877152"
                  ><time
                  >2010-01-01T00:00:00Z</time></wpt></xml>


                  These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.



                  For the sake of being concise though:



                  perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited May 23 '17 at 12:40









                  Community

                  1




                  1










                  answered Feb 10 '17 at 9:08









                  Sobrique

                  3,759517




                  3,759517






















                      up vote
                      0
                      down vote













                      Assuming f.xml is our input (a valid xml):



                      $ perl -MXML::DT -E 'dt("f.xml",
                      time=>sub{$a=father;
                      $c =~ s/[TZ]/ /g;
                      say "$a->{lat},$a->{lon},$c"}
                      )'




                      • -MXML::DT load XML::DT module (xml down translator)


                      • dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub


                      • $a=father : get the attributes from father


                      • $c : is the current element content


                      Warning: I am one of the authors of XML::DT (install with cpan XML::DT)






                      share|improve this answer



























                        up vote
                        0
                        down vote













                        Assuming f.xml is our input (a valid xml):



                        $ perl -MXML::DT -E 'dt("f.xml",
                        time=>sub{$a=father;
                        $c =~ s/[TZ]/ /g;
                        say "$a->{lat},$a->{lon},$c"}
                        )'




                        • -MXML::DT load XML::DT module (xml down translator)


                        • dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub


                        • $a=father : get the attributes from father


                        • $c : is the current element content


                        Warning: I am one of the authors of XML::DT (install with cpan XML::DT)






                        share|improve this answer

























                          up vote
                          0
                          down vote










                          up vote
                          0
                          down vote









                          Assuming f.xml is our input (a valid xml):



                          $ perl -MXML::DT -E 'dt("f.xml",
                          time=>sub{$a=father;
                          $c =~ s/[TZ]/ /g;
                          say "$a->{lat},$a->{lon},$c"}
                          )'




                          • -MXML::DT load XML::DT module (xml down translator)


                          • dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub


                          • $a=father : get the attributes from father


                          • $c : is the current element content


                          Warning: I am one of the authors of XML::DT (install with cpan XML::DT)






                          share|improve this answer














                          Assuming f.xml is our input (a valid xml):



                          $ perl -MXML::DT -E 'dt("f.xml",
                          time=>sub{$a=father;
                          $c =~ s/[TZ]/ /g;
                          say "$a->{lat},$a->{lon},$c"}
                          )'




                          • -MXML::DT load XML::DT module (xml down translator)


                          • dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub


                          • $a=father : get the attributes from father


                          • $c : is the current element content


                          Warning: I am one of the authors of XML::DT (install with cpan XML::DT)







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Feb 10 '17 at 11:52

























                          answered Feb 10 '17 at 11:45









                          JJoao

                          6,9841827




                          6,9841827






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Unix & Linux Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f343636%2ffrom-gpx-to-csv-file%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              List directoties down one level, excluding some named directories and files

                              list processes belonging to a network namespace

                              list systemd RuntimeDirectory mounts