From gpx to csv file

up vote
0
down vote

favorite

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt> 

<wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>

I have a file like having the above lines which needed to be converted into

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.3982529841,103.90877152,2010-01-01 00:00:00

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

add a comment |

up vote
0
down vote

favorite

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt> 

<wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>

I have a file like having the above lines which needed to be converted into

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.3982529841,103.90877152,2010-01-01 00:00:00

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

add a comment |

up vote
0
down vote

favorite

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt> 

<wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>

I have a file like having the above lines which needed to be converted into

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.3982529841,103.90877152,2010-01-01 00:00:00

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time</wpt> 

<wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt> 

<wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt>

I have a file like having the above lines which needed to be converted into

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.345529841,103.7577152,2010-01-01 00:00:00

         1.3982529841,103.90877152,2010-01-01 00:00:00

linux awk sed xml

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

edited Nov 24 at 19:41

Rui F Ribeiro

38.3k1476127

asked Feb 9 '17 at 6:00

RKR

23219

asked Feb 9 '17 at 6:00

RKR

23219

asked Feb 9 '17 at 6:00

RKR

23219

add a comment |

5 Answers
5

active

oldest

votes

up vote
1
down vote

accepted

You could use sed to strip out the characters you don't want:

sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file

s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma

s/T/ / is replacing the character T with a space

s/^,|,$//g is removing the first and last comma

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

1

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

1

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

1

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

|
show 1 more comment

up vote
3
down vote

GPX is an XML format, so you can't use awk or sed to parse it reliably.

Instead, use something like XMLStarlet:

$ xml sel -t -m '//wpt' 

          -v '@lat' -o ',' -v '@lon' -o ',' 

          -v 'time' -nl data.gpx

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.3982529841,103.90877152,2010-01-01T00:00:00Z

Alternatively:

$ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

2

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

add a comment |

up vote
1
down vote

this answer is Based on the input given...

awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"

1.345529841,103.7577152,2010-01-01 00:00:00

1.345529841,103.7577152,2010-01-01 00:00:00

1.3982529841,103.90877152,2010-01-01 00:00:00

awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

add a comment |

up vote
1
down vote

Please, please - don't use a regular expression based solution, like awk or sed.

XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.

But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.

So to take your example:

#!/usr/bin/perl



use warnings;

use strict;

use XML::Twig;



my $xml = XML::Twig -> new -> parsefile('your_file.xml'); 



foreach my $wpt ( $xml -> get_xpath('//wpt') ) {

   print join ",", $wpt -> att('lat'), 

                   $wpt -> att('lon'),

                   $wpt -> first_child_text('time'), "n";

}

Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.

Like indented:

<xml>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.3982529841" lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

All on a single line:

<xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>

Another style of indenting:

<xml>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.3982529841"

      lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

Or even:

<xml

><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.3982529841"

lon="103.90877152"

><time

>2010-01-01T00:00:00Z</time></wpt></xml>

These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.

For the sake of being concise though:

perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

add a comment |

up vote
0
down vote

Assuming f.xml is our input (a valid xml):

$ perl -MXML::DT -E 'dt("f.xml",

                         time=>sub{$a=father;

                                   $c =~ s/[TZ]/ /g;

                                   say "$a->{lat},$a->{lon},$c"}

                       )'

-MXML::DT load XML::DT module (xml down translator)

dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub

$a=father : get the attributes from father

$c : is the current element content

Warning: I am one of the authors of XML::DT (install with cpan XML::DT)

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f343636%2ffrom-gpx-to-csv-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

up vote
1
down vote

accepted

You could use sed to strip out the characters you don't want:

sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file

s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma

s/T/ / is replacing the character T with a space

s/^,|,$//g is removing the first and last comma

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

1

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

1

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

1

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

|
show 1 more comment

up vote
1
down vote

accepted

You could use sed to strip out the characters you don't want:

sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file

s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma

s/T/ / is replacing the character T with a space

s/^,|,$//g is removing the first and last comma

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

1

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

1

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

1

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

|
show 1 more comment

up vote
1
down vote

accepted

You could use sed to strip out the characters you don't want:

sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file

s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma

s/T/ / is replacing the character T with a space

s/^,|,$//g is removing the first and last comma

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

You could use sed to strip out the characters you don't want:

sed 's/[^0-9.T:-]+/,/g;s/T/ /;s/^,|,$//g' file

s/[^0-9.T:-]+/,/g is replacing unwanted characters with a comma

s/T/ / is replacing the character T with a space

s/^,|,$//g is removing the first and last comma

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

edited Feb 10 '17 at 6:56

answered Feb 9 '17 at 8:43

oliv

1,651311

answered Feb 9 '17 at 8:43

oliv

1,651311

answered Feb 9 '17 at 8:43

oliv

1,651311

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

1

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

1

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

1

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

|
show 1 more comment

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

1

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

1

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

1

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

Yours is ok but comma will not come.We have to add it by ourselves
– RKR
Feb 9 '17 at 10:29

didn't notice the comma, answer updated
– oliv
Feb 9 '17 at 10:46

oliv everything is fine but instead of 103.75... in the second column ,I am getting 103,75.(For everything in column 2) Anything wrong in the code ?
– RKR
Feb 10 '17 at 1:57

look at the updated answer, it should the solve the comma/dot notation
– oliv
Feb 10 '17 at 6:57

Sorry, I know this is technically correct, but ... sed is a really awful tool for XML parsing. XML is contextual, and regular expressions are not.
– Sobrique
Feb 10 '17 at 8:51

|
show 1 more comment

up vote
3
down vote

GPX is an XML format, so you can't use awk or sed to parse it reliably.

Instead, use something like XMLStarlet:

$ xml sel -t -m '//wpt' 

          -v '@lat' -o ',' -v '@lon' -o ',' 

          -v 'time' -nl data.gpx

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.3982529841,103.90877152,2010-01-01T00:00:00Z

Alternatively:

$ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

2

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

add a comment |

up vote
3
down vote

GPX is an XML format, so you can't use awk or sed to parse it reliably.

Instead, use something like XMLStarlet:

$ xml sel -t -m '//wpt' 

          -v '@lat' -o ',' -v '@lon' -o ',' 

          -v 'time' -nl data.gpx

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.3982529841,103.90877152,2010-01-01T00:00:00Z

Alternatively:

$ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

2

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

add a comment |

up vote
3
down vote

GPX is an XML format, so you can't use awk or sed to parse it reliably.

Instead, use something like XMLStarlet:

$ xml sel -t -m '//wpt' 

          -v '@lat' -o ',' -v '@lon' -o ',' 

          -v 'time' -nl data.gpx

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.3982529841,103.90877152,2010-01-01T00:00:00Z

Alternatively:

$ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

GPX is an XML format, so you can't use awk or sed to parse it reliably.

Instead, use something like XMLStarlet:

$ xml sel -t -m '//wpt' 

          -v '@lat' -o ',' -v '@lon' -o ',' 

          -v 'time' -nl data.gpx

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.345529841,103.7577152,2010-01-01T00:00:00Z

1.3982529841,103.90877152,2010-01-01T00:00:00Z

Alternatively:

$ xml sel -t -m '//wpt' -v 'concat(@lat, ",", @lon, ",", time)' -nl data.wpx

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

edited Feb 9 '17 at 9:49

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

answered Feb 9 '17 at 9:38

Kusalananda

118k16223361

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

2

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

add a comment |

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

2

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

No command xml found.Should I download anything?
– RKR
Feb 9 '17 at 10:31

@RKR I'm using XMLStarlet. It's likely available as a package for your Linux. The command may sometimes be xmlstarlet rather than just xml.
– Kusalananda
Feb 9 '17 at 10:32

add a comment |

up vote
1
down vote

this answer is Based on the input given...

awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"

1.345529841,103.7577152,2010-01-01 00:00:00

1.345529841,103.7577152,2010-01-01 00:00:00

1.3982529841,103.90877152,2010-01-01 00:00:00

awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

add a comment |

up vote
1
down vote

this answer is Based on the input given...

awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"

1.345529841,103.7577152,2010-01-01 00:00:00

1.345529841,103.7577152,2010-01-01 00:00:00

1.3982529841,103.90877152,2010-01-01 00:00:00

awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

add a comment |

up vote
1
down vote

this answer is Based on the input given...

awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"

1.345529841,103.7577152,2010-01-01 00:00:00

1.345529841,103.7577152,2010-01-01 00:00:00

1.3982529841,103.90877152,2010-01-01 00:00:00

awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

this answer is Based on the input given...

awk -F"[<>"]" '{print $3,$5,$9}' OFS=, input.txt | sed "s/[TZ]/ /g"

1.345529841,103.7577152,2010-01-01 00:00:00

1.345529841,103.7577152,2010-01-01 00:00:00

1.3982529841,103.90877152,2010-01-01 00:00:00

awk -F"[<>"]" '{gsub(/T|Z/," ",$9);print $3,$5,$9}' OFS=, input.txt

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

edited Feb 9 '17 at 6:58

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

answered Feb 9 '17 at 6:51

Kamaraj

2,9161513

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

add a comment |

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

Itruns for quite a long time and does not seems to be finishing even for small file
– RKR
Feb 9 '17 at 10:30

what is your OS ? provide the exact command you typed in your terminal
– Kamaraj
Feb 9 '17 at 23:27

seems, you missed some double quotes in the command and it expects to close and waits for long time...
– Kamaraj
Feb 9 '17 at 23:47

add a comment |

up vote
1
down vote

Please, please - don't use a regular expression based solution, like awk or sed.

XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.

But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.

So to take your example:

#!/usr/bin/perl



use warnings;

use strict;

use XML::Twig;



my $xml = XML::Twig -> new -> parsefile('your_file.xml'); 



foreach my $wpt ( $xml -> get_xpath('//wpt') ) {

   print join ",", $wpt -> att('lat'), 

                   $wpt -> att('lon'),

                   $wpt -> first_child_text('time'), "n";

}

Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.

Like indented:

<xml>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.3982529841" lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

All on a single line:

<xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>

Another style of indenting:

<xml>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.3982529841"

      lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

Or even:

<xml

><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.3982529841"

lon="103.90877152"

><time

>2010-01-01T00:00:00Z</time></wpt></xml>

These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.

For the sake of being concise though:

perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

add a comment |

up vote
1
down vote

Please, please - don't use a regular expression based solution, like awk or sed.

XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.

But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.

So to take your example:

#!/usr/bin/perl



use warnings;

use strict;

use XML::Twig;



my $xml = XML::Twig -> new -> parsefile('your_file.xml'); 



foreach my $wpt ( $xml -> get_xpath('//wpt') ) {

   print join ",", $wpt -> att('lat'), 

                   $wpt -> att('lon'),

                   $wpt -> first_child_text('time'), "n";

}

Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.

Like indented:

<xml>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.3982529841" lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

All on a single line:

<xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>

Another style of indenting:

<xml>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.3982529841"

      lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

Or even:

<xml

><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.3982529841"

lon="103.90877152"

><time

>2010-01-01T00:00:00Z</time></wpt></xml>

These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.

For the sake of being concise though:

perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

add a comment |

up vote
1
down vote

Please, please - don't use a regular expression based solution, like awk or sed.

XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.

But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.

So to take your example:

#!/usr/bin/perl



use warnings;

use strict;

use XML::Twig;



my $xml = XML::Twig -> new -> parsefile('your_file.xml'); 



foreach my $wpt ( $xml -> get_xpath('//wpt') ) {

   print join ",", $wpt -> att('lat'), 

                   $wpt -> att('lon'),

                   $wpt -> first_child_text('time'), "n";

}

Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.

Like indented:

<xml>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.3982529841" lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

All on a single line:

<xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>

Another style of indenting:

<xml>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.3982529841"

      lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

Or even:

<xml

><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.3982529841"

lon="103.90877152"

><time

>2010-01-01T00:00:00Z</time></wpt></xml>

These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.

For the sake of being concise though:

perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

Please, please - don't use a regular expression based solution, like awk or sed.

XML is contextual, where regular expressions are not - so they can NEVER work properly, they're only at best a bit of a hack.

But XML does have a solution to this problem - it's called xpath, that lets you 'search' in a contextual way.

So to take your example:

#!/usr/bin/perl



use warnings;

use strict;

use XML::Twig;



my $xml = XML::Twig -> new -> parsefile('your_file.xml'); 



foreach my $wpt ( $xml -> get_xpath('//wpt') ) {

   print join ",", $wpt -> att('lat'), 

                   $wpt -> att('lon'),

                   $wpt -> first_child_text('time'), "n";

}

Which gives the desired result, but it will also handle a variety of otherwise perfectly valid and semantically identical forms of your XML.

Like indented:

<xml>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.345529841" lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt lat="1.3982529841" lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

All on a single line:

<xml><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.345529841" lon="103.7577152"><time>2010-01-01T00:00:00Z</time></wpt><wpt lat="1.3982529841" lon="103.90877152"><time>2010-01-01T00:00:00Z</time></wpt></xml>

Another style of indenting:

<xml>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.345529841"

      lon="103.7577152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

  <wpt

      lat="1.3982529841"

      lon="103.90877152">

    <time>2010-01-01T00:00:00Z</time>

  </wpt>

</xml>

Or even:

<xml

><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.345529841"

lon="103.7577152"

><time

>2010-01-01T00:00:00Z</time></wpt><wpt

lat="1.3982529841"

lon="103.90877152"

><time

>2010-01-01T00:00:00Z</time></wpt></xml>

These are all semantically identical, and should be parsed the same way. Hopefully it's fairly clear that a regular expression to do this is a LOT more complicated than just using an XML parser.

For the sake of being concise though:

perl -MXML::Twig -0777 -e 'XML::Twig->new(twig_handlers=>{wpt=>sub{print join ",", $_->att("lat", $_->att("lon"),$_->first_child_text("time"), "n" }})->parse(<>)'

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

edited May 23 '17 at 12:40

Community♦

edited May 23 '17 at 12:40

Community♦

edited May 23 '17 at 12:40

Community♦

answered Feb 10 '17 at 9:08

Sobrique

3,759517

answered Feb 10 '17 at 9:08

Sobrique

3,759517

answered Feb 10 '17 at 9:08

Sobrique

3,759517

add a comment |

up vote
0
down vote

Assuming f.xml is our input (a valid xml):

$ perl -MXML::DT -E 'dt("f.xml",

                         time=>sub{$a=father;

                                   $c =~ s/[TZ]/ /g;

                                   say "$a->{lat},$a->{lon},$c"}

                       )'

-MXML::DT load XML::DT module (xml down translator)

dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub

$a=father : get the attributes from father

$c : is the current element content

Warning: I am one of the authors of XML::DT (install with cpan XML::DT)

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

add a comment |

up vote
0
down vote

Assuming f.xml is our input (a valid xml):

$ perl -MXML::DT -E 'dt("f.xml",

                         time=>sub{$a=father;

                                   $c =~ s/[TZ]/ /g;

                                   say "$a->{lat},$a->{lon},$c"}

                       )'

-MXML::DT load XML::DT module (xml down translator)

dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub

$a=father : get the attributes from father

$c : is the current element content

Warning: I am one of the authors of XML::DT (install with cpan XML::DT)

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

add a comment |

up vote
0
down vote

Assuming f.xml is our input (a valid xml):

$ perl -MXML::DT -E 'dt("f.xml",

                         time=>sub{$a=father;

                                   $c =~ s/[TZ]/ /g;

                                   say "$a->{lat},$a->{lon},$c"}

                       )'

-MXML::DT load XML::DT module (xml down translator)

dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub

$a=father : get the attributes from father

$c : is the current element content

Warning: I am one of the authors of XML::DT (install with cpan XML::DT)

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

Assuming f.xml is our input (a valid xml):

$ perl -MXML::DT -E 'dt("f.xml",

                         time=>sub{$a=father;

                                   $c =~ s/[TZ]/ /g;

                                   say "$a->{lat},$a->{lon},$c"}

                       )'

-MXML::DT load XML::DT module (xml down translator)

dt( file, time => sub{....}) : parse file and every time we see a time execute the correspondent sub

$a=father : get the attributes from father

$c : is the current element content

Warning: I am one of the authors of XML::DT (install with cpan XML::DT)

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

edited Feb 10 '17 at 11:52

answered Feb 10 '17 at 11:45

JJoao

6,9841827

answered Feb 10 '17 at 11:45

JJoao

6,9841827

answered Feb 10 '17 at 11:45

JJoao

6,9841827

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk