Bash script to read a web page list from a text file

up vote
-1
down vote

favorite

I want to read webpage list and check if some of them have updated. Is it better to use wget or curl and how should I do that?

Webpage list is in a simple text file. If the contents of a webpage is the same it will not print anything. If contents changed from the last time that the script ran, then it will type(stdout) the webpage address.

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

Please don't re-post the same question. I have now reopened this one and closed your first one as a duplicate.
– terdon♦
Nov 23 at 12:28

Βάσω, please edit your question and show us a few lines of your file with the webpages so we know what we're dealing with. And do you keep the previous versions of the webpages somewhere? Are they simple html files? More complex pages which are generated on the fly?
– terdon♦
Nov 23 at 12:28

Also, please look at the comments on your original question and edit this one to provide the details requested.
– terdon♦
Nov 23 at 12:29

add a comment |

up vote
-1
down vote

favorite

I want to read webpage list and check if some of them have updated. Is it better to use wget or curl and how should I do that?

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

Please don't re-post the same question. I have now reopened this one and closed your first one as a duplicate.
– terdon♦
Nov 23 at 12:28

Βάσω, please edit your question and show us a few lines of your file with the webpages so we know what we're dealing with. And do you keep the previous versions of the webpages somewhere? Are they simple html files? More complex pages which are generated on the fly?
– terdon♦
Nov 23 at 12:28

Also, please look at the comments on your original question and edit this one to provide the details requested.
– terdon♦
Nov 23 at 12:29

add a comment |

up vote
-1
down vote

favorite

I want to read webpage list and check if some of them have updated. Is it better to use wget or curl and how should I do that?

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

I want to read webpage list and check if some of them have updated. Is it better to use wget or curl and how should I do that?

bash shell-script wget

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

edited Nov 23 at 12:32

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

asked Nov 23 at 11:50

Βάσω Κουπετσιδου

New contributor

Βάσω Κουπετσιδου is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

Please don't re-post the same question. I have now reopened this one and closed your first one as a duplicate.
– terdon♦
Nov 23 at 12:28

Βάσω, please edit your question and show us a few lines of your file with the webpages so we know what we're dealing with. And do you keep the previous versions of the webpages somewhere? Are they simple html files? More complex pages which are generated on the fly?
– terdon♦
Nov 23 at 12:28

Also, please look at the comments on your original question and edit this one to provide the details requested.
– terdon♦
Nov 23 at 12:29

add a comment |

Please don't re-post the same question. I have now reopened this one and closed your first one as a duplicate.
– terdon♦
Nov 23 at 12:28

Βάσω, please edit your question and show us a few lines of your file with the webpages so we know what we're dealing with. And do you keep the previous versions of the webpages somewhere? Are they simple html files? More complex pages which are generated on the fly?
– terdon♦
Nov 23 at 12:28

Also, please look at the comments on your original question and edit this one to provide the details requested.
– terdon♦
Nov 23 at 12:29

Please don't re-post the same question. I have now reopened this one and closed your first one as a duplicate.
– terdon♦
Nov 23 at 12:28

Βάσω, please edit your question and show us a few lines of your file with the webpages so we know what we're dealing with. And do you keep the previous versions of the webpages somewhere? Are they simple html files? More complex pages which are generated on the fly?
– terdon♦
Nov 23 at 12:28

Also, please look at the comments on your original question and edit this one to provide the details requested.
– terdon♦
Nov 23 at 12:29

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

#!/bin/sh



i=1

while IFS= read -r url; do

    file="data-$i.out"



    curl -o "$file.new" "$url"



    if ! cmp -s "$file" "$file.new"

    then

        printf '%sn' "$url"

    fi



    mv -f "$file.new" "$file"



    i=$(( i + 1 ))

done <url-list.txt

This would read the URLs from url-list.txt, line by line, and use curl to fetch each, saving the output in a file called data-N.out.new where N is an integer (the URL ordinal number in the file).

If there is no old data-N.out file, or if this file differs from data-N.out.new, then the URL is printed to standard output.

The fetched data file is then renamed for when you run the script again.

The first time you run the script, all URLs will be outputted as they have never been seen before.

Reordering the URLs, or adding new URLs at the top would make the URLs be flagged as changed as the contents of the corresponding data file has changed. You could fix this by using e.g. the base64-encoded URL as part of the output filename instead of $i.

Whether you use curl or wget or some other Web client is essentially unimportant.

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Βάσω Κουπετσιδου is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f483661%2fbash-script-to-read-a-web-page-list-from-a-text-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

#!/bin/sh



i=1

while IFS= read -r url; do

    file="data-$i.out"



    curl -o "$file.new" "$url"



    if ! cmp -s "$file" "$file.new"

    then

        printf '%sn' "$url"

    fi



    mv -f "$file.new" "$file"



    i=$(( i + 1 ))

done <url-list.txt

If there is no old data-N.out file, or if this file differs from data-N.out.new, then the URL is printed to standard output.

The fetched data file is then renamed for when you run the script again.

The first time you run the script, all URLs will be outputted as they have never been seen before.

Whether you use curl or wget or some other Web client is essentially unimportant.

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

add a comment |

up vote
2
down vote

#!/bin/sh



i=1

while IFS= read -r url; do

    file="data-$i.out"



    curl -o "$file.new" "$url"



    if ! cmp -s "$file" "$file.new"

    then

        printf '%sn' "$url"

    fi



    mv -f "$file.new" "$file"



    i=$(( i + 1 ))

done <url-list.txt

If there is no old data-N.out file, or if this file differs from data-N.out.new, then the URL is printed to standard output.

The fetched data file is then renamed for when you run the script again.

The first time you run the script, all URLs will be outputted as they have never been seen before.

Whether you use curl or wget or some other Web client is essentially unimportant.

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

add a comment |

up vote
2
down vote

#!/bin/sh



i=1

while IFS= read -r url; do

    file="data-$i.out"



    curl -o "$file.new" "$url"



    if ! cmp -s "$file" "$file.new"

    then

        printf '%sn' "$url"

    fi



    mv -f "$file.new" "$file"



    i=$(( i + 1 ))

done <url-list.txt

If there is no old data-N.out file, or if this file differs from data-N.out.new, then the URL is printed to standard output.

The fetched data file is then renamed for when you run the script again.

The first time you run the script, all URLs will be outputted as they have never been seen before.

Whether you use curl or wget or some other Web client is essentially unimportant.

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

#!/bin/sh



i=1

while IFS= read -r url; do

    file="data-$i.out"



    curl -o "$file.new" "$url"



    if ! cmp -s "$file" "$file.new"

    then

        printf '%sn' "$url"

    fi



    mv -f "$file.new" "$file"



    i=$(( i + 1 ))

done <url-list.txt

If there is no old data-N.out file, or if this file differs from data-N.out.new, then the URL is printed to standard output.

The fetched data file is then renamed for when you run the script again.

The first time you run the script, all URLs will be outputted as they have never been seen before.

Whether you use curl or wget or some other Web client is essentially unimportant.

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

edited Nov 23 at 12:35

answered Nov 23 at 12:00

Kusalananda

117k16221360

answered Nov 23 at 12:00

Kusalananda

117k16221360

answered Nov 23 at 12:00

Kusalananda

117k16221360

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

add a comment |

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

This question is a duplicate. Vote to open the old one, and put answer there. (And maybe edit question to make it clear.)
– ctrl-alt-delor
Nov 23 at 12:03

Yes that would work. Note there are some edits of the other question (just grammar, punctuation).
– ctrl-alt-delor
Nov 23 at 12:18

add a comment |

Βάσω Κουπετσιδου is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Βάσω Κουπετσιδου is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk