Replace AWORD or BWORD with CWORD in sed
So I have a list of random websites of the following kind:
rapido21655bonk.a.sweetpotato.net
rapido26230bonk.a.sourpotato.net
rapido29926bonk.b.sourpotato.net
rapido29926bonk.b.sweetpotato.net
rapido30179bonk.a.sweetpotato.net
rapido30648bonk.b.sourpotato.net
rapido30761bonk.c.sweetpotato.net
Now I need a sed string to only leave the number, and take everything else out. What I did was:
sed s/rapido//
to get rid of the first part of it, but for the second part, I could use sed twice to get rid of them both, but I want to know if I can use some kind of or
logic to remove both in one sed. I know I can use sed to match a or b or c
using [abc]
but I want something like that for whole words. So what I did after this was:
sed s/rapido//|sed s/bonk.[abc].sweetpotato.net//
and then I would put another one with just sourpotato.net, but I can't seem to do the following:
sed s/rapido//|sed s/bonk.[abc].(sweet|sour)potato.net//
This doesn't work. It gives me this:
-bash: syntax error near unexpected token
('`
Only replacing the number doesn't not work, because sometimes I might get stuff like rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.
[111@111 ~]$ sed s/rapido// sedster|sed 's/bonk.[abc].(sweetpotato|sourpotato).net//'
21655bonk.a.sweetpotato.net
26230bonk.a.sourpotato.net
29926bonk.b.sourpotato.net
29926bonk.b.sweetpotato.net
30179bonk.a.sweetpotato.net
30648bonk.b.sourpotato.net
30761bonk.c.sweetpotato.net
text-processing sed
New contributor
add a comment |
So I have a list of random websites of the following kind:
rapido21655bonk.a.sweetpotato.net
rapido26230bonk.a.sourpotato.net
rapido29926bonk.b.sourpotato.net
rapido29926bonk.b.sweetpotato.net
rapido30179bonk.a.sweetpotato.net
rapido30648bonk.b.sourpotato.net
rapido30761bonk.c.sweetpotato.net
Now I need a sed string to only leave the number, and take everything else out. What I did was:
sed s/rapido//
to get rid of the first part of it, but for the second part, I could use sed twice to get rid of them both, but I want to know if I can use some kind of or
logic to remove both in one sed. I know I can use sed to match a or b or c
using [abc]
but I want something like that for whole words. So what I did after this was:
sed s/rapido//|sed s/bonk.[abc].sweetpotato.net//
and then I would put another one with just sourpotato.net, but I can't seem to do the following:
sed s/rapido//|sed s/bonk.[abc].(sweet|sour)potato.net//
This doesn't work. It gives me this:
-bash: syntax error near unexpected token
('`
Only replacing the number doesn't not work, because sometimes I might get stuff like rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.
[111@111 ~]$ sed s/rapido// sedster|sed 's/bonk.[abc].(sweetpotato|sourpotato).net//'
21655bonk.a.sweetpotato.net
26230bonk.a.sourpotato.net
29926bonk.b.sourpotato.net
29926bonk.b.sweetpotato.net
30179bonk.a.sweetpotato.net
30648bonk.b.sourpotato.net
30761bonk.c.sweetpotato.net
text-processing sed
New contributor
add a comment |
So I have a list of random websites of the following kind:
rapido21655bonk.a.sweetpotato.net
rapido26230bonk.a.sourpotato.net
rapido29926bonk.b.sourpotato.net
rapido29926bonk.b.sweetpotato.net
rapido30179bonk.a.sweetpotato.net
rapido30648bonk.b.sourpotato.net
rapido30761bonk.c.sweetpotato.net
Now I need a sed string to only leave the number, and take everything else out. What I did was:
sed s/rapido//
to get rid of the first part of it, but for the second part, I could use sed twice to get rid of them both, but I want to know if I can use some kind of or
logic to remove both in one sed. I know I can use sed to match a or b or c
using [abc]
but I want something like that for whole words. So what I did after this was:
sed s/rapido//|sed s/bonk.[abc].sweetpotato.net//
and then I would put another one with just sourpotato.net, but I can't seem to do the following:
sed s/rapido//|sed s/bonk.[abc].(sweet|sour)potato.net//
This doesn't work. It gives me this:
-bash: syntax error near unexpected token
('`
Only replacing the number doesn't not work, because sometimes I might get stuff like rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.
[111@111 ~]$ sed s/rapido// sedster|sed 's/bonk.[abc].(sweetpotato|sourpotato).net//'
21655bonk.a.sweetpotato.net
26230bonk.a.sourpotato.net
29926bonk.b.sourpotato.net
29926bonk.b.sweetpotato.net
30179bonk.a.sweetpotato.net
30648bonk.b.sourpotato.net
30761bonk.c.sweetpotato.net
text-processing sed
New contributor
So I have a list of random websites of the following kind:
rapido21655bonk.a.sweetpotato.net
rapido26230bonk.a.sourpotato.net
rapido29926bonk.b.sourpotato.net
rapido29926bonk.b.sweetpotato.net
rapido30179bonk.a.sweetpotato.net
rapido30648bonk.b.sourpotato.net
rapido30761bonk.c.sweetpotato.net
Now I need a sed string to only leave the number, and take everything else out. What I did was:
sed s/rapido//
to get rid of the first part of it, but for the second part, I could use sed twice to get rid of them both, but I want to know if I can use some kind of or
logic to remove both in one sed. I know I can use sed to match a or b or c
using [abc]
but I want something like that for whole words. So what I did after this was:
sed s/rapido//|sed s/bonk.[abc].sweetpotato.net//
and then I would put another one with just sourpotato.net, but I can't seem to do the following:
sed s/rapido//|sed s/bonk.[abc].(sweet|sour)potato.net//
This doesn't work. It gives me this:
-bash: syntax error near unexpected token
('`
Only replacing the number doesn't not work, because sometimes I might get stuff like rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.
[111@111 ~]$ sed s/rapido// sedster|sed 's/bonk.[abc].(sweetpotato|sourpotato).net//'
21655bonk.a.sweetpotato.net
26230bonk.a.sourpotato.net
29926bonk.b.sourpotato.net
29926bonk.b.sweetpotato.net
30179bonk.a.sweetpotato.net
30648bonk.b.sourpotato.net
30761bonk.c.sweetpotato.net
text-processing sed
text-processing sed
New contributor
New contributor
edited 1 hour ago
Rui F Ribeiro
39k1479129
39k1479129
New contributor
asked 3 hours ago
sweetsourpotato
133
133
New contributor
New contributor
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
With
sed -r 's/([^0-9]*)([0-9]*)([^0-9]*)/2/g'
you can keep only the number in the middle. This only works with extended regular expressions, so you need the -r
option to sed
.
Actually, it suffices to use
sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
This uses the function of referencing parts of the expression with 1
, 2
, ... You then have to use parentheses (...)
around the part of your expression you want to reference. In the above code, the second part ([0-9]*)
will match the number in the middle, and you can refer to this by 2
.
Edit: As terdon pointed out, we don't need to capture the initial part since we don't use it again. So
sed -n -r 's/[^0-9]*([0-9]+).*/1/p'
is enough.
To summarize, the command above keeps only the first number in your input line.
I don't understand this part of it though/2/
, I want it replaced with nothing, like//
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for exampleecho ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
There's no reason to capture the groups you won't be using, but you do need+
or[0-9][0-9]*
so it won't match when there are no numbers. You also don't want theg
there as far as I can tell. This should be enough:sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.
– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own havesed 'blah blah'
and not justsed blah blah
.
– terdon♦
3 hours ago
|
show 3 more comments
If you just want to extract the numbers, you can do this with GNU grep
:
$ grep -oP 'd+' file
21655
26230
29926
29926
30179
30648
30761
Or, portably with perl:
$ perl -pe 's/[^dn]+//g' file
21655
26230
29926
29926
30179
30648
30761
Or sed
:
$ sed -nE 's/[^0-9]+//gp' file
21655
26230
29926
29926
30179
30648
30761
If you need something more specific to your input data, you can try:
$ sed -nE 's/.*rapido([0-9]+)bonk...(sweet|sour)potato.net.*/1/p' file
21655
26230
29926
29926
30179
30648
30761
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff likerapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternativessweetpotato.net
ORsourpotato.net
.
– sweetsourpotato
3 hours ago
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?
– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The1
refers to the 1st captured group, the([0-9]+)
, so it will replace the entire match with what was captured. Thep
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the-n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the-n
and-p
and edit your question to clarify.
– terdon♦
3 hours ago
add a comment |
Your attempt
sed s/rapido// | sed s/bonk.[abc](sweet|sour)potato.net//
was actually pretty close, but you made two mistakes. The first is that you didn't put the command inside quotes, so bash
interpreted the special characters "(" and "|". (The fact that you got a bash error message should have tipped you off to this).
The second mistake is more subtle. Sed
and grep
use basic regular expressions, in which only a few characters ( . * ^ $ [ ] ) have special meaning. If you want to use extended regex operators ( | () {} ), you need to precede them with a backslash. So here's what your command should have looked like:
sed < t 's/rapido//' | sed 's/bonk.[abc].(sweet|sour)potato.net//'
and since sed
can handle multiple commands in one run, you can simplify this to
sed < t 's/rapido//; s/bonk.[abc].(sweet|sour)potato.net//'
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
sweetsourpotato is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f491613%2freplace-aword-or-bword-with-cword-in-sed%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
With
sed -r 's/([^0-9]*)([0-9]*)([^0-9]*)/2/g'
you can keep only the number in the middle. This only works with extended regular expressions, so you need the -r
option to sed
.
Actually, it suffices to use
sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
This uses the function of referencing parts of the expression with 1
, 2
, ... You then have to use parentheses (...)
around the part of your expression you want to reference. In the above code, the second part ([0-9]*)
will match the number in the middle, and you can refer to this by 2
.
Edit: As terdon pointed out, we don't need to capture the initial part since we don't use it again. So
sed -n -r 's/[^0-9]*([0-9]+).*/1/p'
is enough.
To summarize, the command above keeps only the first number in your input line.
I don't understand this part of it though/2/
, I want it replaced with nothing, like//
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for exampleecho ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
There's no reason to capture the groups you won't be using, but you do need+
or[0-9][0-9]*
so it won't match when there are no numbers. You also don't want theg
there as far as I can tell. This should be enough:sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.
– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own havesed 'blah blah'
and not justsed blah blah
.
– terdon♦
3 hours ago
|
show 3 more comments
With
sed -r 's/([^0-9]*)([0-9]*)([^0-9]*)/2/g'
you can keep only the number in the middle. This only works with extended regular expressions, so you need the -r
option to sed
.
Actually, it suffices to use
sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
This uses the function of referencing parts of the expression with 1
, 2
, ... You then have to use parentheses (...)
around the part of your expression you want to reference. In the above code, the second part ([0-9]*)
will match the number in the middle, and you can refer to this by 2
.
Edit: As terdon pointed out, we don't need to capture the initial part since we don't use it again. So
sed -n -r 's/[^0-9]*([0-9]+).*/1/p'
is enough.
To summarize, the command above keeps only the first number in your input line.
I don't understand this part of it though/2/
, I want it replaced with nothing, like//
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for exampleecho ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
There's no reason to capture the groups you won't be using, but you do need+
or[0-9][0-9]*
so it won't match when there are no numbers. You also don't want theg
there as far as I can tell. This should be enough:sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.
– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own havesed 'blah blah'
and not justsed blah blah
.
– terdon♦
3 hours ago
|
show 3 more comments
With
sed -r 's/([^0-9]*)([0-9]*)([^0-9]*)/2/g'
you can keep only the number in the middle. This only works with extended regular expressions, so you need the -r
option to sed
.
Actually, it suffices to use
sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
This uses the function of referencing parts of the expression with 1
, 2
, ... You then have to use parentheses (...)
around the part of your expression you want to reference. In the above code, the second part ([0-9]*)
will match the number in the middle, and you can refer to this by 2
.
Edit: As terdon pointed out, we don't need to capture the initial part since we don't use it again. So
sed -n -r 's/[^0-9]*([0-9]+).*/1/p'
is enough.
To summarize, the command above keeps only the first number in your input line.
With
sed -r 's/([^0-9]*)([0-9]*)([^0-9]*)/2/g'
you can keep only the number in the middle. This only works with extended regular expressions, so you need the -r
option to sed
.
Actually, it suffices to use
sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
This uses the function of referencing parts of the expression with 1
, 2
, ... You then have to use parentheses (...)
around the part of your expression you want to reference. In the above code, the second part ([0-9]*)
will match the number in the middle, and you can refer to this by 2
.
Edit: As terdon pointed out, we don't need to capture the initial part since we don't use it again. So
sed -n -r 's/[^0-9]*([0-9]+).*/1/p'
is enough.
To summarize, the command above keeps only the first number in your input line.
edited 2 hours ago
answered 3 hours ago
Stefan Hamcke
139111
139111
I don't understand this part of it though/2/
, I want it replaced with nothing, like//
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for exampleecho ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
There's no reason to capture the groups you won't be using, but you do need+
or[0-9][0-9]*
so it won't match when there are no numbers. You also don't want theg
there as far as I can tell. This should be enough:sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.
– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own havesed 'blah blah'
and not justsed blah blah
.
– terdon♦
3 hours ago
|
show 3 more comments
I don't understand this part of it though/2/
, I want it replaced with nothing, like//
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for exampleecho ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
There's no reason to capture the groups you won't be using, but you do need+
or[0-9][0-9]*
so it won't match when there are no numbers. You also don't want theg
there as far as I can tell. This should be enough:sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.
– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own havesed 'blah blah'
and not justsed blah blah
.
– terdon♦
3 hours ago
I don't understand this part of it though
/2/
, I want it replaced with nothing, like //
– sweetsourpotato
3 hours ago
I don't understand this part of it though
/2/
, I want it replaced with nothing, like //
– sweetsourpotato
3 hours ago
@sweetsourpotato It works, just try it, for example
echo ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
@sweetsourpotato It works, just try it, for example
echo ab234bc | sed -r 's/([^0-9]*)([0-9]*)(.*)/2/g'
– Stefan Hamcke
3 hours ago
1
1
There's no reason to capture the groups you won't be using, but you do need
+
or [0-9][0-9]*
so it won't match when there are no numbers. You also don't want the g
there as far as I can tell. This should be enough: sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.– terdon♦
3 hours ago
There's no reason to capture the groups you won't be using, but you do need
+
or [0-9][0-9]*
so it won't match when there are no numbers. You also don't want the g
there as far as I can tell. This should be enough: sed -r 's/[^0-9]*([0-9]+)(.*)/1/'
.– terdon♦
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
What I don't understand is why I get a syntax error if I use (word1|word2)
– sweetsourpotato
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own have
sed 'blah blah'
and not just sed blah blah
.– terdon♦
3 hours ago
@sweetsourpotato we can't help you if you don't show your syntax error. If you used the command from your updated question, the error is because you're not quoting the sed pattern. Note how both this answer and my own have
sed 'blah blah'
and not just sed blah blah
.– terdon♦
3 hours ago
|
show 3 more comments
If you just want to extract the numbers, you can do this with GNU grep
:
$ grep -oP 'd+' file
21655
26230
29926
29926
30179
30648
30761
Or, portably with perl:
$ perl -pe 's/[^dn]+//g' file
21655
26230
29926
29926
30179
30648
30761
Or sed
:
$ sed -nE 's/[^0-9]+//gp' file
21655
26230
29926
29926
30179
30648
30761
If you need something more specific to your input data, you can try:
$ sed -nE 's/.*rapido([0-9]+)bonk...(sweet|sour)potato.net.*/1/p' file
21655
26230
29926
29926
30179
30648
30761
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff likerapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternativessweetpotato.net
ORsourpotato.net
.
– sweetsourpotato
3 hours ago
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?
– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The1
refers to the 1st captured group, the([0-9]+)
, so it will replace the entire match with what was captured. Thep
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the-n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the-n
and-p
and edit your question to clarify.
– terdon♦
3 hours ago
add a comment |
If you just want to extract the numbers, you can do this with GNU grep
:
$ grep -oP 'd+' file
21655
26230
29926
29926
30179
30648
30761
Or, portably with perl:
$ perl -pe 's/[^dn]+//g' file
21655
26230
29926
29926
30179
30648
30761
Or sed
:
$ sed -nE 's/[^0-9]+//gp' file
21655
26230
29926
29926
30179
30648
30761
If you need something more specific to your input data, you can try:
$ sed -nE 's/.*rapido([0-9]+)bonk...(sweet|sour)potato.net.*/1/p' file
21655
26230
29926
29926
30179
30648
30761
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff likerapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternativessweetpotato.net
ORsourpotato.net
.
– sweetsourpotato
3 hours ago
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?
– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The1
refers to the 1st captured group, the([0-9]+)
, so it will replace the entire match with what was captured. Thep
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the-n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the-n
and-p
and edit your question to clarify.
– terdon♦
3 hours ago
add a comment |
If you just want to extract the numbers, you can do this with GNU grep
:
$ grep -oP 'd+' file
21655
26230
29926
29926
30179
30648
30761
Or, portably with perl:
$ perl -pe 's/[^dn]+//g' file
21655
26230
29926
29926
30179
30648
30761
Or sed
:
$ sed -nE 's/[^0-9]+//gp' file
21655
26230
29926
29926
30179
30648
30761
If you need something more specific to your input data, you can try:
$ sed -nE 's/.*rapido([0-9]+)bonk...(sweet|sour)potato.net.*/1/p' file
21655
26230
29926
29926
30179
30648
30761
If you just want to extract the numbers, you can do this with GNU grep
:
$ grep -oP 'd+' file
21655
26230
29926
29926
30179
30648
30761
Or, portably with perl:
$ perl -pe 's/[^dn]+//g' file
21655
26230
29926
29926
30179
30648
30761
Or sed
:
$ sed -nE 's/[^0-9]+//gp' file
21655
26230
29926
29926
30179
30648
30761
If you need something more specific to your input data, you can try:
$ sed -nE 's/.*rapido([0-9]+)bonk...(sweet|sour)potato.net.*/1/p' file
21655
26230
29926
29926
30179
30648
30761
edited 3 hours ago
answered 3 hours ago
terdon♦
128k31249423
128k31249423
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff likerapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternativessweetpotato.net
ORsourpotato.net
.
– sweetsourpotato
3 hours ago
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?
– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The1
refers to the 1st captured group, the([0-9]+)
, so it will replace the entire match with what was captured. Thep
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the-n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the-n
and-p
and edit your question to clarify.
– terdon♦
3 hours ago
add a comment |
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff likerapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternativessweetpotato.net
ORsourpotato.net
.
– sweetsourpotato
3 hours ago
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?
– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The1
refers to the 1st captured group, the([0-9]+)
, so it will replace the entire match with what was captured. Thep
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the-n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the-n
and-p
and edit your question to clarify.
– terdon♦
3 hours ago
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff like
rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.– sweetsourpotato
3 hours ago
Thanks for the help, but only replacing the number doesn't help me, because sometimes I might get stuff like
rapido22452boonkers.red
which I would want to still have there. I would want to ONLY remove the 2 alternatives sweetpotato.net
OR sourpotato.net
.– sweetsourpotato
3 hours ago
1
1
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
@sweetsourpotato see update. But please remember to make that sort of thing clear when asking a question. Ideally, you need to ask showing an example of your input data that covers all possible cases and the output you want from it.
– terdon♦
3 hours ago
Thank you, but what does this do?
/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?– sweetsourpotato
3 hours ago
Thank you, but what does this do?
/1/p
I mean, using (word1|word2) gave me a bash error, so will that help?– sweetsourpotato
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The
1
refers to the 1st captured group, the ([0-9]+)
, so it will replace the entire match with what was captured. The p
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the -n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the -n
and -p
and edit your question to clarify.– terdon♦
3 hours ago
@sweetsourpotato what error? Did you use the exact command? The
1
refers to the 1st captured group, the ([0-9]+)
, so it will replace the entire match with what was captured. The p
means "only print if the substitution was successful" so any lines not matching the pattern will be skipped (the -n
means "don't print unless I tell you to"). Your question suggested you only want the numbers, so I wanted to skip any lines that didn't match. If that's not what you want, remove the -n
and -p
and edit your question to clarify.– terdon♦
3 hours ago
add a comment |
Your attempt
sed s/rapido// | sed s/bonk.[abc](sweet|sour)potato.net//
was actually pretty close, but you made two mistakes. The first is that you didn't put the command inside quotes, so bash
interpreted the special characters "(" and "|". (The fact that you got a bash error message should have tipped you off to this).
The second mistake is more subtle. Sed
and grep
use basic regular expressions, in which only a few characters ( . * ^ $ [ ] ) have special meaning. If you want to use extended regex operators ( | () {} ), you need to precede them with a backslash. So here's what your command should have looked like:
sed < t 's/rapido//' | sed 's/bonk.[abc].(sweet|sour)potato.net//'
and since sed
can handle multiple commands in one run, you can simplify this to
sed < t 's/rapido//; s/bonk.[abc].(sweet|sour)potato.net//'
add a comment |
Your attempt
sed s/rapido// | sed s/bonk.[abc](sweet|sour)potato.net//
was actually pretty close, but you made two mistakes. The first is that you didn't put the command inside quotes, so bash
interpreted the special characters "(" and "|". (The fact that you got a bash error message should have tipped you off to this).
The second mistake is more subtle. Sed
and grep
use basic regular expressions, in which only a few characters ( . * ^ $ [ ] ) have special meaning. If you want to use extended regex operators ( | () {} ), you need to precede them with a backslash. So here's what your command should have looked like:
sed < t 's/rapido//' | sed 's/bonk.[abc].(sweet|sour)potato.net//'
and since sed
can handle multiple commands in one run, you can simplify this to
sed < t 's/rapido//; s/bonk.[abc].(sweet|sour)potato.net//'
add a comment |
Your attempt
sed s/rapido// | sed s/bonk.[abc](sweet|sour)potato.net//
was actually pretty close, but you made two mistakes. The first is that you didn't put the command inside quotes, so bash
interpreted the special characters "(" and "|". (The fact that you got a bash error message should have tipped you off to this).
The second mistake is more subtle. Sed
and grep
use basic regular expressions, in which only a few characters ( . * ^ $ [ ] ) have special meaning. If you want to use extended regex operators ( | () {} ), you need to precede them with a backslash. So here's what your command should have looked like:
sed < t 's/rapido//' | sed 's/bonk.[abc].(sweet|sour)potato.net//'
and since sed
can handle multiple commands in one run, you can simplify this to
sed < t 's/rapido//; s/bonk.[abc].(sweet|sour)potato.net//'
Your attempt
sed s/rapido// | sed s/bonk.[abc](sweet|sour)potato.net//
was actually pretty close, but you made two mistakes. The first is that you didn't put the command inside quotes, so bash
interpreted the special characters "(" and "|". (The fact that you got a bash error message should have tipped you off to this).
The second mistake is more subtle. Sed
and grep
use basic regular expressions, in which only a few characters ( . * ^ $ [ ] ) have special meaning. If you want to use extended regex operators ( | () {} ), you need to precede them with a backslash. So here's what your command should have looked like:
sed < t 's/rapido//' | sed 's/bonk.[abc].(sweet|sour)potato.net//'
and since sed
can handle multiple commands in one run, you can simplify this to
sed < t 's/rapido//; s/bonk.[abc].(sweet|sour)potato.net//'
answered 18 mins ago
AndyB
954
954
add a comment |
add a comment |
sweetsourpotato is a new contributor. Be nice, and check out our Code of Conduct.
sweetsourpotato is a new contributor. Be nice, and check out our Code of Conduct.
sweetsourpotato is a new contributor. Be nice, and check out our Code of Conduct.
sweetsourpotato is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f491613%2freplace-aword-or-bword-with-cword-in-sed%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown