Can I use a variable storing a regular expression wherever a regular expression is expected?











up vote
1
down vote

favorite












In Awk, when I store a regular expression in a variable, can I use the variable wherever a regular expression is expected?



The AWK Programming Language by Aho says




Note that the string-matching pattern



/Asia/ 


is a shorthand for



$O ~ /Asia/



I have a text file:



$ cat f1
line 1; li
ne
2
line 3
lin
e 4


Why do the following two ways work



$ awk -v pat='in' '{if (match($0, pat)) print $0; } ' f1
line 1; li
line 3
lin
$ awk -v pat='in' ' $0 ~ pat {print $0} ' f1
line 1; li
line 3
lin


while the following doesn't



$ awk -v pat='in' ' pat {print $0} ' f1
line 1; li
ne
2
line 3
lin
e 4


?



Thanks.










share|improve this question
























  • You can't replace the syntax /pattern/ with a variable pat.
    – Kusalananda
    Nov 15 at 20:40










  • Is there a rule governing that?
    – Tim
    Nov 15 at 20:43










  • @Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
    – Kusalananda
    Nov 15 at 20:48















up vote
1
down vote

favorite












In Awk, when I store a regular expression in a variable, can I use the variable wherever a regular expression is expected?



The AWK Programming Language by Aho says




Note that the string-matching pattern



/Asia/ 


is a shorthand for



$O ~ /Asia/



I have a text file:



$ cat f1
line 1; li
ne
2
line 3
lin
e 4


Why do the following two ways work



$ awk -v pat='in' '{if (match($0, pat)) print $0; } ' f1
line 1; li
line 3
lin
$ awk -v pat='in' ' $0 ~ pat {print $0} ' f1
line 1; li
line 3
lin


while the following doesn't



$ awk -v pat='in' ' pat {print $0} ' f1
line 1; li
ne
2
line 3
lin
e 4


?



Thanks.










share|improve this question
























  • You can't replace the syntax /pattern/ with a variable pat.
    – Kusalananda
    Nov 15 at 20:40










  • Is there a rule governing that?
    – Tim
    Nov 15 at 20:43










  • @Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
    – Kusalananda
    Nov 15 at 20:48













up vote
1
down vote

favorite









up vote
1
down vote

favorite











In Awk, when I store a regular expression in a variable, can I use the variable wherever a regular expression is expected?



The AWK Programming Language by Aho says




Note that the string-matching pattern



/Asia/ 


is a shorthand for



$O ~ /Asia/



I have a text file:



$ cat f1
line 1; li
ne
2
line 3
lin
e 4


Why do the following two ways work



$ awk -v pat='in' '{if (match($0, pat)) print $0; } ' f1
line 1; li
line 3
lin
$ awk -v pat='in' ' $0 ~ pat {print $0} ' f1
line 1; li
line 3
lin


while the following doesn't



$ awk -v pat='in' ' pat {print $0} ' f1
line 1; li
ne
2
line 3
lin
e 4


?



Thanks.










share|improve this question















In Awk, when I store a regular expression in a variable, can I use the variable wherever a regular expression is expected?



The AWK Programming Language by Aho says




Note that the string-matching pattern



/Asia/ 


is a shorthand for



$O ~ /Asia/



I have a text file:



$ cat f1
line 1; li
ne
2
line 3
lin
e 4


Why do the following two ways work



$ awk -v pat='in' '{if (match($0, pat)) print $0; } ' f1
line 1; li
line 3
lin
$ awk -v pat='in' ' $0 ~ pat {print $0} ' f1
line 1; li
line 3
lin


while the following doesn't



$ awk -v pat='in' ' pat {print $0} ' f1
line 1; li
ne
2
line 3
lin
e 4


?



Thanks.







awk






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 at 20:43

























asked Nov 15 at 20:36









Tim

1




1












  • You can't replace the syntax /pattern/ with a variable pat.
    – Kusalananda
    Nov 15 at 20:40










  • Is there a rule governing that?
    – Tim
    Nov 15 at 20:43










  • @Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
    – Kusalananda
    Nov 15 at 20:48


















  • You can't replace the syntax /pattern/ with a variable pat.
    – Kusalananda
    Nov 15 at 20:40










  • Is there a rule governing that?
    – Tim
    Nov 15 at 20:43










  • @Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
    – Kusalananda
    Nov 15 at 20:48
















You can't replace the syntax /pattern/ with a variable pat.
– Kusalananda
Nov 15 at 20:40




You can't replace the syntax /pattern/ with a variable pat.
– Kusalananda
Nov 15 at 20:40












Is there a rule governing that?
– Tim
Nov 15 at 20:43




Is there a rule governing that?
– Tim
Nov 15 at 20:43












@Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
– Kusalananda
Nov 15 at 20:48




@Tim The grammar for the language disallows it. What you have in your non-working example is an expression that evaluates to true (it's non-zero), therefore all lines are printed.
– Kusalananda
Nov 15 at 20:48










1 Answer
1






active

oldest

votes

















up vote
3
down vote



accepted










Only /foo/ alone is short for $0 ~ /foo/.



In ... ~ /.../ or match(/.../, ...)..., it's only some form of quoting operator for regexps, while in other contexts, it's more an operator that resolves to a number (0 or 1).



That double meaning can be a bit confusing. There are a lot of those double meanings / ambiguities in awk.



/foo/ expands to 1 or 0 depending on whether $0 matches the foo regexp or not but "1" ~ /foo/ is not "1" ~ "1" when $0 happens to match foo, here /foo/ is no longer short for ($0 ~ /foo/). In the case of"1" ~ (/foo/)or"1" ~ +/foo/`, you'll see the behaviour varies between implementations though.



var is only var.



var as a condition means true if the variable is numeric or a numeric string and resolves to a number other than zero or if it's a string and resolves to a non-empty string.



variables declared with -v var=value are of those that may considered numeric strings if they look like numbers and strings otherwise.



awk -v var=in 'var {print "x"}'


prints x for every record because in doesn't look like a number and is not the empty string.



awk -v var=0 'var {print "x"}'


Would not print x, while:



awk 'BEGIN{var = "0"}; var {print "x"}'


would print x for every record as var was explicitly declared as a string variable. So even though it looks like a number, it's not considered as such.



That's another one of those double meanings. A variable may be considered as numerical or string depending on context. See also > that depending on context is taken as a comparison operator or a redirection operator (which again several ambiguous situations where the behaviour varies between implementations).



Note that you can also do things like:



awk '{print /foo/ + /bar/}'


Which is the same as:



awk '{print ($0 ~ /foo/) + ($0 ~ /bar/)}'


But if using concatenation instead of +



awk '{print /foo/ /bar/}'


that doesn't work as there's again an ambiguity between the /RE/ operator and the / division operator. When in doubt, use parens:



awk '{print (/foo/) (/bar/)}'


By the way, you should avoid using -v to store regexps or anything that may contain backslashes, as ANSI escape sequences are expanded in them. Instead, you should use environment variables:



RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}'


for instance.






share|improve this answer























  • Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
    – Tim
    Nov 15 at 21:14













Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482036%2fcan-i-use-a-variable-storing-a-regular-expression-wherever-a-regular-expression%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
3
down vote



accepted










Only /foo/ alone is short for $0 ~ /foo/.



In ... ~ /.../ or match(/.../, ...)..., it's only some form of quoting operator for regexps, while in other contexts, it's more an operator that resolves to a number (0 or 1).



That double meaning can be a bit confusing. There are a lot of those double meanings / ambiguities in awk.



/foo/ expands to 1 or 0 depending on whether $0 matches the foo regexp or not but "1" ~ /foo/ is not "1" ~ "1" when $0 happens to match foo, here /foo/ is no longer short for ($0 ~ /foo/). In the case of"1" ~ (/foo/)or"1" ~ +/foo/`, you'll see the behaviour varies between implementations though.



var is only var.



var as a condition means true if the variable is numeric or a numeric string and resolves to a number other than zero or if it's a string and resolves to a non-empty string.



variables declared with -v var=value are of those that may considered numeric strings if they look like numbers and strings otherwise.



awk -v var=in 'var {print "x"}'


prints x for every record because in doesn't look like a number and is not the empty string.



awk -v var=0 'var {print "x"}'


Would not print x, while:



awk 'BEGIN{var = "0"}; var {print "x"}'


would print x for every record as var was explicitly declared as a string variable. So even though it looks like a number, it's not considered as such.



That's another one of those double meanings. A variable may be considered as numerical or string depending on context. See also > that depending on context is taken as a comparison operator or a redirection operator (which again several ambiguous situations where the behaviour varies between implementations).



Note that you can also do things like:



awk '{print /foo/ + /bar/}'


Which is the same as:



awk '{print ($0 ~ /foo/) + ($0 ~ /bar/)}'


But if using concatenation instead of +



awk '{print /foo/ /bar/}'


that doesn't work as there's again an ambiguity between the /RE/ operator and the / division operator. When in doubt, use parens:



awk '{print (/foo/) (/bar/)}'


By the way, you should avoid using -v to store regexps or anything that may contain backslashes, as ANSI escape sequences are expanded in them. Instead, you should use environment variables:



RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}'


for instance.






share|improve this answer























  • Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
    – Tim
    Nov 15 at 21:14

















up vote
3
down vote



accepted










Only /foo/ alone is short for $0 ~ /foo/.



In ... ~ /.../ or match(/.../, ...)..., it's only some form of quoting operator for regexps, while in other contexts, it's more an operator that resolves to a number (0 or 1).



That double meaning can be a bit confusing. There are a lot of those double meanings / ambiguities in awk.



/foo/ expands to 1 or 0 depending on whether $0 matches the foo regexp or not but "1" ~ /foo/ is not "1" ~ "1" when $0 happens to match foo, here /foo/ is no longer short for ($0 ~ /foo/). In the case of"1" ~ (/foo/)or"1" ~ +/foo/`, you'll see the behaviour varies between implementations though.



var is only var.



var as a condition means true if the variable is numeric or a numeric string and resolves to a number other than zero or if it's a string and resolves to a non-empty string.



variables declared with -v var=value are of those that may considered numeric strings if they look like numbers and strings otherwise.



awk -v var=in 'var {print "x"}'


prints x for every record because in doesn't look like a number and is not the empty string.



awk -v var=0 'var {print "x"}'


Would not print x, while:



awk 'BEGIN{var = "0"}; var {print "x"}'


would print x for every record as var was explicitly declared as a string variable. So even though it looks like a number, it's not considered as such.



That's another one of those double meanings. A variable may be considered as numerical or string depending on context. See also > that depending on context is taken as a comparison operator or a redirection operator (which again several ambiguous situations where the behaviour varies between implementations).



Note that you can also do things like:



awk '{print /foo/ + /bar/}'


Which is the same as:



awk '{print ($0 ~ /foo/) + ($0 ~ /bar/)}'


But if using concatenation instead of +



awk '{print /foo/ /bar/}'


that doesn't work as there's again an ambiguity between the /RE/ operator and the / division operator. When in doubt, use parens:



awk '{print (/foo/) (/bar/)}'


By the way, you should avoid using -v to store regexps or anything that may contain backslashes, as ANSI escape sequences are expanded in them. Instead, you should use environment variables:



RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}'


for instance.






share|improve this answer























  • Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
    – Tim
    Nov 15 at 21:14















up vote
3
down vote



accepted







up vote
3
down vote



accepted






Only /foo/ alone is short for $0 ~ /foo/.



In ... ~ /.../ or match(/.../, ...)..., it's only some form of quoting operator for regexps, while in other contexts, it's more an operator that resolves to a number (0 or 1).



That double meaning can be a bit confusing. There are a lot of those double meanings / ambiguities in awk.



/foo/ expands to 1 or 0 depending on whether $0 matches the foo regexp or not but "1" ~ /foo/ is not "1" ~ "1" when $0 happens to match foo, here /foo/ is no longer short for ($0 ~ /foo/). In the case of"1" ~ (/foo/)or"1" ~ +/foo/`, you'll see the behaviour varies between implementations though.



var is only var.



var as a condition means true if the variable is numeric or a numeric string and resolves to a number other than zero or if it's a string and resolves to a non-empty string.



variables declared with -v var=value are of those that may considered numeric strings if they look like numbers and strings otherwise.



awk -v var=in 'var {print "x"}'


prints x for every record because in doesn't look like a number and is not the empty string.



awk -v var=0 'var {print "x"}'


Would not print x, while:



awk 'BEGIN{var = "0"}; var {print "x"}'


would print x for every record as var was explicitly declared as a string variable. So even though it looks like a number, it's not considered as such.



That's another one of those double meanings. A variable may be considered as numerical or string depending on context. See also > that depending on context is taken as a comparison operator or a redirection operator (which again several ambiguous situations where the behaviour varies between implementations).



Note that you can also do things like:



awk '{print /foo/ + /bar/}'


Which is the same as:



awk '{print ($0 ~ /foo/) + ($0 ~ /bar/)}'


But if using concatenation instead of +



awk '{print /foo/ /bar/}'


that doesn't work as there's again an ambiguity between the /RE/ operator and the / division operator. When in doubt, use parens:



awk '{print (/foo/) (/bar/)}'


By the way, you should avoid using -v to store regexps or anything that may contain backslashes, as ANSI escape sequences are expanded in them. Instead, you should use environment variables:



RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}'


for instance.






share|improve this answer














Only /foo/ alone is short for $0 ~ /foo/.



In ... ~ /.../ or match(/.../, ...)..., it's only some form of quoting operator for regexps, while in other contexts, it's more an operator that resolves to a number (0 or 1).



That double meaning can be a bit confusing. There are a lot of those double meanings / ambiguities in awk.



/foo/ expands to 1 or 0 depending on whether $0 matches the foo regexp or not but "1" ~ /foo/ is not "1" ~ "1" when $0 happens to match foo, here /foo/ is no longer short for ($0 ~ /foo/). In the case of"1" ~ (/foo/)or"1" ~ +/foo/`, you'll see the behaviour varies between implementations though.



var is only var.



var as a condition means true if the variable is numeric or a numeric string and resolves to a number other than zero or if it's a string and resolves to a non-empty string.



variables declared with -v var=value are of those that may considered numeric strings if they look like numbers and strings otherwise.



awk -v var=in 'var {print "x"}'


prints x for every record because in doesn't look like a number and is not the empty string.



awk -v var=0 'var {print "x"}'


Would not print x, while:



awk 'BEGIN{var = "0"}; var {print "x"}'


would print x for every record as var was explicitly declared as a string variable. So even though it looks like a number, it's not considered as such.



That's another one of those double meanings. A variable may be considered as numerical or string depending on context. See also > that depending on context is taken as a comparison operator or a redirection operator (which again several ambiguous situations where the behaviour varies between implementations).



Note that you can also do things like:



awk '{print /foo/ + /bar/}'


Which is the same as:



awk '{print ($0 ~ /foo/) + ($0 ~ /bar/)}'


But if using concatenation instead of +



awk '{print /foo/ /bar/}'


that doesn't work as there's again an ambiguity between the /RE/ operator and the / division operator. When in doubt, use parens:



awk '{print (/foo/) (/bar/)}'


By the way, you should avoid using -v to store regexps or anything that may contain backslashes, as ANSI escape sequences are expanded in them. Instead, you should use environment variables:



RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}'


for instance.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 15 at 22:43

























answered Nov 15 at 20:51









Stéphane Chazelas

294k54551893




294k54551893












  • Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
    – Tim
    Nov 15 at 21:14




















  • Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
    – Tim
    Nov 15 at 21:14


















Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
– Tim
Nov 15 at 21:14






Thanks. (1) If I am correct, in a pattern-action statement, the pattern can be an expression which can be a regular expression or not a regular expression. So using a variable as a pattern is not using it where only regular expression is expected. That's the reason of it not working. (2) For RE='.txt$' awk '$0 ~ ENVIRON["RE"] {...}', can I just use awk -v RE='\.txt$' '$0 ~ RE {...}' (doubling the backslash) equally well?
– Tim
Nov 15 at 21:14




















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482036%2fcan-i-use-a-variable-storing-a-regular-expression-wherever-a-regular-expression%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Morgemoulin

Scott Moir

Souastre