Hex code for '(' in bash regex












1














I have a strange behaviour in shell.



When I try to match '_' in regex with its hex code it works, but not with '('.



$ regex1=$'x5f'
$ pattern1='_'
$ if [[ $pattern1 =~ $regex1 ]]; then echo yes; else echo no; fi
yes

$ regex2=$'x28'
$ pattern2='('
$ if [[ $pattern2 =~ $regex2 ]]; then echo yes; else echo no; fi
no


Can you explain this behaviour ?










share|improve this question





























    1














    I have a strange behaviour in shell.



    When I try to match '_' in regex with its hex code it works, but not with '('.



    $ regex1=$'x5f'
    $ pattern1='_'
    $ if [[ $pattern1 =~ $regex1 ]]; then echo yes; else echo no; fi
    yes

    $ regex2=$'x28'
    $ pattern2='('
    $ if [[ $pattern2 =~ $regex2 ]]; then echo yes; else echo no; fi
    no


    Can you explain this behaviour ?










    share|improve this question



























      1












      1








      1







      I have a strange behaviour in shell.



      When I try to match '_' in regex with its hex code it works, but not with '('.



      $ regex1=$'x5f'
      $ pattern1='_'
      $ if [[ $pattern1 =~ $regex1 ]]; then echo yes; else echo no; fi
      yes

      $ regex2=$'x28'
      $ pattern2='('
      $ if [[ $pattern2 =~ $regex2 ]]; then echo yes; else echo no; fi
      no


      Can you explain this behaviour ?










      share|improve this question















      I have a strange behaviour in shell.



      When I try to match '_' in regex with its hex code it works, but not with '('.



      $ regex1=$'x5f'
      $ pattern1='_'
      $ if [[ $pattern1 =~ $regex1 ]]; then echo yes; else echo no; fi
      yes

      $ regex2=$'x28'
      $ pattern2='('
      $ if [[ $pattern2 =~ $regex2 ]]; then echo yes; else echo no; fi
      no


      Can you explain this behaviour ?







      bash shell regular-expression ascii hex






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Dec 19 '18 at 15:11









      ilkkachu

      55.9k784155




      55.9k784155










      asked Dec 19 '18 at 15:03









      CLB

      112




      112






















          1 Answer
          1






          active

          oldest

          votes


















          5














          regex2=$'x28' is exactly equivalent to regex2='(', the shell processes the $'...' quotes when the assignment happens. And ( by itself is an invalid regex, so [[ =~ ]] reports an error by returning an exit status of 2:



          $ re='('; [[ "(" =~ $re ]]; echo "$?"
          2


          (Of course within an if statement you can't tell the difference between an exit code of 1 for "no match" and a 2 for "error", but it's there.)



          You need to escape the opening parenthesis from the regex:



          $ re='('; [[ "(" =~ $re ]] && echo match
          match


          or put it in a bracket group:



          $ re='[(]'; [[ "(" =~ $re ]] && echo match
          match


          On a quick test Bash's regexes don't support hex or octal character escapes so re='50' or re='x28' do not work.






          share|improve this answer





















          • Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
            – CLB
            Dec 19 '18 at 15:26












          • @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
            – ilkkachu
            Dec 19 '18 at 15:43










          • Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
            – ilkkachu
            Dec 19 '18 at 15:47










          • @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
            – Kusalananda
            Dec 19 '18 at 16:43











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489938%2fhex-code-for-in-bash-regex%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          5














          regex2=$'x28' is exactly equivalent to regex2='(', the shell processes the $'...' quotes when the assignment happens. And ( by itself is an invalid regex, so [[ =~ ]] reports an error by returning an exit status of 2:



          $ re='('; [[ "(" =~ $re ]]; echo "$?"
          2


          (Of course within an if statement you can't tell the difference between an exit code of 1 for "no match" and a 2 for "error", but it's there.)



          You need to escape the opening parenthesis from the regex:



          $ re='('; [[ "(" =~ $re ]] && echo match
          match


          or put it in a bracket group:



          $ re='[(]'; [[ "(" =~ $re ]] && echo match
          match


          On a quick test Bash's regexes don't support hex or octal character escapes so re='50' or re='x28' do not work.






          share|improve this answer





















          • Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
            – CLB
            Dec 19 '18 at 15:26












          • @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
            – ilkkachu
            Dec 19 '18 at 15:43










          • Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
            – ilkkachu
            Dec 19 '18 at 15:47










          • @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
            – Kusalananda
            Dec 19 '18 at 16:43
















          5














          regex2=$'x28' is exactly equivalent to regex2='(', the shell processes the $'...' quotes when the assignment happens. And ( by itself is an invalid regex, so [[ =~ ]] reports an error by returning an exit status of 2:



          $ re='('; [[ "(" =~ $re ]]; echo "$?"
          2


          (Of course within an if statement you can't tell the difference between an exit code of 1 for "no match" and a 2 for "error", but it's there.)



          You need to escape the opening parenthesis from the regex:



          $ re='('; [[ "(" =~ $re ]] && echo match
          match


          or put it in a bracket group:



          $ re='[(]'; [[ "(" =~ $re ]] && echo match
          match


          On a quick test Bash's regexes don't support hex or octal character escapes so re='50' or re='x28' do not work.






          share|improve this answer





















          • Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
            – CLB
            Dec 19 '18 at 15:26












          • @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
            – ilkkachu
            Dec 19 '18 at 15:43










          • Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
            – ilkkachu
            Dec 19 '18 at 15:47










          • @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
            – Kusalananda
            Dec 19 '18 at 16:43














          5












          5








          5






          regex2=$'x28' is exactly equivalent to regex2='(', the shell processes the $'...' quotes when the assignment happens. And ( by itself is an invalid regex, so [[ =~ ]] reports an error by returning an exit status of 2:



          $ re='('; [[ "(" =~ $re ]]; echo "$?"
          2


          (Of course within an if statement you can't tell the difference between an exit code of 1 for "no match" and a 2 for "error", but it's there.)



          You need to escape the opening parenthesis from the regex:



          $ re='('; [[ "(" =~ $re ]] && echo match
          match


          or put it in a bracket group:



          $ re='[(]'; [[ "(" =~ $re ]] && echo match
          match


          On a quick test Bash's regexes don't support hex or octal character escapes so re='50' or re='x28' do not work.






          share|improve this answer












          regex2=$'x28' is exactly equivalent to regex2='(', the shell processes the $'...' quotes when the assignment happens. And ( by itself is an invalid regex, so [[ =~ ]] reports an error by returning an exit status of 2:



          $ re='('; [[ "(" =~ $re ]]; echo "$?"
          2


          (Of course within an if statement you can't tell the difference between an exit code of 1 for "no match" and a 2 for "error", but it's there.)



          You need to escape the opening parenthesis from the regex:



          $ re='('; [[ "(" =~ $re ]] && echo match
          match


          or put it in a bracket group:



          $ re='[(]'; [[ "(" =~ $re ]] && echo match
          match


          On a quick test Bash's regexes don't support hex or octal character escapes so re='50' or re='x28' do not work.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Dec 19 '18 at 15:16









          ilkkachu

          55.9k784155




          55.9k784155












          • Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
            – CLB
            Dec 19 '18 at 15:26












          • @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
            – ilkkachu
            Dec 19 '18 at 15:43










          • Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
            – ilkkachu
            Dec 19 '18 at 15:47










          • @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
            – Kusalananda
            Dec 19 '18 at 16:43


















          • Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
            – CLB
            Dec 19 '18 at 15:26












          • @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
            – ilkkachu
            Dec 19 '18 at 15:43










          • Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
            – ilkkachu
            Dec 19 '18 at 15:47










          • @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
            – Kusalananda
            Dec 19 '18 at 16:43
















          Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
          – CLB
          Dec 19 '18 at 15:26






          Hello, thanks for quick answer. I have very long regex, with matches on hex codes (x2c for example) which works. But no any of this match is processed by the shell, so this is the reason why I have no error before. To have a clear regex, what's your advice ? Change all x.. by its escaped ascii value, or something else ? I read replacing all characters by their hex value is a best practice, because it consume less cpu for systems that process a lot of regex.
          – CLB
          Dec 19 '18 at 15:26














          @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
          – ilkkachu
          Dec 19 '18 at 15:43




          @CLB, what I was trying to say is that if you assign re=$'...x2c...' or whatever, then the regex will not contain the hex code, it will contain the literal character. Also, that's the only way to do it, since In Bash, [[ "," =~ x2c ]] will not match, but [[ "x2c" =~ x2c ]] matches. That is, the hex code x2c isn't interpreted from the regex. Only the $'...' quote processes it.
          – ilkkachu
          Dec 19 '18 at 15:43












          Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
          – ilkkachu
          Dec 19 '18 at 15:47




          Which means that you need to escape the characters that need escaping, while making sure to not escape anything that turns into something special when escaped. (i.e. ( to match a literal left parenthesis, but watch for others, like w). It may be easier to just put the brackets around any characters you want to take as literals, e.g. [(] or $'[x28]'. That should actually work with any character, the special cases are ] and -, but I think they both work as ] and [-].
          – ilkkachu
          Dec 19 '18 at 15:47












          @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
          – Kusalananda
          Dec 19 '18 at 16:43




          @ilkkachu The user previously tried to use PCRE with bash. I believe that their encoded characters are coming from there. Extended regular expresions do not support characters encoded in that way.
          – Kusalananda
          Dec 19 '18 at 16:43


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f489938%2fhex-code-for-in-bash-regex%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Morgemoulin

          Scott Moir

          Souastre