Should we use UTF-8 characters like ⏰ in bash/shell script?
up vote
31
down vote
favorite
The simple code here is working as expected on my machine if launched with bash
:
function ⏰(){
date
}
⏰
Could there be a problem for other people using this, or is it universal ?
I'm wondering because I've never seen anything like this in other source code for now.
Edit : There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji for example.
A 💣 for something that can modify or remove files, a 🔧 if it's a work in progress, 📃 for an interactive menu...
I guess we should create a standard for all of that, but it seems to be an interesting idea.
Maybe a random line of ~5 characters can help us a lot understanding what the code is doing. (Of course we need to learn how to read them.)
More edit : I'm giving it a shot. For now, if i fold all my functions in my editor (Or cat myscript.sh|grep function
) they look like this. (My unicode looks much better in geany
or my terminal compared to here.)
function ⬚_1(){
function ⬚⬚_2(){
function ⬚⬚⬚_📃_D(){
function ⬚⬚⬚⬚_📃_X(){
function ⬚⬚⬚⬚⬚_📃_Y(){
function ⬚⬚⬚⬚⬚⬚_❓_P(){
function ⬚⬚⬚⬚_📃_Z(){
function ⬚⬚⬚⬚⬚_❓_U(){
function ⬚⬚⬚⬚⬚_❓_O(){
I use a strange indentation ⬚ to show how the functions are related to each other and a symbol 📃/❓ to clearly distinguish their role. (Of course these are not my real function names, I just put a random letter at the end, but even without them we can clearly see the relationships.)
bash shell unicode
|
show 10 more comments
up vote
31
down vote
favorite
The simple code here is working as expected on my machine if launched with bash
:
function ⏰(){
date
}
⏰
Could there be a problem for other people using this, or is it universal ?
I'm wondering because I've never seen anything like this in other source code for now.
Edit : There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji for example.
A 💣 for something that can modify or remove files, a 🔧 if it's a work in progress, 📃 for an interactive menu...
I guess we should create a standard for all of that, but it seems to be an interesting idea.
Maybe a random line of ~5 characters can help us a lot understanding what the code is doing. (Of course we need to learn how to read them.)
More edit : I'm giving it a shot. For now, if i fold all my functions in my editor (Or cat myscript.sh|grep function
) they look like this. (My unicode looks much better in geany
or my terminal compared to here.)
function ⬚_1(){
function ⬚⬚_2(){
function ⬚⬚⬚_📃_D(){
function ⬚⬚⬚⬚_📃_X(){
function ⬚⬚⬚⬚⬚_📃_Y(){
function ⬚⬚⬚⬚⬚⬚_❓_P(){
function ⬚⬚⬚⬚_📃_Z(){
function ⬚⬚⬚⬚⬚_❓_U(){
function ⬚⬚⬚⬚⬚_❓_O(){
I use a strange indentation ⬚ to show how the functions are related to each other and a symbol 📃/❓ to clearly distinguish their role. (Of course these are not my real function names, I just put a random letter at the end, but even without them we can clearly see the relationships.)
bash shell unicode
7
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
17
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
4
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
6
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
5
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doingecho $foo
is safe.
– Matteo Italia
2 days ago
|
show 10 more comments
up vote
31
down vote
favorite
up vote
31
down vote
favorite
The simple code here is working as expected on my machine if launched with bash
:
function ⏰(){
date
}
⏰
Could there be a problem for other people using this, or is it universal ?
I'm wondering because I've never seen anything like this in other source code for now.
Edit : There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji for example.
A 💣 for something that can modify or remove files, a 🔧 if it's a work in progress, 📃 for an interactive menu...
I guess we should create a standard for all of that, but it seems to be an interesting idea.
Maybe a random line of ~5 characters can help us a lot understanding what the code is doing. (Of course we need to learn how to read them.)
More edit : I'm giving it a shot. For now, if i fold all my functions in my editor (Or cat myscript.sh|grep function
) they look like this. (My unicode looks much better in geany
or my terminal compared to here.)
function ⬚_1(){
function ⬚⬚_2(){
function ⬚⬚⬚_📃_D(){
function ⬚⬚⬚⬚_📃_X(){
function ⬚⬚⬚⬚⬚_📃_Y(){
function ⬚⬚⬚⬚⬚⬚_❓_P(){
function ⬚⬚⬚⬚_📃_Z(){
function ⬚⬚⬚⬚⬚_❓_U(){
function ⬚⬚⬚⬚⬚_❓_O(){
I use a strange indentation ⬚ to show how the functions are related to each other and a symbol 📃/❓ to clearly distinguish their role. (Of course these are not my real function names, I just put a random letter at the end, but even without them we can clearly see the relationships.)
bash shell unicode
The simple code here is working as expected on my machine if launched with bash
:
function ⏰(){
date
}
⏰
Could there be a problem for other people using this, or is it universal ?
I'm wondering because I've never seen anything like this in other source code for now.
Edit : There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji for example.
A 💣 for something that can modify or remove files, a 🔧 if it's a work in progress, 📃 for an interactive menu...
I guess we should create a standard for all of that, but it seems to be an interesting idea.
Maybe a random line of ~5 characters can help us a lot understanding what the code is doing. (Of course we need to learn how to read them.)
More edit : I'm giving it a shot. For now, if i fold all my functions in my editor (Or cat myscript.sh|grep function
) they look like this. (My unicode looks much better in geany
or my terminal compared to here.)
function ⬚_1(){
function ⬚⬚_2(){
function ⬚⬚⬚_📃_D(){
function ⬚⬚⬚⬚_📃_X(){
function ⬚⬚⬚⬚⬚_📃_Y(){
function ⬚⬚⬚⬚⬚⬚_❓_P(){
function ⬚⬚⬚⬚_📃_Z(){
function ⬚⬚⬚⬚⬚_❓_U(){
function ⬚⬚⬚⬚⬚_❓_O(){
I use a strange indentation ⬚ to show how the functions are related to each other and a symbol 📃/❓ to clearly distinguish their role. (Of course these are not my real function names, I just put a random letter at the end, but even without them we can clearly see the relationships.)
bash shell unicode
bash shell unicode
edited yesterday
asked 2 days ago
bob dylan
5161614
5161614
7
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
17
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
4
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
6
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
5
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doingecho $foo
is safe.
– Matteo Italia
2 days ago
|
show 10 more comments
7
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
17
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
4
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
6
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
5
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doingecho $foo
is safe.
– Matteo Italia
2 days ago
7
7
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
17
17
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
4
4
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
6
6
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
5
5
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doing
echo $foo
is safe.– Matteo Italia
2 days ago
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doing
echo $foo
is safe.– Matteo Italia
2 days ago
|
show 10 more comments
1 Answer
1
active
oldest
votes
up vote
51
down vote
accepted
A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.
According to the POSIX specification of function definitions in the "Shell Command Language":
The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.
Following the link to the definition of a "name":
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.
That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.
Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
|
show 7 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
51
down vote
accepted
A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.
According to the POSIX specification of function definitions in the "Shell Command Language":
The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.
Following the link to the definition of a "name":
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.
That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.
Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
|
show 7 more comments
up vote
51
down vote
accepted
A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.
According to the POSIX specification of function definitions in the "Shell Command Language":
The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.
Following the link to the definition of a "name":
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.
That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.
Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
|
show 7 more comments
up vote
51
down vote
accepted
up vote
51
down vote
accepted
A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.
According to the POSIX specification of function definitions in the "Shell Command Language":
The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.
Following the link to the definition of a "name":
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.
That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.
Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.
A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.
According to the POSIX specification of function definitions in the "Shell Command Language":
The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.
Following the link to the definition of a "name":
In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.
That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.
Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.
edited 2 days ago
answered 2 days ago
n.st
4,99311741
4,99311741
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
|
show 7 more comments
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
15
15
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it.
– SnakeDoc
2 days ago
6
6
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
@SnakeDoc youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;)
– Jorn
2 days ago
8
8
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
@Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol
– SnakeDoc
2 days ago
4
4
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
@SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names.
– Martin Bonner
yesterday
2
2
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
POSIX-compliant but not POSIX-limited ?
– bob dylan
yesterday
|
show 7 more comments
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f484423%2fshould-we-use-utf-8-characters-like-in-bash-shell-script%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
7
I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux.
– Kiwy
2 days ago
17
@Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal).
– Stephen Kitt
2 days ago
4
How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason...
– dirkt
2 days ago
6
The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense.
– Kusalananda
2 days ago
5
If you find yourself asking whether it is safe to do <whatever> in a shell script, the answer is most probably no. Heck, not even doing
echo $foo
is safe.– Matteo Italia
2 days ago