Bash sort array according to length of elements?
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
add a comment |
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
up vote
9
down vote
favorite
up vote
9
down vote
favorite
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
New contributor
Given an array of strings, I would like to sort the array according to the length of each element.
For example...
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
Should sort to...
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
(As a bonus, it would be nice if the list sorted strings of the same length, alphabetically. In the above example medium string
was sorted before middle string
even though they are the same length. But that's not a "hard" requirement, if it over complicates the solution).
It is OK if the array is sorted in-place (i.e. "array" is modified) or if a new sorted array is created.
bash shell-script sort array
bash shell-script sort array
New contributor
New contributor
New contributor
asked Nov 17 at 20:11
PJ Singh
1543
1543
New contributor
New contributor
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
1
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20
add a comment |
6 Answers
6
active
oldest
votes
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
add a comment |
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
|
show 1 more comment
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
2 days ago
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the${!in[@]}
or${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and theunset IFS
will reset theIFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting ofIFS
and/or filtering the output ofsort
in the second example could be safely done away with.
– mosvy
11 hours ago
@Isaac It does NOT break ifin
contains"testing * here"
andshopt -s nullglob
is set before the loop.
– mosvy
11 hours ago
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
add a comment |
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
add a comment |
up vote
10
down vote
accepted
up vote
10
down vote
accepted
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
If the strings don't contain newlines, the following should work. It sorts the indices of the array by the length, using the strings themselves as the secondary sort criterion.
#!/bin/bash
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
expected=(
"the longest string in the list"
"also a medium string"
"medium string"
"middle string"
"short string"
"tiny string"
)
indexes=( $(
for i in "${!array[@]}" ; do
printf '%s %s %sn' $i "${#array[i]}" "${array[i]}"
done | sort -nrk2,2 -rk3 | cut -f1 -d' '
))
for i in "${indexes[@]}" ; do
sorted+=("${array[i]}")
done
diff <(echo "${expected[@]}")
<(echo "${sorted[@]}")
Note that moving to a real programming language can greatly simplify the solution, e.g. in Perl, you can just
sort { length $b <=> length $a or $a cmp $b } @array
edited Nov 17 at 20:29
answered Nov 17 at 20:21
choroba
25.8k44470
25.8k44470
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
add a comment |
In Python:sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
In Python:
sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Python:
sorted(array, key=lambda s: (len(s), s))
– wjandrea
Nov 18 at 16:47
In Ruby:
array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
In Ruby:
array.sort { |a| a.size }
– Dmitry Kudriavtsev
Nov 19 at 0:07
add a comment |
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
|
show 1 more comment
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
|
show 1 more comment
up vote
8
down vote
up vote
8
down vote
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
This reads the values of the sorted array from a process substitution.
The process substitution contains a loop. The loop output each element of the array prepended by the element's length and a tab character in-between.
The output of the loop is sorted numerically from largest to smallest (and alphabetically if the lengths are the same; use -k 2r
in place of -k 2
to reverse the alphabetical order) and the result of that is sent to cut
which deletes the column with the string lengths.
Sort test script followed by a test run:
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
readarray -t array < <(
for str in "${array[@]}"; do
printf '%dt%sn' "${#str}" "$str"
done | sort -k 1,1nr -k 2 | cut -f 2- )
printf '%sn' "${array[@]}"
$ bash script.sh
the longest string in the list
also a medium string
medium string
middle string
short string
tiny string
This assumes that the strings do not contain newlines. On GNU systems with a recent bash
, you can support embedded newlines in the data by using the nul-character as the record separator instead of newline:
readarray -d '' -t array < <(
for str in "${array[@]}"; do
printf '%dt%s' "${#str}" "$str"
done | sort -z -k 1,1nr -k 2 | cut -z -f 2- )
Here, the data is printed with trailing in the loop instead of newlines, the
sort
and cut
reads nul-delimited lines through their -z
GNU options and readarray
finally reads the nul-delimited data with -d ''
.
edited Nov 18 at 8:32
answered Nov 17 at 20:36
Kusalananda
116k15218352
116k15218352
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
|
show 1 more comment
3
Note that-d ''
is in fact-d ''
asbash
can't pass NUL characters to commands, even its builtins. But it does understand-d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.
– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not''
, it is$''
. And yes, it converts (almost exactly) to''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.
– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
3
3
Note that
-d ''
is in fact -d ''
as bash
can't pass NUL characters to commands, even its builtins. But it does understand -d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.– Stéphane Chazelas
Nov 18 at 8:29
Note that
-d ''
is in fact -d ''
as bash
can't pass NUL characters to commands, even its builtins. But it does understand -d ''
as meaning delimit on NUL. Note that you need bash 4.4+ for that.– Stéphane Chazelas
Nov 18 at 8:29
@StéphaneChazelas No, it is not
''
, it is $''
. And yes, it converts (almost exactly) to ''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.– Isaac
Nov 18 at 15:11
@StéphaneChazelas No, it is not
''
, it is $''
. And yes, it converts (almost exactly) to ''
. But that is a way to comunicate to other readers the actual intent of using a NUL delimiter.– Isaac
Nov 18 at 15:11
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Sorry, which edit?
– Kusalananda
Nov 18 at 15:14
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Are you suggesting I plagiarised it from your solution?
– Kusalananda
Nov 18 at 16:10
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
@Isaac Ah, ok. Yes, there are many things that you really can only do in a limited number of ways.
– Kusalananda
Nov 18 at 18:21
|
show 1 more comment
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
add a comment |
up vote
4
down vote
up vote
4
down vote
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
I won't completely repeat what I've already said about sorting in bash, just you can sort within bash, but maybe you shouldn't. Below is a bash-only implementation of an insertion sort, which is O(n2), and so is only tolerable for small arrays. It sorts the array elements in-place by their length, in decreasing order. It does not do a secondary alphabetical sort.
array=(
"tiny string"
"the longest string in the list"
"middle string"
"medium string"
"also a medium string"
"short string"
)
function sort_inplace {
local i j tmp
for ((i=0; i <= ${#array[@]} - 2; i++))
do
for ((j=i + 1; j <= ${#array[@]} - 1; j++))
do
local ivalue jvalue
ivalue=${#array[i]}
jvalue=${#array[j]}
if [[ $ivalue < $jvalue ]]
then
tmp=${array[i]}
array[i]=${array[j]}
array[j]=$tmp
fi
done
done
}
echo Initial:
declare -p array
sort_inplace
echo Sorted:
declare -p array
As evidence that this is a specialized solution, consider the timings of the existing three answers on various size arrays:
# 6 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.018s ## already 4 times slower!
# 1000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.021s ## up to 5 times slower, now!
5000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.004s
Jeff: 0m0.019s
# 10000 elements
Choroba: 0m0.004s
Kusalananda: 0m0.006s
Jeff: 0m0.020s
# 99000 elements
Choroba: 0m0.015s
Kusalananda: 0m0.012s
Jeff: 0m0.119s
Choroba and Kusalananda have the right idea: compute the lengths once and use dedicated utilities for sorting and text processing.
edited Nov 18 at 0:34
answered Nov 17 at 23:47
Jeff Schaller
36.3k952119
36.3k952119
add a comment |
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
add a comment |
up vote
4
down vote
up vote
4
down vote
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
A hackish? (complex) and fast one line way to sort the array by length
(safe for newlines and sparse arrays):
#!/bin/bash
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
"test * string"
"*"
"?"
"[abc]"
)
readarray -td $'' sorted < <(
for i in "${in[@]}"
do printf '%s %s' "${#i}" "$i";
done |
sort -bz -k1,1rn -k2 |
cut -zd " " -f2-
)
printf '%sn' "${sorted[@]}"
On one line:
readarray -td $'' sorted < <(for i in "${in[@]}";do printf '%s %s' "${#i}" "$i"; done | sort -bz -k1,1rn -k2 | cut -zd " " -f2-)
On execution
$ ./script
the longest
string also containing
newlines
also a medium string
medium string
middle string
test * string
short string
tiny string
[abc]
?
*
edited Nov 18 at 15:36
answered Nov 18 at 2:39
Isaac
9,68211444
9,68211444
add a comment |
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
2 days ago
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the${!in[@]}
or${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and theunset IFS
will reset theIFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting ofIFS
and/or filtering the output ofsort
in the second example could be safely done away with.
– mosvy
11 hours ago
@Isaac It does NOT break ifin
contains"testing * here"
andshopt -s nullglob
is set before the loop.
– mosvy
11 hours ago
add a comment |
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
2 days ago
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the${!in[@]}
or${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and theunset IFS
will reset theIFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting ofIFS
and/or filtering the output ofsort
in the second example could be safely done away with.
– mosvy
11 hours ago
@Isaac It does NOT break ifin
contains"testing * here"
andshopt -s nullglob
is set before the loop.
– mosvy
11 hours ago
add a comment |
up vote
4
down vote
up vote
4
down vote
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
This also handles array elements with newlines in them; it works by passing through sort
only the length and the index of each element. It should work with bash
and ksh
.
in=(
"tiny string"
"the longest
string also containing
newlines"
"middle string"
"medium string"
"also a medium string"
"short string"
)
out=()
unset IFS
for a in $(for i in ${!in[@]}; do echo ${#in[i]}/$i; done | sort -rn); do
out+=("${in[${a#*/}]}")
done
for a in "${out[@]}"; do printf '"%s"n' "$a"; done
If the elements of the same length also have to be sorted lexicographically, the loop could be changed like this:
IFS='
'
for a in $(for i in ${!in[@]}; do printf '%sn' "$i ${#in[i]} ${in[i]//$IFS/ }"; done | sort -k 2,2nr -k 3 | cut -d' ' -f1); do
out+=("${in[$a]}")
done
This will also pass to sort
the strings (with newlines changed to spaces), but they would still be copied from the source to the destination array by their indexes. In both examples, the $(...)
will see only lines containing numbers (and the /
character in the first example), so it won't be tripped by globbing characters or spaces in the strings.
edited 2 days ago
answered Nov 18 at 1:34
mosvy
4,333321
4,333321
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
2 days ago
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the${!in[@]}
or${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and theunset IFS
will reset theIFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting ofIFS
and/or filtering the output ofsort
in the second example could be safely done away with.
– mosvy
11 hours ago
@Isaac It does NOT break ifin
contains"testing * here"
andshopt -s nullglob
is set before the loop.
– mosvy
11 hours ago
add a comment |
Cleaned comments. Now it breaks ifin
contains something like"testing * here"
andshopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.
– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of thecut -d' ' -f1
after the sort. This could be easily demonstrated by atee /dev/tty
at the end of the$(...)
.
– mosvy
2 days ago
Sorry, my bad, I missed thecut
.
– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the${!in[@]}
or${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and theunset IFS
will reset theIFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting ofIFS
and/or filtering the output ofsort
in the second example could be safely done away with.
– mosvy
11 hours ago
@Isaac It does NOT break ifin
contains"testing * here"
andshopt -s nullglob
is set before the loop.
– mosvy
11 hours ago
Cleaned comments. Now it breaks if
in
contains something like "testing * here"
and shopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.– Isaac
Nov 18 at 15:23
Cleaned comments. Now it breaks if
in
contains something like "testing * here"
and shopt -s nullglob
(and/or some others) get set at the script before the for loop. I'll insist: quote your expansions, avoid the pain.– Isaac
Nov 18 at 15:23
Cannot reproduce. In the second example, the
$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of the cut -d' ' -f1
after the sort. This could be easily demonstrated by a tee /dev/tty
at the end of the $(...)
.– mosvy
2 days ago
Cannot reproduce. In the second example, the
$(...)
command substitution sees only the indexes (a list of numbers separated by newlines), because of the cut -d' ' -f1
after the sort. This could be easily demonstrated by a tee /dev/tty
at the end of the $(...)
.– mosvy
2 days ago
Sorry, my bad, I missed the
cut
.– Stéphane Chazelas
2 days ago
Sorry, my bad, I missed the
cut
.– Stéphane Chazelas
2 days ago
@Isaac There's no need to quote the
${!in[@]}
or ${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and the unset IFS
will reset the IFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting of IFS
and/or filtering the output of sort
in the second example could be safely done away with.– mosvy
11 hours ago
@Isaac There's no need to quote the
${!in[@]}
or ${#in[i]}/$i
variable expansions because they only contain digits which are not subject to glob expansion and the unset IFS
will reset the IFS
to space, tab, newline. In fact, quoting them would be harmful, because it will give the false impression that such quoting is useful and effective, and that the setting of IFS
and/or filtering the output of sort
in the second example could be safely done away with.– mosvy
11 hours ago
@Isaac It does NOT break if
in
contains "testing * here"
and shopt -s nullglob
is set before the loop.– mosvy
11 hours ago
@Isaac It does NOT break if
in
contains "testing * here"
and shopt -s nullglob
is set before the loop.– mosvy
11 hours ago
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
add a comment |
up vote
3
down vote
up vote
3
down vote
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
In case switching to zsh
is an option, a hackish way there (for arrays containing any sequence of bytes):
array=('' blah $'xnynz' $'xy' '1 2 3')
sorted_array=( /(e'{reply=("$array[@]")}'nOe'{REPLY=$#REPLY}') )
zsh
allows defining sort orders for its glob expansion via glob qualifiers. So here, we're tricking it to do it for arbitrary arrays by globbing on /
, but replacing /
with the elements of the array (e'{reply=("$array[@]")}'
) and then n
umerically o
rder (in reverse with uppercase O
) the elements based on their length (Oe'{REPLY=$#REPLY}'
).
Note that it's based on the length in number of characters. For number of bytes, set the locale to C
(LC_ALL=C
).
Another bash
4.4+ approach (assuming not too big an array):
readarray -td '' sorted_array < <(
perl -l0 -e 'print for sort {length $b <=> length $a} @ARGV
' -- "${array[@]}")
(that's length in bytes).
With older versions of bash
, you could always do:
eval "sorted_array=($(
perl -l0 -e 'for (sort {length $b <=> length $a} @ARGV) {
'"s/'/'\\''/g"'; printf " ''%s''", $_}' -- "${array[@]}"
))"
(which would also work with ksh93
, zsh
, yash
, mksh
).
edited 2 days ago
answered Nov 18 at 8:40
Stéphane Chazelas
294k54552894
294k54552894
add a comment |
add a comment |
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
PJ Singh is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f482393%2fbash-sort-array-according-to-length-of-elements%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
some interesting answers over here, you should be able to adapt one to test for string length as well stackoverflow.com/a/30576368/2876682
– frostschutz
Nov 17 at 20:20