Quickselect algorithm in Swift











up vote
9
down vote

favorite
1












I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.



This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.



The language is Swift 3, compiled with Xcode 8.1.



extension Array where Element: Comparable {

public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")

var a = self // A mutable copy.
var low = startIndex
var high = endIndex

while high - low > 1 {

// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]

// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}

if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}

// Only single candidate left:
return a[low]
}

public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}


Examples:



let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9

let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"


Feedback on all aspects of the code is welcome, such as (but not limited to):




  • Possible improvements (speed, clarity, swiftyness, ...).

  • Naming. In particular: what would be a better name for kSmallest(_ k: Int)
    in consideration of the
    Swift API Design Guidelines?

  • The check if a[low] == pivotElement looks artificial, but without that
    an infinite loop can occur, e.g. for an array with all elements equal.
    Is there a better solution?




Remark: To make this code compile with Swift 4 (or later), just replace the line



swap(&a[pivotIndex], &a[i])


with



a.swapAt(pivotIndex, i)









share|improve this question
























  • Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
    – greybeard
    Nov 5 '16 at 8:54










  • @greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
    – Martin R
    Nov 12 '16 at 10:53















up vote
9
down vote

favorite
1












I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.



This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.



The language is Swift 3, compiled with Xcode 8.1.



extension Array where Element: Comparable {

public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")

var a = self // A mutable copy.
var low = startIndex
var high = endIndex

while high - low > 1 {

// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]

// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}

if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}

// Only single candidate left:
return a[low]
}

public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}


Examples:



let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9

let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"


Feedback on all aspects of the code is welcome, such as (but not limited to):




  • Possible improvements (speed, clarity, swiftyness, ...).

  • Naming. In particular: what would be a better name for kSmallest(_ k: Int)
    in consideration of the
    Swift API Design Guidelines?

  • The check if a[low] == pivotElement looks artificial, but without that
    an infinite loop can occur, e.g. for an array with all elements equal.
    Is there a better solution?




Remark: To make this code compile with Swift 4 (or later), just replace the line



swap(&a[pivotIndex], &a[i])


with



a.swapAt(pivotIndex, i)









share|improve this question
























  • Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
    – greybeard
    Nov 5 '16 at 8:54










  • @greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
    – Martin R
    Nov 12 '16 at 10:53













up vote
9
down vote

favorite
1









up vote
9
down vote

favorite
1






1





I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.



This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.



The language is Swift 3, compiled with Xcode 8.1.



extension Array where Element: Comparable {

public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")

var a = self // A mutable copy.
var low = startIndex
var high = endIndex

while high - low > 1 {

// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]

// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}

if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}

// Only single candidate left:
return a[low]
}

public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}


Examples:



let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9

let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"


Feedback on all aspects of the code is welcome, such as (but not limited to):




  • Possible improvements (speed, clarity, swiftyness, ...).

  • Naming. In particular: what would be a better name for kSmallest(_ k: Int)
    in consideration of the
    Swift API Design Guidelines?

  • The check if a[low] == pivotElement looks artificial, but without that
    an infinite loop can occur, e.g. for an array with all elements equal.
    Is there a better solution?




Remark: To make this code compile with Swift 4 (or later), just replace the line



swap(&a[pivotIndex], &a[i])


with



a.swapAt(pivotIndex, i)









share|improve this question















I recently answered a question
on Stack Overflow about finding the k-largest element in an array, and
present my implementation of the Quickselect algorithm in Swift for review.



This is essentially the iterative version described in
Wikipedia: Quickselect, only that the
paritioning does not move the pivot element to the front of the second partition.
That saves some swap operations but requires an additional check when updating the lower bound.



The language is Swift 3, compiled with Xcode 8.1.



extension Array where Element: Comparable {

public func kSmallest(_ k: Int) -> Element {
precondition(1 <= k && k <= count, "k must be in the range 1...count")

var a = self // A mutable copy.
var low = startIndex
var high = endIndex

while high - low > 1 {

// Choose random pivot element:
let pivotElement = a[low + Int(arc4random_uniform(UInt32(high - low)))]

// Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low
while a[pivotIndex] < pivotElement {
pivotIndex += 1
}
for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}

if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k == pivotIndex + 1 {
// Pivot element is the k-smallest:
return pivotElement
} else {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
}
}

// Only single candidate left:
return a[low]
}

public func kLargest(_ k: Int) -> Element {
return kSmallest(count + 1 - k)
}
}


Examples:



let a = [ 9, 7, 6, 3, 4, 2, 5, 1, 8 ]
for k in 1 ... a.count {
print(a.kSmallest(k))
}
// 1 2 3 4 5 6 7 8 9

let b = [ "b", "a", "c", "h" ]
print(b.kLargest(2))
// "c"


Feedback on all aspects of the code is welcome, such as (but not limited to):




  • Possible improvements (speed, clarity, swiftyness, ...).

  • Naming. In particular: what would be a better name for kSmallest(_ k: Int)
    in consideration of the
    Swift API Design Guidelines?

  • The check if a[low] == pivotElement looks artificial, but without that
    an infinite loop can occur, e.g. for an array with all elements equal.
    Is there a better solution?




Remark: To make this code compile with Swift 4 (or later), just replace the line



swap(&a[pivotIndex], &a[i])


with



a.swapAt(pivotIndex, i)






algorithm swift swift3






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 at 13:22

























asked Nov 1 '16 at 21:14









Martin R

15.2k12261




15.2k12261












  • Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
    – greybeard
    Nov 5 '16 at 8:54










  • @greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
    – Martin R
    Nov 12 '16 at 10:53


















  • Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
    – greybeard
    Nov 5 '16 at 8:54










  • @greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
    – Martin R
    Nov 12 '16 at 10:53
















Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
– greybeard
Nov 5 '16 at 8:54




Don't you need to manipulate k before iterating the 2nd partition? (See wikipedia talk page.)
– greybeard
Nov 5 '16 at 8:54












@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53




@greybeard: No, all indices (low, high, pivotIndex) always refer to the original array indices. I am fairly sure that the method works correctly. Of course I may have overlooked a special case, but I tested it with many random and constructed arrays.
– Martin R
Nov 12 '16 at 10:53










2 Answers
2






active

oldest

votes

















up vote
2
down vote













Let's start by updating this line to Swift 4.2 :



let pivotElement = a[Int.random(in: low..<high)]


In my tests, Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.





Efficiency



There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:



if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}


The first two conditions are equiprobable. The least probable case is left for last.



Benchmarks



The benchmarking code is the following:



let a = Array(1...10_000).shuffled()

var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")

var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")


It prints the average time for looking up one kth smallest element.



kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness.



kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:



kSmallest    1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)


In the worst case, in my tests, kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.






share|improve this answer























  • @MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
    – Carpsen90
    2 days ago










  • In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
    – Martin R
    2 days ago












  • Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
    – Martin R
    2 days ago












  • @MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
    – Carpsen90
    2 days ago




















up vote
1
down vote













There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high low and pivotIndex into let constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).



Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:



            // Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low

while a[pivotIndex] < pivotElement {
pivotIndex += 1
}

for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}


I would definitely try refactoring some of this to FP, then running it through some measureBlock()s in Xcode to see if it's faster.



Hope this helps!






share|improve this answer























  • Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
    – Martin R
    Nov 4 '16 at 15:11












  • @MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
    – Fluidity
    Nov 4 '16 at 15:33












  • I hope my info isn't wrong xD I'm certainly still learning... :)
    – Fluidity
    Nov 4 '16 at 15:43










  • It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
    – Martin R
    Nov 6 '16 at 15:40










  • I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
    – Martin R
    Nov 12 '16 at 10:55











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f145877%2fquickselect-algorithm-in-swift%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote













Let's start by updating this line to Swift 4.2 :



let pivotElement = a[Int.random(in: low..<high)]


In my tests, Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.





Efficiency



There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:



if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}


The first two conditions are equiprobable. The least probable case is left for last.



Benchmarks



The benchmarking code is the following:



let a = Array(1...10_000).shuffled()

var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")

var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")


It prints the average time for looking up one kth smallest element.



kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness.



kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:



kSmallest    1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)


In the worst case, in my tests, kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.






share|improve this answer























  • @MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
    – Carpsen90
    2 days ago










  • In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
    – Martin R
    2 days ago












  • Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
    – Martin R
    2 days ago












  • @MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
    – Carpsen90
    2 days ago

















up vote
2
down vote













Let's start by updating this line to Swift 4.2 :



let pivotElement = a[Int.random(in: low..<high)]


In my tests, Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.





Efficiency



There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:



if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}


The first two conditions are equiprobable. The least probable case is left for last.



Benchmarks



The benchmarking code is the following:



let a = Array(1...10_000).shuffled()

var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")

var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")


It prints the average time for looking up one kth smallest element.



kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness.



kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:



kSmallest    1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)


In the worst case, in my tests, kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.






share|improve this answer























  • @MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
    – Carpsen90
    2 days ago










  • In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
    – Martin R
    2 days ago












  • Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
    – Martin R
    2 days ago












  • @MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
    – Carpsen90
    2 days ago















up vote
2
down vote










up vote
2
down vote









Let's start by updating this line to Swift 4.2 :



let pivotElement = a[Int.random(in: low..<high)]


In my tests, Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.





Efficiency



There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:



if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}


The first two conditions are equiprobable. The least probable case is left for last.



Benchmarks



The benchmarking code is the following:



let a = Array(1...10_000).shuffled()

var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")

var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")


It prints the average time for looking up one kth smallest element.



kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness.



kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:



kSmallest    1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)


In the worst case, in my tests, kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.






share|improve this answer














Let's start by updating this line to Swift 4.2 :



let pivotElement = a[Int.random(in: low..<high)]


In my tests, Int.random(in:) is way faster than Int(arc4random_uniform), and the comparisons won't take that into consideration.





Efficiency



There is a small improvement to this algorithm, but it still gives a performance gain by rearranging the conditions from the most probable to the least, in order to take advantage of shortcut execution:



if k <= pivotIndex {
// k-smallest element is in the first partition:
high = pivotIndex
} else if k > pivotIndex + 1 {
// k-smallest element is in the second partition
low = pivotIndex
if a[low] == pivotElement {
low += 1
}
} else {
// Pivot element is the k-smallest:
return pivotElement
}


The first two conditions are equiprobable. The least probable case is left for last.



Benchmarks



The benchmarking code is the following:



let a = Array(1...10_000).shuffled()

var timings: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest(k)
let end = mach_absolute_time()
timings.append(Double(end - start)/Double(1e3))
}
let average: Double = timings.reduce(0, +) / Double(timings.count)
print(average, "us")

var timings2: [Double] =
for k in 1 ... a.count {
let start = mach_absolute_time()
_ = a.kSmallest2(k)
let end = mach_absolute_time()
timings2.append(Double(end - start)/Double(1e3))
}
let average2: Double = timings2.reduce(0, +) / Double(timings2.count)
print(average2, "us")


It prints the average time for looking up one kth smallest element.



kSmallest is the original, kSmallest2 is the new one. They both operate on the same array a to ensure fairness.



kSmallest2 is up to 7μs faster per lookup. The fluctuation is due to the randomness of the arrangement of the elements of the array. Which translates into up to ~70ms execution time gain for a 10.000-element array:



kSmallest    1.215636265 s (total time)
kSmallest2 1.138085315 s (total time)


In the worst case, in my tests, kSmallest2 may rarely be 2μs slower per lookup, and it is to be blamed on the randomness of choosing a pivot. Comparisons should probabilistically favor the second version.







share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered 2 days ago









Carpsen90

1798




1798












  • @MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
    – Carpsen90
    2 days ago










  • In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
    – Martin R
    2 days ago












  • Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
    – Martin R
    2 days ago












  • @MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
    – Carpsen90
    2 days ago




















  • @MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
    – Carpsen90
    2 days ago










  • In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
    – Martin R
    2 days ago












  • Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
    – Martin R
    2 days ago












  • @MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
    – Carpsen90
    2 days ago


















@MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
– Carpsen90
2 days ago




@MartinR With an array like this: [1, 2, 3, 2, 3, 4, 2] Should the 2 be considered the 2nd smallest, or the 3rd, or the fourth, or all of them?
– Carpsen90
2 days ago












In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
2 days ago






In this context, the 2nd smallest element of an array is the second element after sorting the elements in increasing order, i.e. a.kSmallest(k) == a.sorted()[k-1] for 1<=k<=a.count. In your example, the 2nd, 3rd, and fourth largest elements are all equal to 2.
– Martin R
2 days ago














Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
– Martin R
2 days ago






Thanks for your suggestions. I will check them in more detail later, but in my first test [1, 2, 3, 2, 3, 4, 2].kSmallest3(2) never returned.
– Martin R
2 days ago














@MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
– Carpsen90
2 days ago






@MartinR Yes it supposes that all elements are unique. If not, an endless loop will be entered. I needed the clarification from your previous comment. I'll work on something later.
– Carpsen90
2 days ago














up vote
1
down vote













There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high low and pivotIndex into let constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).



Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:



            // Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low

while a[pivotIndex] < pivotElement {
pivotIndex += 1
}

for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}


I would definitely try refactoring some of this to FP, then running it through some measureBlock()s in Xcode to see if it's faster.



Hope this helps!






share|improve this answer























  • Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
    – Martin R
    Nov 4 '16 at 15:11












  • @MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
    – Fluidity
    Nov 4 '16 at 15:33












  • I hope my info isn't wrong xD I'm certainly still learning... :)
    – Fluidity
    Nov 4 '16 at 15:43










  • It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
    – Martin R
    Nov 6 '16 at 15:40










  • I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
    – Martin R
    Nov 12 '16 at 10:55















up vote
1
down vote













There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high low and pivotIndex into let constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).



Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:



            // Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low

while a[pivotIndex] < pivotElement {
pivotIndex += 1
}

for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}


I would definitely try refactoring some of this to FP, then running it through some measureBlock()s in Xcode to see if it's faster.



Hope this helps!






share|improve this answer























  • Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
    – Martin R
    Nov 4 '16 at 15:11












  • @MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
    – Fluidity
    Nov 4 '16 at 15:33












  • I hope my info isn't wrong xD I'm certainly still learning... :)
    – Fluidity
    Nov 4 '16 at 15:43










  • It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
    – Martin R
    Nov 6 '16 at 15:40










  • I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
    – Martin R
    Nov 12 '16 at 10:55













up vote
1
down vote










up vote
1
down vote









There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high low and pivotIndex into let constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).



Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:



            // Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low

while a[pivotIndex] < pivotElement {
pivotIndex += 1
}

for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}


I would definitely try refactoring some of this to FP, then running it through some measureBlock()s in Xcode to see if it's faster.



Hope this helps!






share|improve this answer














There are a lot of loops in here. I would probably turn at least one of them into a recursive function; actually, refactoring high low and pivotIndex into let constants with recursion should be faster due to optimizations. (I'm not that good of an FP programmer to mock something up for you quickly).



Otherwise, one suggestion I'd make (just my preference), you have two major loops smooshed next to each other, perhaps another empty line could help readability:



            // Partition elements such that:
// a[i] < pivotElement for low <= i < pivotIndex,
// a[i] >= pivotElement for pivotIndex <= i < high.
var pivotIndex = low

while a[pivotIndex] < pivotElement {
pivotIndex += 1
}

for i in pivotIndex+1 ..< high {
if a[i] < pivotElement {
swap(&a[pivotIndex], &a[i])
pivotIndex += 1
}
}


I would definitely try refactoring some of this to FP, then running it through some measureBlock()s in Xcode to see if it's faster.



Hope this helps!







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 4 '16 at 15:44

























answered Nov 4 '16 at 14:53









Fluidity

15811




15811












  • Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
    – Martin R
    Nov 4 '16 at 15:11












  • @MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
    – Fluidity
    Nov 4 '16 at 15:33












  • I hope my info isn't wrong xD I'm certainly still learning... :)
    – Fluidity
    Nov 4 '16 at 15:43










  • It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
    – Martin R
    Nov 6 '16 at 15:40










  • I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
    – Martin R
    Nov 12 '16 at 10:55


















  • Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
    – Martin R
    Nov 4 '16 at 15:11












  • @MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
    – Fluidity
    Nov 4 '16 at 15:33












  • I hope my info isn't wrong xD I'm certainly still learning... :)
    – Fluidity
    Nov 4 '16 at 15:43










  • It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
    – Martin R
    Nov 6 '16 at 15:40










  • I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
    – Martin R
    Nov 12 '16 at 10:55
















Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11






Thank you for your answer! – I intentionally chose the iterative version in order to avoid that arrays/slices (which are values in Swift) are passed down and then copied when mutated. I am not sure if a recursive version would really be faster, but I haven't tried it.
– Martin R
Nov 4 '16 at 15:11














@MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
– Fluidity
Nov 4 '16 at 15:33






@MartinR from the limited testing I've done, let tends to be significantly faster when passing in values (especially ones that are in-scope such as with recursion). high and low are both out of scope. However, I think keeping a as a var would be a good idea, since swift doesn't have persistent collections (to my knowledge). Also, from what I've heard "copying" data doesn't actually happen (value) until the data has mutated (they share the same memory space)-- thus, it is essentially a reference. All of those calls to mutate pivotIndex + 1should be sped up with let and recursion.
– Fluidity
Nov 4 '16 at 15:33














I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43




I hope my info isn't wrong xD I'm certainly still learning... :)
– Fluidity
Nov 4 '16 at 15:43












It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40




It is correct that arrays (and array slices) are copy-on-write. But here the function does mutate the array (slice), which means that additional copies would be made in the nested calls.
– Martin R
Nov 6 '16 at 15:40












I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55




I am sorry, but in its present form ("I would probably turn at least one of them into a recursive function", "I would definitely try refactoring some of this to FP") this answer is too vague to award the bounty.
– Martin R
Nov 12 '16 at 10:55


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f145877%2fquickselect-algorithm-in-swift%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Morgemoulin

Scott Moir

Souastre