[swift-evolution] [Discussion] Binary search, lower bound, upper bound, equal range

Pyry Jahkola pyry.jahkola at iki.fi
Tue Feb 16 06:12:53 CST 2016

>> On Mon Feb 15 2016, Jeff Hajewski <swift-evolution at swift.org> wrote:
>> I am working on SR-368 <https://bugs.swift.org/browse/SR-368> and am hoping to get some feedback from the community.

Definitely +1!

Shameless plug:

I've played with this idea in the past, implementing all of the four suggested binary searches as generic functions over index ranges or collections, using a custom 3-way comparator. The commented code and unit tests are found in https://github.com/knomi/Allsorts <https://github.com/knomi/Allsorts>. The library is also somewhat tested in production.

let i: Int        = ["a","b","c","c","c","d","f","f"].binarySearch("c")           //=> 2, 3 or 4
let j: Int?       = ["a","b","c","c","c","d","f","f"].binaryFind("e")             //=> nil
// Compare values to "e". `a <=> b` performs a 3-way comparison, returning .LT, .EQ or .GT
let k: Int        = ["a","b","c","c","c","d","f","f"].lowerBound {x in x <=> "e"} //=> 6
// Reverse order (the arguments of <=> are flipped):
let r: Range<Int> = ["e","e","d","c","c","c","b","a"].equalRange {x in "c" <=> x} //=> 3 ..< 6

I'd be happy to join forces in making the algorithms available in the standard library.

I believe that my code tackles many points Dave mentioned:

(1) You don't need to construct an element just to perform the search. You only need a function to tell if the search should proceed to the left or right, or if there's a match.

(2) You don't really need to a concrete collection either. A Range<Int> will do just as fine, as long as you're able to say how those indices should be compared to each other (usually by looking up somewhere else).

(3) The collection can be sorted by any comparator (which I call an "ordering"), as long as you pass that information along as the closure (the ordering).

(4) I use three-way comparisons to avoid extra work comparing e.g. strings.
Turns out comparisons defined that way also compose amazingly well, see first examples in README.
There's an operator "x <=> y" returning Ordering.LT, .EQ, or .GT instead of just Bool. However, that operator has little to do with the implementation of binary search.
(The topic of three-way comparisons was discussed earlier <https://lists.swift.org/pipermail/swift-evolution/Week-of-Mon-20160201/009127.html> in swift-evolution. I think it should find its way into the Comparable protocol directly, as an alternative to defining "<".)
(5) By convention, I do not return nil just because an index wasn't found. You might still want to use the index for the insertion of new elements. For convenience, there's binaryFind which is guaranteed to return an index exactly when there's a match, otherwise nil.

(6) There are convenience overloads for sorted collections of Generator.Element : Comparable so that you can give a value to look up.

(7) The project also contains more experimental stuff like a working (but slow?) port of libc++'s binary heap algorithms. Over time, those would be nice to port to Swift standard library as well.

The existing interface on GitHub is still preferring top-level functions. I'm modernising the interface into the following (plus the usual public, @warn_unused_result, @noescape, throws and rethrows dance). An already working implementation can be found in the protocol-extensions branch <https://github.com/knomi/Allsorts/blob/protocol-extensions/Allsorts/BinarySearch.swift>:

enum Ordering : Int { case LT = -1, EQ, GT } // NSComparisonResult, basically

extension CollectionType where Index : RandomAccessIndexType {
    func binaryFind(ordering: Generator.Element -> Ordering) -> Self.Index?
    func binarySearch(ordering: Generator.Element -> Ordering) -> Self.Index
    func lowerBound(ordering: Generator.Element -> Ordering) -> Self.Index
    func upperBound(ordering: Generator.Element -> Ordering) -> Self.Index
    func equalRange(ordering: Generator.Element -> Ordering) -> Range<Self.Index>

extension CollectionType where Index : RandomAccessIndexType, Generator.Element : Comparable {
    func binaryFind(value: Generator.Element) -> Self.Index?
    func binarySearch(value: Generator.Element) -> Self.Index
    func lowerBound(value: Generator.Element) -> Self.Index
    func upperBound(value: Generator.Element) -> Self.Index
    func equalRange(value: Generator.Element) -> Range<Self.Index>

> On 15 Feb 2016, at 23:34, Dave Abrahams via swift-evolution <swift-evolution at swift.org> wrote:

> Note that you don't need random access to get an advantage from binary search if the comparison is sufficiently expensive.

One current restriction is that I only define these functions over collections with random access. What it boils down to is the implementation of midIndex:

internal func midIndex<Ix : RandomAccessIndexType>(range: Range<Ix>) -> Ix {
    let fullDistance = range.startIndex.distanceTo(range.endIndex)
    return range.startIndex.advancedBy(fullDistance / 2)

That could certainly be written with weaker constraints as well; I just haven't checked how it affects to the interface in terms of duplication of code, and to the performance of the random access version.

— Pyry Jahkola

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160216/8a5cd012/attachment.html>

More information about the swift-evolution mailing list