[swift-evolution] [Proposal] Add Binary Search functions to SequenceType

Jeff Hajewski jeff.hajewski at gmail.com
Tue Mar 15 12:07:21 CDT 2016


This is something I've been working on as well. The issue I've been stuck
on is how we define the predicate and match the definitions of lower_bound
and upper_bound from C++. It seems to me that the focus should be on these
implementations, since once we have these a simple binary_search follows.
Specifically, the issue I'm encountering at the moment is handling
`!predicate`. If `predicate` is sorting on a rule like $0 < 3, for example,
then the complement is $0 >= 3, which has the effect of creating an upper
bound inclusive of 3. To be consistent with the C++ implementation, the
upper bound should be strictly greater than 3. In C++ they are able to get
around this issue by swapping the indices within the `comp` function (our
`predicate`) to eliminate the equality case (e.g.,  looking at `comp(i,j)`
and `comp(j,i)`). I don't see a way to easily do this in our case if we
want keep `predicate` of the form `Generator.Element -> Bool`.

The following implementations are as far as I've gotten so far.

extension CollectionType {

    /// Returns an index such that each element at or above the index is
partitioned from below by the partition predicate

  ///

  /// - Parameter partitionPredicate: The partioning predicate returns
`true` for elements in the collection that are

  ///                                 ordered below, with respet to the
partitioning predicate.

    /// - Complexity: O(lg(n))

    ///

  /// - Returns: An index such that each element at or above the returned
index evaluates as `false` with respect to `partitionPredicate(_:)`

  @warn_unused_result

    func lowerBound(@noescape partitionPredicate: Self.Generator.Element ->
Bool) -> Index {

        var len = self.startIndex.distanceTo(self.endIndex)

        var firstIndex = self.startIndex

        while len > 0 {

            let half = len/2

            let middle = firstIndex.advancedBy(half)

            if partitionPredicate(self[middle]) {

                firstIndex = middle.advancedBy(1)

                len -= half + 1

            } else {

                len = half

            }

        }

        return firstIndex

  }

  /// Returns an index such that each element below the index is strictly
less than the partition predicate

  ///

    /// - Parameter partitionPredicate: The partioning predicate. Returns
`true` for elements in the collection that are

    ///                             ordered below, with respet to the
partitioning predicate.

  /// - Complexity: O(lg(n))

    ///

  /// - Returns: An index such that each element evaluates as `false` with
respect to `partitionPredicate(_:)`

  @warn_unused_result

  func upperBound(@noescape partitionPredicate:Self.Generator.Element ->
Bool) -> Index {

        var len = self.startIndex.distanceTo(self.endIndex)

        var firstIndex = self.startIndex

        while len > 0 {

            let half = len/2

            let middle = firstIndex.advancedBy(half)

            if !partitionPredicate(self[middle]) {

                len = half

            } else {

                firstIndex = middle.advancedBy(1)

                len -= half + 1

            }

        }

        return firstIndex

  }



    /// Returns `true` if element is in Collection, `false` otherwise

    @warn_unused_result

    func binarySearch(@noescape partitionPredicate:Self.Generator.Element
-> Bool) -> Bool {

        let lb = lowerBound(partitionPredicate)

        return (lb != self.endIndex) && !partitionPredicate(self[lb])

    }



}

Again, `upperBound` isn't working like it should, mainly because I'm
looking at the complement of `predicate`. The only way I see getting around
this is making `predicate` take the form `(Generator.Element,
Generator.Element) -> Bool`, but this comes at the cost of some added
complexity on the user end as you can't simply pass a closure like `{ $0 <
3 }`. I suspect there is an easy solution here and I'm just having a mental
block...

Jeff

On Tue, Mar 15, 2016 at 11:37 AM, Haravikk via swift-evolution <
swift-evolution at swift.org> wrote:

> I’m not sure the documentation matches the .lowerBound() and .upperBound()
> behaviours accurately enough; it suggests that a bound will be found for a
> predicate “match”, however if my predicate is { $0 == foo } then these
> won’t actually work at all unless I get lucky and the middle element is a
> match, to me this means that they will only work with comparison predicates
> such as { $0 > foo }, but in that case the lower bound is for the first
> element that is not greater than foo, but I wouldn’t call it a “match”
> until I can also test it for equality.
>
>
> I’ve been doing some work recently on a sorted collection and I
> implemented a binary search a bit differently. My method returns the
> nearest index given an isOrderedBefore predicate, returning .endIndex if
> nothing matches. Implementation below:
>
> @warn_unused_result
> public func binarySearch(withinRange theRange:Range<Index>?=nil,
> isOrderedBefore:(Generator.Element) -> Bool) -> Index {
> var low:Index, high:Index
> if let theSpecifiedRange = theRange { low = theSpecifiedRange.startIndex;
> high = theSpecifiedRange.endIndex }
> else { low = self.startIndex; high = self.endIndex }
>
>
> while low != high {
> let mid = low.advancedBy(low.distanceTo(high) / 2)
> if isOrderedBefore(self[mid]) { low = mid.successor() }
> else { high = mid }
> }
> return low
> }
>
> What this gives me is really an insertion index, however I can implement
> searching for a specific element like so:
>
>     let isOrderedBefore:(Generator.Element, Generator.Element) -> Bool
>     func indexOf(theElement:Equatable) -> Index? {
>         let theNearestIndex = self.binarySearch(){ self.isOrderedBefore($0,
> theElement) }
>         if theNearestIndex < self.endIndex && self[theNearestIndex] ==
> theElement { return theNearestIndex }
>         return nil
>     }
>
> I can also do things like a generic insert() method like so:
>
>     func insert(theElement:Comparable) { self.insert(theElement, atIndex:
> self.binarySearch(){ self.isOrderedBefore($0, theElement) }) }
>
> (note I’ve simplified these as examples as they won’t go into
> CollectionType as-is, it’s just to give you the idea).
>
> Anyway, it seems to me that this enables simple matching of both an exact
> element and the lower bound (this is what the .indexOf() above should
> return), while also giving a useful value if the element isn’t found,
> though it does require testing the result to be sure. When it comes to
> matching a specific element, the above actually eliminates tests for
> equality during the search, only having to test once at the end to confirm
> the result is a match.
>
> I also added a variable starting range, as I had to handle merging two
> sorted collections, in which case it’s useful to restrict the search range
> as you go since there’s no point revisiting indices that you’ve passed
> already if the elements are sorted properly.
>
> The above can also perform an upper bound search by passing in a predicate
> such as { $0 <= foo }, though in that case you’re performing all the extra
> comparisons *plus* a check at the end, so there might still be an
> argument for two methods, e.g nearestLowerBound() and nearestUpperBound().
>
>
> Lastly, in my case I’ve defined the above method on CollectionType
> directly, rather than requiring Comparable; since it takes a predicate its
> only requirement is the collection is sorted prior to search it. In fact,
> constraining the method to Comparable doesn’t guarantee that the collection
> is sorted, only that its elements can be compared in isolation.
>
> On 15 Mar 2016, at 14:28, Lorenzo Racca via swift-evolution <
> swift-evolution at swift.org> wrote:
>
> Hi all,
>
> Using the library for an app with wide sorted arrays I’ve found out that
> Swift doesn’t (yet) have binary search functions.
> Standing on the fact that C++, Java and .NET all have Binary Search
> functions in their standard libs, and given the notorious difficulty to
> create the algorithm (hence the need of developers to trust the library
> function) I’m proposing to adopt these.
> I worked out some code, recalling the C++ functions binary_search,
> lower_bound and upper_bound.
> I’ve tested them and they seem all right.
>
> Also, they all return the `Index` of a found element (say, in an Array),
> so that they can be implemented to return either a boolean value, or the
> element itself. They either return nil or -1 if not any or if the predicate
> isn’t matched, but they all could surely be arranged to return nil.
> The code for binarySearch is :
>
> extension CollectionType where Generator.Element : Comparable {
>     /// Returns `element.Index` if `element` is in `self`.
>     /// If `element` is not in `self`, returns nil.
>     @warn_unused_result
>     public func binarySearch(element: Generator.Element) -> Index? {
>
>         var left = startIndex
>         var right = endIndex
>
>         while (left != right) {
>             let mid = left.advancedBy(left.distanceTo(right) / 2)
>             let value = self[mid]
>
>             if (value == element) {
>                 return mid
>             }
>             if (value < element) {
>                 left = mid.advancedBy(1)
>             }
>             if (value > element) {
>                 right = mid
>             }
>         }
>         return nil
>     }
> }
>
> lowerBound and upperBound:
>
> extension CollectionType where Generator.Element : Comparable {
>     /// Returns the Index of the smallest element in collection matching
> `predicate`.
>     /// If none, returns -1.
>     @warn_unused_result
>     func lowerBound(predicate: Generator.Element -> Bool) -> Index {
>         var result = self.startIndex.advancedBy(-1)
>         var low = self.startIndex
>         var hi = self.endIndex
>
>         while low != hi {
>             let mid = low.advancedBy(low.distanceTo(hi) / 2)
>
>             if predicate(self[mid]) {
>                 result = mid
>                 hi = mid
>             } else {
>                 low = mid.advancedBy(1)
>             }
>         }
>         return result
>     }
> }
>
> extension CollectionType where Generator.Element : Comparable {
>     /// Returns the Index of the biggest element in collection matching
> `predicate`.
>     /// If none, returns -1.
>     func upperBound(predicate: Generator.Element -> Bool) -> Index {
>         var result = startIndex.advancedBy(-1)
>         var low = startIndex
>         var hi = endIndex
>
>         while low != hi {
>             let mid = low.advancedBy(low.distanceTo(hi) / 2)
>
>             if predicate(self[mid]) {
>                 result = mid
>                 low = mid.advancedBy(1)
>             } else {
>                 hi = mid
>             }
>         }
>         return result
>     }
> }
>
> If you wish to try them, usage is :
> var array : Array = [1,2,3,4,5]
>
> let a = array.upperBound{$0 < 3} //array[a] = 2
> let b = array.lowerBound{$0 > 3} //array[b] = 4
>
> What do you think? Should I commit a new file to proposals?
> Should they be added to CollectionType or SequenceType?
> It’s obviously open to discussion and change.
>
> Thank you.
>
> Best,
>
> Lorenzo Racca
> +39 345 9294756
> lorenzo.racca at live.it
>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160315/9fd09460/attachment.html>


More information about the swift-evolution mailing list