<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class="">After having had this code laying around on my Mac since 6/26/15 (waiting for Swift to go open source)</div><div class="">I figured it’s about damn time I submit the actual RFC for it. So without further ado…</div><div class=""><br class=""></div><div class="">One of the areas where Swift’s stdlib is still quite lacking is searching.</div><div class="">Which is a shame as searching (along with indexing) might be among</div><div class="">the most important tasks of computer science/software development.</div><div class="">One might not need fast collection searches for writing a banal fart or flashlight app,</div><div class="">but almost every other app is likely to benefit from having a proper searching API.</div><div class=""><br class=""></div><div class="">So I’d like to fix that.</div><div class=""><br class=""></div><div class="">Attached you find a full RFC along with the full and functional source code to make it happen.</div><div class=""><br class=""></div><div class="">I’d love to hear your opinion on this and will be more than happy to answer questions.</div><div class=""><br class=""></div><div class="">Rendered Version + Full Implementation: <a href="https://gist.github.com/regexident/2b7531bd748f57679671" class="">https://gist.github.com/regexident/2b7531bd748f57679671</a></div><div class="">(The code is tested using Quick/Nimble. Tests would thus have to be ported to Swift’s XCTest.)</div><div class=""><br class=""></div><div class="">- Vincent</div><div class=""><br class=""></div><div class=""></div><blockquote type="cite" class=""><div class="">Markdown-dump for</div><div class=""><br class=""></div><div class=""># Improved Collection Search</div><div class=""><br class=""></div><div class="">* Proposal: [SE-NNNN](<a href="https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md" class="">https://github.com/apple/swift-evolution/blob/master/proposals/NNNN-name.md</a>)</div><div class="">* Author(s): [Vincent Esche](<a href="https://github.com/regexident" class="">https://github.com/regexident</a>)</div><div class="">* Status: **Review**</div><div class="">* Review manager: TBD</div><div class=""><br class=""></div><div class="">## Introduction</div><div class=""><br class=""></div><div class="">This RFC proposes an extension to the currently rather limited and linear searching API of `CollectionType`, that is `indexOf(element:)` and `indexOf(predicate:)`.</div><div class="">It proposes the backwards-compatible refactoring of those existing methods as well as the introduction of a view-based ensemble of methods for efficient binary-search.</div><div class=""><br class=""></div><div class="">Swift-evolution thread: [link to the discussion thread for that proposal](<a href="https://lists.swift.org/pipermail/swift-evolution" class="">https://lists.swift.org/pipermail/swift-evolution</a>)</div><div class=""><br class=""></div><div class="">## Motivation</div><div class=""><br class=""></div><div class="">Indexing of and searching in data is a big deal in software development. As such it is crucial to have capable means of doing so in a language that aspires to be a highly performant programming language.</div><div class=""><br class=""></div><div class="">Swift's current API for searching is basically limited to these two methods on `CollectionType`:</div><div class=""><br class=""></div><div class="">> ```swift</div><div class="">> func indexOf(element: Self.Generator.Element) -> Self.Index?</div><div class="">> ```</div><div class="">> Returns the first index where value appears in self or nil if value is not found.</div><div class="">></div><div class="">> **Complexity**: `O(self.count)`.</div><div class=""><br class=""></div><div class="">and:</div><div class=""><br class=""></div><div class="">> ```swift</div><div class="">> func indexOf(@noescape predicate: (Self.Generator.Element) throws -> Bool) rethrows -> Self.Index?</div><div class="">> ```</div><div class="">> Returns the first index where predicate returns true for the corresponding value, or nil if such value is not found.</div><div class="">> </div><div class="">> **Complexity**: `O(self.count)`.</div><div class=""><br class=""></div><div class="">The author sees a couple of issue with these two methods:</div><div class=""><br class=""></div><div class="">1. They do not provide an option for restricting the "haystack" to a certain `range` of `self`.</div><div class="">2. They do not provide a fast (`O(log2(self.count))`) path for sorted collections.</div><div class="">3. They should not be limited to `CollectionType` but instead be moved to `Indexable`.</div><div class=""><br class=""></div><div class="">In many situations it is desirable to perform fast searches on sorted collections instead of having to fall back to naïve linear searches.</div><div class=""><br class=""></div><div class="">Further more while a single `index` might be the most common subject of interest, there are many scenarios where a `range` or `count` of matches are of more use.</div><div class=""><br class=""></div><div class="">And anyone who is writing a custom ordered collection type will eventually want to find the insertion index for a given element and will thus end up writing some variant of `lowerBoundOf(…)` or `upperBoundOf(…)` (upon which all the other methods can be implemented, actually).</div><div class=""><br class=""></div><div class="">## Proposed solution</div><div class=""><br class=""></div><div class="">The author proposes:</div><div class=""><br class=""></div><div class="">1. A backwards-compatible refactoring of `CollectionType.indices`, moving it to `Indexable`.</div><div class=""><br class=""></div><div class="">2. A backwards-compatible refactoring of `indexOf(…)` (adding optional `range:` and moving it to `Indexable`).</div><div class=""><br class=""></div><div class="">3. The addition of `rangeOf(…)`, `countOf(…)` and `isSorted(…)` to `Indexable` with a TIME complexity of `O(self.count)`.</div><div class=""><br class=""></div><div class="">4. The introduction of a `BinarySearchView` on `Indexable`, allowing for fast (`O(log2(self.count))`) searches on `Indexable` via `indexOf(…)`, `rangeOf(…)`, `countOf(…)`, `lowerBoundOf(…)`, `upperBoundOf(…)` without cluttering `Indexable`'s interface.</div><div class=""><br class=""></div><div class="">## Detailed design</div><div class=""><br class=""></div><div class="">### `CollectionType.indices`:</div><div class=""><br class=""></div><div class="">The author proposes the relocation of `.indices` from `CollectionType` to `Indexable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension Indexable {</div><div class=""> /// Return the range of valid index values.</div><div class=""> ///</div><div class=""> /// The result's `endIndex` is the same as that of `self`. Because</div><div class=""> /// `Range` is half-open, iterating the values of the result produces</div><div class=""> /// all valid subscript arguments for `self`, omitting its `endIndex`.</div><div class=""> public var indices: Range<Self.Index> { get }</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">After all all it does is provide a convenience method for generating a Range from its `startIndex` and `endIndex`. There is no need for `self` to be a `CollectionType` for this.</div><div class=""><br class=""></div><div class="">This change should not break any existing code as it generalizes the property.</div><div class=""><br class=""></div><div class="">### `Indexable` of `Comparable` elements:</div><div class=""><br class=""></div><div class="">The author proposes the addition of the following method on `Indexable where Element : Comparable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension Indexable where Index: RandomAccessIndexType, _Element: Comparable {</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>/// Return true iff all elements in `self` within `range` are sorted according to `isOrderedBefore`.</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>///</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>/// - Complexity: O(`self.count`)</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>public func isSorted(range: Range<Index>? = nil) -> Bool</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">This method would perform a linear scan of the `Indexable` with a TIME complexity of `O(self.count)`.</div><div class=""><br class=""></div><div class="">An actual implementation of these methods can be found in `Indexable.swift`.</div><div class=""><br class=""></div><div class="">### `Indexable` of `Equatable ` elements:</div><div class=""><br class=""></div><div class="">The author proposes the addition of the following method on `Indexable where Element : Equatable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension Indexable where Index : RandomAccessIndexType, _Element : Equatable {</div><div class=""><br class=""></div><div class=""> /// Returns the first index where `element` appears in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func indexOf(element: _Element, range: Range<Index>? = default) -> Self.Index?</div><div class=""><br class=""></div><div class=""> /// Returns the first range where `element` appears in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func rangeOf(element: _Element, range: Range<Index>? = default) -> Range<Self.Index>?</div><div class=""><br class=""></div><div class=""> /// Returns the number of `element`s that appear in `self`</div><div class=""> /// within range `range` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func countOf(element: _Element, range: Range<Index>? = default) -> Int</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">These methods would perform a linear scan of the `Indexable` with a TIME complexity of `O(self.count)`.</div><div class=""><br class=""></div><div class="">An actual implementation of these methods can be found in `Indexable.swift`.</div><div class=""><br class=""></div><div class="">### `Indexable` of any elements:</div><div class=""><br class=""></div><div class="">The author proposes the addition of the following methods on `Indexable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension Indexable where Index : RandomAccessIndexType {</div><div class=""><br class=""></div><div class=""> /// Returns the first index where an element appears in `self`</div><div class=""> /// within range `range` for which `lensView(element) == value` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func indexOf(range: Range<Index>? = default, @noescape predicate: (_Element) throws -> Bool) rethrows -> Self.Index?</div><div class=""><br class=""></div><div class=""> /// Returns the first range where an element appears in `self`</div><div class=""> /// within range `range` for which `lensView(element) == value` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func rangeOf(range: Range<Index>? = default, @noescape predicate: (_Element) throws -> Bool) rethrows -> Range<Self.Index>?</div><div class=""><br class=""></div><div class=""> /// Returns the number of `element`s that appear in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func countOf(range: Range<Index>? = default, @noescape predicate: (_Element) throws -> Bool) rethrows -> Int</div><div class=""><br class=""></div><div class=""> /// Return true iff all elements in `self` within `range` are sorted according to `isOrderedBefore`.</div><div class=""> /// </div><div class=""> /// - Complexity: O(`self.count`)</div><div class=""> public func isSorted(range: Range<Index>? = default, @noescape isOrderedBefore: (_Element, _Element) throws -> Bool) rethrows -> Bool</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">These methods would perform a linear scan of the `Indexable` with a TIME complexity of `O(self.count)`.</div><div class=""><br class=""></div><div class="">An actual implementation of these methods can be found in `Indexable.swift`.</div><div class=""><br class=""></div><div class="">**Note**: For explanation of and reasoning behind the `lensView:` argument as well as the distinction between `T` and `Base._Element`, see below.</div><div class=""><br class=""></div><div class="">### `Indexable.binarySearchView(validateAgainst:?)`:</div><div class=""><br class=""></div><div class="">The author proposes the addition of a `binarySearchView(validateAgainst:?)` method on `Indexable` for obtaining a specialized search view for performing searches with O(`log2(self.count)`) complexity:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension Indexable where Index: RandomAccessIndexType {</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>/// Create a view into a sorted indexable that allows access within `bounds`.</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>///</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>/// - Complexity: O(`self.count`) if a validation closure is provided, otherwise O(`1`).</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>public func binarySearchView(validateAgainst isOrderedBefore: ((_Element, _Element) -> Bool)? = nil) -> BinarySearchView<Self></div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">### `BinarySearchView<T: Indexable>`:</div><div class=""><br class=""></div><div class="">The author proposes the introduction of a `BinarySearchView` struct with the following interface:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">public struct BinarySearchView<Base : Indexable where Base.Index : RandomAccessIndexType> : Indexable {</div><div class=""> public typealias Index = Base.Index</div><div class=""><br class=""></div><div class=""> public var startIndex: Base.Index { get }</div><div class=""><br class=""></div><div class=""> public var endIndex: Base.Index { get }</div><div class=""><br class=""></div><div class=""> public subscript (position: Index) -> Base._Element { get }</div><div class=""><br class=""></div><div class=""> /// Create a view into a sorted collection `base` that allows access within `bounds`.</div><div class=""> ///</div><div class=""> /// - Complexity: O(1).</div><div class=""> public init(base: Base, validateAgainst isOrderedBefore: ((Base._Element, Base._Element) -> Bool)? = default)</div><div class=""><br class=""></div><div class=""> /// Returns first index of the first `element` in `self` within range `range` for which</div><div class=""> /// `lensView(element) < value` evaluates to false or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func lowerBoundOf<T : Comparable>(value: T, range: Range<Index>? = default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape lensView: Base._Element -> T) -> Base.Index</div><div class=""><br class=""></div><div class=""> /// Returns first index of the first `element` in `self` within range `range` for which</div><div class=""> /// `value < lensView(element)` evaluates to true or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func upperBoundOf<T : Comparable>(value: T, range: Range<Index>? = default, @noescape isOrderedBefore: ((T, T) -> Bool) = default, @noescape lensView: Base._Element -> T) -> Base.Index</div><div class=""><br class=""></div><div class=""> /// Returns the first index where an element appears in `self`</div><div class=""> /// within range `range` for which `lensView(element) == value` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func indexOf<T : Comparable>(value: T, range: Range<Index>? = default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape lensView: Base._Element -> T) -> Base.Index?</div><div class=""><br class=""></div><div class=""> /// Returns the range where an element appears in `self`</div><div class=""> /// within range `range` for which `lensView(element) == value` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func rangeOf<T : Comparable>(value: T, range: Range<Index>? = default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape lensView: Base._Element -> T) -> Range<Base.Index>?</div><div class=""><br class=""></div><div class=""> /// Returns the number of `element`s that appear in `self`</div><div class=""> /// within range `range` for which `lensView(element) == value` or `nil` if such an element is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func countOf<T : Comparable>(value: T, range: Range<Index>? = default, @noescape isOrderedBefore: (T, T) -> Bool = default, @noescape lensView: Base._Element -> T) -> Int</div><div class=""><br class=""></div><div class=""> internal let _base: Base</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">As well as the following extension for `Indexable where Element: Comparable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">extension BinarySearchView where Base._Element : Comparable {</div><div class=""> /// Returns first index of the first `element` in `self` within range `range` for which</div><div class=""> /// `… < element` evaluates to false or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func lowerBoundOf(element: Base._Element, range: Range<Index>? = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool = default) -> Base.Index</div><div class=""><br class=""></div><div class=""> /// Returns first index of the first `element` in `self` within range `range` for which</div><div class=""> /// `element < …` evaluates to true or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func upperBoundOf(element: Base._Element, range: Range<Index>? = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool = default) -> Base.Index</div><div class=""><br class=""></div><div class=""> /// Returns the first index where `element` appears in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func indexOf(element: Base._Element, range: Range<Index>? = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool = default) -> Base.Index?</div><div class=""><br class=""></div><div class=""> /// Returns the range where `element` appears in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func rangeOf(element: Base._Element, range: Range<Index>? = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool = default) -> Range<Base.Index>?</div><div class=""><br class=""></div><div class=""> /// Returns the number of `element`s that appear in `self`</div><div class=""> /// within range `range` or `nil` if `element` is not found.</div><div class=""> ///</div><div class=""> /// - Complexity: O(`log2(self.count)`).</div><div class=""> public func countOf(element: Base._Element, range: Range<Index>? = default, @noescape isOrderedBefore: (Base._Element, Base._Element) -> Bool = default) -> Int</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">These methods would perform a linear scan of the `Indexable` with a TIME complexity of `O(log2(self.count))`.</div><div class=""><br class=""></div><div class="">An actual implementation of these methods can be found in `BinarySearchView.swift`.</div><div class=""><br class=""></div><div class="">### Why `lensView`?</div><div class=""><br class=""></div><div class="">Let's assume one had a collection of Persons like this:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">struct Person {</div><div class=""> let name: String</div><div class=""> let age: Int</div><div class="">}</div><div class=""><br class=""></div><div class="">let persons: [Person] = [ … ]</div><div class="">```</div><div class=""><br class=""></div><div class="">Now let's assume one wanted to find the first person that's called `John Doe`. One could either change `Person` to conform to `Equatable`, allowing `indexOf(element:)` to be called on `persons`. Now that was easy, wasn't it?</div><div class=""><br class=""></div><div class="">Fine, but what if one wanted to search `Person`s by both their `name` or their `age` depending on the given use case? You're out of luck now.</div><div class=""><br class=""></div><div class="">Luckily there is a variant of `indexOf(…)` called `indexOf(predicate:)` that is available for any type of `Element` regardless of its conformance to `Equatable`:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">let searchView = persons.binarySearchView()</div><div class="">let index = persons.indexOf { $0.name == "John Doe" }</div><div class="">```</div><div class=""><br class=""></div><div class="">Unfortunately however such a `predicate` cannot be used when searching for an element's property in a sorted(!) collection or a collection of non-`Comparable` elements within a TIME complexity of `O(log2(self.count))` as binary search uses `<` instead of `==` internally and as the "needle" gets passed as both the first (in `indexOf(…)` itself) and second argument (in `lowerBoundOf(…)` which gets called by `indexOf(…)`) to the operator, one cannot simply change `predicate: (E) -> Bool` to `isOrderedBefore: (V, E) -> Bool` as that would result in a type mismatch (`(V, E)` vs. `(E, V)`).</div><div class=""><br class=""></div><div class="">The use of `lensView` however allows one to do just that without making a mess.</div><div class=""><br class=""></div><div class="">```swift</div><div class="">let searchView = persons.binarySearchView() // sorted!</div><div class="">let index = searchView.indexOf("John Doe") { $0.name }</div><div class="">```</div><div class=""><br class=""></div><div class="">The `lensView` works similarly to how keypaths work in Objective-C, just type-safe.</div><div class=""><br class=""></div><div class="">## Impact on existing code</div><div class=""><br class=""></div><div class="">The author proposes:</div><div class=""><br class=""></div><div class="">1. Moving `CollectionType.indices` to `Indexable` should not cause any issues with existing code.</div><div class=""><br class=""></div><div class="">2. Changing `indexOf(element:)` to `indexOf(element:range:?)` should not cause any issues with existing code as the `range` argument is optional and the default behaves like it used to.</div><div class=""><br class=""></div><div class="">2. Changing `indexOf(predicate:)` to `indexOf(range:?predicate:)` should not cause any issues with existing code either, as the `range` argument is optional and the default behaves like it used to. Swift's trailing-closure magic allows `indexOf(predicate: …)` to still properly map to `indexOf(range: nil, predicate: …)`.</div><div class=""><br class=""></div><div class="">3. The addition of `rangeOf(…)`, `countOf(…)` and `isSorted(…)` to `Indexable` do not conflict with any existing API.</div><div class=""><br class=""></div><div class="">4. The introduction of a `BinarySearchView` on `Indexable` does not conflict with any existing API.</div><div class=""><br class=""></div><div class="">## Alternatives considered</div><div class=""><br class=""></div><div class="">The author's initial approach was to implement all methods on Indexable. This however required all methods to either be provided in two variants (one for unsorted, one for sorted `Indexable`s):</div><div class=""><br class=""></div><div class="">```swift</div><div class="">final public func indexOf(element: _Element, range: Range<Index>? = nil) -> Index? {</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>return …</div><div class="">}</div><div class=""><br class=""></div><div class="">final public func sortedIndexOf(element: _Element, range: Range<Index>? = nil) -> Index? {</div><div class=""><span class="Apple-tab-span" style="white-space:pre">        </span>return …</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">Or require the introduction of an additional argument:</div><div class=""><br class=""></div><div class="">```swift</div><div class="">final public func indexOf(element: _Element, range: Range<Index>? = nil, isSorted: Bool = true) -> Index? {</div><div class=""> if isSorted {</div><div class=""> return …</div><div class=""> } else {</div><div class=""> return …</div><div class=""> }</div><div class="">}</div><div class="">```</div><div class=""><br class=""></div><div class="">It also would have cluttered `Indexable` with methods such as `lowerBoundOf` and `upperBoundOf`, which are only ever applicable when `self` is sorted accordingly. The introduction of a dedicated BinarySearchView fixes these issues.</div><div class=""><br class=""></div><div class="">One also would have had to switch from `predicate: (T) -> Bool` to `isOrderedBefore: (T, T) -> Bool` for all methods for the second unified API alternative.</div></blockquote><div class=""><br class=""></div></body></html>