<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">Purely additive, so +1 from me. Side note, I’m wondering how problematic these same discussions will be in third-party library code. Should authors use StringProtocol or String as often as possible?<div class=""><br class=""><div><blockquote type="cite" class=""><div class="">On 28 Jun 2017, at 18:37, Ben Cohen via swift-evolution <<a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><meta http-equiv="Content-Type" content="text/html charset=utf-8" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi swift-evolution,<br class=""><br class="">Below is a short pitch for some performance improvements to be made to String to accommodate Substrings. <br class=""><br class="">As outlined in SE-0163, the more general question of implicit conversion from Substring to String was deferred pending feedback on the initial implementation. To date, the feedback we’ve received hasn’t suggested that such an implicit conversion is necessary – that migrations from 3.2 to 4.0 haven’t led to an extensive need to perform Substring->String conversion. Any further input, either on or off list, about this particular aspect of migration would be very gratefully received.<br class=""><br class=""># Substring performance affordances<br class=""><br class="">* Proposal: [SE-NNNN](NNNN-substring-affordances.md)<br class="">* Authors: [Ben Cohen](<a href="https://github.com/airspeedswift" class="">https://github.com/airspeedswift</a>)<br class="">* Review Manager: TBD<br class="">* Status: **Awaiting review**<br class=""><br class="">## Introduction<br class=""><br class="">This proposal modifies a small number of methods in the standard library that<br class="">are commonly used with the `Substring` type:<br class=""><br class="">- Modify the `init` on floating point and integer types, to construct them<br class=""> from `StringProtocol` rather than `String`. <br class="">- Change `join` to be an extension `where Element: StringProtocol`<br class="">- Add extensions to `Dictionary where Key == String` and `Set where Element ==<br class=""> String` to test for presence of a `Substring`.<br class=""><br class="">## Motivation<br class=""><br class="">Swift 4 introduced `Substring` as the slice type for `String`. Previously,<br class="">`String` had been its own slice type, but this leads to issues where string<br class="">buffers can be unexpectedly retained. This approach was adopted instead of the<br class="">alternative of having the slicing operation make a copy. A copying slicing<br class="">operation would have negative performance consequences, and would also conflict<br class="">with the requirement that `Collection` be sliceable in constant time. In cases<br class="">where an API requires a `String`, the user must construct a new `String` from a<br class="">`Substring`. This can be thought of as a "deferral" of the copy that was<br class="">avoided at the time of the slice.<br class=""><br class="">There are a few places in the standard library where it is notably inefficient<br class="">to force a copy of a substring in order to use it with a string: performing<br class="">lookups in hashed containers, joining substrings, and converting substrings to<br class="">integers. In particular, these operations are likely to be used inside a loop<br class="">over a number of substrings extracted from a string. For example, suppose you<br class="">had a string of key/value pairs, where the values were integers and you wanted<br class="">to sum them by key. You would be forced to convert both the `Substring` keys<br class="">and values to `String` to do this.<br class=""><br class="">## Proposed solution<br class=""><br class="">Add the following to the standard library:<br class=""><br class="">```swift<br class="">extension FixedWidthInteger {<br class=""> public init?<S : StringProtocol>(_ text: S, radix: Int = 10)<br class="">}<br class=""><br class="">extension Float/Double/Float80 {<br class=""> public init?<S : StringProtocol>(_ text: S, radix: Int = 10)<br class="">}<br class=""><br class="">extension Sequence where Element: StringProtocol {<br class=""> public func joined(separator: String = "") -> String<br class="">}<br class=""><br class="">extension Dictionary where Key == String {<br class=""> public subscript(key: Substring) -> Value? { get set }<br class=""> public subscript(key: Substring, default defaultValue: @autoclosure () -> Value) -> Value { get set }<br class="">}<br class=""><br class="">extension Set where Element == String {<br class=""> public func contains(_ member: Substring) -> Bool<br class=""> public func index(of member: Substring) -> Index?<br class=""> public mutating func insert(_ newMember: Substring) -> (inserted: Bool, memberAfterInsert: Element)<br class=""> public mutating func remove(_ member: Substring) -> Element?<br class="">}<br class="">```<br class=""><br class="">These additions are deliberately narrow in scope. They are _not_ intended to<br class="">solve a general problem of being able to interchange substrings for strings (or<br class="">more generally slices for collections) generically in different APIs. See the<br class="">alternatives considered section for more on this.<br class=""><br class="">## Source compatibility<br class=""><br class="">No impact, these are either additive (in case of hashed containers) or<br class="">generalize an existing API to a protocol (in case of numeric<br class="">conversion/joining).<br class=""><br class="">## Effect on ABI stability<br class=""><br class="">The hashed container changes are additive so no impact. The switch from conrete<br class="">to generic types for the numeric conversions needs to be made before ABI<br class="">stability.<br class=""><br class="">## Alternatives considered<br class=""><br class="">While they have a convenience benefit as well, this is not the primary goal of<br class="">these additions, but a side-effect of helping avoid a performance problem. In<br class="">many other cases, the performance issues can be avoided via modified use e.g.<br class="">`Sequence.contains` of a `Substring` in a sequence of strings can be written as<br class="">`sequence.contains { $0 == substring }` .<br class=""><br class="">These changes are limited in scope, and further additions could be considered<br class="">in the future. For example, should the `Dictionary.init(grouping:by:) where Key<br class="">== String` operation be enhanced to similarly take a sequence of substrings?<br class="">There is a long tail of these cases, and the need to keep unnecessary overloads<br class="">to a minimum, avoiding typechecker work and code bloat, must be weighed against<br class="">the likelyhood that string copies will be a performance problems.<br class=""><br class="">There is a more general problem of interoperating between collections and<br class="">slices. In the future, there may be other affordances for converting/comparing<br class="">them. For example, it might be desirable to require equatable collections to<br class="">have equatable slices, and to automatically provide default implementations of<br class="">`==` that efficiently compare a collection to its default slice. These<br class="">enhancements rely on features such as conditional conformance, and so may be<br class="">worth considering in later versions of Swift but are not an option currently.</div>_______________________________________________<br class="">swift-evolution mailing list<br class=""><a href="mailto:swift-evolution@swift.org" class="">swift-evolution@swift.org</a><br class="">https://lists.swift.org/mailman/listinfo/swift-evolution<br class=""></div></blockquote></div><br class=""></div></body></html>