[swift-evolution] Strings in Swift 4

Tue Jan 31 19:28:22 CST 2017

On Tue, Jan 31, 2017 at 7:08 PM, Matthew Johnson <matthew at anandabits.com>
wrote:

>
> On Jan 31, 2017, at 6:54 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>
> On Tue, Jan 31, 2017 at 6:40 PM, Matthew Johnson <matthew at anandabits.com>
> wrote:
>
>>
>> On Jan 31, 2017, at 6:15 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>
>> On Tue, Jan 31, 2017 at 6:09 PM, Matthew Johnson <matthew at anandabits.com>
>> wrote:
>>
>>>
>>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution <
>>> swift-evolution at swift.org> wrote:
>>>
>>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris <davesweeris at mac.com>
>>> wrote:
>>>
>>>>
>>>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>>>
>>>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution <
>>>> swift-evolution at swift.org> wrote:
>>>>
>>>>>
>>>>> On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution <
>>>>> swift-evolution at swift.org> wrote:
>>>>>
>>>>> I think that is perfectly reasonable, but then it seems weird to be
>>>>> able to iterate over it (with no upper bound) independently of a
>>>>> collection). It would surprise me if
>>>>> ```
>>>>> for x in arr[arr.startIndex…] { print(x) }
>>>>> ```
>>>>> yielded different results than
>>>>> ```
>>>>> for i in arr.startIndex… { print(arr[i]) } // CRASH
>>>>> ```
>>>>> which it does under this model.
>>>>>
>>>>>
>>>>> (I *think* this how it works... semantically, anyway) Since the upper
>>>>> bound isn't specified, it's inferred from the context.
>>>>>
>>>>> In the first case, the context is as an index into an array, so the
>>>>> upper bound is inferred to be the last valid index.
>>>>>
>>>>> In the second case, there is no context, so it goes to Int.max. Then,
>>>>> *after* the "wrong" context has been established, you try to index an
>>>>> array with numbers from the too-large range.
>>>>>
>>>>> Semantically speaking, they're pretty different operations. Why is it
>>>>> surprising that they have different results?
>>>>>
>>>>
>>>> I must say, I was originally rather fond of `0...` as a spelling, but
>>>> IMO, Jaden and others have pointed out a real semantic issue.
>>>>
>>>> A range is, to put it simply, the "stuff" between two end points. A
>>>> "range with no upper bound" _has to be_ one that continues forever. The
>>>> upper bound _must_ be infinity.
>>>>
>>>>
>>>> Depends… Swift doesn’t allow partial initializations, and neither the
>>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional.
>>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist
>>>> without getting into undefined behavior territory.
>>>>
>>>> Plus, mathematically speaking, an infinite range would be written "[x,
>>>> ∞)", with an open upper bracket. If you write “[x, ∞]”, with a *closed*
>>>> upper bracket, that’s kind of a meaningless statement. I would argue that
>>>> if we’re going to represent that “infinite” range, the closest Swift
>>>> spelling would be “x..<“. That leaves the mathematically undefined notation
>>>> of “[x, ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x”
>>>> (which by similar reasoning can’t mean "(∞, x]”) return one of these:
>>>>
>>>> enum IncompleteRange<T> {
>>>>     case upperValue(T)
>>>>     case lowerValue(T)
>>>> }
>>>>
>>>> which we could then pass to the subscript function of a collection to
>>>> create the actual Range like this:
>>>>
>>>> extension Collection {
>>>>     subscript(_ ir: IncompleteRange<Index>) -> SubSequence {
>>>>         switch ir {
>>>>         case .lowerValue(let lower): return self[lower ..< self.
>>>> endIndex]
>>>>         case .upperValue(let upper): return self[self.startIndex ..<
>>>> upper]
>>>>         }
>>>>     }
>>>> }
>>>>
>>>>
>>> I understand that you can do this from a technical perspective. But I'm
>>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a
>>> number.
>>>
>>>
>>> It’s not any more devoid of semantics than a partially applied function.
>>>
>>
>> Yes, but this here is not a partially applied type.
>>
>> Nor does it square with your proposal that you should be able to use `for
>> i in 0...` to mean something different from `array[0...]`. We don't have
>> partially applied functions doubling as function calls with default
>> arguments.
>>
>>
>> I’m not trying to say it’s *exactly* like a partially applied function.
>>
>
> I'm not saying you're arguing that point. I'm saying that there is a
> semantic distinction between (1) a range with two bounds where you've only
> specified the one, and (2) a range with one bound. There must be an answer
> to the question: what is the nature of the upper bound of `0...`? Either it
> exists but is not yet known, or it is known that it does not exist (or, it
> is not yet known whether or not it exists). But these are not the same
> thing!
>
> It is a number or index with added semantics that it provides a lower (or
>>> upper) bound on the possible value specified by its type.
>>>
>>>
>>> What is such an `IncompleteRange<T>` other than a value of type T? It's
>>> not an upper bound or lower bound of anything until it's used to index a
>>> collection. Why have a new type (IncompleteRange<T>), a new set of
>>> operators (prefix and postfix range operators), and these muddied semantics
>>> for something that can be written `subscript(upTo upperBound: Index) ->
>>> SubSequence { ... }`? _That_ has unmistakable semantics and requires no new
>>> syntax.
>>>
>>>
>>> Arguing that it adds too much complexity relative to the value it
>>> provides is reasonable.  The value in this use case is mostly syntactic
>>> sugar so it’s relatively easy to make the case that it doesn’t cary its
>>> weight here.
>>>
>>> The value in Ben’s use case is a more composable alternative to
>>> `enumerated`.  I find this to be a reasonably compelling example of the
>>> kind of thing a partial range might enable.
>>>
>>
>> Ben's use case is not a "partial range." It's a bona fide range with no
>> upper bound.
>>
>>
>> Ok, fair enough.  Let’s call it an infinite range then.
>>
>> We can form an infinite range with an Index even if it’s an opaque type
>> that can’t be incremented or decremented.  All we need is a comparable
>> Bound which all Indices meet.  We can test whether other indices are
>> contained within that infinite range and can clamp it to a tighter range as
>> well.  This clamping is what would need to happen when an infinite range is
>> passed to a collection subscript by providing an upper bound.
>>
>> The only thing unusual about this is that we don’t usually do a bounds
>> check of any kind when subscripting a collection.
>>
>
> Precisely. This would be inconsistent. If lenient subscripts as once
> proposed were accepted, however, then perhaps `arr[lenient: 0...]` would
> make sense.
>
> But that's not getting to the biggest hitch with your proposal. If
> subscript were lenient, then `arr[lenient: 42...]` would also have to give
> you a result even if `arr.count == 21`.
>
>
> This is not at all what Dave Abrahams was proposing, though (unless I
> totally misunderstand). He truly doesn't want an infinite range. He wants
> to use a terser notation for saying: I want x to be the lower bound of a
> range for which I don't yet know (or haven't bothered to find out) the
> finite upper bound. It would be plainly clear, if spelled as `arr[from:
> 42]`, that if `arr.count < 43` then this expression will trap, but if
> `arr.count >= 43` then this expression will give you the rest of the
> elements.
>
>
> Right.  I was not making the necessary distinction between incomplete
> ranges and infinite ranges.  Jaden provided an accurate description of what
> I was trying to get at and it *does* require both `IncompleteRange` and
> `InfiniteRange` to do it properly.
>

Cool, I think we broadly agree on the conclusion here. The reason I'm
harping on this point is that one obviously needs to demonstrate compelling
use cases. By conflating different concepts together, we're inflating all
the wonderful things that you can do.

I’m not necessarily trying to argue that we *should* do this, only that
> there isn’t a fundamental semantic problem with it.  In a language like
> Swift there is no fundamental reason that `0…` must semantics independent
> of context.  Allowing context to provide the semantics doesn’t seem any
> more magical than allowing context to define the type of literals like `0`.
>

Hmm, disagree here. Literals aren't typed, they aren't instances of
anything, and thus they do not have any particular semantics. When they are
used to express a value, that value has a particular type with particular
semantics.

That we have been talking about `0...` clouds the fact that we are talking
about a function that takes a single argument which doesn't have to be a
literal, and which must return a value of a particular type. (That is,
unless you want to overload the function, in which case every naked `0...`
would need to be written `0... as IncompleteRange` or `0... as
UnboundedRange`.) And since you're going to get an instance of some
particular type, this implies some particular semantics. Given that
`arr[upTo: 42]` is perfectly nice-looking and does exactly what you'd want
it to do, it is hard to argue that a superior alternative is one that
requires new types, new operators, context-dependent semantics, and
compiler magic.

I also tend to find concise notation important for clarity as long as it
>>> isn’t obscure or idiosyncratic.  With that in mind, I think I lean in favor
>>> of `…` so long as we’re confident we won’t regret it if / when we take up
>>> variadic generics and / or tuple unpacking.
>>>
>>>
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170131/cdeb8b3a/attachment.html>