[swift-evolution] [swift-evolution-announce] [Revised and review extended] SE-0180 - String Index Overhaul

Kevin Ballard kevin at sb.org
Thu Jun 22 19:59:40 CDT 2017

> https://github.com/apple/swift-evolution/blob/master/proposals/0180-string-index-overhaul.md
Given the discussion in the original thread about potentially having
Strings backed by something other than utf16 code units, I'm somewhat
concerned about having this kind of vague `encodedOffset` that happens
to be UTF16 code units. If this is supposed to represent an offset into
whatever code units the String is backed by, then it's going to be a
problem because the user isn't supposed to know or care what the
underlying storage for the String is. And I can imagine potential issues
with archiving/unarchiving where the unarchived String has a different
storage type than the archived one, and therefore `encodedOffset` would
gain a new meaning that screws up unarchived String.Index values.
The other problem with using this as utf16 is how am I supposed to
archive/unarchive a String.Index that comes from String.UTF8View? AFAICT
the only way to do that is to ignore encodedOffset entirely and instead
calculate the distance between s.utf8.startIndex and my index (and then
recreate the index later on by advancing from startIndex). But this RFC
explicitly says that archiving/unarchiving indices is one of the goals
of this overhaul.

The section on comparison still talks about how this is a weak ordering.
In the other thread it was explained as being done so because the
internal transcodedOffset isn't public, but that still makes this
explanation very odd. String.Index comparison should not be weak
ordering, because all indices can be expressed in the utf8View if
nothing else, and in that view they have a total order. So it should
just be defined as a total order, based on the position in the utf8View
that the index corresponds with.

The detailed design of the index has encodedOffset being mutable (and
this was confirmed in the other thread as intentional). I don't think
this is a good idea, because it makes the following code behave oddly:
  let x = index.encodedOffset
  index.encodedOffset = x

Specifically, this resets the private transcodedOffset, so if you do
this with an intra-code-unit Index taken from the utf8View, the modified
Index may point to a different byte.
I'm also not sure why you'd ever want to do this operation anyway. If
you want to change the encodedOffset, you can just say `index =
String.Index(encodedOffset: x)`.
-Kevin Ballard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170622/dc167c82/attachment.html>

More information about the swift-evolution mailing list