[swift-evolution] [Review] SE-0027 Expose code unit initializers on String

Dave Abrahams dabrahams at apple.com
Mon Feb 15 18:55:38 CST 2016


on Mon Feb 15 2016, Zach Waldowski <swift-evolution at swift.org> wrote:

> Responses are inline.
>
> Best,
> Zachary Waldowski
> zach at waldowski.me
>
> On Mon, Feb 15, 2016, at 02:10 PM, Dave Abrahams via swift-evolution
> wrote:
>> 
>> on Mon Feb 15 2016, Zach Waldowski <swift-evolution at swift.org> wrote:
>> 
>> > See responses inline.
>> >
>> > Aside: omgomgomg I got Dave Abrahams to respond to a proposal of mine
>> 
>> Don't worry; it'll wear off soon enough...
>> 
>> > Excitedly,
>> >   Zachary Waldowski
>> >   zach at waldowski.me
>> >
>> > On Mon, Feb 15, 2016, at 11:48 AM, Dave Abrahams via swift-evolution
>> > wrote:
>> >> 
>> >> on Thu Feb 11 2016, Douglas Gregor <swift-evolution at swift.org> wrote:
>> >> 
>> >> > Hello Swift community,
>> >> >
>> >> > The review of SE-0027 "Expose code unit initializers on String" begins
>> >> > now and runs through February 16, 2016. The proposal is available
>> >> > here:
>> >> >
>> >> > https://github.com/apple/swift-evolution/blob/master/proposals/0027-string-from-code-units.md
>> >> > Reviews are an important part of the Swift evolution process. All
>> >> > reviews should be sent to the swift-evolution mailing list at
>> >> >
>> >> > https://lists.swift.org/mailman/listinfo/swift-evolution
>> >> > <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> >> > or, if you would like to keep your feedback private, directly to the
>> >> > review manager. When replying, please try to keep the proposal link at
>> >> > the top of the message:
>> >> >
>> >> > Proposal link:
>> >> >
>> >> > https://github.com/apple/swift-evolution/blob/master/proposals/0027-string-from-code-units.md
>> >> > Reply text
>> >> >
>> >> > Other replies
>> >> >  <https://github.com/apple/swift-evolution#what-goes-into-a-review-1>What goes into a review?
>> >> >
>> >> > The goal of the review process is to improve the proposal under review
>> >> > through constructive criticism and, eventually, determine the
>> >> > direction of Swift. When writing your review, here are some questions
>> >> > you might want to answer in your review:
>> >> >
>> >> > What is your evaluation of the proposal?
>> >> 
>> >> I support the intent of the proposal but am -1 on the specific proposal
>> >> as offered.
>> >> 
>> >> First, in the presentation, there are several things that make it hard
>> >> to evaluate:
>> >> 
>> >> 1. The "decode" function signature is so wide that it can't be read
>> >>    without scrolling at
>> >>    https://github.com/apple/swift-evolution/blob/master/proposals/0027-string-from-code-units.md
>> >> 
>> >> 2. The proposal doesn't show any usage of the proposed APIs, so it
>> >>    is hard to understand what effect these APIs would have on real
>> >>    code.  There are some examples of uses in
>> >>    https://github.com/apple/swift/compare/master...zwaldowski:string-from-code-units
>> >>    but I have to dig for them.
>> >> 
>> >> 3. The description of the “Detailed Design” doesn't show any examples
>> >>    either, so I have to imagine what “backporting the Swift 3.0 versions
>> >>    of the CString constructors, then making them generic over their
>> >>    input and codec” means.
>> >
>> > The notes are appreciated. I'll take care to add examples and fix
>> > formatting. The proposal was always intended to be paired with the PR to
>> > the stdlib, and that lack of clarity did not come up during the proposal
>> > vetting stage.
>> >
>> >> Next, when I look at *uses* that I can find (in the tests), I don't find
>> >> them to be clear.
>> >> 
>> >>   String(validatingCodeUnits: result, as: UTF8.self)
>> >> 
>> >> What does “validatingCodeUnits” mean?  Clearly we're going to do some
>> >> checking.  Is there a repair?  Is the initializer failable?  
>> >
>> > I'm open to suggestions. The names as they are were lifted from the
>> > changes to the CString APIs made on swift-3-api-guidelines.
>> 
>> I suspected that they might be so I went and checked on GitHub... but
>> now I see that only searches the default branch :-P
>> 
>>     https://help.github.com/articles/searching-code/
>> 
>> But if the naming error was on our side, my apologies.  We should do
>> better, regardless.
>> 
>> > "repairingCodeUnits" implies to me that there is something already wrong
>> > with them. 
>> 
>> That doesn't bother me.  If you are requesting the repair, you are
>> effectively assuming that there is something already wrong with them.
>> 
>> > I'm also not sure as to the point of the initializer being failable;
>> > many domain initializers in the stdlib don't indicate that they can
>> > fail directly in the name.
>> 
>> The point I'm trying to make is that the implications of “validating”
>> aren't clear, not that I can't tell whether it's failable.
>
> Okay, point taken. Is your recommendation behind "repairingCodeUnits"?

If repairingInvalidCodeUnits reflects the correct semantics, that's what
I'd use.

> Or something else?
>
>> 
>> >> 
>> >> In this change, for example:
>> >> https://github.com/apple/swift/compare/master...zwaldowski:string-from-code-units#diff-d39ec2c4819c950aeef95f899f53b104R79
>> >> I find the old code much clearer than the new code (I'm not sure why
>> >> _decode starts with an underscore here; isn't it part of your proposed
>> >> public API?)
>> >
>> > The PR is ahead of the proposal here. It was pointed out in the review
>> > thread up to this point that that that "primitive" isn't truly needed;
>> > the initializers on String should be the API used by all.
>> 
>> Okay, well, proposals (including any diffs they reference) should remain
>> stable under review so we know what we're reviewing.  I really don't
>> know what's being proposed here.
>
> Noted. I was following the example set by many other reviews, and I
> apologize for not knowing what the right was here.

n/p.

> I intend to continue to push changes to the PR (or make a new branch,
> more likely), but will strive to make the proposal not dependent on the
> branch anyway. 
>
>> 
>> > static func decode<
>> >   Encoding: UnicodeCodecType, 
>> >   Input: CollectionType where Input.Generator.Element == Encoding.CodeUnit
>> > >(
>> >    _: Input, as: Encoding.Type, repairingInvalidCodeUnits: Bool = default
>> > ) -> (result: String, repairsMade: Bool)?
>> 
>> It seems like you are withdrawing decode from the proposal?

?

>> > For convenience, the Bool flag here is also separated out to a more
>> > common-case pair of String initializers:
>> 
>> I don't understand what it means for a “bool flag to be separated out to
>> a... pair of ... initializers.”
>> 
>> > init<...>(codeUnits: Input, as: Encoding.Type)
>> > init?<...>(validatingCodeUnits: Input, as: Encoding.Type)
>> 
>> Are you saying that the first initializer corresponds to
>> `repairingInvalidCodeUnits: true` and the second to
>> `repairingInvalidCodeUnits: false`?
>
> Yes.

Okay, well the language you used doesn't make that clear.

>> What about the user who wants repairs if necessary and wants to know
>> about it when repairs were made?
>
> That's valid enough reason to keep the full version of decode around,
> though I will again note that the flag is typically ignored. I've yet to
> find someone actually using it.

Does the standard library use it?

>> > Finally, for more direct compatibility with String.Type.fromCString(_:)
>> > and String.Type.fromCStringRepairingIllFormedUTF8(_:), these
>> > constructors are overloaded for pointer-based strings of unknown length:
>> > 
>> > init(cString: UnsafePointer<CChar>)
>> > init?(validatingCString: UnsafePointer<CChar>)
>> 
>> Why are we trying to “be compatible” with those static methods?  
>
> They are existing public API and they are being deprecated and replaced
> by this proposal. It was noted during the proposal shopping period that
> requiring strlen and creating an UnsafeBufferPointer would be, in as
> many words, too much migration effort.

Ah.

>> > The ugliness in this instance is transitional; 
>> 
>> What ugliness?
>
> "I find the old code much clearer than the new code", being the
> forwarding implementation of
> String.Type.fromCStringRepairingIllFormedUTF8(_:).

I'm sorry, I don't mean to be difficult, really, but I don't understand
how that answers my quetsion.

>> > it stems from bridging the old-style constructor to the new-style
>> > one. 
>> 
>> I don't know what that means either.
>
> String.Type.fromCString(_:) and
> String.Type.fromCStringRepairingIllFormedUTF8(_:) are replaced by
> String(validatingCString:) and String(cString:), respectively, as
> described under "Impact on existing code".

Nor that.

>> > The old signature's return type is awkward in that it returns the
>> > extra hadError flag regardless of whether decoding failed, which is
>> > discarded almost universally by clients in the stdlib.
>> >
>> > I intend to revert the change to `fromCStringRepairingIllFormedUTF8`, or
>> > elide `_decode` into it instead.
>> 
>> I don't know what that means either.
>
> You found the changes fromCStringRepairingIllFormedUTF8 unclear, I
> intend to return it to the previous version.
>
>> 
>> >> Lastly, I am very concerned about the “Alternatives Considered” section,
>> >> where, of one alternative, it says:
>> >> 
>> >>       This might be the better long-term solution from the perspective
>> >>       of API maintenance, but in the meantime this proposal has a fairly
>> >>       low impact.
>> >> 
>> >> We can't accept changes into the standard library “in the meantime,”
>> >> with the expectation that something more comprehensive will make them
>> >> obsolete.  Even though we've had migration tools, we never operated that
>> >> way in the past, and as we head toward API and ABI stability it is even
>> >> more true today.
>> >
>> > I understand the sentiment here, but I don't think "wait for the next
>> > rewrite of String" is a good solution to a problem that the stdlib
>> > already resolves. 
>> 
>> I'm really confused.  If the stdlib already resolves the problem, why is
>> there a proposal?
>
> It has implementations in underscored form, which I can't use.
>
>> 
>> > See below.
>> >
>> >> 
>> >> > Is the problem being addressed significant enough to warrant a change
>> >> > to Swift?
>> >> 
>> >> Yes.
>> >> 
>> >> > Does this proposal fit well with the feel and direction of Swift?  
>> >> 
>> >> W.r.t. direction, the fact that we have a major String overhaul planned
>> >> means that tackling this one corner of the API is probably not entirely
>> >> appropriate.
>> >
>> > "Corner" is an unfair characterization; 
>> 
>> Not at all; this addresses an important use-case, but accounts for a
>> very small part of the API.
>> 
>> > users of Swift want to implement things to the level of safety and
>> > performance bar set by the stdlib, but we are currently at a
>> > disadvantage. No better example can be found but in
>> > corelibs-foundation:
>> > https://github.com/apple/swift-corelibs-foundation/blob/546dc8e16c3c34ca50f5752c6d0f39c3524f5f0a/Foundation/NSString.swift#L305.
>> 
>> Heh, well, that's ironic.  Before Swift was released we had all of this
>> functionality for unicode transcoding available in the String API, and
>> it was decided that to avoid treading into Foundation's territory, any
>> functionality already exposed by NSString should not be public API on
>> String without Foundation loaded.
>> 
>
> That's unfortunate.
>
>> I have ambitions that Swift strings will be first-class standalone types
>> without the need to rely on Foundation some day soon, but under the
>> circumstances we may need a more comprehensive change than merely adding
>> fast unicode transcoding to justify treading into this area.
>> 
>> > Is it not a deficiency of the stdlib when code has to resort to
>> > non-public methods? 
>> 
>> Yes.  There are lots of deficiencies in the stdlib.  I'm not saying this
>> shouldn't be addressed, but I'm concerned about addressing this in a
>> temporary way that we think may be sub-optimal.
>> 
>> > The stdlib (i.e., the parts touched by the PR) and corelibs-foundation
>> > would have to move in lockstep to adopt a replacement, so the existing
>> > underscored versions are as good as public API.
>> >
>> > I understand, and 100% encourage, the reticence around new API. However,
>> > we've created a worse problem by encouraging slow, buggy, custom
>> > versions of behavior that already exists in the stdlib, or implying that
>> > the underscored API should be used because we haven't managed something
>> > better yet. :/
>> 
>> I am not reticent to have new API.  I am reticent to accept partial or
>> known-suboptimal temporary solutions when we are developing a
>> comprehensive plan that ought to address the same problems (among many
>> others).
>
> Not it's my turn to be confused. I don't think of this proposal as a
> temporary solution, and would like more color on that if you would. 

You have said that in the fullness of time this problem would be better
addressed a different way.

> Even if a "more complete" variant found its way to the String views,
> the number of clients in the stdlib and Foundation make it clear that
> this group of constructors are necessary. Unless the entire
> UnicodeCodec concept is going away, I don't see this proposal causing
> a drastic migration problem; if they are, then the proposed
> constructors would be just a small component of another heroic
> migration anyway.

We're not going to have that many more heroic migrations; that's my
point.  We should approve the right answer, whatever that is.

-- 
-Dave



More information about the swift-evolution mailing list