[swift-evolution] [Review] SE-0184: Unsafe[Mutable][Raw][Buffer]Pointer: add missing methods, adjust existing labels for clarity, and remove deallocation size

Sat Sep 2 16:40:37 CDT 2017

On Sat, Sep 2, 2017 at 4:06 PM, Taylor Swift <kelvin13ma at gmail.com> wrote:

>
>
> On Sat, Sep 2, 2017 at 3:39 PM, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>
>> On Sat, Sep 2, 2017 at 2:13 PM, Taylor Swift <kelvin13ma at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Sat, Sep 2, 2017 at 10:03 AM, Xiaodi Wu via swift-evolution <
>>> swift-evolution at swift.org> wrote:
>>>
>>>> On Sat, Sep 2, 2017 at 12:27 AM, Douglas Gregor via swift-evolution <
>>>> swift-evolution at swift.org> wrote:
>>>>
>>>>> Hello Swift community,
>>>>>
>>>>> The review of SE-0184 "Unsafe[Mutable][Raw][Buffer]Pointer: add
>>>>> missing methods, adjust existing labels for clarity, and remove
>>>>> deallocation size" begins now and runs through September 7, 2017. The
>>>>> proposal is available here:
>>>>>
>>>>> https://github.com/apple/swift-evolution/blob/master/proposa
>>>>> ls/0184-unsafe-pointers-add-missing.md
>>>>>
>>>>> Reviews are an important part of the Swift evolution process. All
>>>>> reviews should be sent to the swift-evolution mailing list at
>>>>>
>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>
>>>>> or, if you would like to keep your feedback private, directly to the
>>>>> review manager. When replying, please try to keep the proposal link at the
>>>>> top of the message:
>>>>>
>>>>> Proposal link:
>>>>>
>>>>> https://github.com/apple/swift-evolution/blob/master/proposa
>>>>> ls/0184-unsafe-pointers-add-missing.md
>>>>>
>>>>> Reply text
>>>>>
>>>>> Other replies
>>>>>
>>>>>
>>>>> <https://github.com/apple/swift-evolution/pulls#what-goes-into-a-review-1>What
>>>>> goes into a review?
>>>>>
>>>>> The goal of the review process is to improve the proposal under review
>>>>> through constructive criticism and, eventually, determine the direction of
>>>>> Swift. When writing your review, here are some questions you might want to
>>>>> answer in your review:
>>>>>
>>>>>    - What is your evaluation of the proposal?
>>>>>
>>>>> Overall, this is an improvement. However, I do have some questions and
>>>> concerns:
>>>>
>>>>
>>>> Questions:
>>>>
>>>> ## UnsafeMutableRawPointer
>>>>
>>>> Regarding the options given for "whose `count` to use"--which option is
>>>> actually being proposed?
>>>>
>>>
>>> I don’t understand the question,, `UnsafeMutableRawPointer` takes an
>>> explicit `count:`. the “whose count to use” option is irrelevant.
>>>
>>
>> In "Proposed Solution," under subheading "UnsafeMutableRawPointer," you
>> write "the question of whose `count` to use becomes important." You then
>> outline "[o]ne option" as well as "[a] better option." Which of these
>> options are you actually proposing? For clarity, could you please excise
>> the non-proposed option from the "Proposed Solution" section and move it to
>> the "Alternatives Considered" section?
>>
>
> The *Proposed solution* section is divided into two parts, one dealing
> with plain pointers and one dealing with buffer pointers. The sections are
> separated by the horizontal rule above “*Buffer pointers are conceptually
> similar…*”. I don’t know why we’re arguing over typographic formatting
> but I am glad this proposal is noncontroversial enough to induce debate
> over its horizontal rules. As for the two options, the first is a strawman
> to explain why we are going with the second option.
>

I am not arguing over formatting. I'm only asking for clarification as to
which of two options, listed in contiguous paragraphs without any
intervening or surrounding horizontal rules, is actually being proposed. Am
I correct in understanding that you are proposing that there be a
precondition that `offset + source.count <= destination.count`? I do agree
that this is the superior design. In that case, the text could use some
clarification as to which is the strawman.

## UnsafeMutableBufferPointer
>>>>
>>>> Please clarify: why are you proposing that the `at:` arguments in
>>>> `UnsafeMutableBufferPointer` and `UnsafeMutableRawBufferPointer`
>>>> _should not_ receive default values, but the `at:` arguments in
>>>> `UnsafeMutableRawPointer` _should_ receive a default value of `0`?
>>>>
>>>>
>>>  The extant API for `UnsafeMutableRawPointer` already included these
>>> default arguments which seem to be widely used in the stdlib,, the proposal
>>> tries to avoid these kinds of source breakages wherever possible. We avoid
>>> providing the default arguments on buffer pointers because we want the fact
>>> that it takes a *start–length* segment pair to be obvious at the call
>>> site.
>>>
>>
>> Thanks for the clarification; that would be helpful information to put
>> into the proposal text. It is not an intuitive start-length pair, since the
>> `at` refers to an offset of the destination buffer but `count` refers to a
>> length of the source buffer. I appreciate how you separated the proposed
>> new argument `at` and the existing argument `count` in what is currently
>> named `initializeMemory<T>(as:from:count:)`, which helps to reinforce
>> that fact.
>>
>>
>>> Concerns:
>>>>
>>>> ## UnsafeMutablePointer
>>>>
>>>> It's alarming that omitting `count` in `initialize(repeating:count:)`
>>>> (and assign, etc.) means initialize _one_ element, but elsewhere (such as
>>>> `UnsafeMutableBufferPointer` means initialize _all_ elements. The behavior
>>>> of the proposed API also contradicts its own spelling on its face:
>>>> `initialize(repeating: foo)` means *do not repeat* `foo`.
>>>>
>>>> Yes, I understand the argument that `*BufferPointer` types have an
>>>> intrinsic count, etc., but in the context of code where types are inferred,
>>>> `let foo = T.allocate(capacity: 100); foo.initialize(repeating: bar)`
>>>> should not mean one thing for `*BufferPointer` types and a totally
>>>> different thing for plain `*Pointer` types--particularly when both can be
>>>> allocated with a certain capacity greater than one.
>>>>
>>>> Either `count` should always be required, or for convenience there
>>>> should be a separate overload `initialize(pointee:)` that does not require
>>>> `count`.
>>>>
>>>>
>>> I understand the naming is not optimal, but reams of discussion on this
>>> list have concluded that it’s the least bad alternative available. We can’t
>>> just get rid of the default value for `count:` because usage in real
>>> code bases shows that this default argument is actually extremely useful. I
>>> believe well over 90% of the calls to these methods in the standard library
>>> currently rely on the default argument value. Renaming the `repeating:`
>>> argument to `to:` would make the API inconsistent and hide the fact that
>>> plain pointers are still capable of operating on many elements in sequence
>>> — “`to:count:`” makes no grammatical sense to read — “to” is a singular
>>> preposition.
>>>
>>
>> Let me clarify my concern here. This is not intended to be a bikeshedding
>> exercise and I agree with your choice of `repeating` (as I did in the
>> original conversations). For the sake of clarity, however, I'll proceed
>> with this discussion as though the argument were named `xxxx`. My point
>> here is that, regardless of what `xxxx` is called, we have a problem here
>> as follows:
>>
>> ```
>> let foo = UnsafeMutablePointer<Int>.allocate(capacity: 21)
>> foo.initialize(xxxx: 42) // initializes 1 value
>>
>> let bar = UnsafeMutableBufferPointer<Int>.allocate(capacity: 21)
>> bar.initialize(xxxx: 42) // initializes 21 values
>> ```
>>
>> The same spelling, `initialize(xxxx:)`, does two *different* things under
>> your proposal depending on whether it's invoked on a UMP or a UMBP. Even
>> though it's admirable that your proposal is filling in the gaps of Swift's
>> pointer design and increasing consistency by making similar things have
>> similar names, different things need to have different names; otherwise, we
>> are actively creating footguns.
>>
>> Therefore, while I agree with your choice for `xxxx`, and while I also
>> agree that it is very useful to have a method that initializes a single
>> value on a UMP, we need to have a different label `yyyy` for that method.
>> My suggestion is `pointee`, but I would be equally happy with `value`,
>> `to`, or whatever else you may choose.
>>
>
> yes this is a problem, but your solution is to add a second set of methods
> that are functionally identical, except for the names of the argument
> labels. This strikes me as silly and wasteful of API surface.
>

Disagree strongly. If the default value of 1 is important enough to merit
inclusion and even important enough that you are willing to risk the
problem of causing an identical spelling to do two different things, then
it certainly merits its own function. That is an appropriate and necessary
increase in the API surface and not at all wasteful.

## UnsafeMutableRawBufferPointer
>>>>
>>>> In `copyBytes`, the use of `Bytes` to emphasize that it's the memory
>>>> that's being copied is thoughtful, but it is inconsistent with the other
>>>> method names that use the terminology `Memory` for the same purpose (e.g.,
>>>> `moveInitializeMemory` to clarify the meaning of `moveInitialize`).
>>>>
>>>> For better consistency--and since you're proposing to rename
>>>> `copyBytes(from:count:)` on `UnsafeMutableRawPointer` to
>>>> `copy(from:bytes:)`--this particular API on `UnsafeMutableRawBufferPointer`
>>>> should be named `copyMemory(from:)` and not `copyBytes(from:)`.
>>>>
>>>> Although, actually, looking at the APIs on `UnsafeMutableRawPointer`
>>>> itself, that particular method too might best be written as
>>>> `copyMemory(from:bytes:)` instead of merely `copy(from:bytes:)` for better
>>>> consistency with the rest of the methods on that type as well.
>>>>
>>>>
>>>>
>>> “Memory” methods are distinct from “Bytes” methods in that they assume
>>> typed memory. “Bytes” on the other hand does not care about types.
>>>
>>
>> It's unclear to me that this distinction is either consistently observed
>> or helpful. For instance, by that standard, `initializeMemory(as:from:)`
>> should be named `initializeBytes(as:fromMemory:)`, since the memory
>> being initialized is raw until after initialization. This strikes me as not
>> at all necessary for user comprehension of the APIs. (On the other hand, if
>> it _is_ necessary for comprehension, then `copyBytes` should never be
>> shortened to `copy`, since it would be necessary to emphasize that both the
>> source and destination are treated as "bytes" rather than "memory.")
>>
>
> `copyMemory` would probably be better and I am not opposed to adding it to
> the proposal. at the same time I don’t think it’s so bad a problem that it
> really *needs* to be renamed.
>
>
>>
>> ## General comment
>>>>
>>>> Many `at:` arguments, especially such as in the case of
>>>> `copyBytes(at:from:)`, make sense only when read in a list with all other
>>>> methods. Standing alone, `at` is ambiguous as to whether it's referring to
>>>> the _source_ or the _destination_. Why do these APIs on `*BufferPointer`
>>>> types not take advantage of subscripts? That is, why not:
>>>>
>>>>   `foo[x...].copyMemory(from: bar)`
>>>>
>>>> instead of
>>>>
>>>>   `foo.copyBytes(at: x, from: bar)`
>>>>
>>>> The first seems dramatically clearer as to its meaning. The same
>>>> feedback applies to nearly all uses of `at` on `*BufferPointer` types: they
>>>> would seem to be greatly clarified (in terms of the "what does `at` mean"
>>>> question) by the use of a subscript spelling.
>>>>
>>>
>>> This idea sounds elegant on the surface but it’s deeply problematic. `
>>> foo[x...]` suggests that whatever happens to it, will happen to the
>>> entire rest of the buffer slice as well, which is not always the case.
>>>
>>
>> No more so than `foo.copyBytes(at:from:)` suggests the same?
>>
>
> `at:` suggests an unbounded starting point. `x...` suggests a concrete
> range of the buffer since it’s really shorthand for `x ... endIndex`.
>
>
>>
>>
>>> It would have to be written as `foo[x ... x + count].copyMemory(from:
>>> bar)` or `foo[x ... x + bar.count].copyMemory(from: bar)` which seems
>>> *less* clear.  Having to write `foo[0...]` to operate with no offset
>>> also seems silly.
>>>
>>
>> It is unclear to me why one would have to write `foo[0...]`.
>>
>
> How else would you operate at offset 0?
>

As I wrote earlier, `copyMemory` would be available on a protocol (a la
`StringProtocol`) to which both buffer pointers and their slices would
conform.

>
>>
>>> It also means the API would have to be written on `
>>> MutableRandomAccessSlice<UnsafeMutable*BufferPointer>`.
>>>
>>
>> Not necessarily; it's unclear to me that `MutableRandomAccessSlice<UnsafeMutable*BufferPointer>`
>> is the right slice type for `UnsafeMutable*BufferPointer` in the first
>> place (see below).
>>
>>
>>> Finally, what happens when you subscript a raw pointer?
>>>
>>
>> As you know, only buffer pointers conform to `Collection`, so it's
>> unclear to me what your question is here.
>>
>
> I meant a raw *buffer* pointer.
>

Those would be, just as they are now, subscripted by bytes. Although of
course, with generic subscripts, we could permit additional conveniences by
specifying, as a second parameter, the type whose stride is to be used
(strawman syntax: `umrbp[42, stride: Int.self]`).

the subscript doesn’t know about the stride that you want to use. If you
>>> want to get rid of the `at:` ambiguity, you have to get rid of it
>>> everywhere, or users will just wind up having to remember two ways (one
>>> ambiguous and one less ambiguous) of doing the same thing, instead of one
>>> (ambiguous) way of doing it.
>>>
>>
>> Certainly, that's a good point. On rethink and another re-reading of the
>> proposal, it's unclear to me that the addition of `at` arguments to
>> UnsafeMutablePointer is sufficiently justified by the proposal text. Is it
>> merely that it's shorter than writing `foo + MemoryLayout<T>.stride *
>> offset`? With the ambiguity of `at`, it seems that the current way of
>> writing it, though longer, is certainly less ambiguous.
>>
>
> Please reread it; UnsafeMutablePointer’s methods do *not* use `at:`.
>

Apologies; I meant UnsafeMutableRawPointer (as opposed to buffer pointers).

I notice that you comment that you feel there are ergonomic issues with
>>>> buffer pointer slicing; can you please comment on what is "wasteful"
>>>> currently? Is there a performance hit to slicing a `*BufferPointer` type?
>>>> If so, we should fix that, as the whole rationale of having these types (as
>>>> I understand it) is to improve the ergonomics of working with pointers to
>>>> multiple elements by conforming them to `*Collection` APIs.
>>>>
>>>
>>> Slices are not the right abstraction for this because of Swift’s
>>> indexing semantics.
>>>
>>
>> This is an alarming statement if true, in that it would seem to undermine
>> the basic premise of `*BufferPointer` types conforming to `Collection`,
>> would it not?
>>
>
> I think Andy’s message has a good explanation for this.
>

Agreed. My takeaway from that message is that, indeed, `*BufferPointer`
types do not currently implement all the functionality necessary to
interact with them as bona fide collections in all respects, despite their
stated conformance. This is indeed problematic; I hope that it is agreed
that, at some later point, they either *ought to*, given the stated
conformance, or they ought to drop the conformance.

A buffer pointer requires two words of information (beginning and end
>>> address), while a buffer pointer slice occupies four words of information
>>> (beginning, end, start-index, end-index) to preserve the index semantics.
>>> Creating a way to “slice” a buffer pointer into another buffer pointer
>>> without going through `init(rebasing:)` is also problematic because you
>>> can’t `deallocate()` a debased buffer pointer, so we should make that
>>> operation explicit.
>>>
>>
>> It would seem, then, that to properly support slicing--which collection
>> types should do--we will need a custom slice type a la `Substring`. This
>> slice type would clearly not support deallocation, but it would conform to
>> a protocol (a la `StringProtocol`) which would require all the operations
>> such as `copyMemory` that makes sense for both buffer pointers and their
>> slices.
>>
>
> You’re basically proposing Protocol Oriented Pointers. Probably a good
> eventual goal, but we need to take things step by step.
>

I like the sound of it, and I'm fine with this being put off as an eventual
goal. My concern is that the `at:` arguments may not be a step forward or
even orthogonal to the eventual goal, in that there seems to be (at least,
to me) added rather than reduced ambiguity in APIs that use this argument.

The reasoning is in part, as I've outlined in other messages, the
inconsistent meaning of the `at:` argument (for example, `at: x` on methods
in UMBP<T> would correspond to `umbp[x]`, but `at: y` on methods in UMRBP
would *not* correspond to `umrbp[y]`), particularly given that the chief
ambiguity of these terms (`at`, `count`, etc.), has to do with the question
of "which one--source index/count or destination index/count?". This is
especially the case if, as Andy says, the eventual goal is to create slice
types so that subscripting becomes the preferred idiom.

>
>> It seems deeply unsatisfactory to invent new methods that use `at:`
>>>> arguments _on a type whose purpose is to expose `*Collection` APIs_ if we
>>>> agree that the slicing notation is demonstrably clearer. I do not have the
>>>> same concerns about adding `at:` arguments to `UnsafeMutableRawPointer`
>>>> methods, which are quite appropriate.
>>>>
>>>>
>>>>
>>>>>    - Is the problem being addressed significant enough to warrant a
>>>>>    change to Swift?
>>>>>
>>>>> Yes.
>>>>
>>>>
>>>>>
>>>>>    - Does this proposal fit well with the feel and direction of Swift?
>>>>>
>>>>> In parts, yes. In others (see above), it could be improved to fit
>>>> better with the feel and direction of other Swift APIs.
>>>>
>>>>
>>>>>
>>>>>    - If you have used other languages or libraries with a similar
>>>>>    feature, how do you feel that this proposal compares to those?
>>>>>
>>>>> There is much more friction to using pointers in Swift than in C.
>>>> However, making Swift pointers like C pointers is clearly a non-goal. This
>>>> proposal appropriate addresses major pain points to Swift-specific usages
>>>> of pointers.
>>>>
>>>>
>>>>>
>>>>>    - How much effort did you put into your review? A glance, a quick
>>>>>    reading, or an in-depth study?
>>>>>
>>>>> A moderately thorough reading, drawing on some experience using
>>>> pointer APIs in Swift, and based on prior readings of previous iterations
>>>> of this proposal and the on-list discussion.
>>>>
>>>>
>>>>> More information about the Swift evolution process is available at
>>>>>
>>>>> https://github.com/apple/swift-evolution/blob/master/process.md
>>>>>
>>>>> Thank you,
>>>>>
>>>>> -Doug
>>>>>
>>>>> Review Manager
>>>>>
>>>>> _______________________________________________
>>>>> swift-evolution mailing list
>>>>> swift-evolution at swift.org
>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> swift-evolution mailing list
>>>> swift-evolution at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170902/627d4f5b/attachment.html>