[swift-evolution] [Review] SE-0184: Unsafe[Mutable][Raw][Buffer]Pointer: add missing methods, adjust existing labels for clarity, and remove deallocation size

Xiaodi Wu xiaodi.wu at gmail.com
Sat Sep 2 15:54:49 CDT 2017

On Sat, Sep 2, 2017 at 3:36 PM, Andrew Trick <atrick at apple.com> wrote:

> Thanks for the review as always…
> On Sep 2, 2017, at 12:13 PM, Taylor Swift via swift-evolution <
> swift-evolution at swift.org> wrote:
> On Sat, Sep 2, 2017 at 10:03 AM, Xiaodi Wu via swift-evolution <swift-
> evolution at swift.org> wrote:
>> On Sat, Sep 2, 2017 at 12:27 AM, Douglas Gregor via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>> Hello Swift community,
>>> The review of SE-0184 "Unsafe[Mutable][Raw][Buffer]Pointer: add missing
>>> methods, adjust existing labels for clarity, and remove deallocation size"
>>> begins now and runs through September 7, 2017. The proposal is available
>>> here:
>>> https://github.com/apple/swift-evolution/blob/master/proposa
>>> ls/0184-unsafe-pointers-add-missing.md
>>> ## UnsafeMutableBufferPointer
>> Please clarify: why are you proposing that the `at:` arguments in
>> `UnsafeMutableBufferPointer` and `UnsafeMutableRawBufferPointer` _should
>> not_ receive default values, but the `at:` arguments in
>> `UnsafeMutableRawPointer` _should_ receive a default value of `0`?
>  The extant API for `UnsafeMutableRawPointer` already included these
> default arguments which seem to be widely used in the stdlib,, the proposal
> tries to avoid these kinds of source breakages wherever possible. We avoid
> providing the default arguments on buffer pointers because we want the fact
> that it takes a *start–length* segment pair to be obvious at the call
> site.
> That’s right, the buffer pointer API is just one level above pointers. It
> primarily offers the safety of tracking its capacity and bounds checking in
> debug mode. It does not help you safely manage fully initialized buffer
> slices (we *do* want to provide that convenience later as a higher-level
> non-pointer type). Instead, it exposes segments within the buffer for
> initialization, assignment, deinitialization. It needs to be obvious in the
> client code, at every call site, that the caller is responsible for
> tracking those segments. Earlier iterations of this proposal attempted to
> hide this as a convenience, but that led to dangerous scenarios.
> Taking at Kelvin’s example:
> var image = UnsafeMutableBufferPointer<Pixel>.allocate(capacity :
> maxPixels)
> var filled:Int = 0
> for scanline: UnsafeMutableBufferPointer<Pixel> in scanlines {
>     image.moveInitialize(at: filled, from: scanline)
>     filled += scanline.count
> }
>> // We don’t want to allow any defaults here.
> // It’s important for the client code to explicitly correlate the range
> // that it initialized with the range that it deinitialized.
> image.deinitialize(at: 0, count: filled)
> Concerns:
>> ## UnsafeMutablePointer
>> It's alarming that omitting `count` in `initialize(repeating:count:)`
>> (and assign, etc.) means initialize _one_ element, but elsewhere (such as
>> `UnsafeMutableBufferPointer` means initialize _all_ elements. The behavior
>> of the proposed API also contradicts its own spelling on its face:
>> `initialize(repeating: foo)` means *do not repeat* `foo`.
>> Yes, I understand the argument that `*BufferPointer` types have an
>> intrinsic count, etc., but in the context of code where types are inferred,
>> `let foo = T.allocate(capacity: 100); foo.initialize(repeating: bar)`
>> should not mean one thing for `*BufferPointer` types and a totally
>> different thing for plain `*Pointer` types--particularly when both can be
>> allocated with a certain capacity greater than one.
>> Either `count` should always be required, or for convenience there should
>> be a separate overload `initialize(pointee:)` that does not require `count`.
> I understand the naming is not optimal, but reams of discussion on this
> list have concluded that it’s the least bad alternative available. We can’t
> just get rid of the default value for `count:` because usage in real code
> bases shows that this default argument is actually extremely useful. I
> believe well over 90% of the calls to these methods in the standard library
> currently rely on the default argument value. Renaming the `repeating:`
> argument to `to:` would make the API inconsistent and hide the fact that
> plain pointers are still capable of operating on many elements in sequence
> — “`to:count:`” makes no grammatical sense to read — “to” is a singular
> preposition.
> The *only* thing I really don’t like about the proposed API is the
> potential confusion between UnsafeMutablePointer calls with a default count
> and UnsafeMutableBufferPointer calls using the implied buffer count. But
> this is where we ended up because the default count turns out to be useful.
> I’m willing to live with it since I don’t see a great alternative.

This only thing is also the major thing that causes me concern. I have no
problem with the name itself. In my view, it is important to provide the
very useful function with a default count of 1 using a different name, as
I've written in the other reply.

>> ## UnsafeMutableRawBufferPointer
>> In `copyBytes`, the use of `Bytes` to emphasize that it's the memory
>> that's being copied is thoughtful, but it is inconsistent with the other
>> method names that use the terminology `Memory` for the same purpose (e.g.,
>> `moveInitializeMemory` to clarify the meaning of `moveInitialize`).
>> For better consistency--and since you're proposing to rename
>> `copyBytes(from:count:)` on `UnsafeMutableRawPointer` to
>> `copy(from:bytes:)`--this particular API on `UnsafeMutableRawBufferPointer`
>> should be named `copyMemory(from:)` and not `copyBytes(from:)`.
>> Although, actually, looking at the APIs on `UnsafeMutableRawPointer`
>> itself, that particular method too might best be written as
>> `copyMemory(from:bytes:)` instead of merely `copy(from:bytes:)` for better
>> consistency with the rest of the methods on that type as well.
> Xiaodi has a fair point. I’m a moderate +1 on his suggestion. I can’t
> remember why it wasn’t originally called “copyMemory”. Someone may have
> taken issue with the fact that it’s the contents of memory being copied,
> but that’s a silly distinction.
> “Memory” methods are distinct from “Bytes” methods in that they assume
> typed memory. “Bytes” on the other hand does not care about types.
> The other Memory methods take a type parameter out of necessity, but that
> doesn’t need to be a rule. The “Memory” suffix is there because the
> semantics of operating on untyped memory is slightly different. I think
> copyMemory would fit with that convention. After all, it’s really meant to
> be Swift's ‘memcpy’.
> ## General comment
>> Many `at:` arguments, especially such as in the case of
>> `copyBytes(at:from:)`, make sense only when read in a list with all other
>> methods. Standing alone, `at` is ambiguous as to whether it's referring to
>> the _source_ or the _destination_. Why do these APIs on `*BufferPointer`
>> types not take advantage of subscripts? That is, why not:
>>   `foo[x...].copyMemory(from: bar)`
>> instead of
>>   `foo.copyBytes(at: x, from: bar)`
>> The first seems dramatically clearer as to its meaning. The same feedback
>> applies to nearly all uses of `at` on `*BufferPointer` types: they would
>> seem to be greatly clarified (in terms of the "what does `at` mean"
>> question) by the use of a subscript spelling.
> In the proposed API, we have a consistent convention that some segment
> within potentially larger buffer is being replaced with the contents of a
> smaller buffer.
> Ultimately, we want buffer slices to work the way you describe, but first
> we need to introduce new kind of buffer slice. That won’t happen in Swift
> 5. Our current buffer slices are fundamentally dangerous, and I don’t think
> it’s right to build convenience around danger.

Got it. I think we've circled back to the same conclusion here about the
need for a new kind of buffer slice.

> This idea sounds elegant on the surface but it’s deeply problematic. `
> foo[x...]` suggests that whatever happens to it, will happen to the
> entire rest of the buffer slice as well, which is not always the case. It
> would have to be written as `foo[x ... x + count].copyMemory(from: bar)`
> or `foo[x ... x + bar.count].copyMemory(from: bar)` which seems *less* clear.
> Having to write `foo[0...]` to operate with no offset also seems silly.
> It also means the API would have to be written on `
> MutableRandomAccessSlice<UnsafeMutable*BufferPointer>`. Finally, what
> happens when you subscript a raw pointer? the subscript doesn’t know about
> the stride that you want to use. If you want to get rid of the `at:`
> ambiguity, you have to get rid of it everywhere, or users will just wind up
> having to remember two ways (one ambiguous and one less ambiguous) of doing
> the same thing, instead of one (ambiguous) way of doing it.
> Yes, Xiaodi is reopening this can of worms again. This is simply out of
> scope since the proposed API isn’t making any of these problems harder to
> solve in the future.

It isn't making future solutions harder, but my concern is that the
ambiguity of `at` and its use with different meanings (`at` in typed
pointers referring to an offset that is a multiple of the *destination*
pointee type, but `at` in raw pointers referring to an offset that is a
multiple of the *source* pointee type instead of raw bytes) is confusing
enough that designing a whole set of new methods to increase this usage is
not clearly an improvement. At least, as compared to the status quo of
tolerating the less ergonomic, but also unambiguous, `foo +
MemoryLayout<T>.stride * offset`. Especially given that, as you say, these
problems do need to be solved some other way in the future. Unless I am
missing another major argument for incorporating additional uses of `at`,
it would seem (to me) preferable to eliminate all or most of these uses
entirely even if new slice types are out of scope for Swift 5.

> I notice that you comment that you feel there are ergonomic issues with
>> buffer pointer slicing; can you please comment on what is "wasteful"
>> currently? Is there a performance hit to slicing a `*BufferPointer` type?
>> If so, we should fix that, as the whole rationale of having these types (as
>> I understand it) is to improve the ergonomics of working with pointers to
>> multiple elements by conforming them to `*Collection` APIs.
> Slices are not the right abstraction for this because of Swift’s indexing
> semantics. A buffer pointer requires two words of information (beginning
> and end address), while a buffer pointer slice occupies four words of
> information (beginning, end, start-index, end-index) to preserve the index
> semantics. Creating a way to “slice” a buffer pointer into another buffer
> pointer without going through `init(rebasing:)` is also problematic
> because you can’t `deallocate()` a debased buffer pointer, so we should
> make that operation explicit.
> Today’s buffer slices can certainly be useful when you want to expose a
> slice as a Sequence. They correctly refer back to the parent buffer and are
> exactly as efficient as they should be. You probably wouldn’t want to
> persistently store slices though. You’re better off rebasing the slice as a
> new buffer. When you do that it’s reasonably obvious that those rebased
> buffers don’t own their underlying memory. We can definitely come up with a
> better higher-level API, but there are bigger fish to fry.
> It seems deeply unsatisfactory to invent new methods that use `at:`
>> arguments _on a type whose purpose is to expose `*Collection` APIs_ if we
>> agree that the slicing notation is demonstrably clearer. I do not have the
>> same concerns about adding `at:` arguments to `UnsafeMutableRawPointer`
>> methods, which are quite appropriate.
> The proposed new methods provide reasonable functionality on
> UnsafeBufferPointers when you’re *not* using collection APIs. A major
> motivation for this proposal is that to do anything low-level with an
> UnsafeBufferPointer, you currently need to drop down to an UnsafePointer
> first, which is hideous, and you lose bounds checking.
> -Andy
>>>    - Is the problem being addressed significant enough to warrant a
>>>    change to Swift?
>>> Yes.
>>>    - Does this proposal fit well with the feel and direction of Swift?
>>> In parts, yes. In others (see above), it could be improved to fit better
>> with the feel and direction of other Swift APIs.
>>>    - If you have used other languages or libraries with a similar
>>>    feature, how do you feel that this proposal compares to those?
>>> There is much more friction to using pointers in Swift than in C.
>> However, making Swift pointers like C pointers is clearly a non-goal. This
>> proposal appropriate addresses major pain points to Swift-specific usages
>> of pointers.
>>>    - How much effort did you put into your review? A glance, a quick
>>>    reading, or an in-depth study?
>>> A moderately thorough reading, drawing on some experience using pointer
>> APIs in Swift, and based on prior readings of previous iterations of this
>> proposal and the on-list discussion.
>>> More information about the Swift evolution process is available at
>>> https://github.com/apple/swift-evolution/blob/master/process.md
>>> Thank you,
>>> -Doug
>>> Review Manager
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170902/d13cc4eb/attachment.html>

More information about the swift-evolution mailing list