[swift-evolution] Pitch: Improved Swift pointers
Andrew Trick
atrick at apple.com
Mon Jul 17 11:09:26 CDT 2017
> On Jul 14, 2017, at 7:24 PM, Taylor Swift <kelvin13ma at gmail.com> wrote:
>
> For the sake of argument, I’m gonna claim that instead, UnsafeMutableBufferPointer is the low level necessity, and UnsafeMutablePointer is the low level convenience structure.
>
> Suppose that every block of memory has a starting point, and a length. An UnsafeBufferPointer represents that, since all an UnsafeBufferPointer is, is a start address, and a length value, bundled together into one variable instead of two. Then we can say that UnsafePointer is a special case of UnsafeBufferPointer, where the length value is unknown.
>
> Mind you I didn’t say that the length value equals 1. But it might as well be. Because when the length is unknown, the only index where you can reasonably expect there to be a valid value is index zero. You wouldn’t feel confident trying to do something to the memory address 3 strides down from the pointer. If you did feel confident doing that, well now you have a length value. The length value is 4. Your regular Pointer is now a BufferPointer.
>
> The only reason any of this matters, is because all of our memory and memorystate -related functions take size arguments. You have to fill in the size argument with something. That something is the buffer `count`. Even if you don’t actually know the number for the `count`, you still have to supply some value, in which case the “only” sensible choice is “1”. That’s why you could argue that UnsafePointer is just a special case of a buffer pointer where `count` is 1. The UnsafePointer API demands information that only BufferPointers know. You could design an API where buffer pointers are the only pointers that exist, and everything would still work fine. Just because they have high-level capabilities doesn’t mean they can’t do everything plain pointers can do just as efficiently. A plain pointer takes up one word of storage, and you use another word of storage yourself to track the size. A buffer pointer stores the two words next to each other. You could ignore the size word and track the size yourself, taking up three words of storage, but that would just be silly.
>
> But wait! I don’t need to track the size, you say! The pointer just points to one item! I just need the address word, not the size word! Well, yeah. That’s why I proposed a sizeless singular pointer API, so you don’t have to go around plugging “1”s everywhere and save yourself a word of storage. You can argue that’s the actual high-level API, since it abstracts away the length stuff. But if “1” isn’t your answer to the question “on how many instances should this method operate on”, go back to paragraph 3 and convince yourself that what you really have is an buffer pointer, not a plain pointer.
>
> Now I thought of an exception to this, like when you’re assigning 8-bit RGBA values to an image buffer and it might make sense to write something like this in the current API:
>
> let size:Int = height * width << 2
> let base = UnsafeMutablePointer<UInt8>.allocate(capacity: size)
> defer
> {
> base.deallocate(capacity: -314159) // anything goes
> }
> var pixel:UnsafeMutablePointer<UInt8> = base
> while pixel < base + size
> {
> pixel.initialize(from: bg_color_rgba, count: 4)
> pixel += 4
> }
>
> And it’s convenient to be able to initialize 4 instances at a time without creating a 4-count buffer pointer. But this doesn’t really contradict my point if you think carefully. The groups of 4 RGBA values are your atom here, you’re not really working on 4 instances at a time with a stride 1 UInt8 big, you’re working on 1 single instance at a time with a stride 4 UInt8s big. Writing this is currently painful without the sized UnsafeMutablePointer API, but I think this is a deficiency of the UnsafeMutableBufferPointer API and that of its slice type, not a reason for keeping the sized UnsafeMutablePointer API, which if you ask me is a hack to get around the fact that we can’t use buffer pointer slices easily.
>
> Anyway, I mention all of this only because everyone seems convinced that UnsafePointer is supposed to be “edgier” than UnsafeBufferPointer. If you have ideas to make this better, by all means share them
UnsafeBufferPointer was developed to add safety and convenience on top of UnsafePointer. That’s just a historical fact, not necessarily detracting from your argument.
The problem with your use case above is that UnsafePointer doesn’t get the debug-mode bounds checks.
There’s an argument to be made for keeping the UnsafePointer API complete (as a single point of truth for the memory model), and an argument to be made for removing it (it’s not the *right* API to use in nearly all cases). I would be fine with either approach and people can argue it out. At the moment I would lean toward not breaking source for something that isn’t a critical problem for the language.
That doesn’t need to be resolved to make UnsafeBufferPointer clean and convenient.
-Andy
> On Fri, Jul 14, 2017 at 2:27 PM, Michael Ilseman <milseman at apple.com <mailto:milseman at apple.com>> wrote:
>
>> On Jul 13, 2017, at 6:55 PM, Taylor Swift via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>
>>
>>
>> On Thu, Jul 13, 2017 at 6:56 PM, Andrew Trick <atrick at apple.com <mailto:atrick at apple.com>> wrote:
>>
>>> On Jul 12, 2017, at 12:16 PM, Taylor Swift via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>
>>> Hi all, I’ve written up a proposal to modify the unsafe pointer API for greater consistency, safety, and ease of use.
>>>
>>> ~~~
>>>
>>> Swift currently offers two sets of pointer types — singular pointers such as UnsafeMutablePointer, and vector (buffer) pointers such as UnsafeMutableBufferPointer. This implies a natural separation of tasks the two kinds of pointers are meant to do. For example, buffer pointers implement Collection conformance, while singular pointers do not.
>>>
>>> However, some aspects of the pointer design contradict these implied roles. It is possible to allocate an arbitrary number of instances from a type method on a singular pointer, but not from a buffer pointer. The result of such an operation returns a singular pointer, even though a buffer pointer would be more appropriate to capture the information about the number of instances allocated. It’s possible to subscript into a singular pointer, even though they are not real Collections. Some parts of the current design turn UnsafePointers into downright DangerousPointers, leading users to believe that they have allocated or freed memory when in fact, they have not.
>>>
>>> This proposal seeks to iron out these inconsistencies, and offer a more convenient, more sensible, and less bug-prone API for Swift pointers.
>>>
>>> <https://gist.github.com/kelvin13/a9c033193a28b1d4960a89b25fbffb06 <https://gist.github.com/kelvin13/a9c033193a28b1d4960a89b25fbffb06>>
>>>
>>> ~~~
>>>
>>
>> Thanks for taking time to write this up.
>>
>> General comments:
>>
>> UnsafeBufferPointer is an API layer on top of UnsafePointer. The role
>> of UnsafeBufferPointer is direct memory access sans lifetime
>> management with Collection semantics. The role of UnsafePointer is
>> primarily C interop. Those C APIs should be wrapped in Swift APIs that
>> take UnsafeBufferPointer whenever the pointer represents a C array. I
>> suppose making UnsafePointer less convenient would push developers
>> toward UnsafeBufferPointer. I don't think that's worth outright
>> breaking source, but gradual deprecation of convenience methods, like
>> `susbscript` might be acceptable.
>>
>> Gradual deprecation is exactly what I am proposing. As the document states <https://gist.github.com/kelvin13/a9c033193a28b1d4960a89b25fbffb06#proposed-solution>, the only methods which should be marked immediately as unavailable are the `deallocate(capacity:)` methods, for safety and source compatibility reasons. Removing `deallocate(capacity:)` now and forcing a loud compiler error prevents catastrophic *silent* source breakage in the future, or worse, from having to *support our own bug*.
>>
>>
>> I have mixed feelings about stripping UnsafePointer of basic
>> functionality. Besides breaking source, doing that would be
>> inconsistent with its role as a lower API layer. The advantage would
>> just be descreasing API surface area and forcing developers to use a
>> higher-level API.
>>
>> UnsafePointer is as much a high level API as UnsafeBufferPointer is. You wouldn’t create a buffer pointer of length 1 just so you can “stick with the high level API”. UnsafePointer and UnsafeBufferPointer are two tools that do related but different things and they can exist at whatever abstract level you need them at. After all, UnsafeBufferPointer is nothing but an UnsafePointer? with a length value attached to it. If you’re allocating more than one instance of memory, you almost certainly need to track the length of the buffer anyway.
>>
>
> I disagree. UnsafePointer can be viewed as the low level representation of Swift’s memory model. UnsafeBufferPointer is a convenient facility for programming at this level, as illustrated by e.g. RandomAccessCollection conformance. One could drop UnsafeBufferPointer from the standard library without compromising Swift’s core capabilities, it would just be a large convenience regression.
>
> UnsafePointer is a low level necessity; UnsafeBufferPointer is a low level convenience.
>
>> The additive changes you propose are fairly obvious. See [SR-3088]
>> UnsafeMutableBufferPointer doesn't have an allocating init.
>>
>> I haven't wanted to waste review cycles on small additive
>> changes. It may make sense to batch them up into one coherent
>> proposal. Here are a few more to consider.
>>
>> - [SR-3929] UnsafeBufferPointer should have init from mutable
>> - [SR-4340] UnsafeBufferPointer needs a withMemoryRebound method
>> - [SR-3087] No way to arbitrarily initialise an Array's storage
>>
>> The feature requests you mention are all very valuable, however with Michael’s point about fixing the memorystate API’s, the size of this proposal has already grown to encompass dozens of methods in five types. I think this says a lot about just how broken the current system is
>
> I fail to see how the current system is broken. It accurately reflects Swift’s memory model while providing the facilities to stay in the bounds of defined behavior. I do agree that it would be far more convenient for UnsafeBufferPointer to provide allocation/deallocation and initialization/deinitialization convenience functionality as well, as I’ve often wanted it myself.
>
>> , but I think it’s better to try to fix one class of problems at a time, and save the less closely-related issues for separate proposals.
>
> These seem very closely related. Isn’t the overall goal of this proposal to have API parity for UnsafeBufferPointer?
>
>>
>>
>> Point by point:
>>
>> > drop the capacity parameter from UnsafeMutablePointer.allocate() and deallocate().
>>
>> I do not agree with removing the capacity parameter and adding a
>> single-instance allocation API. UnsafePointer was not designed for
>> single instances, it was primarily designed for C-style arrays. I
>> don't see the value in providing a different unsafe API for single
>> vs. multiple values.
>>
>> Although it’s common to *receive* Unsafe__Pointers from C API’s, it’s rare to *create* them from the Swift side. 95% of the time your Swift data lives in a Swift Array, and you use withUnsafePointer(_:) to send them to the C API, or just pass them directly with Array bridging.
>>
>> The only example I can think of where I had to allocate memory from the Swift side to pass to a C API is when I was using the Cairo C library and I wanted the Swift code to own the image buffer backing the Cairo C structs and I wanted to manage the memory manually to prevent the buffer backing from getting deallocated prematurely. I think I ended up using UnsafeMutableBufferPointer and extracting baseAddresses to manage the memory. This proposal tries to mitigate that pain of extracting baseAddresses by giving buffer pointers their own memory management methods.
>>
>> As for the UnsafePointers you get from C APIs, they almost always come with a size (or you specify it beforehand with a parameter) so you’re probably going to be turning them into UnsafeBufferPointers anyway.
>>
>> I also have to say it’s not common to deallocate something in Swift that you didn’t previously allocate in Swift.
>>
>
> I’m not sure it is even defined to use `deallocate` on memory that wasn’t `allocated` in Swift. Andy, thoughts here? Perhaps more clarity in deallocate’s documentation is needed?
>
>>
>> I agree the primary allocation API should be
>> UnsafeMutableBufferPointer.allocate(capacity:). There is an argument
>> to be made for removing UnsafeMutablePointer.allocate(capacity:)
>> entirely. But, as Michael Ilseman pointed out, that would involve
>> reevaluating several other members of the UnsafePointer API. I think
>> it's reasonable for UnsafePointer to retain all its functionality as a
>> lower level API.
>>
>>
>> I think duplication of functionality is something to be avoided if possible.
>>
>
> You have to weigh duplication against convenience. If avoiding duplication is more important, then we should keep the APIs only on UnsafePointer. But, I would prefer convenience in this scenario.
>
>> I don't understand what is misleading about
>> UnsafePointer.deallocate(capacity:). It *is* inconvenienent for the
>> user to keep track of memory capacity. Presumably that was done so
>> either the implementation can move away from malloc/free or some sort
>> of memory tracking can be implemented on the standard library
>> side. Obviously, UnsafeBufferPointer.deallocate() would be cleaner in
>> most cases.
>>
>> It’s misleading because it plain doesn’t deallocate `capacity` instances. It deletes the whole memory block regardless of what you pass in the capacity argument. If the implementation is ever “fixed” so that it actually deallocates `capacity` instances, suddenly every source that uses `deallocate(capacity:)` will break, and *no one will know* until their app starts mysteriously crashing. If the method is not removed, we will have to support this behavior to avoid breaking sources, and basically say “yes the argument label says it deallocates a capacity, but what it *really* does is free the whole block and we can’t fix it because existing code assumes this behavior”.
>>
>
> Note that many of these things are basically programming at the memory semantics level, not necessarily at a execution level.
>
>>
>> > add an allocate(count:) type method to UnsafeMutableBufferPointer
>>
>> `capacity` should be used for allocating uninitialized memory not
>> `count`. `count` should only refer to a number of initialized objects!
>>
>> We can decide on what the correct term should be, but the current state of Swift pointers is that *neither* convention is being followed. Just look at the API for UnsafeMutableRawPointer. It’s a mess. This proposal at the minimum establishes a consistent convention. It can be revised if you feel `capacity` is more appropriate than `count`. If what you mean is that it’s important to maintain the distinction between “initialized counts” and “uninitialized counts”, well that can be revised in too.
>>
>
> So I just looked over all of the APIs, and they are very consistent in their use of `count` and `capacity`. Where are you seeing a violation?
>
>> > add a deallocate() instance method to UnsafeMutableBufferPointer
>>
>> Yes, of course! I added a mention of that in SR-3088.
>>
>> > remove subscripts from UnsafePointer and UnsafeMutablePointer
>>
>> It's often more clear to perform arithmetic on C array indices rather
>> than pointers. That said, I'm happy to push developers to use
>> UnsafeBufferPointer whenever that have a known capacity. To me, this
>> is a question of whether the benefit of making a dangerous thing less
>> convenient is worth breaking source compatibility.
>>
>> Again, I think this is more about what the real use patterns are. If you are subscripting into a C array with integers, then UnsafeBufferPointer is the tool for the job, since it give you Collection conformance. If you can’t make an UnsafeBufferPointer, it’s probably because you don’t know the length of the array, and so you’re probably iterating through it one element at a time. UnsafeMutablePointer.successor() is perfect for this job. If you want to extract or set fields at fixed but irregular offsets, UnsafeRawPointer is the tool for the job. But I’m hard-pressed to think of a use case for random access into a singular typed pointer.
>>
>
> I’m also very interested in more convenience facilities for low level programming and I view UnsafeBufferPointer as the first low level convenience construct. The point of the very careful dance that the Unsafe*Pointers go through is to help the programmer avoid unwittingly falling into undefined behavior.
>
>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170717/ea552ebd/attachment.html>
More information about the swift-evolution
mailing list