[swift-evolution] RFC: Proposed rewrite of Unmanaged<T>

Janosch Hildebrand jnosh at jnosh.com
Mon Feb 22 17:00:50 CST 2016

> On 22 Feb 2016, at 16:45, Dave Abrahams <dabrahams at apple.com> wrote:
>> on Mon Feb 22 2016, Janosch Hildebrand <jnosh-AT-jnosh.com> wrote:
>> But yeah, that would work as well and might be a nicer solution overall.
>> (And you could easily create your own type from these + unowned(unsafe))
>> The downside I see is that being free functions and working with
>> AnyObject makes them much more discoverable than being hidden
>> inside some other type. 
> How discoverable *are* free functions, really, after all?  If you want
> we could nest them:
> enum UnsafeManualReferenceCounting {
> static func retain(AnyObject) {...}
> static func release(AnyObject) {...}
> }

I try to have a general overview of the standard library in terms of what is
available and how it is structured. And the stdlib is compact enough that
that is feasible.

Now a type naturally scopes and encloses functionality so I don't (need to)
have as precise a mental overview over the type's members & methods.
Knowing the type or how to find it also means that 'lookups' via auto-
completion or documentation will produce a relatively narrow set of results
for me to process.

Finding a free function that way is much harder. The enclosing scope is
much larger so autocompletion for example becomes pretty much useless
unless you already know what you are looking for. This is even worse if you
have Darwin imported (unless you do 'Swift.').

So I want a decent overview over what free functions are available because
I probably should know about them before I need them. Being free functions
also signals that they are potentially special or different in some way so that
also makes them of interest.

Now this is mostly a sub- or semi-concious affair and I have no idea how
other people approach this...

That said if this has't been an issue with other unsafe* function then I
doubt it would be one in this case.
And nesting them doesn't feel right without any other precedent and
opens up the discussion about nesting other free functions which is
probably not something we want to do right now.

>> But given the other unsafe* free functions that's already exists that
>> might be fine.
> Yeah, IMO that would be reasonable.
>> Also having a predefined wrapper type has some use, e.g. when you
>> want to store your unowned(unsafe) objects into some collection.
> Yeah, though it seems to me that wrapper should be UnsafeReference<T>.

Yes, it would certainly be the simplest solution.

IIRC, my biggest concern with that was that this continues to combine
"transfer unannotated objects" and "manual reference counting" into a
single type. Which might make it harder for someone new to the topic
that just needs to use some object that comes from an unannotated API.

But again that is probably not a huge issue in practice and we have the
same issue with Unmanaged today. Also UnsafeReference's improved
method names & documentation should hopefully mitigate this as well.

A dedicated type or free functions have other tradeoffs so ultimately I'm
pretty much open to any of these given that, as you mentioned, MRC is
probably going to be retained (no pun intended) in some form...

>> But I'd guess you'll end up with a dedicated wrapper type anyway in
>> most circumstances where this is an issue so it's probably OK.  And
>> this gets rid of having a separate type that also plays the role of
>> unowned(unsafe) which seems like a plus.
>> I have some more answers below but I'll summarize my opinion here.
>> Preferences (in descending order):
>> 1) unsafeRetain() + unsafeRelease() + unowned(unsafe)
>> 2) "UnsafeReferenceCountedPointer"
>> 3) No dedicated functionality so just abuse UnsafeReference instead of
>>   Unmanaged
>> I think ultimately it's a question of whether we want to expose the
>> reference counting implementation to manual use...
>> Is it something I can live without? Absolutely.
>> Is it something I would use often? Absolutely not.
>> But then again, it is going to be exposed through UnsafeReference anyway.
>> And having access to a solid manual reference counting solution
>> that integrates very well with the rest of the language is kinda neat in
>> my opinion. And the integration with ARC is something an external
>> library cannot provide without becoming even more "hacky".
>> And I don't think it's any more dangerous than any other manual
>> memory management we have access to. 
> Okay.  My current feeling about all this is that UnsafeReference should
> have retain and release directly on it, for MRC.
> The thing I'm really unclear about at this point is whether it's
> reasonable to ask users to use “release()” for the maximally-safe usage
> pattern, or if that's just too weird.  My sense all along has been that
> asking them to get used to it is a much better design choice than
> expanding the API, especially when we don't really have a better name for
> it than “release.”  If a better name presented itself, that might change
> the picture.

I've taken issue with it before but I've reread the thread and I still don't have
any new ideas.

I think 'transferByReleasing()' & 'transfer()' as suggested by Nevin might be
my favorite of the proposed alternatives. Other verbs than 'transfer' might
work too.
Whether it's better than release() and .object... I don't know.

At any rate, at least retain() would fit somewhat nicely in the state diagram
described by the "ownership states" section so retain() could be documented
as going from Unretained->Retained.

But I don't know what the plan would be regarding documentation if MRC
behaviour is done through UnsafeReference. Trying to document it would
probably really detract from and add confusion to the simple model that
is currently laid out.

[Quick aside : I was looking at the draft for UnsafeReference as quoted in
Proposal 6 and there is a period missing after
"No API should pass or return a *released* UnsafeReference`"
and a double-space right before the sentence (and in a few other places).]

>> I also wonder if it has some minor use for learning/teaching.
>> Yes, ARC is great because most of the time you don't need to think
>> about this but if you're trying to understand reference counting it's
>> kinda nice to be able to actually interact and play around with it.
>> Then again I'm the kind of person that likes doing that but YMMV...
> As long as you don't need to share anything across threads, it's easy
> enough to build MRC using malloc, free, and a counter in each allocated
> block, so this doesn't sound like a strong argument to me.

Right and it's certainly no justification for the feature by itself.

I think it could be nice as a real world example / demonstration tool though.
Along the lines of:
"So you've been learning about ref counting now and you've also been using
Swift. Well guess what, Swift does that and you can even watch it do its thing".

Demystifying the "magic" in complex systems with simple building blocks is
something I love about all of Science and Engineering.

>>>>>> As Joe mentioned, `Unmanaged` has a use for manual ref counting
>>>>>> beyond immediate transfer from un-annotated APIs.
>>>>>> I have used it for performance reasons myself (~ twice) and while I
>>>>>> think it's a pretty small use case there isn't really any
>>>>>> alternative.
>>>>>> If it would help I can also describe my use-cases in more detail.
>>>>> Yes please!
>>>> One place I used Unmanaged is in a small project where I experiment
>>>> with binary heaps in Swift. I've put the project on Github
>>>> --(https://github.com/Jnosh/SwiftBinaryHeapExperiments) but basically
>>>> I'm using `Unmanaged` in two places here:
>>>> 1) Testing the 'overhead' of (A)RC.
>>>> Basically comparing the performance of using ARC-managed objects in
>>>> the heaps vs. using 'unmanaged' objects. In Swift 1.2 the difference
>>>> was still ~2x but with Swift 2+ it's likely approaching the cost of
>>>> the retain/release when entering and exiting the collection.
>>>> Now this could also be accomplished using `unowned(unsafe)` but
>>>> `Unmanaged` has some minor advantages:
>>>> 	a) I can keep the objects alive without keeping them in a
>>>> separate collection. Not a big issue here since I'm doing that anyway
>>>> but I also find that `Unmanaged` makes it clearer that & how the
>>>> objects are (partly) manually managed.
>>>> 	b) I had previously experimented with using `unowned(unsafe)`
>>>> for this purpose but found that `Unmanaged` performed better. However,
>>>> that was in a more complex example and in the Swift 1.2 era. A quick
>>>> test indicates that in this case and with Swift 2.1 `unowned(unsafe)`
>>>> and `Unmanaged` perform about equally.
>>> They should.  unowned(unsafe) var T is essentially just an
>>> UnsafePointer.  unowned/unowned(safe) do incur reference-counting cost
>>> in exchange for their safety.
>> I'll come back to this further down.
>>>> 2) A (object only) binary heap that uses `Unmanaged` internally
>>>> Not much practical use either in this case since the compiler seems to
>>>> do quite well by itself but still a somewhat interesting exercise.
>>>> `Unmanaged` is pretty much required here to make CoW work by manually
>>>> retaining the objects.
>>> It's hard for me to imagine why that would be the case.  Would I have
>>> needed to use Unmanaged in implementing Arrays of objects, if it were?
>> Sorry, I wasn't clear enough. I (ab)use Unmanaged for two different reasons here.
>> 1) To have a performance baseline where the ARC overhead inside the collection
>> is essentially zero beyond the mandatory retain on insert, i.e. as if the compiler was
>> able to eliminate all (redundant) retains and releases.
>> One part of this is exempting the objects from ARC which is is done by storing the
>> elements in Unmanaged instances but a wrapper type using unowned(unsafe)
>> would work just as well.
>> However, I still need a strong reference to the objects to keep them alive. Using a
>> separate data structure would work but that has a space, time and code complexity
>> cost.
>> Instead I use Unmanaged to manually retain the objects on insert and release on
>> removal. unowned cannot do that on its own hence the need for something like
>> unsafeRetain() & unsafeRelease().
> OK.
>> 2) I then abuse Unmanaged's capabilities a second time to retain the elements
>> when the collection is copied (which would happen 'automatically' with
>> ARC).
>> Btw, with Swift 2.2 under WMO the performance of a normal ManagedBuffer
>> is on par with this "hack". Go Swift team!
> :-)
>>>> The other project was a simple 2D sprite engine (think a simplified
>>>> version of SpriteKit) I experimented with about a year ago.
>>>> Textures and Shaders were abstracted as value types privately backed
>>>> by reference types that managed the underlying OpenGL objects,
>>>> i.e. destroy the OpenGL texture object on deinit, etc...
>>>> I found this to be quite nice to use but ARC overhead during batching
>>>> & rendering amounted to something like 20-30% of CPU time IIRC. (This
>>>> was under Swift 1.2 and with WMO). Using `Unmanaged` was one of the
>>>> things I played around with to get around this and it worked very
>>>> well.
>>> Another case where you can use unowned(unsafe), is it not?
>> Indeed, and that was what I originally tried to use.
>> Ultimately i settled on Unmanaged however. Now it's been a long time and
>> I don't recall the exact details so take this with a grain of salt: 
>> One reason certainly was that I ended up needing Unmanaged anyway
>> to perform manual retain & releases at which point why not also use it
>> for 'storage'...
>> But I also vaguely recall that Unmanaged had more of a performance impact.
>> Now one possibility is that there was some issue with unowned(unsafe) (this
>> was with 1.2.β1) but much more likely is that Unmanaged was easier to apply
>> consistently and correctly.
>> e.g. assume you have some struct that contains an unowned(unsafe) variable.
>> Now if you extract that into a local variable you add a perhaps unwanted
>> retain/release so you might need to mark the local variable as unowned(unsafe)
>> as well, etc...
> Right; makes sense.
>> What I'm trying to say is that with unowned you need to be careful and
>> considerate with how you use it at all times since the 'obvious' thing
>> generally leads to retain/release.  
> That's safer, though, but in this case you're more concerned about performance.

Yep. In my experience Unmanaged also worked surprisingly well.
You can (and will) of course make all the obvious mistakes but at least it forces
you to think about what exactly you want whenever you access it.

And whenever you screw up, all the potential failure points are relatively easy to
find since you can mostly just ignore all the "normal" objects.

>> Unmanaged is fine to pass around, store in a local variable,
>> etc... and any ARC related interactions are obvious because they
>> manifest as method calls on the Unmanaged instance.
>> For their main application, breaking retain cycles, weak and unowned work fine
>> because you want to retain the objects when they are not 'at rest'.
>> But if you want to avoid retains even when working with the objects, a type is
>> just a much more comfortable way to handle this.
>> Like I mentioned before, I imagine that in many cases you'll end up making a
>> custom wrapper anyway but it's something I'm a bit apprehensive about.
>> Still, I think unowned(unsafe) together with unsafeRetain() and unsafeRelease()
>> free functions makes for a nicer API and I don't think I can adequately judge it
>> beyond that. I've barely used this in it's current form have no real experience
>> with a potential future form and I hopefully won't use it (often) anyway.
>> So I think it's more than appropriate to prioritize the general API over making
>> this esoteric use case more comfortable to use.
> I think you've already talked me into the idea that we need a type for
> transporting manually reference-counted references.
>>>> The `Unmanaged` instances were created when draw commands are
>>>> submitted to the renderer so they were only used inside the rendering
>>>> pipeline.
>>>> I eventually switched to using the OpenGL names (i.e. UInts) directly
>>>> inside the renderer since they are already available anyway but that
>>>> also requires extra logic to ensure the resources are not destroyed
>>>> prematurely (e.g. retaining the object until the end of the frame or
>>>> delaying the cleanup of the OpenGL resources until the end of the
>>>> frame, ...). In many ways it's quite a bit messier than just using
>>>> `Unmanaged`.
>>> I don't see how Unmanaged could have been less messy; don't you still
>>> need a strong reference somewhere to ensure the lifetime?
>> Absolutely. You can retain the object directly through Unmanaged
>> (via passRetained() or retain()) and make the Unmanaged instance
>> a strong reference in effect.
> Right.  Thanks for refreshing my ability to think about these issues
> again :-)

:-) I should know better by now but I'm still often surprised how something
that felt relatively simple and sensible when writing the code turns out to be
very hard to put into words and properly justify.

You had me going back to to some code more than once because I was
wondering myself why on earth I couldn't have just done X.

>> Not a big difference to retaining by putting the objects in some container.
>> Just a different set of tradeoffs.
>> I don't think it's the best solution for this case but it's pretty simple. Retain
>> when creating the draw command, release when discarding the draw
>> command - nothing different than malloc/free.
>> Collecting the objects in some collection is likely to be a cleaner solution
>> and more efficient too, since you don't retain objects multiple times if they are
>> used multiple times in the same frame (which is likely for shaders, textures).
>> But then someone somewhere needs to manage this, and you need to
>> access that state when creating or submitting the draw command.
>> Or perhaps make sure the referenced resources stay valid until the frame
>> is drawn so you don't need to retain here at all but now you need to track
>> all the scene contents, etc...
>> Hopefully I don't come across as too petulant. :-)
> Not a bit.
>> I don't really want to argue in favor of or defend these approaches.
>> I'm merely trying to give some examples of when and what for I actually
>> used this stuff not to prove the merits of these cases but instead to argue
>> for the existence of better justified uses based on the same ideas.
>> Not sure if that makes any sense but there you go :-)
> Thanks, it's been very helpful.
>>>> I don't think these are particularly great examples and I could
>>>> certainly live without 'native' MRC but ultimately I think it's an
>>>> interesting capability so I'd like to keep it around. 
>>>> Although I'd be in favor of keeping it out of the stdlib but I don't
>>>> think that's really an option just yet...
>>>> It would also be interesting to be able to do the same with indirect
>>>> enum instances and closures but it's not like I have a particular use
>>>> case for that ;-)
>>> I don't understand what you might be hinting at here.
>> Just that AFAIK closures and indirect enum instances also use ARCed
>> references under the hood. So in theory the could potentially also be
>> stored unowned and manually retained/released.
>> I just find it slightly interesting that with (Any)Objects certain things are
>> exposed (unsafeAddressOf, retain/release, ...) whereas with indirect enums
>> and closures they are not.
>> I don't want to imply that that would be a good idea and it would certainly
>> be hard, complicated, and annoying to implement with essentially n
>> benefit so I don't want to go anywhere with this other than the partial similarity.
>> Basically it's just my brain going:
>> "Oh look, some pyramids. Hmm, you could store these much more efficiently
>> if you stacked them up against each other" ;-)
> Say, you must be some kind of engineer or something! ;-)
> -- 
> -Dave

- Janosch

More information about the swift-evolution mailing list