[swift-evolution] RFC: Proposed rewrite of Unmanaged<T>

Mon Feb 22 09:45:01 CST 2016

on Mon Feb 22 2016, Janosch Hildebrand <jnosh-AT-jnosh.com> wrote:

>>> And speaking of a separate type for MRC, how about `ManagedReference`
>>> as a name? Seems much better than `Unmanaged`, nicely contrasts with
>>> `UnsafeReference` and `ManuallyManagedReference` is a bit of a
>>> mouthful...
>> 
>> I think we want “managed” to mean “managed for you,” not “managed by
>> you.”  It's also quite unsafe because you can overrelease it, etc., so
>> it would have to have “unsafe” in the name somewhere I think.
>
> That makes sense. Something like `UnsafeReferenceCountedPointer` but
> shorter?

Something like that.  I'm not attached to it being shorter.

>>>>> I don't think this use case even needs to be described in the
>>>>> documentation for `UnsafeReference` and it's fine if its use is
>>>>> very much discouraged.
>>>>> 
>>>>> Personally I prefer the proposed
>>>>> `manuallyRetain()`/`manuallyRelease()` over plain
>>>>> `retain()`/`release()` as it clearly separates the returning and
>>>>> more generally applicable `release()` from the MRC
>>>>> methods. `retain()` would probably also have to return the object
>>>>> which would interfere with the max safe usage pattern.
>>>> 
>>>> I don't understand your last sentence; care to clarify?
>>> 
>>> My main reason for preferring `manuallyRetain()`/`manuallyRelease()`
>>> over `retain()`/`release()` would be that the former would *not*
>>> return the object, thus more cleanly separating them from the current
>>> `release()` which returns the object to be used from now on, with the
>>> `UnsafeReference` to be discarded at that point.
>>> 
>>> I just think it might be more confusing to also use `release()` for
>>> MRC and also introducing `retain()` would only exacerbate the
>>> issue. For symmetry reasons `retain()` would likely also return the
>>> object. 
>> 
>> There might be other reasons to do it, but I don't think symmetry is
>> necessarily a design goal here.
>> 
>>> That would make it very similar to `release()` and `.object` which it
>>> really shouldn't be as it shouldn't ever be used for handling object
>>> from unannotated CF APIs.
>>> 
>>> I think having a third method/property with a very similar signature
>>> would likely confusion regarding the "Maximally Safe Usage" pattern
>>> you described.
>>> 
>>> But as mentioned above I would actually prefer having two separate
>>> types which would also make this a non-issue.
>> 
>> Questions:
>> 
>> 1. How would these types interact?  Does one need to be able to convert
>>   between them liberally, or is it sufficient to use strong references
>>   as the common currency?
>
> If we were to have two separate types I think it would be more than fine to
> use strong references as a go-between. The use cases for the two types are
> so different that I doubt it would be an issue.
>
>> 2. Do you really want a type at all?  Why not just retain() and
>>   release() as free functions?
>
> I assume these would be unsafeRetain() and unsafeRelease() ;-)

Yes of course.

> But yeah, that would work as well and might be a nicer solution overall.
> (And you could easily create your own type from these + unowned(unsafe))
>
> The downside I see is that being free functions and working with
> AnyObject makes them much more discoverable than being hidden
> inside some other type. 

How discoverable *are* free functions, really, after all?  If you want
we could nest them:

enum UnsafeManualReferenceCounting {
  static func retain(AnyObject) {...}
  static func release(AnyObject) {...}
}

> But given the other unsafe* free functions that's already exists that
> might be fine.

Yeah, IMO that would be reasonable.

> Also having a predefined wrapper type has some use, e.g. when you
> want to store your unowned(unsafe) objects into some collection.

Yeah, though it seems to me that wrapper should be UnsafeReference<T>.

> But I'd guess you'll end up with a dedicated wrapper type anyway in
> most circumstances where this is an issue so it's probably OK.  And
> this gets rid of having a separate type that also plays the role of
> unowned(unsafe) which seems like a plus.
>
> I have some more answers below but I'll summarize my opinion here.
> Preferences (in descending order):
>
> 1) unsafeRetain() + unsafeRelease() + unowned(unsafe)
> 2) "UnsafeReferenceCountedPointer"
> 3) No dedicated functionality so just abuse UnsafeReference instead of
>     Unmanaged
>
> I think ultimately it's a question of whether we want to expose the
> reference counting implementation to manual use...
>
> Is it something I can live without? Absolutely.
> Is it something I would use often? Absolutely not.
> But then again, it is going to be exposed through UnsafeReference anyway.
>
> And having access to a solid manual reference counting solution
> that integrates very well with the rest of the language is kinda neat in
> my opinion. And the integration with ARC is something an external
> library cannot provide without becoming even more "hacky".
> And I don't think it's any more dangerous than any other manual
> memory management we have access to. 

Okay.  My current feeling about all this is that UnsafeReference should
have retain and release directly on it, for MRC.

The thing I'm really unclear about at this point is whether it's
reasonable to ask users to use “release()” for the maximally-safe usage
pattern, or if that's just too weird.  My sense all along has been that
asking them to get used to it is a much better design choice than
expanding the API, especially when we don't really have a better name for
it than “release.”  If a better name presented itself, that might change
the picture.

> I also wonder if it has some minor use for learning/teaching.
> Yes, ARC is great because most of the time you don't need to think
> about this but if you're trying to understand reference counting it's
> kinda nice to be able to actually interact and play around with it.
> Then again I'm the kind of person that likes doing that but YMMV...

As long as you don't need to share anything across threads, it's easy
enough to build MRC using malloc, free, and a counter in each allocated
block, so this doesn't sound like a strong argument to me.

>>>>> As Joe mentioned, `Unmanaged` has a use for manual ref counting
>>>>> beyond immediate transfer from un-annotated APIs.
>>>>> 
>>>>> I have used it for performance reasons myself (~ twice) and while I
>>>>> think it's a pretty small use case there isn't really any
>>>>> alternative.
>>>>> If it would help I can also describe my use-cases in more detail.
>>>> 
>>>> Yes please!
>>> 
>>> One place I used Unmanaged is in a small project where I experiment
>>> with binary heaps in Swift. I've put the project on Github
>>> --(https://github.com/Jnosh/SwiftBinaryHeapExperiments) but basically
>>> I'm using `Unmanaged` in two places here:
>>> 
>>> 1) Testing the 'overhead' of (A)RC.
>>> Basically comparing the performance of using ARC-managed objects in
>>> the heaps vs. using 'unmanaged' objects. In Swift 1.2 the difference
>>> was still ~2x but with Swift 2+ it's likely approaching the cost of
>>> the retain/release when entering and exiting the collection.
>>> 
>>> Now this could also be accomplished using `unowned(unsafe)` but
>>> `Unmanaged` has some minor advantages:
>>> 	a) I can keep the objects alive without keeping them in a
>>> separate collection. Not a big issue here since I'm doing that anyway
>>> but I also find that `Unmanaged` makes it clearer that & how the
>>> objects are (partly) manually managed.
>>> 	b) I had previously experimented with using `unowned(unsafe)`
>>> for this purpose but found that `Unmanaged` performed better. However,
>>> that was in a more complex example and in the Swift 1.2 era. A quick
>>> test indicates that in this case and with Swift 2.1 `unowned(unsafe)`
>>> and `Unmanaged` perform about equally.
>> 
>> They should.  unowned(unsafe) var T is essentially just an
>> UnsafePointer.  unowned/unowned(safe) do incur reference-counting cost
>> in exchange for their safety.
>
> I'll come back to this further down.
>
>>> 2) A (object only) binary heap that uses `Unmanaged` internally
>>> Not much practical use either in this case since the compiler seems to
>>> do quite well by itself but still a somewhat interesting exercise.
>>> `Unmanaged` is pretty much required here to make CoW work by manually
>>> retaining the objects.
>> 
>> It's hard for me to imagine why that would be the case.  Would I have
>> needed to use Unmanaged in implementing Arrays of objects, if it were?
>
> Sorry, I wasn't clear enough. I (ab)use Unmanaged for two different reasons here.
>
> 1) To have a performance baseline where the ARC overhead inside the collection
> is essentially zero beyond the mandatory retain on insert, i.e. as if the compiler was
> able to eliminate all (redundant) retains and releases.
>
> One part of this is exempting the objects from ARC which is is done by storing the
> elements in Unmanaged instances but a wrapper type using unowned(unsafe)
> would work just as well.
>
> However, I still need a strong reference to the objects to keep them alive. Using a
> separate data structure would work but that has a space, time and code complexity
> cost.
> Instead I use Unmanaged to manually retain the objects on insert and release on
> removal. unowned cannot do that on its own hence the need for something like
> unsafeRetain() & unsafeRelease().

OK.

> 2) I then abuse Unmanaged's capabilities a second time to retain the elements
> when the collection is copied (which would happen 'automatically' with
> ARC).
>
> Btw, with Swift 2.2 under WMO the performance of a normal ManagedBuffer
> is on par with this "hack". Go Swift team!

:-)

>>> The other project was a simple 2D sprite engine (think a simplified
>>> version of SpriteKit) I experimented with about a year ago.
>>> Textures and Shaders were abstracted as value types privately backed
>>> by reference types that managed the underlying OpenGL objects,
>>> i.e. destroy the OpenGL texture object on deinit, etc...
>>> 
>>> I found this to be quite nice to use but ARC overhead during batching
>>> & rendering amounted to something like 20-30% of CPU time IIRC. (This
>>> was under Swift 1.2 and with WMO). Using `Unmanaged` was one of the
>>> things I played around with to get around this and it worked very
>>> well.
>> 
>> Another case where you can use unowned(unsafe), is it not?
>
> Indeed, and that was what I originally tried to use.
> Ultimately i settled on Unmanaged however. Now it's been a long time and
> I don't recall the exact details so take this with a grain of salt: 
>
> One reason certainly was that I ended up needing Unmanaged anyway
> to perform manual retain & releases at which point why not also use it
> for 'storage'...
>
> But I also vaguely recall that Unmanaged had more of a performance impact.
> Now one possibility is that there was some issue with unowned(unsafe) (this
> was with 1.2.β1) but much more likely is that Unmanaged was easier to apply
> consistently and correctly.
> e.g. assume you have some struct that contains an unowned(unsafe) variable.
> Now if you extract that into a local variable you add a perhaps unwanted
> retain/release so you might need to mark the local variable as unowned(unsafe)
> as well, etc...

Right; makes sense.

> 
> What I'm trying to say is that with unowned you need to be careful and
> considerate with how you use it at all times since the 'obvious' thing
> generally leads to retain/release.  

That's safer, though, but in this case you're more concerned about performance.

> Unmanaged is fine to pass around, store in a local variable,
> etc... and any ARC related interactions are obvious because they
> manifest as method calls on the Unmanaged instance.
>
> For their main application, breaking retain cycles, weak and unowned work fine
> because you want to retain the objects when they are not 'at rest'.
> But if you want to avoid retains even when working with the objects, a type is
> just a much more comfortable way to handle this.
> Like I mentioned before, I imagine that in many cases you'll end up making a
> custom wrapper anyway but it's something I'm a bit apprehensive about.
>
> Still, I think unowned(unsafe) together with unsafeRetain() and unsafeRelease()
> free functions makes for a nicer API and I don't think I can adequately judge it
> beyond that. I've barely used this in it's current form have no real experience
> with a potential future form and I hopefully won't use it (often) anyway.
> So I think it's more than appropriate to prioritize the general API over making
> this esoteric use case more comfortable to use.

I think you've already talked me into the idea that we need a type for
transporting manually reference-counted references.

>>> The `Unmanaged` instances were created when draw commands are
>>> submitted to the renderer so they were only used inside the rendering
>>> pipeline.
>>> I eventually switched to using the OpenGL names (i.e. UInts) directly
>>> inside the renderer since they are already available anyway but that
>>> also requires extra logic to ensure the resources are not destroyed
>>> prematurely (e.g. retaining the object until the end of the frame or
>>> delaying the cleanup of the OpenGL resources until the end of the
>>> frame, ...). In many ways it's quite a bit messier than just using
>>> `Unmanaged`.
>> 
>> I don't see how Unmanaged could have been less messy; don't you still
>> need a strong reference somewhere to ensure the lifetime?
>
> Absolutely. You can retain the object directly through Unmanaged
> (via passRetained() or retain()) and make the Unmanaged instance
> a strong reference in effect.

Right.  Thanks for refreshing my ability to think about these issues
again :-)

> Not a big difference to retaining by putting the objects in some container.
> Just a different set of tradeoffs.
>
> I don't think it's the best solution for this case but it's pretty simple. Retain
> when creating the draw command, release when discarding the draw
> command - nothing different than malloc/free.
>
> Collecting the objects in some collection is likely to be a cleaner solution
> and more efficient too, since you don't retain objects multiple times if they are
> used multiple times in the same frame (which is likely for shaders, textures).
> But then someone somewhere needs to manage this, and you need to
> access that state when creating or submitting the draw command.
>
> Or perhaps make sure the referenced resources stay valid until the frame
> is drawn so you don't need to retain here at all but now you need to track
> all the scene contents, etc...
>
> Hopefully I don't come across as too petulant. :-)

Not a bit.

> I don't really want to argue in favor of or defend these approaches.
> I'm merely trying to give some examples of when and what for I actually
> used this stuff not to prove the merits of these cases but instead to argue
> for the existence of better justified uses based on the same ideas.
> Not sure if that makes any sense but there you go :-)

Thanks, it's been very helpful.

>>> I don't think these are particularly great examples and I could
>>> certainly live without 'native' MRC but ultimately I think it's an
>>> interesting capability so I'd like to keep it around. 
>>> Although I'd be in favor of keeping it out of the stdlib but I don't
>>> think that's really an option just yet...
>>> 
>>> It would also be interesting to be able to do the same with indirect
>>> enum instances and closures but it's not like I have a particular use
>>> case for that ;-)
>> 
>> I don't understand what you might be hinting at here.
>
> Just that AFAIK closures and indirect enum instances also use ARCed
> references under the hood. So in theory the could potentially also be
> stored unowned and manually retained/released.
>
> I just find it slightly interesting that with (Any)Objects certain things are
> exposed (unsafeAddressOf, retain/release, ...) whereas with indirect enums
> and closures they are not.
>
> I don't want to imply that that would be a good idea and it would certainly
> be hard, complicated, and annoying to implement with essentially n
> benefit so I don't want to go anywhere with this other than the partial similarity.
>
> Basically it's just my brain going:
> "Oh look, some pyramids. Hmm, you could store these much more efficiently
> if you stacked them up against each other" ;-)

Say, you must be some kind of engineer or something! ;-)

-- 
-Dave