[swift-evolution] [Concurrency] async/await + actors

Fri Aug 18 14:35:11 CDT 2017

> On Aug 17, 2017, at 11:33 PM, Brent Royal-Gordon <brent at architechies.com> wrote:
> 
>> On Aug 17, 2017, at 3:24 PM, Chris Lattner via swift-evolution <swift-evolution at swift.org> wrote:
>> 
>> Anyway, here is the document, I hope it is useful, and I’d love to hear comments and suggestions for improvement:
>> https://gist.github.com/lattner/31ed37682ef1576b16bca1432ea9f782
> 
> 
> I think you're selecting the right approaches and nailing many of the details, but I have a lot of questions and thoughts. A few notes before I start:

Thanks!

> ## Dispatching back to the original queue
> 
> You correctly identify one of the problems with completion blocks as being that you can't tell which queue the completion will run on, but I don't think you actually discuss a solution to that in the async/await section. Do you think async/await can solve that? How?  Does GCD even have the primitives needed? (`dispatch_get_current_queue()` was deprecated long ago and has never been available in Swift.)
> 

Async/await does not itself solve this - again think of async/await as sugar for completion handlers.  However, a follow-on to this proposal would be another proposal that describes how existing ObjC completion handlers are imported.  It would be surely controversial and is possibly unwise, but we could build magic into the thunks for those. This is described here:
https://gist.github.com/lattner/429b9070918248274f25b714dcfc7619#fix-queue-hopping-objective-c-completion-handlers

Iff it were a good idea to do this, we can work with the GCD folks to figure out the best implementation approach, including potentially new API.

> ## Error handling
> 
> Do you imagine that `throws` and `async` would be orthogonal to one another? If so, I suspect that we could benefit from adding typed `throws` and making `Never` a subtype of `Error`, which would allow us to handle this through the generics system.

Great question, I explore some of it here:
https://gist.github.com/lattner/429b9070918248274f25b714dcfc7619#alternate-syntax-options <https://gist.github.com/lattner/429b9070918248274f25b714dcfc7619#alternate-syntax-options>

> (Also, I notice that a fire-and-forget message can be thought of as an `async` method returning `Never`, even though the computation *does* terminate eventually. I'm not sure how to handle that, though)

Yeah, I think that actor methods deserve a bit of magic:

- Their bodies should be implicitly async, so they can call async methods without blocking their current queue or have to use beginAsync.
- However, if they are void “fire and forget” messages, I think the caller side should *not* have to use await on them, since enqueuing the message will not block.

If the actor method throws an error or returns a value, then yes, you'd have to await.

> ## Interop, again
> 
> There are a few actor-like types in the frameworks—the WatchKit UI classes are the clearest examples—but I'm not quite worried about them. What I'm more concerned with is how this might interoperate with Cocoa delegates. Certain APIs, like `URLSession`, either take a delegate and queue or take a delegate and call it on arbitrary queues; these seem like excellent candidates for actor-ization, especially when the calls are all one-way. But that means we need to be able to create "actor protocols" or something. It's also hard to square with the common Cocoa (anti?)pattern of implementing delegate protocols on a controller—you would want that controller to also be an actor.
> 
> I don't have any specific answers here—I just wanted to point this out as something we should consider in our actor design.

As part of the manifesto, I’m not proposing that existing APIs be “actorized”, though that is a logical thing to look into once the basic model is nailed down.

> ## Value-type annotation
> 
> The big problem I see with your `ValueSemantical` protocol is that developers are very likely to abuse it. If there's a magic "let this get passed into actors" switch, programmers will flip it for types that don't really qualify; we don't want that switch to have too many other effects.

I agree.  That is one reason that I think it is important for it to have a (non-defaulted) protocol requirement.  Requiring someone to implement some code is a good way to get them to think about the operation… at least a little bit.  That said, the design does not try to *guarantee* memory safety, so there will always be an opportunity for error.

>  I also worry that the type behavior of a protocol is a bad fit for `ValueSemantical`. Retroactive conformance to `ValueSemantical` is almost certain to be an unprincipled hack; subclasses can very easily lose the value-semantic behavior of their superclasses, but almost certainly can't have value semantics unless their superclasses do. And yet having `ValueSemantical` conformance somehow be uninherited would destroy Liskov substitutability.

Indeed.  See NSArray vs NSMutableArray.

OTOH, I tend to think that retroactive conformance is really a good thing, particularly in the transition period where you’d be dealing with other people’s packages who haven’t adopted the model.  You may be adopting it for their structs afterall.

An alternate approach would be to just say “no, you can’t do that.  If you want to work around someone else’s problem, define a wrapper struct and mark it as ValueSemantical”.  That design could also work.

> ## Plain old classes
> 
> In the section on actors, you suggest that actors can either be a variant of classes or a new fundamental type, but one option you don't seem to probe is that actors could simply *be* subclasses of an `Actor` class:
> 
> 	class Storage: Actor {
> 		func fetchPerson(with uuid: UUID) async throws -> Person? {
> 			...
> 		}
> 	}
> 
> You might be able to use different concurrency backends by using different base classes (`GCDActor` vs. `DillActor` vs. whatever), although that would have the drawback of tightly coupling an actor class to its backend. Perhaps `Actor` could instead be a generic class which took an `ActorBackend` type parameter; subclasses could either fix that parameter (`Actor<DispatchQueue>`) or expose it to their users.

Yes, that is possible, I’ll mention it.

> 
> Another possibility doesn't involve subclasses at all. In this model, an actor is created by an `init() async` initializer. An async initializer on `Foo` returns an instance of type `Foo.Async`, an implicitly created pseudo-class which contains only the `async` members of `Foo`.
...
> A third possibility is to think of the actor as a sort of proxy wrapper around a (more) synchronous class, which exposes only `actor`-annotated members and wraps calls to them in serialization logic. This would require some sort of language feature to make transparent wrappers, though. This design would allow the user, instead of the actor, to select a "backend" for it, so an iOS app could use `GCDActor<Storage>` while its server backend could use `DillActor<Storage>`. (`Storage` is a bad example for shared code, but you get the idea.)
> 
> My point here is simply that, although you show the actor-ness of a type as being fundamental to it, I'm not sure it needs to be.

It would be a perfectly valid design approach to implement actors as a framework or design pattern instead of as a first class language feature.  You’d end up with something very close to Akka, which has provides a lot of the high level abstractions, but doesn’t nudge coders to do the right thing w.r.t. shared mutable state enough (IMO).

> ### Lifting parameter type restrictions into `async`
> 
> The major downside of an "actors are not special types" model is that it wouldn't enforce the parameter type restrictions.

Right.

> One solution would be to apply those restrictions to *all* `async` functions—their parameters would all have to conform to the magic "okay for actors" protocol (well, it'd be "okay for async" now). That strikes me as a pretty sane restriction, since the shared-state problems we want to avoid with actors are also questionable with other async calls.
> 
> However, this would move the design of the magic protocol forward in the schedule, and might delay the deployment of async/await. If we *want* these restrictions on all async calls, that might be worth it, but if not, that's a problem.

I’m not sure it make sense either given the extensive completion handler based APIs, which take lots of non value type parameters.

> ## The inevitable need for metadata
> 
> GCD started with a very simple model: you put blocks on a queue and the queue runs them in order. This was much more lightweight than `NSOperationQueue`, which had a lot of extra stuff for canceling operations, prioritizing them, etc. Unfortunately, within a few years Apple decided that GCD *needed* to be able to cancel and prioritize operations, so they had to pack this information into weird pseudo-block objects. In Swift, this manifested as the `DispatchWorkItem` class.
> 
> My point is, in anything that involves background processing, you always end up needing more configurability than you think at the start. We should anticipate this in our design and have a plan for how we'll attach metadata to actor messages, even if we don't implement that feature right away. Because we'll surely need to sooner or later.

Agreed.

> # Reliability
> 
> Overall, I like reliability at the actor level; it seems like an appropriate unit of trap-resistance.
> 
> I don't think we should incorporate traps into normal error-handling mechanisms; that is, I don't think resilient actors should throw on traps. When an invariant is violated within an actor, that means *something went wrong* in a way that wasn't anticipated. The mistake may be completely internal to the actor in question, but it may also have stemmed from invalid data passed into it—data which may be present in other parts of the system. In other words, I don't think we should think of reliable actors as a way to normalize trapping; we should think of it as a way to mitigate the damage caused by a trap, to trap gracefully. Failure handlers encourage the thinking we want; throwing errors encourages the opposite.

I tend to agree with you.

> To that end, I think failure handlers are the right approach. I also think we should make it clear that, once a failure handler is called, there is no saving the process—it is *going* to crash eventually. Maybe failure handlers are `Never`-returning functions, or maybe we simply make it clear that we're going to call `fatalError` after the failure handler runs, but in either case, a failure handler is a point of no return.
> 
> (In theory, a failure handler could keep things going by pulling some ridiculous shenanigans, like re-entering the runloop. We could try to prevent that with a time limit on failure handlers, but that seems like overengineering.)
> 
> I have a few points of confusion about failure handlers, though:
> 
> 1. Who sets up a failure handler? The actor that might fail, or the actor which owns that actor?

I imagine it being something set up by the actor’s init method.  That way the actor failure behavior is part of the contract the actor provides.  Parameters to the init can be used by clients to customize that behavior.

> 2. Can there be multiple failure handlers?
> 
> 3. When does a failure handler get invoked? Is it queued like a normal message, or does it run immediately? If the latter, and if it runs in the context of an outside actor, how do we deal with the fact that invariants might not currently hold?

These all need to be defined, I haven’t gone into the design of the API because there are numerous good answers here.

> # Distributed actors
> 
> I love the feature set you envision here, but I have two major criticisms.
> 
> ## Heterogeneity is the rule
> 
> Swift everywhere is a fine idea, but heterogeneity is the reality. It's the reality today and it will probably be the reality in twenty years. A magic "distributed actor" model isn't going to do us much good if it doesn't work when the actor behind it is implemented in Node, PHP, or Java.
> 
> That means that we should expect most distributed actors to be wrappers around marshaling code. Dealing with things like XPC or Neo-Distributed Objects is great, but we also need to think about "distributed actors" based on `JSONEncoder`, `URLSession`, and some custom glue code to stick them together. That's probably most of what we'll end up doing.

I completely agree, we want both.

> ## It's just a tweaked backend
> 
> You describe this as a `distributed` keyword, but I don't think the keyword actually adds much. I don't think there's a simple, binary distinction between distributed and non-distributed actors. Rather, there are a variety of actor "backends"—some in-process, some in-machine, some in-network—which vary in two dimensions:
> 
> 1. **Is the backend inherently error-prone?** Basically, should actor methods that normally are not `throws` be exposed as `throws` methods because the backend itself is expected to introduce errors in the normal course of operation?
> 
> 2. **How strictly does the backend constrain the types of parameters you can pass?** In-process, anything that can be safely used by multiple threads is fine. In-machine, it needs to be `Codable` or support `mmap`ing. In-network, it needs to be `Codable`. But that's only the common case, of course! A simple in-machine backend might not support `mmap`; a sophisticated in-network backend might allow you to pass one of your `Actor`s to the other side (where calls would be sent back the other way).
> 
> Handling these two dimensions of variation basically requires new protocol features. For the error issue, we basically need typed `throws`, `Never` as a universal subtype (or at least a universal subtype of all `Error`s), and an operation equivalent to `#commonSupertype(BackendError, MethodError)`. For the type-constraining issue, we need an "associated protocol" feature that allows you to constrain `ActorBackend`'s parameters to a protocol specified by the conforming type. And, y'know, a way to reject actor/backend combinations that aren't compatible.

Sure, that’s reasonable.  I’m mostly concerned with programmers having an expressive way to describe what they want, and for the language to drag them into doing the right thing when they’ve said they need a capability.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170818/27cd8b4f/attachment.html>