[swift-evolution] [Concurrency] async/await + actors

Sat Sep 2 17:38:06 CDT 2017

> On Sep 2, 2017, at 2:19 PM, Charles Srstka via swift-evolution <swift-evolution at swift.org> wrote:
> 
>> On Sep 2, 2017, at 4:05 PM, David Zarzycki via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> 
>>> On Sep 2, 2017, at 14:15, Chris Lattner via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>> 
>>> My understanding is that GCD doesn’t currently scale to 1M concurrent queues / tasks.
>> 
>> Hi Chris!
>> 
>> [As a preface, I’ve only read a few of these concurrency related emails on swift-evolution, so please forgive me if I missed something.]
>> 
>> When it comes to GCD scalability, the short answer is that millions of of tiny heap allocations are cheap, be they queues or closures. And GCD has fairly linear performance so long as the millions of closures/queues are non-blocking.
>> 
>> The real world is far messier though. In practice, real world code blocks all of the time. In the case of GCD tasks, this is often tolerable for most apps, because their CPU usage is bursty and any accidental “thread explosion” that is created is super temporary. That being said, programs that create thousands of queues/closures that block on I/O will naturally get thousands of threads. GCD is efficient but not magic.
>> 
>> As an aside, there are things that future versions of GCD could do to minimize the “thread explosion” problem. For example, if GCD interposed the system call layer, it would gain visibility into *why* threads are stalled and therefore GCD could 1) be more conservative about when to fire up more worker threads and 2) defer resuming threads that are at “safe” stopping points if all of the CPUs are busy.
>> 
>> That being done though, the complaining would just shift. Instead of an “explosion of threads”, people would complain about an “explosion of stacks" that consume memory and address space. While I and others have argued in the past that solving this means that frameworks must embrace callback API design patterns, I personally am no longer of this opinion. As I see it, I don’t think the complexity (and bugs) of heavy async/callback/coroutine designs are worth the memory savings. Said differently, the stack is simple and efficient. Why fight it?
>> 
>> I think the real problem is that programmers cannot pretend that resources are infinite. For example, if one implements a photo library browsing app, it would be naive to try and load every image at launch (async or otherwise). That just won’t scale and that isn’t the operating system's fault.
> 
> Problems like thread explosion can be solved using higher-level constructs, though. For example, (NS)OperationQueue has a .maxConcurrentOperationCount property. If you make a global OperationQueue, set the maximum to whatever you want it to be, and run all your “primitive” operations through the queue, you can manage the thread count rather effectively.
> 
> I have a few custom Operation subclasses that easily wrap arbitrary asynchronous operations as Operation objects; once the new async/await API comes out, I plan to adapt my subclass to support it, and I’d be happy to submit the code to swift-evolution if people are interested.

NSOperation has several implementation issues, and using it to encapsulate asynchronous work means that you don't get the correct priorities (I don't say it cant' be fixed, I honnestly don't know, I just know from the mouth of the maintainer that NSOperation makes only guarantees if you do all your work from -[NSOperation main]).

Second, what Dave is saying is exactly the opposite of what you just wrote. If you use NSOQ's maximum concurrency bit, and you throw an infinite amount of work at it, *sure* the thread explosion will be fixed but:
- cancelation is a problem
- scalability is a problem
- memory growth is a problem.

The better design is to have a system that works like this:

(1) have a scheduler that knows how many operations are in flight and admits two levels "low" and "high".
(2) when you submit work to the scheduler, it tells you if it could take it or not, if not, then it's up to you to serialize it somewhere "for later" or propagate the error to your client
(3) the scheduler admits up to "high" work items, and as they finish, if you reach "low", then you use some notification mechanism to feed it again (and possibly get it from the database).

This is how any OS construct works for resource reasons (network sockets, file descriptors, ...) where the notification mechanism that writing to these is available again is select/poll/epoll/kevent/... name it.

By doing it this way, you actually can write smarter policies on what the next work is, because computing what you should do next is usually relatively expensive, especially if work comes in all the time and that your decision can go stale quickly or that priorities are live. by batching it because the scheduler will only call you when you're at "low" which lets you send "hihg - low" items at once, you benefit from a batching effect and possibility to reorder work. And you work in constant memory.

NSOperationQueue is not that, it is just a turnstile at the door that doesn't let too many people enter but will happily have the waiting queue overflow in the streets and cause a car accident because the pavement is not large enough to hold all of it.

Dave said it very well so I'll not try to reword it:

>> I think the real problem is that programmers cannot pretend that resources are infinite.

From there, it stems that you need to reason about your resources, and the execution contexts and amount of Actors in flight are such resources. It's fine to ignore these problems when the working set you have is small enough, but in my experience, there are *lots* of cases when the working set is actually still bounded, but too large in a way that makes the developer think he's fine. The most typical example which we actually tried to express in the WWDC GCD Talk my team did this year is the case when your working set is scaling in the number of clients/connections/... of your process: this is usually not a good property for a real system, even in many core systems. For these cases you need to make sure that all related work happens on the same exeution context to benefit from cache effects at all the levels (from CPU to actually the software stack), and have a better affinity for the locking domains you use, etc, etc, etc...

And then that bring us back to the reply I made to Chris earlier today: nesting actors is like nesting locks. When you have a clear ordering it's all fine, but when you don't, then you're screwed.

-Pierre

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170902/a54c75a1/attachment.html>