<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">On Sep 2, 2017, at 11:09 PM, Pierre Habouzit <<a href="mailto:phabouzit@apple.com" class="">phabouzit@apple.com</a>> wrote:<br class=""><div><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">On Sep 2, 2017, at 12:19 PM, Pierre Habouzit <<a href="mailto:pierre@habouzit.net" class="">pierre@habouzit.net</a>> wrote:<div class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><div style="font-family: SFMono-Light; font-size: 11px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" class=""><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">What do you mean by this?</div></div></blockquote><div class=""><br class=""></div><div class="">My understanding is that GCD doesn’t currently scale to 1M concurrent queues / tasks.</div></div></div></blockquote><div class=""><br class=""></div><div class="">It completely does provided these 1M queues / tasks are organized on several well known independent contexts.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Ok, I stand corrected. My understanding was that you could run into situations where you get stack explosions, fragment your VM and run out of space, but perhaps that is a relic of 32-bit systems.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">a queue on 64bit systems is 128 bytes (nowadays). Provided you have that amount of VM available to you (1M queues is 128M after all) then you're good.</div><div class="">If a large amount of them fragments the VM beyond this is a malloc/VM bug on 64bit systems that are supposed to have enough address space.</div></div></div></div></blockquote><div><br class=""></div><div>Right, I was referring to the fragmentation you get by having a large number of 2M stacks allocated for each kernel thread. I recognize the queues themselves are small.</div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">What doesn't scale is asking for threads, not having queues.</div></div></div></div></blockquote><div><br class=""></div><div>Right, agreed.</div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class="">Agreed, to be clear, I have no objection to building actors on top of (perhaps enhanced) GCD queues. In fact I *hope* that this can work, since it leads to a naturally more incremental path forward, which is therefore much more likely to actually happen.</div></div></div></div></blockquote><div class=""><br class=""></div>Good :)</div></div></div></blockquote><div><br class=""></div><div>I meant to be pretty clear about that all along, but perhaps I missed the mark. In any case, I’ve significantly revised the “scalable runtime” section of the doc to reflect some of this discussion, please let me know what you think:</div><div><a href="https://gist.github.com/lattner/31ed37682ef1576b16bca1432ea9f782#scalable-runtime" class="">https://gist.github.com/lattner/31ed37682ef1576b16bca1432ea9f782#scalable-runtime</a></div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">My currently not very well formed opinion on this subject is that GCD queues are just what you need with these possibilities:</div></div><div class=""><div class=""><div class="">- this Actor queue can be targeted to other queues by the developer when he means for these actor to be executed in an existing execution context / locking domain,</div><div class="">- we disallow Actors to be directly targeted to GCD global concurrent queues ever</div><div class="">- for the other ones we create a new abstraction with stronger and better guarantees (typically limiting the number of possible threads servicing actors to a low number, not greater than NCPU).</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">Is there a specific important use case for being able to target an actor to an existing queue? Are you looking for advanced patterns where multiple actors (each providing disjoint mutable state) share an underlying queue? Would this be for performance reasons, for compatibility with existing code, or something else?</div></div></div></div></blockquote><div class=""><br class=""></div>Mostly for interaction with current designs where being on a given bottom serial queue gives you the locking context for resources naturally attached to it.</div></div></div></blockquote><div><br class=""></div><div>Ok. I don’t understand the use-case well enough to know how we should model this. For example, is it important for an actor to be able to change its queue dynamically as it goes (something that sounds really scary to me) or can the “queue to use” be specified at actor initialization time?</div><div><br class=""></div><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class="">One plausible way to model this is to say that it is a “multithreaded actor” of some sort, where the innards of the actor allow arbitrary number of client threads to call into it concurrently. The onus would be on the implementor of the NIC or database to implement the proper synchronization on the mutable state within the actor.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">I think what you said made sense.</div></div></div></blockquote><div><br class=""></div><div>Ok, I captured this in yet-another speculative section:</div><div><a href="https://gist.github.com/lattner/31ed37682ef1576b16bca1432ea9f782#intra-actor-concurrency" class="">https://gist.github.com/lattner/31ed37682ef1576b16bca1432ea9f782#intra-actor-concurrency</a></div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">But it wasn't what I meant. I was really thinking at sqlite where the database is strongly serial (you can't use it in a multi-threaded way well, or rather you can but it has a big lock inside). It is much better to interact with that dude on the same exclusion context all the time. What I meant is really having some actors that have a "strong affinity" with a given execution context which eases the task of the actor scheduler.</div></div></div></blockquote><div><br class=""></div><div>Ah ok. Yes, I think that wrapping a “subsystem with a big lock” in an actor is a very natural thing to do, just as much as it makes sense to wrap a non-threadsafe API in an actor. Any internal locking would be subsumed by the outer actor queue, but that’s ok - all the lock acquires would be uncontended and fast :)</div><div><br class=""></div><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">Another problem I haven't touched either is kernel-issued events (inbound IPC from other processes, networking events, etc...). Dispatch for the longest time used an indirection through a manager thread for all such events, and that had two major issues:</div><div class=""><br class=""></div><div class="">- the thread hops it caused, causing networking workloads to utilize up to 15-20% more CPU time than an equivalent manually made pthread parked in kevent(), because networking pace even when busy idles back all the time as far as the CPU is concerned, so dispatch queues never stay hot, and the context switch is not only a scheduled context switch but also has the cost of a thread bring up</div><div class=""><br class=""></div><div class="">- if you deliver all possible events this way you also deliver events that cannot possibly make progress because the execution context that will handle them is already "locked" (as in busy running something else.</div><div class=""><br class=""></div><div class="">It took us several years to get to the point we presented at WWDC this year where we deliver events directly to the right dispatch queue. If you only have very anonymous execution contexts then all this machinery is wasted and unused. However, this machinery has been evaluated and saves full percents of CPU load system-wide. I'd hate for us to go back 5 years here.</div></div></div></blockquote><div><br class=""></div><div>I don’t have anything intelligent to say here, but it sounds like you understand the issues well :-) I agree that giving up 5 years of progress is not appealing.</div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">Declaring that an actor targets a given existing serial context also means that if that actor needs to make urgent progress the context in question has to be rushed, and its priority elevated. It's really hard to do the same on an anonymous global context (the way dispatch does it still is to actually enqueue stealer work that try to steal the "actor" at a higher priority. this approach is terribly wasteful).</div></div></div></blockquote><div><br class=""></div>Ok.</div><div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">It is clear today that it was a mistake and that there should have been 3 kind of queues:</div><div class="">- the global queues, which aren't real queues but represent which family of system attributes your execution context requires (mostly priorities), and we should have disallowed enqueuing raw work on it</div><div class="">- the bottom queues (which GCD since last year tracks and call "bases" in the source code) that are known to the kernel when they have work enqueued</div><div class="">- any other "inner" queue, which the kernel couldn't care less about</div><div class=""><br class=""></div><div class="">In dispatch, we regret every passing day that the difference between the 2nd and 3rd group of queues wasn't made clear in the API originally.</div></div></div></blockquote><div><br class=""></div><div>Well this is an opportunity to encourage the right thing to happen. IMO it seems that the default should strongly be for an “everyday actor” to be the third case, which is what developers will reach for when building out their app abstractions. We can provide some API or other system that allows sufficiently advanced developers to tie an actor into the second bucket, and we can disallow the first entirely if that is desirable (e.g. by a runtime trap if nothing else).</div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">I like to call the 2nd category execution contexts, but I can also see why you want to pass them as Actors, it's probably more uniform (and GCD did the same by presenting them both as queues). Such top-level "Actors" should be few, because if they all become active at once, they will need as many threads in your process, and this is not a resource that scales. This is why it is important to distinguish them. And like we're discussing they usually also wrap some kind of shared mutable state, resource, or similar, which inner actors probably won't do.</div></div></div></blockquote><div><br class=""></div><div>The thing I still don’t understand is where #2 comes from: are these system defined queues that the developer interacts with, or are these things that developers define in their code? How does the kernel know about them if the developer defines new ones?</div><div><br class=""></div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">You can obviously model this as "the database is an actor itself", and have the queue of other actors that only do database work target this database actor queue, but while this looks very appealing, in practice this creates a system which is hard to understand for developers. Actors talking to each other/messaging each other is fine. Actors nesting their execution inside each other is not because the next thing people will then ask from such a system is a way to execute code from the outer actor when in the context of the inner Actor, IOW what a recursive mutex is to a mutex, but for the Actor queue. This obvious has all the terrible issues of recursive locks where you think you hold the lock for the first time and expect your object invariants to be valid, except that you're really in a nested execution and see broken invariants from the outer call and this creates terribly hard bugs to find.</div></div></blockquote><div class=""><br class=""></div><div class="">Yes, I understand the problem of recursive locks, but I don’t see how or why you’d want an outer actor to have an inner actor call back to it.</div></div></div></div></blockquote><div class=""><br class=""></div><div class="">I don't see why you'd need this with dispatch queues either today, however my radar queue disagrees strongly with this statement. People want this all the time, mostly because the outer actor has a state machine and the inner actor wants it to make progress before it continues working.</div><div class=""><br class=""></div><div class="">In CFRunloop terms it's like running the runloop from your work by calling CFRunloopRun() yourself again until you can observe some event happened.</div><div class=""><br class=""></div><div class="">It's not great and problematic for tons of reasons. If actors are nested, we need a way to make sure people don't have to ever do something like that.</div><div class=""><br class=""></div><div class="">Just as a data point, my reactions to this thread yielded a private discussion off list *exactly* about someone wanting us to provide something like this for dispatch (or rather in this instance having the inner actor be able to suspend the outer one, but it's just a corollary / similar layering violation where the inner actor wants to affect the outer one in a way the outer one didn't expect).</div></div></div></blockquote><div><br class=""></div><div>I get that this comes up now and then, but I also don’t understand what we can do about it. Allowing “recursive lock” style reentrancy into an actor breaks fundamental aspects of the design of actors: that you know at the start of any actor message that the actor invariants are intact.</div><div><br class=""></div><div>It would be possible to relax this and give up on that (perhaps rather theoretical) win, but I’d rather not. I think that any recursive case can be expressed through a refactoring of the code, e.g. to use async sending instead of sync calls. Also, I think we should strongly encourage pure async “fire and forget” actor methods anyway - IOW, we should encourage push, not pull - since they provide much stronger guarantees in general.</div><div><br class=""></div><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class="">It is definitely the family of problems I'm worried about. I want to make sure we have an holistic approach here because I think that recursive mutexes, recursive CFRunloop runs and similar ideas are flawed and dangerous. I want to make sure we understand which limitations we want to impose here, and by limitations I really mean layering/architecture rules, and that we document them upfront and explain how to work with them.</div></div></blockquote><div><br class=""></div>+1</div><div><div><br class=""></div><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><div class="">I think that what I'm getting at is that we can't build Actors without defining how they interact with the operating system and what guarantees of latency and liveness they have.</div></div></div></blockquote><br class=""></div><div>Agreed, thanks Pierre!</div><div><br class=""></div><div>-Chris</div><div><br class=""></div><br class=""></body></html>