[swift-dev] Resilient dynamic dispatch ABI. Notes and mini-proposal.

Fri Feb 3 13:55:55 CST 2017

> On Feb 3, 2017, at 10:58 AM, John McCall <rjmccall at apple.com> wrote:
> 
>> On Feb 2, 2017, at 9:57 PM, Andrew Trick via swift-dev <swift-dev at swift.org> wrote:
>> ---
>> #1. (thunk export) The simplest, most flexible way to expose dispatch
>> across resilience boundaries is by exporting a single per-method entry
>> point. Future compilers could improve dispatch and gradually expose
>> more ABI details.
> 
> Probably the most important such case is that many of these "dispatch" symbols
> for a non-open class could simply be aliases to the actual method definition.
> 
> Note that open classes have to support super-dispatch, not just ordinary dynamic
> dispatch; that's something that needs to be discussed at the same time, since it
> may affect the trade-offs.  #1 has some serious disadvantages here; the others
> are likely fine, since their mechanisms are all easily parameterized by the isa.
> 
>> ---
>> #3. (method index) This is an alternative that I've alluded to before,
>> but was not discussed in yesterday's meeting. One that makes a
>> tradeoff between exporting symbols vs. exposing vtable layout. I want
>> to focus on direct cost of the ABI support and flexibility of this
>> approach vs. approach #1 without arguing over how to micro-optimize
>> various dispatching schemes. Here's how it works:
>> 
>> The ABI specifies a sort function for public methods that gives each
>> one a per-class index. Version availability takes sort precedence, so
>> public methods can be added without affecting other
>> indices. [Apparently this is the same approach we're taking with
>> witness tables].
>> 
>> As with #2 this avoids locking down the vtable format for now--in the
>> future we'll likely optimize it further. To avoid locking all methods
>> into the vtable mechanism, the offset can be tagged. The alternative
>> dispatch mechanism for tagged offsets will be hidden within the
>> class-defining framework.
> 
> As a note, I went down a garden path here — I didn't realize at first that #3
> wasn't an option on its own, but really just forematter for options #3a and #3b.
> The paragraph about the sort, especially coming after #2, made me think that
> you were talking about the option — let's call it #2b — where the class only
> exports an offset to the start of its v-table and an availability-sort allows the
> caller to hard-code a fixed offset to add to that.  This greatly cuts down on the
> number of symbols required to be exported, but of course it does force all
> the methods to actually end up in the v-table.

Right, I didn't go down that path initially because, as Joe pointed
out, the vtable offsets aren't statically knowable because of the
resilient base class problem. But, of course the vtable base offset
could be exported and we can rely on the class being realized before
it's invoked. This is great if we're willing to commit to vtable
dispatch.

>> This avoids the potential explosion of exported symbols--it's limited
>> to one per public class. It avoids explosion of metadata by allowing
>> alternative dispatch for some subset of methods. These tradeoffs can
>> be explored in the future, independent of the ABI.
>> 
>> ---
>> #3a. (offset table export) A single per-class entry point provides a
>> pointer to an offset table. [It can be optionally cached on the client
>> side].
> 
> Why would this need to be a function?  Just to allow its resolution to be lazy?
> It seems to me that the class definition always knows the size of this table.

I don't know what you're asking. In #3b, the offset is resolved by a
class-exported function.

>> method_index = immediate
>> { // common per-class method lookup
>>   isa = load[obj]
>>   isa = isa & @isa_mask
>>   offset = load[@class_method_table + method_index]
>>   if (isVtableOffset(offset))
>>     method_entry = load[isa + offset]
>>   else
>>     method_entry = @resolveMethodAddress(isa, @class_method_table, method_index)
>> }
>> call method_entry
> 
> I hope the proposal is that the stuff in braces is a locally-emitted function,
> not something inlined into every call site.

This approach is based on generating local per-class "nocallersave" helper functions
for method lookup (all the stuff in curly braces).

> A sort of #4 is that this function could be exported by the class definition.
> It would be passed an isa and a method index and could resolve it however it likes.

Yeah, let's call that option #4. I like that from an ABI perspective, but didn't go there
because it seems strictly worse than #1 in terms of runtime cost.

-Andy

>> Cost - client code size: Worst case 3 instructions to dispatch vs 1
>> instruction for approach #1. Method lookups can be combined, so groups
>> of calls will be more compact.
>> 
>> Cost - library size: the offset tables themselves need to be
>> materialized on the framework side. I believe this can be done
>> statically in read-only memory, but that needs to be verified.
>> 
>> ABI: The offset table format and tag bit are baked into the ABI.
>> 
>> ---
>> #3b. (lazy resolution) Offset tables can be completely localized.
>> 
>> method_index = immediate
>> { // common per-class method lookup
>>   isa = load[obj]
>>   offset = load[@local_class_method_table + method_index]
>>   if (!isInitializedOffset(offset)) {
>>     offset = @resolveMethodOffset(@class_id, method_index)
>>     store [@local_class_method_table + method_index]
>>   }
>>   if (isVtableOffset(offset))
>>     method_entry = load[isa + offset]
>>   else
>>     method_entry = @resolveMethodAddress(isa, @class_id, method_index)
>> }
>> call method_entry
> 
> The size of @local_class_method_table is not statically knowable.
> Fortunately, it doesn't matter, because this mechanism does not actually
> care about the table being contiguous; the lookup function could be
> passed a per-method cache variable.  This would also allow the lookup
> function to be shared between classes.
> 
> John.
> 
>> ABI: This avoids exposing the offset table format as ABI. All that's
>> needed is a symbol for the class, a single entry point for method
>> offset resolution, and a single entry point for non-vtable method
>> resolution.
>> 
>> Benefit: The library no longer needs to statically materialize
>> tables. Instead they are initialized lazilly in each client module.
>> 
>> Cost: Lazy initialization of local tables requires an extra check and
>> burns some code size.
>> 
>> ---
>> Caveat:
>> 
>> This is the first time I've thought through approach #3, and it hasn't
>> been discussed, so there are likely a few things I'm missing at the
>> moment.
>> 
>> ---
>> Side Note:
>> 
>> Regardless of the resilient dispatch mechanism, within a module the
>> dispatch mechanism should be implemented with thunks to avoid type
>> checking classes from other files and improve compile time in non-WMO
>> builds, as Slava requested.
>> 
>> -Andy
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
>