[swift-dev] Resilient dynamic dispatch ABI. Notes and mini-proposal.
atrick at apple.com
Fri Feb 3 18:06:51 CST 2017
> On Feb 3, 2017, at 3:12 PM, John McCall <rjmccall at apple.com> wrote:
> I think we can generalize this discussion a bit by describing some mostly-independent axes of variation:
That's great (sorry I used up the arabic+letter naming convention earlier in the thread)...
> I. A call site:
> I1) inlines the lookup
> I2) calls a function that performs the lookup and returns a function pointer
> I3) calls a function that performs the lookup and calls the function
> I3 minimizes code size at the call site, but it prevents the lookup (or components thereof) from being hoisted. I1 requires at least some details of the lookup to be ABI.
> II. The function that performs the lookup:
> II1) is emitted lazily
> II2) is emitted eagerly
> II2 can use static information to implement the lookup inline without affecting the ABI. II1 cannot; it must either rely on ABI assumptions or just recursively use another strategy, in which case the function is probably just a call-site code-size optimization.
Sure, but I was thinking of it as client side vs. class-definition
side. My suggestions explored the tradeoff between II1 and II2, where
the client code could optionally optimize for vtable dispatch by
emitting some lazy lookup helpers with a fall-back to the eagerly
> III. The function that performs the lookup:
> III1) is method-specific
> III2) is class-specific
> III3) is part of the language runtime
> III1 requires a lot more functions, but if emitted with the class definition, it allows the lookup to be statically specialized based on the method called, so e.g. if a non-open method is never overridden, it can resolved to a specific function pointer immediately, possibly by just aliasing a symbol to the function definition. III3 minimizes the number of functions required but doesn't allow us to evolve dispatch beyond what's supported by the current language runtime and doesn't allow class-specific optimizations (e.g. if the class definition statically knows the offset to its v-table).
III1 is the sort of optimization that can be selectively added later
if we're willing to version the ABI.
I suppose by that logic we should start with III3. We could never
ditch the runtime/compiler support, but newer code could move toward
III2 through ABI versioning.
I actually prefer to start with III2 because the static overhead is
not significantly worse than III3 and it can be optimized without ABI
> IV. The function that performs the lookup:
> IV1) is parameterized by an isa
> IV2) is not parameterized by an isa
> IV1 allows the same function to be used for super-dispatch but requires extra work to be inlined at the call site (possibly requiring a chain of resolution function calls).
In my first message I was trying to accomplish IV1. But IV2 is simpler
and I can't see a fundamental advantage to IV1. Why would it need a
> V. For any particular function or piece of information, it can be accessed:
> V1) directly through a symbol
> V2) through a class-specific table
> V3) through a hierarchy-specific table (e.g. the class object)
> V1 requires more global symbols, especially if the symbol is per-method, but doesn't have any index-computation problems, and it's generally a bit more efficient.
> V2 allows stable assignment of fixed indexes to entries because of availability-sorting.
> V3 does not; it requires some ability to (at least) slide indexes of entries because of changes elsewhere in the hierarchy.
> If there are multiple instantiations of a table (perhaps with different information, like a v-table), V2 and V3 can be subject to table bloat.
I had proposed V2 as an option, but am strongly leaning toward V1 for
ABI simplicity and lower static costs (why generate vtables and offset
> So I think your alternatives were:
> 1. I3, II2, III1, IV2, V1 (for the dispatch function): a direct call to a per-method global function that performs the dispatch. We could apply V2 to this to decrease the number of global symbols required, but at the cost of inflating the call site and requiring a global variable whose address would have to be resolved at load time. Has an open question about super dispatch.
> 2. I1, V3 (for the v-table), V1 (for the global offset): a load of a per-method global variable giving an offset into the v-table. Joe's suggestion adds a helper function as a code-size optimization that follows I2, II1, III1, IV2. Again, we could also use V2 for the global offset to reduce the symbol-table costs.
> 3. I2, II2, III2, IV1, V2 (for the class offset / dispatch mechanism table). At least I think this is right? The difference between 3a and 3b seems to be about initialization, but maybe also shifts a lot of code-generation to the call site?
I'll pick the following option as a starting point because it constrains the ABI the least in
terms of static costs and potential directions for optimization:
"I2; (II1+II2); III2; IV1; V1"
method_entry = resolveMethodAddress_ForAClass(isa, method_index, &vtable_offset)
(where both modules would need to opt into the vtable_offset.)
I think any alternative would need to be demonstrably better in terms of code size or dynamic dispatch cost.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the swift-dev