[swift-dev] Resilient dynamic dispatch ABI. Notes and mini-proposal.

John McCall rjmccall at apple.com
Fri Feb 3 12:58:27 CST 2017

> On Feb 2, 2017, at 9:57 PM, Andrew Trick via swift-dev <swift-dev at swift.org> wrote:
> I'm following up on a resilient dynamic dispatch discussion kicked off by
> Slava during a performance team meeting to summarize some key
> points on public [swift-dev].
> It's easy to get sidetracked by the details of dynamic
> dispatch and various ways to generate code. I suggest approaching the
> problem by focusing on the ABI aspects and flexibility the ABI affords
> for future optimization. I'm including a proposal for one specific
> approach (#3) that wasn't discussed yet.
> ---
> #1. (thunk export) The simplest, most flexible way to expose dispatch
> across resilience boundaries is by exporting a single per-method entry
> point. Future compilers could improve dispatch and gradually expose
> more ABI details.

Probably the most important such case is that many of these "dispatch" symbols
for a non-open class could simply be aliases to the actual method definition.

Note that open classes have to support super-dispatch, not just ordinary dynamic
dispatch; that's something that needs to be discussed at the same time, since it
may affect the trade-offs.  #1 has some serious disadvantages here; the others
are likely fine, since their mechanisms are all easily parameterized by the isa.

> Cost: We're forced to export all those symbols in perpetuity.
> [The cost of the symbols is questionable. The symbol trie should compress the
> names, so the size may be small, and they should be lazily resolved,
> so the startup cost should be amortized].
> ---
> #2. (offset export) An alternative approach was proposed by JoeG a
> while ago and revisited in the meeting yesterday. It involves a
> client-side vtable offset lookup helper.
> This allows more opportunity for micro-optimization on the client
> side. This exposes the isa-based vtable mechanism as ABI. However, it
> stops short of exposing the vtable layout itself. Guaranteeing vtable
> dispatch may become a problem in the future because it forces an
> explosion of metadata. It also has the same problem as #1 because the
> framework must export a per-method symbol for the dispatch
> offset. What's worse, the symbols need to be eagerly resolved (AFAIK).
> ---
> #3. (method index) This is an alternative that I've alluded to before,
> but was not discussed in yesterday's meeting. One that makes a
> tradeoff between exporting symbols vs. exposing vtable layout. I want
> to focus on direct cost of the ABI support and flexibility of this
> approach vs. approach #1 without arguing over how to micro-optimize
> various dispatching schemes. Here's how it works:
> The ABI specifies a sort function for public methods that gives each
> one a per-class index. Version availability takes sort precedence, so
> public methods can be added without affecting other
> indices. [Apparently this is the same approach we're taking with
> witness tables].
> As with #2 this avoids locking down the vtable format for now--in the
> future we'll likely optimize it further. To avoid locking all methods
> into the vtable mechanism, the offset can be tagged. The alternative
> dispatch mechanism for tagged offsets will be hidden within the
> class-defining framework.

As a note, I went down a garden path here — I didn't realize at first that #3
wasn't an option on its own, but really just forematter for options #3a and #3b.
The paragraph about the sort, especially coming after #2, made me think that
you were talking about the option — let's call it #2b — where the class only
exports an offset to the start of its v-table and an availability-sort allows the
caller to hard-code a fixed offset to add to that.  This greatly cuts down on the
number of symbols required to be exported, but of course it does force all
the methods to actually end up in the v-table.

> This avoids the potential explosion of exported symbols--it's limited
> to one per public class. It avoids explosion of metadata by allowing
> alternative dispatch for some subset of methods. These tradeoffs can
> be explored in the future, independent of the ABI.
> ---
> #3a. (offset table export) A single per-class entry point provides a
> pointer to an offset table. [It can be optionally cached on the client
> side].

Why would this need to be a function?  Just to allow its resolution to be lazy?
It seems to me that the class definition always knows the size of this table.

>  method_index = immediate
>  { // common per-class method lookup
>    isa = load[obj]
>    isa = isa & @isa_mask
>    offset = load[@class_method_table + method_index]
>    if (isVtableOffset(offset))
>      method_entry = load[isa + offset]
>    else
>      method_entry = @resolveMethodAddress(isa, @class_method_table, method_index)
>  }
>  call method_entry

I hope the proposal is that the stuff in braces is a locally-emitted function,
not something inlined into every call site.

A sort of #4 is that this function could be exported by the class definition.
It would be passed an isa and a method index and could resolve it however it likes.

> Cost - client code size: Worst case 3 instructions to dispatch vs 1
> instruction for approach #1. Method lookups can be combined, so groups
> of calls will be more compact.
> Cost - library size: the offset tables themselves need to be
> materialized on the framework side. I believe this can be done
> statically in read-only memory, but that needs to be verified.
> ABI: The offset table format and tag bit are baked into the ABI.
> ---
> #3b. (lazy resolution) Offset tables can be completely localized.
>  method_index = immediate
>  { // common per-class method lookup
>    isa = load[obj]
>    offset = load[@local_class_method_table + method_index]
>    if (!isInitializedOffset(offset)) {
>      offset = @resolveMethodOffset(@class_id, method_index)
>      store [@local_class_method_table + method_index]
>    }
>    if (isVtableOffset(offset))
>      method_entry = load[isa + offset]
>    else
>      method_entry = @resolveMethodAddress(isa, @class_id, method_index)
>  }
>  call method_entry

The size of @local_class_method_table is not statically knowable.
Fortunately, it doesn't matter, because this mechanism does not actually
care about the table being contiguous; the lookup function could be
passed a per-method cache variable.  This would also allow the lookup
function to be shared between classes.


> ABI: This avoids exposing the offset table format as ABI. All that's
> needed is a symbol for the class, a single entry point for method
> offset resolution, and a single entry point for non-vtable method
> resolution.
> Benefit: The library no longer needs to statically materialize
> tables. Instead they are initialized lazilly in each client module.
> Cost: Lazy initialization of local tables requires an extra check and
> burns some code size.
> ---
> Caveat:
> This is the first time I've thought through approach #3, and it hasn't
> been discussed, so there are likely a few things I'm missing at the
> moment.
> ---
> Side Note:
> Regardless of the resilient dispatch mechanism, within a module the
> dispatch mechanism should be implemented with thunks to avoid type
> checking classes from other files and improve compile time in non-WMO
> builds, as Slava requested.
> -Andy
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

More information about the swift-dev mailing list