<html><head><meta http-equiv="Content-Type" content="text/html charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div><blockquote type="cite" class=""><div class="">On Feb 3, 2017, at 7:06 PM, Andrew Trick <<a href="mailto:atrick@apple.com" class="">atrick@apple.com</a>> wrote:</div><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class="">On Feb 3, 2017, at 3:12 PM, John McCall <<a href="mailto:rjmccall@apple.com" class="">rjmccall@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I think we can generalize this discussion a bit by describing some mostly-independent axes of variation:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div><div class="">That's great (sorry I used up the arabic+letter naming convention earlier in the thread)...</div><div class=""><br class=""></div><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I. A call site:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> I1) inlines the lookup</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> I2) calls a function that performs the lookup and returns a function pointer</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> I3) calls a function that performs the lookup and calls the function</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I3 minimizes code size at the call site, but it prevents the lookup (or components thereof) from being hoisted. I1 requires at least some details of the lookup to be ABI.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div>Exactly.</div><div class=""><br class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">II. The function that performs the lookup:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">II1) is emitted lazily</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">II2) is emitted eagerly</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">II2 can use static information to implement the lookup inline without affecting the ABI. II1 cannot; it must either rely on ABI assumptions or just recursively use another strategy, in which case the function is probably just a call-site code-size optimization.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div><div class="">Sure, but I was thinking of it as client side vs. class-definition</div><div class="">side. My suggestions explored the tradeoff between II1 and II2, where</div><div class="">the client code could optionally optimize for vtable dispatch by</div><div class="">emitting some lazy lookup helpers with a fall-back to the eagerly</div><div class="">emitted lookup.</div></div></div></div></blockquote><div><br class=""></div>That's how I see it, too.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">III. The function that performs the lookup:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> III1) is method-specific</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> III2) is class-specific</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> III3) is part of the language runtime</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">III1 requires a lot more functions, but if emitted with the class definition, it allows the lookup to be statically specialized based on the method called, so e.g. if a non-open method is never overridden, it can resolved to a specific function pointer immediately, possibly by just aliasing a symbol to the function definition. III3 minimizes the number of functions required but doesn't allow us to evolve dispatch beyond what's supported by the current language runtime and doesn't allow class-specific optimizations (e.g. if the class definition statically knows the offset to its v-table).</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div><div class="">III1 is the sort of optimization that can be selectively added later</div><div class="">if we're willing to version the ABI.</div></div></div></div></blockquote><div><br class=""></div>Well, at the cost of leaving us with a legacy exported function.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class="">I suppose by that logic we should start with III3. We could never</div><div class="">ditch the runtime/compiler support, but newer code could move toward</div><div class="">III2 through ABI versioning.</div></div></div></div></blockquote><div><br class=""></div>It would depend on whether we build features on top of this. For example,</div><div>I could see a reflection system assuming that all methods could be dispatched</div><div>via the global runtime function, if we had one.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class="">I actually prefer to start with III2 because the static overhead is</div><div class="">not significantly worse than III3 and it can be optimized without ABI</div><div class="">migration.</div></div></div></div></blockquote><div><br class=""></div>If it's between III2 and III3, yeah, I think I agree.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">IV. The function that performs the lookup:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> IV1) is parameterized by an isa</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> IV2) is not parameterized by an isa</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">IV1 allows the same function to be used for super-dispatch but requires extra work to be inlined at the call site (possibly requiring a chain of resolution function calls).</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div><div class="">In my first message I was trying to accomplish IV1. But IV2 is simpler</div><div class="">and I can't see a fundamental advantage to IV1.</div></div></div></div></blockquote><div><br class=""></div><div>Well, you can use IV1 to implement super dispatch (+ sibling dispatch, if we add it)</div><div>by passing in the isa of either the superclass or the current class. IV2 means</div><div>that the dispatch function is always based on the isa from the object, so those</div><div>dispatch schemes need something else to implement them.</div><br class=""><blockquote type="cite" class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class=""> Why would it need a lookup chain?</div></div></div></blockquote><div><br class=""></div>Code size, because you might not want to inline the isa load at every call site.</div><div>So, for a normal dispatch, you'd have an IV2 function (defined client-side?)</div><div>that just loads the isa and calls the IV1 function (defined by the class).</div><div><div><br class=""></div><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">V. For any particular function or piece of information, it can be accessed:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> V1) directly through a symbol</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> V2) through a class-specific table</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class=""> V3) through a hierarchy-specific table (e.g. the class object)</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">V1 requires more global symbols, especially if the symbol is per-method, but doesn't have any index-computation problems, and it's generally a bit more efficient.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">V2 allows stable assignment of fixed indexes to entries because of availability-sorting.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">V3 does not; it requires some ability to (at least) slide indexes of entries because of changes elsewhere in the hierarchy.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">If there are multiple instantiations of a table (perhaps with different information, like a v-table), V2 and V3 can be subject to table bloat.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""></div></blockquote><div class=""><br class=""></div><div class="">I had proposed V2 as an option, but am strongly leaning toward V1 for</div><div class="">ABI simplicity and lower static costs (why generate vtables and offset</div><div class="">tables?)</div></div></div></div></blockquote><div><br class=""></div><div>V1 doesn't remove the need for tables, it just hides them from the ABI.</div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><blockquote type="cite" class=""><div class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">So I think your alternatives were:</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">1. I3, II2, III1, IV2, V1 (for the dispatch function): a direct call to a per-method global function that performs the dispatch. We could apply V2 to this to decrease the number of global symbols required, but at the cost of inflating the call site and requiring a global variable whose address would have to be resolved at load time. Has an open question about super dispatch.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">2. I1, V3 (for the v-table), V1 (for the global offset): a load of a per-method global variable giving an offset into the v-table. Joe's suggestion adds a helper function as a code-size optimization that follows I2, II1, III1, IV2. Again, we could also use V2 for the global offset to reduce the symbol-table costs.</span><br style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">3. I2, II2, III2, IV1, V2 (for the class offset / dispatch mechanism table). At least I think this is right? The difference between 3a and 3b seems to be about initialization, but maybe also shifts a lot of code-generation to the call site?</span></div></blockquote></div><br class=""><div class=""><div class="">I'll pick the following option as a starting point because it constrains the ABI the least in</div><div class="">terms of static costs and potential directions for optimization:</div><div class=""><br class=""></div><div class="">"I2; (II1+II2); III2; IV1; V1"</div><div class=""><br class=""></div><div class="">method_entry = resolveMethodAddress_ForAClass(isa, method_index, &vtable_offset)</div><div class=""><br class=""></div><div class="">(where both modules would need to opt into the vtable_offset.)</div></div></div></div></blockquote><div><br class=""></div>Wait, remind me what this &vtable_offset is for at this point? Is it basically just a client-side cache? I can't figure out what it's doing for us.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><div class=""><div class="">I think any alternative would need to be demonstrably better in terms of code size or dynamic dispatch cost.</div></div></div></div></blockquote><div><br class=""></div>That's a lot of stuff to materialize at every call site. It makes calls</div><div>into something like a 10 instruction sequence on ARM64, ignoring</div><div>the actual formal arguments:</div><div><br class=""></div><div><font face="Menlo" class=""> %raw_isa = load %object // 1 instruction</font></div><div><font face="Menlo" class=""> %isa_mask = load @swift_isaMask // 3: 2 to materialize address from GOT (not necessarily with ±1MB), 1 to load from it</font></div><div><font face="Menlo" class=""> %isa = and %raw_isa, %isa_mask // 1</font></div><div><font face="Menlo" class=""> %method_index = 13 // 1</font></div><div><font face="Menlo" class=""> %cache = @local.A.foo.cache // 2: not necessarily within ±1MB</font></div><div><font face="Menlo" class=""> %method = call @A.resolveMethod(%isa, %method_index, %cache) // 1</font></div><div><font face="Menlo" class=""> call %method(...) // 1</font></div><div><br class=""></div><div>On x86-64, it'd just be 8 instructions because the immediate range for leaq/movq</div><div>is ±2GB, which is Good Enough for the standard code model, but of course it still</div><div>expands to roughly the same amount of code.</div><div><br class=""></div><div>Even without vtable_offset, it's a lot of code to inline.</div><div><br class=""></div><div>So we'd almost certainly want a client-side resolver function that handled</div><div>the normal case. Is that what you mean when you say II1+II2? So the local</div><div>resolver would be I2; II1; III2; IV2; V1, which leaves us with a three-instruction</div><div>call sequence, which I think is equivalent to Objective-C, and that function</div><div>would do this sequence:</div><div><font face="Menlo" class=""><br class=""></font></div><div><font face="Menlo" class="">define @local_resolveMethodAddress(%object, %method_index)</font></div><div><font face="Menlo" class=""> %raw_isa = load %object // 1 instruction</font></div><div><font face="Menlo" class=""> %isa_mask = load @swift_isaMask // 3: 2 to materialize address from GOT (not necessarily with ±1MB), 1 to load from it</font></div><div><font face="Menlo" class=""> %isa = and %raw_isa, %isa_mask // 1</font></div><div><font face="Menlo" class=""> %cache_table = @local.A.cache_table // 2: not necessarily within ±1MB</font></div><div><font face="Menlo" class=""> %cache = add %cache_table, %method_index * 8 // 1</font></div><div><font face="Menlo" class=""> tailcall @A.resolveMethod</font><span style="font-family: Menlo;" class="">(%isa, %method_index, %cache)</span><span style="font-family: Menlo;" class=""> // 1</span></div><div><br class=""></div><div>John.</div></body></html>