[swift-dev] Reconsidering the global uniqueness of type metadata and protocol conformance instances

Fri Jul 28 17:18:37 CDT 2017

> On Jul 28, 2017, at 2:53 PM, Michael Ilseman <milseman at apple.com> wrote:
> 
> 
>> On Jul 28, 2017, at 2:20 PM, Joe Groff via swift-dev <swift-dev at swift.org> wrote:
>> 
>> The Swift runtime currently maintains globally unique pointer identities for type metadata and protocol conformances. This makes checking type equivalence a trivial pointer equality comparison, but most operations on generic values do not really care about exact type identity and only need to invoke value or protocol witness methods or consult other data in the type metadata structure. I think it's worth reevaluating whether having globally unique type metadata objects is the correct design choice. Maintaining global uniqueness of metadata instances carries a number of costs. Any code that wants type metadata for an instance of a generic type, even a fully concrete one, must make a potentially expensive runtime call to get the canonical metadata instance. This also greatly complicates our ability to emit specializations of type metadata, value witness tables, or protocol witness tables for concrete instances of generic types, since specializations would need to be registered with the runtime as canonical metadata objects,
> 
> This seems pretty compelling
> 
>> and it would be difficult to do this lazily and still reliably favor specializations over more generic witnesses.
> 
> What do you mean by doing this lazily and favoring the specializations?

We've generally tried to make metadata instantiation as lazy as possible. If we have runtime-mediated unique metadata but the compiler can also generate specialized metadata candidates for concrete instances, then you either have to eagerly scan for and register specialized instances the first time you try to instantiate any instance of a generic type, or you keep it lazy and let the first metadata the runtime sees win, which runs the risk of an unspecialized, fully runtime-synthesized instance getting anointed as the official instance before any specialization can. To be fair, there's a potential mitigation here too since the value witness table is independent of the type metadata; we could keep the canonical metadata address stable and redirect the value witness table to a better candidate if we discover one. This is still all a lot more complex than forgoing the need for a canonical instance altogether.

> 
>> The lack of witness table specializations leaves an obnoxious performance cliff for instances of generic types that end up inside existential containers or cross into unspecialized code. The runtime also obligates binaries to provide the canonical metadata for all of their public types, along with all the dependent value witnesses, class methods, and protocol witness tables, meaning a type abstraction can never be completely "zero-cost" across modules.
>> 
> 
> Do you have some examples here to illustrate? E.g. if I pass an instance of concrete type to something taking a T:Hashable, how does that currently work vs how it would work with this change? Would this mean that I can just pass off a function pointer to the hashing function? Do I need some kind of id scheme if the callee might want to cast or do something else with it?

Today, given:

struct Foo<T>: Hashable { ... }

We'll generate one unspecialized protocol witness table forall T. Foo<T>: Hashable. The uniqueness of the witness table is significant since it may be part of the parameterization of a type like Dictionary with a Hashable type parameter. If you pass Foo<Int> as a U: Hashable to an unspecialized generic function, we'll still pass that set of fully unspecialized witnesses, so we get no benefit from knowing T == Int at runtime.. If there wasn't that uniqueness requirement on the witness table, then the compiler could synthesize a specialized witness table for Foo.

> 
>> On the other hand, if type metadata did not need to be unique, then the compiler would be free to emit specialized type metadata and protocol witness tables for fully concrete non-concrete value types without consulting the runtime. This would let us avoid runtime calls to fetch metadata in specialized code, and would make it much easier for us to implement witness specialization. It would also give us the ability to potentially extend the "inlinable" concept to public fragile types, making it a client's responsibility to emit metadata for the type when needed and keeping the type from affecting its home module's ABI. This could significantly reduce the size and ABI surface area of the standard library, since the standard library contains a lot of generic lightweight adapter types for collections and other abstractions that are intended to be optimized away in most use cases.
>> 
>> There are of course benefits to globally unique metadata objects that we would lose if we gave up uniqueness. Operations that do check type identity, such as comparison, hashing, and dynamic casting, would have to perform more expensive checks, and nonunique metadata objects would need to carry additional information to enable those checks.
> 
> How do you think this will work? Do you think we will want a way to go from a non-unique type metadata to some kind of canonical, uniqued type metadata? Would it make sense to key this off of a mangled name?

Adding a pointer to the mangled type string or some other unique identifier to the witness table seems workable to me.

> 
>> It is likely that class objects would have to remain globally unique, if for no other reason than that the Objective-C runtime requires it on Apple platforms. Having multiple equivalent copies of type metadata has the potential to increase the working set of an app in some situations, although it's likely that redundant compiler-emitted copies of value type metadata would at least be able to live in constant pages mapped from disk instead of getting dynamically instantiated by the runtime like everything is today. There could also be subtle source-breaking behavior for code that bitcasts metatype values to integers or pointers and expects bit-level equality to indicate type equality.
> 
> This sounds very niche, and I don’t think we have promised any kind of stability or compatibility in that area.

Sure, but it is something that people could be conceivably rely on that we would break.

-Joe