[swift-dev] Implementation of swift's value types

Thu Jul 14 19:17:28 CDT 2016

Sent from my iPhone

> On Jul 14, 2016, at 4:04 PM, Johannes Neubauer <neubauer at kingsware.de> wrote:
> 
> Hi Arnold,
> 
> Thank you and atrick at apple.com very much for your answers. That helps already a lot. Are you Arnold from the WWDC Video?

Yes.

> I removed all the answers that are clear and have some follow-up questions inline.
> 
>>> Am 14.07.2016 um 23:02 schrieb Arnold Schwaighofer <aschwaighofer at apple.com>:
>>> 
>>> 3. Then Arnold says that four existential containers pointing to the same value (greater than the value buffer) use 4 heap allocations. He proposes copy-on-write (using a local storage class), but **does he mean this should be implemented by hand or is this an optimization that the swift compiler does on its own?**
>> 
>> No the swift compiler does not do this on its own.
> 
> Are there corresponding plans? Storing „big“ values (using some statistics to evaluate where the break-even is) as reference types always, storing them in a big table per type (each value occurs only once) should be possible, right? Is this something I should post on the swift-evolution mailing list?

No hashed out plans only discussions on hallways. The swift team  focuses on finishing up swift 3.

But yes, similar ideas are being discussed.

> 
>> 
>>> The issue here is, that the switch between "swift subscript" for showing an abstraction of internals and real swift code that one should write is sometimes not clear.
>> 
>> Sorry about this.
> 
> This was meant as positive criticism. I really like the video!
> 
>> 
>>> Doing this by hand has some clear disadvantages as this would add a reference to each usage of `Line` (and reference counting) even in the first examples of Kyle.
>> 
>> Yes, it is a tradeoff. Either, in the type erased context (as an instance of protocol value) copies are expensive for large value types and in the non generic/existential concrete context it is fast.
> 
> Is it really fast in the latter context for „big“ values?

I was being imprecise. I meant big plain POD types - a struct that contains only non-reference counted values.

As soon as you have more than one reference in your struct indirection will typically be cheaper - just looking at reference counting.

This a simplification that might overlook specific cases where this is not necessarily true. (The swift compiler can flatten structs to its individual member properties -- scalarize the struct -- and that can lead to better optimization ...)

> All collection types use copy-on-write. Especially if you have a lot of reference type properties copying seems to be slow even in the latter context. And since all collection types (including String) are backed by a reference type, even properties of these type incur additional reference counting.

Yes. Correct.
> 
>> Or you make the type have indirect storage with copy on write to preserve value semantics and then you have an overhead for reference counting.
>> 
>> You can get the best of both worlds if you write a wrapper.
>> 
>> All of this comes at the cost of a lot of boiler plate code and discussions are taking place how-to improve this situation.
> 
> Are there already proposals available or is it still in early discussions? Can you give me a hint where to find more about it?

No proposals at the moment.
> 
>> 
>>> 6. Is the Value-Witness-Table a (kind of) dictionary for all values of a given value type (bigger than the value buffer), so that you do not store any value twice on the heap, but share the reference? If this is the case the answer of *3.* should be *automatically*. But then again, the "manual" implementation of `String` and `Array` (see *4.*) make no sense anymore, does it? Or are `Array` and `String` implemented only on the lower-level and the copy-on-write implementation is not visible in their Swift implementation?
>> 
>> 
>> I don’t full understand the question.
>> 
>> The value-witness-table exists per type and contains functions that describe how to copy, allocate, and deallocate values of that type.
> 
> OK. I am still a little bit unsure what it is now. Andrew Trick says:
> 
>> A value witness table is a dictionary for all values of a type regardless of whether it fits in a buffer.
>> The keys are operations that can be done to any value (copy/destroy), the implementation for
>> that type knows where the value is stored.
> 

This is an abstract description that does not contradict mine.
Conceptually the value witness table is a dictionary (value witness's function kind to implementation). It is implemented as a block of memory (a table) that contains pointers to functions. At offset 2 (made up I would have to lookup what the offset is) say you will find the function that knows how to allocate memory for that type.
For an Int this function will just return a pointer into the inline value buffer.

For a struct of 4 integers this function will malloc memory on the heap store the pointer into the online value buffer and will return that pointer.

> In the example Arnold (you?) used the value witness table to copy the value (see slide 171 last line of code):
> 
> ```swift
> pwt.draw(vwt.projectBuffer(&local))
> ```
> 
> Since I didn’t know that the „storage“-trick is done manually, I thought there is a (hash)table/dictionary of all values of a „big“ value type where each value is uniquely stored (constant time access).

No this is not how it works.

Buffers for big values are created when we create a protocol value. This code is not clever. It allocates memory on the heap and copies the value's storage to that heap memory.

> For „small“ values it would just point, to the concrete value of this very value. As mentioned above: should I post this on the swift-evolution mailing list?
> 
> All the best
> Johannes
>