[swift-dev] Implementation of swift's value types

Thu Jul 14 16:02:35 CDT 2016

Hi Johannes,

thank you for your questions. Answers inline.

> On Jul 14, 2016, at 12:39 PM, Johannes Neubauer via swift-dev <swift-dev at swift.org> wrote:
> 
> Dear Devs,
> 
> I saw the WWDC 2016 video [Understanding Swift Performance][0] as well as some others regarding value types in swift of WWDC2015. I think there are a few ambiguities which make it hard both to decide which weapon to choose and to give proposition how to evolve this implementations to the better (swift-evolution).
> 
> Is there a place, where low-level decisions in the language are documented? Is there an adequate place/forum where we can ask questions regarding low-level implementations?

I think here is a good place to ask such questions.

> It would be great if you add a (moderated) comment section to each of the WWDC videos, so that we can discuss the contents (with transcript-like links) as well as an errata section containing a list of Apple-Approved mistakes/ambiguities in a given video/talk.
> 
> # So, here come my questions
> 
> In the talk [Understanding Swift Performance][0] Kyle says, that value types are put stored in the stack and copied. He uses a point and a line which are both copied. Lateron Arnold uses a similar example with protocols. Then the Existential Container is used, which either uses the value buffer (for small values like `Point`) or allocates some memory on the heap and adds a pointer to this value (e.g. for a `Line`):
> 
> 1. If I have an object (instance of a class) in a variable (or a container like an array) of a protocol type, will it be stored into an Existential Container, too? Or are reference types always stored as a reference (storing it in an Existential Container makes more sense to me).

Yes, the reference to the instance is stored inside the existential container. It fits into the three word value buffer and not out of line allocation for the existential container’s value buffer is necessary.

> 2. If I use a variable of the concrete type (although it implements a protocol), will it always be copied (no matter its size) or does the compiler choose an existential container if it is greater than some given size (or perhaps even always, because it gives a good tradeoff?)

Assuming you talk about value types like “struct Line”. Yes it will be copied. If you want to make the storage of a struct type indirect you have to wrap it in a class (similar to the IndirectStorage example).

> 3. Then Arnold says that four existential containers pointing to the same value (greater than the value buffer) use 4 heap allocations. He proposes copy-on-write (using a local storage class), but **does he mean this should be implemented by hand or is this an optimization that the swift compiler does on its own?**

No the swift compiler does not do this on its own.

> The issue here is, that the switch between "swift subscript" for showing an abstraction of internals and real swift code that one should write is sometimes not clear.

Sorry about this.

> Doing this by hand has some clear disadvantages as this would add a reference to each usage of `Line` (and reference counting) even in the first examples of Kyle.

Yes, it is a tradeoff. Either, in the type erased context (as an instance of protocol value) copies are expensive for large value types and in the non generic/existential concrete context it is fast. Or you make the type have indirect storage with copy on write to preserve value semantics and then you have an overhead for reference counting.

You can get the best of both worlds if you write a wrapper.

All of this comes at the cost of a lot of boiler plate code and discussions are taking place how-to improve this situation.

> Doing this as a compiler optimization would allow to use a struct in different scenarios and always the best tradeoff is used. Otherwise, I would perhaps even need to create two different types for different situations and choose it wisely. This would add a big burden on the developer.

There are things that we can do in the compiler (change the representation of generics and protocol values) that will improve the situation.

These would be future changes ...

> 4. If Arnold really means *manually* (see *3.*) and reference types are not stored in existential containers (see *1.*) the slides are wrong, because there a existential container is still used and the instance on the heap is named `Line` instead of `Line._storage`. So what is the case?

Yes the suggestion was do manually do this. Reference types are stored in existential container.

> 5. The implementations of `String` and `Array` seem to follow the copy-on-write strategy "manually", but I think they do that because this behavior is wanted even if the values would be copied automatically (if this is true, the answer for *3.* would be *manually*). Or am I wrong here?

The answer to 3. is manually.

> 6. Is the Value-Witness-Table a (kind of) dictionary for all values of a given value type (bigger than the value buffer), so that you do not store any value twice on the heap, but share the reference? If this is the case the answer of *3.* should be *automatically*. But then again, the "manual" implementation of `String` and `Array` (see *4.*) make no sense anymore, does it? Or are `Array` and `String` implemented only on the lower-level and the copy-on-write implementation is not visible in their Swift implementation?

I don’t full understand the question.

The value-witness-table exists per type and contains functions that describe how to copy, allocate, and deallocate values of that type.

The copy-on-write implementation is visible in standard library code for Array and String: array really calls isUniquelyReference for example before it does a subscript set.

> 7. If you want to have a reference-type (like `NSData`) with value semantics, then I need to implement my own copy-on-write of course, but if I want to have it only on the swift-value-type level the compiler should be able to do it all by itself, shouldn't it?
> 
> I read some [posts like this one][1] describing how Swift implements value types in a manner, that is conflicting with some of the things Kyle and Arnold said on WWDC 2016 (see above). Did Swift’s implementation change here between v2 and v3 or what do you think? The articles interpretation of the changes of the memory address (and the padding ints for the address struct; see the post) suggest, that always an existential container is used for structs (see *2.*) and copy-on-write is done automatically (see *3.*)…
> 
> It would be great, if someone could give me the answers to these questions :). Thanks in advance.
> 

This blog post draws the wrong conclusion from what it observes. Plain struct types are not copy on write. But really create a copy.

The test case in the blog post really shows that the string stored in the variable “streetAddress” changes. He overlays the memory of “struct Address” with “struct AddressBits”. If you read "bits1.underlyingPtr” you get whatever is the first 4/8bytes (32bit/64bit pointer platform) of the String struct instance stored  in Address.streetAddress. If you dig deep enough into the implementation of Swift’s string https://github.com/apple/swift/blob/master/stdlib/public/core/StringCore.swift. You will see that this is a variable that holds a pointer (the pointer to the String’s storage).

Best,
Arnold