[swift-dev] [discussion notes] SIL address types and borrowing

Tue Oct 11 16:14:59 CDT 2016

> On Oct 11, 2016, at 11:49 AM, Joe Groff <jgroff at apple.com> wrote:
>> On Oct 11, 2016, at 11:44 AM, John McCall <rjmccall at apple.com> wrote:
>> 
>>> On Oct 11, 2016, at 11:22 AM, Joe Groff <jgroff at apple.com> wrote:
>>>> On Oct 11, 2016, at 11:19 AM, Andrew Trick <atrick at apple.com> wrote:
>>>> 
>>>> 
>>>>> On Oct 11, 2016, at 11:02 AM, Joe Groff <jgroff at apple.com> wrote:
>>>>> 
>>>>> 
>>>>>> On Oct 11, 2016, at 10:50 AM, John McCall <rjmccall at apple.com> wrote:
>>>>>> 
>>>>>>> On Oct 11, 2016, at 10:10 AM, Joe Groff via swift-dev <swift-dev at swift.org> wrote:
>>>>>>>> On Oct 10, 2016, at 6:58 PM, Andrew Trick <atrick at apple.com> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Oct 10, 2016, at 6:23 PM, Joe Groff <jgroff at apple.com> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 7, 2016, at 11:10 PM, Andrew Trick via swift-dev <swift-dev at swift.org> wrote:
>>>>>>>>>> ** World 1: SSA @inout
>>>>>>>>>> 
>>>>>>>>>> Projecting an element produces a new SILValue. Does this SILValue have
>>>>>>>>>> it's own ownership associated with it's lifetime, or is it derived
>>>>>>>>>> from it's parent object by looking through projections?
>>>>>>>>>> 
>>>>>>>>>> Either way, projecting any subelement requires reconstructing the
>>>>>>>>>> entire aggregate in SIL, through all nesting levels. This will
>>>>>>>>>> generate a massive amount of SILValues. Superficially they all need
>>>>>>>>>> their own storage.
>>>>>>>>>> 
>>>>>>>>>> [We could claim that projections don't need storage, but that only
>>>>>>>>>> solves one side of the problem.]
>>>>>>>>>> 
>>>>>>>>>> [I argue that this actually obscures the producer/consumer
>>>>>>>>>> relationship, which is the opposite of the intention of moving to
>>>>>>>>>> SSA. Projecting subelements for mutation fundamentally doesn't make
>>>>>>>>>> sense. It does make sense to borrow a subelement (not for
>>>>>>>>>> mutation). It also makes sense to project a mutable storage
>>>>>>>>>> location. The natural way to project a storage location is by
>>>>>>>>>> projecting an address...]
>>>>>>>>> 
>>>>>>>>> I think there's a size threshold at which SSA @inout is manageable, and might lead to overall better register-oriented code, if the aggregates can be exploded into a small number of individual values. The cost of reconstructing the aggregate could be mitigated somewhat by introducing 'insert' instructions for aggregates to pair with the projection instructions, similar to how LLVM has insert/extractelement. "%x = project_value %y.field; %x' = transform(%x); %y' = insert %y.field, %x" isn't too terrible compared to the address-oriented formulation. Tracking ownership state through projections and insertions might tricky; haven't thought about that aspect.
>>>>>>>>> 
>>>>>>>>> -Joe
>>>>>>>> 
>>>>>>>> We would have to make sure SROA+mem2reg could still kick in. If that happens, I don’t think we need to worry about inout ownership semantics anymore. A struct_extract is then essentially a borrow. It’s parent’s lifetime needs to be guaranteed, but I don’t know if the subobject needs explicit scoping in SIL since there’s no inout scopes to worry about and nothing for the runtime to do when the scope ends .
>>>>>>>> 
>>>>>>>> (Incidentally, this would never happen to a CoW type that has a uniqueness check—to mutate a CoW type, it’s value needs to be in memory). 
>>>>>>> 
>>>>>>> Does a uniqueness check still need to be associated with a memory location once we associate ownership with SSA values? It seems to me like it wouldn't necessarily need to be. One thing I'd like us to work toward is being able to reliably apply uniqueness checks to rvalues, so that code in a "pure functional" style gets the same optimization benefits as code that explicitly uses inouts.
>>>>>> 
>>>>>> As I've pointed out in the past, this doesn't make any semantic sense.  Projecting out a buffer reference as a true r-value creates an independent value and therefore requires bumping the reference count.  The only query that makes semantic sense is "does this value hold a unique reference to its buffer", which requires some sort of language tool for talking abstractly about values without creating new, independent values.  Our only existing language tool for that is inout, which allows you to talk about the value stored in a specific mutable variable.  Ownership will give us a second and more general tool, borrowing, which allows you abstractly refer to immutable existing values.
>>>>> 
>>>>> If we have @owned values, then we also have the ability to do a uniqueness check on that value, don't we? This would necessarily consume the value, but we could conditionally produce a new known-unique value on the path where the uniqueness check succeeds.
>>>>> 
>>>>> entry(%1: @owned $X):
>>>>> is_uniquely_referenced %1, yes, no
>>>>> yes(%2: /*unique*/ @owned $X):
>>>>> // %2 is unique, until copied at least
>>>>> no(%3: @owned %X):
>>>>> // %3 is not
>>>>> 
>>>>> -Joe
>>>> 
>>>> You had to copy $X to make it @owned.
>>> 
>>> This is the part I think I'm missing. It's not clear to me why this is the case, though. You could have had an Array return value that has never been stored in memory, so never needed to be copied. If you have an @inout memory location, and we enforce the single-owner property on inouts so that they act like a Rust-style mutable borrow, then you should also be able to take the value out of the memory location as long as you move a value back in before the scope of the inout expires.
>> 
>> I'm not sure what your goal is here vs. relying on borrowing.  Both still require actual analysis to prove uniqueness at any given point, as you note with your "until copied at least" comment.
>> 
>> Also, from a higher level, I'm not sure why we care whether a value that was semantically an r-value was a unique reference.  CoW types are immutable even if the reference is shared, and that should structurally straightforward to take advantage of under any ownership representation.
> 
> My high-level goal was to get to a point where we could support in-place optimizations on unique buffers that belong to values that are semantically rvalues at the language level. It seems to me that we ought to be able to make 'stringA + B + C + D' as efficient as '{ var tmp = stringA; tmp += B; tmp += C; tmp += D; tmp }()' by enabling uniqueness checks and in-place mutation of the unique-by-construction results of +-ing strings. If you think that works under the borrow/inout-in-memory model, then no problem; I'm also trying to understand the design space a bit more.

Ah right, that optimization.  The problem here with using borrows is that you really want static enforcement that both (1) you've really got ownership of a unique reference (so e.g. you aren't just forwarding a borrowed value down)  and (2) you're not accidentally copying the reference and so ruining the uniqueness check.  Those are hard guarantees to get with an implicitly-copyable type.

I wonder if it would make more sense to make copy-on-write buffer references a move-only type, so that as long as you were just working with the raw reference (as opposed to the CoW aggregate, which would remain copyable) it wouldn't get implicitly copied anymore.  You could have mutable and immutable buffer reference types, both move-only, and there could be a consuming checkUnique operation on the immutable one that, I dunno, returned an Either of the mutable and immutable versions.

For CoW aggregates, you'd need some @copied attribute on the field to make sure that the CoW attribute was still copyable.  Within the implementation of the type, though, you would be projecting out the reference immediately, and thereafter you'd be certain that you were borrowing / moving it around as appropriate.

I dunno.  It's an idea.

John.