[swift-dev] [discussion notes] SIL address types and borrowing

Karl razielim at gmail.com
Sat Oct 8 12:09:27 CDT 2016

Could you add this (and John’s previous writeup) to the docs in the repo?

I was reasonably along the way to adding unowned optionals a while back but got totally lost in SILGen.
This info looks really valuable, but personally I find that with the mailing list format it’s hard to ever find this kind of stuff when I need it.



P.S. going to pick up that unowned optional stuff soon, once I have time to read the docs about SILGen

> On 8 Oct 2016, at 08:10, Andrew Trick via swift-dev <swift-dev at swift.org> wrote:
> On swift-dev, John already sent out a great writeup on SIL SSA:
> Representing "address-only" values in SIL.
> While talking to John I also picked up a lot of insight into how
> address types relate to SIL ownership and borrow checking. I finally
> organized the information into these notes. This is not a
> proposal. It's background information for those of us writing and
> reviewing proposals. Just take it as a strawman for future
> discussions. (There's also a good chance I'm getting something
> wrong).
> [My commentary in brackets.]
> ** Recap of address-only.
> Divide address-only types into two categories:
> 1. By abstraction (compiler doesn't know the size).
> 2. The type is "memory-linked". i.e. the address is significant at runtime.
>    - weak references (anything that registers its address).
>    - C++ this.
>    - Anything with interior pointers.
>    - Any shared-borrowed value of a type with "nonmutating" properties.
>      ["nonmutating" properties allow mutation of state attached to a value.
>       Rust atomics are an example.]
> Address-only will not be reflected in SIL types. SIL addresses should
> only be used for formal memory (pointers, globals, class
> properties, captures). We'll get to inout arguments later...
> As with opaque types, when IRGen lowers a memory-linked borrowed type,
> it needs to allocate storage.
> Concern: SILGen has built-in tracking of managed values that automates
> insertion of cleanups. Lowering address-only types after SILOpt would
> require rediscovering that information based on CFG analysis. Is this
> too heroic?
> This was already described by John. Briefly recapping:
> e.g. Constructung Optional<Any>
> We want initialization should be in-place as such:
> %0 = struct_element_addr .. #S.any
> %1 = init_existential_addr %0, $*Any, $Optional<X>
> %2 = inject_enum_data_addr %1, $Optional<X>.Some
> apply @initX(%2)
> SILValue initialization would look something like:
> %0 = apply @initX()
> %1 = enum #Optional.Some, %0 : $X
> %2 = existential %1 : $Any
> [I'm not sure we actually want to represent an existential container
> this way, but enum, yes.]
> Lowering now requires discovering the storage structure, bottom-up,
> hoisting allocation, inserting cleanups as John explained.
> Side note: Before lowering, something like alloc_box would directly
> take its initial value.
> ** SILFunction calling convention.
> For ownership analysis, there's effectively no difference between the
> value/address forms of argument ownership:
> @owned          / @in
> @guaranteed     / @in_guaranteed
> return          / @out
> @owned arg
> + @owned return / @inout
> Regardless of the representation we choose for @inout, @in/@out will
> now be scalar types. SILFunction will maintain the distinction between
> @owned/@in etc. based on whether the type is address-only. We need
> this for reabstraction, but it only affects the function type, not the
> calling convention.
> Rather than building a tuple, John prefers SIL support for anonymous
> aggregate as "exploded values".
> [I'm guessing because tuples are a distinct formal type with their own
> convention and common ownership. This may need some discussion though.]
> Example SIL function type:
> $(@in P, @owned Q) -> (@owned R, @owned S, @out T, @out U)
> %p = apply f: $() -> P
> %q = apply g: $() -> Q
> %exploded = apply h(%p, %q)
> %r = project_exploded %exploded, #0 : $R
> %s = project_exploded %exploded, #1 : $S
> %t = project_exploded %exploded, #2 : $T
> %u = project_exploded %exploded, #3 : $U
> Exploded types requires all their elements to be projected with their
> own independent ownership.
> ** Ownership terminology.
> Swift "owned"    = Rust values           = SIL @owned      = implicitly consumed
> Swift "borrowed" = Rust immutable borrow = SIL @guaranteed = shared
> Swift "inout"    = Rust mutable borrow   = SIL @inout      = unique
> Swift "inout" syntax is already (nearly) sufficient.
> "borrowed" may not need syntax on the caller side, just a way to
> qualify parameters. Swift still needs syntax for returning a borrowed
> value.
> ** Representation of borrowed values.
> Borrowed values represent some shared storage location.
> We want some borrowed value references to be passed as SIL values, not SIL addresses:
> - Borrowed class references should not be indirected.
> - Optimize borrowing other small non-memory linked types.
> - Support capture promotion, and other SSA optimizations.
> - Borrow CoW values directly.
> [Address-only borrowed types will still be passed as SIL addresses (why not?)]
> Borrowed types with potentially mutating properties must be passed by
> SIL address because they are not actually immutable and their storage
> location is significant.
> Borrowed references have a scope and need an end-of-borrow marker.
> [The end-of-borrow marker semantically changes the memory state, and
> statically enforces non-overlapping memory states. It does not
> semantically write-back a value. Borrowed values with mutating fields
> are semantically modified in-place.]
> [Regardless of whether borrowed references are represented as SIL
> values or addresses, they must be associated with formal storage. That
> storage must remain immutable at the language level (although it may
> have mutating fields) and the value cannot be destroyed during the
> borrowed scope].
> [Trivial borrowed values can be demoted to copies so we can eliminate
> their scope]
> [Anything borrowed from global storage (and not demoted to a copy)
> needs its scope to be dynamically enforced. Borrows from local storage
> are sufficiently statically enforced. However, in both cases the
> optimizer must respect the static scope of the borrow.]
> [I think borrowed values are effectively passed @guaranteed. The
> end-of-borrow scope marker will then always be at the top-level
> scope. You can't borrow in a caller and end its scope in the callee.]
> ** Borrowed and inout scopes.
> inout value references are also scoped. We'll get to their
> representation shortly. Within an inout scope, memory is in an
> exclusive state. No borrowed scopes may overlap with an inout state,
> which is to say, memory is either shared or exclusive.
> We need a flag for stored properties, even for simple trivial
> types. That's the only way to provide a simple user model. At least we
> don't need this to be implemented atomically, we're not detecting race
> conditions. Optimizations will come later. We should be able to prove
> that some stored properties are never passed as inout.
> The stored property flag needs to be a tri-state: owned, borrowed, exclusive.
> The memory value can only be destroyed in the owned state.
> The user may mark some storage locations as "unchecked" as an
> opt-out. That doesn't change the optimizer's constraints. It simply
> bypasses the runtime check.
> ** Ownership of loaded values.
> [MikeG already explained possibilities of load ownership in
> [swift-dev] [semantic-arc][proposal] High Level ARC Memory Operations]
> For the sake of understanding the model, it's worth realizing that we
> only need one form of load ownership: load_borrow. We don't
> actually need an operation that loads an owned value out of formal
> storage. This makes canonical sense because:
> - Semantically, a load must at least be a borrow because the storage
>   location's non-exclusive flag needs to be dynamically checked
>   anyway, even if the value will be copied.
> - Code motion in the SIL optimizer has to obey the same limitations
>   within borrow scopes regardless of whether we fuse loads and copies
>   (retains).
> [For the purpose of semantic ARC, the copy_value would be the RC
> root. The load and copy_value would effectively be "coupled" by the
> static scope of the borrow. e.g. we would not want to move a release
> inside the static scope of a borrow.]
> [Purely in the interest of concise SIL, I still think we want a load [copy].]
> ** SIL value ownership and aggregates
> Operations on values:
> 1. copy
> 2. forward (move)
> 3. borrow (share)
> A copy or forward produces an owned value.
> An owned value has a single consumer.
> A borrow has static scope.
> For simplicity, passing a bb argument only has move semantics (it
> forwards the value). Later that can be expanded if needed.
> We want to allow simultaneous access to independent subelements of a
> fragile aggregate. We should be able to borrow one field while
> mutating another.
> Is it possible to forward a subelement within an aggregate? No. But we
> can fully explode an owned aggregate into individual owned elements
> and reconstruct the aggregate. This makes use of the @exploded type
> feature described in the calling convention.
> [I don't think forwarding a subelement is useful anyway except for
> modeling @inout semantics...]
> That leads us to this question: Does an @inout value reference have
> formal storage (thus a SIL address) or is it just a convention for
> passing owned SSA values?
> ** World 1: SSA @inout
> Projecting an element produces a new SILValue. Does this SILValue have
> it's own ownership associated with it's lifetime, or is it derived
> from it's parent object by looking through projections?
> Either way, projecting any subelement requires reconstructing the
> entire aggregate in SIL, through all nesting levels. This will
> generate a massive amount of SILValues. Superficially they all need
> their own storage.
> [We could claim that projections don't need storage, but that only
> solves one side of the problem.]
> [I argue that this actually obscures the producer/consumer
> relationship, which is the opposite of the intention of moving to
> SSA. Projecting subelements for mutation fundamentally doesn't make
> sense. It does make sense to borrow a subelement (not for
> mutation). It also makes sense to project a mutable storage
> location. The natural way to project a storage location is by
> projecting an address...]
> ** World 2: @inout formal storage
> In this world, @inout references continue to have SILType $*T with
> guaranteed exclusive access.
> Memory state can be:
> - uninitialized
> - holds an owned value
>   - has exclusive access
>   - has shared access
> --- expected transitions need to be handled
>   - must become uninitialized
>   - must become initialized
>   - must preserve initialization state
> We need to mark initializers with some "must initialize" marker,
> similar to how we mark deinitializers [this isn't clear to me yet].
> We could give address types qualifiers to distinguish the memory state
> of their pointee (uninitialized, shared, exclusive). Addresses
> themselves could be pseudo-linear types. This would provide the same
> use-def guarantees as the SSA @inout approach, but producing a new
> address each type memory changes states would also be complicated and
> cumbersome (though not as bad as SSA).
> [[
> We didn't talk about the alternative, but presumably exclusive
> vs. shared scope would be delimited by pseudo memory operations as
> such:
> %a1 = alloc_stack
> begin_exclusive %a
> apply foo(%a) // must be marked an initializer?
> end_exclusive %a
> begin_shared %a
> apply bar(%a) // immutable access
> end_shared %a
> dealloc_stack %a
> Values loaded from shared memory also need to be scoped. They must be
> consumed within the shared region. e.g.
> %a2 = ref_element_addr
> %x = load_borrow %a2
> end_borrow %x, %a2
> It makes sense to me that a load_borrow would implicitly transition
> memory to shared state, and end_borrow would implicitly return memory
> to an owned state. If the address type is already ($* @borrow T), then
> memory would remain in the shared state.
> ]]
> For all sorts of analysis and optimization, from borrow checking to
> CoW to ARC, we really need aliasing guarantees. Knowing we have a
> unique address to a location is about as good as having an owned
> value.
> To get this guarantee we need to structurally guarantee
> unique addresses.
> [Is there a way to do this with out making all the element_addr
> operations scoped?]
> With aliasing guaratees, verification should be able to statically
> prove that most formal storage locations are properly initialized and
> uninitialized (pseudo-linear type) by inspecting the memory
> operations.
> Likewise, we can verify the shared vs. exclusive states.
> Representing @inout with addresses doesn't really add features to
> SIL. In any case, SIL address types are still used for
> formal storage. Exclusive access through any of the following
> operations must be guaranteed dynamically:
> - ref_element_addr
> - global_addr
> - pointer_to_address
> - alloc_stack
> - project_box
> We end up with these basic SIL Types:
> $T = owned value
> $@borrowed T = shared value
> $*T = exclusively accessed
> $* @borrowed T = shared access
> [I think the non-address @borrowed type is only valid for concrete
> types that the compiler knows are not memory-linked? This can be used
> to avoid passing borrowed values indirectly for arrays and other
> small, free-to-copy values].
> [We obviously need to work through concrete examples before we can
> claim to have a real design.]
> -Andy
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20161008/f6cfed3b/attachment.html>

More information about the swift-dev mailing list