<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Oct 4, 2016, at 1:04 PM, John McCall <<a href="mailto:rjmccall@apple.com" class="">rjmccall@apple.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" class=""><div class=""><br class="Apple-interchange-newline">On Sep 30, 2016, at 11:54 PM, Michael Gottesman via swift-dev <<a href="mailto:swift-dev@swift.org" class="">swift-dev@swift.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div class="">The document attached below contains the first "Semantic ARC" mini proposal: the High Level ARC Memory Operations Proposal.</div><div class=""><br class=""></div><div class="">An html rendered version of this markdown document is available at the following URL:</div><div class=""><br class=""></div><div class=""><a href="https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html" class="">https://gottesmm.github.io/proposals/high-level-arc-memory-operations.html</a></div><div class=""><br class=""></div><div class="">----</div><div class=""><br class=""></div><div class=""><div class=""># Summary</div><div class=""><br class=""></div><div class="">This document proposes:</div><div class=""><br class=""></div><div class="">1. adding the `load_strong`, `store_strong` instructions to SIL. These can only</div><div class=""> be used with memory locations of `non-trivial` type.</div></div></div></div></blockquote><div class=""><br class=""></div>I would really like to avoid using the word "strong" here. Under the current proposal, these instructions will be usable with arbitrary non-trivial types, not just primitive class references. Even if you think of an aggregate that happens to contain one or more strong references as some sort of aggregate strong reference (which is questionable but not completely absurd), we already have loadable non-strong class references that this operation would be usable with, like native unowned references. "load_strong %0 : $*@sil_unowned T" as an operation yielding a scalar "@sil_unowned T" is ridiculous, and it will only get more ridiculous when we eventually allow this operation to work with types that are currently address-only, like weak references.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Brainstorming:</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">Something like load_copy and store_copy would be a bit unfortunate, since store_copy doesn't actually copy the source operand and we want to have a load_copy [take].</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">load_value and store_value seem excessively generic. It's not like non-trivial types aren't values.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">One question that comes to mind: do we actually need new instructions here other than for staging purposes? We don't actually need new instructions for pseudo-linear SIL to work; we just need to say that we only enforce pseudo-linearity for non-trivial types.</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">If we just want the instruction to be explicit about ownership so that we can easily distinguish these cases, we can make the rule always explicit, e.g.:</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""> load [take] %0 : $*MyClass</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""> load [copy] %0 : $*MyClass</div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><div class=""> load [trivial] %0 : $*Int</div><div class=""><br class=""></div><div class=""><div class=""> store %0 to [initialization] %1 : $*MyClass</div><div class=""> store %0 to [assignment] %1 : $*MyClass</div><div class=""><div class=""> store %0 to [trivial] %1 : $*Int</div><div class=""><br class=""></div></div></div></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class="">John.</div></div></blockquote><div><br class=""></div><div>The reason why I originally suggested to go the load_strong route is that we already have load_weak, load_unowned instructions. If I could add a load_strong instruction, then it would make sense to assign an engineer to do a pass over all 3 of these instructions and combine them into 1 load instruction. That is, first transform into a form amenable for canonicalization and then canonicalize all at once.</div><div><br class=""></div><div>As you pointed out, both load_unowned and load_weak involve representation changes in type (for instance the change of weak pointers to Optional<T>). Such a change would be against the "spirit" of a load instruction to perform such representation changes versus ownership changes.</div><div><br class=""></div><div>In terms of the properties that we actually want here, what is important is that we can verify that no non-trivially typed values are loaded in an unsafe unowned manner. That can be done also with ownership flags on load/store.</div><div><br class=""></div><div>Does this sound reasonable:</div><div><br class=""></div><div>1. We introduce two enums that define memory ownership changes, one for load and one for store. Both of these enums will contain a [trivial] ownership.</div><div>2. We enforce in the verifier that non-trivial types must have a non-trivial ownership modifier on any memory operations that they are involved in.</div><div><br class=""></div><div>Michael</div><br class=""><blockquote type="cite" class=""><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><br class=""></div><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" class=""><div class="" style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"><div class=""><div class="">2. banning the use of `load`, `store` on values of `non-trivial` type.</div><div class=""><br class=""></div><div class="">This will allow for:</div><div class=""><br class=""></div><div class="">1. eliminating optimizer miscompiles that occur due to releases being moved into</div><div class=""> the region in between a `load`/`retain`, `load`/`release`,</div><div class=""> `store`/`release`. (For a specific example, see the appendix).</div><div class="">2. modeling `load`/`store` as having `unsafe unowned` ownership semantics. This</div><div class=""> will be enforced via the verifier.</div><div class="">3. more aggressive ARC code motion.</div><div class=""><br class=""></div><div class=""># Definitions</div><div class=""><br class=""></div><div class="">## load_strong</div><div class=""><br class=""></div><div class="">We propose three different forms of load_strong differentiated via flags. First</div><div class="">define `load_strong` as follows:</div><div class=""><br class=""></div><div class=""> %x = load_strong %x_ptr : $*C</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %x = load %x_ptr : $*C</div><div class=""> retain_value %x : $C</div><div class=""><br class=""></div><div class="">Then define `load_strong [take]` as:</div><div class=""><br class=""></div><div class=""> %x = load_strong [take] %x_ptr : $*Builtin.NativeObject</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %x = load %x_ptr : $*Builtin.NativeObject</div><div class=""><br class=""></div><div class="">**NOTE** `load_strong [take]` implies that the loaded from memory location no</div><div class="">longer owns the result object (i.e. a take is a move). Loading from the memory</div><div class="">location again without reinitialization is illegal.</div><div class=""><br class=""></div><div class="">Next we provide `load_strong [guaranteed]`:</div><div class=""><br class=""></div><div class=""> %x = load_strong [guaranteed] %x_ptr : $*Builtin.NativeObject</div><div class=""> ...</div><div class=""> fixLifetime(%x)</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %x = load %x_ptr : $*Builtin.NativeObject</div><div class=""> ...</div><div class=""> fixLifetime(%x)</div><div class=""><br class=""></div><div class="">`load_strong [guaranteed]` implies that in the region before the fixLifetime,</div><div class="">the loaded object is guaranteed semantically to remain alive. The fixLifetime</div><div class="">communicates to the optimizer the location up to which the value's lifetime is</div><div class="">guaranteed to live. An example of where this construct is useful is when one has</div><div class="">a let binding to a class instance `c` that contains a let field `f`. In that</div><div class="">case `c`'s lifetime guarantees `f`'s lifetime.</div><div class=""><br class=""></div><div class="">## store_strong</div><div class=""><br class=""></div><div class="">Define a store_strong as follows:</div><div class=""><br class=""></div><div class=""> store_strong %x to %x_ptr : $*C</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %old_x = load %x_ptr : $*C</div><div class=""> store %new_x to %x_ptr : $*C</div><div class=""> release_value %old_x : $C</div><div class=""><br class=""></div><div class="">*NOTE* store_strong is defined as a consuming operation. We also provide</div><div class="">`store_strong [init]` in the case where we know statically that there is no</div><div class="">previous value in the memory location:</div><div class=""><br class=""></div><div class=""> store_strong %x to [init] %x_ptr : $*C</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> store %new_x to %x_ptr : $*C</div><div class=""><br class=""></div><div class=""># Implementation</div><div class=""><br class=""></div><div class="">## Goals</div><div class=""><br class=""></div><div class="">Our implementation strategy goals are:</div><div class=""><br class=""></div><div class="">1. zero impact on other compiler developers until the feature is fully</div><div class=""> developed. This implies all work will be done behind a flag.</div><div class="">2. separation of feature implementation from updating passes.</div><div class=""><br class=""></div><div class="">Goal 2 will be implemented via a pass that blows up `load_strong`/`store_strong`</div><div class="">right after SILGen.</div><div class=""><br class=""></div><div class="">## Plan</div><div class=""><br class=""></div><div class="">We begin by adding initial infrastructure for our development. This means:</div><div class=""><br class=""></div><div class="">1. Adding to SILOptions a disabled by default flag called</div><div class=""> "EnableSILOwnershipModel". This flag will be set by a false by default frontend</div><div class=""> option called "-enable-sil-ownership-mode".</div><div class=""><br class=""></div><div class="">2. Bots will be brought up to test the compiler with</div><div class=""> "-enable-sil-ownership-model" set to true. The specific bots are:</div><div class=""><br class=""></div><div class=""> * RA-OSX+simulators</div><div class=""> * RA-Device</div><div class=""> * RA-Linux.</div><div class=""><br class=""></div><div class=""> The bots will run once a day until the feature is close to completion. Then a</div><div class=""> polling model will be followed.</div><div class=""><br class=""></div><div class="">Now that change isolation is guaranteed, we develop building blocks for the</div><div class="">optimization:</div><div class=""><br class=""></div><div class="">1. load_strong, store_strong will be added to SIL and IRGen, serialization,</div><div class="">printing, SIL parsing support will be implemented. SILGen will not be modified</div><div class="">at this stage.</div><div class=""><br class=""></div><div class="">2. A pass called the "OwnershipModelEliminator" will be implemented. It will</div><div class="">(initially) blow up load_strong/store_strong instructions into their constituent</div><div class="">operations.</div><div class=""><br class=""></div><div class="">3. An option called "EnforceSILOwnershipMode" will be added to the verifier. If</div><div class="">the option is set, the verifier will assert if unsafe unowned loads, stores are</div><div class="">used to load from non-trivial memory locations.</div><div class=""><br class=""></div><div class="">Finally, we wire up the building blocks:</div><div class=""><br class=""></div><div class="">1. If SILOption.EnableSILOwnershipModel is true, then the after SILGen SIL</div><div class=""> verification will be performed with EnforceSILOwnershipModel set to true.</div><div class="">2. If SILOption.EnableSILOwnershipModel is true, then the pass manager will run</div><div class=""> the OwnershipModelEliminator pass right after SILGen before the normal pass</div><div class=""> pipeline starts.</div><div class="">3. SILGen will be changed to emit load_strong, store_strong instructions when</div><div class=""> the EnableSILOwnershipModel flag is set. We will use the verifier throwing to</div><div class=""> guarantee that we are not missing any specific cases.</div><div class=""><br class=""></div><div class="">Then once all fo the bots are green, we change SILOption.EnableSILOwnershipModel</div><div class="">to be true by default. After a cooling off period, we move all of the code</div><div class="">behind the SILOwnershipModel flag in front of the flag. We do this so we can</div><div class="">reuse that flag for further SILOwnershipModel changes.</div><div class=""><br class=""></div><div class="">## Optimizer Changes</div><div class=""><br class=""></div><div class="">Since the SILOwnershipModel eliminator will eliminate the load_strong,</div><div class="">store_strong instructions right after ownership verification, there will be no</div><div class="">immediate affects on the optimizer and thus the optimizer changes can be done in</div><div class="">parallel with the rest of the ARC optimization work.</div><div class=""><br class=""></div><div class="">But, in the long run, we need IRGen to eliminate the load_strong, store_strong</div><div class="">instructions, not the SILOwnershipModel eliminator, so that we can enforce</div><div class="">Ownership invariants all through the SIL pipeline. Thus we will need to update</div><div class="">passes to handle these new instructions. The main optimizer changes can be</div><div class="">separated into the following areas: memory forwarding, dead stores, ARC</div><div class="">optimization. In all of these cases, the necessary changes are relatively</div><div class="">trivial to respond to. We give a quick taste of two of them: store->load</div><div class="">forwarding and ARC Code Motion.</div><div class=""><br class=""></div><div class="">### store->load forwarding</div><div class=""><br class=""></div><div class="">Currently we perform store->load forwarding as follows:</div><div class=""><br class=""></div><div class=""> store %x to %x_ptr : $C</div><div class=""> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...</div><div class=""> %y = load %x_ptr : $C</div><div class=""> use(%y)</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> store %x to %x_ptr : $C</div><div class=""> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...</div><div class=""> use(%x)</div><div class=""><br class=""></div><div class="">In a world, where we are using load_strong, store_strong, we have to also</div><div class="">consider the ownership implications. *NOTE* Since we are not modifying the</div><div class="">store_strong, `store_strong` and `store_strong [init]` are treated the</div><div class="">same. Thus without any loss of generality, lets consider solely `store_strong`.</div><div class=""><br class=""></div><div class=""> store_strong %x to %x_ptr : $C</div><div class=""> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...</div><div class=""> %y = load_strong %x_ptr : $C</div><div class=""> use(%y)</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> store_strong %x to %x_ptr : $C</div><div class=""> ... NO SIDE EFFECTS THAT TOUCH X_PTR ...</div><div class=""> strong_retain %x</div><div class=""> use(%x)</div><div class=""><br class=""></div><div class="">### ARC Code Motion</div><div class=""><br class=""></div><div class="">If ARC Code Motion wishes to move `load_strong`, `store_strong` instructions, it</div><div class="">must now consider read/write effects. On the other hand, it will be able to now</div><div class="">not consider the side-effects of destructors when moving retain/release</div><div class="">operations.</div><div class=""><br class=""></div><div class="">### Normal Code Motion</div><div class=""><br class=""></div><div class="">Normal code motion will lose some effectiveness since many of the load/store</div><div class="">operations that it used to be able to move now must consider ARC information. We</div><div class="">may need to consider running ARC code motion earlier in the pipeline where we</div><div class="">normally run Normal Code Motion to ensure that we are able to handle these</div><div class="">cases.</div><div class=""><br class=""></div><div class="">### ARC Optimization</div><div class=""><br class=""></div><div class="">The main implication for ARC optimization is that instead of eliminating just</div><div class="">retains, releases, it must be able to recognize `load_strong`, `store_strong`</div><div class="">and set their flags as appropriate.</div><div class=""><br class=""></div><div class="">### Function Signature Optimization</div><div class=""><br class=""></div><div class="">Semantic ARC affects function signature optimization in the context of the owned</div><div class="">to guaranteed optimization. Specifically:</div><div class=""><br class=""></div><div class="">1. A `store_strong` must be recognized as a release of the old value that is</div><div class=""> being overridden. In such a case, we can move the `release` of the old value</div><div class=""> into the caller and change the `store_strong` into a `store_strong</div><div class=""> [init]`.</div><div class="">2. A `load_strong` must be recognized as a retain in the callee. Then function</div><div class=""> signature optimization will transform the `load_strong` into a `load_strong</div><div class=""> [guaranteed]`. This would require the addition of a new `@guaranteed` return</div><div class=""> value convention.</div><div class=""><br class=""></div><div class=""># Appendix</div><div class=""><br class=""></div><div class="">## Partial Initialization of Loadable References in SIL</div><div class=""><br class=""></div><div class="">In SIL, a value of non-trivial loadable type is loaded from a memory location as</div><div class="">follows:</div><div class=""><br class=""></div><div class=""> %x = load %x_ptr : $*S</div><div class=""> ...</div><div class=""> retain_value %x_ptr : $S</div><div class=""><br class=""></div><div class="">At first glance, this looks reasonable, but in truth there is a hidden drawback:</div><div class="">the partially initialized zone in between the load and the retain</div><div class="">operation. This zone creates a period of time when an "evil optimizer" could</div><div class="">insert an instruction that causes x to be deallocated before the copy is</div><div class="">finished being initialized. Similar issues come up when trying to perform a</div><div class="">store of a non-trival value into a memory location.</div><div class=""><br class=""></div><div class="">Since this sort of partial initialization is allowed in SIL, the optimizer is</div><div class="">forced to be overly conservative when attempting to move releases passed retains</div><div class="">lest the release triggers a deinit that destroys a value like `%x`. Lets look at</div><div class="">two concrete examples that show how semantically providing load_strong,</div><div class="">store_strong instructions eliminate this problem.</div><div class=""><br class=""></div><div class="">**NOTE** Without any loss of generality, we will speak of values with reference</div><div class="">semantics instead of non-trivial values.</div><div class=""><br class=""></div><div class="">## Case Study: Partial Initialization and load_strong</div><div class=""><br class=""></div><div class="">### The Problem</div><div class=""><br class=""></div><div class="">Consider the following swift program:</div><div class=""><br class=""></div><div class=""> func opaque_call()</div><div class=""><br class=""></div><div class=""> final class C {</div><div class=""> var int: Int = 0</div><div class=""> deinit {</div><div class=""> opaque_call()</div><div class=""> }</div><div class=""> }</div><div class=""><br class=""></div><div class=""> final class D {</div><div class=""> var int: Int = 0</div><div class=""> }</div><div class=""><br class=""></div><div class=""> var GLOBAL_C : C? = nil</div><div class=""> var GLOBAL_D : D? = nil</div><div class=""><br class=""></div><div class=""> func useC(_ c: C)</div><div class=""> func useD(_ d: D)</div><div class=""><br class=""></div><div class=""> func run() {</div><div class=""> let c = C()</div><div class=""> GLOBAL_C = c</div><div class=""> let d = D()</div><div class=""> GLOBAL_D = d</div><div class=""> useC(c)</div><div class=""> useD(d)</div><div class=""> }</div><div class=""><br class=""></div><div class="">Notice that both `C` and `D` have fixed layouts and separate class hierarchies,</div><div class="">but `C`'s deinit has a call to the function `opaque_call` which may write to</div><div class="">`GLOBAL_D` or `GLOBAL_C`. Additionally assume that both `useC` and `useD` are</div><div class="">known to the compiler to not have any affects on instances of type `D`, `C`</div><div class="">respectively and useC assigns `nil` to `GLOBAL_C`. Now consider the following</div><div class="">valid SIL lowering for `run`:</div><div class=""><br class=""></div><div class=""> sil_global GLOBAL_D : $D</div><div class=""> sil_global GLOBAL_C : $C</div><div class=""><br class=""></div><div class=""> final class C {</div><div class=""> var x: Int</div><div class=""> deinit</div><div class=""> }</div><div class=""><br class=""></div><div class=""> final class D {</div><div class=""> var x: Int</div><div class=""> }</div><div class=""><br class=""></div><div class=""> sil @useC : $@convention(thin) () -> ()</div><div class=""> sil @useD : $@convention(thin) () -> ()</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D (2)</div><div class=""><br class=""></div><div class=""> %c2 = load %global_c : $*C (3)</div><div class=""> strong_retain %c2 : $C (4)</div><div class=""> %d2 = load %global_d : $*D (5)</div><div class=""> strong_retain %d2 : $D (6)</div><div class=""><br class=""></div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () (7)</div><div class=""><br class=""></div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""><br class=""></div><div class=""> strong_release %d : $D (9)</div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">Lets optimize this function! First we perform the following operations:</div><div class=""><br class=""></div><div class="">1. Since `(2)` is storing to an identified object that can not be `GLOBAL_C`, we</div><div class=""> can store to load forward `(1)` to `(3)`.</div><div class="">2. Since a retain does not block store to load forwarding, we can forward `(2)`</div><div class=""> to `(5)`. But lets for the sake of argument, assume that the optimizer keeps</div><div class=""> such information as an analysis and does not perform the actual load->store</div><div class=""> forwarding.</div><div class="">3. Even though we do not foward `(2)` to `(5)`, we can still move `(4)` over</div><div class=""> `(6)` so that `(4)` is right before `(7)`.</div><div class=""><br class=""></div><div class="">This yields (using the ' marker to designate that a register has had load-store</div><div class="">forwarding applied to it):</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D (2)</div><div class=""><br class=""></div><div class=""> strong_retain %c : $C (4')</div><div class=""> %d2 = load %global_d : $*D (5)</div><div class=""> strong_retain %d2 : $D (6)</div><div class=""><br class=""></div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""><br class=""></div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""><br class=""></div><div class=""> strong_release %d : $D (9)</div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">Then by assumption, we know that `%useC` does not perform any releases of any</div><div class="">instances of class `D`. Thus `(6)` can be moved past `(7')` and we can then pair</div><div class="">and eliminate `(6)` and `(9)` via the rules of ARC optimization using the</div><div class="">analysis information that `%d2` and `%d` are th same due to the possibility of</div><div class="">performing store->load forwarding. After performing such transformations, `run`</div><div class="">looks as follows:</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D</div><div class=""><br class=""></div><div class=""> %d2 = load %global_d : $*D (5)</div><div class=""> strong_retain %c : $C (4')</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""><br class=""></div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""><br class=""></div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">Now by assumption, we know that `%useD_func` does not touch any instances of</div><div class="">class `C` and `%c` does not contain any ivars of type `D` and is final so none</div><div class="">can be added. At first glance, this seems to suggest that we can move `(10)`</div><div class="">before `(8')` and then pair/eliminate `(4')` and `(10)`. But is this a safe</div><div class="">optimization perform? Absolutely Not! Why? Remember that since `useC_func`</div><div class="">assigns `nil` to `GLOBAL_C`, after `(7')`, `%c` could have a reference count</div><div class="">of 1. Thus `(10)` _may_ invoke the destructor of `C`. Since this destructor</div><div class="">calls an opaque function that _could_ potentially write to `GLOBAL_D`, we may be</div><div class="">be passing `%d2`, an already deallocated object to `%useD_func`, an illegal</div><div class="">optimization!</div><div class=""><br class=""></div><div class="">Lets think a bit more about this example and consider this example at the</div><div class="">language level. Remember that while Swift's deinit are not asychronous, we do</div><div class="">not allow for user level code to create dependencies from the body of the</div><div class="">destructor into the normal control flow that has called it. This means that</div><div class="">there are two valid results of this code:</div><div class=""><br class=""></div><div class="">- Operation Sequence 1: No optimization is performed and `%d2` is passed to</div><div class=""> `%useD_func`.</div><div class="">- Operation Sequence 2: We shorten the lifetime of `%c` before `%useD_func` and</div><div class=""> a different instance of `$D` is passed into `%useD_func`.</div><div class=""><br class=""></div><div class="">The fact that 1 occurs without optimization is just as a result of an</div><div class="">implementation detail of SILGen. 2 is also a valid sequence of operations.</div><div class=""><br class=""></div><div class="">Given that:</div><div class=""><br class=""></div><div class="">1. As a principle, the optimizer does not consider such dependencies to avoid</div><div class=""> being overly conservative.</div><div class="">2. We provide constructs to ensure appropriate lifetimes via the usage of</div><div class=""> constructs such as fix_lifetime.</div><div class=""><br class=""></div><div class="">We need to figure out how to express our optimization such that 2</div><div class="">happens. Remember that one of the optimizations that we performed at the</div><div class="">beginning was to move `(6)` over `(7')`, i.e., transform this:</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d_addr = global_addr GLOBAL_D : $D</div><div class=""> %d = load %global_d_addr : $*D (5)</div><div class=""> strong_retain %d : $D (6)</div><div class=""><br class=""></div><div class=""> // Call the user functions passing in the instances that we loaded from the globals.</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""><br class=""></div><div class="">into:</div><div class=""><br class=""></div><div class=""> %global_d_addr = global_addr GLOBAL_D : $D</div><div class=""> %d2 = load %global_d_addr : $*D (5)</div><div class=""><br class=""></div><div class=""> // Call the user functions passing in the instances that we loaded from the globals.</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""> strong_retain %d2 : $D (6)</div><div class=""><br class=""></div><div class="">This transformation in Swift corresponds to transforming:</div><div class=""><br class=""></div><div class=""> let d = GLOBAL_D</div><div class=""> useC(c)</div><div class=""><br class=""></div><div class="">to:</div><div class=""><br class=""></div><div class=""> let d_raw = load_d_value(GLOBAL_D)</div><div class=""> useC(c)</div><div class=""> let d = take_ownership_of_d(d_raw)</div><div class=""><br class=""></div><div class="">This is clearly an instance where we have moved a side-effect in between the</div><div class="">loading of the data and the taking ownership of such data, that is before the</div><div class="">`let` is fully initialized. What if instead of just moving the retain, we moved</div><div class="">the entire let statement? This would then result in the following swift code:</div><div class=""><br class=""></div><div class=""> useC(c)</div><div class=""> let d = GLOBAL_D</div><div class=""><br class=""></div><div class="">and would correspond to the following SIL snippet:</div><div class=""><br class=""></div><div class=""> %global_d_addr = global_addr GLOBAL_D : $D</div><div class=""><br class=""></div><div class=""> // Call the user functions passing in the instances that we loaded from the globals.</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""> %d2 = load %global_d_addr : $*D (5)</div><div class=""> strong_retain %d2 : $D (6)</div><div class=""><br class=""></div><div class="">Moving the load with the strong_retain to ensure that the full initialization is</div><div class="">performed even after code motion causes our SIL to look as follows:</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D</div><div class=""><br class=""></div><div class=""> strong_retain %c : $C (4')</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c) : $@convention(thin) (@owned C) -> () (7')</div><div class=""><br class=""></div><div class=""> %d2 = load %global_d : $*D (5)</div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""><br class=""></div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">Giving us the exact result that we want: Operation Sequence 2!</div><div class=""><br class=""></div><div class="">### Defining load_strong</div><div class=""><br class=""></div><div class="">Given that we wish the load, store to be tightly coupled together, it is natural</div><div class="">to express this operation as a `load_strong` instruction. Lets define the</div><div class="">`load_strong` instruction as follows:</div><div class=""><br class=""></div><div class=""> %1 = load_strong %0 : $*C</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %1 = load %0 : $*C</div><div class=""> retain_value %1 : $C</div><div class=""><br class=""></div><div class="">Now lets transform our initial example to use this instruction:</div><div class=""><br class=""></div><div class="">Notice how now if we move `(7)` over `(3)` and `(6)` now, we get the following SIL:</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D (2)</div><div class=""><br class=""></div><div class=""> %c2 = load_strong %global_c : $*C (3)</div><div class=""> %d2 = load_strong %global_d : $*D (5)</div><div class=""><br class=""></div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () (7)</div><div class=""><br class=""></div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""><br class=""></div><div class=""> strong_release %d : $D (9)</div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">We then perform the previous code motion:</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D (2)</div><div class=""><br class=""></div><div class=""> %c2 = load_strong %global_c : $*C (3)</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () (7)</div><div class=""> strong_release %d : $D (9)</div><div class=""><br class=""></div><div class=""> %d2 = load_strong %global_d : $*D (5)</div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""> strong_release %c : $C (10)</div><div class=""> }</div><div class=""><br class=""></div><div class="">We then would like to eliminate `(9)` and `(10)` by pairing them with `(3)` and</div><div class="">`(8)`. Can we still do so? One way we could do this is by introducing the</div><div class="">`[take]` flag. The `[take]` flag on a load_strong says that one is semantically</div><div class="">loading a value from a memory location and are taking ownership of the value</div><div class="">thus eliding the retain. In terms of SIL this flag is defined as:</div><div class=""><br class=""></div><div class=""> %x = load_strong [take] %x_ptr : $*C</div><div class=""><br class=""></div><div class=""> =></div><div class=""><br class=""></div><div class=""> %x = load %x_ptr : $*C</div><div class=""><br class=""></div><div class="">Why do we care about having such a `load_strong [take]` instruction when we</div><div class="">could just use a `load`? The reason why is that a normal `load` has unsafe</div><div class="">unowned ownership (i.e. it has no implications on ownership). We would like for</div><div class="">memory that has non-trivial type to only be able to be loaded via instructions</div><div class="">that maintain said ownership. We will allow for casting to trivial types as</div><div class="">usual to provide such access if it is required.</div><div class=""><br class=""></div><div class="">Thus we have achieved the desired result:</div><div class=""><br class=""></div><div class=""> sil @run : $@convention(thin) () -> () {</div><div class=""> bb0:</div><div class=""> %c = alloc_ref $C</div><div class=""> %global_c = global_addr @GLOBAL_C : $*C</div><div class=""> strong_retain %c : $C</div><div class=""> store %c to %global_c : $*C (1)</div><div class=""><br class=""></div><div class=""> %d = alloc_ref $D</div><div class=""> %global_d = global_addr @GLOBAL_D : $*D</div><div class=""> strong_retain %d : $D</div><div class=""> store %d to %global_d : $*D (2)</div><div class=""><br class=""></div><div class=""> %c2 = load_strong [take] %global_c : $*C (3)</div><div class=""> %useC_func = function_ref @useC : $@convention(thin) (@owned C) -> ()</div><div class=""> apply %useC_func(%c2) : $@convention(thin) (@owned C) -> () (7)</div><div class=""><br class=""></div><div class=""> %d2 = load_strong [take] %global_d : $*D (5)</div><div class=""> %useD_func = function_ref @useD : $@convention(thin) (@owned D) -> ()</div><div class=""> apply %useD_func(%d2) : $@convention(thin) (@owned D) -> () (8)</div><div class=""> }</div></div><div class=""><br class=""></div></div>_______________________________________________<br class="">swift-dev mailing list<br class=""><a href="mailto:swift-dev@swift.org" class="">swift-dev@swift.org</a><br class=""><a href="https://lists.swift.org/mailman/listinfo/swift-dev" class="">https://lists.swift.org/mailman/listinfo/swift-dev</a></blockquote></div></div></blockquote></div><br class=""></body></html>