[swift-dev] [arc optimization] Why doesn't enum destructuring use guaranteed references?
Félix Cloutier
felixcloutier at icloud.com
Fri Dec 29 15:58:35 CST 2017
Thanks Michael! Other people might have different opinions, but knowing that a "definitive" fix is on the way is plenty for me and I'm not looking to make a fuss.
If you still have some time, I have another question or two. You said that only self is passed at +0. Does this mean, for instance, that custom operators would still get +1 Expr values? It seems to me that if you receive any boxed value at +0, then it's always safe to also pass it further at +0 since it is immutable. That would have a huge impact on this program. Are there downsides to more pervasive +0 parameters?
Félix
> Le 29 déc. 2017 à 15:42, Michael Gottesman <mgottesman at apple.com> a écrit :
>
>
>
>> On Dec 28, 2017, at 4:27 PM, Chris Lattner via swift-dev <swift-dev at swift.org> wrote:
>>
>> <moving to swift-dev, where most of the perf optimization people lurk>
>>
>> This is a great question, I’m not sure what the answer is: maybe it is a simple case missed by the arc optimizer?
>
> It is a bit more complicated than that. There are codegen aspects and optimization aspects. In both cases there are "structural issues" that I am currently in the process of fixing as part of the "semantic sil" effort. There are some retain/release pairs that we could eliminate by teaching the optimizer about enums/indirect boxes. I go over all of this below.
>
> TLDR: There are structural issues here that make this difficult/unsafe in the general case. Semantic SIL eliminates all of the structural issues with ARC and makes this problem (among other problems) "trivial", so we have been focusing our engineering time on implementing that. That being said, we could use the effects analysis to handle this trivial case. But it would have to be very conservative. Chris, feel like doing some ARC Optimization? = p.
>
> # Codegen Aspects
>
> ## Background Info: Local Retain/Release Pairs
>
> If one does not know about local vs non-local retain/release pairs, see the appendix for an explanation.
>
> ## +0 Self
>
> In Swift today all parameters except for self are passed at +1. Self is passed at +0. Even though this is true, often times SILGen will be conservative around self and will emit a copy/destroy around the call site. This is actually better since it eliminates a non-local retain/release pair in the callee/caller favor of a local retain/release pair in the caller that can be eliminated by the optimizer without any form of inter-procedural optimization. That is why there is a retain/release at the count call. I will go into why we are not optimizing it below in the optimization section.
>
> ## Pattern Matching at +1
>
> All pattern matching in Swift today is done at +1. That means that in the following Swift:
>
> ----
> enum Expr: CustomStringConvertible {
> case Int(n: Int)
> indirect case Var(x: String)
> indirect case Add(f: Expr, g: Expr)
> indirect case Mul(f: Expr, g: Expr)
> indirect case Pow(f: Expr, g: Expr)
> indirect case Ln(f: Expr)
>
> ...
>
> var count: Int {
> switch self {
> case .Int, .Var: return 1
> case let .Ln(f): return f.count
> case let .Add(f, g), let .Mul(f, g), let .Pow(f, g):
> return f.count + g.count
> }
> }
> }
> ----
>
> even though self is passed at +0 to count, SILGen will emit a retain before the switch. This is unfortunate and adds unnecessary ARC traffic in read only methods on enums. There is no reason not to emit switches that way, so as part of the work to make SILGenPattern able to pass the ownership verifier, I plan on changing all pattern emission to be done at +0. As an added benefit once that work is complete, refactorings in SILGenPattern will be exposed that will enable us to significantly simplify SILGenPattern's handling of cleanups. This complex code has lead to obscure bugs in the past and is IMO overly complex.
>
> # Optimizer Aspects
>
> Lets start by looking at count.getter. Here is the SIL in CFG form:
>
> <count.getter.pdf>
>
> Lets analyze bb4 (I think bb5 suffers from similar issues).
>
> The retain, release pair on %0 here suffers from the turning off of the guaranteed optimization. The guaranteed optimization says that any retain/release pairs on a guaranteed value can be eliminated if they are "semantic pairings". We had to turn off this optimization since without semantic-sil we were guessing what retain/release pairings were "semantic pairings". In certain cases this would result in mispairings so we had to turn it off in the general case. Specifically, consider the following straw man:
>
> ----
> strong_retain %x
> ...
> strong_release obfuscated(%x)
> ...
> strong_retain obfuscated'(%x)
> ...
> strong_release %x
> ----
>
> in this case we would pair/remove the outer retain, release pair yielding:
>
> ----
> strong_release obfuscated(%x)
> ...
> strong_retain obfuscated'(%x)
> ----
>
> While the reference count on %x is still balanced, since the retain count balancing is in inverse order this can result in pre-mature releases and use-after-frees. Once semantic-sil is complete, all pairings are explicit in the IR so such a bug can not happen. To work around that problem, today we are conservative and require any retain/release pair around a guaranteed parameter to be +1 before and have a use afterwards.
>
> The second retain, release pair (i.e. %13) is slightly different since this suffers from the optimizer needing to make a connection in between the box in the enum and the value loaded from the box. Today we have enough knowledge in the optimizer to make the first connection without issue via the RCIdentity analysis. We do not make the connection though in between the box being guaranteed and the value loaded from the box being effectively guaranteed as a result. Even if we made that connection today, I am not sure if in the general case it would be safe. With semantic-sil all of those problems go away.
>
> For both of these, I think that by using appropriate inter procedural effects analysis (which we have), we could teach the ARC optimizer to be more aggressive around guaranteed. The specific properties would be that:
>
> 1. No side-effects (ignoring retain/releases).
> 2. No retains/releases that are unbalanced.
> 3. No escapes.
>
> I would have to think in more detail in order to be sure though.
>
> # Appendix
>
> ## Local Retain/Release Pairs
>
> There are two types of retain/release pairs in Swift, function local and non-local. An example of a local retain/release is when one creates a new binding, i.e.:
>
> ----
> func f(x: Klass) {
> let xhat = x
> ...
> }
> ----
>
> When the assignment to xhat occurs we retain x and at the end of the function (assuming that xhat is not used), we release xhat. Since the retain/release pair are both in f, we are able to potentially pair and eliminate them without needing to modify the call graph. In contrast, a function non-local pairing is created when a parameter is passed at +1, e.g.:
>
> ----
> func g(x: Klass) { ... x is used but not consumed ... }
> func f(x: Klass) {
> g(x)
> }
> ----
>
> In this case when we call g (ignoring any peepholes) we retain x first in f (the callee) and then release x in g (the caller). This is bad for optimization purposes since it means that we can not eliminate the retain/release pair without modifying the call graph or inlining. For non-resilient functions, we of course can/do inline and modify the callgraph via function signature optimization. In the case of resilient functions this is impossible by the nature of resilience.
>
>>
>> -Chris
>>
>>
>>
>>> On Dec 27, 2017, at 9:19 PM, Félix Cloutier via swift-evolution <swift-evolution at swift.org> wrote:
>>>
>>> Running this on my MBP with 10 as the parameter: https://gist.github.com/zneak/ae33bb970a08632cfb2925e2049f9e7a
>>>
>>> I get a runtime of about 10 seconds, 45% of which is spent in retain/release calls according to Instruments (!!), and at least half of that from Expr.count. Looking at the IR, Swift generously sprinkles retain/release calls through the outlined copy method:
>>>
>>> • `self` is retained at the beginning of `count`
>>> • The values that you get out of destructuring are retained
>>> • Of course, when you get `count` on these, they are retained again
>>>
>>> Of course, Expr.count cannot modify itself or its subexpressions because the functions are not mutating; in fact, no function in that program can mutate an enum case. Why, then, is Swift retaining/releasing `self` and the values obtained from destructured patterns? They can't go away. Shouldn't we be getting guaranteed references instead of owning references?
>>>
>>> That seems to hit pattern-matching-heavy programs with indirect cases pretty hard, and it's pretty frustrating because that seems to be about the nicest way to write this program, and there's no workaround from the developer's perspective. I don't think that this is a fatal flaw of refcounting, but unless I'm missing something, that's sub-par behavior.
>>>
>>> Félix
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
>
More information about the swift-dev
mailing list