[swift-dev] Miscompile with protocol compositions

Slava Pestov spestov at apple.com
Thu Apr 27 02:35:17 CDT 2017


Hi all,

I’ve spent most of the last two days debugging various issues with property and subscript accesses on class-constrained existentials and I’ve just now realized the root cause is a much more fundamental issue.

Consider the following code:

protocol P {
  var x: Int { get set }

  init()
}

func takesP(p: inout P & AnyObject) {
  p.x = 1
}

Note that ‘takesP’ has a class-constrained existential type as a parameter. You can replace AnyObject with any class protocol, it won’t change anything below.

Also, the init() requirement in the protocol will come into play later — for now, just consider the ‘var x’, which has a mutating setter.

The above code compiles fine. Let’s take a look at the SILGen though:

  %2 = load [copy] %0 : $*AnyObject & P
  %3 = open_existential_ref %2 : $AnyObject & P to $@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P
...
  %8 = alloc_stack $@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P
  store %3 to [init] %8 : $*@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P
  %10 = witness_method $@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P, #P.x!setter.1, %3 : $@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P : $@convention(witness_method) <τ_0_0 where τ_0_0 : P> (Int, @inout τ_0_0) -> ()
  %11 = apply %10<@opened("52860F0E-2B19-11E7-8595-34363BD08DA0") AnyObject & P>(%7, %8) : $@convention(witness_method) <τ_0_0 where τ_0_0 : P> (Int, @inout τ_0_0) -> ()

So we load the ‘AnyObject & P’ existential value from our inout argument, then in order to call the protocol requirement for x’s setter, which takes the ‘self’ value as an indirect argument, we allocate a box on the stack, store the reference into the box, call the requirement with the address of the box as the value of ‘self’, and destroy the box. We never write the result back to the inout.

So consider the following implementation of a concrete type conforming to P:

extension P {
  var x: Int {
    get { return 0 }
    set {
      self = Self()
    }
  }
}

class C : P {
  required init() {}
}

Note that the mutating setter for ‘x’ assigns to ‘self’, which is totally fine — it’s defined in a protocol extension of a non-class protocol P.

Let’s try to use our concrete class type with our function takesP():

let c = C()
var cc = c

takesP(p: &cc)

print(c === cc)

When you compile and run this, it outputs ‘true’, which is not what you would expect from reading the code, since the setter re-assigns ‘self’.

Ok, maybe we can just tell people that if your class conforms to a non-class protocol, then assignments to ‘self’ inside implementations of the protocol requirement are not allowed, and are silently dropped.

Now look at how we compile this version of takesP:

extension P {
  mutating func foo() {
    self = Self()
  }
}

func takesP(p: inout P & AnyObject) {
  p.foo()
}

Here instead of calling a mutating setter, we call a mutating method.

If we look at the SILGen, there is no stack allocation anymore, instead we pass the class reference directly as an inout argument, which is wrong:

  %7 = open_existential_ref %6 : $AnyObject & P to $@opened("F78B53CA-2B18-11E7-A73C-34363BD08DA0") AnyObject & P
  %8 = witness_method $@opened("F78B53CA-2B18-11E7-A73C-34363BD08DA0") AnyObject & P, #P.foo!1, %7 : $@opened("F78B53CA-2B18-11E7-A73C-34363BD08DA0") AnyObject & P : $@convention(witness_method) <τ_0_0 where τ_0_0 : P> (@inout τ_0_0) -> ()
  %9 = apply %8<@opened("F78B53CA-2B18-11E7-A73C-34363BD08DA0") AnyObject & P>(%7) : $@convention(witness_method) <τ_0_0 where τ_0_0 : P> (@inout τ_0_0) -> ()

In a no-asserts build, like Swift 3.1 GM, this produces a binary which crashes at runtime. In an asserts build, we hit an assertion in Sema:

Assertion failed: (fromType->is<LValueType>() && "Can only convert lvalues to inout"), function coerceObjectArgumentToType, file /Users/slava/new/swift/lib/Sema/CSApply.cpp, line 6336.

There seem to be two fundamental problems here:

1) Sema models the base expression of a mutating member as an RValue, because it checks if the base expression type has reference semantics, not the *self type of the member*.

2) SIL cannot express an inout access performed on the payload of class existential.

#1 is easy enough to fix, and allows my code examples involving subclass existentials to type check without hitting the LValue assertion.

#2 is more fundamental. At first I thought it would be sufficient to change SILGen to only use class existential representation for protocol compositions where *all* members are class-constrained, instead of protocol compositions where *any* member is class-constrained. However this is insufficient because a class-constrained protocol can refine a non-class constrained protocol.

We could say that a class-constrained protocol only uses class existential representation if all protocols it inherits also use class existential representation. Basically, split up ProtocolDecl::requiresClass() into two predicates — the current requiresClass() that expresses a constraint in the formal type system, and a stricter requiresNonMutatingSelfAccess() or similar which expresses that requirements and extension methods cannot be ‘mutating’. The latter would only be true if all refined protocols are also class constrained, and would be the condition for using a class existential representation.

Now the reason the first example generates valid (but surprising) SIL, while the second one crashes, is that we seem to have a hack somewhere to handle mutating property access on a non-mutating value by consing up a box, copying the value into the box, and tearing down the box. We could conceivably generalize this hack to work on mutating methods as well, but it would lead to the same surprising behavior where the assignment to ’self’ is ‘lost'.

However, there would be no loss of efficiency in that case, as we would still use the class existential representation in all the cases where we can currently use it.

Another potential fix is to simply disallow mutating methods from being called on class-constrained existentials altogether, but that is a source breaking change (even though the old behavior was extremely flaky).

Any thoughts?

Slava


More information about the swift-dev mailing list