[swift-dev] question about performance of dispatches on existentials

Sat Jul 8 09:07:37 CDT 2017

Thanks very much Arnold, also for filing the bug!

> On 7 Jul 2017, at 8:07 pm, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> 
> This is a failure in the optimizer of identifying two loads to return the same value and so it can’t remove a retain/release pair.
> 
> 
> / protocol witness for B.foo(_:) in conformance OtherB
> sil shared [transparent] [serialized] [thunk] @_T04test6OtherBCAA1BA2aDP3fooyAA1ACFTW : $@convention(witness_method) (@owned A, @in_guaranteed OtherB) -> () {
> // %0                                             // user: %7
> // %1                                             // user: %3
> bb0(%0 : $A, %1 : $*OtherB):
>   %2 = alloc_stack $OtherB                        // users: %9, %4, %11, %7
>   %3 = load %1 : $*OtherB                         // users: %6, %4
>   store %3 to %2 : $*OtherB                       // id: %4
>   // function_ref B.foo(_:)
>   %5 = function_ref @_T04test1BPAAE3fooyAA1ACF : $@convention(method) <τ_0_0 where τ_0_0 : B> (@owned A, @in_guaranteed τ_0_0) -> () // user: %7
>   strong_retain %3 : $OtherB                      // id: %6
>   %7 = apply %5<OtherB>(%0, %2) : $@convention(method) <τ_0_0 where τ_0_0 : B> (@owned A, @in_guaranteed τ_0_0) -> ()
>   %8 = tuple ()                                   // user: %12
>   %9 = load %2 : $*OtherB                         // user: %10
>   strong_release %9 : $OtherB                     // id: %10
>   dealloc_stack %2 : $*OtherB                     // id: %11
>   return %8 : $()                                 // id: %12
> } // end sil function ‘_T04test6OtherBCAA1BA2aDP3fooyAA1ACFTW’
> 
> If load store forwarding could just tell that the apply does not write to the alloc_stack  (It could because @in_guaranteed guarantees no write) … i would expect it to mem promote this … ARC could then remove the retain/release pair (AFAICT).
> 
> 
> https://bugs.swift.org/browse/SR-5403
> 
> 
>> On Jul 7, 2017, at 11:27 AM, Johannes Weiß via swift-dev <swift-dev at swift.org> wrote:
>> 
>> Hi swift-dev,
>> 
>> If I have basically this program (full program see at the tail end of this mail)
>> 
>> public class A { func bar() { ... }}
>> public protocol B {
>>    func foo(_ a: A)
>> }
>> extension B {
>>    func foo(_ a: A) { a.bar() }
>> }
>> public class ActualB: B {
>> }
>> public class OtherB: B {
>> }
>> func abc() {
>>    let b: B = makeB()
>>    b.foo(a)
>> }
>> 
>> I get the following call frames when running it (compiled with `swiftc -O -g -o test test.swift`):
>> 
>>    frame #1: 0x0000000100001dbf test`specialized A.bar() at test.swift:6 [opt]
>>    frame #2: 0x0000000100001e6f test`specialized B.foo(_:) [inlined] test.SubA.bar() -> () at test.swift:0 [opt]
>>    frame #3: 0x0000000100001e6a test`specialized B.foo(a=<unavailable>) at test.swift:23 [opt]
>>    frame #4: 0x0000000100001a6e test`B.foo(_:) at test.swift:0 [opt]
>>    frame #5: 0x0000000100001b3e test`protocol witness for B.foo(_:) in conformance OtherB at test.swift:0 [opt]
>>    frame #6: 0x0000000100001ccd test`abc() at test.swift:45 [opt]
>>    frame #7: 0x0000000100001969 test`main at test.swift:48 [opt]
>> 
>> 1, 6, and 7 are obviously totally fine and expected.
>> 
>> In 6 we are also building and destroying an existential box, also understandable and fine.
>> 
>> But there's two things I don't quite understand:
>> 
>> I) Why (in 5) will the existential container be retained and released?
>> 
>> --- SNIP ---
>>                     __T04test6OtherBCAA1BA2aDP3fooyAA1ACFTW:        // protocol witness for test.B.foo(test.A) -> () in conformance test.OtherB : test.B in test
>> 0000000100001b20         push       rbp                                         ; CODE XREF=__T04test7ActualBCAA1BA2aDP3fooyAA1ACFTW+4
>> 0000000100001b21         mov        rbp, rsp
>> 0000000100001b24         push       r14
>> 0000000100001b26         push       rbx
>> 0000000100001b27         mov        r14, rdi
>> 0000000100001b2a         mov        rbx, qword [r13]
>> 0000000100001b2e         mov        rdi, rbx
>> 0000000100001b31         call       _swift_rt_swift_retain
>> 0000000100001b36         mov        rdi, r14                                    ; argument #1 for method __T04test1BPAAE3fooyAA1ACF
>> 0000000100001b39         call       __T04test1BPAAE3fooyAA1ACF                  ; (extension in test):test.B.foo(test.A) -> ()
>> 0000000100001b3e         mov        rdi, rbx
>> 0000000100001b41         pop        rbx
>> 0000000100001b42         pop        r14
>> 0000000100001b44         pop        rbp
>> 0000000100001b45         jmp        _swift_rt_swift_release
>>                        ; endp
>> --- SNAP ---
>> 
>> II) Why are 2, 3, 4 and 5 not one stack frame? Seems like we could just JMP from one to the next. Sure in 5 the call is surrounded by a release/retain but in the others we could just JMP.
>> 
>> 
>> We see quite a measurable performance issue in a project we're working on (email me directly for details/code) and so I thought I'd ask because I'd like to understand why this is all needed (if it is).
>> 
>> 
>> Many thanks,
>>  Johannes
>> 
>> --- SNIP ---
>> import Darwin
>> 
>> public class A {
>>    @inline(never)
>>    public func bar() {
>>        print("bar")
>>    }
>> }
>> public class SubA: A {
>>    @inline(never)
>>    public override func bar() {
>>        print("bar")
>>    }
>> }
>> 
>> public protocol B {
>>    func foo(_ a: A)
>> }
>> 
>> public extension B {
>>    @inline(never)
>>    func foo(_ a: A) {
>>        a.bar()
>>    }
>> }
>> 
>> public class ActualB: B {
>> }
>> 
>> public class OtherB: B {
>> }
>> 
>> public func makeB() -> B {
>>    if arc4random() == 1231231 {
>>        return ActualB()
>>    } else {
>>        return OtherB()
>>    }
>> }
>> 
>> @inline(never)
>> func abc() {
>>    let a = SubA()
>>    let b: B = makeB()
>>    b.foo(a)
>> }
>> 
>> abc()
>> --- SNAP ---
>> 
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
>