[swift-dev] Weird dyn_cast code gen / performance problem
Michael Gottesman
mgottesman at apple.com
Mon Jan 1 20:47:56 CST 2018
I posted a quick comment there + the llvm ir. It seems like something that could be a simple SimplifyInstructions transformation.
Michael
> On Jan 1, 2018, at 6:37 PM, David Zarzycki <dave at znu.io> wrote:
>
>
>
>> On Jan 1, 2018, at 18:20, John McCall <rjmccall at apple.com> wrote:
>>
>>>
>>> On Jan 1, 2018, at 5:51 PM, David Zarzycki via swift-dev <swift-dev at swift.org> wrote:
>>>
>>> Hi Michael,
>>>
>>> I reduced it down to a simple test case. I was wrong about this requiring two or more dyn_casts. This actually affects any C++ code that uses the “if (auto x = y(z))” convention. What follows is the reduction (compiled with “clang++ -O3 -c” if it matters):
>>>
>>> // Uncomment the next line to see the expected code gen (albeit not inlined)
>>> //__attribute__((used,noinline))
>>> int *x(void *arg) {
>>> return ((long long)arg & 1) ? (int *)arg : nullptr;
>>> }
>>>
>>> int test(void *arg) {
>>> if (auto y = x(arg))
>>> return *y;
>>> return 42;
>>> }
>>>
>>> It seems like inlining ‘x’ causes the compiler to effectively generate the following pseudo-code:
>>>
>>> int test(void *arg) {
>>> if (arg != nullptr)
>>> if (arg & 1)
>>> return *arg;
>>> return 42;
>>> }
>>>
>>> Which is surprising in multiple ways and (as far as I can tell) difficult to workaround without lots of source churn.
>>>
>>> Where should I file a bug?
>>
>> bugs.llvm.org would be best. Including both your reduced test case and the fact that it was reduced from dyn_cast patterns should make them sit up and take notice.
>
> Thanks John,
>
> In case anybody wants to cherry-pick the fix once it is found:
> https://bugs.llvm.org/show_bug.cgi?id=35790
>
>> John.
>>
>>>
>>> Dave
>>>
>>>
>>>> On Jan 1, 2018, at 13:10, David Zarzycki via swift-dev <swift-dev at swift.org> wrote:
>>>>
>>>> I don’t have the IR handy. You can easily generate it for yourself though. Just drop the following into any file (I use swift/lib/AST/Type.cpp) and recompile swift.
>>>>
>>>> Decl *my_test_function(Type t) {
>>>> return t->getClassOrBoundGenericClass();
>>>> }
>>>>
>>>>
>>>>> On Jan 1, 2018, at 12:53, Michael Gottesman <mgottesman at apple.com> wrote:
>>>>>
>>>>> Do you have the llvm-ir handy?
>>>>>
>>>>>> On Jan 1, 2018, at 11:30 AM, David Zarzycki via swift-dev <swift-dev at swift.org> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I noticed recently that the code gen of CanType::getClassOrBoundGenericClass() could be better and along the way I found a clang/LLVM bug. Where exactly, I do not know, although my bet is the LLVM optimizer.
>>>>>>
>>>>>> When more than one dyn_cast() happens in a row, LLVM/clang emits redundant and pointless nullptr checks. Both Apple clang-900.0.39.2 and clang/llvm top-of-tree generate essentially the same code:
>>>>>>
>>>>>> <+35>: movb 0x8(%rbx), %cl ; getKind()
>>>>>> <+38>: testq %rbx, %rbx ; XXX - nullptr check after deref is pointless
>>>>>> <+41>: je 0x1377df6 ; <+54>
>>>>>> <+43>: cmpb $0x12, %cl ; isa<ClassType>()
>>>>>> <+46>: jne 0x1377df6 ; <+54>
>>>>>> <+48>: addq $0x10, %rbx ; (void*)this + offsetof(ClassType, TheDecl)
>>>>>> <+52>: jmp 0x1377e06 ; <+70>
>>>>>> <+54>: xorl %eax, %eax ; the default return value (nullptr)
>>>>>> <+56>: testq %rbx, %rbx ; XXX - another pointless nullptr check?
>>>>>> <+59>: je 0x1377e09 ; <+73>
>>>>>> <+61>: cmpb $0x29, %cl ; isa<BoundGenericClassType>()
>>>>>> <+64>: jne 0x1377e09 ; <+73>
>>>>>> <+66>: addq $0x18, %rbx ; (void*)this + offsetof(BoundGenericClassType, TheDecl)
>>>>>> <+70>: movq (%rbx), %rax ; load the decl pointer
>>>>>> <+73>: popq %rbx
>>>>>> <+74>: retq
>>>>>>
>>>>>> I’ve tried adding different “nonnull” spellings in various parts of both Swift and LLVM’s casting machinery, but with no luck. The only thing that seems to work is to create a free function that takes a non-null “const TypeBase *” parameter and then have CanType::getClassOrBoundGenericClass() call that.
>>>>>>
>>>>>> FWIW – I *suspect* this is because LLVM’s casting machinery internally converts traditional pointers into C++ references before ultimately calling classof(&Val).
>>>>>>
>>>>>> Before I file a bug against clang/llvm, might I be missing something? Can anybody think of a good workaround?
>>>>>>
>>>>>> Dave
>>>>>> _______________________________________________
>>>>>> swift-dev mailing list
>>>>>> swift-dev at swift.org
>>>>>> https://lists.swift.org/mailman/listinfo/swift-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> swift-dev mailing list
>>>> swift-dev at swift.org
>>>> https://lists.swift.org/mailman/listinfo/swift-dev
>>>
>>> _______________________________________________
>>> swift-dev mailing list
>>> swift-dev at swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-dev
More information about the swift-dev
mailing list