[swift-users] Why are Swift Loops slow?
Joe Groff
jgroff at apple.com
Wed Oct 12 11:05:24 CDT 2016
> On Oct 12, 2016, at 2:25 AM, Gerriet M. Denkmann via swift-users <swift-users at swift.org> wrote:
>
> uint64_t nbrBytes = 4e8;
> uint64_t count = 0;
> for( uint64_t byteIndex = 0; byteIndex < nbrBytes; byteIndex++ )
> {
> count += byteIndex;
> if ( ( byteIndex & 0xffffffff ) == 0 ) { count += 1.3; } (AAA)
> };
>
> Takes 260 msec.
>
> Btw.: Without the (AAA) line the whole loop is done in 10 μsec. A really clever compiler!
> And with “count += 1” instead of “count += 1.3” it takes 410 msec. Very strange.
> But this is beside the point here.
>
>
> Now Swift:
> let nbrBytes = 400_000_000
> var count = 0
> for byteIndex in 0 ..< nbrBytes
> {
> count += byteIndex
> if ( ( byteIndex & 0xffffffff ) == 0 ) {count += Int(1.3);}
> }
>
> takes 390 msec - about 50 % more.
>
> Release build with default options.
This is a useless benchmark because the loop does no observable work in either language. The C version, if you look at the generated assembly, in fact optimizes away to nothing:
~/src/s/swift$ clang ~/butt.c -O3 -S -o -
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 9
.globl _main
.p2align 4, 0x90
_main: ## @main
.cfi_startproc
## BB#0: ## %entry
pushq %rbp
Ltmp0:
.cfi_def_cfa_offset 16
Ltmp1:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp2:
.cfi_def_cfa_register %rbp
xorl %eax, %eax
popq %rbp
retq
.cfi_endproc
It looks like LLVM does not recognize the overflow traps Swift emits on arithmetic operations as dead code, so that prevents it from completely eliminating the Swift loop. That's a bug worth fixing in Swift, but unlikely to make a major difference in real, non-dead code. However, if we make the work useful by factoring the loop into a function in both languages, the perf difference is unmeasurable. Try comparing:
#include <stdint.h>
__attribute__((noinline))
uint64_t getCount(uint64_t nbrBytes) {
uint64_t count = 0;
for( uint64_t byteIndex = 0; byteIndex < nbrBytes; byteIndex++ )
{
count += byteIndex;
if ( ( byteIndex & 0xffffffff ) == 0 ) { count += 1.3; }
};
return count;
}
int main() {
uint64_t nbrBytes = 4e8;
return getCount(nbrBytes);
}
with:
import Darwin
@inline(never)
func getCount(nbrBytes: Int) -> Int {
var count = 0
for byteIndex in 0 ..< nbrBytes
{
count += byteIndex
if ( ( byteIndex & 0xffffffff ) == 0 ) {count += Int(1.3);}
}
return count
}
exit(Int32(truncatingBitPattern: getCount(nbrBytes: 400_000_000)))
-Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20161012/42a1a7f4/attachment.html>
More information about the swift-users
mailing list