[swift-users] Why are Swift Loops slow?

Joe Groff jgroff at apple.com
Wed Oct 12 11:05:24 CDT 2016


> On Oct 12, 2016, at 2:25 AM, Gerriet M. Denkmann via swift-users <swift-users at swift.org> wrote:
> 
> uint64_t nbrBytes = 4e8;
> uint64_t count = 0;
> for( uint64_t byteIndex = 0; byteIndex < nbrBytes; byteIndex++ )
> {
> 	count += byteIndex;
> 	if ( ( byteIndex & 0xffffffff ) == 0 ) { count += 1.3; }  (AAA) 
> };
> 
> Takes 260 msec.
> 
> Btw.: Without the (AAA) line the whole loop is done in 10 μsec. A really clever compiler!
> And with “count += 1” instead of “count += 1.3” it takes 410 msec. Very strange. 
> But this is beside the point here.
> 
> 
> Now Swift:
> let nbrBytes = 400_000_000
> var count = 0
> for byteIndex in 0 ..< nbrBytes
> {
> 	count += byteIndex
> 	if ( ( byteIndex & 0xffffffff ) == 0 ) {count += Int(1.3);}
> }
> 
> takes 390 msec - about 50 % more.
> 
> Release build with default options.

This is a useless benchmark because the loop does no observable work in either language. The C version, if you look at the generated assembly, in fact optimizes away to nothing:

~/src/s/swift$ clang ~/butt.c -O3  -S -o -
	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 9
	.globl	_main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## BB#0:                                ## %entry
	pushq	%rbp
Ltmp0:
	.cfi_def_cfa_offset 16
Ltmp1:
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
Ltmp2:
	.cfi_def_cfa_register %rbp
	xorl	%eax, %eax
	popq	%rbp
	retq
	.cfi_endproc

It looks like LLVM does not recognize the overflow traps Swift emits on arithmetic operations as dead code, so that prevents it from completely eliminating the Swift loop.  That's a bug worth fixing in Swift, but unlikely to make a major difference in real, non-dead code. However, if we make the work useful by factoring the loop into a function in both languages, the perf difference is unmeasurable. Try comparing:

#include <stdint.h>

__attribute__((noinline))
uint64_t getCount(uint64_t nbrBytes) {
  uint64_t count = 0;
  for( uint64_t byteIndex = 0; byteIndex < nbrBytes; byteIndex++ )
  {
          count += byteIndex;
          if ( ( byteIndex & 0xffffffff ) == 0 ) { count += 1.3; } 
  };
  return count;
}

int main() {
  uint64_t nbrBytes = 4e8;
  return getCount(nbrBytes);
}


with:

import Darwin

@inline(never)
func getCount(nbrBytes: Int) -> Int {
  var count = 0
  for byteIndex in 0 ..< nbrBytes
  {
          count += byteIndex
          if ( ( byteIndex & 0xffffffff ) == 0 ) {count += Int(1.3);}
  }
  return count
}

exit(Int32(truncatingBitPattern: getCount(nbrBytes: 400_000_000)))

-Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20161012/42a1a7f4/attachment.html>


More information about the swift-users mailing list