[swift-evolution] For-loop revisited

Wed Mar 9 10:37:10 CST 2016

BTW, I got curious about this so I actually tested it out. I have written a very rudimentary C program (attached) that implements the same simple loop in three styles:

1. A classical for(int i … ) loop
2. A loop using functions for controlling the counter
3. A object-oriented iterator protocol (this is essentially what Swift would compile to)

To make sure that the compiler would not optimise the loop away, it produces side effects (console output). 

In release mode, I can't see any substantial difference in the output produced by the compiler. Its not identical, but the actual loop looks very efficient, everything is inlined and all iterator fields are eliminated (they are pushed to registers). So the optimisations in LLVM are there already. If Swift doesn’t do it yet, then its probably the case of few missing heuristics. 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: main.c
Type: application/octet-stream
Size: 1137 bytes
Desc: not available
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160309/f4d69a5f/attachment.obj>
-------------- next part --------------

> On 09 Mar 2016, at 16:45, Taras Zakharko <taras.zakharko at uzh.ch> wrote:
> 
> 
>> On 09 Mar 2016, at 14:52, Ted F.A. van Gaalen <tedvgiosdev at gmail.com> wrote:
>> 
>> Hello Taras
>> If I am not mistaken, the compiler cannot optimize: 
>> 
>>  for element in collection {}
>> 
>> simply because the contents of the collection are
>> unknown at compile time.
> 
> 
> The progression 
> 
>  for(int i=0; i<=n; i++) {...}
> 
> is also unknown at compile time. Compiler needs to generate a series of instructions that increment a loop counter and check whether the exit condition has been reached.
> 
> Now, in case of a higher abstraction such as
> 
>  for i in stride(0, n, by=1) {...}
> 
> there is indeed more code being generated, but the structure of the loop is exactly the same. Specifically, the above loop can essentially be rewritten in C as
> 
> for(int i = get_initial_value(), !should_exit_on_counter(i), i = next_value(i)) {...}
> 
> There is a piece of code that increments a loop counter and a piece of code that checks the exit condition. In an unoptimised build, the overhead is indeed significant because of all the extra function calls. However, an optimising compiler can easily
> 
> a)  detect that the iterator does not escape the loop (and therefore eliminate refcounting for it)
> b)  inline the methods that implement loop increment and exit condition check
> c)  infer (e.g. via SSA form transformations) that i and the internal counter used by the iterator variable are always identical and eliminate any unnecessary assignments
> 
> All of these optimisations are fairly trivial and have existed for quite some time in modern compilers. At this point the code of the abstract collection-driven loop becomes identical to an optimised for loop. Furthermore, the compiler can use additional heuristics to unroll or otherwise optimise the loop (such as moving the loop counter to a register). 
> 
>> On 09 Mar 2016, at 14:25, Goffredo Marocchi <panajev at gmail.com> wrote:
>> 
>> Sometimes programmers directives and runtime knowledge are essential though and the compilers should be optimised but not held to a practically impossible standards.
> 
> 
> I do not think that expecting the compiler to optimise away a simple iterator is a practically impossible standard. There are compilers out there that perform much more impressive feats (such as autovectorisation). In fact, I am quite sure that GCC and clang will optimise away the for loop that uses simple functions for  exit condition/increment. 
> 
> I certainly agree with you that a compiler can’t do everything — but thats what the while loop is for. 
> 
>> On 09 Mar 2016, at 16:11, David Owens II <david at owensd.io> wrote:
>> 
>> And yet, developers spend the vast majority of their time running and validating code in non-optimized builds. 
> 
> This is a good point! But I believe that this can be alleviated somehow by performing partial optimisations on loops even in the debug mode (e.g. refcounting elimination, inlining)
> 
> — Taras
>