[swift-evolution] [Review] SE-0007 Remove C-style for-loops with conditions and incrementers

Fri Dec 11 11:17:58 CST 2015

This does not take continue into account which would then require a
duplication of expr3.
It would also change the scope of variables defined in expr1
potentially causing
collisions or unexpected shadowing.

On Fri, Dec 11, 2015 at 5:27 PM, Erica Sadun via swift-evolution <
swift-evolution at swift.org> wrote:

> For many of these number-crunching performance-stretching scenarios, many
> I suggest once again, that if you're doing serious number crunching that
> Accelerate or similar approaches is to be preferred? As for c-style-for vs
> while, the two are mechanically convertible.
>
>
> Where heavy performance is not required, for-in is more readable,
> maintainable, and optimizable to a sufficient extent that I do not see it
> as a bar to conversion.
>
> -- E
>
>
> On Dec 11, 2015, at 8:44 AM, Paul Cantrell via swift-evolution <
> swift-evolution at swift.org> wrote:
>
> Your revised results are now right in line with what I get in my test
> harness, so that’s reassuring!
>
> I’d quibble with this:
>
>
>    1. The optimized builds are still slower than the for-in “equivalent”
>    functionality.
>
>
> That’s not an accurate summary. Depending on precisely what’s in the loop,
> the for-in flavor is clocking in anywhere from 80% slower to 20% *faster*.
>
> None of this performance testing undercuts your entirely valid concerns
> about syntax. We have, I think, widespread agreement on the list that the
> C-style for is very rarely used in most Swift code in the wild — but if
> your usage patterns are unusual and you use it a lot, I can see why you’d
> be reluctant to part with it!
>
> It’s a question, then, of whether it’s worth having a leaner language at
> the expense of making some less-common code more verbose when optimized.
> I’m not sure that any of the C-style audits people have done on the list
> have been games. Are there other game developers on the list using Swift
> who could do the audit on their code?
>
> Cheers,
>
> Paul
>
>
> On Dec 11, 2015, at 2:33 AM, David Owens II <david at owensd.io> wrote:
>
> I don’t know what you did, your gist 404s.
>
> Here’s an update with the while-loop:
> https://gist.github.com/owensd/b438bea90075907fa6ec and using both i and
> j within the loops. This is a simple OS X framework project with unit tests.
>
> *Debug Build:*
>
>    - testZipStride - 2.496s
>    - testCStyleFor - 0.210s
>    - testWhileLoop - 0.220s
>
>
> *Release Build:*
>
>    - testZipStride - 0.029s
>    - testCStyleFor - 0.018s
>    - testWhileLoop - 0.019s
>
>
> I ran these tests from my MacBook Pro, the previous tests were from my
> iMac.
>
> When you use the sum += (i - j) construct, I think all you are ending up
> with a hot-path that the optimizer can end up optimizing better (my guess
> is that the i-j turns into a constant expression - after all, the
> difference is always 1, but I don’t know enough about the SIL
> representation to confirm that). If you use a code path where that
> expression is not constant time (again, assuming my suspicion is correct),
> the zip+stride is against slower.
>
> I would argue the following:
>
>
>    1. The code is not objectively easier to read or understand with the
>    zip+stride construct (arguably, they are not even semantically equivalent).
>    2. The debug builds are prohibitively slower, especially in the
>    context of high-performance requirement code (I’m doing a lot of
>    prototyping Swift in the context of games, so yes, performance matters
>    considerably).
>    3. The optimized builds are still slower than the for-in “equivalent"
>    functionality.
>    4. The optimizer is inconsistent, like all optimizers are (this is a
>    simple truth, not a value judgement - optimizers are not magic, they are
>    code that is run like any other code and can only do as well as they are
>    coded under the conditions they are coded against), at actually producing
>    similar results with code that ends up with slightly different shapes.
>    5. There is not functionally equivalent version of the code that I can
>    write that is not more verbose, while requiring artificial scoping
>    constructs, to achieve the same behavior.
>
>
> So no, there is no evidence that I’ve seen to reconsider my opinion that
> this proposal should not be implemented. If there is evidence to show that
> my findings are incorrect or a poor summary of the general problem I am
> seeing, then of course I would reconsider my opinion.
>
> -David
>
> On Dec 10, 2015, at 9:12 PM, Paul Cantrell <cantrell at pobox.com> wrote:
>
> Hold the presses.
>
> David, I found the radical differences in our results troubling, so I did
> some digging. It turns out that the zip+stride code:
>
>     var sum = 0
>     for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0,
> by: -2)) {
>         if i % 2 == 0 { continue }
>         sum += 1
>     }
>
> …runs *much* faster if you actually use both i and j inside the loop:
>
>     var sum = 0
>     for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0,
> by: -2)) {
>         if i % 2 == 0 { continue }
>         sum += *i-j*
>     }
>
> Weird, right? This is with optimization on (default “production” build).
> It smells like a compiler quirk.
>
> With that tweak, the zip+stride approach actually clocks in faster than
> the C-style for. Yes, you read that right: *faster*. Also smells like a
> quirk. Am I doing something fantastically stupid in my code? Or maybe it’s
> just my idiosyncratic taste in indentation? :P
>
> Here’s my test case, which was a command-line app with manual timing,
> followed by David’s dropped into the same harness, followed by David’s but
> with sum += i-j instead of sum += 1:
>
>     https://gist.github.com/pcantrell/6bbe80e630d227ed0262
>
> Point is: *no big performance difference here; even a performance
> advantage* (that is probably a compiler artifact).
>
> David and Thorsten, you might want to reconsider your reviews?
>
> Results:
>
> —————— Paul’s comparison ——————
>
> zip+stride
>
>   Iter 0: 0.519110977649689
>   Iter 1: 0.503385007381439
>   Iter 2: 0.503321051597595
>   Iter 3: 0.485216021537781
>   Iter 4: 0.524757027626038
>   Iter 5: 0.478078007698059
>   Iter 6: 0.503880977630615
>   Iter 7: 0.498068988323212
>   Iter 8: 0.485781013965607
>          ——————————————
>   Median: 0.524757027626038
>
> C-style
>
>   Iter 0: 0.85480797290802
>   Iter 1: 0.879491031169891
>   Iter 2: 0.851797997951508
>   Iter 3: 0.836017966270447
>   Iter 4: 0.863684952259064
>   Iter 5: 0.837742984294891
>   Iter 6: 0.839070022106171
>   Iter 7: 0.849772989749908
>   Iter 8: 0.819278955459595
>          ——————————————
>   Median: 0.863684952259064
>
> Zip+stride takes 0.607579217692143x the time of C-style for
>
> —————— David’s comparison ——————
>
> zip+stride
>
>   Iter 0: 1.15285503864288
>   Iter 1: 1.1244450211525
>   Iter 2: 1.24192994832993
>   Iter 3: 1.02782195806503
>   Iter 4: 1.13640999794006
>   Iter 5: 1.15879601240158
>   Iter 6: 1.12114900350571
>   Iter 7: 1.21364599466324
>   Iter 8: 1.10698300600052
>          ——————————————
>   Median: 1.13640999794006
>
> C-style
>
>   Iter 0: 0.375869989395142
>   Iter 1: 0.371365010738373
>   Iter 2: 0.356527984142303
>   Iter 3: 0.384984970092773
>   Iter 4: 0.367590010166168
>   Iter 5: 0.365644037723541
>   Iter 6: 0.384257972240448
>   Iter 7: 0.379297018051147
>   Iter 8: 0.363133013248444
>          ——————————————
>   Median: 0.367590010166168
>
> Zip+stride takes 3.09151491202482x the time of C-style for
>
> —————— David’s comparison, actually using indices in the loop ——————
>
> zip+stride
>
>   Iter 0: 0.328687965869904
>   Iter 1: 0.332105994224548
>   Iter 2: 0.336817979812622
>   Iter 3: 0.321089029312134
>   Iter 4: 0.338591992855072
>   Iter 5: 0.348567008972168
>   Iter 6: 0.34687602519989
>   Iter 7: 0.34755402803421
>   Iter 8: 0.341500997543335
>          ——————————————
>   Median: 0.338591992855072
>
> C-style
>
>   Iter 0: 0.422354996204376
>   Iter 1: 0.427953958511353
>   Iter 2: 0.403640985488892
>   Iter 3: 0.415378987789154
>   Iter 4: 0.403639018535614
>   Iter 5: 0.416707038879395
>   Iter 6: 0.415345013141632
>   Iter 7: 0.417587995529175
>   Iter 8: 0.415713012218475
>          ——————————————
>   Median: 0.403639018535614
>
> Zip+stride takes 0.838848518865867x the time of C-style for
>
> Program ended with exit code: 0
>
> Cheers,
>
> Paul
>
> On Dec 10, 2015, at 5:36 PM, David Owens II <david at owensd.io> wrote:
>
> Here’s my basic test case:
>
> let first = 10000000
> let second = 20000000
>
> class LoopPerfTests: XCTestCase {
>
>     func testZipStride() {
>         self.measureBlock {
>             var sum = 0
>             for (i, j) in zip(first.stride(to: 0, by:
> -1), second.stride(to: 0, by: -2)) {
>                 if i % 2 == 0 { continue }
>                 sum += 1
>             }
>             print(sum)
>         }
>     }
>
>     func testCStyleFor() {
>         self.measureBlock {
>             var sum = 0
>             for var i = first, j = second; i > 0 && j > 0; i -= 1, j -= 2 {
>                 if i % 2 == 0 { continue }
>                 sum += 1
>             }
>             print(sum)
>         }
>
>     }
>
> }
>
>
> Non-optimized timings:
>
>    - testCStyleFor - 0.126s
>    - testZipStride - 2.189s
>
>
> Optimized timings:
>
>    - testCStyleFor - 0.008s
>    - testZipStride - 0.015s
>
>
> That’s a lot worse than 34%; even in optimized builds, that’s 2x slower
> and in debug builds, that’s 17x slower. I think it’s unreasonable to force
> people to write a more verbose while-loop construct to simply get the
> performance they need.
>
> Also, the readability argument is very subjective; for example, I don’t
> find the zip version more readability. In fact, I think it obscures what
> the logic of the loop is doing. But again, that’s subjective.
>
> -David
>
> On Dec 10, 2015, at 2:41 PM, Paul Cantrell <cantrell at pobox.com> wrote:
>
> Is there any guarantee that these two loops have the exact same runtime
> performance?
>
> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>    if i % 2 == 0 { continue }
>    print(i, j)
> }
>
> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>    if i % 2 == 0 { continue }
>    print(i, j)
> }
>
>
> In a quick and dirty test, the second is approximately 34% slower.
>
> I’d say that’s more than acceptable for the readability gain. If you’re in
> that rare stretch of critical code where the extra 34% actually matters,
> write it using a while loop instead.
>
> P
>
>
>
> On Dec 10, 2015, at 4:07 PM, David Owens II via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>
> On Dec 10, 2015, at 1:57 PM, thorsten--- via swift-evolution <
> swift-evolution at swift.org> wrote:
>
> Yes, performance is one thing neglected by the discussions and the
> proposal.
>
>
> This is my primary objection to to this proposal; it assumes (or
> neglects?) that all of the types used can magically be inlined to nothing
> but the imperative code. This isn’t magical, someone has to implement the
> optimizations to do this.
>
> Is there any guarantee that these two loops have the exact same runtime
> performance?
>
> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>    if i % 2 == 0 { continue }
>    print(i, j)
> }
>
> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>    if i % 2 == 0 { continue }
>    print(i, j)
> }
>
>
> And can you guarantee that performance is relatively the same across debug
> and release builds? Because historically, Swift has suffered greatly in
> this regard with respects to the performance of optimized versus
> non-optimized builds.
>
> These types of optimizer issues are real-world things I’ve had to deal
> with (and have written up many blog posts about). I get the desire to
> simplify the constructs, but we need an escape hatch to write performant
> code when the optimizer isn’t up to the job.
>
> -David
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>
>
>
>  _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151211/b0acb653/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2015-12-08 at 3.54.37 PM.png
Type: image/png
Size: 93414 bytes
Desc: not available
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151211/b0acb653/attachment.png>