[swift-evolution] [Review] SE-0007 Remove C-style for-loops with conditions and incrementers

inbox only inboxonly4 at gmail.com
Fri Dec 11 12:20:10 CST 2015


I don't know why do you try to force for-in loop more than it is.It is for
SequenceType(s). For-in mostly it is for arrays(because it is a
sequence).Even though you can force everything to be a sequence because no
matter what the optimizer will be ,the SequenceType requires more time
because you have to call next() which can be a little bit slower or take a
lot more time for what in a C style for-loop can be done by a simple
subtraction or addition.

2015-12-11 19:17 GMT+02:00 Marc Knaup via swift-evolution <
swift-evolution at swift.org>:

> This does not take continue into account which would then require a
> duplication of expr3.
> It would also change the scope of variables defined in expr1 potentially causing
> collisions or unexpected shadowing.
>
> On Fri, Dec 11, 2015 at 5:27 PM, Erica Sadun via swift-evolution <
> swift-evolution at swift.org> wrote:
>
>> For many of these number-crunching performance-stretching scenarios, many
>> I suggest once again, that if you're doing serious number crunching that
>> Accelerate or similar approaches is to be preferred? As for c-style-for vs
>> while, the two are mechanically convertible.
>>
>>
>> Where heavy performance is not required, for-in is more readable,
>> maintainable, and optimizable to a sufficient extent that I do not see it
>> as a bar to conversion.
>>
>> -- E
>>
>>
>> On Dec 11, 2015, at 8:44 AM, Paul Cantrell via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>
>> Your revised results are now right in line with what I get in my test
>> harness, so that’s reassuring!
>>
>> I’d quibble with this:
>>
>>
>>    1. The optimized builds are still slower than the for-in “equivalent”
>>    functionality.
>>
>>
>> That’s not an accurate summary. Depending on precisely what’s in the
>> loop, the for-in flavor is clocking in anywhere from 80% slower to 20%
>> *faster*.
>>
>> None of this performance testing undercuts your entirely valid concerns
>> about syntax. We have, I think, widespread agreement on the list that the
>> C-style for is very rarely used in most Swift code in the wild — but if
>> your usage patterns are unusual and you use it a lot, I can see why you’d
>> be reluctant to part with it!
>>
>> It’s a question, then, of whether it’s worth having a leaner language at
>> the expense of making some less-common code more verbose when optimized.
>> I’m not sure that any of the C-style audits people have done on the list
>> have been games. Are there other game developers on the list using Swift
>> who could do the audit on their code?
>>
>> Cheers,
>>
>> Paul
>>
>>
>> On Dec 11, 2015, at 2:33 AM, David Owens II <david at owensd.io> wrote:
>>
>> I don’t know what you did, your gist 404s.
>>
>> Here’s an update with the while-loop:
>> https://gist.github.com/owensd/b438bea90075907fa6ec and using both i and
>> j within the loops. This is a simple OS X framework project with unit tests.
>>
>> *Debug Build:*
>>
>>    - testZipStride - 2.496s
>>    - testCStyleFor - 0.210s
>>    - testWhileLoop - 0.220s
>>
>>
>> *Release Build:*
>>
>>    - testZipStride - 0.029s
>>    - testCStyleFor - 0.018s
>>    - testWhileLoop - 0.019s
>>
>>
>> I ran these tests from my MacBook Pro, the previous tests were from my
>> iMac.
>>
>> When you use the sum += (i - j) construct, I think all you are ending up
>> with a hot-path that the optimizer can end up optimizing better (my guess
>> is that the i-j turns into a constant expression - after all, the
>> difference is always 1, but I don’t know enough about the SIL
>> representation to confirm that). If you use a code path where that
>> expression is not constant time (again, assuming my suspicion is correct),
>> the zip+stride is against slower.
>>
>> I would argue the following:
>>
>>
>>    1. The code is not objectively easier to read or understand with the
>>    zip+stride construct (arguably, they are not even semantically equivalent).
>>    2. The debug builds are prohibitively slower, especially in the
>>    context of high-performance requirement code (I’m doing a lot of
>>    prototyping Swift in the context of games, so yes, performance matters
>>    considerably).
>>    3. The optimized builds are still slower than the for-in “equivalent"
>>    functionality.
>>    4. The optimizer is inconsistent, like all optimizers are (this is a
>>    simple truth, not a value judgement - optimizers are not magic, they are
>>    code that is run like any other code and can only do as well as they are
>>    coded under the conditions they are coded against), at actually producing
>>    similar results with code that ends up with slightly different shapes.
>>    5. There is not functionally equivalent version of the code that I
>>    can write that is not more verbose, while requiring artificial scoping
>>    constructs, to achieve the same behavior.
>>
>>
>> So no, there is no evidence that I’ve seen to reconsider my opinion that
>> this proposal should not be implemented. If there is evidence to show that
>> my findings are incorrect or a poor summary of the general problem I am
>> seeing, then of course I would reconsider my opinion.
>>
>> -David
>>
>> On Dec 10, 2015, at 9:12 PM, Paul Cantrell <cantrell at pobox.com> wrote:
>>
>> Hold the presses.
>>
>> David, I found the radical differences in our results troubling, so I did
>> some digging. It turns out that the zip+stride code:
>>
>>     var sum = 0
>>     for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0,
>> by: -2)) {
>>         if i % 2 == 0 { continue }
>>         sum += 1
>>     }
>>
>> …runs *much* faster if you actually use both i and j inside the loop:
>>
>>     var sum = 0
>>     for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0,
>> by: -2)) {
>>         if i % 2 == 0 { continue }
>>         sum += *i-j*
>>     }
>>
>> Weird, right? This is with optimization on (default “production” build).
>> It smells like a compiler quirk.
>>
>> With that tweak, the zip+stride approach actually clocks in faster than
>> the C-style for. Yes, you read that right: *faster*. Also smells like a
>> quirk. Am I doing something fantastically stupid in my code? Or maybe it’s
>> just my idiosyncratic taste in indentation? :P
>>
>> Here’s my test case, which was a command-line app with manual timing,
>> followed by David’s dropped into the same harness, followed by David’s but
>> with sum += i-j instead of sum += 1:
>>
>>     https://gist.github.com/pcantrell/6bbe80e630d227ed0262
>>
>> Point is: *no big performance difference here; even a performance
>> advantage* (that is probably a compiler artifact).
>>
>> David and Thorsten, you might want to reconsider your reviews?
>>
>> Results:
>>
>> —————— Paul’s comparison ——————
>>
>> zip+stride
>>
>>   Iter 0: 0.519110977649689
>>   Iter 1: 0.503385007381439
>>   Iter 2: 0.503321051597595
>>   Iter 3: 0.485216021537781
>>   Iter 4: 0.524757027626038
>>   Iter 5: 0.478078007698059
>>   Iter 6: 0.503880977630615
>>   Iter 7: 0.498068988323212
>>   Iter 8: 0.485781013965607
>>          ——————————————
>>   Median: 0.524757027626038
>>
>> C-style
>>
>>   Iter 0: 0.85480797290802
>>   Iter 1: 0.879491031169891
>>   Iter 2: 0.851797997951508
>>   Iter 3: 0.836017966270447
>>   Iter 4: 0.863684952259064
>>   Iter 5: 0.837742984294891
>>   Iter 6: 0.839070022106171
>>   Iter 7: 0.849772989749908
>>   Iter 8: 0.819278955459595
>>          ——————————————
>>   Median: 0.863684952259064
>>
>> Zip+stride takes 0.607579217692143x the time of C-style for
>>
>> —————— David’s comparison ——————
>>
>> zip+stride
>>
>>   Iter 0: 1.15285503864288
>>   Iter 1: 1.1244450211525
>>   Iter 2: 1.24192994832993
>>   Iter 3: 1.02782195806503
>>   Iter 4: 1.13640999794006
>>   Iter 5: 1.15879601240158
>>   Iter 6: 1.12114900350571
>>   Iter 7: 1.21364599466324
>>   Iter 8: 1.10698300600052
>>          ——————————————
>>   Median: 1.13640999794006
>>
>> C-style
>>
>>   Iter 0: 0.375869989395142
>>   Iter 1: 0.371365010738373
>>   Iter 2: 0.356527984142303
>>   Iter 3: 0.384984970092773
>>   Iter 4: 0.367590010166168
>>   Iter 5: 0.365644037723541
>>   Iter 6: 0.384257972240448
>>   Iter 7: 0.379297018051147
>>   Iter 8: 0.363133013248444
>>          ——————————————
>>   Median: 0.367590010166168
>>
>> Zip+stride takes 3.09151491202482x the time of C-style for
>>
>> —————— David’s comparison, actually using indices in the loop ——————
>>
>> zip+stride
>>
>>   Iter 0: 0.328687965869904
>>   Iter 1: 0.332105994224548
>>   Iter 2: 0.336817979812622
>>   Iter 3: 0.321089029312134
>>   Iter 4: 0.338591992855072
>>   Iter 5: 0.348567008972168
>>   Iter 6: 0.34687602519989
>>   Iter 7: 0.34755402803421
>>   Iter 8: 0.341500997543335
>>          ——————————————
>>   Median: 0.338591992855072
>>
>> C-style
>>
>>   Iter 0: 0.422354996204376
>>   Iter 1: 0.427953958511353
>>   Iter 2: 0.403640985488892
>>   Iter 3: 0.415378987789154
>>   Iter 4: 0.403639018535614
>>   Iter 5: 0.416707038879395
>>   Iter 6: 0.415345013141632
>>   Iter 7: 0.417587995529175
>>   Iter 8: 0.415713012218475
>>          ——————————————
>>   Median: 0.403639018535614
>>
>> Zip+stride takes 0.838848518865867x the time of C-style for
>>
>> Program ended with exit code: 0
>>
>> Cheers,
>>
>> Paul
>>
>> On Dec 10, 2015, at 5:36 PM, David Owens II <david at owensd.io> wrote:
>>
>> Here’s my basic test case:
>>
>> let first = 10000000
>> let second = 20000000
>>
>> class LoopPerfTests: XCTestCase {
>>
>>     func testZipStride() {
>>         self.measureBlock {
>>             var sum = 0
>>             for (i, j) in zip(first.stride(to: 0, by:
>> -1), second.stride(to: 0, by: -2)) {
>>                 if i % 2 == 0 { continue }
>>                 sum += 1
>>             }
>>             print(sum)
>>         }
>>     }
>>
>>     func testCStyleFor() {
>>         self.measureBlock {
>>             var sum = 0
>>             for var i = first, j = second; i > 0 && j > 0; i -= 1, j
>> -= 2 {
>>                 if i % 2 == 0 { continue }
>>                 sum += 1
>>             }
>>             print(sum)
>>         }
>>
>>     }
>>
>> }
>>
>>
>> Non-optimized timings:
>>
>>    - testCStyleFor - 0.126s
>>    - testZipStride - 2.189s
>>
>>
>> Optimized timings:
>>
>>    - testCStyleFor - 0.008s
>>    - testZipStride - 0.015s
>>
>>
>> That’s a lot worse than 34%; even in optimized builds, that’s 2x slower
>> and in debug builds, that’s 17x slower. I think it’s unreasonable to force
>> people to write a more verbose while-loop construct to simply get the
>> performance they need.
>>
>> Also, the readability argument is very subjective; for example, I don’t
>> find the zip version more readability. In fact, I think it obscures what
>> the logic of the loop is doing. But again, that’s subjective.
>>
>> -David
>>
>> On Dec 10, 2015, at 2:41 PM, Paul Cantrell <cantrell at pobox.com> wrote:
>>
>> Is there any guarantee that these two loops have the exact same runtime
>> performance?
>>
>> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>>    if i % 2 == 0 { continue }
>>    print(i, j)
>> }
>>
>> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>>    if i % 2 == 0 { continue }
>>    print(i, j)
>> }
>>
>>
>> In a quick and dirty test, the second is approximately 34% slower.
>>
>> I’d say that’s more than acceptable for the readability gain. If you’re
>> in that rare stretch of critical code where the extra 34% actually matters,
>> write it using a while loop instead.
>>
>> P
>>
>>
>>
>> On Dec 10, 2015, at 4:07 PM, David Owens II via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>
>>
>> On Dec 10, 2015, at 1:57 PM, thorsten--- via swift-evolution <
>> swift-evolution at swift.org> wrote:
>>
>> Yes, performance is one thing neglected by the discussions and the
>> proposal.
>>
>>
>> This is my primary objection to to this proposal; it assumes (or
>> neglects?) that all of the types used can magically be inlined to nothing
>> but the imperative code. This isn’t magical, someone has to implement the
>> optimizations to do this.
>>
>> Is there any guarantee that these two loops have the exact same runtime
>> performance?
>>
>> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>>    if i % 2 == 0 { continue }
>>    print(i, j)
>> }
>>
>> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>>    if i % 2 == 0 { continue }
>>    print(i, j)
>> }
>>
>>
>> And can you guarantee that performance is relatively the same across
>> debug and release builds? Because historically, Swift has suffered greatly
>> in this regard with respects to the performance of optimized versus
>> non-optimized builds.
>>
>> These types of optimizer issues are real-world things I’ve had to deal
>> with (and have written up many blog posts about). I get the desire to
>> simplify the constructs, but we need an escape hatch to write performant
>> code when the optimizer isn’t up to the job.
>>
>> -David
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>>
>>
>>
>>  _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151211/e9943375/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2015-12-08 at 3.54.37 PM.png
Type: image/png
Size: 93414 bytes
Desc: not available
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151211/e9943375/attachment.png>


More information about the swift-evolution mailing list