[swift-evolution] [Review] SE-0007 Remove C-style for-loops with conditions and incrementers
Paul Cantrell
cantrell at pobox.com
Thu Dec 10 23:12:13 CST 2015
Hold the presses.
David, I found the radical differences in our results troubling, so I did some digging. It turns out that the zip+stride code:
var sum = 0
for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0, by: -2)) {
if i % 2 == 0 { continue }
sum += 1
}
…runs much faster if you actually use both i and j inside the loop:
var sum = 0
for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0, by: -2)) {
if i % 2 == 0 { continue }
sum += i-j
}
Weird, right? This is with optimization on (default “production” build). It smells like a compiler quirk.
With that tweak, the zip+stride approach actually clocks in faster than the C-style for. Yes, you read that right: faster. Also smells like a quirk. Am I doing something fantastically stupid in my code? Or maybe it’s just my idiosyncratic taste in indentation? :P
Here’s my test case, which was a command-line app with manual timing, followed by David’s dropped into the same harness, followed by David’s but with sum += i-j instead of sum += 1:
https://gist.github.com/pcantrell/6bbe80e630d227ed0262 <https://gist.github.com/pcantrell/6bbe80e630d227ed0262>
Point is: no big performance difference here; even a performance advantage (that is probably a compiler artifact).
David and Thorsten, you might want to reconsider your reviews?
Results:
—————— Paul’s comparison ——————
zip+stride
Iter 0: 0.519110977649689
Iter 1: 0.503385007381439
Iter 2: 0.503321051597595
Iter 3: 0.485216021537781
Iter 4: 0.524757027626038
Iter 5: 0.478078007698059
Iter 6: 0.503880977630615
Iter 7: 0.498068988323212
Iter 8: 0.485781013965607
——————————————
Median: 0.524757027626038
C-style
Iter 0: 0.85480797290802
Iter 1: 0.879491031169891
Iter 2: 0.851797997951508
Iter 3: 0.836017966270447
Iter 4: 0.863684952259064
Iter 5: 0.837742984294891
Iter 6: 0.839070022106171
Iter 7: 0.849772989749908
Iter 8: 0.819278955459595
——————————————
Median: 0.863684952259064
Zip+stride takes 0.607579217692143x the time of C-style for
—————— David’s comparison ——————
zip+stride
Iter 0: 1.15285503864288
Iter 1: 1.1244450211525
Iter 2: 1.24192994832993
Iter 3: 1.02782195806503
Iter 4: 1.13640999794006
Iter 5: 1.15879601240158
Iter 6: 1.12114900350571
Iter 7: 1.21364599466324
Iter 8: 1.10698300600052
——————————————
Median: 1.13640999794006
C-style
Iter 0: 0.375869989395142
Iter 1: 0.371365010738373
Iter 2: 0.356527984142303
Iter 3: 0.384984970092773
Iter 4: 0.367590010166168
Iter 5: 0.365644037723541
Iter 6: 0.384257972240448
Iter 7: 0.379297018051147
Iter 8: 0.363133013248444
——————————————
Median: 0.367590010166168
Zip+stride takes 3.09151491202482x the time of C-style for
—————— David’s comparison, actually using indices in the loop ——————
zip+stride
Iter 0: 0.328687965869904
Iter 1: 0.332105994224548
Iter 2: 0.336817979812622
Iter 3: 0.321089029312134
Iter 4: 0.338591992855072
Iter 5: 0.348567008972168
Iter 6: 0.34687602519989
Iter 7: 0.34755402803421
Iter 8: 0.341500997543335
——————————————
Median: 0.338591992855072
C-style
Iter 0: 0.422354996204376
Iter 1: 0.427953958511353
Iter 2: 0.403640985488892
Iter 3: 0.415378987789154
Iter 4: 0.403639018535614
Iter 5: 0.416707038879395
Iter 6: 0.415345013141632
Iter 7: 0.417587995529175
Iter 8: 0.415713012218475
——————————————
Median: 0.403639018535614
Zip+stride takes 0.838848518865867x the time of C-style for
Program ended with exit code: 0
Cheers,
Paul
> On Dec 10, 2015, at 5:36 PM, David Owens II <david at owensd.io> wrote:
>
> Here’s my basic test case:
>
> let first = 10000000
> let second = 20000000
>
> class LoopPerfTests: XCTestCase {
>
> func testZipStride() {
> self.measureBlock {
> var sum = 0
> for (i, j) in zip(first.stride(to: 0, by: -1), second.stride(to: 0, by: -2)) {
> if i % 2 == 0 { continue }
> sum += 1
> }
> print(sum)
> }
> }
>
> func testCStyleFor() {
> self.measureBlock {
> var sum = 0
> for var i = first, j = second; i > 0 && j > 0; i -= 1, j -= 2 {
> if i % 2 == 0 { continue }
> sum += 1
> }
> print(sum)
> }
>
> }
>
> }
>
> Non-optimized timings:
> testCStyleFor - 0.126s
> testZipStride - 2.189s
>
> Optimized timings:
> testCStyleFor - 0.008s
> testZipStride - 0.015s
>
> That’s a lot worse than 34%; even in optimized builds, that’s 2x slower and in debug builds, that’s 17x slower. I think it’s unreasonable to force people to write a more verbose while-loop construct to simply get the performance they need.
>
> Also, the readability argument is very subjective; for example, I don’t find the zip version more readability. In fact, I think it obscures what the logic of the loop is doing. But again, that’s subjective.
>
> -David
>
>> On Dec 10, 2015, at 2:41 PM, Paul Cantrell <cantrell at pobox.com <mailto:cantrell at pobox.com>> wrote:
>>
>>> Is there any guarantee that these two loops have the exact same runtime performance?
>>>
>>> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>>> if i % 2 == 0 { continue }
>>> print(i, j)
>>> }
>>>
>>> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>>> if i % 2 == 0 { continue }
>>> print(i, j)
>>> }
>>
>>
>> In a quick and dirty test, the second is approximately 34% slower.
>>
>> I’d say that’s more than acceptable for the readability gain. If you’re in that rare stretch of critical code where the extra 34% actually matters, write it using a while loop instead.
>>
>> P
>>
>>
>>
>>> On Dec 10, 2015, at 4:07 PM, David Owens II via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>
>>>
>>>> On Dec 10, 2015, at 1:57 PM, thorsten--- via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>>>
>>>> Yes, performance is one thing neglected by the discussions and the proposal.
>>>
>>> This is my primary objection to to this proposal; it assumes (or neglects?) that all of the types used can magically be inlined to nothing but the imperative code. This isn’t magical, someone has to implement the optimizations to do this.
>>>
>>> Is there any guarantee that these two loops have the exact same runtime performance?
>>>
>>> for (i, j) in zip(10.stride(to: 0, by: -1), 20.stride(to: 0, by: -2)) {
>>> if i % 2 == 0 { continue }
>>> print(i, j)
>>> }
>>>
>>> for var i = 10, j = 20; i > 0 && j > 0; i -= 1, j -= 2 {
>>> if i % 2 == 0 { continue }
>>> print(i, j)
>>> }
>>>
>>>
>>> And can you guarantee that performance is relatively the same across debug and release builds? Because historically, Swift has suffered greatly in this regard with respects to the performance of optimized versus non-optimized builds.
>>>
>>> These types of optimizer issues are real-world things I’ve had to deal with (and have written up many blog posts about). I get the desire to simplify the constructs, but we need an escape hatch to write performant code when the optimizer isn’t up to the job.
>>>
>>> -David
>>>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution <https://lists.swift.org/mailman/listinfo/swift-evolution>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20151210/ccfa190f/attachment.html>
More information about the swift-evolution
mailing list