[swift-dev] Shouldn't the optimizer make this manual loop-unrolling unnecessary?

Jens Persson jens at bitcycle.com
Fri Dec 11 01:28:33 CST 2015


I've been doing a lot of performance testing related to generic value types
and SIMD lately, and I've built Swift from sources in order to get an idea
of what's coming up optimizerwise. Things have improved and the optimizer
is impressive overall. But I still see no improvement in the case
exemplified below.

Manually unrolling the simple for loop will make it ~ 4 times faster (and
exactly the same as when SIMD float4):

struct V4<T> {
    var elements: (T, T, T, T)
    /.../
    subscript(index: Int) -> T { /.../ }
    /.../
    func addedTo(other: V4) -> V4 {
        var r = V4()
        // Manually unrolling makes code ~ 4 times faster:
        // for i in 0 ..< 4 { r[i] = self[i] + other[i] }
        r[0] = self[0] + other[0]
        r[1] = self[1] + other[1]
        r[2] = self[2] + other[2]
        r[3] = self[3] + other[3]
        return r
    }
    /.../
}

Shouldn't the optimizer be able to handle that for loop and make the manual
unrolling unnecessary?

(compiled the test with -O -whole-module-optimizations, also tried
-Ounchecked but with same results.)

/Jens
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20151211/5fc6ba22/attachment.html>


More information about the swift-dev mailing list