# [swift-users] Implementing a few statistics functions in Swift

Nevin Brackett-Rozinsky nevin.brackettrozinsky at gmail.com
Mon Oct 10 15:02:41 CDT 2016

```I rolled my own (rather simple) statistics struct. It had been using Double
and Array, but I just went back and made it work generically with
FloatingPoint and Sequence. Here’s what it looks like:

struct Statistic<Number: FloatingPoint> {
private var ssqDev: Number = 0
private(set) var count: Number = 0
private(set) var average: Number = 0
private(set) var maximum: Number = Number.infinity
private(set) var minimum: Number = -Number.infinity

var variance: Number { return ssqDev / (count - 1) }
var standardDeviation: Number { return sqrt(variance) }

init() {}

init<T: Sequence> (values: T) where T.Iterator.Element == Number {
}

mutating func addValues<T: Sequence> (_ vals: T) where
T.Iterator.Element == Number {
for val in vals { addValue(val) }
}

mutating func addValue(_ value: Number) {
count += 1 as Number
let diff = value - average
let frac = diff / count
average += frac
ssqDev += diff * (diff - frac)
minimum = min(minimum, value)
maximum = max(maximum, value)
}
}

(Sorry for the lack of syntax highlighting—Gmail strips the formatting when
I paste it.)

Some notes:

• The approach is to look at each data point once and keep the statistics
correct for the numbers seen so far. This saves memory if the values are
being computed or fetched, since you don’t need to store them. However it
also means that the median cannot be found.

• The calculation to update “average” and “ssqDev” is simplified from the
online-algorithm
<https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm>
found on Wikipedia. (“ssqDev” stores the sum of squared deviations from the
mean, which is just the sample variance times the count.)

• If you want to ignore NaN’s, just add “if value.isNaN { return }” at the

• The “as Number” coercion shouldn’t be necessary, but I was getting an
“ambiguous use of +=” error without it.

• All occurrences of “count” were originally “n”, which was private, and I
had a computed “count” that just returned Int(n). But when I switched from
“Double” to “Number: FloatingPoint” I lost the ability to write “Int(n)”.

Nevin

On Mon, Oct 10, 2016 at 1:13 PM, Harlan Haskins via swift-users <
swift-users at swift.org> wrote:

> Oh yeah, I'd love contributions and feedback! I'm essentially implementing
> this as I learn things in stats 101 so it's probably woefully inadequate. 😅
>
> -- Harlan
>
> On Oct 10, 2016, at 1:04 PM, Michael Ilseman <milseman at apple.com> wrote:
>
>
> On Oct 8, 2016, at 11:29 AM, Georgios Moschovitis via swift-users <
> swift-users at swift.org> wrote:
>
> Hey everyone,
>
> I would like to implement a few statistics functions in Swift (e.g.
> variance, standardDeviation, etc) that are computed over a collection.
>
> I am aware of this library:
>
> https://github.com/evgenyneu/SigmaSwiftStatistics
>
> My problem is that it only supports Doubles and Arrays. Also the API
> doesn’t look very ‘swifty' to me.
>
>
> You might find this library to be more Swifty: https://github.com/
>
> It’s not as generic as possible nor has all the features you might need,
> but the author is very responsive to feedback.
>
>
> I am wondering how would someone implement such functionality in a more
> generic way: to allow usage of multiple collections (even custom, e.g. a
> RingBuffer) and multiple value types (e.g. Decimal, Double). Extra points
> for being 'swifty'.
>
> Thanks in advance for any ideas.
>
> -g.
```