> On Jun 12, 2017, at 5:29 PM, Michael Gottesman <mgottesman at apple.com> wrote:
>> I don't know what that is. 
> Check it out: https://en.wikipedia.org/wiki/Mann–Whitney_U_test <https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test>. It is a non-parametric test that two sets of samples are from the same distribution. As a bonus, it does not assume that our data is from a normal distribution (a problem with using mean/standard deviation which assumes a normal distribution).

This is a fairly important point that I didn’t stress enough. In my experience with other benchmark suites the sample distribution is nothing close to normal which is why I’ve always thought MEAN/SD was silly. But the “noise” I was dealing with was in the underlying H/W and OS mode transitions. General system noise from other processes might lead to a more normal distribution… but as I’ve said, benchmarking on a noisy system is something to be avoided.

