[swift-build-dev] Performance testing via SwiftPM and XCTest

Fri Jul 29 12:47:23 CDT 2016

I received some feedback from Paulo Faria that I’d like to share here:

Is there a way to change the performance test baselines by code? I can’t
find a way to do it today. So my question would be, why can’t we set the
baselines by code? Is it because we need a different baseline for each
different machine configuration? I think some of the configurations make
more sense to be defined in code — the maximum standard deviation, for
example. Sometimes we want to test performance of a memory intensive
operation and usually the standard deviation is bigger in those cases. This
wouldn’t be an SwiftPM issue though… it would be more of a XCTest issue.

Also, isn’t the long term goal for SwiftPM to allow different testing
frameworks? I think we should take that into consideration when designing
the performance tests integration. The proposal feels very coupled with
XCTest… like the comment about using plist instead of JSON.

Thanks for the feedback!! To answer your questions:
On coupling with XCTest

Yes, the current proposal focuses on XCTest. Its goal is to support XCTest
performance tests in SwiftPM. As I’ve looked into them over the past couple
of days, I’ve begun to realize that performance tests are a fascinating
topic. XCTest is merely one implementation. It’s not the best one, but it
is the one that Apple provides to developers that use Xcode, and
SwiftPM/corelibs-xctest do have an active mission to maintain source
compatibility.

I think limiting the scope here to just XCTest helps make forward progress
possible for now. I plan on drafting a separate proposal for third-party
testing support soon.

You mention using plists as evidence of coupling to XCTest, but I don’t
think this is the case for three reasons:

   1. swift-corelibs-foundation provides utilities to parse plist files.
   Cross-platform, Swift-first testing libraries should have no problem
   parsing plists.
   2. The motivation to use plists is motivated by Xcode, not XCTest. Many
   developers already have baseline plist files. Using plists in SwiftPM
   theoretically allows them to “reuse” those plist files.
   3. My opinion is that using plists allows us to share more with the
   Apple Xcode and XCTest systems, but that opinion isn’t a strong one. We
   could define an entirely new JSON format for baseline files. I just think
   reusing plists is less work, and requires less discussion on
   swift-evolution. :slightly_smiling_face:

On defining baselines programatically

Apple XCTest does not provide this functionality, and swift-corelibs-xctest
does not yet provide any APIs that are not provided by Apple XCTest. I
think proposing APIs that don’t exist in Apple XCTest is a leap that many
would be opposed to. After all, Swift committers believe
swift-corelibs-xctest provides an API that is inherently incompatible with
Swift. Convincing them that we should work on additional APIs on top of it
would be a lot of work!

- Brian Gesiak

On Tue, Jul 26, 2016 at 12:38 PM, Brian Gesiak <modocache at gmail.com> wrote:

> Hi, it’s me again. :)
>
> I figured out how we can do this on Darwin: by using .xctestconfiguration
> files.
> SwiftPM performance tests on Darwin, using XCTestConfiguration files
>
> Xcode passes all sorts of variables to XCTest by specifying the
> XCTestConfigurationFilePath=/path/to/an.xctestconfiguration plist file as
> an environment variable.
>
> You can see this for yourself by running a unit test suite via Xcode or
> xcodebuild on Darwin. When you do, Xcode prints the path to a “test
> session log”. That file contains logs that show Xcode is launching XCTest
> with XCTestConfiguration environment variables set. The XCTestConfiguration
> files are binary plists, which you can convert to XML by using plutil
> -convert xml1 <plist_path>.
>
> XCTestConfiguration files appear to specify the paths to baseline metrics
> plist files using the baselineFileURL and baselineFileRelativePath keys.
> XCTest then parses these plists to determine the baseline metrics to run
> performance tests against.
>
> So, SwiftPM could run performance tests on Darwin by passing XCTest
> <https://github.com/apple/swift-package-manager/blob/dfdcd2de5fc1bfc64a14690ed186e147f5ea95f5/Sources/Commands/SwiftTestTool.swift#L254-L260>
> a XCTestConfigurationFilePath=/path/to/an.xctestconfiguration environment
> variable, and by specifying the baseline file paths in that plist file.
> JSON vs. plist, and other questions
>
> Because the Darwin path for SwiftPM performance testing requires plists be
> used, I wonder whether we should use plists to store baseline metrics on
> all platforms.
>
> I think this is about as far as I can go short of either:
>
>    1. Getting feedback from Apple employees (and others!)
>    2. Submitting an official evolution proposal
>
> I’ll submit a proposal within a week or so. Feedback before then would be
> very much appreciated!! :)
>
> - Brian Gesiak
> 
>
> On Tue, Jul 26, 2016 at 2:42 AM, Brian Gesiak <modocache at gmail.com> wrote:
>
>> I received some feedback on this proposal from Ankit Aggarwal, which
>> centered on how developers would edit and update their baseline metrics.
>> Here’s what I’m envisioning specifically:
>> Two new command-line options for swift test
>>
>>    1. swift test --performance-metrics <path>. This is a path to a
>>    directory where JSON files containing the baseline metrics for the tests
>>    will be stored. By default, this path will be
>>    MyPackage/Tests/PerformanceMetrics. In my previous email, I suggested
>>    the default --performance-metrics path could be set to the same path
>>    as the swift test --build-path directory, but I have reconsidered.
>>    This is because I think developers would want to check their baseline
>>    metrics JSON files into source control, so that they can share metrics with
>>    one another, and with their continuous integration servers.
>>    2. swift test --performance-metrics-update <mode>, where <mode> is
>>    one of {all|new|better|worse|none}. This specifies the behavior
>>    SwiftPM should take when writing baseline metrics data into the JSON files
>>    at the --performance-metrics path.
>>       - all: Write baseline metrics data for all performance test cases.
>>       If metrics for those test cases already exist in the JSON, they are
>>       overwritten.
>>       - new: Only write baseline metrics for performance test cases that
>>       did not already exist in the baseline metrics JSON. This is the default.
>>       - better: Only write baseline metrics for performance test cases
>>       whose performance has improved compared to the last time they were run. If
>>       baseline metrics for those test cases already exist in the JSON, they are
>>       overwritten. If baseline metrics for those test cases does not exist in the
>>       JSON, they are written to the JSON.
>>       - worse: Only write baseline metrics for performance test cases
>>       whose performance has worsened compared to the last time they were run. If
>>       baseline metrics for those test cases already exist in the JSON, they are
>>       overwritten. If baseline metrics for those test cases does not exist in the
>>       JSON, they are written to the JSON.
>>
>> Two new command-line options for swift-corelibs-xctest executables
>>
>>    1. --performance-metrics <path>. This is a path to a JSON file
>>    containing a mapping from test cases to baseline metrics.
>>       - If not specified, performance tests are not run against any
>>       baseline metrics, and so will never fail.
>>       - If specified, performance test cases will be run against these
>>       metrics. Based on the --performance-metrics-update mode (see
>>       below), performance test cases may fail if their performance does not meet
>>       the baseline.
>>    2. --performance-metrics-update <mode>. Same as the swift test
>>    --performance-metrics-update parameter.
>>
>> PerformanceMetrics directory
>>
>> If a package’s tests contain any performance tests (i.e.: tests that call
>> XCTestCase.measure(), XCTestCase.measureMetrics(), etc.), running swift
>> test will result in the following directories and files being generated:
>>
>> MyPackage/
>>     .build/
>>     Sources/
>>     Tests/
>>         LinuxMain.swift
>>         MyPackage/
>>             MyPackageTests.swift
>>         PerformanceMetrics/  # Generated if any performance tests are run. This is the path specified by --performance-metrics.
>>             MyPackage/
>>                 Destinations.json                          # Contains a mapping of "runDestinationsByUUID".
>>                 8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5.json  # An individual run destination's baseline metrics.
>>
>> In order to avoid name collisions, developers will no longer be able to
>> name their test modules “PerformanceMetrics”.
>> What happens when swift test is run
>>
>>    1. swift test, using the default arguments, would be the equivalent
>>    of swift test --performance-metrics ./Tests/PerformanceMetrics
>>    --performance-metrics-update new.
>>    2. SwiftPM determines which of the destinations defined in
>>    Destinations.json to pass to XCTest. For example, if testing on a
>>    macOS 64-bit system with one processor, SwiftPM attempts to find a run
>>    destination UUID in Destinations.json that matches those criteria. If
>>    no Destinations.json file exists, SwiftPM creates a mapping in memory.
>>    3. SwiftPM invokes LinuxMain.swift, passing swift-corelibs-xctest the
>>    path to a run destination’s baseline metrics file (in this case, --performance-metrics
>>    8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5.json), as well as the update
>>    behavior (all, new, better, or worse). This file may not already
>>    exist, such as in the case that a Destinations.json file did not
>>    exist, or that a JSON file for this particular run destination did not
>>    exist.
>>    4. swift-corelibs-xctest parses the JSON in the baseline metrics JSON
>>    file it is given, and stores in memory the mappings from test cases to
>>    their baseline metrics. If the file is empty or does not exist,
>>    swift-corelibs-xctest stores an empty mapping.
>>    5. swift-corelibs-xctest runs the tests. If a test exists in the
>>    mapping from step 4, it compares its performance to the baseline metric. If
>>    the performance is worse, and the update behavior is new or better,
>>    the test case is failed.
>>    6. swift-corelibs-xctest writes to the baseline metrics JSON file,
>>    based on the specified update behavior. If the file does not already exist,
>>    swift-corelibs-xctest creates the file, then writes to it.
>>    7. After running the tests, SwiftPM determines whether the
>>    8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5.json file contains any data. If
>>    it does, and this run destination did not exist in Destinations.json
>>    in step (2), then SwiftPM writes the new destination to the
>>    Destinations.json file.
>>
>> How will this work on Darwin?
>>
>> I realized while writing this email that I have no clue how to get this
>> working on Darwin. Is it even possible to specify the paths to performance
>> baseline plist files to Apple XCTest on the command line? This seems like a
>> prerequisite to supporting performance testing via SwiftPM on Darwin.
>>
>> It would be great to hear from someone on the developer tools team on
>> this topic (+cc Daniel Dunbar, Mike Ferris). I’ll try and figure out how
>> this works in Apple XCTest, and will send an update when I do.
>> Thoughts?
>>
>> As before, I’d love to hear any feedback you all may have on this
>> proposal.
>>
>> - Brian Gesiak
>> 
>>
>> On Sun, Jul 24, 2016 at 1:01 PM, Brian Gesiak <modocache at gmail.com>
>> wrote:
>>
>>> Hello corelibs-dev and build-dev,
>>>
>>> Back in May, Brian Croom implemented performance testing in
>>> swift-corelibs-xctest:
>>> https://github.com/apple/swift-corelibs-xctest/pull/109
>>>
>>> I’d love to see Swift developers use this feature to measure the
>>> performance of their code. I think we’ll need to add functionality to
>>> swift-corelibs-xctest and SwiftPM in order to do so.
>>> The problem: recording performance test baselines
>>>
>>> In order for performance tests to be useful, Apple’s Xcode provides a
>>> way to record “baseline” metrics. Baseline metrics allow a developer to
>>> indicate “this performance test should never be slower than 1.2 seconds on
>>> average, with 10% standard deviation as ‘wiggle room’”. When Apple XCTest
>>> tests are run, they are informed of the baseline metrics that have been set
>>> in Xcode. Apple XCTest performance tests that have a baseline registered
>>> will fail if performance becomes slower than the acceptable amount.
>>>
>>> If we could provide swift-corelibs-xctest with a mapping from each
>>> performance test to its baseline metric, it would be easy to write the code
>>> to fail a test if it didn’t perform well enough. That mapping, however, is
>>> the tricky part. Here’s why:
>>>
>>>    - The mapping needs to group metrics based on the host machine
>>>    running the test. Performance will of course vary based on the hardware, so
>>>    it’s important to make sure performance baselines set on a Raspberry Pi
>>>    aren’t used when testing on a Mac Pro.
>>>    - The mapping also needs to group metrics based on the target
>>>    machine. Using Apple XCTest, a developer can start a test suite run from
>>>    their MacBook Pro (macOS 64-bit), and see the results of the performance
>>>    tests when run on their iPhone 6s (iOS armv7s). I don’t think this is
>>>    relevant to swift-corelibs-xctest just yet — as far as I know, SwiftPM is
>>>    not capable of cross-compilation, so the host machine will always be
>>>    identical to the target machine. Still, we should design something flexible
>>>    enough for this scenario.
>>>
>>> Xcode’s solution: plist files
>>>
>>> Xcode’s solves this problem using two kinds of .plist files. I tried
>>> creating a sample project, named Perforate.xcodeproj, which contained a
>>> single performance test. Here’s what Xcode created:
>>>
>>> <!-- Perforate.xcodeproj/xcshareddata/xcbaselines/DA77262F1D447DB300735C93.xcbaseline/Info.plist -->
>>> <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict>
>>>         <!-- runDestinationsByUUID: These are the host/target machine groups. -->
>>>         <key>runDestinationsByUUID</key>
>>>         <dict>
>>>                 <!--
>>>                         It appears each group is given a UUID, but to be honest, I'm not sure why.
>>>                         It seems like these should be "keyed" on aspects of the host/target machines.
>>>                         As-is, I imagine Xcode and Apple XCTest need to traverse each group's
>>>                         `localComputer`, `targetArchitecture`, and `targetDevice`'s values in order to find a match.
>>>                 -->
>>>                 <key>8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5</key>
>>>                 <dict>
>>>                         <!-- Information about the host machine: number of CPUs, cores, etc. -->
>>>                         <key>localComputer</key>
>>>                         <dict>
>>>                                 <key>busSpeedInMHz</key>
>>>                                 <integer>100</integer>
>>>                                 <key>cpuCount</key>
>>>                                 <integer>1</integer>
>>>                                 <key>cpuKind</key>
>>>                                 <string>Intel Core i7</string>
>>>                                 <key>cpuSpeedInMHz</key>
>>>                                 <integer>2800</integer>
>>>                                 <key>logicalCPUCoresPerPackage</key>
>>>                                 <integer>8</integer>
>>>                                 <key>modelCode</key>
>>>                                 <string>MacBookPro11,3</string>
>>>                                 <key>physicalCPUCoresPerPackage</key>
>>>                                 <integer>4</integer>
>>>                                 <key>platformIdentifier</key>
>>>                                 <string>com.apple.platform.macosx</string>
>>>                         </dict>
>>>                         <!-- The target architecture and device are stored as separate keys. -->
>>>                         <key>targetArchitecture</key>
>>>                         <string>x86_64</string>
>>>                         <key>targetDevice</key>
>>>                         <dict>
>>>                                 <key>modelCode</key>
>>>                                 <string>iPhone8,2</string>
>>>                                 <key>platformIdentifier</key>
>>>                                 <string>com.apple.platform.iphonesimulator</string>
>>>                         </dict>
>>>                 </dict>
>>>         </dict></dict></plist>
>>>
>>> <!-- Perforate.xcodeproj/xcshareddata/xcbaselines/DA77262F1D447DB300735C93.xcbaseline/8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5.plist -->
>>> <!-- Notice that this file is named after the `runDestinationsByUUID` key from the first file: 8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5. -->
>>> <?xml version="1.0" encoding="UTF-8"?><!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"><plist version="1.0"><dict>
>>>         <key>classNames</key>
>>>         <dict>
>>>                 <key>PerforateTests</key>
>>>                 <dict>
>>>                         <!-- The metrics are mapped by class name and test method name to performance metrics. -->
>>>                         <key>test_uniqueOrdered_performance</key>
>>>                         <dict>
>>>                                 <!-- There are several categories of performance metrics. The only one publicly available in Apple XCTest so far is wall clock time. -->
>>>                                 <key>com.apple.XCTPerformanceMetric_WallClockTime</key>
>>>                                 <dict>
>>>                                         <key>baselineAverage</key>
>>>                                         <real>0.5</real>
>>>                                         <key>baselineIntegrationDisplayName</key>
>>>                                         <string>Local Baseline</string>
>>>                                 </dict>
>>>                         </dict>
>>>                 </dict>
>>>         </dict></dict></plist>
>>>
>>> Proposed solution for SwiftPM/swift-corelibs-xctest: JSON files
>>>
>>> I think we can mimic Xcode’s approach here. Here’s what I’m proposing:
>>>
>>>    - swift-corelibs-xctest’s test runner should take a --performance-metrics
>>>    <PATH> argument, where <PATH> is the location of a file containing
>>>    JSON that looks pretty much exactly like the
>>>    8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5.plist from above:
>>>
>>> {
>>>   "classNames": {
>>>     "PerforateTests": {
>>>       "test_uniqueOrdered_performance": {
>>>         "baselineAverage": "0.5",
>>>         "baselineIntegrationDisplayName": "Local Baseline"
>>>       }
>>>     }
>>>   }}
>>>
>>>
>>>    - SwiftPM’s swift test command should also take a --performance-metrics
>>>    <PATH> argument, where <PATH> is the location of a file containing
>>>    JSON that looks pretty much exactly like the
>>>    xcbaselines/DA77262F1D447DB300735C93.xcbaseline/Info.plist from
>>>    above (by default, --performance-metrics could be set to the same
>>>    path as the swift test --build-path directory):
>>>
>>> {
>>>   "runDestinationsByUUID": {
>>>     "8CE9E051-9AB6-44AF-8B80-F2DEFD409CB5": {
>>>       "localComputer": {
>>>         "busSpeedInMHz": "100",
>>>         # ...
>>>       },
>>>       "targetArchitecture": "x86_64",
>>>       "targetDevice": {
>>>         # We might need to change these keys, since "modelCode" seems very Apple-specific.
>>>         "modelCode": "linux",
>>>         "platformIdentifier": "Ubuntu 15.04",
>>>       }
>>>     }
>>>   }
>>> }
>>>
>>> Personally, I think the format of the plist files Xcode and Apple XCTest
>>> generate could be improved. Still, I think it’d be nice to stick to the
>>> same format (as much as possible) for swift-corelibs-xctest, just to keep
>>> things simple.
>>> Thoughts?
>>>
>>> I admit that I don’t have much experience using Apple XCTest’s
>>> performance testing functionality, so I might be missing something here.
>>> Does anyone have any feedback on this idea? I’d like to incorporate your
>>> feedback, and perhaps submit a Swift Evolution proposal for this feature.
>>>
>>> - Brian Gesiak
>>> 
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-build-dev/attachments/20160729/b235b069/attachment.html>