[swift-dev] Potential contributions to compilation time reporting?

Graydon Hoare ghoare at apple.com
Thu Nov 16 23:54:16 CST 2017



> On Nov 12, 2017, at 8:30 AM, Brian Gesiak via swift-dev <swift-dev at swift.org> wrote:
> 
> Hello all!
> 
> I'm looking for a body of work to do on the Swift compiler for an upcoming programmer retreat I'm attending in January [1]. I've read a lot of blog posts with tips for diagnosing slow Swift compile times [2], and I was wondering if I could contribute to tooling in this area.

Hey, this is great! Sorry I didn't get back to you earlier. Happy to talk this over.

> (Just to make it clear: I'm talking about improving and expanding the set of tools developers have to figure out why their projects take a long time to compile. I'm *not* talking about working on speeding Swift compile times -- although the tools may indirectly help with that.)

This seems plausible to me; though like others, I'm a bit hesitant too. There are often _many_ things going on in a given compilation, and one needs to be careful surfacing any particular signal as a putative "singular cause" of a slow compilation, since the user may see that signal and respond by making significant adjustments to their code, only to find the overall time isn't helped much. But I totally get the motive.

> Using these options, developers can find function bodies and expressions that took longer to compile than others.

Sadly, there's not always a 1:1 mapping between source entities and time like that. Certainly _some_ cases can be so egregious (say, typechecking time on an expression that's triggering an exponential-time inference) that they dominate compile time, and can be identified-and-fixed in isolation; but often the total amount of work attributable to a given source entity is spread around the compilation, occurs in multiple phases, emerges out of interaction between entities, overlaps with others, etc.

That's not to say that one couldn't build machinery to do a better job attributing costs to source entities than what we have now; just that it'll be a bit of work to get there from here. Maybe fun work! I'd be happy to discuss more :)

> However, it should be noted that, of these options, only the first is a user-facing "supported" option. The others are `swift -frontend` options, and as such the Swift team has been clear that this means the options may be changed or removed at any point in the future [3].

Yeah, there are a few good reasons for being careful which options get surfaced; but I generally agree with you that _some_ at least partly-supported ones would be nice. IMO it's easiest and most flexible to aim to _export_ information out of the compiler in a machine-readable form, so you can post-process with other tools. I've mostly been using json and csv in recent efforts (eg. see -stats-output-dir or -trace-stats-events). But I don't have an especially strong feeling about the degree-of-support / stability of such features; I'm going to have to leave that part of your question to others.

> What's more, several contributors have noted the behavior of the options themselves could also be improved. Here's what I gathered from reading several JIRA bugs, commit messages, and mailing list discussions:
> 
> - SR-2910 <https://bugs.swift.org/browse/SR-2910> points out that `-debug-time-function-bodies` prints just `get {}` and `set {}` for struct getters and setters, and that this could be improved by printing the variable name as well.
> - The commit that added `-warn-long-function-message=` notes in its commit message <https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5> that the option only measures some aspects of type-checking, that it doesn't provide any information on how checking a function for the first time will take longer, doesn't report on other phases of compilation, and doesn't catch anything being type-checked multiple times.

Yes, this latter comment is the part I'm most concerned about. I think if you're going to go this way -- providing a diagnostic mode that assigns costs to source-entities rather than compiler subsystems -- the difficult/interesting parts will be (a) making sure you capture enough of the costs of compilation to approximate "total time" meaningfully and (b) making sure you attribute "causes" meaningfully. The latter is really quite tricky since a lot of the compiler's work is lazy / demand-driven: the first time some (lazy and memoized) work is demanded, it's tempting to attribute the cost of the work to the entity first requesting it. But if a dozen other entities also demand that work gets done (reusing it, whichever requests it first) then the attribution will be ... unhelpful to the user trying to change their code to avoid the cost.

> I'd like to solicit ideas for future work here:
> 
> - At the very least, I could fix SR-2910 <https://bugs.swift.org/browse/SR-2910>.
> - I'd also like to address some of the issues mentioned in this commit message <https://github.com/apple/swift/commit/18c75928639acf0ccf0e1fb6729eea75bc09cbd5>, but I would like to confirm they're still something the Swift team would like work to be done on. Jordan, are these still relevant?
> - I'm wondering if anyone else has some work in-flight here, or if they have ideas. If you've been longing for a "killer feature", please reply here and let me know!

I'll let the others who feel strongly about the "official support" aspect speak to that; myself I'd suggest looking at one or another of two approaches:

  1. See if you can leverage the existing counters and stats-gathering / reporting machinery in the compiler; for example the -trace-stats-events infrastructure lets you bundle together work done on behalf of a given source range and any changes to compiler counters during that work, as a single virtual "stats-event" in processing, and trace those events to a .csv file. Maybe something related to that would be helpful for the task you're interested in?

  2. Consider going up a level from declarations or functions to _files_, and see if there's a useful way to visualize hot-spots in the inter-file dependency graph that the driver interprets, during incremental compilation. The units of work at this level are likely to be large (especially if they involve cascading dependencies, that invalidate "downstream" files) and often cutting or changing an edge in that graph can be a simpler matter of moving a declaration, or changing its visibility: reasonably easy changes that don't cost the user much to experiment with.

Hope that helps! Happy to discuss any of this further.

-Graydon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20171116/47689ac0/attachment.html>


More information about the swift-dev mailing list