[swift-dev] Advice on adding PGO support
vsk at apple.com
Thu Sep 8 10:27:40 CDT 2016
Thanks for your advice!
> On Sep 7, 2016, at 12:53 PM, Erik Eckstein <eeckstein at apple.com> wrote:
> Hi Vedant,
> nice work!
>> On Sep 6, 2016, at 12:36 PM, Vedant Kumar via swift-dev <swift-dev at swift.org> wrote:
>> Hi swift-dev,
>> I've been working on some patches which add basic support for PGO to swift .
>> What I have so far is just a proof-of-concept. I'd like to get some feedback on
>> the approach I've taken.
>> I've added support for loading profile data, matching up execution counts to
>> the right parts of the AST, and attaching those execution counts to conditional
>> branches. I added two fields to CondBranchInst (in SIL):
>> /// The number of times the True branch was executed.
>> Optional<uint64_t> TrueBBCount;
>> /// The number of times the False branch was executed.
>> Optional<uint64_t> FalseBBCount;
>> I fixed up the SILCloner and a few other sites where conditional branches are
>> created to propagate the branch taken counts. I have a patch to propagate
>> branch taken counts through the SILOptimizer, but I didn't include it in 
>> because I don't know how to write tests for it.
>> In IRGen, I added some logic to scale execution counts and create llvm
>> branch_weight metadata.
>> Some questions:
>> 1. Is it acceptable to make some SIL objects larger in order to store
>> execution counts (e.g CondBranchInst)?
> Yes, I don’t see a problem with that.
> But for saving some space, you might want to store the counts as a plain uint64_t instead of an Optional and use a special value (e.g. ~0) for the none-case.
Right, this makes sense. Reserving that value should be fine, since it looks
like llvm expects weights to be scaled below UINT32_MAX.
>> If not, what's the best way to make
>> this information visible to SILOptimizer and IRGen?
>> 2. Is it better to associate counts with SIL instructions, or with
> I think storing the counts in SILSuccessor makes sense, because counts are needed for all kind of terminator instructions (like switch_enum), except for the unconditional br, but IMO we can live with wasting some bytes for this instructions.
Storing counts in SILSuccessor has pros and cons, like you mentioned. It may
lead to wasted space in some cases, but would require fewer invasive API
changes to implement.
> Another thing to consider is that the count information should be printed/parsed/serialized when writing and reading SIL.
Yes! I jotted these down as "TODO's" in my commit messages.
>> 3. Does anyone have tips on modifying swift/benchmark? I'd like to add a
>> benchmark driver which generates profile data, and another driver which
>> uses that data to turn on PGO.
I still need to work out how to configure swift/benchmark.
At any rate, I'm going to start gradually cleaning up these patches and
submitting PR's. I expect swift to support frontend-based PGO in the future,
but if it doesn't, some of these patches will still be useful. I'd rather have
them in-tree than let them bitrot.
Here's the first one:
Reviews appreciated :).
>>  https://github.com/vedantk/swift/tree/profile_use
>> swift-dev mailing list
>> swift-dev at swift.org
More information about the swift-dev