[swift-dev] Profiling ARC

Thu Feb 16 22:03:39 CST 2017

> On Feb 16, 2017, at 6:48 PM, Jiho Choi via swift-dev <swift-dev at swift.org> wrote:
> 
> Hi,
> 
> I was curious about the overhead of ARC and started profiling some benchmarks found in the Computer Language Benchmark Game (http://benchmarksgame.alioth.debian.org/u64q/measurements.php?lang=swift <http://benchmarksgame.alioth.debian.org/u64q/measurements.php?lang=swift>).  So far, it seems that ARC sequence optimization is surprisingly good and most benchmarks don't have to perform ARC operations as often as I expected.  I have some questions regarding this finding.
> 
> I compiled all benchmarks with "-O -wmo" flags and counted the number of calls to ARC runtime (e.g., swift_rt_swift_retain) using Pin.
> 
> 1. Reference counting is considered to have high overhead due to frequent counting operations which also have to be atomic.  At least for the benchmarks I tested, it is not the case and there is almost no overhead.  Is it expected behavior?  Or is it because the benchmarks are too simple (they are all single-file programs)?  How do you estimate the overhead of ARC would be?

It is possible that the optimizer eliminated many reference counting operations here. Also my understanding is that while atomic operations are more expensive than non-atomic operations, the real cost only comes into play if you actually have contention due to bouncing cache lines. In a single-threaded workload the overhead is not that great.

> 
> 2. I also tried to compile the same benchmarks with "-Xfrontend -assume-single-threaded" to measure the overhead of atomic operations.  Looking at the source code of this experimental pass and SIL optimizer's statistic, the pass seems to work as expected to convert all ARC operations in user code into nonatomic.  However, even when using this flag, there are some atomic ARC runtime called from the user code (not library).  More strangely, SIL output said all ARC operations in the user code have turned into nonatomic.  The documentation says ARC operations are never implicit in SIL.  So if there is no atomic ARC at SIL-level, I expect the user code would never call atomic ARC runtime.  Am I missing something?

IRGen still emits atomic reference counting operations when it produces value witness operations. I think there’s a PR open right now to address this: https://github.com/apple/swift/pull/7421 <https://github.com/apple/swift/pull/7421>

> 
> 3. Are there more realistic benchmarks available?  Swift's official benchmarks also seem pretty small.

Contributions are welcome :-)

> 
> Thanks,
> Jiho
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20170216/c5ca207a/attachment.html>