[swift-dev] Profiling ARC

Thu Feb 16 23:40:22 CST 2017

Hi,

I just created a new PR #7557 (https://github.com/apple/swift/pull/7557)
which replaces #7421.

Thanks,
-- Mikio

2017-02-17 13:03 GMT+09:00 Slava Pestov via swift-dev <swift-dev at swift.org>:

>
> On Feb 16, 2017, at 6:48 PM, Jiho Choi via swift-dev <swift-dev at swift.org>
> wrote:
>
> Hi,
>
> I was curious about the overhead of ARC and started profiling some
> benchmarks found in the Computer Language Benchmark Game (
> http://benchmarksgame.alioth.debian.org/u64q/measurements.php?lang=swift).
> So far, it seems that ARC sequence optimization is surprisingly good and
> most benchmarks don't have to perform ARC operations as often as I
> expected.  I have some questions regarding this finding.
>
> I compiled all benchmarks with "-O -wmo" flags and counted the number of
> calls to ARC runtime (e.g., swift_rt_swift_retain) using Pin.
>
> 1. Reference counting is considered to have high overhead due to frequent
> counting operations which also have to be atomic.  At least for the
> benchmarks I tested, it is not the case and there is almost no overhead.
> Is it expected behavior?  Or is it because the benchmarks are too simple
> (they are all single-file programs)?  How do you estimate the overhead of
> ARC would be?
>
>
> It is possible that the optimizer eliminated many reference counting
> operations here. Also my understanding is that while atomic operations are
> more expensive than non-atomic operations, the real cost only comes into
> play if you actually have contention due to bouncing cache lines. In a
> single-threaded workload the overhead is not that great.
>
>
> 2. I also tried to compile the same benchmarks with "-Xfrontend
> -assume-single-threaded" to measure the overhead of atomic operations.
> Looking at the source code of this experimental pass and SIL optimizer's
> statistic, the pass seems to work as expected to convert all ARC operations
> in user code into nonatomic.  However, even when using this flag, there are
> some atomic ARC runtime called from the user code (not library).  More
> strangely, SIL output said all ARC operations in the user code have turned
> into nonatomic.  The documentation says ARC operations are never implicit
> in SIL.  So if there is no atomic ARC at SIL-level, I expect the user code
> would never call atomic ARC runtime.  Am I missing something?
>
>
> IRGen still emits atomic reference counting operations when it produces
> value witness operations. I think there’s a PR open right now to address
> this: https://github.com/apple/swift/pull/7421
>
>
> 3. Are there more realistic benchmarks available?  Swift's official
> benchmarks also seem pretty small.
>
>
> Contributions are welcome :-)
>
>
> Thanks,
> Jiho
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev
>
>
>
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20170217/9d866fb5/attachment.html>