[swift-evolution] What about garbage collection?

Mon Feb 8 15:00:57 CST 2016

On Feb 8, 2016, at 11:56 AM, Félix Cloutier via swift-evolution <swift-evolution at swift.org> wrote:
> Has there been a garbage collection thread so far? I understand that reference counting vs. garbage collection can be a heated debate, but it might be relevant to have it.

Technically speaking, reference counting is a form of garbage collection, but I get what you mean.  Since there are multiple forms of GC, I'll assume that you mean a generational mark and sweep algorithm like you’d see in a Java implementation.

> It seems to me that the two principal upsides of reference counting are that destruction is (essentially) deterministic and performance is more easily predicted.

Yes, deterministic destruction is a major feature.  Not having to explain what finalizers are (and why they shouldn’t generally be used) is a pretty huge.  Keep in mind that Swift interops with C, so deinit is unavoidable for certain types.

More pointedly, not relying on GC enables Swift to be used in domains that don’t want it - think boot loaders, kernels, real time systems like audio processing, etc.

We have discussed in the passed using hybrid approaches like introducing a cycle collector, which runs less frequently than a GC would.  The problem with this is that if you introduce a cycle collector, code will start depending on it. In time you end up with some libraries/packages that works without GC, and others that leak without it (the D community has relevant experience here).  As such, we have come to think that adding a cycle collector would be bad for the Swift community in the large.

> However, it comes with many downsides:
> 
> object references are expensive to update

Most garbage collectors have write barriers, which execute extra code when references are updated.  Most garbage collectors also have safe points, which means that extra instructions get inserted into loops.

> object references cannot be atomically updated

This is true, but Swift currently has no memory model and no concurrency model, so it isn’t clear that this is actually important (e.g. if you have no shared mutable state).

> heap fragmentation

This is at best a tradeoff depending on what problem you’re trying to solve (e.g. better cache locality or smaller max RSS of the process).  One thing that I don’t think is debatable is that the heap compaction behavior of a GC (which is what provides the heap fragmentation win) is incredibly hostile for cache (because it cycles the entire memory space of the process) and performance predictability.

Given that GC’s use a lot more memory than ARC systems do, it isn’t clear what you mean by GC’s winning on heap fragmentation.

> the closure capture syntax uses up an unreasonable amount of mindshare just because of [weak self]

I think that this specific point is solvable in others ways, but I’ll interpret this bullet as saying that you don’t want to worry about weak/unowned pointers.  I completely agree that we strive to provide a simple programming model, and I can see how "not having to think about memory management" seems appealing.

On the other hand, there are major advantages to the Swift model.  Unlike MRR, Swift doesn’t require you to micromanage memory: you think about it at the object graph level when you’re building out your types.  Compared to MRR, ARC has moved memory management from being imperative to being declarative.  Swift also puts an emphasis on value types, so certain problems that you’d see in languages like Java are reduced.

That said, it is clear that it takes time and thought to use weak/unowned pointers correctly, so the question really becomes: does reasoning about your memory at the object graph level and expressing things in a declarative way contribute positively to your code?  

My opinion is yes: while I think it is silly to micromanage memory, I do think thinking about it some is useful. I think that expressing that intention directly in the code adds value in terms of maintenance of the code over time and communication to other people who work on it.

> Since Swift doesn't expose memory management operations outside of `autoreleasepool`, it seems to me that you could just drop in a garbage collector instead of reference counting and it would work (for most purposes).
> 
> Has a GC been considered at all?

GC also has several *huge* disadvantages that are usually glossed over: while it is true that modern GC's can provide high performance, they can only do that when they are granted *much* more memory than the process is actually using.  Generally, unless you give the GC 3-4x more memory than is needed, you’ll get thrashing and incredibly poor performance.  Additionally, since the sweep pass touches almost all RAM in the process, they tend to be very power inefficient (leading to reduced battery life).

I’m personally not interested in requiring a model that requires us to throw away a ton of perfectly good RAM to get an “simpler" programming model - particularly on that adds so many tradeoffs.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160208/9d99990e/attachment.html>