[swift-evolution] What about garbage collection?

Mon Feb 8 15:55:53 CST 2016

TL;DR - I agree generational garbage collection is not a great idea

> On Feb 8, 2016, at 2:00 PM, Chris Lattner via swift-evolution <swift-evolution at swift.org> wrote:
>> However, it comes with many downsides:
>> 
>> object references are expensive to update
> 
> Most garbage collectors have write barriers, which execute extra code when references are updated.  Most garbage collectors also have safe points, which means that extra instructions get inserted into loops.

Many even have read barriers for when the heap is compacted.

>> heap fragmentation
> 
> This is at best a tradeoff depending on what problem you’re trying to solve (e.g. better cache locality or smaller max RSS of the process).  One thing that I don’t think is debatable is that the heap compaction behavior of a GC (which is what provides the heap fragmentation win) is incredibly hostile for cache (because it cycles the entire memory space of the process) and performance predictability.
> 
> Given that GC’s use a lot more memory than ARC systems do, it isn’t clear what you mean by GC’s winning on heap fragmentation

I find only copy collectors focus on cache locality. Usually the mark and sweep is a fallback when the copy collectors fail to prevent growth, and compaction of the mature objects to really promote reducing RSS is a last resort.

> 
>> the closure capture syntax uses up an unreasonable amount of mindshare just because of [weak self]
> 
> I think that this specific point is solvable in others ways, but I’ll interpret this bullet as saying that you don’t want to worry about weak/unowned pointers.  I completely agree that we strive to provide a simple programming model, and I can see how "not having to think about memory management" seems appealing.
> 
> On the other hand, there are major advantages to the Swift model.  Unlike MRR, Swift doesn’t require you to micromanage memory: you think about it at the object graph level when you’re building out your types.  Compared to MRR, ARC has moved memory management from being imperative to being declarative.  Swift also puts an emphasis on value types, so certain problems that you’d see in languages like Java are reduced.
> 

Small changes in the relationships between objects and object lifetime in an object-oriented design can have huge impacts when doing manual memory management. What was a scope-lived instance can become a long-lived, shared instance as part of new requirements, and if you are doing manual memory management result in large amounts of changes to accommodate the new ownership rules. This is occasionally when people move from alloc/free or scoped memory management to MRR.

MRR then can have issues when you are not tracking that objects may have an interrelationship (a cycle), which weak references can solve.

The important part of declarative management IMHO is that there are rules for how separate subsystems within your architecture reference one another (e.g. weak references to delegates). That eliminates a lot of the macro-level problems. Comparatively, micro-level problems can be understood as part of a subsystem's design.

> That said, it is clear that it takes time and thought to use weak/unowned pointers correctly, so the question really becomes: does reasoning about your memory at the object graph level and expressing things in a declarative way contribute positively to your code?
> 
> My opinion is yes: while I think it is silly to micromanage memory, I do think thinking about it some is useful. I think that expressing that intention directly in the code adds value in terms of maintenance of the code over time and communication to other people who work on it.

In generational systems there are de-optimization issues when an application (or application server) is composed of components with differing memory needs.

For example, a web application may have a current request, current transaction, user state, and a cache of persisted state. All of these have differing lifetimes. If the GC is optimized toward handling a younger generation consisting of just the current transaction (due to timing or space tuning), current transaction data may very well get promoted to the mature generation, where it is likely to stay until there is a full GC. Even with tuning, heavy use of caching may still push data into the mature generation, including cache data which is meant to expire. You wind up having threads doing concurrent and incremental garbage collection to deal with the fact that while the GC can be tuned for performance of any one component, it cannot be tuned to handle them well in aggregate.

MRR and declarative memory management systems generally are not negatively impacted by other in-process components.

>> Since Swift doesn't expose memory management operations outside of `autoreleasepool`, it seems to me that you could just drop in a garbage collector instead of reference counting and it would work (for most purposes).
>> 
>> Has a GC been considered at all?
> 
> GC also has several *huge* disadvantages that are usually glossed over: while it is true that modern GC's can provide high performance, they can only do that when they are granted *much* more memory than the process is actually using.  Generally, unless you give the GC 3-4x more memory than is needed, you’ll get thrashing and incredibly poor performance.  Additionally, since the sweep pass touches almost all RAM in the process, they tend to be very power inefficient (leading to reduced battery life).

Externalizing the marks from objects can turn much of the sweep into a read of memory rather than writing, which I believe does help power usage (although I’ve seen no studies of GC impact on battery life). It certainly helps for scripting languages, particularly in server-oriented applications on unix platforms, which are apt to fork() processes to handle requests than use threads, and wish to retain the benefits of CoW memory usage past the first GC cycle.

The memory and CPU impact of using a fully automatic GC are not to be understated though. This is a strong reason that different mobile phone platforms have drastically different memory requirements for their devices, for instance.

-DW

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160208/cb02259f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160208/cb02259f/attachment.sig>