[swift-evolution] Pitch: Wrap calls to NSFileHandle and NSData in autorelease pools

John McCall rjmccall at apple.com
Fri Jul 14 15:24:48 CDT 2017


> On Jul 14, 2017, at 4:15 PM, Charles Srstka <cocoadev at charlessoft.com> wrote:
> 
>> On Jul 14, 2017, at 2:35 PM, John McCall <rjmccall at apple.com <mailto:rjmccall at apple.com>> wrote:
>> 
>>> On Jul 14, 2017, at 1:12 PM, Charles Srstka via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>>> MOTIVATION:
>>> 
>>> Meet Bob. Bob is a developer with mostly C++ and Java experience, but who has been learning Swift. Bob needs to write an app to parse some proprietary binary data format that his company requires. Bob’s written this app, and it’s worked pretty well on Linux:
>>> 
>>> import Foundation
>>> 
>>> do {
>>>     let url = ...
>>>     
>>>     let handle = try FileHandle(forReadingFrom: url)
>>>     let bufsize = 1024 * 1024 // read 1 MiB at a time
>>>     
>>>     while true {
>>>         let data = handle.readData(ofLength: bufsize)
>>>         
>>>         if data.isEmpty {
>>>             break
>>>         }
>>>         
>>>         data.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in
>>>             // do something with bytes
>>>         }
>>>     }
>>> } catch {
>>>     print("Error occurred: \(error.localizedDescription)")
>>> }
>>> 
>>> Later, Bob needs to port this same app to macOS. All seems to work well, until Bob tries opening a large file of many gigabytes in size. Suddenly, the simple act of running the app causes Bob’s Mac to completely lock up, beachball, and finally pop up with the dreaded “This computer is out of system memory” message. If Bob’s particularly unlucky, things will locked up tight enough that he can’t even recover from there, and may have to hard-reboot the machine.
>>> 
>>> What happened?
>>> 
>>> Experienced Objective-C developers will spot the problem right away; the Foundation APIs that Bob used generated autoreleased objects, which would never be released until Bob’s loop finished. However, Bob’s never programmed in Objective-C, and to him, this behavior is completely undecipherable.
>>> 
>>> After a copious amount of time spent Googling for answers and asking for help on various mailing lists and message boards, Bob finally gets the recommendation from someone to try wrapping the file handle read in an autorelease pool. So he does:
>>> 
>>> import Foundation
>>> 
>>> do {
>>>     let url = ...
>>>     
>>>     let handle = try FileHandle(forReadingFrom: url)
>>>     let bufsize = 1024 * 1024 // read 1 MiB at a time
>>>     
>>>     while true {
>>>         let data = autoreleasepool { handle.readData(ofLength: bufsize) }
>>>         
>>>         if data.isEmpty {
>>>             break
>>>         }
>>>         
>>>         data.withUnsafeBytes { (bytes: UnsafePointer<UInt8>) in
>>>             // do something with bytes
>>>         }
>>>     }
>>> } catch {
>>>     print("Error occurred: \(error.localizedDescription)")
>>> }
>>> 
>>> Unfortunately, Bob’s program still eats RAM like Homer Simpson in an all-you-can-eat buffet. Turns out the data.withUnsafeBytes call *also* causes the data to be autoreleased.
>> 
>> This seems like a bug that should be fixed.  I don't know why the other one would cause an unreclaimable autorelease.
> 
> Sticking a break at the end of my original code snippet so the loop runs only once and then running it through Instruments, it seems the NSConcreteData instance gets autoreleased… 32,769 times. o_O
> 
> First autorelease occurs inside -[NSConcreteFileHandle readDataOfLength:], the next 32,768 occur in Data.Iterator.next(), which is called by specialized RangeReplaceableCollection.init<A>.
> 
> I can send you the trace off-list if you’d like.

We should absolutely not need to do the later autoreleases.  We have logic to autorelease objects when calling returns-inner-pointer objects on them, but we shouldn't need to do that in safe patterns like what Data does here by scoping the pointer to the closure.  We probably just don't actually have a way to turn that logic off, i.e. an equivalent of objc_precise_lifetime in ObjC ARC.

I have no idea why the first autorelease wouldn't be reclaimed.  There's a well-known issue with micro-reductions involving autoreleases on x86, where the first autorelease from the executable doesn't get reclaimed because the dynamic linker's lazy-binding stub interferes somehow.  Can you verify that you still see that initial autorelease on subsequent Data creations?

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170714/f0f0cd60/attachment.html>


More information about the swift-evolution mailing list