[swift-users] Data races with copy-on-write
Romain Jacquinot
rjacquinot at me.com
Tue Dec 5 04:20:33 CST 2017
Hi,
I'm trying to better understand how copy-on-write works, especially in a multithreaded environment, but there are a few things that confuse me.
From the documentation, it is said that:
"If the instance passed as object is being accessed by multiple threads simultaneously, isKnownUniquelyReferenced(_:) may still return true. Therefore, you must only call this function from mutating methods with appropriate thread synchronization. That will ensure that isKnownUniquelyReferenced(_:) only returns true when there is really one accessor, or when there is a race condition, which is already undefined behavior."
Let's consider this sample code:
func mutateArray(_ array: [Int]) {
var elements = array
elements.append(1)
}
let q1 = DispatchQueue(label: "testQ1")
let q2 = DispatchQueue(label: "testQ2")
let q3 = DispatchQueue(label: "testQ3")
let iterations = 1000
var array: [Int] = [1, 2, 3]
q1.async {
for _ in 0..<iterations {
mutateArray(array)
}
}
q2.async {
for _ in 0..<iterations {
mutateArray(array)
}
}
q3.async {
for _ in 0..<iterations {
mutateArray(array)
}
}
// ...
From what I understand, since Array<T> implements copy-on-write, the array should be copied only when the mutating append(_:) method is called in mutateArray(_:). Therefore, isKnownUniquelyReferenced(_:) should be called to determine whether a copy is required or not.
However, it is being accessed by multiple threads simultaneously, which may cause a race condition, right? So why does the thread sanitizer never detect a race condition here? Is there some compiler optimization going on here?
On the other hand, the thread sanitizer always detects a race condition, for instance, if I add the following code to mutate directly the array:
for _ in 0..<iterations {
array.append(1)
}
In this case, is it because I mutate the array buffer that is being copied from other threads?
Even strangier, let's consider the following sample code:
class SynchronizedArray<Element> {
// [...]
private var lock = NSLock()
private var _elements: Array<Element>
var elements: Array<Element> {
lock.lock()
defer { lock.unlock() }
return _elements
}
@discardableResult
public final func access<R>(_ closure: (inout T) throws -> R) rethrows -> R {
lock.lock()
defer { lock.unlock() }
return try closure(&_value)
}
}
let syncArray = SynchronizedArray<Int>()
func mutateArray() {
syncArray.access { array in
array.append(1)
}
var elements = syncArray.elements
var copy = elements // [X] no race condition detected by TSan when I add this line
elements.append(1) // race condition detected by TSan (if previous line is missing)
}
// Call mutateArray() from multiple threads like in the first sample code.
The line marked with [X] does nothing useful, yet adding this line prevents the race condition at the next line to be detected by the thread sanitizer. Is this again because of some compiler optimization?
However, when the array buffer is being copied, we can mutate the same buffer with the append(_:) method, right? So, shouldn't the thread sanitizer detect a race condition here?
Please let me know if I ever misunderstood how copy-on-write works in Swift.
Also, I'd like to know:
- besides capture lists, what are the correct ways to pass a copy-on-write value between threads?
- for thread-safe classes that expose an array as a property, should I always copy the private array variable before returning it from the public getter? If so, is there any recommended way to force-copy a value type in Swift ?
Any help would be greatly appreciated.
Thanks.
Note: I'm using Swift 4 with the latest Xcode version (9.2 (9C40b)) and the thread sanitizer enabled.
More information about the swift-users
mailing list