[swift-users] Data races with copy-on-write
Romain Jacquinot
rjacquinot at me.com
Tue Dec 5 17:53:10 CST 2017
Hi Jordan,
For which specific code sample(s) do you think it’s a bug? In the previous code samples, are there cases where you think a data race is to be expected?
Thanks.
> On Dec 6, 2017, at 12:05 AM, Jordan Rose <jordan_rose at apple.com> wrote:
>
> I'm seeing the race too when compiling with -O (and TSan enabled). I'm 95% sure this should be valid Swift code without any further synchronization, so please file a bug at https://bugs.swift.org <https://bugs.swift.org/>. (But I'm only 95% sure because concurrency is hard.)
>
> Looking at the backtraces, it looks like one thread thinks it has exclusive ownership of the buffer while the other thread is still copying things out of it. This is a bug on Swift's side; this code should work. I'm pretty sure this is actually a situation I was just talking about with Michael I from the stdlib team a few days ago, so it's good to have a test case for it.
>
> Jordan
>
>
>> On Dec 5, 2017, at 14:23, Romain Jacquinot via swift-users <swift-users at swift.org <mailto:swift-users at swift.org>> wrote:
>>
>> Also, on the official Swift blog (https://developer.apple.com/swift/blog/?id=10 <https://developer.apple.com/swift/blog/?id=10>), it is stated that:
>>
>> "One of the primary reasons to choose value types over reference types is the ability to more easily reason about your code. If you always get a unique, copied instance, you can trust that no other part of your app is changing the data under the covers. This is especially helpful in multi-threaded environments where a different thread could alter your data out from under you.
>> […]
>> In Swift, Array, String, and Dictionary are all value types. They behave much like a simple int value in C, acting as a unique instance of that data. You don’t need to do anything special — such as making an explicit copy — to prevent other code from modifying that data behind your back. Importantly, you can safely pass copies of values across threads without synchronization. In the spirit of improving safety, this model will help you write more predictable code in Swift.”
>>
>> Yet, data race can occur here:
>>
>> public class MyClass {
>>
>> private var _myArray: Array<Int> = []
>> private var _lock = NSLock()
>>
>> public var myArray: Array<Int> {
>> lock.lock()
>> defer {
>> lock.unlock()
>> }
>> let copy = _myArray
>> return copy
>> }
>>
>> public func doSomethingThatMutatesArray() {
>> _lock.lock()
>> _myArray.append(1) // data race here: write of size 8 by thread 1
>> _lock.unlock()
>> }
>> }
>>
>>
>> let myClass = MyClass()
>>
>> func mutateArray() {
>> myClass.doSomethingThatMutatesArray()
>> var arrayCopy = myClass.myArray
>> arrayCopy.append(2) // data race here: read of size 8 by thread 10
>> }
>>
>> let q1 = DispatchQueue(label: "testQ1")
>> let q2 = DispatchQueue(label: "testQ2")
>> let q3 = DispatchQueue(label: "testQ3")
>>
>> let iterations = 100_000
>>
>> q1.async {
>> for _ in 0..<iterations {
>> mutateArray()
>> }
>> }
>>
>> q2.async {
>> for _ in 0..<iterations {
>> mutateArray()
>> }
>> }
>>
>> q3.async {
>> for _ in 0..<iterations {
>> mutateArray()
>> }
>> }
>>
>> for _ in 0..<iterations {
>> mutateArray()
>> }
>>
>> q1.sync {}
>> q2.sync {}
>> q3.sync {}
>> NSLog("done")
>>
>> It also can occur for instance if I replace the mutateArray() function with the following method:
>>
>> func enumerateArray() {
>> myClass.doSomethingThatMutatesArray()
>> for element in myClass.myArray { // data race here: read of size 8 by thread 5
>> let _ = element
>> }
>> }
>>
>> How can I return a “predictable” copy from the MyClass.myArray getter, so that I can later mutate the copy without synchronization like in the mutateArray() function?
>>
>>> On Dec 5, 2017, at 9:23 PM, Romain Jacquinot via swift-users <swift-users at swift.org <mailto:swift-users at swift.org>> wrote:
>>>
>>> Hi Jens,
>>>
>>> In the SynchronizedArray class, I use a lock to perform mutating operations on the array. However, my questions concern the case where an array is exposed as a public property of a thread-safe class, like this:
>>>
>>> public class MyClass {
>>>
>>> private var _myArray: Array<Int> = []
>>> private var _lock = NSLock()
>>>
>>> public var myArray: Array<Int> {
>>> _lock.lock()
>>> defer { _lock.unlock() }
>>> return _myArray
>>> }
>>>
>>> public func doSomethingThatMutatesArray() {
>>> _lock.lock()
>>> _myArray.append(1)
>>> _lock.unlock()
>>> }
>>> }
>>>
>>> Arrays are implemented as structs in Swift. This is a value type, but because Array<T> implements copy-on-write, there is an issue if you do something like this from multiple threads:
>>>
>>> let myClass = MyClass()
>>>
>>> func callFromMultipleThreads() {
>>> let arrayCopy = myClass.myArray
>>> arrayCopy.append(2) // race condition here
>>> }
>>>
>>> This is quite problematic, because the user of MyClass shouldn’t have to worry about side-effects when using a copy of the value from myArray.
>>>
>>>> On Dec 5, 2017, at 8:22 PM, Jens Alfke <jens at mooseyard.com <mailto:jens at mooseyard.com>> wrote:
>>>>
>>>>
>>>>
>>>>> On Dec 5, 2017, at 6:24 AM, Michel Fortin via swift-users <swift-users at swift.org <mailto:swift-users at swift.org>> wrote:
>>>>>
>>>>> The array *storage* is copy on write. The array variable (which essentially contains a pointer to the storage) is not copy on write. If you refer to the same array variable from multiple threads, you have a race. Rather, use a different copy of the variable to each thread. Copied variables will share the same storage but will make a copy of the storage when writing to it.
>>>>
>>>> I think you’ve misunderstood. The race condition Romain is referring to is when the two threads both access the shared storage, through separate array variables.
>>>>
>>>> Romain:
>>>> From the thread sanitizer warnings, it sounds like the implementation of Array is not using synchronization around its call(s) to isKnownUniquelyReferenced … which would mean the class is not thread-safe.
>>>>
>>>> It’s pretty common for the regular (mutable) collection classes supplied by a framework to be non-thread-safe; for example consider Foundation and Java. The reason is that the overhead of taking a lock every time you access an array element is pretty high. Generally it’s preferable to use larger-granularity locks, i.e. grab an external lock before performing a number of array operations.
>>>>
>>>> —Jens
>>>
>>> _______________________________________________
>>> swift-users mailing list
>>> swift-users at swift.org <mailto:swift-users at swift.org>
>>> https://lists.swift.org/mailman/listinfo/swift-users <https://lists.swift.org/mailman/listinfo/swift-users>
>>
>> _______________________________________________
>> swift-users mailing list
>> swift-users at swift.org <mailto:swift-users at swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20171206/2014fbd1/attachment.html>
More information about the swift-users
mailing list