[swift-users] Problem with mutable views and COW

Adrian Zubarev adrian.zubarev at devandartist.com
Fri Nov 18 13:47:03 CST 2016


Hi John,

Thank you for your huge and informative answer. Only after reading it I’ve realized that I made a couple mistakes in my code snippets. Thank you for correcting me. :)

I also have done some further digging and found a thread [swift-dev] [idle] COW wrapper in 30 lines which explains a few things.

I believe that magical language support, which only exists internally is currently called mutableAddressWithNativeOwner?!

I really wished I could build this now, but dropped the idea.

I probably move both arrays keys/values directly into document and make them as public internal(set) var and conform Document itself to MutableCollection to mutate the values array correctly.

Thank you very much for summing everything up nicely.



-- 
Adrian Zubarev
Sent with Airmail

Am 18. November 2016 um 20:18:44, John McCall (rjmccall at apple.com) schrieb:


On Nov 18, 2016, at 7:40 AM, Karl <razielim at gmail.com> wrote:


On 18 Nov 2016, at 13:05, Adrian Zubarev via swift-users <swift-users at swift.org> wrote:

Hi there,

I just can’t get my head around mutable views and COW.

Here is a small example:

final class Storage {
      
    var keys: [String] = []
    var values: [Int] = []
}

public struct Document {
      
    var _storageReference: Storage
      
    public init() {
          
        self._storageReference = Storage()
    }
      
    public init(_ values: DocumentValues) {
          
        self._storageReference = values._storageReference
    }
      
    public var values: DocumentValues {
          
        get { return DocumentValues(self) }
          
        set { self = Document(newValue) }
    }
}

public struct DocumentValues : MutableCollection {
      
    unowned var _storageReference: Storage
      
    init(_ document: Document) {
          
        self._storageReference = document._storageReference
    }
      
    public var startIndex: Int {
          
        return self._storageReference.keys.startIndex
    }
      
    public var endIndex: Int {
          
        return self._storageReference.keys.endIndex
    }
      
    public func index(after i: Int) -> Int {
          
        return self._storageReference.keys.index(after: i)
    }
      
    public subscript(position: Int) -> Int {
          
        get { return _storageReference.values[position] }
          
        set { self._storageReference.values[position] = newValue } // That will break COW
    }
}
First of all the _storageReference property is unowned because I wanted to check the following:

var document = Document()

print(CFGetRetainCount(document._storageReference)) //=> 2
print(isKnownUniquelyReferenced(&document._storageReference)) // true

var values = document.values

print(CFGetRetainCount(values._storageReference)) //=> 2
print(isKnownUniquelyReferenced(&values._storageReference)) // false
Why is the second check false, even if the property is marked as unowned for the view?

Next up, I don’t have an idea how to correctly COW optimize this view. Assume the following scenario:

Scenario A:

var document = Document()

// just assume we already added some values and can mutate safely on a given index
// mutation in place
document.values[0] = 10   
VS:

Scenario B:

var document = Document()

let copy = document

// just assume we already added some values and can mutate safely on a given index
// mutation in place
document.values[0] = 10 // <--- this should only mutate `document` but not `copy`
We could change the subscript setter on the mutable view like this:

set {
              
    if !isKnownUniquelyReferenced(&self._storageReference) {
                  
        self._storageReference = ... // clone
    }
    self._storageReference.values[position] = newValue
}
There is only one problem here. We’d end up cloning the storage every time, because as shown in the very first example, even with unowned the function isKnownUniquelyReferenced will return false for scenario A.

Any suggestions? 

PS: In general I also wouldn’t want to use unowned because the view should be able to outlive it’s parent.




-- 
Adrian Zubarev
Sent with Airmail

_______________________________________________
swift-users mailing list
swift-users at swift.org
https://lists.swift.org/mailman/listinfo/swift-users


This is kind of an invalid/unsafe design IMO; DocumentValues may escape the scope of the Document and the underlying storage may be deallocated.

Instead, I’d recommend a function:

func withDocumentValues<T>(_ invoke: (inout DocumentValues)->T) -> T {
var view = DocumentValues(self)
        defer { _fixLifetime(view) }
        return invoke(&view)
}

(unfortunately, this isn’t completely safe because somebody could still copy the DocumentValues from their closure, the same way you can copy the pointer from String’s withCString, but that’s a limitation of Swift right now)

CC: John McCall, because I read his suggestion in the thread about contiguous memory/borrowing that we could have a generalised @noescape. In this example, you would want the DocumentValues parameter in the closure to be @noescape.

I think you guys understand this stuff, but let me talk through it, and I hope it will be illuminating about where we're thinking of taking the language.

In value semantics, you expect something like:
  let values = document.values
to produce an independent value, and mutations of it shouldn't affect the original document value.

But there is a situation where values aren't independent, which is when one value is just a projected component of another.  In Swift, this is (currently, at least) always expressed with properties and subscripts. So when you write:
  document.values.mutateInSomeWay()
this is expected to actually change the document.  So it makes language sense for views like "values" to be expressed in this way; the only question is whether that can be done efficiently while still providing a satisfactory level of safety etc.

When a property is actually stored directly in a value, Swift allows direct access to it (although for subscripts this mechanism is not currently documented + exposed, intentionally).  This sort of direct access is optimal, but it's not general enough for use cases like views and slices because the slice value doesn't actually exist anywhere; it needs to be created.  We do allow properties to be defined with get / set, but there are problems with that, which are exactly what you're seeing: slice values need to assert ownership of the underlying data if they're going to be used as independent values, but they also need to not assert ownership so that they don't interfere with copy-on-write.  get / set isn't good enough for this because get is used to both derive an independent value (which should assert ownership) and initiate a mutation (which should not).  The obvious solution is to allow a third accessor to be provided which is used when a value is mutated, as opposed to just copied (get) or overwritten whole-sale (set).  We're still working out various ideas for how this will look at the language level.

So that would be sufficient to allow DocumentValues to store either a stong or an unowned reference to the storage, depending on how the property is being used.  However, that creates the problem that, like with Karl's solution, the value can be copied during the mutation, and the user would expect that to create an independent value, i.e. to promote an unowned reference to strong.  The most general solution for this is to provide some sort of "copy constructor" feature which would be used to create an independent value.  But that's a pretty large hammer to pull out for this nail.

A third problem is that the original document can be copied and/or mutated during the projection of the DocumentValues, leaving the copy / the view in a potentially inconsistent state.  But this is a problem that we expect to thoroughly solve with the ownership system, which will statically (or dynamically when necessary) prevent simultaneous conflicting accesses to a value.

In the meantime, I think the best alternative is to
  - allow the view to hold either an unowned or owned reference and
  - create a callback-based accessor like Karl's and document that copies from the value are not permitted

On a purely technical level:
print(isKnownUniquelyReferenced(&values._storageReference)) // false
Why is the second check false, even if the property is marked as unowned for the view?

A function taking an "inout T" expects to be passed an l-value for an ordinary (strong) reference.  Swift makes this work when passing an unowned or weak reference by passing a temporary variable holding a temporarily-promoted strong reference.  That's usually good, but it's wrong for isKnownUniquelyReferenced, and even more unfortunately, I don't think there's any supported way to make this work in the current compiler; you need language support.

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20161118/c4e3e514/attachment.html>


More information about the swift-users mailing list