[swift-evolution] [swift-users] Problem with COW optimization

Adrian Zubarev adrian.zubarev at devandartist.com
Mon Sep 19 05:18:32 CDT 2016


I think I just found a solution to my problem:

/// - Parameter child: The new `child` to add to the `children` array.
public mutating func add(_ child: Element) {

   let clonedChildReference = Reference(cloning: child.reference)
   let index = self.reference.children.endIndex
             
   self.mutableInsert(clonedChildReference, at: index, isNotOwnReference: child.reference !== self.reference)
}
         
/// Warning: Pass only clonded nodes of type Element to this function!
private mutating func mutableInsert(_ node: XML.Node, at index: Int, isNotOwnReference: Bool) {

   // * If `self.reference` is NOT uniquely referenced and `node` is a String,
   //   we should rebuilt own reference.
   // * If `self.reference` is NOT uniquely referenced and `node` is a Reference
   //   `isNotOwnReference` is true, we should rebuilt own reference.
   // * If `self.reference` is NOT uniquely referenced and `node` is a clone  
   //   reference to `self.reference` where is `isNotOwnReference` is false, we
   //   should check if there are more than **two** strong references to rebuild
   //   own reference, otherwise it's an implementation artifact and we can keep
   //   old reference (any `node` of type Reference is cloned before it's added  
   //   to the child array).
   let isNotKnownUniquelyReferenced = !isKnownUniquelyReferenced(&self.reference)
             
   var shouldRebuildReference = false
             
   switch node.kind {
                 
   case .element(_):
      let hasMoreThanTwoStrongReferences = (CFGetRetainCount(self.reference) - 1) > 2
      shouldRebuildReference = (isNotKnownUniquelyReferenced && isNotOwnReference) || hasMoreThanTwoStrongReferences
                 
   case .text(_):
      shouldRebuildReference = isNotKnownUniquelyReferenced
   }
             
   if shouldRebuildReference {
                 
      self.reference = Reference(cloning: self.reference, wholeTree: true)
   }

   self.reference.insert(node, at: index)
}
I’m using CFGetRetainCount(self.reference) to catch that implementation artifact.



-- 
Adrian Zubarev
Sent with Airmail

Am 19. September 2016 um 09:59:24, Adrian Zubarev (adrian.zubarev at devandartist.com) schrieb:

Hello Dave,

thank you for trying to help me. I’ll try to explain the issue with some more details.

First here is some code:

extension XML {
      
    public struct Element {

        // public for testing   
        public var reference: Reference

        public var name: String { return self.reference.name }

        public var children: [Element] {
              
            return self.reference.children.flatMap {
                  
                guard case .element(let element) = $0.kind else { return nil }
                return Element(wrapping: element)
            }
        }
          
        public init(name: String) {
              
            self.reference = Reference(name: name)
        }

        public mutating func add(_ child: Element) {
              
            self.mutableInsert(Reference(cloning: child.reference), at: self.reference.children.endIndex)
        }
          
        // Ignore XML.Node, it's a String or Reference
        // Parameter `Node` is assumed to be a clone of a reference passed to `add` or `insert` method.
        private mutating func mutableInsert(_ node: XML.Node, at index: Int) {
              
            // Clone own reference all way up to the root
            if !isKnownUniquelyReferenced(&self.reference) {
                  
                self.reference = Reference(cloning: self.reference, wholeTree: true)
            }
              
            // Extract a reference or just insert a string as a child
            guard case .element(let nodeReference) = node.kind else {
                  
                self.reference.insert(node, at: index)
                return
            }
              
            // Check for possible debelopment bug
            if nodeReference === self.reference {
                  
                fatalError("wrong usage of `mutableInsert` function")
            }
              
            self.reference.insert(nodeReference, at: index)
        }
          
        ...
    }
}

extension XML.Element {
      
    // public for testing
    public class Reference : XML.Node {
          
        let name: String

        private(set) weak var parent: Reference?

        private(set) var children: [XML.Node]

        var kind: XML.Node.Kind { return .element(self) }

        ...
    }
}
Now lets focus on the problem.

Every Element is baked with it’s own Reference to be able to traverse the tree from any of it’s node all way up to the root for example.

Lets look again at the scenario I already described:

var root = XML.Element(name: "root")
var elem = XML.Element(name: "elem")

ObjectIdentifier(root.reference) // 0x000060000026ab40
ObjectIdentifier(elem.reference) // 0x000060800026bb00

isKnownUniquelyReferenced(&root.reference) // true
isKnownUniquelyReferenced(&elem.reference) // true

root.add(elem)

isKnownUniquelyReferenced(&root.reference) // true

root.add(root)

// The reference of root has changed even if the second child   
// was cloned and added as a new object to the reference.
// 0x000060000026ab40 <-- was thrown away
isKnownUniquelyReferenced(&root.reference) // true

ObjectIdentifier(root.reference) // 0x000060000026c680 <— new one
The way I’m adding children to the tree is that every passed element of type XML.Element stores a Reference, which will be cloned and added as a new standalone object to the children array.

The same happens when we try adding root as it’s own child. We copy root struct which contains the same reference, then we clone it inside add method, then we pass the new object to the mutableInsert function. At that point we don’t need the old reference anymore, I’m speaking of root.add(root). The problem here is that at that time root.reference has 2 strong references which I cannot escape.

I could workaround the problem if I knew the reference counter value, because I could check if the passed Element contains the same reference first. And if it does and we have exactly 2 strong references, I don’t need to recreate root.reference here.

But I couldn’t find any API for that. :/



-- 
Adrian Zubarev
Sent with Airmail

Am 19. September 2016 um 05:50:57, Dave Abrahams via swift-users (swift-users at swift.org) schrieb:


on Sun Sep 18 2016, Adrian Zubarev <swift-users-AT-swift.org> wrote:

> Dear Swift community,
>
> currently I’m building a value type XML library which is baked behind
> the scene with a reference type to manage graph traversing between
> nodes. I also like to COW optimize the xml graph, but I run into one
> single problem atm.
>
> Image this xml tree:
>
> <root>
> <item/>
> </root>
> It’s just a root element with one single child. As for value types it
> should be totally fine to do something like this:
>
> // The given xml tree
> var root = XML.Element(name: "root")
> let item = XML.Element(name: "item")
> root.add(item)
>
> // The problematic behavior
> root.add(root)
> If this would be a simple value type without any references behind the
> scenes you could imagine that the result of the last code line will
> look like this:
>
> <root>
> <item/>
> <root>
> <item/>
> </root>
> </root>

Yep, that's exactly the right answer for a tree with value semantics.
The simplest way to implement this tree is to use an Array for the child
nodes.

> Basically we copied the whole tree and added it as the second child
> into the original root element.
>
> As for COW optimization this is a problem, just because the passed
> root is a copy of a struct that contains the exact same reference as
> the original root element.

I don't understand why that's a problem.

> isKnownUniquelyReferenced(&self.reference) will result in false inside
> the add method.

...as it should.

> Is there any chance I could force my program to decrease the reference
> counter of that last item after I’m sure I don’t need it?!

Which last item? When are you sure you don't need it? What result do
you hope for?

> A few more details: inside the add method I’m always cloning the
> passed reference just because graphs aren’t that trivial and otherwise
> I could possibly end up with a cycle graph, which would be really
> bad. After that job I’m sure that I don’t need the passed reference
> anymore and I need a way to escape from it.
>
> I’d appreciate any suggestions and help. :)

It's not clear what you want to acheive nor can I picture the code
you're using to acheive it, so it's hard to give useful feedback.

Sorry,

--
-Dave

_______________________________________________
swift-users mailing list
swift-users at swift.org
https://lists.swift.org/mailman/listinfo/swift-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160919/d22621c3/attachment.html>


More information about the swift-evolution mailing list