[swift-evolution] [swift-users] Problem with COW optimization
Dave Abrahams
dabrahams at apple.com
Tue Sep 20 10:15:17 CDT 2016
on Tue Sep 20 2016, Adrian Zubarev <swift-evolution at swift.org> wrote:
> True, but how do you traverse back the tree from an inner node?
>
> Lets look at this simple tree:
>
> <root>
> <first>
> <second/>
> </first>
> </root>
> let second = /* get the second node */
>
> second.parent // <— How would you implement this with pure value type?
Two possibilities:
1. You don't. Instead your traversal state becomes a path from the root
of the tree.
2. You represent the graph more explicitly, e.g.:
struct XMLNode {
tag: String
attributes: [String]
parent: Int
children: [Int]
}
typealias XMLTree = [XMLNode]
where parent and children are indices into XMLTree.
> If parent were only a struct without a reference type behind the
> scenes the parent will also contain the current child, where every
> child has the parent and so one. The reference type is there as a
> layer to achieve this.
A CoW tree that uses references for relationships internally has some
interesting limitations:
* you have to choose between the ability to share unmodified subparts of
the tree between copies and the ability to reach a node's parent.
* if you share unmodified subparts, you need to check uniqueness of
all its ancestors before modifying a node in place
To me, it doesn't make sense to use references for relationships unless
they let you share unmodified subparts. If you don't share unmodified
subparts, you'll get much worse performance from such a CoW
representation than you would using the array representation I quoted
above.
So if you're really committed to using a reference for a parent
relationship, I'd ditch CoW either give the XML tree reference
semantics, or make it immutable (in which case its semantics are
indistinguishable from value semantics).
Hope this helps,
Dave
>
>
> What kind of a layer does indirect add on enums?
> Can indirect solve the problem here and can it be added to structs as well?
> If I’m missing here something, and there might be a true value type solution for this problem, I’d throw the whole project away and rebuild it from the beginning. :)
>
> Best regards,
>
> --
> Adrian Zubarev
> Sent with Airmail
>
> Am 20. September 2016 um 08:56:20, Dave Abrahams via swift-evolution (swift-evolution at swift.org) schrieb:
>
> on Mon Sep 19 2016, Adrian Zubarev <swift-evolution at swift.org> wrote:
>
>> I think I just found a solution to my problem:
>>
>> /// - Parameter child: The new `child` to add to the `children` array.
>> public mutating func add(_ child: Element) {
>>
>> let clonedChildReference = Reference(cloning: child.reference)
>> let index = self.reference.children.endIndex
>>
>> self.mutableInsert(clonedChildReference, at: index, isNotOwnReference: child.reference !== self.reference)
>> }
>>
>> /// Warning: Pass only clonded nodes of type Element to this function!
>
> I don't understand why any explicit cloning should be needed. An XML
> tree can be modeled perfectly well using arrays of value types, which
> will “clone” themselves as needed to maintain value semantics.
>
>> private mutating func mutableInsert(_ node: XML.Node, at index: Int,
>> isNotOwnReference: Bool) {
>>
>> // * If `self.reference` is NOT uniquely referenced and `node` is a String,
>> // we should rebuilt own reference.
>> // * If `self.reference` is NOT uniquely referenced and `node` is a Reference
>> // `isNotOwnReference` is true, we should rebuilt own reference.
>> // * If `self.reference` is NOT uniquely referenced and `node` is a clone
>> // reference to `self.reference` where is `isNotOwnReference` is false, we
>> // should check if there are more than **two** strong references to rebuild
>> // own reference, otherwise it's an implementation artifact and we can keep
>> // old reference (any `node` of type Reference is cloned before it's added
>> // to the child array).
>> let isNotKnownUniquelyReferenced = !isKnownUniquelyReferenced(&self.reference)
>>
>> var shouldRebuildReference = false
>>
>> switch node.kind {
>>
>> case .element(_):
>> let hasMoreThanTwoStrongReferences = (CFGetRetainCount(self.reference) - 1) > 2
>> shouldRebuildReference = (isNotKnownUniquelyReferenced && isNotOwnReference) || hasMoreThanTwoStrongReferences
>>
>> case .text(_):
>> shouldRebuildReference = isNotKnownUniquelyReferenced
>> }
>>
>> if shouldRebuildReference {
>>
>> self.reference = Reference(cloning: self.reference, wholeTree: true)
>> }
>>
>> self.reference.insert(node, at: index)
>> }
>> I’m using CFGetRetainCount(self.reference) to catch that implementation artifact.
>>
>> --
>> Adrian Zubarev
>> Sent with Airmail
>>
>> Am 19. September 2016 um 09:59:24, Adrian Zubarev (adrian.zubarev at devandartist.com) schrieb:
>>
>> Hello Dave,
>>
>> thank you for trying to help me. I’ll try to explain the issue with some more details.
>>
>> First here is some code:
>>
>> extension XML {
>>
>> public struct Element {
>>
>> // public for testing
>> public var reference: Reference
>>
>> public var name: String { return self.reference.name }
>>
>> public var children: [Element] {
>>
>> return self.reference.children.flatMap {
>>
>> guard case .element(let element) = $0.kind else { return nil }
>> return Element(wrapping: element)
>> }
>> }
>>
>> public init(name: String) {
>>
>> self.reference = Reference(name: name)
>> }
>>
>> public mutating func add(_ child: Element) {
>>
>> self.mutableInsert(Reference(cloning: child.reference), at: self.reference.children.endIndex)
>> }
>>
>> // Ignore XML.Node, it's a String or Reference
>> // Parameter `Node` is assumed to be a clone of a reference passed to `add` or `insert` method.
>> private mutating func mutableInsert(_ node: XML.Node, at index: Int) {
>>
>> // Clone own reference all way up to the root
>> if !isKnownUniquelyReferenced(&self.reference) {
>>
>> self.reference = Reference(cloning: self.reference, wholeTree: true)
>> }
>>
>> // Extract a reference or just insert a string as a child
>> guard case .element(let nodeReference) = node.kind else {
>>
>> self.reference.insert(node, at: index)
>> return
>> }
>>
>> // Check for possible debelopment bug
>> if nodeReference === self.reference {
>>
>> fatalError("wrong usage of `mutableInsert` function")
>> }
>>
>> self.reference.insert(nodeReference, at: index)
>> }
>>
>> ...
>> }
>> }
>>
>> extension XML.Element {
>>
>> // public for testing
>> public class Reference : XML.Node {
>>
>> let name: String
>>
>> private(set) weak var parent: Reference?
>>
>> private(set) var children: [XML.Node]
>>
>> var kind: XML.Node.Kind { return .element(self) }
>>
>> ...
>> }
>> }
>> Now lets focus on the problem.
>>
>> Every Element is baked with it’s own Reference to be able to
>> traverse the tree from any of it’s node all way up to the root for
>> example.
>>
>> Lets look again at the scenario I already described:
>>
>> var root = XML.Element(name: "root")
>> var elem = XML.Element(name: "elem")
>>
>> ObjectIdentifier(root.reference) // 0x000060000026ab40
>> ObjectIdentifier(elem.reference) // 0x000060800026bb00
>>
>> isKnownUniquelyReferenced(&root.reference) // true
>> isKnownUniquelyReferenced(&elem.reference) // true
>>
>> root.add(elem)
>>
>> isKnownUniquelyReferenced(&root.reference) // true
>>
>> root.add(root)
>>
>> // The reference of root has changed even if the second child
>> // was cloned and added as a new object to the reference.
>> // 0x000060000026ab40 <-- was thrown away
>> isKnownUniquelyReferenced(&root.reference) // true
>>
>> ObjectIdentifier(root.reference) // 0x000060000026c680 <— new one
>> The way I’m adding children to the tree is that every passed element
>> of type XML.Element stores a Reference, which will be cloned and
>> added as a new standalone object to the children array.
>>
>> The same happens when we try adding root as it’s own child. We copy
>> root struct which contains the same reference, then we clone it
>> inside add method, then we pass the new object to the mutableInsert
>> function. At that point we don’t need the old reference anymore, I’m
>> speaking of root.add(root). The problem here is that at that time
>> root.reference has 2 strong references which I cannot escape.
>>
>> I could workaround the problem if I knew the reference counter
>> value, because I could check if the passed Element contains the same
>> reference first. And if it does and we have exactly 2 strong
>> references, I don’t need to recreate root.reference here.
>>
>> But I couldn’t find any API for that. :/
>>
>> --
>> Adrian Zubarev
>> Sent with Airmail
>>
>> Am 19. September 2016 um 05:50:57, Dave Abrahams via swift-users
>> (swift-users at swift.org) schrieb:
>>
>> on Sun Sep 18 2016, Adrian Zubarev <swift-users-AT-swift.org> wrote:
>>
>>> Dear Swift community,
>>>
>>> currently I’m building a value type XML library which is baked behind
>>> the scene with a reference type to manage graph traversing between
>>> nodes. I also like to COW optimize the xml graph, but I run into one
>>> single problem atm.
>>>
>>> Image this xml tree:
>>>
>>> <root>
>>> <item/>
>>> </root>
>>> It’s just a root element with one single child. As for value types it
>>> should be totally fine to do something like this:
>>>
>>> // The given xml tree
>>> var root = XML.Element(name: "root")
>>> let item = XML.Element(name: "item")
>>> root.add(item)
>>>
>>> // The problematic behavior
>>> root.add(root)
>>> If this would be a simple value type without any references behind the
>>> scenes you could imagine that the result of the last code line will
>>> look like this:
>>>
>>> <root>
>>> <item/>
>>> <root>
>>> <item/>
>>> </root>
>>> </root>
>>
>> Yep, that's exactly the right answer for a tree with value semantics.
>> The simplest way to implement this tree is to use an Array for the child
>> nodes.
>>
>>> Basically we copied the whole tree and added it as the second child
>>> into the original root element.
>>>
>>> As for COW optimization this is a problem, just because the passed
>>> root is a copy of a struct that contains the exact same reference as
>>> the original root element.
>>
>> I don't understand why that's a problem.
>>
>>> isKnownUniquelyReferenced(&self.reference) will result in false inside
>>> the add method.
>>
>> ...as it should.
>>
>>> Is there any chance I could force my program to decrease the reference
>>> counter of that last item after I’m sure I don’t need it?!
>>
>> Which last item? When are you sure you don't need it? What result do
>> you hope for?
>>
>>> A few more details: inside the add method I’m always cloning the
>>> passed reference just because graphs aren’t that trivial and otherwise
>>> I could possibly end up with a cycle graph, which would be really
>>> bad. After that job I’m sure that I don’t need the passed reference
>>> anymore and I need a way to escape from it.
>>>
>>> I’d appreciate any suggestions and help. :)
>>
>> It's not clear what you want to acheive nor can I picture the code
>> you're using to acheive it, so it's hard to give useful feedback.
>>
>> Sorry,
>>
>> --
>> -Dave
>>
>> _______________________________________________
>> swift-users mailing list
>> swift-users at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-users
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>
> --
> -Dave
>
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
--
-Dave
More information about the swift-evolution
mailing list