<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body><div>On Thu, Dec 17, 2015, at 09:37 AM, Joe Groff via swift-evolution wrote:<br></div>
<blockquote type="cite"><div>Hi everyone. Chris stole my thunder already—yeah, I've been working on a design for allowing properties to be extended with user-defined <s>delegates^W</s> behaviors. Here's a draft proposal that I'd like to open up for broader discussion. Thanks for taking a look!<br></div>
</blockquote><div> </div>
<div>Thanks for posting this! I just read through it, and there's a lot to like in here, but I also have a bunch of concerns. I'll go back through the document in order and respond to bits of it.<br></div>
<div> </div>
<div>I apologize in advance for the massive size of this email, and for its rambling nature. It is a bit of stream-of-consciousness. I also apologize if anything in here has already been addressed in this thread, as I've been writing it over several hours and I know the thread has had discussion during that time.<br></div>
<div> </div>
<blockquote><div>A var or let declaration can specify its behavior in parens after the keyword<br></div>
</blockquote><div> </div>
<div>I like this syntax.<br></div>
<div> </div>
<blockquote><div>Furthermore, the behavior can provide additional operations, such as clear-ing a lazy property, by accessing it with property.behavior syntax:<br></div>
</blockquote><div> </div>
<div>You already mentioned this at the end, but I'm concerned about the ambiguity between `foo.behavior` and `foo.someProp`. If the compiler always resolves ambiguity in one way, that makes it impossible to explicitly choose the alternative resolution (e.g. if `foo.lazy` resolves in favor of a property of the type, how do you access the lazy behavior? If it resolves in favor of the behavior, how do you get at the property instead?). Not just that, but it's also ambiguous to any reader; if I see `self.foo.bar` I have to know up-front whether "bar" is a behavior or a property of the variable's type.<br></div>
<div> </div>
<div>I'm mildly tempted to say we should use<br></div>
<div> </div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"> `foo.lazy`.reset()<br></span></div>
<div> </div>
<div>but I admit it does look a bit odd, especially if accessing methods of behaviors ends up being common. Another idea might look like<br></div>
<div> </div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"> foo.(lazy).reset()<br></span></div>
<div> </div>
<div>Or maybe we could even come up with a syntax that lets you omit the behavior name if it's unambiguous (e.g. only one behavior, or if the method/property you're accessing only exists on one behavior). Being able to omit the behavior name would be nice for defining resettable properties because saying something like `foo.resettable.reset()` is annoyingly redundant. Maybe something like `foo::reset()` or `foo#reset()`, which would be shorthand for`foo::lazy.reset()` or `foo#lazy.reset()`.<br></div>
<div> </div>
<blockquote><div><div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"> public subscript<Container>(varIn _: Container,</span><br></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"> initializer initial: () -> Value) -> Value {</span><br></div>
</div>
</blockquote><div> </div>
<div>I'm a bit concerned about passing in the Container like this. For class types it's probably fine, but for value types, it means we're passing a copy of the value in to the property, which just seems really weird (both because it's a copy, and because that copy includes a copy of the property).<br></div>
<div> </div>
<div>Also the only example you gave that actually uses the container is Synchronized, but even there it's not great, because it means every synchronized property in the class all share the same lock. But that's not how Obj-C atomic properties work, and there's really no benefit at all to locking the entire class when accessing a single property because it doesn't provide any guarantees about access to multiple properties (as the lock is unlocked in between each access).<br></div>
<div> </div>
<div>FWIW, the way Obj-C atomic properties work is for scalars it uses atomic unordered loads/stores (which is even weaker than memory_order_relaxed, all it guarantees is that every load sees a value that was written at some point, i.e. no half-written values). For scalars it calls functions objc_copyStruct(), which uses a bank of 128 spinlocks and picks two of them based on the hash of the src/dst addresses (there's a comment saying the API was designed wrong, hence the need for 2 spinlocks; ideally it would only use one lock based on the address of the property because the other address is a local stack value). For objects it calls objc_getProperty() / objc_setProperty() which uses a separate bank of 128 spinlocks (and picks one based on the address of the ivar). The getter retains the object with the spinlock held and then autoreleases it outside of the spinlock. The setter just uses the spinlock to protect writing to the ivar, doing any retains/releases outside of it. I haven't tested but it appears that Obj-C++ properties containing C++ objects uses yet another bank of 128 spinlocks, using the spinlock around the C++ copy operation.<br></div>
<div> </div>
<div>Ultimately, the point here is that the only interesting synchronization that can be done at the property level is unordered atomic access, and for any properties that can't actually use an atomic load/store (either because they're aggregates or because they're reference-counted objects) you really do want to use a spinlock to minimize the cost. But adding a spinlock to every single property is a lot of wasted space (especially because safe spinlocks on iOS require a full word), which is why the Obj-C runtime uses those banks of spinlocks.<br></div>
<div> </div>
<div>In any case, I guess what I'm saying is we should ditch the Container argument. It's basically only usable for classes, and even then it's kind of strange for a property to actually care about its container.<br></div>
<div> </div>
<blockquote><div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">var `foo.lazy` = lazy(var: Int.self, initializer: { 1738 })<br></span></div>
</blockquote><div> </div>
<div>This actually won't work to replace existing lazy properties. It's legal today to write<br></div>
<div> </div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"> lazy var x: Int = self.y + 1<br></span></div>
<div> </div>
<div>This works because the initializer expression isn't actually run until the property is accessed. But if the initializer is passed to the behavior function, then it can't possibly reference `self` as that runs before stage-1 initialization.<br></div>
<div> </div>
<div>So we need some way to distinguish behaviors that initialize immediately vs behaviors that initialize later. The former want an initializer on the behavior function, and may or may not care about having an initializer on the getter/setter. The latter don't want an initializer on the behavior, and do want one on the getter/setter. In theory you could use the presence of a declared `initializer` argument on the behavior function to distinguish between eager-initialized and lazy-initialized, though that feels a little odd. <br></div>
<div> </div>
<blockquote><div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">let (memoized) address = "\(street)\n\(city) \(postalCode)"</span><br></div>
</blockquote><div> </div>
<div>You're using un-qualified accesses to properties on self in the initializer here. I'm not actually against allowing that, but `lazy` properties today require you to use `self.`, otherwise any unqualified property access is resolved against the type instead of the value. I believe the current behavior is because non-lazy properties resolve unqualified properties this way, so `lazy` properties do too in order to allow you to add `lazy` to any property without breaking the existing initializer.<br></div>
<div> </div>
<div>This property declaration also runs into the eager-vs-delayed initializer issue I mentioned above.<br></div>
<div> </div>
<blockquote><div>A property behavior can model "delayed" initialization behavior, where the DI rules for var and let properties are enforced dynamically rather than at compile time<br></div>
</blockquote><div> </div>
<div>It looks to me that the only benefit this has versus IOUs is you can use a `let` instead of a `var`. It's worth pointing out that this actually doesn't even replace IOUs for @IBOutlets because it's commonly useful to use optional-chaining on outlets for code that might run before the view is loaded (and while optional chaining is possible with behavior access, it's a lot more awkward).<br></div>
<div> </div>
<blockquote><div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">let (delayed) x: Int</span><br></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">...<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">self.x.delayed.initialize(x)<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">...</span><br></div>
</blockquote><div> </div>
<div>Allowing `let` here is actually a violation of Swift's otherwise-strict rules about `let`. Specifically, Delayed here is a struct, but initializing it requires it to be mutable. So `let (delayed) x: Int` can't actually ever be initialized. You could make it a class, but that's a fairly absurd performance penalty for something that provides basically the same behavior as IOUs. You do remark later in detailed design about how the backing storage is always `var`, which solves this at a technical level, but it still appears to the user as though they're mutating a `let` property and that's strictly illegal today.<br></div>
<div> </div>
<div>I think the right resolution here is just to remove the `letIn` constructor and use `var` for these properties. The behavior itself (e.g. delayed) can document write-once behavior if it wants to. Heck, that behavior was only enforcing write-once in a custom initialize() method anyway, so nothing about the API would actually change.<br></div>
<div> </div>
<blockquote><div>Resettable properties<br></div>
</blockquote><div> </div>
<div>The implementation here is a bit weird. If the property is nil, it invokes the initializer expression, every single time it's accessed. And the underlying value is optional. This is really implemented basically like a lazy property that doesn't automatically initialize itself.<br></div>
<div> </div>
<div>Instead I'd expect a resettable property to have eager-initialization, and to just eagerly re-initialize the property whenever it's reset. This way the underlying storage isn't Optional, the initializer expression is invoked at more predictable times, and it only invokes the initializer once per reset.<br></div>
<div> </div>
<div>The problem with this change is the initializer expression needs to be provided to the behavior when reset() is invoked rather than when the getter/setter is called.<br></div>
<div> </div>
<blockquote><div>NSCopying<br></div>
</blockquote><div> </div>
<div>We really need to support composition. Adding NSCopying to a property doesn't really change the behavior of the property itself, it just makes assignments to it automatically call copy() on the new value before doing the actual assignment. Composition in general is good, but NSCopying seems like an especially good example of where adding this kind of behavior should work fine with everything else.<br></div>
<div> </div>
<div>Based on the examples given here, there's really several different things behaviors do:<br></div>
<div> </div>
<div>* Behaviors that "decorate" the getter/setter, without actually changing the underlying value get/set. This includes property observers and Synchronized (although atomic properties ideally should alter the get/set to use atomic instructions when possible, but semantically it's the same as taking a per-property spinlock).<br></div>
<div>* Behaviors that transform the value. This is basically NSCopying, because it copies the value but otherwise wants to preserve any existing property behavior (just with the new value instead of the old). But e.g. lazy can also be thought of as doing this where the transform is from T to T? (the setter converts T into T? and assigns it to the underlying value; the getter unwraps the T? or initializes it if nil and returns T). Of course there is probably a difference between transformers that keep the same type and ones that change the type; e.g. property observers with NSCopying may want to invoke willSet with the initial uncopied value (in case the observer wants to change the assigned value), but didSet should of course be invoked with the resulting copied value. But transformers where the transformation is an implementation detail (such as lazy, which transforms T to T?) don't want to expose that implementation detail to the property observers. So maybe there's two types of transformers; one that changes the underlying type, and one that doesn't.<br></div>
<div>* Behaviors that don't alter the getter/setter but simply provide additional functionality. This is exemplified by Resettable (at least, with my suggested change to make it eagerly initialize), because it really just provides a .reset() function.<br></div>
<div>* The lazy vs eager initialized thing from before<br></div>
<div> </div>
<div>I suspect that we really should have a behavior definition that acknowledges these differences and makes them explicit in the API.<br></div>
<div> </div>
<div>There's also a lot of composition concerns here. For example, synchronized should probably always be the innermost decorator, because the lock is really only protecting the storage of the value and shouldn't e.g. cover property observers or NSCopying. Property observers should probably always be the outermost decorator (and willSet should even fire before NSCopying, partially because it should be whatever value the user actually tried to assign, and because willSet observers can actually change the value being assigned and any such new value should then get copied by NSCopying).<br></div>
<div> </div>
<div>Speaking of composition, mixing lazy and synchronized seems really problematic. If Synchronized uses a bank of locks like the obj-c runtime, then lazy can't execute inside of the lock because the initializer might access something else that hits the same lock and causes an unpredictable deadlock. But it can't execute outside of the lock either because the initializer might then get executed twice (which would surprise everyone). So really the combination of lazy + synchronized needs to actually use completely separate combined LazySynchronized type, one that provides the expected dispatch_once-like behavior.<br></div>
<div> </div>
<blockquote><div>Referencing Properties with Pointers<br></div>
<div>...<br></div>
<div>A production-quality stdlib implementation could use compiler magic to ensure the property is stored in-line in an addressable way.<br></div>
</blockquote><div> </div>
<div>Sounds like basically an implementation that just stores the value inline as a value and uses Builtin.addressOf(). This behavior is problematic for composition. It also doesn't work at all for computed properties (although any behavior that directly controls value storage, such as lazy, also has the same limitation). The behavior design should acknowledge the split between behaviors that work on computed properties and those that don't.<br></div>
<div> </div>
<div>More thoughts on composition: The "obvious" way to compose behaviors is to just have a chain of them where each behavior wraps the next one, e.g. Copying<Resettable<Synchronized<NSString>>>. But this doesn't actually work for properties like Lazy that change the type of the underlying value, because the "underlying value" in this case is the wrapped behavior, and you can't have a nil behavior (it would break most of the functionality of behaviors, as well as break the ability to say `foo.behavior.bar()`).<br></div>
<div> </div>
<div>Based on the previous behavior categories, I'm tempted to say that we need to model behaviors with a handful of protocols (e.g. on for decorators, one for transformers, etc), and have the logic of the property itself call the appropriate methods on the collection of protocols at the appropriate times. Transformer behaviors could have an associated type that is the transformed value type (and the behavior itself would be generic, taking the value type as its parameter, as you already have). The compiler can then calculate the ordering of behaviors, and use the associated types to figure out the "real" underlying value, and pass appropriately-transformed value types to the various behaviors depending on where in the chain they execute. By that I mean a chain of (observed, lazy, sync) for a property of type Int (ignoring for a moment the issues with sync + lazy) would create an Observed<Int>, a Lazy<Int>, and a Sync<Int?> (because the Lazy<Int>'s associated type says it transforms to Int?). The problem with this model is the behavior can no longer actually contain the underlying value as a property. And that's actually fine. If we can split up any stored values the behavior needs from the storage of the property itself, that's probably a good thing.<br></div>
<div> </div>
<blockquote><div>Property Observers<br></div>
</blockquote><div> </div>
<div>Property Observers need to somehow support the behavior of letting accessors reassign to the property without causing an infinite loop. They also need to support subclassing such that the observers are called in the correct order in the nested classes (and again, with reassignment, such that the reassigned value is visible to the future observers without starting the observer chain over again).<br></div>
<div> </div>
<div>Property Observers also pose a special challenge for subclasses. Overriding a property to add a behavior in many cases would actually want to create brand new underlying storage (e.g. adding lazy to a property needs different storage). But property observers explicitly don't want to do that, they just want to observe the existing property. I suspect this may actually line up quite well with the distinction between decorators and other behaviors.<br></div>
<div> </div>
<div>On a similar note, I'm not sure if there's any other behaviors where overriding actually wants to preserve any existing behaviors. Property observers definitely want to, but if I have a lazy property and I override it in a subclass for any reason beyond adding observers, the subclass property probably shouldn't be lazy. Conversely, if I have an observed property and I override it to be lazy, it should still preserve the property observers (but no other behaviors). This actually suggests to me that Property Observers are unique among behaviors, and are perhaps worthy of leaving as a language feature instead of as a behavior. Of course, I can always override a property with a computed property and call `super` in the getter/setter, at which point any behaviors of the superclass property are expected to apply, but I don't think there's any actual problems there.<br></div>
<div> </div>
<div>Speaking of that, how do behaviors interact with computed properties? A lazy computed property doesn't make sense (which is why the language doesn't allow it). But an NSCopying computed property is fine (the computed getter would be handed the copied value).<br></div>
<div> </div>
<blockquote><div>The backing property has internal visibility by default<br></div>
</blockquote><div> </div>
<div>In most cases I'd recommend private by default. Just because I have an internal property doesn't mean the underlying implementation detail should be internal. In 100% of the cases where I've written a computed property backed by a second stored property (typically named with a _ prefix), the stored property is always private, because nobody has any business looking at it except for the class/struct it belongs to.<br></div>
<div> </div>
<div>Although actually, having said that, there's at least one behavior (resettable) that only makes sense if it's just as visible as the property itself (e.g. so it should be public on a public property).<br></div>
<div> </div>
<div>And come to think of it, just because the class designer didn't anticipate a desire to access the underlying storage of a lazy property (e.g. to check if it's been initialized yet) doesn't mean the user of the property doesn't have a reason to get at that.<br></div>
<div> </div>
<div>So I'm actually now leaning to making it default to the same accessibility as the property itself (e.g. public, if the property is public). Any behaviors that have internal implementation details that should never be exposed (e.g. memoized should never expose its box, but maybe it should expose an accessor to check if it's initialized) can mark those properties/methods as internal or private and that accessibility modifier would be obeyed. Which is to say, the behavior itself should always be accessible on a property, but implementation details of the behavior are subject to the normal accessibility rules there.<br></div>
<div> </div>
<div>The proposed (public lazy) syntax can still be used to lower visibility, e.g. (private lazy).<br></div>
<div> </div>
<blockquote><div>Defining behavior requirements using a protocol<br></div>
</blockquote><div> </div>
<div>As mentioned above, I think we should actually model behaviors using a family of protocols. This will let us represent decorators vs value transformers (and a behavior could even be both, by implementing both protocols). We could also use protocols for eager initialization vs lazy initialization (which is distinguished only by the presence of the initializer closure in the behavior initializer). We'd need to do something like<br></div>
<div> </div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol Behavior { init(...) }<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol LazyBehavior { init(...) }<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol DecoratorBehavior : Behavior { ... }<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol LazyDecoratorBehavior : LazyBehavior { ... }<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol TransformerBehavior : Behavior { ... }<br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">protocol LazyTransformerBehavior : LazyBehavior { ... }<br></span></div>
<div> </div>
<div>and that way a type could conform to both DecoratorBehavior and TransformerBehavior without any collision in init (because the init requirement comes from a shared base protocol).<br></div>
<div> </div>
<div>As for actually defining the behavior name, you still do need the global function, but it could maybe return the behavior type, e.g. behavior functions are functions that match either of the following:<br></div>
<div> </div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">func name<T: Behavior>(...) -> T.Type</span><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"><br></span></div>
<div><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;">func name<T: LazyBehavior>(...) -> T.Type</span><span class="font" style="font-family: menlo, consolas, "courier new", monospace, sans-serif;"><br></span></div>
<div> </div>
<div>I'm not really a big fan of having two "root" protocols here, but I also don't like magical arguments (e.g. treating the presence of an argument named "initializer" as meaningful) which is why the protocols take initializers. I guess the protocols also need to declare typealiases for the Value type (and TransformerBehavior can declare a separate typealias for the TransformedValue, i.e. the underlying storage. e.g T? for lazy)<br></div>
<div> </div>
<blockquote><div>A behavior declaration<br></div>
</blockquote><div> </div>
<div>This has promise as well. By using a declaration like this, you can have basically a DSL (using contextual keywords) to specify things like whether it's lazy-initialized, decorators, and transformers. Same benefits as the protocol family (e.g. good compiler checking of the behavior definition before it's even used anywhere), allows for code code-completion too, and it doesn't litter the global function namespace with behavior names.<br></div>
<div> </div>
<div>The more I think about this, the more I think it's a good idea. Especially because it won't litter the global function namespace with behavior names. Behavior constructors should not be callable by the user, and behaviors may be named things we would love to use as function names anyway (if a behavior implements some functionality that is useful to be exposed to the user anyway, it can vend a type like your proposal has and people can just instantiate that type directly).<br></div>
<div> </div>
<blockquote><div>Can properties with behaviors be initialized from init rather than with inline initializers?<br></div>
</blockquote><div> </div>
<div>I think the answer to this has to be "absolutely". Especially if property observers are a behavior (as the initial value may need to be computed from init args or other properties, which can't be done as an inline initializer).<br></div>
<div> </div>
<div>-Kevin Ballard</div>
</body>
</html>