[swift-evolution] Should we rename "class" when referring to protocol conformance?

Sat May 7 15:48:13 CDT 2016

on Sat May 07 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

> Sent from my iPad
>
>> On May 6, 2016, at 8:54 PM, Dave Abrahams <dabrahams at apple.com> wrote:
>> 
>> 
>>> on Fri May 06 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
>>> 
>>>    On May 6, 2016, at 7:48 PM, Dave Abrahams via swift-evolution
>>>    <swift-evolution at swift.org> wrote:
>>> 
>>>    on Thu May 05 2016, Matthew Johnson <swift-evolution at swift.org> wrote:
>>> 
>>>        On May 5, 2016, at 10:02 PM, Dave Abrahams
>>>        <dabrahams at apple.com> wrote:
>>> 
>>>        on Thu May 05 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
>>> 
>>>        On May 5, 2016, at 4:59 PM, Dave Abrahams
>>>        <dabrahams at apple.com> wrote:
>>> 
>>>        on Wed May 04 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:
>>> 
>>>        On May 4, 2016, at 5:50 PM, Dave Abrahams via swift-evolution
>>>        <swift-evolution at swift.org> wrote:
>>> 
>>>        on Wed May 04 2016, Matthew Johnson
>>>        <swift-evolution at swift.org> wrote:
>>> 
>>>        On May 4, 2016, at 1:29 PM, Dave Abrahams via swift-evolution
>>>        <swift-evolution at swift.org> wrote:
>>> 
>>>        on Wed May 04 2016, Adrian Zubarev
>>>        <swift-evolution at swift.org>
>>>        wrote:
>>> 
>>>        Not sure what to think about the enum cases inside a
>>>        protocol (if AnyEnum would
>>>        even exist), it could be a nice addition to the language, but
>>>        this is an own
>>>        proposal I guess.
>>> 
>>>        We should start by adding AnyValue protocol to which all value
>>>        types
>>>        conforms.
>>> 
>>>        Having a way to constrain conformance to things with value semantics
>>>        is
>>>        something I've long wanted. *However*, the approach described is too
>>>        simplistic. It's possible to build classes whose instances have
>>>        value
>>>        semantics (just make them immutable) and it's possible to build
>>>        structs
>>>        whose instances have reference semantics (just put the struct's
>>>        storage
>>>        in a mutable class instance that it holds as a property, and don't
>>>        do
>>>        copy-on-write). 
>>> 
>>>        In order for something like AnyValue to have meaning, we need to
>>>        impose
>>>        greater order. After thinking through many approaches over the
>>>        years, I
>>>        have arrived at the (admittedly rather drastic) opinion that the
>>>        language should effectively outlaw the creation of structs and enums
>>>        that don't have value semantics. (I have no problem with the idea
>>>        that
>>>        immutable classes that want to act as values should be wrapped in a
>>>        struct). The language could then do lots of things much more
>>>        intelligently, such as correctly generating implementations of
>>>        equality.
>>> 
>>>        That is a drastic solution indeed! How would this impact things like
>>>        Array<UIView>? While Array itself has value semantics, the aggregate
>>>        obviously does not as it contains references which usually be mutated
>>>        underneath us. 
>>> 
>>>        Value semantics and mutation can only be measured with respect to
>>>        equality. The definition of == for all class types would be equivalent
>>>        to ===. Problem solved.
>>> 
>>>        Similar considerations apply to simpler wrapper structs such as Weak.
>>> 
>>>        Same answer.
>>> 
>>>        Hmm. If those qualify as “value semantic” then what kind of structs and
>>>        enums
>>>        would not? A struct wrapping a mutable reference type certainly doesn’t
>>>        “feel”
>>>        value semantic to me and certainly doesn’t have the guarantees usually
>>>        associated with value semantics (won’t mutate behind your back, thread
>>>        safe,
>>>        etc).
>>> 
>>>        Sure it does.
>>> 
>>>        public struct Wrap<T: AnyObject> : Equatable {
>>>        init(_ x: T) { self.x = x }
>>>        private x: T
>>>        }
>>> 
>>>        func == <T>(lhs: Wrap<T>, rhs: Wrap<T>) -> Bool {
>>>        return lhs.x === rhs.x
>>>        }
>>> 
>>>        I defy you to find any scenario where Wrap<T> doesn't have value
>>>        semantics, whether T is mutable or not.
>>> 
>>>        Alternately, you can look at the Array implementation. Array is a
>>>        struct wrapping a mutable class. It has value semantics by virtue of
>>>        CoW.
>>> 
>>>        This goes back to where you draw the line as to the “boundary of the
>>>        value”.
>>>        Wrap and Array are “value semantic” in a shallow sense and are capable
>>>        of deep
>>>        value semantics when T is deeply value semantic. 
>>> 
>>>        No, I'm sorry; this “deep-vs-shallow” thing is a fallacy that comes from
>>>        not understanding the boundaries of your value. Or, put more
>>>        solicitously: sure, you can look at the world that way, but it just
>>>        makes everything prohibitively complicated, so why would you want to?
>>> 
>>>        In my world, there's no such thing as a “deep copy” or a “shallow copy;”
>>>        there's just “copy,” which logically creates an independent version of
>>>        everything up to the boundaries of the value. Likewise, there's no
>>>        “deep value semantics” or “shallow value semantics.” 
>>> 
>>>        Equality defines
>>>        value semantics, and the boundaries of an Array value always includes
>>>        the values of its elements. The *only* problem here is that we have no
>>>        way to do equality comparison on some arrays because some types aren't
>>>        Equatable. IMO the costs of not having everything be equatable, in
>>>        complexity-of-programming-model terms, are too high.
>>> 
>>>        Thank you for clarifying the terminology for me. This is helpful. 
>>> 
>>>        I think I may have misunderstood what you meant by “boundary of the
>>>        value”. Do
>>>        you mean that the boundary of an Array value stops at the reference
>>>        identity for
>>>        elements with reference semantics? 
>>> 
>>>    Yes.
>>> 
>>>        If you have an Array whose elements are of an immutable reference type
>>>        that has value semantics would you say the boundary extends past the
>>>        reference identity of an element and includes a definition of equality
>>>        defined by that type?
>>> 
>>>    Yes!
>>> 
>>>        Are you arguing that reference types should be equatable by default,
>>>        using
>>>        equality of the reference if the type does not provide a custom
>>>        definition of
>>>        equality?
>>> 
>>>    Yes!!
>>> 
>>>        Both have their place, but the maximum benefit of value semantics
>>>        (purity) 
>>> 
>>>        I don't know what definition of purity you're using. The only one I
>>>        know of applies to functions and implies no side effects. In that
>>>        world, there is no mutation and value semantics is equivalent to
>>>        reference semantics.
>>> 
>>>        I was using it in the sense of “PureValue” as discussed in this
>>>        thread. 
>>> 
>>>    Sorry, this is the first mention I can find in the whole thread, honest.
>>>    Oh, it was a different thread. Joe describes it as a protocol for
>>>    “types that represent fully self-contained values,” which is just fuzzy
>>>    enough that everyone reading it can have his own interpretation of what
>>>    it means.
>>> 
>>>        I was using it to mean values for which no *observable* mutation is
>>>        possible (allowing for CoW, etc). Is there a better term for this than
>>>        purity?
>>> 
>>>    You're still not making any sense to me. A type for which no observable
>>>    mutation is possible is **immutable**. The “write” part of
>>>    copy-on-write is a pretty clear indicator that it's all about
>>>    **mutation**. I don't see how they're compatible.
>>> 
>>> Sorry, I did not write that very clearly. I should have said no observable
>>> mutation *that happens behind your back*. In other words, the only *observable*
>>> mutation possible is local.
>> 
>> Yeah, but you need to ask the question, “mutation in what?”  The answer:
>> mutation in the value instance.  Then you need to ask, “how do you
>> determine whether there was mutation?”  
>> 
>>> Immutability accomplishes this by simply prohibiting all
>>> mutation. Primitive value types like Int and structs or enums that
>>> only contain primitive value types accomplish this by getting copied
>>> everywhere.
>>> 
>>> Swift’s collections also accomplish this through copying, but only when the
>>> elements they contain also have the same property.
>> 
>> Only if you think mutable class instances are part of the value of the
>> array that stores references to those class instances.  As I said
>> earlier, you can *take* that point of view, but why would you want to?
>> Today, we have left that question wide open, which makes the whole
>> notion of what is a logical value very indistinct.  I am proposing to
>> close it.
>
> I think part of the disconnect here might be the domains in which we
> work.  Maybe you're coming at this primarily from an algorithmic
> perspective and I'm coming at it primarily from an app development
> perspective.

IMO that's a false distinction.  Suggestion: look up the definition of
“algorithm.”  Your apps are built out of algorithms.  FWIW, I was an app
developer long before I was a library writer.  What I discovered, after
many years living with my own software and learning from mistakes, was
that “an algorithmic perspective” is essential to building any piece of
software that you or someone else might have to maintain, that users can
rely on, that doesn't have catastrophic performance problems, etc.

> For example, I think it is perfectly reasonable to write a generic
> view controller that works with various data types and is initialized
> with an Array<T> but only works properly when it isn't possible to
> observe any mutation in the subgraph of T.

And my claim is that you have picked a really complicated way of saying
“T has value semantics,” or if there are differences in your intended
constraint, you don't actually care about those differences.

Just taking the nontrivial case where T is a reference type, let's look
at the the phrase “it isn't possible to observe any mutation in the
subgraph of T.”  This is still a rather fuzzy notion, but let me try to
nail it down.  To me that means, if the behavior of “f” only depends on
data reachable through this array, and f makes no mutations, then in
this code, the two calls to f() are guaranteed have the same effect.

      func g<T>(a: [T]) {
        var vc = MyViewController(a)
        vc.f() // #1
        h()
        vc.f() // #2
     }

But clearly, the only way that can be the case is if T is actually
immutable (and contains no references to mutable data), because
otherwise anybody can write:

    class X { ... }
    let global: [X] = [ X() ]
    func h() { global[0].mutatingMethod() }
    g(global)

Conclusion: your definition of PureValue, as written, implies conforming
reference types must be immutable.  I'm not saying that's necessarily
what you meant, but if it isn't, you need to try to define it again.

>>> On the other hand, it is immediately obvious that non-local mutation
>>> is quite possibly in the elements of a Swift Array<AnyObject> unless
>>> they are all uniquely referenced.
>> 
>> If you interpret the elements of the array as being *references* to
>> objects, there is no possibility of non-local mutation.  If you
>> interpret the elements as being objects *themselves*, then you've got
>> problems.
>
> In application code we are concerned with the objects, not the
> references.

Not necessarily, not at all.  A Set<UIView> where you're interested in
the references is totally reasonable.

>>>    I think perhaps what you mean by “purity” is just, “has value
>>>    semantics.” But I could be wrong.
>>> 
>>> No, an array storing instances of reference types that are not immutable would
>>> not be “pure” (or whatever you want to call it).
>>> 
>>>        is derived from deep value semantics. This is when there is no
>>>        possibility of shared mutable state. This is an extremely important
>>>        property.
>>> 
>>>        It's the wrong property, IMO.
>>> 
>>>        Wrong in what sense? 
>>> 
>>>    Wrong in the sense that it rules out using things like Array that are
>>>    logically value types but happen to be implemented with CoW, and if you
>>>    have proper encapsulation there's no way for these types to behave as
>>>    anything other than values, so it would be extremely limiting. 
>>> 
>>> I’m a big fan of CoW as an implementation detail. We have definitely been
>>> miscommunicating if you thought I was suggesting something that would prohibit
>>> CoW.
>> 
>> Then what, precisely, are the syntactic and semantic requirements of “PureValue?”
>
> I believe it is a purely semantic concept.  It means that every name
> binding is logically and observably distinct, including and objects in
> the aggregate (if it includes references).  

That sounds like “value semantics” to me, although I get the sense maybe
you're also adding the restriction that you're *not allowed* to define
the boundary of values as stopping at a reference but not including the
instance it references.  IMO that restriction is not actually useful and
probably harmful.

> This allows for local mutation on the same binding and also for un
> observable mutation such as CoW in the implementation.  But it does
> not allow for a mutation applied to one name binding having the type
> to be observed through another name binding having the type.

It sounds like you're trying to capture some notion of “can't possibly
reach shared mutable state through this instance,” but IMO there's a
false distinction here.  Fundamentally, there's no difference between a
reference to an object and an integer that can be used as an index into
a global array that contains a reference to the object, or even an
integer that can be used as an index into a global array that contains
an equivalent struct.

Again, I would like to see some piece of code that *actually depends on
this PureValue property for its correctness*.

>>>        I don’t mean to imply that it is the *only* valuable
>>>        property. However, it I (and many others) do believe it is an extremely
>>>        valuable
>>>        property in many cases. Do you disagree?
>>> 
>>>    I think I do. What is valuable about such a protocol? What generic
>>>    algorithms could you write that work on models of PureValue but don't
>>>    work just as well on Array<Int>?
>>> 
>>> Array<Int> provides the semantics I have in mind just fine so there wouldn’t be
>>> any.  Array<AnyObject> is a completely different story. With
>>> Array<AnyObject> you cannot rely on a guarantee the objects contained
>>> in the array will not be mutated by code elsewhere that also happens
>>> to have a reference to the same objects.
>> 
>> Okay then, what algorithms can you write that operate on PureValue that
>> don't work equally well on Array<AnyObject>?
>
> I am not sure.  It is possible that it does not apply to purely
> algorithmic work.  That does not mean it is unimportant.  
> It is quite valuable in application level code.  It think it would be
> valuable to reify it with a protocol rather than leaving it to
> documentation even if the compiler can't always prove our code meets
> this semantic.

For the purposes of library and language design, the ability to produce
use-cases (and solid definitions) is crucial.  The ability to show how
it substantively differs from concepts we already have is crucial.  If
we can't find these things, it doesn't belong.

>  If you don't like the name PureValue for this concept lets bike shed.
> I only used it because others had already used it.  Maybe there is a
> better name.

It's not the name that's the problem.  I don't even understand what
you're reaching for, or why.  Without a demonstration of what this is
for, I'm going to continue to argue against it (though I'm about to be
on vacation so I'll be out of your hair for a week).

>> I have been trying to get you to nail down what you mean by PureValue,
>> and I was trying to illustrate that merely being “a struct wrapping a
>> mutable reference type” is not enough to disqualify anything from being
>> in the category you're trying to describe.  What are the properties of
>> types in that category, and what generic code would depend on those
>> properties?
>> 
>
> I hope my previous comments have helped to clarify this.

I'm afraid not yet.

-- 
-Dave