[swift-evolution] [Proposal] Foundation Swift Archival & Serialization

Sun Mar 19 22:24:47 CDT 2017


Sent from my iPad

> On Mar 19, 2017, at 10:19 PM, Matthew Johnson via swift-evolution <swift-evolution at swift.org> wrote:
> 
> 
> 
> Sent from my iPad
> 
>> On Mar 19, 2017, at 9:14 PM, Brent Royal-Gordon <brent at architechies.com> wrote:
>> 
>>> On Mar 19, 2017, at 5:51 PM, Matthew Johnson <matthew at anandabits.com> wrote:
>>> 
>>> I generally agree with you about casting.  However, my dislike isn’t the cast itself, but instead it is the lack of a static guarantee.  I’m not sure we’ll find a solution that provides a static guarantee that a required context exists that is also acceptable to the Foundation team.
>> 
>> I don't think we can get a static guarantee that the context is present, but I still would like a static guarantee that the context is of the expected type. That's what I'm trying to provide here.
> 
> This doesn't do any better job of that than a cast in user code.  I can see two meaningful differences.  First, your solution does not allow a user to see a context if they can't name the type (you can't get it as Any and use reflection, etc).  I don't see this restriction as being beneficial.  Second, your solution introduces several subtle problems mentioned in my last email which you didn't respond to (overlapping context types, etc).  
> 
>> 
>>>> 
>>>> 	protocol Encoder {
>>>> 		// Retrieve the context instance of the indicated type.
>>>> 		func context<Context>(ofType type: Context.Type) -> Context?
>>>> 		
>>>> 		// This context is visible for `encode(_:)` calls from this encoder's containers all the way down, recursively.
>>>> 		func addContext<Context>(_ context: Context, ofType type: Context.Type)
>>> 
>>> What happens if you call `addContext` more than once with values of the same type?
>> 
>> It overrides the previous context, but only for the containers created by this `encode(to:)` method and any containers nested within them.
>> 
>> (Although that could cause trouble for an encoder which only encodes objects with multiple instances once. Hmm.)
>> 
>>> And why do you require the type to be passed explicitly when it is already implied by the type of the value?
>> 
>> As you surmised later, I was thinking in terms of `type` being used as a dictionary key; in that case, if you stored a `Foo` into the context, you would not later be able to look it up using one of `Foo`'s supertypes. But if we really do expect multiple contexts to be rare, perhaps we don't need a dictionary at all—we can just keep an array, loop over it with `as?`, and return the first (or last?) match. If that's what we do, then we probably don't need to pass the type explicitly.
> 
> The array approach is better because at least there is an order to the contexts and we can assign precise semantics in the presence of overlapping context types by saying type get the first (most recent) context that can be cast to the type you ask for.  
> 
> That said, I think what you're really trying to model here is a context stack, isn't it?  Why don't we just do that?
> 
>> 
>>>> 	}
>>>> 	// Likewise on Decoder
>>>> 	
>>>> 	// Encoder and decoder classes should accept contexts in their top-level API:
>>>> 	open class JSONEncoder {
>>>> 		open func encode<Value : Codable>(_ value: Value, withContexts contexts: [Any] = []) throws -> Data
>>>> 	}
>>> 
>>> What happens if more than one context of the same type is provided here?
>> 
>> Fail a precondition, probably.
> 
> I would never support this design.  Good news though: the context stack approach avoids the problem.  We allow multiple contexts of the same type to be on the stack and the topmost context that can be cast to the requested type is used.
> 
>> 
>>> Also, it’s worth pointing out that whatever reason you had for explicitly passing the type above you’re not requiring type information to be provided here.  Whatever design we have it should be self-consistent.
>> 
>> Yeah. I did this here because there was no way to specify a dictionary literal of `(T.Type, T)`, where `T` could be different for different elements.
>> 
>>> Do you think it’s really important to allow users to dynamically provide context for children?  Do you have real world use cases where this is needed?  I’m sure there could be case where this might be useful.  But I also think there is some benefit in knowing that the context used for an entire encoding / decoding is the one you provide at the top level.  I suspect the benefit of a static guarantee that your context is used for the entire encoding / decoding has a lot more value than the ability to dynamically change the context for a subtree.
>> 
>> The problem with providing all the contexts at the top level is that then the top level has to *know* what all the contexts needed are. Again, if you're encoding a type from FooKit, and it uses a type from GeoKit, then you—the user of FooKit—need to know that FooKit uses GeoKit and how to make contexts for both of them. There's no way to encapsulate GeoKit's role in encoding.
> 
> The use cases I know of for contexts are really around helping a type choose an encoding strategy.  I can't imagine a real world use case where a Codable type would have a required context - it's easy enough to choose one strategy as the default.  That said, I can imagine really evil and degenerate API designs that would require the same type to be encoded differently in different parts of the tree.  I could imagine dynamic contexts being helpful in solving some of these cases, but often you would need to look at the codingKeyContext to get it right.
> 
> If you have a concrete real world use case involving module boundaries please elaborate.  I'm having trouble imagining the details about a precise problem you would solve using dynamic contexts.  I get the impression you have something more concrete in mind than I can think of.
> 
>> 
>> On the other hand, there *could* be a way to encapsulate it. Suppose we had a context protocol:
>> 
>> 	protocol CodingContext {
>> 		var underlyingContexts: [CodingContext] { get }
>> 	}
>> 	extension CodingContext {
>> 		var underlyingContexts: [CodingContext] { return [] }
>> 	}
>> 
>> Then you could have this as your API surface:
>> 
>> 	protocol Encoder {
>> 		// Retrieve the context instance of the indicated type.
>> 		func context<Context: CodingContext>(ofType type: Context.Type) -> Context?
>> 	}
>> 	// Likewise on Decoder
>> 	
>> 	// Encoder and decoder classes should accept contexts in their top-level API:
>> 	open class JSONEncoder {
>> 		open func encode<Value : Codable>(_ value: Value, with context: CodingContext? = nil) throws -> Data
>> 	}
>> 
>> And libraries would be able to add additional contexts for dependencies as needed.
>> 
>> (Hmm. Could we maybe do this?
>> 
>> 	protocol Codable {
>> 		associatedtype CodingContextType: CodingContext = Never
>> 		
>> 		func encode(to encoder: Encoder) throws
>> 		init(from decoder: Decoder) throws
>> 	}
>> 
>> 	protocol Encoder {
>> 		// Retrieve the context instance of the indicated type.
>> 		func context<CodableType: Codable>(for instance: Codable) -> CodableType.CodingContextType?
>> 	}
>> 	// Likewise on Decoder
>> 	
>> 	// Encoder and decoder classes should accept contexts in their top-level API:
>> 	open class JSONEncoder {
>> 		open func encode<Value : Codable>(_ value: Value, with context: Value.CodingContextType? = nil) throws -> Data
>> 	}
>> 
>> That would make sure that, if you did use a context, it would be the right one for the root type. And I don't believe it would have any impact on types which didn't use contexts.)
> 
> I think this is far more than we need.  I think we could just say encoders and decoders keep a stack of contexts.  Calls to encode or decode (including top level) can provide a context (or an array of contexts which are interpreted as a stack bottom on left, top on right).  When the call returns the stack is popped to the point it was at before the call.  We could also include an explicit


> `func push(contexts: Context...)`

This should have been `func push(contexts: Any...)`

> method on encoder and decoder to allow a Codable to set context used by all of its members.  All calls to `push` would be popped when the current call to encode / decode returns.
> 
> Users ask for a context from an encoder / decoder using `func context<Context>(of: Context.Type) -> Context?`.  The stack is searched from the top to the bottom for a value that can be successfully cast to Context.
> 
>> 
>>> What benefit do you see in using types as context “keys” rather than something like `CodingUserInfoKey`?  As far as I can tell, it avoids the need for an explicit key which you could argue are somewhat redundant (it would be weird to have two context values of the same type in the cases I know of) and puts the cast in the Encoder / Decoder rather than user code.  These seem like modest, but reasonable wins.  
>> 
>> I also see it as an incentive for users to build a single context type rather than sprinkling in a whole bunch of separate keys. I really would prefer not to see people filling a `userInfo` dictionary with random primitive-typed values like `["json": true, "apiVersion": "1.4"]`; it seems too easy for names to clash or people to forget the type they're actually using. `context(…)` being a function instead of a subscript is similarly about ergonomics: it discourages you from trying to mutate your context during the encoding process (although it doesn't prevent it for reference types.)
>> 
> 
> I agree with this sentiment and indicated to Tony the desire to steer people away from treating this as a dictionary to put a lot of stuff in and towards defining an explicit context type.  This and the fact that keys will feel pretty arbitrary are behind my desire to avoid the keys and dictionary approach.
> 
>>> Unfortunately, I don't think there is a good answer to the question about multiple context values with the same type though.  I can’t think of a good way to prevent this statically.  Worse, the values might not have the same type, but be equally good matches for a type a user requests (i.e. both conform to the same protocol).  I’m not sure how a user-defined encoder / decoder could be expected to find the “best” match using semantics that would make sense to Swift users (i.e. following the rules that are kind of the inverse to overload resolution).  
>>> 
>>> Even if this were possible there are ambiguous cases where there would be equally good matches.  Which value would a user get when requesting a context in that case?  We definitely don’t want accessing the context to be a trapping or throwing operation.  That leaves returning nil or picking a value at random.  Both are bad choices IMO.
>> 
>> If we use the `underlyingContexts` idea, we could say that the context list is populated breadth-first and the first context of a particular type encountered wins. That would tend to prefer the context "closest" to the top-level one provided by the caller, which will probably have the best fidelity to the caller's preferences.
> 
> I'm not totally sure I follow you here, but I think you're describing stack-like semantics that are at least similar to what I have described.  I think the stack approach is a pretty cool one that targets the kinds of problems multiple contexts are trying to solve more directly than the dictionary approach would.
> 
>> 
>> -- 
>> Brent Royal-Gordon
>> Architechies
>> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170319/55753f50/attachment.html>