[swift-evolution] [Proposal] Foundation Swift Archival & Serialization

Itai Ferber iferber at apple.com
Thu Mar 16 14:33:18 CDT 2017


Thanks for the thorough and detailed review, Brent! Responses inline.

On 15 Mar 2017, at 21:19, Brent Royal-Gordon wrote:

>> On Mar 15, 2017, at 3:40 PM, Itai Ferber via swift-evolution 
>> <swift-evolution at swift.org> wrote:
>>
>> Hi everyone,
>>
>> The following introduces a new Swift-focused archival and 
>> serialization API as part of the Foundation framework. We’re 
>> interested in improving the experience and safety of performing 
>> archival and serialization, and are happy to receive community 
>> feedback on this work.
>
> Thanks to all of the people who've worked on this. It's a great 
> proposal.
>
>> Specifically:
>>
>> 	• It aims to provide a solution for the archival of Swift struct 
>> and enum types
>
> I see a lot of discussion here of structs and classes, and an example 
> of an enum without associated values, but I don't see any discussion 
> of enums with associated values. Can you sketch how you see people 
> encoding such types?
>
> For example, I assume that `Optional` is going to get some special 
> treatment, but if it doesn't, how would you write its `encode(to:)` 
> method?
`Optional` values are accepted and vended directly through the API. The 
`encode(_:forKey:)` methods take optional values directly, and 
`decodeIfPresent(_:forKey:)` vend optional values.

`Optional` is special in this way — it’s a primitive part of the 
system. It’s actually not possible to write an `encode(to:)` method 
for `Optional`, since the representation of null values is up to the 
encoder and the format it’s working in; `JSONEncoder`, for instance, 
decides on the representation of `nil` (JSON `null`). It wouldn’t be 
possible to ask `nil` to encode itself in a reasonable way.

> What about a more complex enum, like the standard library's 
> `UnicodeDecodingResult`:
>
> 	enum UnicodeDecodingResult {
> 		case emptyInput
> 		case error
> 		case scalarValue(UnicodeScalar)
> 	}
>
> Or, say, an `Error`-conforming type from one of my projects:
>
> 	public enum SQLError: Error {
> 	    case connectionFailed(underlying: Error)
> 	    case executionFailed(underlying: Error, statement: SQLStatement)
> 	    case noRecordsFound(statement: SQLStatement)
> 	    case extraRecordsFound(statement: SQLStatement)
> 	    case columnInvalid(underlying: Error, key: ColumnSpecifier, 
> statement: SQLStatement)
> 	    case valueInvalid(underlying: Error, key: AnySQLColumnKey, 
> statement: SQLStatement)
> 	}
>
> (You can assume that all the types in the associated values are 
> `Codable`.)
Sure — these cases specifically do not derive `Codable` conformance 
because the specific representation to choose is up to you. Two possible 
ways to write this, though there are many others (I’m simplifying 
these cases here a bit, but you can extrapolate this):

```swift
// Approach 1
// This produces either {"type": 0} for `.noValue`, or {"type": 1, 
"value": …} for `.associated`.
public enum EnumWithAssociatedValue : Codable {
     case noValue
     case associated(Int)

     private enum CodingKeys : CodingKey {
         case type
         case value
     }

     public init(from decoder: Decoder) throws {
         let container = try decoder.container(keyedBy: CodingKeys.self)
         let type = try container.decode(Int.self, forKey: .type)
         switch type {
         case 0:
             self = .noValue
         case 1:
             let value = try container.decode(Int.self, forKey: .value)
             self = .associated(value)
         default:
             throw …
         }
     }

     public func encode(to encoder: Encoder) throws {
         let container = encoder.container(keyedBy: codingKeys.self)
         switch self {
         case .noValue:
             try container.encode(0, forKey: .type)
         case .associated(let value):
             try container.encode(1, forKey: .type)
             try container.encode(value, forKey: .value)
         }
     }
}

// Approach 2
// Produces `0`, `1`, or `2` for `.noValue1`, `.noValue2`, and 
`.noValue3` respectively.
// Produces {"type": 3, "value": …} and {"type": 4, "value": …} for 
`.associated1` and `.associated2`.
public enum EnumWithAssociatedValue : Codable {
     case noValue1
     case noValue2
     case noValue3
     case associated1(Int)
     case associated2(String)

     private enum CodingKeys : CodingKey {
         case type
         case value
     }

     public init(from decoder: Decoder) throws {
         if let container = try? decoder.singleValueContainer() {}
             let type = container.decode(Int.self)
             switch type {
             case 0: self = .noValue1
             case 1: self = .noValue2
             case 2: self = .noValue3
             default: throw …
             }
         } else {
             let container = try decoder.container(keyedBy: 
CodingKeys.self)
             let type = container.decode(Int.self, forKey: .type)
             switch type {
             case 3:
                 let value = container.decode(Int.self, forKey: .value)
                 self = .associated1(value)
             case 4:
                 let value = container.decode(String.self, forKey: 
.value)
                 self = .associated2(value)
             default: throw ...
             }
         }
     }
}
```

There are, of course, many more approaches that you could take, but 
these are just two examples. The first is likely simpler to read and 
comprehend, but may not be appropriate if you’re trying to optimize 
for space.

> I don't necessarily assume that the compiler should write conformances 
> to these sorts of complicated enums for me (though that would be 
> nice!); I'm just wondering what the designers of this feature envision 
> people doing in cases like these.
>
>> 	• protocol Codable: Adopted by types to opt into archival. 
>> Conformance may be automatically derived in cases where all 
>> properties are also Codable.
>
> Have you given any consideration to supporting types which only need 
> to decode? That seems likely to be common when interacting with web 
> services.
We have. Ultimately, we decided that the introduction of several 
protocols to cover encodability, decodability, and both was too much of 
a cognitive overhead, considering the number of other types we’re also 
introducing. You can always implement `encode(to:)` as `fatalError()`.

>> 	• protocol CodingKey: Adopted by types used as keys for keyed 
>> containers, replacing String keys with semantic types. Conformance 
>> may be automatically derived in most cases.
>> 	• protocol Encoder: Adopted by types which can take Codable values 
>> and encode them into a native format.
>> 		• class KeyedEncodingContainer<Key : CodingKey>: Subclasses of 
>> this type provide a concrete way to store encoded values by 
>> CodingKey. Types adopting Encoder should provide subclasses of 
>> KeyedEncodingContainer to vend.
>> 		• protocol SingleValueEncodingContainer: Adopted by types which 
>> provide a concrete way to store a single encoded value. Types 
>> adopting Encoder should provide types conforming to 
>> SingleValueEncodingContainer to vend (but in many cases will be able 
>> to conform to it themselves).
>> 	• protocol Decoder: Adopted by types which can take payloads in a 
>> native format and decode Codable values out of them.
>> 		• class KeyedDecodingContainer<Key : CodingKey>: Subclasses of 
>> this type provide a concrete way to retrieve encoded values from 
>> storage by CodingKey. Types adopting Decoder should provide 
>> subclasses of KeyedDecodingContainer to vend.
>> 		• protocol SingleValueDecodingContainer: Adopted by types which 
>> provide a concrete way to retrieve a single encoded value from 
>> storage. Types adopting Decoder should provide types conforming to 
>> SingleValueDecodingContainer to vend (but in many cases will be able 
>> to conform to it themselves).
>
> I do want to note that, at this point in the proposal, I was sort of 
> thinking you'd gone off the deep end modeling this. Having read the 
> whole thing, I now understand what all of these things do, but this 
> really is a very large subsystem. I think it's worth asking if some of 
> these types can be eliminated or combined.
In the past, the concepts of `SingleValueContainer` and `Encoder` were 
not distinct — all of the methods on `SingleValueContainer` were just 
part of `Encoder`. Sure, this is a simpler system, but unfortunately 
promotes the wrong thing altogether. I’ll address this below.

>> Structured types (i.e. types which encode as a collection of 
>> properties) encode and decode their properties in a keyed manner. 
>> Keys may be String-convertible or Int-convertible (or both),
>
> What does "may" mean here? That, at runtime, the encoder will test for 
> the preferred key type and fall back to the other one? That seems a 
> little bit problematic.
Yes, this is the case. A lot is left up to the `Encoder` because it can 
choose to do something for its format that your implementation of 
`encode(to:)` may not have considered.
If you try to encode something with an `Int` key in a string-keyed 
dictionary, the encoder may choose to stringify the integer if 
appropriate for the format. If not, it can reject your key, ignore the 
call altogether, `preconditionFailure()`, etc. It is also perfectly 
legitimate to write an `Encoder` which supports a flat encoding format 
— in that case, keys are likely ignored altogether, in which case 
there is no error to be had. We’d like to not arbitrarily constrain an 
implementation unless necessary.

FWIW, 99.9% of the time, the appropriate thing to do is to either simply 
throw an error, or `preconditionFailure()`. Nasal demons should not be 
the common case. But for some encoding formats, this is appropriate.

> I'm also quite worried about how `Int`-convertible keys will interact 
> with code synthesis. The obvious way to assign integers—declaration 
> order—would mean that reordering declarations would invisibly break 
> archiving, potentially (if the types were compatible) without breaking 
> anything in an error-causing way even at runtime. You could sort the 
> names, but then adding a new property would shift the integers of the 
> properties "below" it. You could hash the names, but then there's no 
> obvious relationship between the integers and key cases.
>
> At the same time, I also think that using arbitrary integers is a poor 
> match for ordering. If you're making an ordered container, you don't 
> want arbitrary integers wrapped up in an abstract type. You want 
> adjacent integers forming indices of an eventual array. (Actually, you 
> may not want indices at all—you may just want to feed elements in 
> one at a time!)
For these exact reasons, integer keys are not produced by code 
synthesis, only string keys. If you want integer keys, you’ll have to 
write them yourself. :)

Integer keys are fragile, as you point out yourself, and while we’d 
like to encourage their use as appropriate, they require explicit 
thought and care as to their use.

> So I would suggest the following changes:
>
> * The coding key always converts to a string. That means we can 
> eliminate the `CodingKey` protocol and instead use `RawRepresentable 
> where RawValue == String`, leveraging existing infrastructure. That 
> also means we can call the `CodingKeys` associated type `CodingKey` 
> instead, which is the correct name for it—we're not talking about an 
> `OptionSet` here.
>
> * If, to save space on disk, you want to also people to use integers 
> as the serialized representation of a key, we might introduce a 
> parallel `IntegerCodingKey` protocol for that, but every `CodingKey` 
> type should map to `String` first and foremost. Using a protocol here 
> ensures that it can be statically determined at compile time whether a 
> type can be encoded with integer keys, so the compiler can select an 
> overload of `container(keyedBy:)`.
>
> * Intrinsically ordered data is encoded as a single value containers 
> of type `Array<Codable>`. (I considered having an `orderedContainer()` 
> method and type, but as I thought about it, I couldn't think of an 
> advantage it would have over `Array`.)
This is possible, but I don’t see this as necessarily advantageous 
over what we currently have. In 99.9% of cases, `CodingKey` types will 
have string values anyway — in many cases you won’t have to write 
the `enum` yourself to begin with, but even when you do, derived 
`CodingKey` conformance will generate string values on your behalf.
The only time a key will not have a string value is if the `CodingKey` 
protocol is implemented manually and a value is either deliberately left 
out, or there was a mistake in the implementation; in either case, there 
wouldn’t have been a valid string value anyway.

>>     /// Returns an encoding container appropriate for holding a 
>> single primitive value.
>>     ///
>>     /// - returns: A new empty single value container.
>>     /// - precondition: May not be called after a prior 
>> `self.container(keyedBy:)` call.
>>     /// - precondition: May not be called after a value has been 
>> encoded through a previous `self.singleValueContainer()` call.
>>     func singleValueContainer() -> SingleValueEncodingContainer
>
> Speaking of which, I'm not sure about single value containers. My 
> first instinct is to say that methods should be moved from them to the 
> `Encoder` directly, but that would probably cause code duplication. 
> But...isn't there already duplication between the 
> `SingleValue*Container` and the `Keyed*Container`? Why, yes, yes there 
> is. So let's talk about that.
In the Alternatives Considered section of the proposal, we detail having 
done just this. Originally, the requirements now on 
`SingleValueContainer` sat on `Encoder` and `Decoder`.
Unfortunately, this made it too easy to do the wrong thing, and required 
extra work (in comparison) to do the right thing.

When `Encoder` has `encode(_ value: Bool?)`, `encode(_ value: Int?)`, 
etc. on it, it’s very intuitive to try to encode values that way:

```swift
func encode(to encoder: Encoder) throws {
     // The very first thing I try to type is encoder.enc… and guess 
what pops up in autocomplete:
     try encoder.encode(myName)
     try encoder.encode(myEmail)
     try encoder.encode(myAddress)
}
```

This might look right to someone expecting to be able to encode in an 
ordered fashion, which is _not_ what these methods do.
In addition, for someone expecting keyed encoding methods, this is very 
confusing. Where are those methods? Where don’t these "default" 
methods have keys?

The very first time that code block ran, it would 
`preconditionFailure()` or throw an error, since those methods intend to 
encode only one single value.

>>     open func encode<Value : Codable>(_ value: Value?, forKey key: 
>> Key) throws
>>     open func encode(_ value: Bool?,   forKey key: Key) throws
>>     open func encode(_ value: Int?,    forKey key: Key) throws
>>     open func encode(_ value: Int8?,   forKey key: Key) throws
>>     open func encode(_ value: Int16?,  forKey key: Key) throws
>>     open func encode(_ value: Int32?,  forKey key: Key) throws
>>     open func encode(_ value: Int64?,  forKey key: Key) throws
>>     open func encode(_ value: UInt?,   forKey key: Key) throws
>>     open func encode(_ value: UInt8?,  forKey key: Key) throws
>>     open func encode(_ value: UInt16?, forKey key: Key) throws
>>     open func encode(_ value: UInt32?, forKey key: Key) throws
>>     open func encode(_ value: UInt64?, forKey key: Key) throws
>>     open func encode(_ value: Float?,  forKey key: Key) throws
>>     open func encode(_ value: Double?, forKey key: Key) throws
>>     open func encode(_ value: String?, forKey key: Key) throws
>>     open func encode(_ value: Data?,   forKey key: Key) throws
>
> Wait, first, a digression for another issue: I'm concerned that, if 
> you look at the `decode` calls, there are plain `decode(…)` calls 
> which throw if a `nil` was originally encoded and `decodeIfPresent` 
> calls which return optional. The result is, essentially, that the 
> encoding system eats a level of optionality for its own 
> purposes—seemingly good, straightforward-looking code like this:
>
> 	struct MyRecord: Codable {
> 		var id: Int?
>> 		
> 		func encode(to encoder: Encoder) throws {
> 			let container = encoder.container(keyedBy: CodingKey.self)
> 			try container.encode(id, forKey: .id)
>> 		}
> 		
> 		init(from decoder: Decoder) throws {
> 			let container = decoder.container(keyedBy: CodingKey.self)
> 			id = try container.decode(Int.self, forKey: .id)
>> 		}
> 	}
>
> Will crash. (At least, I assume that's what will happen.)
The return type of `decode(Int.self, forKey: .id)` is `Int`. I’m not 
convinced that it’s possible to misconstrue that as the correct thing 
to do here. How would that return a `nil` value if the value was `nil` 
to begin with?
The only other method that would be appropriate is 
`decodeIfPresent(Int.self, forKey: .id)`, which is exactly what you 
want.

> I think we'd be better off having `encode(_:forKey:)` not take an 
> optional; instead, we should have `Optional` conform to `Codable` and 
> behave in some appropriate way. Exactly how to implement it might be a 
> little tricky because of nested optionals; I suppose a `none` would 
> have to measure how many levels of optionality there are between it 
> and a concrete value, and then encode that information into the data. 
> I think our `NSNull` bridging is doing something broadly similar right 
> now.
`Optional` cannot encode to `Codable` for the reasons given above. It is 
a primitive type much like `Int` and `String`, and it’s up to the 
encoder and the format to represent it.
How would `Optional` encode `nil`?

> I know that this is not the design you would use in Objective-C, but 
> Swift uses `Optional` differently from how Objective-C uses `nil`. 
> Swift APIs consider `nil` and absent to be different things; where 
> they can both occur, good Swift APIs use doubled-up Optionals to be 
> precise about the situation. I think the design needs to be a little 
> different to accommodate that.
>
> Now, back to the `SingleValue*Container`/`Keyed*Container` issue. The 
> list above is, frankly, gigantic. You specify a *lot* of primitives in 
> `Keyed*Container`; there's a lot to implement here. And then you have 
> to implement it all *again* in `SingleValue*Container`:
>
>>     func encode(_ value: Bool) throws
>>     func encode(_ value: Int) throws
>>     func encode(_ value: Int8) throws
>>     func encode(_ value: Int16) throws
>>     func encode(_ value: Int32) throws
>>     func encode(_ value: Int64) throws
>>     func encode(_ value: UInt) throws
>>     func encode(_ value: UInt8) throws
>>     func encode(_ value: UInt16) throws
>>     func encode(_ value: UInt32) throws
>>     func encode(_ value: UInt64) throws
>>     func encode(_ value: Float) throws
>>     func encode(_ value: Double) throws
>>     func encode(_ value: String) throws
>>     func encode(_ value: Data) throws
>
>
> This is madness.
>
> Look, here's what we do. You have two types: `Keyed*Container` and 
> `Value*Container`. `Keyed*Container` looks something like this:
>
> 	final public class KeyedEncodingContainer<EncoderType: Encoder, Key: 
> RawRepresentable> where Key.RawValue == String {
> 	    public let encoder: EncoderType
> 	
> 	    public let codingKeyContext: [RawRepresentable where RawValue == 
> String]
> 	    // Hmm, we might need a CodingKey protocol after all.
> 	    // Still, it could just be `protocol CodingKey: RawRepresentable 
> where RawValue == String {}`
> 	
> 	    subscript (key: Key) -> ValueEncodingContainer {
> 	        return encoder.makeValueEncodingContainer(forKey: key)
> 	    }
> 	}
>
> It's so simple, it doesn't even need to be specialized. You might even 
> be able to get away with combining the encoding and decoding variants 
> if the subscript comes from a conditional extension. `Value*Container` 
> *does* need to be specialized; it looks like this (modulo the 
> `Optional` issue I mentioned above):
Sure, let’s go with this for a moment. Presumably, then, `Encoder` 
would be able to vend out both `KeyedEncodingContainer`s and 
`ValueEncodingContainer`s, correct?

> 	public protocol ValueEncodingContainer {
> 	    func encode<Value : Codable>(_ value: Value?, forKey key: Key) 
> throws
I’m assuming that the key here is a typo, correct?
Keep in mind that combining these concepts changes the semantics of how 
single-value encoding works. Right now `SingleValueEncodingContainer` 
only allows values of primitive types; this would allow you to encode a 
value in terms of a different arbitrarily-codable value.

> 	    func encode(_ value: Bool?) throws
> 	    func encode(_ value: Int?) throws
> 	    func encode(_ value: Int8?) throws
> 	    func encode(_ value: Int16?) throws
> 	    func encode(_ value: Int32?) throws
> 	    func encode(_ value: Int64?) throws
> 	    func encode(_ value: UInt?) throws
> 	    func encode(_ value: UInt8?) throws
> 	    func encode(_ value: UInt16?) throws
> 	    func encode(_ value: UInt32?) throws
> 	    func encode(_ value: UInt64?) throws
> 	    func encode(_ value: Float?) throws
> 	    func encode(_ value: Double?) throws
> 	    func encode(_ value: String?) throws
> 	    func encode(_ value: Data?) throws
> 	
> 	    func encodeWeak<Object : AnyObject & Codable>(_ object: Object?) 
> throws
Same comment here.
  	
> 	    var codingKeyContext: [CodingKey]
> 	}
>
> And use sites would look like:
>
> 	func encode(to encoder: Encoder) throws {
> 		let container = encoder.container(keyedBy: CodingKey.self)
> 		try container[.id].encode(id)
> 		try container[.name].encode(name)
> 		try container[.birthDate].encode(birthDate)
> 	}
For consumers, this doesn’t seem to make much of a difference. We’ve 
turned `try container.encode(id, forKey:. id)` into `try 
container[.id].encode(id)`.

> Decoding is slightly tricker. You could either make the subscript 
> `Optional`, which would be more like `Dictionary` but would be 
> inconsistent with `Encoder` and would give the "never force-unwrap 
> anything" crowd conniptions, or you could add a `contains()` method to 
> `ValueDecodingContainer` and make `decode(_:)` throw. Either one 
> works.
>
> Also, another issue with the many primitives: swiftc doesn't really 
> like large overload sets very much. Could this set be reduced? I'm not 
> sure what the logic was in choosing these particular types, but many 
> of them share protocols in Swift—you might get away with just this:
>
> 	public protocol ValueEncodingContainer {
> 	    func encode<Value : Codable>(_ value: Value?, forKey key: Key) 
> throws
> 	    func encode(_ value: Bool?,   forKey key: Key) throws
> 	    func encode<Integer: SignedInteger>(_ value: Integer?, forKey 
> key: Key) throws
> 	    func encode<UInteger: UnsignedInteger>(_ value: UInteger?, forKey 
> key: Key) throws
> 	    func encode<Floating: FloatingPoint>(_ value: Floating?, forKey 
> key: Key) throws
> 	    func encode(_ value: String?, forKey key: Key) throws
> 	    func encode(_ value: Data?,   forKey key: Key) throws
> 	
> 	    func encodeWeak<Object : AnyObject & Codable>(_ object: Object?, 
> forKey key: Key) throws
> 	
> 	    var codingKeyContext: [CodingKey]
> 	}
These types were chosen because we want the API to make static 
guarantees about concrete types which all `Encoder`s and `Decoder`s 
should support. This is somewhat less relevant for JSON, but more 
relevant for binary formats where the difference between `Int16` and 
`Int64` is critical.
This turns the concrete type check into a runtime check that `Encoder` 
authors need to keep in mind. More so, however, any type can conform to 
`SignedInteger` or `UnsignedInteger` as long as it fulfills the protocol 
requirements. I can write an `Int37` type, but no encoder could make 
sense of that type, and that failure is a runtime failure. If you want a 
concrete example, `Float80` conforms to `FloatingPoint`; no popular 
binary format I’ve seen supports 80-bit floats, though — we cannot 
prevent that call statically…

Instead, we want to offer a static, concrete list of types that 
`Encoder`s and `Decoder`s must be aware of, and that consumers have 
guarantees about support for.

>
> To accommodate my previous suggestion of using arrays to represent 
> ordered encoded data, I would add one more primitive:
>
> 	    func encode(_ values: [Codable]) throws
Collection types are purposefully not primitives here:

* If `Array` is a primitive, but does not conform to `Codable`, then you 
cannot encode `Array<Array<Codable>>`.
* If `Array` is a primitive, and conforms to `Codable`, then there may 
be ambiguity between `encode(_ values: [Codable])` and `encode(_ value: 
Codable)`.
   * Even in cases where there are not, inside of `encode(_ values: 
[Codable])`, if I call `encode([[1,2],[3,4]])`, you’ve lost type 
information about what’s contained in the array — all you see is 
`Codable`
   * If you change it to `encode<Value : Codable>(_ values: [Value])` to 
compensate for that, you still cannot infinitely recurse on what type 
`Value` is. Try it with `encode([[[[1]]]])` and you’ll see what I 
mean; at some point the inner types are no longer preserved.

> (Also, is there any sense in adding `Date` to this set, since it needs 
> special treatment in many of our formats?)
We’ve considered adding `Date` to this list. However, this means that 
any format that is a part of this system needs to be able to make a 
decision about how to format dates. Many binary formats have no native 
representations of dates, so this is not necessarily a guarantee that 
all formats can make.

Looking for additional opinions on this one.

>> Encoding Container Types
>>
>> For some types, the container into which they encode has meaning. 
>> Especially when coding for a specific output format (e.g. when 
>> communicating with a JSON API), a type may wish to explicitly encode 
>> as an array or a dictionary:
>>
>> // Continuing from before
>> public protocol Encoder {
>>     func container<Key : CodingKey>(keyedBy keyType: Key.Type, type 
>> containerType: EncodingContainerType) -> KeyedEncodingContainer<Key>
>> }
>>
>> /// An `EncodingContainerType` specifies the type of container an 
>> `Encoder` should use to store values.
>> public enum EncodingContainerType {
>>     /// The `Encoder`'s preferred container type; equivalent to 
>> either `.array` or `.dictionary` as appropriate for the encoder.
>>     case `default`
>>
>>     /// Explicitly requests the use of an array to store encoded 
>> values.
>>     case array
>>
>>     /// Explicitly requests the use of a dictionary to store encoded 
>> values.
>>     case dictionary
>> }
>
> I see what you're getting at here, but I don't think this is fit for 
> purpose, because arrays are not simply dictionaries with integer 
> keys—their elements are adjacent and ordered. See my discussion 
> earlier about treating inherently ordered containers as simply 
> single-value `Array`s.
You’re right in that arrays are not simply dictionaries with integer 
keys, but I don’t see where we make that assertion here.
If an `Encoder` is asked for an array and is provided with integer keys, 
it can use those keys as indices. If the keys are non-contiguous, the 
intervening spaces can be filled with null values (if appropriate for 
the format; if not, the operation can error out).

The way these containers are handled is completely up to the `Encoder`. 
An `Encoder` producing an array may choose to ignore keys altogether and 
simply produce an array from the values given to it sequentially. (This 
is not recommended, but possible.)

>> Nesting
>>
>> In practice, some types may also need to control how data is nested 
>> within their container, or potentially nest other containers within 
>> their container. Keyed containers allow this by returning nested 
>> containers of differing key types:
>
> [snip]
>
>> This can be common when coding against specific external data 
>> representations:
>>
>> // User type for interfacing with a specific JSON API. JSON API 
>> expects encoding as {"id": ..., "properties": {"name": ..., 
>> "timestamp": ...}}. Swift type differs from encoded type, and 
>> encoding needs to match a spec:
>
> This comes very close to—but doesn't quite—address something else 
> I'm concerned about. What's the preferred way to handle differences in 
> serialization to different formats?
>
> Here's what I mean: Suppose I have a BlogPost model, and I can both 
> fetch and post BlogPosts to a cross-platform web service, and store 
> them locally. But when I fetch and post remotely, I ned to conform to 
> the web service's formats; when I store an instance locally, I have a 
> freer hand in designing my storage, and perhaps need to store some 
> extra metadata. How do you imagine handling that sort of situation? Is 
> the answer simply that I should use two different types?
This is a valid concern, and one that should likely be addressed.

Perhaps the solution is to offer a `userInfo : [UserInfoKey : Any]` 
(`UserInfoKey` being a `String`-`RawRepresentable` struct or similar) on 
`Encoder` and `Decoder` set at the top-level to allow passing this type 
of contextual information from the top level down.

>> To remedy both of these points, we adopt a new convention for 
>> inheritance-based coding — encoding super as a sub-object of self:
>
> [snip]
>
>>         try super.encode(to: container.superEncoder())
>
> This seems like a good idea to me. However, it brings up another 
> point: What happens if you specify a superclass of the originally 
> encoded class? In other words:
>
> 	let joe = Employee(…)
> 	let payload = try SomeEncoder().encode(joe)
>> 	let someone = try SomeDecoder().decode(Person.self, from: payload)
> 	print(type(of: someone))		// Person, Employee, or does 
> `decode(_:from:)` fail?
We don’t support this type of polymorphic decoding. Because no type 
information is written into the payload (there’s no safe way to do 
this that is not currently brittle), there’s no way to tell what’s 
in there prior to decoding it (and there wouldn’t be a reasonable way 
to trust what’s in the payload to begin with).
We’ve thought through this a lot, but in the end we’re willing to 
make this tradeoff for security primarily, and simplicity secondarily.

>> The encoding container types offer overloads for working with and 
>> processing the API's primitive types (String, Int, Double, etc.). 
>> However, for ease of implementation (both in this API and others), it 
>> can be helpful for these types to conform to Codable themselves. 
>> Thus, along with these overloads, we will offer Codable conformance 
>> on these types:
>
> [snip]
>
>> Since Swift's function overload rules prefer more specific functions 
>> over generic functions, the specific overloads are chosen where 
>> possible (e.g. encode("Hello, world!", forKey: .greeting) will choose 
>> encode(_: String, forKey: Key) over encode<T : Codable>(_: T, forKey: 
>> Key)). This maintains performance over dispatching through the 
>> Codable existential, while allowing for the flexibility of fewer 
>> overloads where applicable.
>
> How important is this performance? If the answer is "eh, not really 
> that much", I could imagine a setup where every "primitive" type 
> eventually represents itself as `String` or `Data`, and each 
> `Encoder`/`Decoder` can use dynamic type checks in 
> `encode(_:)`/`decode(_:)` to define whatever "primitives" it wants for 
> its own format.
Does this imply that `Int32` should decide how it’s represented as 
`Data`? What if an encoder forgets to implement that?
Again, we want to provide a static list of types that `Encoder`s know 
they _must_ handle, and thus, consumers have _guarantees_ that those 
types are supported.

> * * *
>
> One more thing. In Alternatives Considered, you present two 
> designs—#2 and #3—where you generate a separate instance which 
> represents the type in a fairly standardized way for the encoder to 
> examine.
>
> This design struck me as remarkably similar to the reflection system 
> and its `Mirror` type, which is also a separate type describing an 
> original instance. My question was: Did you look at the reflection 
> system when you were building this design? Do you think there might be 
> anything that can be usefully shared between them?
We did, quite a bit, and spent a lot of time considering reflection and 
its place in our design. Ultimately, the reflection system does not 
currently have the features we would need, and although the Swift team 
has expressed desire to improve the system considerably, it’s not 
currently a top priority, AFAIK.

> Thank you for your attention. I hope this was helpful!
Thanks for all of these comments! Looking to respond to your other email 
soon.

> -- 
> Brent Royal-Gordon
> Architechies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170316/6839e66c/attachment.html>


More information about the swift-evolution mailing list