[swift-evolution] [Proposal] Foundation Swift Encoders

Thu Mar 16 15:51:54 CDT 2017

On 16 Mar 2017, at 1:00, Brent Royal-Gordon wrote:

>> On Mar 15, 2017, at 3:43 PM, Itai Ferber via swift-evolution 
>> <swift-evolution at swift.org> wrote:
>>
>> Hi everyone,
>> This is a companion proposal to the Foundation Swift Archival & 
>> Serialization API. This introduces new encoders and decoders to be 
>> used as part of this system.
>> The proposal is available online and inlined below.
>
> Executive summary: I like where you're going with this, but I'm 
> worried about flexibility.
>
> I'm not going to quote every bit of the JSON section because Apple 
> Mail seems to destroy the formatting when I reply, but: I think you've 
> identified several of the most important customization points (Date, 
> Data, and illegal Floats). However, I think:
>
> * People may want to map illegal Floats to legal floating-point values 
> (say, `greatestFiniteMagnitude`, `-greatestFiniteMagnitude`, and `0`) 
> or map them to `null`s. They may also want different behavior for 
> different things: imagine `(positiveInfinity: 
> Double.greatestFiniteMagnitude, negativeInfinity: 
> -Double.greatestFiniteMagnitude, nan: .throw)`.
I agree, this may be something that users could want.

>
> * Large integers are another big concern that you don't address. 
> Because JSON only supports doubles, APIs that use 64-bit IDs often 
> need them to be passed as strings, frequently with a different key 
> ("id_str" instead of "id").
This is not true — JSON has no limitations on what numbers it can 
represent. 340282366920938463463374607431768211455 (2^128-1) is a 
perfectly legitimate number in JSON, though you may have a hard reading 
it in on some platforms. _Javascript_ numbers are IEEE 754 doubles, but 
that’s a Javascript problem, not a JSON problem.

If what you mean here is that some large numbers should be encoded as 
strings instead of integers for the benefit of the other side reading it 
in a valid way, then perhaps.

> * For that matter, style and capitalization are a problem. JSON style 
> varies, but it *tends* to be snake_case, where Cocoa favors camelCase. 
> You can address this at the CodingKey level by manually specifying 
> string equivalents of all the coding keys, but that's kind of a pain, 
> and it affects all of your code and all of your serializations.
>
> I'm sorely tempted to suggest that we give the JSON encoder and 
> decoder a delegate:
>
> 	public protocol JSONCodingDelegate {
> 		/// Returns the string name to be used when encoding or decoding the 
> given CodingKey as JSON.
> 		///
> 		/// - Returns: The string to use, or `nil` for the default.
> 		func jsonName(for key: CodingKey, at keyPath: [CodingKey], in 
> encoderOrDecoder: AnyObject) throws -> String?
>
> 		// These are used when encoding/decoding any of the integer types.
> 		func jsonValue(from integer: Int64, at keyPath: [CodingKey], in 
> encoder: JSONEncoder) throws -> JSONValue?
> 		func integer(from jsonValue: JSONValue, at keyPath: [CodingKey], in 
> decoder: JSONDecoder) throws -> Int64?
> 		
> 		// These are used when encoding/decoding any of the floating-point 
> types.
> 		func jsonValue(from number: Double, at keyPath: [CodingKey], in 
> encoder: JSONEncoder) throws -> JSONValue?
> 		func number(from jsonValue: JSONValue, at keyPath: [CodingKey], in 
> decoder: JSONDecoder) throws -> Double?
> 		
> 		// These are used when encoding/decoding Date.
> 		func jsonValue(from date: Date, at keyPath: [CodingKey], in encoder: 
> JSONEncoder) throws -> JSONValue?
> 		func date(from jsonValue: JSONValue, at keyPath: [CodingKey], in 
> decoder: JSONDecoder) throws -> Date?
> 		
> 		// These are used when encoding/decoding Data.
> 		func jsonValue(from data: Data, at keyPath: [CodingKey], in encoder: 
> JSONEncoder) throws -> JSONValue?
> 		func data(from jsonValue: JSONValue, at keyPath: [CodingKey], in 
> decoder: JSONDecoder) throws -> Data?
> 		
> 		func jsonValue(from double: Double, at keyPath: [CodingKey], in 
> encoder: JSONEncoder) throws -> JSONValue?
> 		func integer(from jsonValue: JSONValue, at keyPath: [CodingKey], in 
> decoder: JSONDecoder) throws -> Double?
> 	}
> 	public enum JSONValue {
> 		case string(String)
> 		case number(Double)
> 		case bool(Bool)
> 		case object([String: JSONValue])
> 		case array([JSONValue])
> 		case null
> 	}
I disagree with generalizing this to the point of being on a delegate. 
This is all work that you could be doing in `encode(to:)` and 
`decode(from:)`. In `encode(to:)`, it’s always possible to clamp an 
invalid floating-point number to `Double.greatestFiniteMagnitude`, and 
always possible to `encode("\(id)", forKey: .id)` instead of `encode(id, 
forKey: .id)`.

The options that we have on `JSONEncoder` and `JSONDecoder` straddle a 
fine line between being pedantically correct (and refusing to break 
encapsulation for encoded types), and being pragmatically useful. In 
theory, it certainly feels "wrong" that we would allow someone to change 
the way in which a `Date` is encoded, or how `Double`s are represented; 
in a pragmatic sense, though, JSON has no native representation of such 
`Double` values, or a standardized representation of dates, and it’s 
useful to provide options to for controlling that.

However, allowing a delegate to intercept all such calls feels like it 
leans too much in the wrong direction. We’d like to offer as limited a 
set of knobs as possible while still being useful.

>
> Or, perhaps, that a more general form of this delegate be available on 
> all encoders and decoders. But that may be overkill, and even if it 
> *is* a good idea, it's one we can add later.
>
>> Property List
>>
>> We also intend to support the property list format, with 
>> PropertyListEncoder and PropertyListDecoder:
>
> No complaints here.
>
>> Foundation-Provided Errors
>>
>> Along with providing the above encoders and decoders, we would like 
>> to promote the use of a common set of error codes and messages across 
>> all new encoders and decoders. A common vocabulary of expected errors 
>> allows end-users to write code agnostic about the specific 
>> encoder/decoder implementation they are working with, whether 
>> first-party or third-party:
>>
>> extension CocoaError.Code {
>>     /// Thrown when a value incompatible with the output format is 
>> encoded.
>>     public static var coderInvalidValue: CocoaError.Code
>>
>>     /// Thrown when a value of a given type is requested but the 
>> encountered value is of an incompatible type.
>>     public static var coderTypeMismatch: CocoaError.Code
>>
>>     /// Thrown when read data is corrupted or otherwise invalid for 
>> the format. This value already exists today.
>>     public static var coderReadCorrupt: CocoaError.Code
>>
>>     /// Thrown when a requested key or value is unexpectedly null or 
>> missing. This value already exists today.
>>     public static var coderValueNotFound: CocoaError.Code
>> }
>
> [snip]
>
>> All of these errors will include the coding key path that led to the 
>> failure in the error's userInfo dictionary under 
>> NSCodingKeyContextErrorKey, along with a non-localized, 
>> developer-facing failure reason under NSDebugDescriptionErrorKey.
>
> Now comes the part where I whine like a four-year-old:
>
> "Do we haaaaaaaave to use the `userInfo` dictionary, papa?"
>
> An enum with an associated value would be a much more natural way to 
> express these errors and the data that comes with them. Failing that, 
> at least give us some convenience properties. The untyped bag of stuff 
> in the `userInfo` dictionary fills developers who spend all their time 
> in Swift with fear and loathing.
>
> Actually, if you wanted to help us out with the "untyped bag of stuff" 
> problem in general, I for one wouldn't say "no":
>
> 	public struct TypedKey<Key: Hashable, Value> {
> 		public var key: Key
> 		public init(key: Key, valueType: Value.Type) {
> 			self.key = key
> 		}
> 	}
> 	extension Dictionary where Value == Any {
> 		public subscript<CastedValue>(typedKey: TypedKey<Key, CastedValue>) 
> -> CastedValue? {
> 			get {
> 				return self[typedKey.key] as? CastedValue
> 			}
> 			set {
> 				self[typedKey.key] = newValue
> 			}
> 		}
> 	}
As explained in a different email, `NSError` is Foundation’s common 
currency for errors, and we are not looking to change that as part of 
this proposal. If we were to add a `userInfo` dictionary to `Encoder` 
and `Decoder` mentioned in my other email to you, it would likely take 
on more of the form that you suggest here.

>> NSKeyedArchiver & NSKeyedUnarchiver Changes
>>
>> Although our primary objectives for this new API revolve around 
>> Swift, we would like to make it easy for current consumers to make 
>> the transition to Codable where appropriate. As part of this, we 
>> would like to bridge compatibility between new Codabletypes (or 
>> newly-Codable-adopting types) and existing NSCoding types.
>>
>> To do this, we want to introduce changes to NSKeyedArchiver and 
>> NSKeyedUnarchiver in Swift that allow archival of Codable types 
>> intermixed with NSCoding types:
>>
>> // These are provided in the Swift overlay, and included in 
>> swift-corelibs-foundation.
>> extension NSKeyedArchiver {
>>     public func encodeCodable(_ codable: Codable?, forKey key: 
>> String) { ... }
>> }
>>
>> extension NSKeyedUnarchiver {
>>     public func decodeCodable<T : Codable>(_ type: T.Type, forKey 
>> key: String) -> T? { ... }
>> }
>>
>> NOTE: Since these changes are being made in extensions in the Swift 
>> overlay, it is not yet possible for these methods to be overridden. 
>> These can therefore not be added to NSCoder, since NSKeyedArchiver 
>> and NSKeyedUnarchiver would not be able to provide concrete 
>> implementations. In order to call these methods, it is necessary to 
>> downcast from an NSCoder to NSKeyedArchiver/NSKeyedUnarchiver 
>> directly. Since subclasses of NSKeyedArchiver and NSKeyedUnarchiver 
>> in Swift will inherit these implementations without being able to 
>> override them (which is wrong), we will 
>> NSRequiresConcreteImplementation() dynamically in subclasses.
>>
>> The addition of these methods allows the introduction of Codable 
>> types into existing NSCoding structures, allowing for a transition to 
>> Codable types where appropriate.
>
> I wonder about this.
>
> Could `NSCoding` be imported in Swift as refining `Codable`? Then we 
> could all just forget `NSCoding` exists, other than that certain types 
> are less likely to properly handle being put into a 
> JSONEncoder/Decoder. (Which, to tell the truth, is probably inevitable 
> here; the Encoder and Decoder types look like they're probably too 
> loosely defined to truly guarantee that you can mix-and-match coders 
> and types without occasional problems.)
Likely not — `NSCoding` and `Codable` don’t support the same 
features:

* `NSCoding` implementations currently write type information into 
produced archives. Off the top of my head, I don’t think this is a 
strict necessity of the API, but a _lot_ of `NSCoding` implementations 
rely on this
* `Codable` requires type information on decode; `NSCoding` does not 
(because of the aforementioned type information in archives). We cannot 
translate decodes properly
* `NSCoding` has different primitive types than `Codable`

They don’t translate 1-to-1, and would have to stay completely 
distinct.

>> Semantics of Codable Types in Archives
>>
>> There are a few things to note about including Codable values in 
>> NSKeyedArchiverarchives:
>>
>> 	• Bridgeable Foundation types will always bridge before encoding. 
>> This is to facilitate writing Foundation types in a compatible format 
>> from both Objective-C and Swift
>> 		• On decode, these types will decode either as their Objective-C 
>> or Swift version, depending on user need (decodeObject(forKey:) will 
>> decode as an Objective-C object; decodeCodable(_:forKey:) as a Swift 
>> value)
>
> This sounds sensible.
>
>> 	• User types, which are not bridgeable, do not write out a $class 
>> and can only be decoded in Swift. In the future, we may add API to 
>> allow Swift types to provide an Objective-C class to decode as, 
>> effectively allowing for user bridging across archival
>
> Even pure Swift class types? I guess that's probably necessary since 
> even our private ability to look up classes at runtime doesn't cover 
> things like generics, but...ugh.
I’m not sure what you mean by this comment. If you have an old 
codebase which you’re converting from Objective-C to Swift but want to 
support writing archives from newer versions of the codebase still 
readable by old versions, it isn’t unreasonable to provide an 
Objective-C class to encode as or from…

>> Along with these, the Array, Dictionary, and Set types will gain 
>> Codableconformance (as part of the Conditional Conformance feature), 
>> and encode through NSKeyedArchiver as NSArray, NSDictionary, and 
>> NSSet respectively.
>
> You might need to be careful here—we'll need to make sure that data 
> structures of Swift types bridge properly. I suppose that means 
> `_SwiftValue` will need to support `NSCoding` after all...
It shouldn’t have to. NSKeyedArchiver can make callbacks into Swift 
for the encoding of `_SwiftValue`s it finds which did not bridge.

> -- 
> Brent Royal-Gordon
> Architechies
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170316/871f319b/attachment.html>