[swift-evolution] [Proposal] Foundation Swift Archival & Serialization

Colin Barrett colin at springsandstruts.com
Tue Mar 21 13:59:14 CDT 2017


Hi Itai,

On Tue, Mar 21, 2017 at 1:03 PM Itai Ferber <iferber at apple.com> wrote:

Hi Colin,

Thanks for your comments! Are you talking about Codable synthesis, or
encoding in general?

Yeah, I meant specifically in the case where things are synthesized
automatically. As you point out below, if someone implements a custom
Codeable instance, all bets are off.

On 21 Mar 2017, at 8:44, Colin Barrett wrote:

Hi Itai,

Glad to see these proposal! I'm curious, have you or the other Swift folks
thought about how *users* of these new Codable protocols will interact with
resilience domains?

What I mean is that what appear to be private or internal identifiers, and
thus changeable at will, may actually be fragile in that changing them will
break the ability to decode archives encoded by previous versions.

Making this safer could mean:
- Encoding only public properties

Unfortunately, property accessibility in code does not always map 1-to-1
with accessibility for archival (nor do I think they should be tied to one
another).
There are certainly cases where you’d want to include private information
in an archive, but that is not useful to expose to external clients, e.g.,
a struct/class version:

public struct MyFoo {
    // Should be encoded.
    public var title: String
    public var identifier: Int

    // This should be encoded too — in case the struct changes in the
    // future, want to be able to refer to the payload version.
    private let version = 1.0
}

Of course, there can also be public properties that you don’t find useful
to encode. At the moment, I’m not sure there’s a much better answer than
"the author of the code will have to think about the representation of
their data"; even if there were an easier way to annotate "I definitely
want this to be archived"/"I definitely don’t want this to be archived",
the annotation would still need to be manual.

(The above applies primarily in the case of Codable synthesis; when
implementing Codable manually I don’t think the compiler should ever
prevent you from doing what you need.)

- Adding some form of indirection (a la ObjC non-fragile ivars?)

What do you mean by this?

I'm not sure exactly how or if it would work in-detail, unfortunately, but
I know that the ObjC runtime emits symbols which are used to lookup the
offset in the object struct for non-fragile ivars. Maybe some similar form
of indirection would be useful for encoding non-public ivars. Like I said,
don't know exactly how/if that would work, just sharing :)


- Compiler warning (or disallowing) changes to properties in certain
situations.

We’ve thought about this with regards to identifying classes uniquely
across renaming, moving modules, etc.; this is a resilience problem in
general.
In order for the compiler to know about changes to your code it’d need to
keep state across compilations. While possible, this feels pretty fragile
(and potentially not very portable).

   - Compiler warns about changing a property? Blow away the cache
   directory!
   - Cloning the code to a new machine for the first time? Hmm, all the
   warnings went away…

This would be nice to have, but yes:

I imagine the specifics would need to follow the rest of the plans for
resilience.

specifics on this would likely be in line with the rest of resilience plans
for Swift in general.


Right. Thus my concern about allowing non-public fields to be automatically
serialized. The most conservative option would be to only automatically
synthesize a Codeable instance for the public members of public types.
Seems overly restrictive, so maybe anything goes for internal types, or
there's some sort of warning (overridable via an attribute?)

I want to emphasize btw that I'm enthusiastic about this proposal in
general. The support for integer keys is welcome and, as it's one of my pet
projects, eases support for a Cap'n Proto-style serialization format.[1]

-Colin

[1]: https://capnproto.org

It's likely that this could be addressed by a future proposal, as for the
time being developers can simply "not hold it wrong" ;)

Thanks,
-Colin

On Wed, Mar 15, 2017 at 6:52 PM Itai Ferber via swift-evolution <
swift-evolution at swift.org> wrote:

Hi everyone,

The following introduces a new Swift-focused archival and serialization
API as part of the Foundation framework. We’re interested in improving the
experience and safety of performing archival and serialization, and are
happy to receive community feedback on this work.

Because of the length of this proposal, the *Appendix* and *Alternatives
Considered* sections have been omitted here, but are available in the full
proposal <https://github.com/apple/swift-evolution/pull/639> on the
swift-evolution repo. The full proposal also includes an *Unabridged API*
for


further consideration.

Without further ado, inlined below.

— Itai

Swift Archival & Serialization

- Proposal: SE-NNNN <https://github.com/apple/swift-evolution/pull/639>
- Author(s): Itai Ferber <https://github.com/itaiferber>, Michael LeHew
<https://github.com/mlehew>, Tony Parker <https://github.com/parkera>
- Review Manager: TBD
- Status: *Awaiting review*
- Associated PRs:
- #8124 <https://github.com/apple/swift/pull/8124>
- #8125 <https://github.com/apple/swift/pull/8125>



Introduction

Foundation's current archival and serialization APIs (NSCoding,
NSJSONSerialization, NSPropertyListSerialization, etc.), while fitting
for the dynamism of Objective-C, do not always map optimally into Swift.
This document lays out the design of an updated API that improves the
developer experience of performing archival and serialization in Swift.

Specifically:

- It aims to provide a solution for the archival of Swift struct and
enum types
- It aims to provide a more type-safe solution for serializing to


external formats, such as JSON and plist

Motivation

The primary motivation for this proposal is the inclusion of native Swift
enum and struct types in archival and serialization. Currently,
developers targeting Swift cannot participate in NSCoding without being
willing to abandon enum and structtypes — NSCoding is an @objc protocol,
conformance to which excludes non-class types. This is can be limiting in
Swift because small enums and structs can be an idiomatic approach to
model representation; developers who wish to perform archival have to
either forgo the Swift niceties that constructs like enumsprovide, or
provide an additional compatibility layer between their "real" types and
their archivable types.

Secondarily, we would like to refine Foundation's existing serialization
APIs (NSJSONSerialization and NSPropertyListSerialization) to better
match Swift's strong type safety. From experience, we find that the
conversion from the unstructured, untyped data of these formats into
strongly-typed data structures is a good fit for archival mechanisms,
rather than taking the less safe approach that 3rd-party JSON conversion
approaches have taken (described further in an appendix below).

We would like to offer a solution to these problems without sacrificing
ease of use or type safety.
Agenda

This proposal is the first stage of three that introduce different facets
of a whole Swift archival and serialization API:

1. This proposal describes the basis for this API, focusing on the


protocols that users adopt and interface with

2. The next stage will propose specific API for new encoders
3. The final stage will discuss how this new API will interop with


NSCoding as it is today

SE-NNNN provides stages 2 and 3.
Proposed solution

We will be introducing the following new types:

- protocol Codable: Adopted by types to opt into archival. Conformance


may be automatically derived in cases where all properties are also
Codable.

- protocol CodingKey: Adopted by types used as keys for keyed


containers, replacing String keys with semantic types. Conformance may
be automatically derived in most cases.

- protocol Encoder: Adopted by types which can take Codable values and


encode them into a native format.

- class KeyedEncodingContainer<Key : CodingKey>: Subclasses of this


type provide a concrete way to store encoded values by CodingKey.
Types adopting Encoder should provide subclasses of
KeyedEncodingContainer to vend.

- protocol SingleValueEncodingContainer: Adopted by types which


provide a concrete way to store a single encoded value. Types adopting
Encoder should provide types conforming to
SingleValueEncodingContainer to vend (but in many cases will be
able to conform to it themselves).

- protocol Decoder: Adopted by types which can take payloads in a


native format and decode Codable values out of them.

- class KeyedDecodingContainer<Key : CodingKey>: Subclasses of this


type provide a concrete way to retrieve encoded values from storage by
CodingKey. Types adopting Decoder should provide subclasses of
KeyedDecodingContainer to vend.

- protocol SingleValueDecodingContainer: Adopted by types which


provide a concrete way to retrieve a single encoded value from storage.
Types adopting Decoder should provide types conforming to
SingleValueDecodingContainer to vend (but in many cases will be
able to conform to it themselves).

For end users of this API, adoption will primarily involve the Codable
and CodingKey protocols. In order to participate in this new archival
system, developers must add Codable conformance to their types:

// If all properties are Codable, implementation is automatically
derived:public struct Location : Codable {
public let latitude: Double
public let longitude: Double}
public enum Animal : Int, Codable {
case chicken = 1
case dog
case turkey
case cow}
public struct Farm : Codable {
public let name: String
public let location: Location
public let animals: [Animal]}

With developer participation, we will offer encoders and decoders
(described in SE-NNNN, not here) that take advantage of this conformance to
offer type-safe serialization of user models:

let farm = Farm(name: "Old MacDonald's Farm",
location: Location(latitude: 51.621648, longitude: 0.269273),

animals: [.chicken, .dog, .cow, .turkey, .dog, .chicken, .cow, .turkey,
.dog])let payload: Data = try JSONEncoder().encode(farm)


do {
let farm = try JSONDecoder().decode(Farm.self, from: payload)

// Extracted as user types:
let coordinates = "\(farm.location.latitude, farm.location.longitude)"}
catch {
// Encountered error during deserialization}

This gives developers access to their data in a type-safe manner and a
recognizable interface.
Detailed design

To support user types, we expose the Codable protocol:

/// Conformance to `Codable` indicates that a type can marshal itself into
and out of an external representation.public protocol Codable {


/// Initializes `self` by decoding from `decoder`.
///
/// - parameter decoder: The decoder to read data from.
/// - throws: An error if reading from the decoder fails, or if read data
is corrupted or otherwise invalid.
init(from decoder: Decoder) throws

/// Encodes `self` into the given encoder.
///
/// If `self` fails to encode anything, `encoder` will encode an empty
`.default` container in its place.
///
/// - parameter encoder: The encoder to write data to.
/// - throws: An error if any values are invalid for `encoder`'s format.
func encode(to encoder: Encoder) throws}

By adopting Codable, user types opt in to this archival system.

Structured types (i.e. types which encode as a collection of properties)
encode and decode their properties in a keyed manner. Keys may be
String-convertible
or Int-convertible (or both), and user types which have properties should
declare semantic key enums which map keys to their properties. Keys must
conform to the CodingKey protocol:

/// Conformance to `CodingKey` indicates that a type can be used as a key
for encoding and decoding.public protocol CodingKey {


/// The string to use in a named collection (e.g. a string-keyed
dictionary).
var stringValue: String? { get }

/// Initializes `self` from a string.
///
/// - parameter stringValue: The string value of the desired key.
/// - returns: An instance of `Self` from the given string, or `nil` if the
given string does not correspond to any instance of `Self`.
init?(stringValue: String)

/// The int to use in an indexed collection (e.g. an int-keyed dictionary).
var intValue: Int? { get }

/// Initializes `self` from an integer.
///
/// - parameter intValue: The integer value of the desired key.
/// - returns: An instance of `Self` from the given integer, or `nil` if
the given integer does not correspond to any instance of `Self`.
init?(intValue: Int)}

For most types, String-convertible keys are a reasonable default; for
performance, however, Int-convertible keys are preferred, and Encoders may
choose to make use of Ints over Strings. Framework types should provide
keys which have both for flexibility and performance across different types
of Encoders. It is generally an error to provide a key which has neither
a stringValue nor an intValue.

By default, CodingKey conformance can be derived for enums which have
either String or Int backing:

enum Keys1 : CodingKey {
case a // (stringValue: "a", intValue: nil)
case b // (stringValue: "b", intValue: nil)}
enum Keys2 : String, CodingKey {
case c = "foo" // (stringValue: "foo", intValue: nil)
case d // (stringValue: "d", intValue: nil)}
enum Keys3 : Int, CodingKey {
case e = 4 // (stringValue: "e", intValue: 4)
case f // (stringValue: "f", intValue: 5)
case g = 9 // (stringValue: "g", intValue: 9)}

Coding keys which are not enums, have associated values, or have other
raw representations must implement these methods manually.

In addition to automatic CodingKey conformance derivation for enums,
Codableconformance can be automatically derived for certain types as well:

1. Types whose properties are all either Codable or primitive get an


automatically derived String-backed CodingKeys enum mapping properties
to case names

2. Types falling into (1) and types which provide a CodingKeys enum
(directly


or via a typealias) whose case names map to properties which are all
Codableget automatic derivation of init(from:) and encode(to:) using
those properties and keys. Types may choose to provide a custom
init(from:) or encode(to:) (or both); whichever they do not provide
will be automatically derived

3. Types which fall into neither (1) nor (2) will have to provide a


custom key type and provide their own init(from:) and encode(to:)

Many types will either allow for automatic derivation of all codability
(1), or provide a custom key subset and take advantage of automatic method
derivation (2).
Encoding and Decoding

Types which are encodable encode their data into a container provided by
their Encoder:

/// An `Encoder` is a type which can encode values into a native format for
external representation.public protocol Encoder {


/// Populates `self` with an encoding container (of `.default` type) and
returns it, keyed by the given key type.
///
/// - parameter type: The key type to use for the container.
/// - returns: A new keyed encoding container.
/// - precondition: May not be called after a previous
`self.container(keyedBy:)` call of a different `EncodingContainerType`.
/// - precondition: May not be called after a value has been encoded
through a prior `self.singleValueContainer()` call.
func container<Key : CodingKey>(keyedBy type: Key.Type) ->
KeyedEncodingContainer<Key>

/// Returns an encoding container appropriate for holding a single
primitive value.
///
/// - returns: A new empty single value container.
/// - precondition: May not be called after a prior
`self.container(keyedBy:)` call.
/// - precondition: May not be called after a value has been encoded
through a previous `self.singleValueContainer()` call.
func singleValueContainer() -> SingleValueEncodingContainer

/// The path of coding keys taken to get to this point in encoding.
var codingKeyContext: [CodingKey] { get }}

// Continuing examples from before; below is automatically generated by the
compiler if no customization is needed.public struct Location : Codable {


private enum CodingKeys : CodingKey {
case latitutude
case longitude
}

public func encode(to encoder: Encoder) throws {
// Generic keyed encoder gives type-safe key access: cannot encode with
keys of the wrong type.
let container = encoder.container(keyedBy: CodingKeys.self)

// The encoder is generic on the key -- free key autocompletion here.
try container.encode(latitude, forKey: .latitude)
try container.encode(longitude, forKey: .longitude)
}}
public struct Farm : Codable {
private enum CodingKeys : CodingKey {
case name
case location
case animals
}

public func encode(to encoder: Encoder) throws {
let container = encoder.container(keyedBy: CodingKeys.self)
try container.encode(name
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170321/c2d1ef1c/attachment.html>


More information about the swift-evolution mailing list