[swift-evolution] Proposal: Add a sequence-based initializer to Dictionary
Nicola Salmoria
nicola.salmoria at gmail.com
Fri Jan 15 04:18:58 CST 2016
To handle the case of duplicate keys, why not allow to pass in a 'combine'
function?
The function could default to a preconditionFailure to be consistent with
the DictionaryLiteral behavior, but be overridden by the caller as needed.
extension Dictionary {
/// Creates a dictionary with the keys and values in the given sequence.
init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S, combine: (existing: Value, other: Value) -> Value = {
preconditionFailure("Sequence contains duplicate keys"); return $1 } ) {
self.init()
for (key, value) in sequence {
if let existing = updateValue(value, forKey: key) {
updateValue(combine(existing: existing, other: value),
forKey: key)
}
}
}
}
usage examples:
let samples = [("Rome", 40.2), ("New York", 35.1), ("Rome", 42.5), ("New
York", 32.8)]
let minTemperatures = Dictionary(samples, combine: min)
// ["Rome": 40.2, "New York": 32.8]
let maxTemperatures = Dictionary(samples, combine: max)
// ["Rome": 42.5, "New York": 35.1]
let probabilities = [("a", 0.25), ("b", 0.25), ("c", 0.25), ("a", 0.25)]
let stateProbabilities = Dictionary(probabilities, combine: +)
// ["b": 0.25, "a": 0.5, "c": 0.25]
Nicola
> A brief draft is below... I had mostly written this up before I saw the
thread and Gwendal's similar contribution -- happy to hear feedback and
fold in comments/revisions!
>
> Nate
>
>
> ---
>
>
> Introduction
>
> The Dictionary type should allow initialization from a sequence of (Key,
Value) tuples.
>
> Motivation
>
> Array and Set both have initializers that create a new instance from a
sequence of elements. The Array initializer is useful for converting other
sequences and collections to the "standard" collection type, but the Set
initializer is essential for recovering set operations after performing any
functional operations on a set. For example, filtering a set produces a
collection without any kind of set operations available:
>
> let numberSet = Set(1 ... 100)
> let fivesOnly = numberSet.lazy.filter { $0 % 5 == 0 }
>
> "fivesOnly" is a LazyFilterCollection<Set<Int>> instead of a Set --
sending that back through the Set sequence initializer restores the
expected methods:
>
> let fivesOnlySet = Set(numberSet.lazy.filter { $0 % 5 == 0 })
> fivesOnlySet.isSubsetOf(numberSet) // true
>
> Dictionary, on the other hand, has no such initializer, so a similar
operation leaves no room except for building a mutable Dictionary via
iteration or functional methods with dubious performance. These techniques
also don't support type inference from the source sequence, increasing
verbosity:
>
> var viaIteration: [String: Int] = [:]
> for (key, value) in evenOnly {
> viaIteration[key] = value
> }
>
> let viaFunction: [String: Int] = evenOnly.reduce([:]) { (cumulative,
keyValue) in
> var mutableDictionary = cumulative
> mutableDictionary[keyValue.0] = keyValue.1
> return mutableDictionary
> }
>
> Proposed solution
>
> The proposed solution would add an initializer to Dictionary that accepts
any sequence of (Key, Value) tuple pairs, matching the Dictionary's element
type when treated as a sequence:
>
> init<S: SequenceType where S.Generator.Element == Generator.Element>(_
sequence: S)
>
> Instead of the techniques for recovering a Dictionary shown above, the
proposed initializer would allow a much cleaner syntax to be written:
>
> let viaProposed = Dictionary(evenOnly)
>
> Moreover, this new initializer would allow for some convenient uses that
aren't currently possible.
>
> ???? Initializing from an array of tuples:
>
> let dictFromArray = Dictionary([("a", 1), ("b", 2), ("c", 3), ("d", 4)])
>
> ???? Initializing from a DictionaryLiteral (the type, not an actual
literal):
>
> let literal: DictionaryLiteral = ["a": 1, "b": 2, "c": 3, "d": 4]
> let dictFromDL = Dictionary(literal)
>
> ?? Initializing from a pair of zipped sequences (examples abound):
>
> let letters = "abcdefghij".characters.lazy.map { String($0) }
> let dictFromZip = Dictionary(zip(letters, 1...10))
> // ["b": 2, "a": 1, "i": 9, "j": 10, "c": 3, "e": 5, "f": 6, "g": 7, "d":
4, "h": 8]
>
> Potential pitfalls
>
> One caveat is that the new initializer doesn't prevent using a sequence
with multiple identical keys. In such a case, the last key/value would
"win" and exist in the dictionary. Such an initialization is a compile-time
error with a dictionary literal, but succeeds under the new initializer:
>
> let _ = ["z": 1, "z": 2, "z": 3, "z": 4]
> // fatal error: Dictionary literal contains duplicate keys
> Dictionary([("z", 1), ("z", 2), ("z", 3), ("z", 4)])
> // ["z": 4]
>
> This behavior is particularly troublesome when used in conjunction with a
mapping operation that modifies a dictionary's keys, since dictionaries
have no particular guaranteed order:
>
> let overlapping = Dictionary(dictFromArray.lazy.map { (_, value) in ("z",
value) })
> // ["z": ???]
>
> While a pitfall, this behavior is less a symptom of the proposed API and
more an inherent problem with recovering a dictionary after modifying its
keys. The current ways of rebuilding a dictionary (as shown above) are just
as susceptible to silently dropping values. Moreover, the sequence-based
initializer for Set exhibits the same behavior, though slightly less
problematic in most cases:
>
> let dividedNumbers = Set(numberSet.map { $0 / 20 })
> // {4, 5, 2, 0, 1, 3}
>
> Given the potential lossiness of the initializer, should it use a
parameter name for the sequence? I would suggest not, to match the syntax
of Array.init(_:) and Set.init(_:), but a parameter like "collapsingKeys"
would make the risk clear to users.
>
> Detailed design
>
> The implementation is simple enough to show in the proposal:
>
> extension Dictionary {
> /// Creates a dictionary with the keys and values in the given
sequence.
> init<S: SequenceType where S.Generator.Element ==
Generator.Element>(_ sequence: S) {
> self.init()
> for (key, value) in sequence {
> updateValue(value, forKey: key)
> }
> }
> }
>
> (As part of the standard library, this could use the nativeUpdateValue
method.)
>
> Impact on existing code
>
> As a new API, this will have no impact on existing code.
>
> Alternatives considered
>
> As suggested in the thread below, a method could be added to SequenceType
that would build a dictionary. This approach seems less of a piece with the
rest of the standard library, and overly verbose when used with a
Dictionary that is only passing through filtering or mapping operations. I
don't think the current protocol extension system could handle a
passthrough case (i.e., something like "extension SequenceType where
Generator.Element == (Key, Value)").
>
> Alternately, the status quo could be maintained. Which would be sad.
>
>
>
>
> > On Jan 13, 2016, at 11:55 AM, Gwendal Roué via swift-evolution
<swift-evolution at swift.org> wrote:
> >
> > Doesn’t Swift prefer initializers?
> >
> > So let’s build a Dictionary initializer that eats any sequence of (key,
value) pairs:
> >
> > extension Dictionary {
> > init<S: SequenceType where S.Generator.Element == (Key,
Value)>(keyValueSequence s: S) {
> > self.init()
> > for (key, value) in s {
> > self[key] = value
> > }
> > }
> > }
> >
> > do {
> > // From array of (key, value) pairs
> > let input = [("foo", 1), ("bar", 2)]
> > let d = Dictionary(keyValueSequence: input)
> > print(d)
> > }
> > do {
> > // From another dictionary
> > let input = [1: "foo", 2: "bar"]
> > let d = Dictionary(keyValueSequence: input)
> > print(d)
> > }
> > do {
> > // Reverse key and values
> > let input = [1: "foo", 2: "bar"]
> > let d = Dictionary(keyValueSequence: input.map { ($1, $0) })
> > print(d)
> > }
> >
> > Gwendal
> >
> >> Le 13 janv. 2016 à 18:41, Thorsten Seitz via swift-evolution
<swift-evolution at swift.org <mailto:swift-evolution at swift.org>> a
écrit :
> >>
> >> I'd prefer "mapToDict" otherwise it sounds like a dictionary gets
mapped, at least for me.
> >>
> >> -Thorsten
> >>
> >>> Am 13.01.2016 um 17:13 schrieb Kenny Leung via swift-evolution
<swift-evolution at swift.org <mailto:swift-evolution at swift.org>>:
> >>>
> >>> This solution looks great! How do you feel about “mapDict”?
> >>>
> >>> -Kenny
> >>>
> >>>
> >>>>> On Jan 12, 2016, at 10:28 AM, Craig Cruden <ccruden at novafore.com
<mailto:ccruden at novafore.com>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> I named the method(s) „toDict“ instead of „map“ because map
normally returns a collection which is either the same as the receiver or a
simple one.
> >>>>> The second version is more general and allows to do things like
> >>>>>
> >>>>> let dict = ["Tom", "Dick", "Harry"].enumerate().toDict { (index,
value) in (index + 1, value) }
> >>>>
> >>>> Map would probably be a more correct mathematically speaking — but
it would be inconsistent with the naming convention already chosen for
Swift. So for Swift - toDict (or toDictionary) would be the best choice.
> >>>
> >>> _______________________________________________
> >>> swift-evolution mailing list
> >>> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> >>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >> _______________________________________________
> >> swift-evolution mailing list
> >> swift-evolution at swift.org <mailto:swift-evolution at swift.org>
> >> https://lists.swift.org/mailman/listinfo/swift-evolution
> >
> >
> > _______________________________________________
> > swift-evolution mailing list
> > swift-evolution at swift.org
> > https://lists.swift.org/mailman/listinfo/swift-evolution
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160115/f003880c/attachment.html>
More information about the swift-evolution
mailing list