[swift-evolution] [Pitch] Add `mapValues` method to Dictionary
Honza Dvorsky
jan.dvorsky at me.com
Sat May 21 09:47:32 CDT 2016
While I agree that it'd be nice to add a Map abstraction into which we
could move a lot of the Dictionary-ness, my original pitch is *just* about
adding the specific implementation of `mapValues` in its regular, non-lazy
form. My example was about only keeping a subset of the information in
memory in a Dictionary to allow for quick and frequent access (lazy goes
against that). I think it'd be better to get that in first, or at least
evaluate that separately from a comprehensive refactoring of the
Dictionary, which would just accumulate more opinions and slow this
specific step down.
If one of you have specific ideas about the potential Map protocol, I
encourage you to start a separate thread for that, to focus the
conversation on the parameters of what it would look like.
I guess I'm now asking - would you support a proposal for adding the basic
mapValues function as the first step, with the potential extendability to a
Map protocol allowing for a lazy version? Because I'd like to keep the
proposal as focused as possible to increase the chance of an on-point
discussion.
Thanks,
Honza
On Sat, May 21, 2016 at 3:27 PM Matthew Johnson <matthew at anandabits.com>
wrote:
>
>
> Sent from my iPad
>
> > On May 21, 2016, at 8:45 AM, Haravikk via swift-evolution <
> swift-evolution at swift.org> wrote:
> >
> > I think that before this can be done there needs to be an abstraction of
> what a Dictionary is, for example a Map<Key, Value> protocol. This would
> allow us to also implement the important lazy variations of what you
> suggest, which would likely be more important for very large dictionaries
> as dictionaries are rarely consumed in their entirety; in other words,
> calculating and storing the transformed value for every key/value pair is
> quite a performance overhead if only a fraction of those keys may actually
> be accessed. Even if you are consuming the whole transformed dictionary the
> lazy version is better since it doesn’t store any intermediate values, you
> only really want a fully transformed dictionary if you know the
> transformation is either very costly, or transformed values will be
> accessed frequently.
> >
> > Anyway, long way of saying that while the specific implementation is
> definitely wanted, the complete solution requires a few extra steps which
> should be done too, as lazy computation can have big performance benefits.
> >
> > That and it’d be nice to have a Map protocol in stdlib for defining
> other map types, such as trees, since these don’t require Hashable keys,
> but dictionaries do.
>
> +1 to defining map abstractions in the standard library (separating read
> only from read write). The value associatedtype should not take a position
> on optionality, allowing for maps which have a valid value for all possible
> keys. I have done similar things in other languages and found it extremely
> useful. It is not uncommon to have code that just needs to read and / or
> write to / from a map without having concern for the implementation of the
> map.
>
> One issue I think we should sort out along side this is some kind of
> abstraction which allows code to use functions or user-defined types
> without regard for which it is accessing. The map abstraction would build
> on this abstraction, allowing single argument functions to be viewed as a
> read only map.
>
> One option is to allow functions to conform to protocols that only have
> subscript { get } requirements (we would probably only allow them to be
> subscripted through the protocol interface). I think this feels like the
> most Swifty direction.
>
> Another option is to take the path I have seen in several languages which
> is to allow overloading of the function call "operator". I originally
> wanted this in Swift but now wonder if the first option might be a better
> way to accomplish the same goals.
>
> -Matthew
>
> >
> >> On 21 May 2016, at 11:27, Honza Dvorsky via swift-evolution <
> swift-evolution at swift.org> wrote:
> >>
> >> Hello everyone,
> >>
> >> I have added a very simple, but powerful method into a Dictionary
> extension on multiple projects in the last weeks, so I'd like to bring up
> the idea of adding it into the standard library, in case other people can
> see its benefits as well.
> >>
> >> Currently, Dictionary conforms to Collection with its Element being the
> tuple of Key and Value. Thus transforming the Dictionary with regular map
> results in [T], whereas I'd find it more useful to also have a method which
> results in [Key:T].
> >>
> >> Let me present an example of where this makes sense.
> >>
> >> I recently used the GitHub API to crawl some information about
> repositories. I started with just names (e.g. "/apple/swift",
> "/apple/llvm") and fetched a JSON response for each of the repos, each
> returning a dictionary, which got saved into one large dictionary as the
> end of the full operation, keyed by its name, so the structure was
> something like
> >>
> >> {
> >> "/apple/swift": { "url":..., "size":...., "homepage":... },
> >> "/apple/llvm": { "url":..., "size":...., "homepage":... },
> >> ...
> >> }
> >>
> >> To perform analysis, I just needed a dictionary mapping the name of the
> repository to its size, freeing me to discard the rest of the results.
> >> This is where things get interesting, because you can't keep this
> action nicely functional anymore. I had to do the following:
> >>
> >> let repos: [String: JSON] = ...
> >> var sizes: [String: Int] = [:]
> >> for (key, value) in repos {
> >> sizes[key] = value["size"].int
> >> }
> >> // use sizes...
> >>
> >> Which isn't a huge amount of work, but it creates unnecessary mutable
> state in your transformation pipeline (and your current scope). And I had
> to write it enough times to justify bringing it up on this list.
> >>
> >> I suggest we add the following method to Dictionary:
> >>
> >> extension Dictionary {
> >> public func mapValues<T>(_ transform: @noescape (Value) throws ->
> T) rethrows -> [Key: T] {
> >> var transformed: [Key: T] = [:]
> >> for (key, value) in self {
> >> transformed[key] = try transform(value)
> >> }
> >> return transformed
> >> }
> >> }
> >>
> >> It is modeled after Collection's `map` function, with the difference
> that
> >> a) only values are transformed, instead of the Key,Value tuple and
> >> b) the returned structure is a transformed Dictionary [Key:T], instead
> of [T]
> >>
> >> This now allows a much nicer workflow:
> >>
> >> let repos: [String: JSON] = ...
> >> var sizes = repos.mapValues { $0["size"].int }
> >> // use sizes...
> >>
> >> and even multi-step transformations on Dictionaries, previously only
> possible on Arrays, e.g.
> >> var descriptionTextLengths = repos.mapValues { $0["description"].string
> }.mapValues { $0.characters.count }
> >>
> >> You get the idea.
> >>
> >> What do you think? I welcome all feedback, I'd like to see if people
> would support it before I write a proper proposal.
> >>
> >> Thanks! :)
> >> Honza Dvorsky
> >>
> >> _______________________________________________
> >> swift-evolution mailing list
> >> swift-evolution at swift.org
> >> https://lists.swift.org/mailman/listinfo/swift-evolution
> >
> > _______________________________________________
> > swift-evolution mailing list
> > swift-evolution at swift.org
> > https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160521/cfb93089/attachment.html>
More information about the swift-evolution
mailing list