[swift-evolution] Inconsistencies related to prepositions

Wed Aug 2 22:38:57 CDT 2017

> On Aug 2, 2017, at 10:49 AM, Xiaodi Wu via swift-evolution <swift-evolution at swift.org> wrote:
> 
> Brent had a draft proposal to revise the names of collection methods to improve the situation here. There is room for improvement.

It didn't change `remove(at:)`, though. (Well, it might have affected its return value or something, I don't remember.) It mostly addressed the `dropLast()`, `prefix(_:)` etc. calls.

To respond to the original post:

Some of the APIs you cite are not very well named, but I think your diagnosis is incorrect and so your prescription isn't going to help.

> The reason for this is, whenever a preposition is used in English, it almost always takes a dyadic form, relating a subject to the preposition's object. The two most common dyadic formats are:
> 
> <subject> [<preposition> <object of preposition>]
> <The boy> [<with> <the dog>] crossed the street. 
> 
> [<preposition> < object of preposition>] <subject>
> [<In> <space>], <no one> can hear you scream.
> [<On> <the Moon>] are <many craters>.
> 
> Now, in Objective C through Swift 1 and 2, prepositions' dyadic nature were generally respected in method signatures. However, Swift 3's migration of the preposition inside the parentheses also seems to have been accompanied by the stripping away of either the subject, the prepositional object, or both—according to no discernible pattern. For example: 
> 
> (1) CloudKit:
> 
> old: myCKDatabase.fetchRecordWithID(recordID)
> new: myCKDatabase.fetch(withRecordID: recordID)
> (subject "Record" got removed)
> 
> (2) String:
> 
> old: myString.capitalizedStringWithLocale(_: myLocale)
> new: myString.capitalized(with: myLocale)
> (subject "String" and prep. object "Locale" both got removed)
> 
> (3) Dictionary:
> 
> old: myDict.removeAtIndex(myIndex)
> new: myDict.remove(at: myIndex)
> (subject "element" already missing from both; prep. object "Index" got removed)

The thing is, the subjects and prepositional objects *are* present in all of these—they are the parameters and targets of the calls. 

In your own example, you say "In space, no one can hear you scream", not "In location space, group-of-people no one can hear you scream". So why is it a problem that we say "myString, capitalized with myLocale" instead of "myString, string capitalized with locale myLocale"? These are redundancies that we would never tolerate in natural language; I don't see why you think code should be different.

> (4) Date:
> 
> old: myDate.timeIntervalSinceDate(myDate)
> new: myDate.timeIntervalSince(date: myDate)
> (subject "timeInterval" and object "Date" both still present; but oddly, preposition "since" got left outside of the parentheses)

This is actually inaccurate—the parameter to `timeIntervalSince(_:)` is unlabeled, so it's:

		new: myDate.timeIntervalSince(myDate)

> (5) Array:
> 
> 	    old: myArray.forEach({ thing in code})
> new: myArray.forEach() { thing in //code }
>             (preposition “for” is outside of the parentheses)

Yes, because the preposition does not apply to the parameter—it applies to the operation as a whole. I'll have more to say on this in a moment.

> The inconsistency between the examples is shown in the bold text of each example, but lets go over why this matters. It matters because any language is easier to learn the more consistently it sticks to its own rules.

This is true, but you aren't just proposing sticking more closely to our existing standards—you're proposing *changing* our standards. And I don't think the changes you propose are an improvement. In fact, I'd say each of these examples is worse:

> (1) myCKDatabase.fetchRecord(withRecordID:)

"Fetch record with record ID"? I mean, you could at least remove the `Record` before `ID`. What other ID do you suppose it would be?

I *can* see the case for going back to `fetchRecord` instead of just `fetch`, though. On the other hand, I do think that, if you know it's a database, the most obvious thing for `fetch` to be fetching is a record from that database. It's not a dog—it won't be fetching a stick.

> (2) myString.stringCapitalized(withLocale:)

Let's translate this to an actual use site, which is what we care about.

	func tableView(_: UITableView, titleForHeaderInSection section: Int) -> String? {
		return sections[section].title.stringCapitalized(withLocale: .current)
	}

What is `string` contributing here? We already know it's a "title", which sounds a lot like a string. If you asked for a "capitalized" string, what else would you get back if not another string?

The locale parameter is a little more tricky. You're right that `(with: .current)` is vague, but I can imagine plenty of call sites where `with` wouldn't be:

	title.capitalized(with: german)
	title.capitalized(with: docLocale)
	title.capitalized(with: otherUser.locale)

Something at the call site needs to imply this is a locale, and there's nothing in `(with: .current)` that does so. This is arguably just a style issue, though: even though the language allows you to say `(with: .current)`, you really ought to say `(with: Locale.current)` to be clearer. Or perhaps the problem is with the name `current`—it ought to be `currentLocale` (even though that's redundant when you write it out as `Locale.currentLocale`), or it should use some location-ish terminology like `home` or `native`.

(Actually, I think there might be a new guideline there: Variable and property names should at least hint at the type of the value they contain. Names like `current` or `shared` or `default` are too vague, and should usually be paired with a word that implies the type.)

It might also help to change the `with` preposition to `in`, which would at least imply that the parameter is related to some kind of location.

	title.capitalized(in: german)
	title.capitalized(in: docLocale)
	title.capitalized(in: otherUser.locale)
	title.capitalized(in: .current)			// Still not great, but better

> (3) myDictionary.elementRemoved(atIndex:)

This naming is exactly backwards, and is a perfect example of why we *don't* want rigid consistency:

	1. It emphasizes the element being returned, while making the "Removed" operation an afterthought, even though the removal is the main thing you want to happen and the element is returned as an afterthought.

	2. It mentions the "Index", even though there is no other plausible thing that could be labeled "at". (The only other plausible parameters to a `remove` method are an element, an array of elements, a predicate, or a range of indices. Of those four, only the range of indices could possibly make sense with "at", but that ambiguity is a harmless overloading.)

Again, think about a use site:

	func tableView(_: UITableView, commit edit: UITableViewCellEditingStyle, forRowAt indexPath: IndexPath) {
		assert(edit == .delete)

		sections[indexPath.section].rows.elementRemoved(atIndex: indexPath.row)
		// vs.
		sections[indexPath.section].rows.remove(at: indexPath.row)
	}

In context, `elementRemoved` obscures the purpose of the line, and there is no ambiguity about what `at` means. The current name is *far* better.

> (4) myDate.timeInterval(sinceDate:)

I have a hard time thinking of a call site where `sinceDate` would be an improvement over `since`; the preposition already *strongly* implies a temporal aspect to the parameter.

If we remove `Date`, then in isolation I kind of like this change. The problem arrives when you step out of isolation and think about the other methods in its family. It would be strange to have `myDate.timeInterval(since: otherDate)`, but `myDate.timeIntervalSinceNow()`.

One solution would be to add `now`, `referenceDate`, and `unixEpoch` (or `utc1970`) as static properties on `Date`. Then you could have just the one `timeInterval(since:)` method:

		myDate.timeInterval(since: otherDate)
		myDate.timeInterval(since: .now)
		myDate.timeInterval(since: .referenceDate)
		myDate.timeInterval(since: .unixEpoch)

Another solution would be to add a `-` operator that takes two `Date`s and returns a `TimeInterval`, sidestepping the wording issue entirely.

> (5) myArray.each(inClosure: )   

I don't get this name at all. This operation is completely imperative and side-effectful, but there's no verb? How is it "in" the closure? Why "closure" when you can pass a function, and in fact you probably will *only* see this label if you're passing a function?

I do think there's a problem with `forEach(_:)`—it ought to be `forEach(do:)`. This is much like `DispatchQueue.async(execute:)` or the `withoutActuallyEscaping(_:do:)` function in the standard library. When you pass a function parameter, and the call's primary purpose is to run that parameter, it's often best to attach the verb to that parameter instead of putting it in the base name. Of course, the complication there is that the verb doesn't actually get written if you use trailing closure syntax, but the curlies sort of fill that role. Kind of.

> Although I do understand removing "string" from the latter was to reduce redundancy in function/method declarations, we only make one declaration, yet we make many calls. So increasing ambiguity in calls does not seem like a good trade-off for decreased boilerplate in declarations. More often than not it's calls that we're reading, not the declarations—unless of course the call was ambiguous and we had to read the declaration to make sense out of it. So perhaps we might question if increased ambiguity is an overall good thing. 

I think you misunderstand the current Guidelines' goals. The Guidelines are not trying to reduce redundancy at declaration sites—they're trying to reduce redundancy at *call sites*. The idea is that, if the variable names for the method's target and parameters imply something about the types they contain, those names along with the prepositions will imply the purpose of each parameter, and therefore the call. The types are just a more formal version of that check.

That's why the very first paragraph of the API Design Guidelines <https://swift.org/documentation/api-design-guidelines/> says:

>>> Clarity at the point of use is your most important goal. Entities such as methods and properties are declared only once but used repeatedly. Design APIs to make those uses clear and concise. When evaluating a design, reading a declaration is seldom sufficient; always examine a use case to make sure it looks clear in context.

So the tradeoff is not between *declaration redundancy* and call site clarity—it is between *call site redundancy* and call site ambiguity. Because their parameters are unlabeled, most languages have severe call site ambiguity problems. Objective-C has a pretty serious call site redundancy problem. Swift's design is trying to hit the baby bear "just right" point.

It is quite possible that, in some areas, we have swung too far back towards ambiguity. But I don't think `elementRemoved(atIndex:)` is going to fix that.

> However this removal of explicit contextual cues from the method signature harms readability, since now, the compiler will let people write code like:
> 
> { return $0.fetch(withRecordID:$1) }
> 
> Clearly, the onus is now on the developer not to use cryptic, short variable names or NO variable names. However, spend much time on GitHub or in CocoaPods and you will see an increasing number of codebases where that's exactly what they do, especially in closures.

What I think you're missing with this example—and in fact with all of your closure-based examples—is that closures don't exist in isolation; they're in some larger context. (Otherwise, they won't compile.) For instance, the above closure might be in a line like:

	return zip(databases, recordIDs)
		.map { return $0.fetch(withRecordID:$1) }

Read in the context of its line, the meanings of $0 and $1 are fairly clear.

> Another problem is that the compiler doesn't care if you write:
> 
> { ambiguousName in
> let myRecordID = ambiguousName.fetch(withRecordID:myID) 
> return myRecordID }
> 
> This is highly problematic because someone reading this code will have no reason to expect the type of "myRecordID" not to be CKRecordID. (In fact, it's CKRecord.)

Again, in the larger context, this line will end up generating a `[CKRecord]` array instead of a `[CKRecordID]` array, which is probably going to cause a type mismatch once you try to actually use the alleged record IDs. (But as I said earlier, I can see the case for using `fetchRecord(withID:)` or `fetchRecord(with:)` instead of `fetch(withRecordID:)`.)

Ambiguous names can hide bugs in their ambiguity, but verbose names can also hide bugs in the sheer mass of code they generate. The difference is, developers can manage the ambiguity in their code by naming variables well, but they can't manage verbosity if verbose names are imposed on them.

> We also have examples like: 
> 
> { return $0.draw(with:$1) }
> 
> What is $0? What is $1? This is a real Apple API, BTW.

Again, the context of the closure would tell you, but part of the problem here is that they held onto an Objective-C preposition which was poorly chosen. If the line were `$0.draw(in:$1)`, you would know that `$1` specified an area of the screen and `$0` was something that could be drawn, which frankly is all you really *need* to know to understand what this line does.

> {array, key in 
> let number = array.remove(at:key)
> return number }
> 
> This will compile and run even though number will be a tuple key-value pair, array will be a dict, and key will be an index type! This may seem like a ridiculous example, but I have literally seen things like this.

Where have you seen something like this? `array` would have to be passed `inout` for this to work at all.

Nevertheless, how would more verbose names help with this problem? This is every bit as incorrect, and the compiler will still accept it:

	{array, key in 
	let number = array.elementRemoved(atIndex:key)
	return number }

Are you thinking that they'll notice that `atIndex` is not `atKey`? There is already a much stronger safeguard against that: `Dictionary.Index` and `Dictionary.Key` are different, totally incompatible types. Every mechanism I can think of to get a `Dictionary.Index` has "index" or "indices" in its name, so you could only make this mistake if you confused dictionary indices with dictionary keys, in which case `atIndex:` would not stand out to you either.

Ultimately, unless the compiler actually understands identifiers, it's just not going to be able to catch mistakes like calling a dictionary an "array", or many of the other problems you describe here. But the type system can and often does flag these kinds of problems pretty close to the source.

> Orphaning method signatures by stripping useful return type and argument type information wouldn't be so bad if variables were all named descriptively, but that is a strangely optimistic hope for a language that's as paranoid about safety that it was specifically designed to prevent many categories of common mistakes.

Personally, I think of Swift's approach to safety as similar to Reagan's "trust, but verify". Our naming conventions trust the programmer to write code with clear names; our picky type system verifies that the code is plausible. We don't force the programmer to explain herself to us until we notice that something doesn't seem right.

The bottom line is, a language can't force you to write clear and correct code, but it has many tools it can use to encourage it. Swift chooses not to use the "heavily redundant names" tool because its cost to good code is too high. We instead rely more on other tools, like strong typing, value types, and definite initialization, to encourage high-quality code at a lower cost.

-- 
Brent Royal-Gordon
Architechies

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170802/30217847/attachment.html>