[swift-evolution] [SHORT Review] SE-0132: Rationalizing Sequence end-operation names

Tue Jul 26 10:15:06 CDT 2016

on Tue Jul 26 2016, Brent Royal-Gordon <brent-AT-architechies.com> wrote:

>> On Jul 25, 2016, at 6:35 PM, Dave Abrahams via swift-evolution
>> <swift-evolution at swift.org> wrote:
>> 
>> First, though, I have to apologize for those wide tables, since I'm
>> listed as a co-author (because of a small design contribution).  The
>> only way I've been able to read them is by looking at the markdown
>> source, so that's how I'm going to quote it here.
>
> Sorry about that. I felt that the systematic way the names were
> arranged couldn't be conveyed in any other way, but the resulting
> formatting is regrettably difficult to read.
>
>> Unfortunately there's a semantic difference that I hadn't noticed
>> before: the mutating “remove” operations have a precondition that there
>> be at least as many elements as are being removed.  “Drop,” like “pop,”
>> is forgiving of such overruns.  I think this is solvable; my suggestion
>> is below
>
> Wow, there's a lot going on in this next section.

Yeah, my turn to say sorry.

> To preface this discussion, I'll merely point out that, in writing
> this proposal, I tried to narrowly focus on renaming. My inability to
> rename `prefix(upTo/through:)` and `suffix(from:)` in a way that made
> sense led me to a more aggressive redesign, but for the other calls I
> considered larger redesigns out of scope. I say this not as an
> argument against redesigning the `remove` calls, but merely to explain
> why I didn't fiddle with their semantics already.

Makes sense.

>
>
> To address things one at a time:
>
>> My suggestion would be to make the remove()
>> operations more forgiving:
>> 
>>   rename popFirst() to removeFirst()
>>   rename popLast() to removeLast()
>
> I actually quite agree that `removeFirst/Last()` and `popFirst/Last()`
> are redundant. I think I briefly touched on that in "Future
> Directions".

Missed that.  Your proposal is a bit long, so it's hard to take
everything in.  Here's what you say:

   removeFirst/Last() and popFirst/Last() are very nearly redundant;
   their only difference is that the remove methods have a non-Optional
   return type and require the collection not be empty, while the pop
   methods have an Optional return type and return nil if it's empty.

   These operations could be merged, with the remove operations taking
   on the preconditions of the current pop operations; 

The current pop operations *have* no preconditions.  By “taking on the
preconditions of the current pop operations” do you mean “dropping their
preconditions?”

   additionally, removePrefix(_:) and removeSuffix(_:) could drop their
   equivalent preconditions requiring that the elements being removed
   exist.

IIUC you are suggesting making the remove operations uniformly
“forgiving,” yes?  I think I'm in favor.

> Going out of order for a moment:
>
>> We could of course just make
>> removePrefix(n) and removeSuffix(n) forgiving,
>
> If we're going to go beyond renaming, this is the approach I favor,
> because it is the most straightforward and readable. When you see a
> piece of code that says `removePrefix(n)`, it doesn't take much effort
> to figure out that it's removing a series of elements from the
> beginning of the sequence.

Of course not, that's entirely captured by “removePrefix.”  It's the
clarity of the meaning of “n” that concerns me.

> Specifically, if we go this route, I would:
>
> * Remove the `popFirst/Last` methods.
> * Remove the preconditions on removeFirst/Last/Prefix/Suffix requiring
> the collection to not be smaller than the number of elements to be
> removed.
> * Change the signatures to something like:
>
> 	@discardableResult
> 	func removeFirst() -> Iterator.Element?
> 	@discardableResult
> 	func removeLast() -> Iterator.Element?

This part is equivalent to renaming popFirst/ popLast to
removeFirst/removeLast, right?

> 	
> 	@discardableResult func removePrefix(_ n: Int) ->
> 	[Iterator.Element] // or SubSequence 
>       @discardableResult func
> 	removeSuffix(_ n: Int) -> [Iterator.Element] // or SubSequence
>
> I've added return types to `removePrefix/Suffix` so that they follow
> our general rule about not throwing away information that isn't
> necessarily easy to compute, but that part of the change isn't
> strictly necessary.

Yeah, I don't think we can do those without loss of efficiency; the
return value would prevent removals from being done in-place in CoW
structures.

>> but I have long believed
>> that the “prefix/suffix” methods should go one of two ways:
>> 
>> a. they get a label that clarifies the meaning of the argument, e.g.
>> 
>>   x.removePrefix(ofMaxLength: n)
>>   x.removeSuffix(ofMaxLength: n)
>
> I struggled for a while to find a good label for the count parameters,
> but couldn't find a good convention.
>
> The fundamental problem is that English counting grammar doesn't line
> up with Swift syntax. In English, you would like to say "prefix of 3
> elements", but you can't put a label after a parameter in Swift. Thus,
> there may not be an idiomatic way to label these parameters. (This is
> a problem throughout the standard library, actually; I ran into the
> same issues when I was looking at the UnsafeRawPointer proposal.)

I don't see how “a prefix of maximum length 3” fails to be idiomatic.

> Labels like `ofMaxLength` are verbose, 

Yes.

> and the use of "length" in particular isn't great when the standard
> library otherwise uses "count". 

The name `count` causes all kinds of problems, so I view conflicts with
it as inevitable.

> If you put a gun to my head and demanded I add a label, I would make
> it `x.removePrefix(ofUpTo: n)` 

That one is easily misinterpreted as removing the longest prefix that
doesn't include the value `n`.

> or just `x.removePrefix(of: n)`. 

That one is easily misinterpreted as removing the first element if it is
equal to `n`.

> But I'm not convinced any of these bring enough to the table.

I agree that they don't clarify enough (or anything in the case of “of”).

> When you see an unlabeled Int parameter to a call with a name like
> `removePrefix`, there's really only a couple of things it could mean,
> and the count is an unsurprising one.
>
> On the other hand, the internal parameter name should probably not be
> `n`; it should be something like `count` or even `countToRemove`.

The parameter name should first serve the documentation comment.  IMO
`count` is worse than `n` in that regard and `countToRemove` makes the
comment read awkwardly.  But this is really tangential at best.

> (Probably obvious, but worth mentioning explicitly: if we add a label,
> we should apply it consistently to all `prefix` and `suffix` calls
> which take a count.)

Obvious, yes.

>
>
> Okay, back to the right order:
>
>>   kill removeFirst(n)
>>   kill removeLast(n)
>> 
>> The “forgiving” forms of x.removeFirst(n) and x.removeLast(n) can be
>> expressed as:
>> 
>>   let i = x.index(x.startIndex, offsetBy: n, limitedBy: x.endIndex)
>>   x.removeSubrange(..<i)
>> 
>>   let i = x.index(x.endIndexIndex, offsetBy: -n, limitedBy: x.startIndex)
>>   x.removeSubrange(i..<)
>> 
>> I realize that's quite verbose.
>
> No kidding. It requires:
>
> * Two statements, four references to the collection, and 88 characters
> (for an example with one-letter variable names)
> * The use of a complex index-manipulation function
> * The simultaneous use of the startIndex, endIndex, and a sign on the
> count, all of which must be coordinated
>
> I think that, if `removePrefix/Suffix(_:)` were not in the standard
> library, people would be forced to invent them.

They might, but probably not in that form.  I'm not convinced people
need them to be forgiving.  They just have to be forgiving if we name
them the same as the renamed popXXX methods.  But that doesn't change
your point substantially.

>
>
>> b. they are given a recognizable domain-specific notation such as:
>> 
>>   x.removeSubrange($+n..<)
>>   x.removeSubrange(..<$-n)
>
> Does $ represent the start, the end, or either one depending on which
> side of the range we're on? Because if it's the third option, I think
> these two operations are actually inverted: the first is removing
> everything *except* the `prefix(n)`, and the second is removing
> everything except the `suffix(n)`.

Wow, that was impressive!  With one stroke, you have just convinced me
that we can't do this.  The fact that I got it wrong, along with other
excellent feedback in this thread, kills my interest in using $ in this
way.

> (Or maybe it's based on the sign, with a negative offset going from
> the endIndex, and a positive offset going from the startIndex? Don't
> we usually try to avoid that kind of runtime branching?)
>
>>  That would admittedly leave single-pass Sequences without an API for
>>  dropping the first N elements. I am inclined to think that interface
>>  should be moved to Iterator.
>
> Why would it be a good thing to use two different syntaxes for the
> same operation?

It wouldn't.  I was just trying to make the world fit my preconceived
idea of the right solution ;-)

>>  The introduction of such notation raises the question of whether we
>>  need unary range operators, and could instead go with
>> 
>>     x[i..<$] and x[$..<i]
>> 
>>  which is after all only one character longer than
>> 
>>     x[i..<] and x[..<i]
>> 
>>  and stays out of the territory of the prefix/suffix “pack/unpack”
>>  operators that are likely to be used for generic variadics.
> …
>> I believe the `$+n..<i` idea is still implementable with these basic
>> types, just with an enum instead of optionals.  I'll take a shot at it
>> tonight if I can get a few minutes.
>
> Xcode 8 beta 3-compatible hack based on my assumptions about how this
> supposed to work, suitable for at least testing the ergonomics:
> https://gist.github.com/brentdax/3c5c64d3b7ca3ff6b68f1c86163c39c4
>
> At a technical level, I think the biggest problem is the `$`
> symbol. If it's a constant or variable, then it can't be generic, and
> we need to fix a specific SignedInteger type for the offset. That
> means we need to convert to whatever IndexDistance ends up being. That
> could be fixed if either `$` became a magic syntax or we get generic
> constants, but either of those will be a significant effort. Or it can
> be a function, but then it's less convenient.
>
> Stepping back from implementation, though, I think the `$` syntax is
> just too opaque. It gets better when it's spaced out:
>
> 	array[$ + 2 ..< $ - 1]
>
> But it's still not something whose meaning you will even make a wild
> guess at without looking it up. Even once you learn it, the $ symbol
> will almost always be squeezed between two other punctuation
> characters, making visual parsing more difficult. The effect is not
> unlike stereotypes of Perl.
>
> (Note: I'm an old Perl hand, so others might better read that sentence
> as "The effect is not unlike Perl.")

OK, now you're just driving the dagger in deeper.  My poor idea is dead already!
>
> I mean, look at this example you gave earlier:
>
>>   It also implies we can replace
>> 
>>     x.removingPrefix(n)
>>     x.removingSuffix(n)
>> 
>>   with
>> 
>>     x[$+n..<]
>>     x[..<$-n]
>> 
>>  for Collections.  
>
> The first version is clear as crystal; the second is clear as mud. The
> first version gives the parameter a position of prominence; the second
> buries it in the middle of a complex expression.
>
> Now, abandoning the `$` could help here. I've worked up an alternate
> design which uses descriptive names at
> <https://gist.github.com/brentdax/0946a99528f6e6500d93bbf67684c0b3>. The
> example I gave above is instead written:
>
> 	array[.startIndex + 2 ..< .endIndex - 1]
>
> Or, for the removingPrefix equivalent:
>
> 	x[.startIndex + n ..< .endIndex]
>
> But this is longer than a removingPrefix call and *still* buries the
> intent. And adding prefix/suffix operators back in doesn't really
> help; it merely allows you to omit the straightforward bits, without
> doing anything to help make the non-straightforward bits more
> clear. (And in fact, it makes those worse by forcing you to either
> compress the whitespace in them or parenthesize.)
>
> And even if we get past the syntax issues, there's an attractive
> nuisance problem. This feature, whatever it is, will undoubtedly
> circulate as "the way to make String indexes work". Is that something
> we want to bring to the language?
>
> In short, I think the idea of relative ranges is pretty neat. It's
> brief, it's parsimonious, it's flexible, it's clever. But it makes
> code less clear and it hides performance problems. I think either of
> those two factors by itself ought to make us rethink a Swift proposal.

Sold.

Thanks for indulging my unworkable fantasy,

-- 
Dave