[swift-evolution] Implicit truncation

Xiaodi Wu xiaodi.wu at gmail.com
Tue May 23 00:44:33 CDT 2017


On Mon, May 22, 2017 at 5:21 PM, Haravikk <swift-evolution at haravikk.me>
wrote:

>
> On 22 May 2017, at 21:16, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>
>
> On Mon, May 22, 2017 at 10:39 Haravikk <swift-evolution at haravikk.me>
> wrote:
>
>> On 22 May 2017, at 15:51, Xiaodi Wu <xiaodi.wu at gmail.com> wrote:
>>
>> If we're to speak of intuition for new developers who've never used a
>> programming language, who are falling back to what they know about
>> mathematics, then quite literally a decimal point _is_ about division by
>> ten.
>>
>>
>> I don't think this necessarily follows; the issue here is that the
>> constructor isn't explicit enough that it is simply lopping off the
>> fractional part. My own experience of maths as taught in school, to go from
>> a decimal to an integer I would expect to round,
>>
>
> You would also expect that 3 / 4 in integer math gives you 1. With integer
> division, however, 3 / 4 == 0. By definition the decimal point separates an
> integer from a fractional part, so the behaviors are inextricably linked.
> To test this out in practice, I asked the first person with no programming
> experience I just encountered today.
>
> I said: "Let me teach you one fact about integers in a programming
> language. When two integers are divided, the integer result has the
> fractional part discarded; for example, 3/4 computes to 0. What would you
> expect to be the result of converting 0.75 to an integer?"
>
> He answered immediately: "I would have expected that 3/4 gives you 1, but
> since 3/4 gives you 0, I'd expect 0.75 to convert to 0."
>
>
> These are two different case; Int(3) / Int(4) is a division of integer
> values with an integer result, there's no intermediate floating point value
> that needs to be coerced back into an Int. The issue here is converse of a
> Float/Double to an Integer, it's a different operation.
>
> 3.0 / 4.0 = 0.75 is a property of Float
> 3 / 4 = 0 is a property of Int
>
> What's under discussion here is conversion between the two, they're not
> really comparable cases.
>

We're not disagreeing here. Or at least, I'm not disagreeing with you. Of
course, integer division is not the same operation as conversion of Float
to Int. However, division and decimals are related concepts, obviously. I
don't think you dispute that.

How integer types behave under division and how they behave when converting
from a decimal value are therefore inevitably related. And as my
non-programming colleague demonstrated today, new users will use their
knowledge regarding the behavior of one operation to shape expectations
regarding the behavior of the other.


>
> Discarding the fractional part of a floating point value is a bit pattern
> operation only in the sense that any operation on any data is a bit pattern
> operation. It is clearly not, however, an operation truncating a bit
> pattern.
>
> as the conversion is simplistically taking the significand, dropping it
>> into an Int then shifting by the exponent.
>>
>
> That's not correct. If you shift the significand or the significand bit
> pattern of pi by its exponent, you don't get 3.
>
>
> I think you're misunderstanding me. If you think of it in base 10 terms
> 1.2345 is equivalent to 12345 x 10-4; when you convert that to an Int it
> effectively becomes 12345 shifted four places to the right, leaving you
> with 1. In that sense it's a truncation of the bit-pattern as you're
> chopping part of it off, or at the very least are manipulating it.
>

No. It is of course true that the most significant bit of the binary
representation of 12345 is 1. However:

12345 >> 1 == 6172
12345 >> 2 == 3086
12345 >> 3 == 1543
12345 >> 4 == 771
12345 >> 5 == 385
etc.

That is to say, you can't get 1234, 123, or 12 from truncating 12345. There
is no sense in which converting 1234.5, 123.45, or 12.345 to an integer
involves truncating the bit pattern of 12345. You are not in that case
performing recursive integer division by 2, but rather recursive integer
division by 10, which goes back to how decimals and division are
inextricably related.


>
> Regardless it's also very literally a truncation since you're specifically
> truncating any fraction part, it's simply the most correct term to use;
> frankly I find restricting that to bit-pattern truncation to be entirely
> arbitrary and unhelpful.
>

It is important and not at all arbitrary. Binary integers model two things
simultaneously: an integral value and a sequence of bits. Much care was
placed during the design and review of the revised integer protocols in
making sure that the names of operations that view integers as integral
values are distinguished from those that view integers as sequences of
bits. It was accepted that "truncating" and "extending" would be applied to
operations on bit patterns only, which is why it was OK to shorten the
label from `truncatingBitPattern` to `truncating` (later renamed
`extendingOrTruncating`, for other fairly obvious reasons). By analogy, the
expected value for a hypothetical `Int32(truncating: 42.0 as Double)` would
be 252867936, which is of questionable usefulness. It would, however, be
confusing and unhelpful to use the same word to describe an operation on
the represented real or integral value which is now used only for a very
different operation on a sequence of bits.

The types involved should make it clear whether the value is being made
> narrower or not. Int64 -> Int32 is a form of truncation, but so to is Float
> -> Int; in both cases the target can't represent all values of the source,
> so something will be lost.
>
> func init(rounding:Float, _ strategy: FloatingPointRoundingRule) { … }
>>>
>>
>> Again, here, as an addition to the API, this fails the six criteria of
>> Ben Cohen, as it is strictly duplicative of `T(value.rounded(strategy))`.
>>
>>
>> Maybe, but init(rounding:) is explicit that something is being done to
>> the value, at which point there's no obvious harm in clarifying what (or
>> allowing full freedom). While avoiding redundancy is good as a general
>> rule, it doesn't mean there can't be any at all if there's some benefit to
>> it; in this case clarity of exactly what kind of rounding is taking place
>> to the Float/Double value.
>>
>
> The bar for adding new API to the standard library is *far* higher than
> "some benefit"; `Int(value.rounded(.up))` is the approved spelling for
> which you are proposing a second spelling that does the same thing.
>
>>
> The main benefit is that the constructor I proposed would actually require
> the developer to do this, what you're showing is entirely optional; i.e-
> any value can be passed without consideration of the rounding that is
> occurring, or that it may not be as desired. With a label the constructor
> at least would remind the developer that rounding is occurring (i.e- the
> value may not be as passed).
>

I do not think any users expect 0.75 to be represented exactly as an
integer; that's not at issue here. The question is, which users see
`Int(0.75)` and think, "this must mean that 0.75 is rounded up to 1"? My
answer, from prior experiences teaching beginners, is that the subset of
users who make this mistake (or a similar one in other languages) largely
overlaps the subset of users who see `3 / 4` and think "this must evaluate
to 1."


> Going further and requiring them to provide a rounding strategy would also
> force them to consider what method of rounding should actually be used,
> eliminating any confusion entirely. What you're demonstrating there does
> not provide any of these protections against mistakes, as you can omit the
> rounding operation without any warning, and end up with a value you didn't
> expect.
>

It is, of course, true that a user who does not read the documentation and
expects a function that does A instead to do B will use that function
incorrectly. The question is whether it is reasonably common for users to
make the incorrect assumption, and whether such incorrectness ought to be
accommodated by a breaking change to the language. Here I am arguing that
users who are aware of integer division are unlikely to make the incorrect
assumption; they will at minimum look up what the actual behavior is, and
based on my teaching experience and today's mini-experiment, they are
likely actually to expect the existing behavior. (And, of course, I am
arguing that users who are unaware of integer division have a much more
serious gap in knowledge that is the primary issue, not fixable by tweaking
the name of integer initializers.)

A secondary benefit is that any rounding that does take place can do so
> within the integer type itself, potentially eliminating a Float to Float
> rounding followed by truncation; i.e- since rounding towards zero is the
> same as truncation it can optimise away entirely.
>

Renaming the `Int.init(_: Float)` initializer, by definition, cannot
recover any performance benefits. I'm not aware of any optimizations of
other rounding modes that can make such an initializer faster than what is
currently possible.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170523/3ee971b9/attachment.html>


More information about the swift-evolution mailing list