[swift-evolution] [proposal]Decouple definition of Int8 from target char type
William Dillon
william at housedillon.com
Fri Feb 26 16:31:33 CST 2016
> On Feb 26, 2016, at 9:09 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>
> On Fri, Feb 26, 2016 at 9:01 AM, William Dillon <william at housedillon.com> wrote:
>>
>>> On Feb 25, 2016, at 11:13 PM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>>>
>>> On Thu, Feb 25, 2016 at 9:58 PM, William Dillon <william at housedillon.com> wrote:
>>>>>> Swift currently maps the Int8 type to be equal to the char type of the target platform. On targets where char is unsigned by default, Int8 becomes an unsigned 8-bit integer, which is a clear violation of the Principle of Least Astonishment. Furthermore, it is impossible to specify a signed 8-bit integer type on platforms with unsigned chars.
>>>>>
>>>>> I'm probably misunderstanding you, but are you sure that's what is
>>>>> happening? I can't imagine how the standard library would just
>>>>> silently make Int8 unsigned on Linux arm.
>>>>>
>>>>
>>>> I think the best way to demonstrate this is through an example. Here is a sample swift program:
>>>>
>>>> import Foundation
>>>> print(NSNumber(char: Int8.min).shortValue)
>>>
>>> There is a lot happening in this snippet of code (including importing
>>> two completely different implementations of Foundation, and the pure
>>> swift one not being affected by Clang importer at all). Could you
>>> provide AST dumps for both platforms for this code?
>>>
>>
>> Of course. Here’s the AST on ARM:
>>
>> wdillon at tegra-ubuntu:~$ swiftc -dump-ast example.swift
>> (source_file
>> ...
>>
>> And Darwin:
>>
>> Falcon:~ wdillon$ xcrun -sdk macosx swiftc -dump-ast example.swift
>> (source_file
>> ...
>>
>> I want to point out that these are identical, as far as I can tell.
>
> I agree. Then, the difference in behavior should be contained in the
> NSNumber implementation. As far as this piece of code is concerned,
> it correctly passes the value as Int8. Could you debug what's
> happening in the corelibs Foundation, to find out why it is not
> printing a negative number?
>
I want to be clear that this isn’t a problem specific to NSNumber. I chose that example because I wanted something that was trivial to check on your own, and limited to Swift project code, that demonstrates the issue. This behavior will occur in any case where a char is imported into swift from C. Fixing NSNumber will address the issue in only that one place. Even if all of of stdlib and CoreFoundation were modified to hide this problem, any user code that interfaces with C will have issues, and require fixes of their own.
I don’t think it’s reasonable to expect that the issue be known and addressed in literally thousands of places where chars from C APIs are present, especially as the issue is hidden from view by the nature of mapping char into Int8. An implementor would have to know that a given API returns char, that it’ll be imported as Int8, and that it might be an Int8 that was intended to be unsigned, then do the right thing.
In contrast, if C char is imported as CChar, it’s very clear what’s happening, and leads the user toward a course of action that is more likely to be appropriate.
I’ve create a github project that demonstrates this problem without using Foundation or CoreFoundation at all. This code creates a small C-based object that has three functions that return a char; one returns -1, one 1 and the last 255.
On signed-char platforms:
From Swift: Type: Int8
From Swift: Negative value: -1, positive value: 1, big positive value: -1
From clang: Negative value: -1, positive value: 1, big positive value: -1
On unsigned-char platforms:
From Swift: Type: Int8
From Swift: Negative value: -1, positive value: 1, big positive value: -1
From clang: Negative value: 255, positive value: 1, big positive value: 255
Code: https://github.com/hpux735/badCharExample.git
It’s clear that Swift is interpreting the bit pattern of the input value as a signed 8-bit integer regardless of how it’s defined in the target platform.
>> As another exercise, you can tell clang to use signed or unsigned chars and there will be no change:
>>
>> wdillon at tegra-ubuntu:~$ swiftc example.swift -Xcc -funsigned-char
>> wdillon at tegra-ubuntu:~$ ./example
>> 128
>> wdillon at tegra-ubuntu:~$ swiftc example.swift -Xcc -fsigned-char
>> wdillon at tegra-ubuntu:~$ ./example
>> 128
>
> And it makes sense, since the program you provided does not compile
> any C code. It is pure-swift (though it calls into C via corelibs
> Foundation).
>
Yep, that’s right.
>>>>> What about a proposal where we would always map 'char' to Int8,
>>>>> regardless of the C's idea of signedness?
>>>>>
>>>>
>>>> In a very real sense this is exactly what is happening currently.
>>>
>>> Sorry, I don't see that yet -- it is still unclear to me what is happening.
>>>
>>
>> That’s ok. We’ll keep working on it until I’ve proven to everyone’s satisfaction that there really is a problem.
>
> Given what you showed with corelibs Foundation, I agree there's a
> problem. I'm just trying to understand how much of that behavior was
> intended, if there are any bugs in the compiler (in implementing our
> intended behavior), if there are any bugs in Foundation, and what
> would the behavior be if we fixed those bugs. When we have that, we
> can analyze our model (as-if it was implemented as intended) and make
> a judgement whether it works, and whether is a good one.
>
I believe that, based on the comments in CTypes.swift
/// This will be the same as either `CSignedChar` (in the common
/// case) or `CUnsignedChar`, depending on the platform.
public typealias CChar = Int8
that the dual-meaning of Int8 is expected and intended, otherwise the author of this comment and code (Ted and Jordan respectively) don’t understand the intended behavior, and I find that hard to believe.
> For example, if it turns out that the issue above is due to a bug in
> the C parts of CoreFoundation that assumes signed char on arm (because
> of iOS, say), then there's nothing that a language change in Swift
> could do.
>
Hopefully I’ve been able to demonstrate that CoreFoundation is not a party to this issue, per se. Really, any time char gets imported into swift there is the possibility of unintended (and potentially very frustrating to diagnose) behavior.
Cheers,
- Will
More information about the swift-evolution
mailing list