[swift-evolution] [proposal]Decouple definition of Int8 from target char type

William Dillon william at housedillon.com
Thu Feb 25 23:58:46 CST 2016

>> Swift currently maps the Int8 type to be equal to the char type of the target platform.  On targets where char is unsigned by default, Int8 becomes an unsigned 8-bit integer, which is a clear violation of the Principle of Least Astonishment.  Furthermore, it is impossible to specify a signed 8-bit integer type on platforms with unsigned chars.
> I'm probably misunderstanding you, but are you sure that's what is
> happening?  I can't imagine how the standard library would just
> silently make Int8 unsigned on Linux arm.

I think the best way to demonstrate this is through an example.  Here is a sample swift program:

import Foundation
print(NSNumber(char: Int8.min).shortValue)

Compile and run this on Darwin and you get what you would expect:

Falcon:~ wdillon$ ./example; uname -a
Darwin Falcon.local 15.3.0 Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64 x86_64

On Linux/ARM you’ll get something entirely unexpected:

wdillon at tegra-ubuntu:~$ ./example; uname -a
Linux tegra-ubuntu 3.10.40-gdacac96 #1 SMP PREEMPT Thu Jun 25 15:25:11 PDT 2015 armv7l armv7l armv7l GNU/Linux

> What I would expect to happen is that on Linux arm the Clang importer
> would map 'char' to UInt8, instead of mapping it to Int8 like it does
> on x86_64.

That would approach a satisfactory solution, except that it would lead to frustration in the long term, and ultimately an expansion of the number of special cases.  Any API that relies upon the definition of char would be bifurcated.  The user would have to either bracket with #if blocks (and know what platform specifies what), or explicitly cast to a consistent type at every entry point char is used.  And, when providing values to C, the reverse is true.  The user would have to know what platforms do what, and explicitly cast their internally-used type into the correct type for char first.

By using CChar, the user isn’t required to maintain this knowledge and list of platforms in countless locations in their code.  All they would have to do is cast from CChar to whatever type they want to use within Swift.  When going the other way, to get a value from Swift to C, they just cast to CChar and the correct action is taken.  In cases where the Swift type is the same as CChar the cast is basically a no-op.

Another benefit is that the process brings awareness to the fact that char is not defined consistently across platforms.  I believe that it is worth while for people to understand the implications of the code they write, and the cast from CChar to something else provides an opportunity for a moment of reflection and the asking of the question “what am I doing here, and what do I want.”

>>    C type    |   Swift type
>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>>         char |   CChar
>> unsigned char |   UInt8
>>  signed char |    Int8
> This brings in the notion of the CChar type, and requires us to define
> (hopefully!) some rules for type-based aliasing, since you want to be
> able to freely cast UnsafePointer<CChar> to UnsafePointer<UInt8> or
> UnsafePointer<Int8>.

Swift already has a CChar.  It’s defined in https://github.com/apple/swift/blob/master/stdlib/public/core/CTypes.swift#L19 In the usage in CTypes.swift, the fact that Int8 has this dual meaning is relied upon. I agree that the ability to cast between UnsafePointers specialized to each type is desirable.

> What about a proposal where we would always map 'char' to Int8,
> regardless of the C's idea of signedness?

In a very real sense this is exactly what is happening currently.

Thanks for sharing your thoughts,
- Will

More information about the swift-evolution mailing list