<html><head><style>
body {
        font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;
        padding:1em;
        margin:auto;
        background:#fefefe;
}
h1, h2, h3, h4, h5, h6 {
        font-weight: bold;
}
h1 {
        color: #000000;
        font-size: 28pt;
}
h2 {
        border-bottom: 1px solid #CCCCCC;
        color: #000000;
        font-size: 24px;
}
h3 {
        font-size: 18px;
}
h4 {
        font-size: 16px;
}
h5 {
        font-size: 14px;
}
h6 {
        color: #777777;
        background-color: inherit;
        font-size: 14px;
}
hr {
        height: 0.2em;
        border: 0;
        color: #CCCCCC;
        background-color: #CCCCCC;
display: inherit;
}
p, blockquote, ul, ol, dl, li, table, pre {
        margin: 15px 0;
}
a, a:visited {
        color: #4183C4;
        background-color: inherit;
        text-decoration: none;
}
#message {
        border-radius: 6px;
        border: 1px solid #ccc;
        display:block;
        width:100%;
        height:60px;
        margin:6px 0px;
}
button, #ws {
        font-size: 12 pt;
        padding: 4px 6px;
        border-radius: 5px;
        border: 1px solid #bbb;
        background-color: #eee;
}
code, pre, #ws, #message {
        font-family: Monaco;
        font-size: 10pt;
        border-radius: 3px;
        background-color: #F8F8F8;
        color: inherit;
}
code {
        border: 1px solid #EAEAEA;
        margin: 0 2px;
        padding: 0 5px;
}
pre {
        border: 1px solid #CCCCCC;
        overflow: auto;
        padding: 4px 8px;
}
pre > code {
        border: 0;
        margin: 0;
        padding: 0;
}
#ws { background-color: #f8f8f8; }
.bloop_markdown table {
border-collapse: collapse;
font-family: Helvetica, arial, freesans, clean, sans-serif;
color: rgb(51, 51, 51);
font-size: 15px; line-height: 25px;
padding: 0; }
.bloop_markdown table tr {
border-top: 1px solid #cccccc;
background-color: white;
margin: 0;
padding: 0; }
.bloop_markdown table tr:nth-child(2n) {
background-color: #f8f8f8; }
.bloop_markdown table tr th {
font-weight: bold;
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
.bloop_markdown table tr td {
border: 1px solid #cccccc;
margin: 0;
padding: 6px 13px; }
.bloop_markdown table tr th :first-child, table tr td :first-child {
margin-top: 0; }
.bloop_markdown table tr th :last-child, table tr td :last-child {
margin-bottom: 0; }
.bloop_markdown blockquote{
border-left: 4px solid #dddddd;
padding: 0 15px;
color: #777777; }
blockquote > :first-child {
margin-top: 0; }
blockquote > :last-child {
margin-bottom: 0; }
code, pre, #ws, #message {
word-break: normal;
word-wrap: normal;
}
hr {
display: inherit;
}
.bloop_markdown :first-child {
-webkit-margin-before: 0;
}
code, pre, #ws, #message {
font-family: Menlo, Consolas, Liberation Mono, Courier, monospace;
}
.send { color:#77bb77; }
.server { color:#7799bb; }
.error { color:#AA0000; }</style></head><body><p>Please see the <a href="https://gist.github.com/hpux735/eafad78108ed42879690">gist</a> for the most up-to-date drafts.</p>
<p>I appreciate any comments, concerns and questions!</p>
<h1 id="improvetheportabilityofswiftwithdifferentlysignedchar.">Improve the portability of Swift with differently signed char.</h1>
<ul>
<li>Proposal: <a href="https://github.com/apple/swift-evolution/blob/master/proposals/004x-target-specific-chars.md">SE–004x</a></li>
<li>Author: <a href="https://github.com/hpux735">William Dillon</a></li>
<li>Status: <strong>Draft</strong></li>
<li>Review manager: TBD</li>
</ul>
<h2 id="introduction">Introduction</h2>
<p>In C, the signness of <code>char</code> is undefined. A convention is set by either the platform, such as Windows, or by the architecture ABI specification, as is typical on System-V derived systems. A subset of known platforms germane to this discussion and their <code>char</code> signness is provided below.</p>
<table>
<colgroup>
<col style="text-align:center;">
<col style="text-align:center;">
<col style="text-align:center;">
<col style="text-align:center;">
<col style="text-align:center;">
<col style="text-align:center;">
<col style="text-align:center;">
</colgroup>
<thead>
<tr>
        <th style="text-align:center;">char</th>
        <th style="text-align:center;">ARM</th>
        <th style="text-align:center;">mips</th>
        <th style="text-align:center;">PPC</th>
        <th style="text-align:center;">PPC64</th>
        <th style="text-align:center;">i386</th>
        <th style="text-align:center;">x86_64</th>
</tr>
</thead>
<tbody>
<tr>
        <td style="text-align:center;">Linux/ELF</td>
        <td style="text-align:center;">unsigned <a href="http://www.eecs.umich.edu/courses/eecs373/readings/ARM-AAPCS-EABI-v2.08.pdf">1</a></td>
        <td style="text-align:center;">unsigned <a href="http://math-atlas.sourceforge.net/devel/assembly/mipsabi32.pdf">2</a></td>
        <td style="text-align:center;">unsigned <a href="https://uclibc.org/docs/psABI-ppc.pdf">3</a></td>
        <td style="text-align:center;">unsigned <a href="http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi.html">4</a></td>
        <td style="text-align:center;">signed <a href="http://www.sco.com/developers/devspecs/abi386-4.pdf">5</a></td>
        <td style="text-align:center;">signed <a href="http://www.x86-64.org/documentation/abi.pdf">6</a></td>
</tr>
<tr>
        <td style="text-align:center;">Mach-O</td>
        <td style="text-align:center;">signed [7]</td>
        <td style="text-align:center;">N/A</td>
        <td style="text-align:center;">signed [7]</td>
        <td style="text-align:center;">signed [7]</td>
        <td style="text-align:center;">signed [7]</td>
        <td style="text-align:center;">signed [7]</td>
</tr>
<tr>
        <td style="text-align:center;">Windows</td>
        <td style="text-align:center;">signed [8]</td>
        <td style="text-align:center;">signed [8]</td>
        <td style="text-align:center;">signed [8]</td>
        <td style="text-align:center;">signed [8]</td>
        <td style="text-align:center;">signed [8]</td>
        <td style="text-align:center;">signed [8]</td>
</tr>
</tbody>
</table>
<p>This is not a great problem in C, and indeed many aren’t even aware of the issue. Part of the reason for this is that C will silently cast many types into other similar types as necessary. Notably, even with <code>-Wall</code> clang produces no warnings while casting beteen any pair of <code>char</code>, <code>unsigned char</code>, <code>signed char</code> and <code>int</code>. Swift, in contrast, does not cast types without explicit direction from the programmer. As implemented, <code>char</code> is interpreted by swift as <code>Int8</code>, regardless of whether the underlying platform uses <code>signed</code> or <code>unsigned char</code>. As every Apple platform (seemingly) uses <code>signed char</code> as a convention, it was an appropriate choice. However, now that Swift is being ported to more and more platforms, it is important that we decide how to handle the alternate case.</p>
<p>The problem at hand may be most simply demonstrated by a small example. Consider a C API where a set of functions return values as <code>char</code>:</p>
<pre><code class="C">char charNegFunction(void) { return -1; }
char charBigPosFunction(void) { return 255; }
char charPosFunction(void) { return 1; }
</code></pre>
<p>Then, if the API is used in C thusly:
<code>C
char negValue = charNegFunction();
char posValue = charPosFunction();
char bigValue = charBigPosFunction();
printf("From clang: Negative value: %d, positive value: %d, big positive value: %d\n", negValue, posValue, bigValue);
</code>
You get exactly what you would expect on <code>signed char</code> platforms:
<code>
From clang: Negative value: -1, positive value: 1, big positive value: -1
</code>
and on <code>unsigned char</code> platforms:
<code>
From clang: Negative value: 255, positive value: 1, big positive value: 255
</code>
In its current state, swift behaves similarly to C on <code>signed char</code> platforms.
<code>
From Swift: Negative value: -1, positive value: 1, big positive value: -1
</code></p>
<p>This code is available <a href="https://github.com/hpux735/badCharExample">here</a>, if you would like to play with it yourself.</p>
<h2 id="motivation">Motivation</h2>
<p>The third stated focus area for Swift 3.0 is <strong>portability</strong>, to quote the evolution document:</p>
<blockquote>
<ul>
<li><strong>Portability</strong>: Make Swift available on other platforms and ensure that one can write portable Swift code that works properly on all of those platforms.</li>
</ul>
</blockquote>
<p>As it stands, Swift’s indifference to the signness of <code>char</code> while importing from C can be ignored in many cases. The consequences of inaction, however, leave the door open for extremely subtle and dificult to diagnose bugs any time a C API relies on the use of values greater than 128 on platforms with <code>unsigned char</code>; in this case the current import model certainly violates the Principle of Least Astonishment.</p>
<p>This is not an abstract problem that I want to have solved “just because.” This issue has been a recurrent theme, and has come up several times during code review. I’ve included a sampling of these to provide some context to the discussion:</p>
<ul>
<li><a href="https://github.com/apple/swift/pull/1103">Swift PR–1103</a></li>
<li><a href="https://github.com/apple/swift-corelibs-foundation/pull/265">Swift Foundation PR–265</a></li>
</ul>
<p>In these discussions we obviously struggle to adequately solve the issues at hand without introducing the changes proposed here. Indeed, this proposal was suggested in <a href="https://github.com/apple/swift-corelibs-foundation/pull/265">Swift Foundation PR–265</a> by <a href="http://github.com/jckarter">Joe Groff</a>.</p>
<p>These changes should happen during a major release. Considering them for Swift 3 will enable us to move forward efficiently while constraining any source incompatibilities to transitions where users expect them. Code that works properly on each of these platforms is already likely to work properly. Further, the implementation of this proposal will identify cases where a problem exists and the symptoms have not yet been identified. </p>
<h2 id="proposedsolution">Proposed solution</h2>
<p>I propose that the <code>CChar</code> be aliased to <code>UInt8</code> on targets where <code>char</code> is <code>unsigned</code>, and <code>Int8</code> on platforms where <code>char</code> is <code>signed</code>.</p>
<h2 id="detaileddesign">Detailed design</h2>
<p>In principle this is a very small change to <code>swift/stdlib/public/core/CTypes.swift</code>:</p>
<pre><code class="diff"> ///
/// This will be the same as either `CSignedChar` (in the common
/// case) or `CUnsignedChar`, depending on the platform.
+#if os(OSX) || os(iOS) || os(windows) || arch(i383) || arch(x86_64)
public typealias CChar = Int8
+#else
+public typealias CChar = UInt8
+#endif
</code></pre>
<h2 id="impactonexistingcode">Impact on existing code</h2>
<p>Though the change itself is trivial, the impact on other parts of the project including <em>stdlib</em> and <em>foundation</em> cannot be be ignored. To get a handle on the scope of the required changes, I’ve performed this change on the swift project, and I encourage any interested party to invesigate. https://github.com/apple/swift/compare/master…hpux735:char This project fork builds on both <code>signed</code> and <code>unsigned char</code> platforms. There is one test failure on <code>signed char</code> platforms and two test failures on <code>unsigned char</code> platforms resulting from remaining assumptions about the signness of char. They should be trivial to address by someone skilled at lit tests, and will be fixed prior to any pull request.</p>
<p>In general, code that you write will fail to compile if you assume that C APIs will continue to import <code>char</code> as <code>Int8</code>. Your choice is to write code that interfaces with <code>char</code> using <code>CChar</code> or to break it out into seperate cases. Other than one test, which relies on breaking the <code>char</code> assumption for the purposes of generating an error, I have not seen a case that justifies using conditional compilation directives over <code>CChar</code>. There are cases where it is necessary to cast to a concretely-signed type, such as <code>UInt8</code> or <code>Int8</code> from <code>CChar</code>, but in those cases it encourages you to consider the impact of assuming the structure of the data that you’re working with. Very often, if you write your code using <code>CChar</code> it will be portable, and compile cleanly on all platforms.</p>
<h2 id="alternativesconsidered">Alternatives considered</h2>
<p>The only real alternative is the status quo. Currently, Swift treats all unspecified <code>chars</code> as signed. This mostly works most of the time, but I think we can do better.</p>
<h2 id="footnotes">Footnotes</h2>
<p>[7]: <em>proof by construction</em> (is it signed by convention?)
```
$ cat test.c
char <em>char(char a) { return a; }
signed char </em>schar(signed char a) { return a; }
unsigned char _uchar(unsigned char a) { return a; }</p>
<p>$ clang -S -emit-llvm -target <arch>-unknown-{windows,darwin}
```
and look for “signext” OR “zeroext" in @_char definition</arch></p>
<p>[8]: Windows char is signed by convention.</p></body></html>