[swift-evolution] Strings in Swift 4

David Waite david at alkaline-solutions.com
Sat Feb 25 18:09:41 CST 2017


> On Feb 25, 2017, at 2:54 PM, Michael Ilseman <milseman at apple.com> wrote:
> 
> 
>> On Feb 25, 2017, at 3:26 PM, David Waite via swift-evolution <swift-evolution at swift.org <mailto:swift-evolution at swift.org>> wrote:
>> 
>> Ted, 
>> 
>> It might have helped if instead of being called String and Character, they were named Text
> 
> I would oppose taking a good name like “Text” and using it for Strings which are mostly for machine processing purposes, but can be human-presentable with explicit locale. A name like Text would a better fit for Strings bundled with locale etc. for the purpose of presentation to humans, which must always be in the context of some locale (even if a “default” system locale). Refer to the sections in the String manifesto[1][2]. Such a Text type is definitely out-of-scope for current discussion.
> 
Oh, I would never propose such a naming change, because I am comfortable with the existing names. I’m just acknowledging that the history of string manipulation causes friction in developers coming from other languages, in that they may expect certain functionality which doesn’t make sense within String’s goals.

I was merely illustrating that there is a big difference to how strings work in traditional languages and how a truly unicode-safe strings work. In scripting languages like ruby and python, string bears the brunt of binary data handling. Even in languages like Java and C#, unicode support takes compromises that Swift seems unwilling to make.

IMO, that Swift String doesn’t have random access capabilities is not a deficiency in Swift, but can cause misunderstandings of how Swift strings differ from other languages.

>> and ExtendedGraphemeCluster. 
>> 
> 
> What is expressed by Swift’s Character type is what the Unicode standard often refers to as a “user-perceived character”. Note that “character” by it self is not meaningful in Unicode (though it is often thrown about casually). In Swift, Character is an appropriate name here for the concept of a user-perceived character. If you want bytes, then you can use UInt8. If you want Unicode scalar values, you can use UnicodeScalar. If you want code units, you can use whatever that ends up looking (probably an associated type named CodeUnit that is bound to UInt8 or UInt16 depending on the encoding).

A character “char" in C or C++ is considered nearly universally to be an 8-bit value. A Character in Java or Char in C# is a 16 bit (UTF-16) value. All of these effectively behave as integer values (with Character in java having the unique quality of being unsigned).

IMO, that Swift Character doesn’t behave as an integer value but rather closer to a string holding one user-perceived character is not a deficiency in Swift, but can cause misunderstandings because of how Swift differs from other languages.

-DW

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170225/6ef24c1b/attachment.html>


More information about the swift-evolution mailing list