[swift-evolution] InternalString class for easy String manipulation

Félix Cloutier felixcca at yahoo.ca
Fri Aug 19 01:05:12 CDT 2016


Even UTF-32 does not provide a 1-to-1 mapping to visual glyphs. As mentioned earlier in this thread, for instance, flags are composed of two Unicode characters.

Félix

> Le 18 août 2016 à 12:22:10, Jean-Denis Muys <jdmuys at gmail.com> a écrit :
> 
> And both are variable-length encoding. I mean that different characters do
> not necessarily occupy the same number of bytes in memory.
> 
> But now, UTF-32 (or UCS-4) is a constant-length encoding. Why not using
> UTF-32 as the encoding for an easy to use and easy to index string type?
> The memory inefficiency of it might be a small price to pay in many cases,
> including for beginners.
> 
> Finally, I oppose restricting identifiers in Swift programs to ASCII chars
> only. One reason is that in scientific programming, we at last can use
> greek letters, or even: א.
> 
> Jean-Denis
> 
> 
> On Thu, Aug 18, 2016 at 8:51 PM, Félix Cloutier <swift-evolution at swift.org>
> wrote:
> 
>> I'm not sure I understand your comment. UTF-8 and UTF-16 are just two
>> different ways to represent Unicode data, and they can both encode the
>> whole range of Unicode. Of course you'll have problems if you try to
>> interpret UTF-8 as UTF-16 and vice-versa, but that'll do you regardless of
>> whether you use international characters or not.
>> Félix
>> 
>> 
>> On Thursday, August 18, 2016 9:33 AM, Kenny Leung via swift-evolution <
>> swift-evolution at swift.org> wrote:
>> 
>> 
>>>> Just because you are using UTF-8 as the internal format, it does not
>> mean that universal support is guaranteed.
>> 
>> All I meant was this, and nothing more. If the internal format was UTF-8,
>> and you were using a filesystem whose filenames were UTF-16, you would have
>> the same problems.
>> 
>> -Kenny
>> 
>> 
>>> On Aug 17, 2016, at 10:40 PM, Félix Cloutier <felixcca at yahoo.ca> wrote:
>>> 
>>>> In Félix’s case, I would expect to have to ask for a mail-friendly
>> representation of his name, just like you have to ask for a
>> filesystem-friendly representation of a filename regardless of what the
>> internal representation is. Just because you are using UTF-8 as the
>> internal format, it does not mean that universal support is guaranteed.
>>> 
>>> Would you imagine if "n" turned out to be poorly supported by systems
>> throughout the world and dead-serious people argued that it's too hard for
>> beginners?
>>> 
>>> "Filesystem-friendly" and "email-friendly" names are not backed by
>> modern standards. You can have essentially any character that you like in a
>> file name save for the directory separator on almost every platform out
>> there (except on Windows, but the constraints are implemented in a layer
>> above NTFS), and addresses like félix at ... are RFC-legal. Restrictions are
>> merely wished into existence by programmers who don't want to complicate
>> their mental model of text processing, to everyone else's detriment.
>>> 
>>> Félix
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>> 
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160818/8bba18bf/attachment.html>


More information about the swift-evolution mailing list