[swift-evolution] Strings in Swift 4

Ted F.A. van Gaalen tedvgiosdev at gmail.com
Mon Feb 20 14:59:28 CST 2017


Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others

As described in the Swift Standard Library API Reference:

The Character type represents a character made up of one or more Unicode scalar values, 
grouped by a Unicode boundary algorithm. Generally, a Character instance matches what 
the reader of a string will perceive as a single character. The number of visible characters is 
generally the most natural way to count the length of a string.
The smallest discrete unit we (app programmers) are mostly working with is this
perceived visible character, what else? 

If that is the case, my reasoning is, that Strings (could / should? ) be relatively simple, 
because most, if not all, complexity of Unicode is confined within the Character object and
completely hidden**  for the average application programmer, who normally only needs
to work with Strings which contains these visible Characters, right? 
It doesn’t then make no difference at all “what’ is in” the Character, (excellent implementation btw) 
(Unicode, ASCCII, EBCDIC, Elvish, KlingonIV, IntergalacticV.2, whatever)
because we rely in sublime oblivion for the visually representation of whatever is in
the Character on miraculous font processors hidden in the dark depths of the OS. 

Then, in this perspective, my question is: why is String not implemented as 
directly based upon an array [Character]  ? In that case one can refer to the Characters of the
String directly, not only for direct subscripting and other String functionality in an efficient way. 
(i do hava scope of independent Swift here, that is interaction with libraries should be 
solved by the compiler, so as not to be restricted by legacy ObjC etc. 

**   (expect if one needs to do e.g. access individual elements and/or compose graphics directly?
      but for  this purpose the Character’s properties are accessible) 

For the sake of convenience, based upon the above reasoning,  I now “emulate" this in 
a string extension, thereby ignoring the rare cases that a visible character could be based 
upon more than a single Character (extended grapheme cluster)  If that would occur, 
thye should be merged into one extended grapheme cluster, a single Character that is. 

//: Playground - implement direct subscripting using a Character array
// of course, when the String is defined as an array of Characters, directly
// accessible it would be more efficient as in these extension functions. 
extension String
{
    var count: Int
        {
        get
        {
            return self.characters.count
        }
    }

    subscript (n: Int) -> String
    {
        return String(Array(self.characters)[n])
    }
    
    subscript (r: Range<Int>) -> String
    {
        return String(Array(self.characters)[r])
    }
    
    subscript (r: ClosedRange<Int>) -> String
    {
        return String(Array(self.characters)[r])
    }
}

func test()
{
    let zoo = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
    print("zoo has \(zoo.count) characters (discrete extended graphemes):")
    for i in 0..<zoo.count
    {
        print(i,zoo[i],separator: "=", terminator:" ")
    }
    print("\n")
    print(zoo[0..<7])
    print(zoo[9..<16])
    print(zoo[18...26])
    print(zoo[29...39])
    print("images:" + zoo[6] + zoo[15] + zoo[26] + zoo[39])
}

test()

this works as intended  and generates the following output:  

zoo has 40 characters (discrete extended graphemes):
0=K 1=o 2=a 3=l 4=a 5=  6=🐨 7=, 8=  9=S 10=n 11=a 12=i 13=l 14=  15=🐌 16=, 17=  
18=P 19=e 20=n 21=g 22=u 23=i 24=n 25=  26=🐧 27=, 28=  29=D 30=r 31=o 32=m 
33=e 34=d 35=a 36=r 37=y 38=  39=🐪 

Koala 🐨
Snail 🐌
Penguin 🐧
Dromedary 🐪
images:🐨🐌🐧🐪

I don’t know how (in) efficient this method is. 
but in many cases this is not so important as e.g. with numerical computation.

I still fail to understand why direct subscripting strings would be unnecessary,
and would like to see this built-in in Swift asap. 

Btw, I do share the concern as expressed by Rien regarding the increasing complexity of the language.

Kind Regards, 

TedvG


 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20170220/18da1a0a/attachment-0001.html>


More information about the swift-evolution mailing list