[swift-evolution] Faster/lower-level external String initialization
crk at akkyra.com
Fri Jan 8 16:51:42 CST 2016
> I'd like to see _fromCodeUnitSequence  become public API
I am very much in favor of this. I have had *exactly* the same experience.
String.reserveCapacity() seems to act like a no-op for some reason so append() is incredibly slow, and fromCString() often necessitates a copy to an intermediate buffer because of the the null-byte requirement.
This has been one of the weakest areas of Swift performance for me.
> On Jan 8, 2016, at 12:21 PM, Zach Waldowski via swift-evolution <swift-evolution at swift.org> wrote:
> Going back and forth from Strings to their byte representations is an
> important part of solving many problems, including object
> serialization, binary file formats, wire/network interfaces, and
> In developing such a parser, a coworker did the yeoman's work of
> Swift's Unicode types. He swore up and down that
> String.Type.fromCString(_:) 
> was the fastest way he found. I, stubborn and noobish as I am, was
> that a better way couldn't be wrought from Swift's UnicodeCodecTypes.
> After reading through stdlib source and doing my own testing, this is no
> tale. fromCString  is essentially the only public user of
> String.Type._fromCodeUnitSequence(_:input:), which serves the exact role
> both efficient and safe initialization-by-buffer-copy.
> Of course, fromCString isn't a silver bullet; it has to have a null
> requiring a copy of the origin buffer if one needs to be added (as is
> case with formats that specify the length up front, or unstructured
> that use unescaped double quotes as the terminator). It also prevents
> the string
> itself from containing the null character.
> I'd like to see _fromCodeUnitSequence  become public API as (just
> spittballing here) String.init?<Collection, Codec>(codeUnits:encoding:).
> If that
> can't happen, an alternative to fromCString that doesn't use strlen
> would be
> nice, and we can just eat the performance hit on other code unit
> I can't really think of a reason why it's not exposed yet, so I'm led to
> I'm just missing something major, and not that a reason doesn't exist.
> There's also discussion to be had of if API is needed. Try as I might, I
> can't seem to get the reserveCapacity/append(UnicodeScalar) workflow to
> anything close to the same speed.  Profiling indicates that I keep
> _StringBuffer.grow. I don't know if that means the buffer isn't uniquely
> referenced, or it's a bug, or what, but it's consistently slower than
> an Array of the bytes and performing fromCString on it. Similar story
> crossing the NSString bridge, which is even stranger. 
> Anyway, I wanted to stir up discussion, see if I'm way off base and/or
> this can be turned into a proposal.
> Zachary Waldowski
> zach at waldowski.me
> swift-evolution mailing list
> swift-evolution at swift.org
More information about the swift-evolution