[swift-users] Why does String.CharacterView have reserveCapacity(:)?

Brent Royal-Gordon brent at architechies.com
Thu May 4 06:33:34 CDT 2017


> On May 2, 2017, at 12:35 PM, Kelvin Ma via swift-users <swift-users at swift.org> wrote:
> 
> I’m wondering why the String.CharacterView structure has a reserveCapacity(:) member?

Because it conforms to the RangeReplaceableCollection protocol, which requires `reserveCapacity(_:)`.

More broadly, because you can append characters to the collection, and so you might want to pre-size it to reduce the amount of reallocating you might need to do in the future.

> And even more strangely, why String itself has the same method?

Because it has duplicates of those `CharacterView` methods which don't address individual characters. (In Swift 4, it will be merged with CharacterView.)

> It’s even weirder that String.UnicodeScalarView has this method, but it reserves `n` `UInt8`s of storage, instead of `n` `UInt32`s of storage.

Because the views are simply different wrappers around a single underlying buffer type, which stores the string in 8-bit (if all characters are ASCII) or 16-bit (if some are non-ASCII). That means that `UnicodeScalarView` isn't backed by a UTF-32 buffer; it's backed by an ASCII or UTF-16 buffer, but it only generates and accepts indices corresponding to whole characters, not the second half of a surrogate pair.

Why not allocate a larger buffer anyway? Most strings use no extraplanar characters, and many strings use only ASCII characters. (Even when the user works in a non-ASCII language, strings representing code, file system paths, URLs, identifiers, localization keys, etc. are usually ASCII-only.) By reserving only `n` `UInt8`s, Swift avoids wasting memory, at the cost of sometimes having to reallocate and copy the buffer when a string contains relatively rare characters. I believe Swift doubles the buffer size on each allocation, so we're talking no more than one reallocation for a non-ASCII string and two for an extraplanar string. That's quite acceptable.

> Also why String.UTF8View and String.UTF16View do not have this method, when it would make more sense for them to have it than for String itself and String.CharacterView to have it.

Because UTF8View and UTF16View are immutable. They don't conform to RangeReplaceableCollection and cannot be used to modify the string (since you could modify them to generate an invalid string).

-- 
Brent Royal-Gordon
Architechies

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20170504/6cf0beda/attachment.html>


More information about the swift-users mailing list