[swift-users] Why does String.CharacterView have reserveCapacity(:)?

Kelvin Ma kelvin13ma at gmail.com
Sat May 6 23:12:49 CDT 2017


Okay I understand most of that, but I still feel it’s misleading to put
`reserveCapacity()` on `CharacterView` and `UnicodeScalarView`.
`reserveCapacity()` should live in a type where its meaning matches up with
the meaning of the `.count` property, ideally the `UTF8View`. Otherwise it
should at least be removed from `CharacterView` and `UnicodeScalarView` and
only live in the parent `String` type.

On Thu, May 4, 2017 at 6:33 AM, Brent Royal-Gordon <brent at architechies.com>
wrote:

> On May 2, 2017, at 12:35 PM, Kelvin Ma via swift-users <
> swift-users at swift.org> wrote:
>
> I’m wondering why the String.CharacterView structure has a
> reserveCapacity(:) member?
>
>
> Because it conforms to the RangeReplaceableCollection protocol, which
> requires `reserveCapacity(_:)`.
>
> More broadly, because you can append characters to the collection, and so
> you might want to pre-size it to reduce the amount of reallocating you
> might need to do in the future.
>
> And even more strangely, why String itself has the same method?
>
>
> Because it has duplicates of those `CharacterView` methods which don't
> address individual characters. (In Swift 4, it will be merged with
> CharacterView.)
>
> It’s even weirder that String.UnicodeScalarView has this method, but it
> reserves `n` `UInt8`s of storage, instead of `n` `UInt32`s of storage.
>
>
> Because the views are simply different wrappers around a single underlying
> buffer type, which stores the string in 8-bit (if all characters are ASCII)
> or 16-bit (if some are non-ASCII). That means that `UnicodeScalarView`
> isn't backed by a UTF-32 buffer; it's backed by an ASCII or UTF-16 buffer,
> but it only generates and accepts indices corresponding to whole
> characters, not the second half of a surrogate pair.
>
> Why not allocate a larger buffer anyway? Most strings use no extraplanar
> characters, and many strings use only ASCII characters. (Even when the user
> works in a non-ASCII language, strings representing code, file system
> paths, URLs, identifiers, localization keys, etc. are usually ASCII-only.)
> By reserving only `n` `UInt8`s, Swift avoids wasting memory, at the cost of
> sometimes having to reallocate and copy the buffer when a string contains
> relatively rare characters. I believe Swift doubles the buffer size on each
> allocation, so we're talking no more than one reallocation for a non-ASCII
> string and two for an extraplanar string. That's quite acceptable.
>
> Also why String.UTF8View and String.UTF16View do not have this method,
> when it would make more sense for them to have it than for String itself
> and String.CharacterView to have it.
>
>
> Because UTF8View and UTF16View are immutable. They don't conform to
> RangeReplaceableCollection and cannot be used to modify the string (since
> you could modify them to generate an invalid string).
>
> --
> Brent Royal-Gordon
> Architechies
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-users/attachments/20170506/07871f4e/attachment.html>


More information about the swift-users mailing list