[swift-evolution] [Draft proposal] Faster/lower-level external String initialization

Zach Waldowski zach at waldowski.me
Wed Feb 3 12:18:45 CST 2016


Charles —

This certainly makes a lot of sense. My primary response is that I think
the bad behavior of reserveCapacity should be reported by one of us as a
bug. My second thought is that the extra method should be proposed
separately; whereas the current proposal surfaces things that already
exist, what you need is purely additive but would require underlying
changes. I don't see a point in implementing it now for API completeness
if it can't make good on its performance; that's the exact predicament
we're in today with reserveCapacity and append/appendContentsOf.

Zach Waldowski
zach at waldowski.me

On Tue, Feb 2, 2016, at 03:24 AM, Charles Kissinger wrote:
> 
> > On Feb 1, 2016, at 8:53 PM, Zach Waldowski via swift-evolution <swift-evolution at swift.org> wrote:
> > 
> > That'd seem reasonable.
> > 
> > I guess I'm not entirely sold on the benefit of the extra method here,
> > and all the weight on maintenance that'd entail. Obviously I get the
> > benefit of skipping the storage reservation, but I can't imagine a
> > scenario where building something up using
> > `appendContentsOf(_:encoding:)` would be that much better then plumb
> > concatenation. I'd love to hear an example, though.
> 
> Zach,
> 
> Here’s a real-world example:
> 
> I have a case where I am assembling a String from five short ASCII
> character sequences scattered around different parts of each line of an
> input file. The maximum length of the resulting String is predictable, so
> in an ideal world I could create an empty string, call
> String.reserveCapacity() and then suck up all of the ASCII character
> sequences with a series of String.appendContentsOf(_, encoding:), all
> with just a single memory allocation per String. (But as you mentioned,
> it would appear to require a significant change in the String
> implementation for things to be that efficient.)
> 
> Obviously, the alternative approach of instantiating a string for each of
> the subsequences and concatenating them would involve a minimum of six
> allocations. It matters in my case, because the input files are large
> (sometimes millions of lines).
> 
> Right now, my approach is to allocate a byte buffer, assemble the
> substrings in it, null-terminate and call String.fromCString(). That
> performs reasonably well, but it still involves an extra copy of the
> characters and the byte buffer allocation, neither of which would be
> necessary with the String.appendContentsOf(_, encoding:) method. 
> 
> I hope that example was clear. If single-character String.append() became
> more efficient, that would reduce the need for the function I’m
> proposing. And if Swift strings were to get short-string optimization it
> would make this all much easier, but I have no idea if that is in the
> cards.
> 
> —CK
> 
> > 
> > Cheers!
> > Zach Waldowski
> > zach at waldowski.me
> > 
> > On Mon, Feb 1, 2016, at 08:36 PM, Charles Kissinger via swift-evolution
> > wrote:
> >> 
> >>> On Feb 1, 2016, at 2:07 PM, Dave Abrahams via swift-evolution <swift-evolution at swift.org> wrote:
> >>> 
> >>> 
> >>> on Mon Feb 01 2016, Zach Waldowski <swift-evolution at swift.org> wrote:
> >>> 
> >>>> Due to the semantics of _StringCore and _StringBuffer (as far as I
> >>>> understand them), such a method would not be more efficient than
> >>>> creating another String with the new initializer and concatenating the
> >>>> two, and would require more significant plumbing changes to
> >>>> _StringBuffer.
> >>> 
> >>> We are very interested in making significant plumbing changes to String, FWIW.
> >>> 
> >> 
> >> In that case, perhaps it would make sense to add String.append() for code
> >> unit sequences over the exiting plumbing just for completeness of the
> >> API, on the assumption that efficiency would come later when String gets
> >> its makeover.
> >> 
> >> —CK
> >> 
> >>>> 
> >>>> 
> >>>> It would be good to shop around for this proposal, though; maybe if
> >>>> someone on the core team wants to chime in.
> >>>> 
> >>>> Cheers,
> >>>> Zachary Waldowski
> >>>> zach at waldowski.me
> >>>> 
> >>>> On Mon, Feb 1, 2016, at 03:07 AM, Charles Kissinger wrote:
> >>>>> It occurred to me that this proposal provides a way to efficiently
> >>>>> initialize Strings from UTF code unit sequences, but it doesn’t provide a
> >>>>> way to *append* code unit sequences to existing strings. String has an
> >>>>> existing method to append Character sequences:
> >>>>> 
> >>>>> String.appendContentsOf<S : SequenceType where S.Generator.Element ==
> >>>>> Character>(_: S)
> >>>>> 
> >>>>> The equivalent for code units would presumably be:
> >>>>> 
> >>>>> String.appendContentsOf<S : SequenceType, Encoding: UnicodeCodecType
> >>>>> where Encoding.CodeUnit == Input.Generator.Element>(_: S, encoding:
> >>>>> Encoding.Type)
> >>>>> 
> >>>>> Is there any interest in adding that to the proposal? It would only have
> >>>>> a lot of value if it could be implemented in a more efficient way than
> >>>>> just calling String.Append() for each decoded Character. From looking at
> >>>>> the code, that might not be straightforward.
> >>>>> 
> >>>>> —CK
> >>>>> 
> >>>>>> On Jan 26, 2016, at 3:14 PM, Zach Waldowski via swift-evolution <swift-evolution at swift.org> wrote:
> >>>>>> 
> >>>>>> Since this seems to have gone quiet, and the code was already done, I've
> >>>>>> posted the PR to Swift itself:
> >>>>>> 
> >>>>>> https://github.com/apple/swift/pull/1109
> >>>>>> 
> >>>>>> The existing proposal PR:
> >>>>>> 
> >>>>>> https://github.com/apple/swift-evolution/pull/101
> >>>>>> 
> >>>>>> -- 
> >>>>>> Sincerely,
> >>>>>> Zachary Waldowski
> >>>>>> zach at waldowski.me
> >>>>>> 
> >>>>>> On Wed, Jan 20, 2016, at 06:08 PM, Zach Waldowski via swift-evolution
> >>>>>> wrote:
> >>>>>>> Thanks, Dave.
> >>>>>>> 
> >>>>>>> I definitely wasn't hard to convince on this. The change has already
> >>>>>>> been made to the proposal, its PR, and the pending PR to the stdlib.
> >>>>>>> 
> >>>>>>> Cheers!
> >>>>>>> Zach Waldowski
> >>>>>>> zach at waldowski.me
> >>>>>>> 
> >>>>>>> On Wed, Jan 20, 2016, at 01:23 PM, Dave Abrahams via swift-evolution
> >>>>>>> wrote:
> >>>>>>>> 
> >>>>>>>> on Fri Jan 15 2016, Zach Waldowski via swift-evolution
> >>>>>>>> <swift-evolution-m3FHrko0VLzYtjvyW6yDsg-AT-public.gmane.org> wrote:
> >>>>>>>> 
> >>>>>>>>> Charles -
> >>>>>>>>> 
> >>>>>>>>> I shared the same concern, and mention them in the proposal. I thought
> >>>>>>>>> `decode(_:as:)` to be too simple to the point of being
> >>>>>>>>> non-descriptive,
> >>>>>>>> 
> >>>>>>>> The names of methods don't need to be descriptive.  It's the use-sites
> >>>>>>>> (and secondarily, declarations) that need to be clear.  Trying to make
> >>>>>>>> the names of methods descriptive by themselves just hurts readability at
> >>>>>>>> the use-site.
> >>>>>>>> 
> >>>>>>>> -Dave
> >>>>>>>> 
> >>>>>>>> _______________________________________________
> >>>>>>>> swift-evolution mailing list
> >>>>>>>> swift-evolution at swift.org
> >>>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >>>>>>> _______________________________________________
> >>>>>>> swift-evolution mailing list
> >>>>>>> swift-evolution at swift.org
> >>>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >>>>>> _______________________________________________
> >>>>>> swift-evolution mailing list
> >>>>>> swift-evolution at swift.org
> >>>>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >>>>> 
> >>>> _______________________________________________
> >>>> swift-evolution mailing list
> >>>> swift-evolution at swift.org
> >>>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >>> 
> >>> -- 
> >>> -Dave
> >>> 
> >>> _______________________________________________
> >>> swift-evolution mailing list
> >>> swift-evolution at swift.org
> >>> https://lists.swift.org/mailman/listinfo/swift-evolution
> >> 
> >> _______________________________________________
> >> swift-evolution mailing list
> >> swift-evolution at swift.org
> >> https://lists.swift.org/mailman/listinfo/swift-evolution
> > _______________________________________________
> > swift-evolution mailing list
> > swift-evolution at swift.org
> > https://lists.swift.org/mailman/listinfo/swift-evolution
> 


More information about the swift-evolution mailing list