[swift-dev] Very slow Set<String>(arrayOfStrings) for some arrayOfStrings
Jens Persson
jens at bitcycle.com
Wed Mar 2 18:02:13 CST 2016
The following slight modification of the extension, however, makes
test(strings) run as fast as test(caseSwappedStrings) (ie 0.07 seconds):
extension String {
func componentsSeparatedByNewLineCharacter() -> [String] {
var lines = [String]()
var currStr = String.UnicodeScalarView()
let newLineUCS = UnicodeScalar("\n")
for ucs in self.unicodeScalars {
switch ucs {
case newLineUCS: lines.append(String(currStr) + " ");
currStr.removeAll()
default: currStr.append(ucs)
}
}
return lines
}
}
Note that the only change is that a space is added to the string there ( +
" " ).
So I guess that for some reason adding that space sets the String's isASCII
bit ... But the strange thing is that if I try to remove the space, and no
matter how I do that, the test(strings)-test goes back to being 2.3 seconds
again (instead of 0.07 seconds).
It's almost as if there is a cached version of the original String (one
that has its isASCII bit cleared) that is being reused as soon as I modify
it in a way that makes it be the same as it was originally.
If so, I'm guessing that it is the String.init(contentsOfFile: path) that
is to blame (it's making an NSString-backed String with its isASCII bit
cleared), because I'm unable to reproduce the slow (now 2.3 seconds)
behavior without loading from disk.
/Jens
On Wed, Mar 2, 2016 at 10:02 PM, Jens Persson <jens at bitcycle.com> wrote:
> Interesting, thanks!
> I tried using this extension
> extension String {
> func componentsSeparatedByNewLineCharacter() -> [String] {
> var lines = [String]()
> var currStr = String.UnicodeScalarView()
> let newLineUCS = UnicodeScalar("\n")
> for ucs in self.unicodeScalars {
> switch ucs {
> case newLineUCS: lines.append(String(currStr));
> currStr.removeAll()
> default: currStr.append(ucs)
> }
> }
> return lines
> }
> }
> instead of componentsSeparatedByString("\n")
>
> This made the slow non-caseSwapped test(strings) run in 2.3 seconds
> instead of the previous 9.5 seconds, but that is still relatively slow
> compared to the 0.066 seconds of the test(caseSwappedStrings).
>
> Is there a way to make sure a String in Swift has the isASCII bit set
> (provided the original string contains only ASCII of course)?
>
> /Jens
>
>
> On Wed, Mar 2, 2016 at 7:24 PM, Daniel Duan via swift-dev <
> swift-dev at swift.org> wrote:
>
>> Arnold Schwaighofer via swift-dev <swift-dev <at> swift.org> writes:
>>
>> >
>> > That is the difference between a “String” type instance that can use the
>> > ascii fast path and NSString backed “String” type instances.
>> >
>>
>> This makes total sense now :) I was very mystified by this issue and
>> thought
>> it's a weird bias in the hashing function at some point.
>>
>> Thanks for the insight Arnold.
>> _______________________________________________
>> swift-dev mailing list
>> swift-dev at swift.org
>> https://lists.swift.org/mailman/listinfo/swift-dev
>>
>
>
>
> --
> bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
> http://www.bitcycle.com/
> Phone: +46-73-753 24 62
> E-mail: jens at bitcycle.com
>
>
--
bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
http://www.bitcycle.com/
Phone: +46-73-753 24 62
E-mail: jens at bitcycle.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20160303/060effbc/attachment.html>
More information about the swift-dev
mailing list