[swift-dev] Very slow Set<String>(arrayOfStrings) for some arrayOfStrings

Jens Persson jens at bitcycle.com
Fri Mar 4 02:48:43 CST 2016


Are you already tracking/looking into this in someway or should I file a
bug?
/Jens

On Thu, Mar 3, 2016 at 5:55 AM, Arnold <aschwaighofer at apple.com> wrote:

> Yes this does not explain what you are seeing. (I could repeat this on my
> end)
>
> If you look at the two runs under instruments: one run with the slow
> strings file and one with the fast strings file (I tested with space
> appended to each line); you will notice that we spend a lot more time in
> the Set's find function doing NSString comparisons. The only explanation I
> have for this is hash collisions leading to more comparisons as we are
> probing non empty buckets. Why one of the files would trigger this and the
> other not makes no sense to me.
> This will need looking at.
>
> Sent from my iPhone
>
> On Mar 2, 2016, at 6:43 PM, Jens Persson <jens at bitcycle.com> wrote:
>
> Just thought I should mention that I tried another thing:
>
> Using my original example:
>
> http://sloppyfocus.com/slowSetFromParticularButSeeminglyNormalArrayOfStrings.html
> Keeping the code as it is but adding a space to the end of every line in
> the actual text file (strings.txt).
>
> This actually makes the slow test(strings) run almost as fast as the fast
> test(caseSwappedStrings), 0.09 seconds, so about a hundred times faster
> just because the text file has an extra space at the end of each line.
>
> I just can't see how Arnold Schwaighofer's explanation explains that ...
> An added space character at the end of each line of that text file is just
> one more ASCII character, how can that make test(strings) so much faster?
>
> And other operations on the text file makes the slow test faster too, like
> eg:
> Inserting a space as the first char of each line, 0.9 seconds.
> Removing all spaces, 0.9 seconds.
> Replacing all spaces with "-" (replacing the dashes back to spaces made it
> slow again), 0.9 seconds.
> Replacing all "a" with "*", 0.9 seconds.
> Swapping case (now in the actual file instead of programmatically), 2.3
> seconds.
> (I only tried operations that kept the resulting strings unique and their
> characters plain ASCII.)
>
> /Jens
>
> On Wed, Mar 2, 2016 at 6:28 PM, Arnold Schwaighofer <
> aschwaighofer at apple.com> wrote:
>
>> That is the difference between a “String” type instance that can use the
>> ascii fast path and NSString backed “String” type instances.
>>
>> When the code translates the string (the map function that calls upper
>> case) it becomes a natively backed “String” that isASCII.
>>
>> // The strings in strings.txt are not strange in any way, they all match
>> ^[a-zA-Z ]+$.
>>
>> Set the isASCII bit and things are blazing fast.
>>
>> The other path is that componentsSeparatedByString(separator: String) ->
>> [String] returns an “[String]” where every “String” is backed by
>> “NSString”. When it comes to do comparison we call a NSString function on
>> the NSString and this is where the difference in performance comes from.
>>
>>
>>
>>
>> > On Mar 2, 2016, at 12:15 AM, Jens Persson via swift-dev <
>> swift-dev at swift.org> wrote:
>> >
>> > I guess it would be possible to isolate the problem by analyzing my
>> example, and thereby making it possible to write a smaller benchmark which
>> doesn't need that big textfile. I didn't manage to do that however, hence
>> my question about the problem here.
>> > /Jens
>> >
>> > On Wed, Mar 2, 2016 at 9:11 AM, Jens Persson <jens at bitcycle.com> wrote:
>> > Thanks, but I'll have to invite anyone with more time and experience to
>> do that.
>> >
>> > One (of many) thing(s) I wouldn't know how include/handle is the 1.3 MB
>> text file for the particular arrayOfStrings that makes Set(arrayOfStrings)
>> slow. It seems a bit unnecessary/bloating to put it in the code base.
>> >
>> > /Jens
>> >
>> > On Tue, Mar 1, 2016 at 11:58 PM, Nadav Rotem <nrotem at apple.com> wrote:
>> > Hi Jens,
>> >
>> > Thanks for reporting this issue. I don’t know what’s going on but we’ll
>> take a look.
>> >
>> > I think it would be great if you could add this program as a benchmark
>> under swift/benchmarks/.  This will allow us to track the performance of
>> this test and ensure that we don’t regress.
>> >
>> > Thanks,
>> > Nadav
>> >
>> >> On Mar 1, 2016, at 5:01 AM, Jens Persson via swift-dev <
>> swift-dev at swift.org> wrote:
>> >>
>> >> For some particular arrays of strings, creating a
>> Set<String>(arrayOfStrings) takes about 100 to 200 times longer than for
>> other very similar arrays of strings (equally many unique simple [a-zA-Z ]+
>> strings).
>> >>
>> >> I've put together a tiny program to demonstrate the problem here:
>> >>
>> http://sloppyfocus.com/slowSetFromParticularButSeeminglyNormalArrayOfStrings.html
>> >>
>> >> Is this due to a bug / performance problem in Set or can it be
>> explained (and solved) in some way?
>> >>
>> >> /Jens
>> >>
>> >> _______________________________________________
>> >> swift-dev mailing list
>> >> swift-dev at swift.org
>> >> https://lists.swift.org/mailman/listinfo/swift-dev
>> >
>> >
>> >
>> >
>> > --
>> > bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
>> > http://www.bitcycle.com/
>> > Phone: +46-73-753 24 62
>> > E-mail: jens at bitcycle.com
>> >
>> >
>> >
>> >
>> > --
>> > bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
>> > http://www.bitcycle.com/
>> > Phone: +46-73-753 24 62
>> > E-mail: jens at bitcycle.com
>> >
>> > _______________________________________________
>> > swift-dev mailing list
>> > swift-dev at swift.org
>> > https://lists.swift.org/mailman/listinfo/swift-dev
>>
>>
>
>
> --
> bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
> http://www.bitcycle.com/
> Phone: +46-73-753 24 62
> E-mail: jens at bitcycle.com
>
>


-- 
bitCycle AB | Smedjegatan 12 | 742 32 Östhammar | Sweden
http://www.bitcycle.com/
Phone: +46-73-753 24 62
E-mail: jens at bitcycle.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20160304/13dd6a66/attachment.html>


More information about the swift-dev mailing list