[swift-dev] Reducing the size of Swift binaries by shortening symbols

Sun Dec 20 15:52:59 CST 2015

I would love to add that uploading iOS apps is much bigger with swift 

Sent from my iPhone

> On 20 Dec 2015, at 15:35, Stephen Canon via swift-dev <swift-dev at swift.org> wrote:
> 
> Nadav, can you clarify what we’re really trying to accomplish here?  "Smaller binaries” isn’t too important of a goal in and of itself.
> 
> Are we trying to:
> – reduce storage used on disk
> – reduce load time
> – reduce loaded memory footprint
> – make emitting swift binaries more efficient
> – something else?
> 
> Yes, I know, “all of the above”, but understanding something about what’s most important would help evaluate the proposal.
> 
> It’s also worth keeping in mind that iOS and OS X have been aggressively adopting pervasive system-wide compression both on disk and in memory.  This trend will continue, and it makes it quite a bit less important for individual components to explicitly adopt compression techniques themselves, except in cases where there’s a lot of special structure that those components can leverage to get better compression than a general-purpose lossless compressor can manage (images and sound are the two obvious examples of this, but also cases like huge arrays of floating-point data where the low-order bits don’t matter, etc).  Linux hasn’t been as aggressive about doing this yet, but pervasive system-level compression is The Future.
> 
> – Steve
> 
>> On Dec 20, 2015, at 5:17 AM, Dmitri Gribenko <gribozavr at gmail.com> wrote:
>> 
>> + Stephen Canon, because he probably has good ideas in this domain.
>> 
>> On Fri, Dec 18, 2015 at 3:42 PM, Nadav Rotem via swift-dev <swift-dev at swift.org> wrote:
>>> 
>>> What’s next?
>>> 
>>> The small experiment I described above showed that compressing the names in the string table has a huge potential for reducing the size of swift binaries. I’d like for us (swift-developers) to talk about the implications of this change and start working on the two tasks of tightening our existing mangling format and on implementing a new compression layer on top. 
>> 
>> Hi Nadav,
>> 
>> This is a great start that shows that there is a potential for improvement in our mangled names!
>> 
>> To make this effort more visible, I would suggest creating a bug on https://bugs.swift.org/ .
>> 
>> I think we survey existing solutions that industry has developed for compressing short messages.  What comes to mind:
>> 
>> - header compression in HTTP2:
>> https://http2.github.io/http2-spec/compression.html
>> 
>> - PPM algorithms are one of the best-performing compression algorithms for text.
>> 
>> - Arithmetic coding is also a natural starting point for experimentation.
>> 
>> Since the input mangled name also comes in a restricted character set, we could also remove useless bits first, and try an existing compression algorithm on the resulting binary string.
>> 
>> We should also build a scheme that uses shortest one between the compressed and non-compressed names.
>> 
>> For running experiments it would be useful to publish a sample corpus of mangled names that we will be using for comparing the algorithms and approaches.
>> 
>> I also have a concern about making mangled names completely unreadable.  Today, I can frequently at least get a gist of what the referenced entity is without a demangler.  What we could do is make the name consist of a human-readable prefix that encodes just the base name and a compressed suffix that encodes the rest of the information.
>> 
>> _T<length><class name><length><method name><compressed suffix>
>> 
>> We would be able to use references to the class and the method name from the compressed part, so that character data isn't completely wasted.
>> 
>> This scheme that injects human-readable parts will also allow the debugger to quickly match the names without the need to decompress them.
>> 
>> We should also investigate improving existing mangling scheme to produce shorter results.  For example, one idea that comes to mind is using base-60 instead of base-10 for single-digit numbers that that specify identifier length, falling back to base-10 for longer numbers to avoid ambiguity.  This would save one character for every identifier longer than 9 characters and shorter than 60, which is actually the common case.
>> 
>> Dmitri
>> 
>> -- 
>> main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
>> (j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr at gmail.com>*/
> 
> 
> _______________________________________________
> swift-dev mailing list
> swift-dev at swift.org
> https://lists.swift.org/mailman/listinfo/swift-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-dev/attachments/20151220/bb758f96/attachment.html>