[swift-server-dev] [swift-dev] Performance Refinement of Data

Helge Heß me at helgehess.eu
Tue Nov 29 14:16:45 CST 2016


On Nov 29, 2016, at 4:57 PM, Philippe Hausler <phausler at apple.com> wrote:
> Furthermore another very common task is to build up buffers of data (appending small chunks; buckets in the brigade in the parlance of Apache).

I was originally confused by “build up buffers of data”, you mean “build up buffers of buffers of data” :-)

This is pretty common, yes. Notably this is often “small buffers of large buffers of data”. Like if you have a markdown template like that:

  … tons of html …
  {{ today }}
  … tons of html …

It may be desirable to represent that as a brigade of

  [FileData offset:0 length:nn]       // use sendfile()
  [MemoryData]                        // regular send() here
  [FileData offset:n + 11 length:-1]  // sendfile()

E.g. in a CalDAV file-based server that would be pretty common. If you fetch batches of CalDAV events, you would ideally stream them directly off-store, with a lot of static MemoryData for the XML wrapping and very few dynamic Data objects for things like URLs or sync-tokens.

> There is a caveat here of course; if the end result will not be needed to be contiguous DispatchData may be a better type in the end for this but that of course depends on your usage.

I think this is the common case on the server today. Loading everything into memory is often the naive/simpler way to get stuff running quickly. But if you look at it, many of the backend servers are just fancy proxies primarily translating between different representations of the data (like PostgreSQL-wire protocol to JSON). More often than not you don’t really need to spool up the data here but can just transform on the fly.

This is often (not always) a little different on the client side. E.g. if you load an image from a server via HTTP, you quite likely need it in full and place it into an on-disk cache anyways. And then you just grab one mmap-data of that resource once it is fully downloaded.

> There is a reason why we have both contiguous data (NSData) and discontiguous data (_NSDispatchData/OS_dispatch_data/DispatchData). So those portions aside of differing usage patterns I will focus on the contiguous case since that is what my proposal is about.

I’m a little confused about that because it seems to conflict with what Alex has been saying. I think what we need for the server effort is clearly 'discontiguous data’. Is this going to be supported by `Data` (as in NSData)? Something which we can pass to readv and writev (and companions).

hh



More information about the swift-server-dev mailing list