[swift-server-dev] Prototype of the discussed HTTP API Spec
Paulo Faria
paulo at zewo.io
Thu Jun 1 11:54:20 CDT 2017
Oh! And just to show you guys how that is totally possible. One of the
seeds for this project is Open Swift. Open Swift was a project which aimed
to get the high level frameworks to share protocols and concrete types
which are common to any web framework. Here's the link:
https://github.com/open-swift
At its peak Open Swift had 3 frameworks based on completely different IO
models.
Zewo <https://github.com/Zewo/Zewo> - libmill with non-blocking synchronous
APIs (now using libdill)
Vapor <https://github.com/vapor/vapor> - libdispatch with blocking
synchronous APIs over threads
Slimane <https://github.com/noppoMan/Slimane> - libuv with non-blocking
asynchronous APIs
It also had SwiftOn <https://github.com/necolt/Swifton> and Flamingo which
later changed the name to Champagne <https://github.com/hyperoslo/Champagne>,
but these frameworks were also based on Zewo (libmill).
On 1 June 2017 at 13:40, Paulo Faria <paulo at zewo.io> wrote:
> I think I understand what Michael is trying to say, but I think we're not
> in the realm of just HTTP anymore. This is more about the underlying IO
> model. Strictly speaking the base API for POSIX non-blocking IO *IS*
> synchronous. The only thing is that the IO call might return EAGAIN or
> EWOULDBLOCK in case the call would block. Now, on a higher level how you
> deal with EAGAIN or EWOULD is what will define if the higher-level API is
> synchronous or asynchronous. I think what Michael is suggesting is that we
> provide *for the base API* (Networking/IO) synchronous APIs with access to
> the underlying file descriptor, which would allow higher level frameworks
> to use either libdispatch/libdill/libuv/libwhatever to write their
> scheduling mechanism and then define if their version of the higher level
> APIs should be synchronous or asynchronous. This doesn't exclude the goal
> of this group to provide a libdispatch based default implementation.
>
> I agree with this. Although the overlap and reuse of code between these
> different frameworks (each based on libdispatch or libdill or libuv, for
> example) would not be that high, there's still value in having them use
> what is possible to be common. For HTTP, separating the message heads from
> the body is a great example. This way HTTPRequestHead and HTTPResponseHead
> can be easily shared between synchronous and asynchronous frameworks since
> this part doesn't involve IO. This means that these frameworks can share
> for example authentication middleware which only depend on the
> Authentication header.
>
> This is not actually about sync/async. I think the takeaway is that
> although we are providing a libdispatch based default implementation.
> That's just a detail. I think what Michael means is that we must allow
> other frameworks based on libdispatch/libuv/libdill/libwhatever to
> implement their approach too. And this means providing synchronous APIs for
> the base IO and exposing the file descriptor so people can choose how they
> are going to poll/schedule for reads and writes. I agree with him a 100%.
>
>
> On 1 June 2017 at 13:08, Michael Chiu <hatsuneyuji at icloud.com> wrote:
>
>> Hi Johannes
>>
>>
>> I think i need to clarify something: I’m ok with a asynchronous api that
>> executes synchronously, for example if the api is something like [[ a. {
>> b() } ; c() ]], executes as [[ a(); b(); c() ]], it is totally fine since
>> it’s just synchronous api with syntactic sugar.
>>
>>
>> We actually have a synchronous implementation of the proposed API next to
>> the DispatchIO one that we normally use. The synchronous one uses problem
>> system calls and only services one request per thread. It's handy for unit
>> testing and for specialised use-cases. The synchronous implementation only
>> uses the following syscalls: open, close, read and write, that's it so
>> nothing fancy.
>>
>>
>> I think even exposing these apis to user will be good. No need for fancy
>> support just include it and it will be good enough.
>>
>>
>> I think i need to clarify something: I’m ok with a asynchronous api that
>> executes synchronously, for example if the api is something like [[ a. {
>> b() } ; c() ]], executes as [[ a(); b(); c() ]], it is totally fine since
>> it’s just synchronous api
>>
>>
>> ie. you use write as a blocking system call because the file descriptor
>> isn't set to be non-blocking.
>>
>> Just as a side note: You won't be able to repro this issue by replacing
>> the macOS `telnet` with the macOS `nc` (netcat) as netcat will only read
>> more to the socket after it was able to write it. Ie. the implementation of
>> standard macOS `nc` happens to make your implementation appear
>> non-blocking. But the macOS provided telnet seems to do the right thing.
>> You can use pbjnc (http://www.chiark.greenend.org.uk/~peterb/linux/pjbnc/)
>> if you prefer which also doesn't have the same bug as `nc`.
>>
>>
>> As I said both snippet of code are just sketches only for proof of
>> concept. But I do missed on the kevent write one that’s for sure.
>>
>>
>>
>> I'd guess that most programmers prefer an asynchronous API with callback
>> (akin to Node.js/DispatchIO) to using the eventing mechanism directly and I
>> was therefore assuming you wanted to build that from kevent() (which is
>> what they're often used for). Nevertheless, kevent() won't make your
>> programming model any nicer than asynchronous APIs and as I mentioned
>> before you can build one from the other in a quite straightforward way.
>> What we don't get from that is ordinary synchronous APIs that don't block
>> kernel threads and that happens to be what most people would prefer
>> eventually. Hence libdill/mill/venice and Zewo :).
>>
>>
>> Johannes, I totally agree with you. A asynchronous API is more intuitive
>> and I agree with that. But since we are providing low level API for ppl
>> like Zewo, Prefect, and Kitura, it is not right for us to assume their
>> model of programming.
>>
>> For libdill/mill/venice, even with green threads they will block when
>> there’s nothing to do,
>>
>>
>> If you read in libdill/mill/venice, it will switch the user-level thread
>> to _not_ block a kernel thread. That's the difference and that's what we
>> can't achieve with Swift today (without using UB).
>>
>>
>> I’m quite confused on this one, since a green thread, if that’s what we
>> think we were referring to, can not enter kernel (It can, but when it
>> enters what happened is that the kernel thread associated with enters
>> kernel).
>> So you can’t switch to another user-level thread to not block a kernel
>> thread.
>> AFAIK all majority OS(Liunx, FreeBSD, Solaris….) adopted 1:1 threading
>> model instead of n:m, not sure about Darwin but I think it applies to
>> Darwin as well according to an old WWDC video (I could be wrong), hence any
>> user threads (except for green threads) are in fact kernel threads. Since
>> kevent and epoll are designed to block when they should, I don’t think
>> anyone could avoid blocking something.
>>
>> in fact all the example you listed above all uses events api internally.
>> Hence I don’t think if an api will block a kernel thread is a good argument
>> here.
>>
>>
>> kernel threads are a finite resource and most modern networking APIs try
>> hard to only spawn a finite number of kernel threads way smaller than the
>> number of connections handled concurrently. If you use Dispatch as your
>> concurrency mechanism, your thread pool will have a maximum size of 64
>> threads by default on Darwin. (Sure you can spawn more using (NS)Thread
>> from Foundation or pthreads or so)
>>
>>
>> Yes Kernel threads are finite resources especially in 1:1 model but I’m
>> not sure how is it relevant. My concern on not include a synchronous API is
>> that it make people impossible to write synchronous code, with server side
>> swift tools, despite blocking or not, which they might want to. I’m not
>> saying sync is better, I’m just saying we could give them a chance.
>>
>> And even if such totally non-blocking programming model it will be
>> expensive since the kernel is constantly scheduling a do-nothing-thread. ((
>> if the io thread of a server-side application need to do something
>> constantly despite there’s no user and no connection it sounds like a ghost
>> story to me )).
>>
>>
>> what is the do-nothing-thread? The IO thread will only be scheduled if
>> there's something to do and then normally the processing starts on that
>> very thread. In systems like Netty they try very hard to reduce hopping
>> between different threads and to only spawn a few. Node.js is the extreme
>> which handles everything on one thread. It will be able to do thousands of
>> connections with only one thread.
>>
>>
>> The kernel has no idea is a thread have anything to do unless it
>> sleeps/enterKernel, unless a thread fits in these requirements, it will
>> always scheduled by the kernel.
>>
>> I’m saying, if there exists a real non-blocking programming model,
>> defined that by “never call any ‘wait’ system calls’, than any IO threads
>> of that model must constantly poll the kernel, hence such thread
>> _cannot_be_scheduled_on_demand since the thread itself has no idea if it
>> has anything to do. The only way to have an IO thread to do know they have
>> to do something, they will either need
>>
>> 1) An external listener call blocking event api and poke the IO thread on
>> demand
>> 2) The IO thread has to constantly poll the kernel
>> 3) An external listener polls the kernel constantly and poke the IO
>> thread when ready.
>>
>> 2 and 3 are the do-nothing-thread I’m referring to, they are running,
>> polling, wasting kernel resources but not actually being productive (when
>> there’s no connection).
>>
>>
>> You should definitely check the return value of write(), it's very
>> important. Even if positive you need to handle the case that it's less
>> bytes than you wanted it to write. And if negative, the bytes are _lost_
>> which happens all the time with the current implementation.
>>
>> Anyway, to fix the EAGAIN you'll need to ask kevent() when you can write
>> next.
>>
>>
>> It was suppose to be a proof of concept sketch work. As mentioned in the
>> comments of the code it was assuming to satisfy one single client. Now I’ve
>> improved it so it handles multiple clients while remain synchronous and non
>> blocking. EAGAIN is the only “error” will raise if you consider it as error
>> but for me it’s part of the non blocking IO.
>>
>>
>> Foundation/Cocoa is I guess the Swift standard library and they abandon
>> synchronous&blocking APIs completely. I don't think we should create
>> something different (without it being better) than what people are used to.
>>
>> Again, there are two options for IO at the moment:
>> 1) synchronous & blocking kernel threads
>> 2) asynchronous/inversion of control & not blocking kernel threads
>>
>> Even though I would love a synchronous programming model, I'd chose
>> option (2) because the drawbacks of (1) are just too big. The designers of
>> Foundation/Cocoa/Netty/Node.js/many more have made the same decision.
>> Not saying all other options aren't useful but I'd like the API to be
>> implementable with high-performance and not requiring the implementors to
>> block a kernel thread per connection.
>>
>>
>> To be honest I will choose 2 as well. But we are in not a 2 choose 1
>> situation. The main difference between we and netty/node.js is that ppl use
>> them to, write a server, what we do is, writing something ppl use to write
>> something like netty and node.js. So it is reasonable to think there’s
>> demand on a lower-level, synchronous api, despite the possible “drawbacks”
>> they might encounter.
>>
>> Maybe we have some misunderstanding here. I’m not saying a synchronous
>> api that happens to be able to handle a vector of sockets in single call
>> without blocking anything, I’m saying a synchronous api that can just do
>> one simple thing, which is, read/write in a synchronous way despite block
>> or not, if it will block, just let them know by throwing an exception, the
>> api call itself, will not block anything that way.
>>
>> Cheers,
>> Michael.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-server-dev/attachments/20170601/ed7c3a0e/attachment.html>
More information about the swift-server-dev
mailing list