[swift-server-dev] Prototype of the discussed HTTP API Spec

Thu Jun 1 12:09:48 CDT 2017

> 100% agreed. I don't think we even decided how the default impl should be
implemented.

Yeah, I don't think we ever *officially* decided on that. But it was
discussed between the core team that we would likely provide an
implementation based on libdispatch, since it's the official Apple library
for concurrency. And based on the first proposal. This is likely what's
going to happen, haha.

On 1 June 2017 at 14:06, Johannes Weiss <johannesweiss at apple.com> wrote:

> Hi Paulo,
>
> > On 1 Jun 2017, at 5:40 pm, Paulo Faria <paulo at zewo.io> wrote:
> >
> > I think I understand what Michael is trying to say, but I think we're
> not in the realm of just HTTP anymore. This is more about the underlying IO
> model. Strictly speaking the base API for POSIX non-blocking IO *IS*
> synchronous. The only thing is that the IO call might return EAGAIN or
> EWOULDBLOCK in case the call would block. Now, on a higher level how you
> deal with EAGAIN or EWOULD is what will define if the higher-level API is
> synchronous or asynchronous. I think what Michael is suggesting is that we
> provide *for the base API* (Networking/IO) synchronous APIs with access to
> the underlying file descriptor, which would allow higher level frameworks
> to use either libdispatch/libdill/libuv/libwhatever to write their
> scheduling mechanism and then define if their version of the higher level
> APIs should be synchronous or asynchronous. This doesn't exclude the goal
> of this group to provide a libdispatch based default implementation.
>
> Agreed. Just wanted to point out that there's no libdispatch built into
> the APIs I was proposing. Sure, we have _one_ implementation on top of
> Dispatch but it can be done completely differently. We have a synchronous
> one and Helge even made one on top of Apache, also synchronous.
>
> I believe this API is totally implementable with libdispatch/libdill/libuv/libwhatever
> .
>
> Sure, the group will probably one come up with one reference
> implementation but I don't think there's anything wrong with implementing
> it again on top of other low-level libraries.
>
>
> > I agree with this. Although the overlap and reuse of code between these
> different frameworks (each based on libdispatch or libdill or libuv, for
> example) would not be that high, there's still value in having them use
> what is possible to be common. For HTTP, separating the message heads from
> the body is a great example. This way HTTPRequestHead and HTTPResponseHead
> can be easily shared between synchronous and asynchronous frameworks since
> this part doesn't involve IO. This means that these frameworks can share
> for example authentication middleware which only depend on the
> Authentication header.
>
> that too. And if the lower-level implementation provides the API that
> we're coming up with, then even the web apps could be reused.
>
>
> > This is not actually about sync/async. I think the takeaway is that
> although we are providing a libdispatch based default implementation.
> That's just a detail.
>
> 100% agreed. I don't think we even decided how the default impl should be
> implemented.
>
>
> > I think what Michael means is that we must allow other frameworks based
> on libdispatch/libuv/libdill/libwhatever to implement their approach too.
> And this means providing synchronous APIs for the base IO and exposing the
> file descriptor so people can choose how they are going to poll/schedule
> for reads and writes. I agree with him a 100%.
>
> but the base APIs are orthogonal to the HTTP API, right? The HTTP API is
> implemented using some underlying mechanisms, be it libdill/mill/venice,
> kevent/epoll, Apache, Perfect-Net, DispatchSource, or DispatchIO. But as
> you say, that's a detail.
>
>
> -- Johannes
>
>
> >
> >
> > On 1 June 2017 at 13:08, Michael Chiu <hatsuneyuji at icloud.com> wrote:
> > Hi Johannes
> >
> >>>
> >>> I think i need to clarify something: I’m ok with a asynchronous api
> that executes synchronously, for example if the api is something like [[ a.
> {  b() } ; c() ]], executes as [[ a(); b(); c() ]], it is totally fine
> since it’s just synchronous api with syntactic sugar.
> >>
> >> We actually have a synchronous implementation of the proposed API next
> to the DispatchIO one that we normally use. The synchronous one uses
> problem system calls and only services one request per thread. It's handy
> for unit testing and for specialised use-cases. The synchronous
> implementation only uses the following syscalls: open, close, read and
> write, that's it so nothing fancy.
> >
> > I think even exposing these apis to user will be good. No need for fancy
> support just include it and it will be good enough.
> >
> >>
> >> I think i need to clarify something: I’m ok with a asynchronous api
> that executes synchronously, for example if the api is something like [[ a.
> {  b() } ; c() ]], executes as [[ a(); b(); c() ]], it is totally fine
> since it’s just synchronous api
> >>
> >> ie. you use write as a blocking system call because the file descriptor
> isn't set to be non-blocking.
> >>
> >> Just as a side note: You won't be able to repro this issue by replacing
> the macOS `telnet` with the macOS `nc` (netcat) as netcat will only read
> more to the socket after it was able to write it. Ie. the implementation of
> standard macOS `nc` happens to make your implementation appear
> non-blocking. But the macOS provided telnet seems to do the right thing.
> You can use pbjnc (http://www.chiark.greenend.org.uk/~peterb/linux/pjbnc/)
> if you prefer which also doesn't have the same bug as `nc`.
> >
> > As I said both snippet of code are just sketches only for proof of
> concept. But I do missed on the kevent write one that’s for sure.
> >
> >
> >>
> >>>> I'd guess that most programmers prefer an asynchronous API with
> callback (akin to Node.js/DispatchIO) to using the eventing mechanism
> directly and I was therefore assuming you wanted to build that from
> kevent() (which is what they're often used for). Nevertheless, kevent()
> won't make your programming model any nicer than asynchronous APIs and as I
> mentioned before you can build one from the other in a quite
> straightforward way. What we don't get from that is ordinary synchronous
> APIs that don't block kernel threads and that happens to be what most
> people would prefer eventually. Hence libdill/mill/venice and Zewo :).
> >>>
> >>> Johannes, I totally agree with you. A asynchronous API is more
> intuitive and I agree with that. But since we are providing low level API
> for ppl like Zewo, Prefect, and Kitura, it is not right for us to assume
> their model of programming.
> >>>
> >>> For libdill/mill/venice, even with green threads they will block when
> there’s nothing to do,
> >>
> >> If you read in libdill/mill/venice, it will switch the user-level
> thread to _not_ block a kernel thread. That's the difference and that's
> what we can't achieve with Swift today (without using UB).
> >
> > I’m quite confused on this one, since a green thread, if that’s what we
> think we were referring to, can not enter kernel (It can, but when it
> enters what happened is that the kernel thread associated with enters
> kernel).
> > So you can’t switch to another user-level thread to not block a kernel
> thread.
> > AFAIK all majority OS(Liunx, FreeBSD, Solaris….) adopted 1:1 threading
> model instead of n:m, not sure about Darwin but I think it applies to
> Darwin as well according to an old WWDC video (I could be wrong), hence any
> user threads (except for green threads) are in fact kernel threads. Since
> kevent and epoll are designed to block when they should, I don’t think
> anyone could avoid blocking something.
> >
> >>> in fact all the example you listed above all uses events api
> internally. Hence I don’t think if an api will block a kernel thread is a
> good argument here.
> >>
> >> kernel threads are a finite resource and most modern networking APIs
> try hard to only spawn a finite number of kernel threads way smaller than
> the number of connections handled concurrently. If you use Dispatch as your
> concurrency mechanism, your thread pool will have a maximum size of 64
> threads by default on Darwin. (Sure you can spawn more using (NS)Thread
> from Foundation or pthreads or so)
> >
> > Yes Kernel threads are finite resources especially in 1:1 model but I’m
> not sure how is it relevant. My concern on not include a synchronous API is
> that it make people impossible to write synchronous code, with server side
> swift tools, despite blocking or not, which they might want to. I’m not
> saying sync is better, I’m just saying we could give them a chance.
> >
> >>> And even if such totally non-blocking programming model it will be
> expensive since the kernel is constantly scheduling a do-nothing-thread. ((
> if the io thread of a server-side application need to do something
> constantly despite there’s no user and no connection it sounds like a ghost
> story to me )).
> >>
> >> what is the do-nothing-thread? The IO thread will only be scheduled if
> there's something to do and then normally the processing starts on that
> very thread. In systems like Netty they try very hard to reduce hopping
> between different threads and to only spawn a few. Node.js is the extreme
> which handles everything on one thread. It will be able to do thousands of
> connections with only one thread.
> >>
> >
> > The kernel has no idea is a thread have anything to do unless it
> sleeps/enterKernel, unless a thread fits in these requirements, it will
> always scheduled by the kernel.
> >
> > I’m saying, if there exists a real non-blocking programming model,
> defined that by “never call any ‘wait’ system calls’, than any IO threads
> of that model must constantly poll the kernel, hence such thread
> _cannot_be_scheduled_on_demand since the thread itself has no idea if it
> has anything to do. The only way to have an IO thread to do know they have
> to do something, they will either need
> >
> > 1) An external listener call blocking event api and poke the IO thread
> on demand
> > 2) The IO thread has to constantly poll the kernel
> > 3) An external listener polls the kernel constantly and poke the IO
> thread when ready.
> >
> > 2 and 3 are the do-nothing-thread I’m referring to, they are running,
> polling, wasting kernel resources but not actually being productive (when
> there’s no connection).
> >
> >>
> >> You should definitely check the return value of write(), it's very
> important. Even if positive you need to handle the case that it's less
> bytes than you wanted it to write. And if negative, the bytes are _lost_
> which happens all the time with the current implementation.
> >>
> >> Anyway, to fix the EAGAIN you'll need to ask kevent() when you can
> write next.
> >
> > It was suppose to be a proof of concept sketch work. As mentioned in the
> comments of the code it was assuming to satisfy one single client. Now I’ve
> improved it so it handles multiple clients while remain synchronous and non
> blocking. EAGAIN is the only “error” will raise if you consider it as error
> but for me it’s part of the non blocking IO.
> >
> >
> >> Foundation/Cocoa is I guess the Swift standard library and they abandon
> synchronous&blocking APIs completely. I don't think we should create
> something different (without it being better) than what people are used to.
> >>
> >> Again, there are two options for IO at the moment:
> >> 1) synchronous & blocking kernel threads
> >> 2) asynchronous/inversion of control & not blocking kernel threads
> >>
> >> Even though I would love a synchronous programming model, I'd chose
> option (2) because the drawbacks of (1) are just too big. The designers of
> Foundation/Cocoa/Netty/Node.js/many more have made the same decision. Not
> saying all other options aren't useful but I'd like the API to be
> implementable with high-performance and not requiring the implementors to
> block a kernel thread per connection.
> >
> > To be honest I will choose 2 as well. But we are in not a 2 choose 1
> situation. The main difference between we and netty/node.js is that ppl use
> them to, write a server, what we do is, writing something ppl use to write
> something like netty and node.js. So it is reasonable to think there’s
> demand on a lower-level, synchronous api, despite the possible “drawbacks”
> they might encounter.
> >
> > Maybe we have some misunderstanding here. I’m not saying a synchronous
> api that happens to be able to handle a vector of sockets in single call
> without blocking anything, I’m saying a synchronous api that can just do
> one simple thing, which is, read/write in a synchronous way despite block
> or not, if it will block, just let them know by throwing an exception, the
> api call itself, will not block anything that way.
> >
> > Cheers,
> > Michael.
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-server-dev/attachments/20170601/054bf754/attachment.html>