[wsf-c-dev] Concurrency
James Clark
james at wso2.com
Sun Dec 10 00:53:49 PST 2006
Concurrency is a particularly challenging area for the C stack because
of the need to integrate both with a wide variety of applications that
and with a variety of platforms which provide different primitives.
Here are a couple of different kinds of scenario that I think will be
challenging:
1. Consider a GUI application which needs a SOAP client. It's
single-threaded, and wants to stay that way. It's event-driven but uses
the GUI toolkit's built-in event loop (e.g. Gnome, Qt or Windows). It
can't block for any length of time because this would freeze the UI. To
handle this the C stack needs to get the event loop to watch the
appropriate file descriptors and notify it when they become ready for
IO.
2. Consider a server-side, middle-tier application. To handle each
request, the server needs to fire off a couple of requests to backend
services, which may take some time to complete. You want handle to be
able to scale up to large numbers of simultaneous clients (say,
100,000), so dedicating a thread to each request is not workable. On the
other hand, a completely non-threaded approach is also not workable,
because it won't take advantage of multi-core hardware. You need to
leverage the most efficient, scalable primitives on each platform, epoll
on Linux and IO completion ports on Windows, which happen to have
completely different semantics. The messages involved may be large: it
might be desirable to stream the response from one of the backend
services into the response going out to client, or from the client
request into one of the requests going out to the backend.
At a high-level, the approach I envisage would have a major pluggable
subsystem, which I will call the event manager. This would have a
fairly high proportion of system dependent code. The SOAP engine would
be able to request that various operations be performed asynchronously.
Given such a request the event manager would decide whether to perform
it synchronously or asynchronously. If the former, it just performs the
operation normally and returns. If the latter, it would queue up the
operation to be performed asynchronously and indicate that it is doing
the operation asynchronously by returning an appropriate error code;
when the operation has been performed, it would then queue up a
completion notification to be delivered by calling a notification
function supplied by the SOAP engine. In a multi-threaded event
manager, it would be up to the event manager to maintain a pool of
threads and select which thread to make the completion notification on.
Note that you can handle a simple blocking client just by plugging in an
event manager that never chooses to do anything asynchronously.
The xml_reader API would need to have support for asynchronous
operations. It would be very inconvenient if every next_event()
operation could complete asynchronously, so instead:
- next_event() would always be synchronous
- there would be a next_event_async() which would be asynchronous (at
the event manager's option as usual)
- there would also be a wait_rest_of_element_async(); you would call
this on a start element event; it would complete when the matching end
element is available (this could be implemented either by buffering the
events, or doing a quick pre-parse of the buffer looking for the
end-tag)
- similarly, there would be a read_rest_of_element_async(); you would
call this on a start element event; it would complete when the matching
end element is available and the completion callback would provide the
xml_element containing that element
- for situations where you can't handle things asynchronously, you can
just call wait_rest_of_document_async(), which would complete when the
underlying input stream has been read completely (of course in this case
you would end up holding the complete input in memory as a list of
buffers)
So going back to the reading of the SOAP envelope, the strategy might be
- use next_event_async to get to the SOAP header element
- use wait_rest_of_element_async to wait for the header to be completely
available
- process the header blocks using non-async methods
- use next_event_async to get to the payload start-tag
The event manager would also centralize locking. For this, I think we
need the notion of a logical thread of control, which I will call a
fiber. An asynchronous callback is associate with a fiber. If one
callback running on fiber F makes another asynchronous request, then
then the notification callback for that request will by default also be
associated with fiber F. The event manager will perform locking so as to
guarantee that there are no two callbacks associated with the same fiber
executing simultaneously. Given this, I'm hoping that the only
situation where a message handler will need to explicitly lock things is
the case where there is mutable service-level state.
James
More information about the Wsf-c-dev
mailing list