[e-lang] Events as Bad Ideas
Jonathan S. Shapiro
shap at eros-os.org
Sat Nov 27 16:34:55 EST 2004
Okay, here is my 2 cents, for which each of you should please donate
$1.50 to SRL. :-)
First, the events vs. threads argument is very old and rather silly.
Needham had it right: the two are duals, and the correct choice depends
largely on what problem you are trying to solve.
In partial rebuttal of Ousterhout's 1996 USENIX presentation, many of
the issues he raised were artifacts of the implementations then
available (rather than fundamental objections). It is certainly true
that early threading algorithms were not designed for scalability, and
that thread synchronization (most notably mutexes) were incredibly slow.
See more recent work on K42/Hurricane for the current state of the art.
The "lower overhead (no stack)" argument made by Ousterhout was wrong.
When a thread is stopped, it has a certain amount of state that needs to
be retained. Even event-driven systems sometimes wish to be able to
yield cooperatively. They either need a continuation or a stack to do
this. The two are interchangeable. The main difference is that today's
programming languages provide more direct support for stacks.
But Ousterhout's key argument concerned complexity, which is something
that von Behren, Condit and Brewer don't address at all. In failing to
address this, they are arguing against a straw man, not against
In my opinion, von Behren, Condit, and Brewer mis-state the real
argument for user-mode threads, which has relatively little to do with
kernel thread switch overheads, and a very great deal to do with the
unreliability of large-scale MP kernel schedulers. For example, Sun's
m:n threading model was originally motivated by concerns for stability,
not performance. Yes, it is true that a sufficiently bad kernel designer
can make kernel threading expensive. It is also true that several good
kernel designers have beaten this problem conclusively (L4, EROS,
Nemesis). None of this is an argument against kernel-scheduled threads.
If anything, it is an argument against Pentium and (with apologies to
Alan) Itanium machines. It is also a frequently mis-measured argument,
because people tend to evaluate it with microbenchmarks, and the context
switch time is noise relative to the cache reload time. The real issue
is who decides when a context switch occurs.
It *is* true that an m:n threading model (which is essentially what von
Behren et al. advocate) *does* preserve locality, because the
application retains control over logical preemption. In this sense,
there has been a significant advance in the state of the art in
The argument that thread-based control flow is less flexible is
absolutely correct. The question is: is this relevant to the problem you
are trying to solve? It is instructive that *all* modern microprocessor
use event-based concurrency (register rename and comparable techniques).
This allows significant out of order execution and effectively hides
latency. The same benefit does *not* occur in multithreaded apps in the
local case -- the relative latencies of blocking and rescheduling don't
match up well in practice. The event-driven model comes back into the
forefront when network latencies need to be hidden, as in E.
In the end, however, von Behren, Condit, and Brewer ignore the most
compelling argument that John (Ousterhout) made against threading:
complexity. John didn't attempt to claim that threading was a bad idea
(indeed, the subtitle of his talk was "for most purposes"). Rather, he
claimed that for the vast majority of cases where threading was actually
getting used, it wasn't the right tool for the job. The difficulty is
that generalized concurrency is well beyond the capabilities of most
programmers. Language support for concurrency today comes in only two
flavors: (1) nonexistent, and (2) wrong (e.g. Java). This leaves events
as the right mechanism for most purposes. We are currently looking very
hard at this problem in our lab.
Now, if you have a well-isolated sub-problems, and you can handle two or
more events concurrently without interference, John would certainly say:
go ahead and thread that. He didn't make the case clearly in his talk,
but what he was actually arguing for was a unified data-driven control
structure to manage concurrency. Today, we would call this "dataflow",
and it is fairly universally accepted as a good idea for many problems.
He would also argue that if you *are* going to multithread, it's better
to multithread sub-steps that are completely independent, and handle all
of the complexity of scheduling and dispatch and concurrency control in
a single place (the event dispatcher).
What E does, which is (so far as I know) unique to the E/Joule line of
work, is directly integrate dataflow and event-driven dispatch into the
language architecture in a coherent way. My own feeling (and I think
MarkM would agree) is that the Joule approach is theoretically better
integrated, but its cost is that it visibly violates the programming
paradigm, which makes it a very tough sell. E chooses to introduce
dataflow in a different place, and the result is much more usable.
A couple of specific responses to comments by MarcS:
> > if you care about performance more than
> > everything else in the world, we do not have anything to
> > offer. I still feel that that is not correct, but so far,
> > anything else has eluded us.
I think that what I am about to say is off-topic in connection with E,
but the event/thread duality is relevant to performance. Eric Northup
and I have been looking lately at doing an SMP kernel for Coyotos. Our
provisional conclusion is that doing an SMP implementation of an atomic-
style microkernel ought to be faster for several reasons:
1. Strictly less concurrency management required.
2. Object granularity finer than conventional kernels
3. Lock release strategy differs enough to justify a
completely different locking design
4. Fine-grain knowledge of object behavior coupled with
simplicity of concurrency gives us engineering options
that a conventional kernel does not have.
We've just submitted an NSF proposal on this. I'll put it online as soon
as I get a domain propagation problem straightened out.
NOTE: Please do NOT go pointing your website at anything in
coyotos.org yet. There is nothing to see, but you *will* manage
to contaminate your DNS cache for 24 hours if you try it too soon.
I'll send an announce to the list when it's ready.
> > "Common event patterns...map cleanly...onto threads...They
> > need a return event on the event model." We agree that a
> > return is needed, and this is where the promises come in...
Oddly enough, it looks like Coyotos is going to abandon this. It turns
out that you *don't* need a return. What you need is the ability to
transmit a capability on which the returned message should be sent. You
also need a way to invalidate this efficiently.
This is, of course, a way of describing one implementation of return. My
point is that there are several viable low-level mechanisms, and the one
you want is heavily influenced by usage patterns.
> > Exception handling: E once again has a mechanism that was not
> > foreseen by the authors, the broken promise. This wipes out
> > their argument that threads are easier. "Fixing problems with
> > events is tantamount to switching to threads" is simply not true.
I disagree. A broken promise constitutes a remote computation that has
failed. This is completely orthogonal to the thread/event issue. The
problem that a broken promise addresses is the possibility that
computations can fail for lack of resource.
In my opinion, the one interesting concurrency mechanism that E lacks is
an event queue that is shared across multiple co-equal vats. This is
exactly the same problem that KeyKOS/EROS have in the current
Jonathan S. Shapiro <shap at eros-os.org>
More information about the e-lang