EROS retrospective -- threading

shapj@us.ibm.com shapj@us.ibm.com
Sun, 26 Dec 1999 16:43:31 -0500


>A major premise of the whole EROS design is that a
>message/invocation is fast. True, handling redirection
>in a process rather than the kernel introduces one
>extra message. I wouldn't call it a "fatal" flaw.
>If sending messages between processors is really
>"deadly", you are going to have many other problems.

Charlie:

I agree that it shouldn't be an issue on uniprocessors, or on tightly
coupled multiprocessors (though even there it's getting dicey).  However, I
believe that you may not be adequately considering the implications of NUMA
architectures. The problem here is not just that an intervening processor
gets control (and thereby becomes the bottleneck in a server that would
otherwise have been nicely load balanced).  The problem is that the *data*
ends up on the wrong CPU as well, and then has to be transshipped by the
hardware.

Better to put the data and control flow on the right CPU to begin with.

Empirically, getting the info to the processor where it will be used can
make a big difference on such machines.

Jonathan S. Shapiro, Ph. D.
Research Staff Member
IBM T.J. Watson Research Center
Email: shapj@us.ibm.com
Phone: +1 914 784 7085  (Tieline: 863)
Fax: +1 914 784 6576


Charles Landau <clandau@macslab.com>@eros-os.org on 12/26/99 01:58:25 PM

Please respond to clandau@macslab.com

Sent by:  owner-eros-arch@eros-os.org


To:   Jonathan S Shapiro/Watson/IBM@IBMUS
cc:   eros-arch@eros-os.org
Subject:  Re: EROS retrospective -- threading



shapj@us.ibm.com wrote:

> One problem in EROS is multithreading. There are two issues, one
technical
> and the other social.
>
> The technical issue is more serious. Oran Krieger has pointed out a fatal
> flaw in the redirector design: in a multiprocessor, an invocation of the
> redirector necessarily involves either a rescheduling of the redirector
on
> the calling processor or a cross-processor transfer. The redirector then
> does the same thing in dispatching some particular thread.  Even on
> small-scale multiprocessors the cost of the L2 cache line transfers can
be
> considerable. On larger-scale multiprocessors, where memory access times
> are non-uniform, it's deadly.

A major premise of the whole EROS design is that a message/invocation is
fast.
True, handling redirection in a process rather than the kernel introduces
one
extra message. I wouldn't call it a "fatal" flaw. If sending messages
between
processors is really "deadly", you are going to have many other problems.