Long answer to quick question
shapj@us.ibm.com
shapj@us.ibm.com
Thu, 2 Sep 1999 12:38:21 -0400
John: [others in case interested]
You're right. There's no short answer.
In general, going to 32-bit processors using the current code base should be
simple. There are minimal byte-order dependencies in the current kernel. The
major change would be in the trap handler (easy) and the memory manager (more
involved, depends on the CPU). Doing an initial port, if uninterrupted, would
take 3-6 months. Porting to machines with hierarchical page tables (e.g. 68k)
could use substantially the current memory logic, and would be closer on the 3
month end of the spectrum. A quick and dirty MIPS port accomplished by
implementing the hierarchy in software could be done in about that time frame.
32-bit PPC, because of the hash-structured page tables, requires a good bit more
thought. Also, a port that takes proper advantage of the MIPS software reload
would take more thought. Both of these get into the 6 month span. In the PPC
case, we could in the interest of time view the hashed lookup mechanism as a 2nd
level TLB cache, back that with software-implemented tree structured
translation, and arrive at a quick port in ~4 months. The main issue on these
CPUs is that I suspect a good bit of the current code is heavily dependent on
the fact that we presently run on a hierarchically translated machine.
Going to 64 bit isn't substantially harder from a memory mapping perspective --
the internal translation logic is already using 96 bit segment offsets. The
more complex issue is dealing with in-memory capability logic. A "quick and
dirty" solution would simply increase the size of the in-memory Node structure
to allow for the larger pointers, or (equally good) use object indices rather
than pointers and leave the size alone. Indices is probably the better answer
in the short term, and the change is not visible outside the kernel in any case.
There are two "simple" solutions, but I think that time permitting I'ld want to
use this as an excuse to investigate using in-memory GC in preference to
in-memory key rings. I'ld therefore want to go at this with a bit more
deliberation if given the opportunity.
As to the "more registers" problem, there are two effects that have fairly major
impact on overall performance. One is the user/supervisor crossing delay (which
is where the x86 eats performance), the other is register save. The main
challenge to getting a satisfactory implementation on Merced is the latter.
That said, note that Merced has a relatively wide memory bus, and can be placed
in a mode in which it will work behind the scenes to "clean" the register file
by saving the registers. It remains to be determined if this can be leveraged
to reduce the context switching overhead. I expect that Jochen Liedtke will
have results on this before I do, as I'm not actively attending to Merced at the
moment.
Beyond this, Merced raises some serious challenges to the hand-coding of the IPC
path, but this presents no fundamental difficulties other than a lot of hard
work.
Jonathan S. Shapiro, Ph. D.
IBM T.J. Watson Research Center
Email: shapj@us.ibm.com
Phone: +1 914 784 7085 (Tieline: 863)
Fax: +1 914 784 7595
"John C. Randolph" <jcr@idiom.com> on 09/01/99 11:44:51 AM
Please respond to jcr@idiom.com
To: Jonathan S Shapiro/Watson/IBM@IBMUS
cc:
Subject: Other processors?
Shap,
Quick question: (which may not have a quick answer)
Suppose you had to port EROS to
1) Merced
2) PPC
3) anything else
What kind of pain are you looking at?
Also, how does the size of a processor's register file affect your
context-switching speed?
-jcr