Backing up one more step...
Bryan Ford
baford@schirf.cs.utah.edu
Wed, 07 Dec 94 13:17:13 MST
> I wasn't proposing to try to maintain consistency automatically between
> different districts - just keep them completely independent. As I said
> in my first posting, _all_ connections between two districts get severed if
> _either_ of them dies and restarts.
>
>I understood. That was why I chose the space bank as my example.
>Keeping them independent is impossible in the KeyKOS architecture. At
>the top of the world there will be the Prime Space Bank that all
>domains, directly or indirectly, derive from.
OK, I see what you're saying. I guess the real issue now is how much
we're willing to deviate from the original KeyKOS architecture in order
to get needed functionality. To take up the space bank example, there
would probably now have to be a separate "Prime Space Bank" for each
district, representing all the backing store assigned to the district.
Other examples will have to be dealt with individually.
As you've pointed out, it's either impossible or extremely difficult
to distribute a "single-district" persistent system across a large number
of nodes without incurring large global restart/rollback costs.
My proposed approach is, basically, to modify the original KeyKOS scheme
enough to make this problem tractable without giving up all the benefits
of the KeyKOS design. Think of that as a "research challenge". :-)
>If you propose to change this, you haven't got a single system image
>anymore, because what you've done is said "these subsystems are not
>allowed to communicate with each other through the persistence layer."
I never really knew exactly what the term "single system image" meant,
but I'll take your word for it. :-)
>It's not simple to specify what a "well defined" connection between
>districts is, particularly given that there are lots of services you
>*want* to have span the boundary. Factories, for example, shouldn't
>need to be replicated.
I admit that I can't propose an obviously-correct or definitely-good-enough
definition of "well defined" at this point, or specify how all the
different interactions would be affected. Perhaps it would help if
you (or someone) could post a list of services or connections in a
KeyKOS system that might be problematical in a "multiple-district"
organization. We already have two: space banks and factories.
I already dealt with space banks (at a first-pass level anyway).
Factories, if I understand them correctly, are services that are
highly trusted but hold almost no dynamic state: internally they don't
change much (if any) from request to request. Hence it seems it wouldn't
cause much of a problem if factories were in a different district than
their clients: if the client district goes down while a factory is
building a new <whatever> in it, the connection gets severed, and
upon restart the program in the client district must retry the
operation by reconnecting with the factory. (Again, this reconnection
could be made transparent through the use of proxies.)
Bryan