[cap-talk] Another "core" principle - virtualize capabilities
Jed Donnelley
capability at webstart.com
Thu Jan 4 13:20:22 CST 2007
At 06:04 AM 1/4/2007, Marcus Brinkmann wrote:
>Hi,
>
>some technical notes.
>
>At Wed, 03 Jan 2007 17:07:23 -0800,
>Jed Donnelley <capability at webstart.com> wrote:
> > >I claim that this is merely the beginning of a model for computation,
> > >and that it lacks, among other things, in the words of Jonathan,
> > >considerations of:
> > >
> > > "durability, resource exhaustion, and synchronicity"
> > >
> > >In other words, it lacks *anything* related to the question of
> > >executing instructions on a real machine architecture.
> >
> > Hmmm. I'm afraid I don't understand what you're getting at.
> > In the object-capability model processes are assumed to
> > execute the instructions of their underlying processor
> > architecture in their own memory except for the "invoke"
> > instruction (virtual instruction if you will) that behaves as
> > above, namely that it allows:
> >
> > 1. Some amount of data and some number of capabilities
> > to be sent to whatever services the invocation, and
> >
> > 2. The process to block until some amount of data and
> > some number of capabilities are received in "reply" to
> > the invocation.
>
>Note that you introduce a new concept, that of a process, here. You
>then have to answer several questions:
I'm sorry, but it escapes me why I "have to" answer these questions?
Answer to meet what goal? Certainly not to defend my argument
that mappable object-capability systems can be implemented efficiently.
Do you believe so? The answer to these questions are (generally)
system dependent questions that are independent of the issue
of the system "call" interface.
> Are objects passive or active?
In any system there are both active (call them processes
or active objects - the "things" that execute machine
instructions) and passive ("things" such as files in which
you don't load code and say "go") objects.
> How are concurrent processes scheduled?
Why does it matter for this discussion? In any system
I believe the scheduling issues are substantially the
same - regardless of the system call interface.
> How do you avoid priority inversion?
Ditto.
> How can a process unblock a blocked process safely (important for
> signal delivery)?
It can be done. Many (most, all) systems do so and again I view
this as largely independent of the system call interface (object
"invocation" vs., say, a Unix or Windows sort of interface).
>The answers to these questions heavily influence the feasible system
>structures that can be implemented.
Perhaps "system structures" in some deep sense, but not the
system "call" interface - I argue. That is, one can adopt an
object-capability system call interface and still have the
flexibility to meet any desires along the lines of the above
questions.
> > I'm not quite sure what you mean by "durability", but I expect
> > that should also be included at a higher level.
>
>Resources have different durability. A page of memory allocated from
>the system has different durability than a page of memory mapped by
>the client, if the principle holds that the resource owner can destroy
>the resource at any time.
I expect it will be true that a resource "owner" can destroy such
a resource at any time in any case. What's the problem?
> > Fine. Provide such an interface. There's nothing about the base
> > object-capability model that prohibits doing so.
> >
> > >Note that people who advocate a synchronous call primitive do not say
> > >that select() can be implemented on top of it if pressed on this
> > >point.
> >
> > Sorry, I do (as I did above).
>
>Are you saying that it is adequate to start 5000 processes to poll
>5000 file descriptors? I would like to hear about your experiences in
>that area.
No. One process suffices - in my experience with an object-capability
system, the NLTSS system.
> > I guess I should also mention that I'm not concerned about
> > what I would call "out of band" perfection in the wrapping
> > mechanism, specifically anything like "EQ?" or "MyCap?".
> > It is enough for me that functionally a wrapped object
> > (capability) behaves in the same way as the original
> > in terms of what happens when data and capabilities
> > are sent in and come back.
>
>Note that there is no system to my knowledge which implements a
>membrane primitive,
and I believe it's appropriate that they not. What systems do
implement (e.g. RATS, KeyKOS, EROS, etc.) is an extension mechanism
which provides some amount of ability to emulate (wrap) other
objects in the system. My argument amounts to suggesting that
the extension mechanism on such system can and should be
able to wrap all their primitive objects and that it should
be able to map to a "standard" object (e.g. through a remoting
network interface).
>and that all capabiities fetched through wrappers
>are transfered, not delegated through the membrane.
By "delegated" are you referring to the Responsible Delegation
mechanism that we so recently started discussing (namely including
the notion of identity)? If so, then this isn't surprising, since
as far as I know this is a newly invented mechanism. However,
I believe it important that even existing object-capability
systems can provide for such delegation through their
existing extension mechanism.
>That means that
>all existing (to my knowledge) performance measurements are biased
>towards the wrong (in my opinion) model.
Hmmm. I've argued that the Horton mechanism (I'll just use that term
to avoid repeating the "responsible delegation" phrase so often)
is only important in exchanges managed by people - either person to
person delegation or person to process delegation. Any time a
human action is an initiator I believe the performance requirements
are significantly less. Still, you may have a point. It will remain
to see whether such a mechanism proves valuable in practice and
what the performance trade-offs are for it's use.
> > >It's not that "an" interface imposes necessarily performance problems.
> > >However, the "wrong" interface certainly will. What is the "correct"
> > >and what is the "wrong" interface depends greatly on system details
> > >that lie well outside of what is covered by your proposal.
> >
> > At this point I don't see anything we can do but respectfully disagree.
> > I have my experience in this area, described above. I can only
> > guess that you have some experience where an interface faithful
> > to the object-capability paradigm imposed unacceptable overhead
> > and you believe no object-capability interface could provide acceptable
> > overhead. Perhaps you can describe your experience in this
> > regard and we can discuss it.
>
>Just for the record: I don't claim that no object-capability interface
>could provide acceptable overhead. I am saying that we do not have
>sufficient evidence to claim that a single universal mechanism is good
>enough for all applications.
And, for the record ;-), I am not arguing that a single universal
mechanism is good enough for all applications, but that a wrappable
and mappable object-capability interface is sufficient for all
applications. There are a variety of interfaces available within
that constraint.
>[Points about Mach deleted]
>
> > That is, the trade-off isn't in the interface, it's in the
> > service implementation.
>
>I don't think it's one or the other. Services can only make use of
>the available interfaces, so the interfaces determine the feasible
>design space for services. Also, there may be outside constraints on
>the feasibble server designs. It simply isn't an option to put
>everything into the kernel.
Why not? If the comparison is to a monolithic kernel system and
you require comparable performance, you may have no choice but
to eliminate domain changes within one or more service implementations.
Domain changes have a cost. This cost may vary between a relatively
small language thread exchange cost to a rather heavy register
exchange cost for a heavy weight scientific processor (as we
had to deal with for NLTSS). Whatever the cost, if you have
more of them in a modular service implementation those additional
domain changes will add time (cycles) to a service that need
not be there in a monolithic kernel implementation.
If the constraints are tight enough on the service times
(e.g. in our case low latency access to solid state memory
through a file system interface) then you may have no choice
but to cut out domain changes in the service implementation.
I simply argue that in my experience (which I believe is
pretty generally applicable) it isn't the wrappable
and mappable nature of the object-capability "call" interface
that will constrain performance.
> > >An analysis of how these failures can be avoided by careful
> > >construction of the microkernel primitives is contained in the
> > >following paper. Also contains references to other projects.
> > >http://www.l4ka.org/publications/paper.php?docid=642
> > >
> > >This last paper illustrates that the requirements for the
> > >communication primitives go beyond what you specified. This does not
> > >mean that the decisions made by Liedtke are the only feasible ones,
> > >but they show that the design constraints are tight. You have to do
> > >an effort to get a fast IPC mechanism, and you can not design the IPC
> > >mechanism arbitrarily.
> >
> > I didn't argue that care isn't required. Certainly any interface must
> > be carefully designed with performance in mind. This is particularly
> > true for an interface (e.g. an underlying invocation or RPC interface)
> > on which much of a system will depend. Even then, however,
> > I argue that techniques are available for improving performance
> > that can be done behind the scenes and needn't impact the
> > base object-capability interface.
>
>What is "behind the scenes" for you?
Whatever is on the other side of the "invoke" call.
>For me, it is everything that is
>not in the kernel, which means just about everything except for the
>basic mechanisms. Maybe that's a difference between us that can
>explain some of the confusion.
Sure. When I implement system functions I see all parts of the
system as "in play". I typically begin with a design that's as modular
as possible (e.g. think postfix with multiple internal processes).
However, since what matters to the user (client, customer) is
what services are provided and how well they perform, if I
started with communicating processes within a service implementation,
I have no problem merging their implementation ("behind the scenes"
of the invocation/system call) to improve performance when needed.
Of course it helps if the basic architecture is preserved. E.g.
in our case the modularity and code was preserved, but we traded
off using a lower latency "thread" switch between what were
previously protected separate "process" domains (requiring a
heavy weight register exchange).
> > ><snip>
> > >It took years for the microkernel community to develop a fast IPC
> > >mechanism (due to Liedtke). Now it is taking years for the L4 group
> > >to solve the resource allocation problems involved in managing the
> > >mapping database. And even if they succeed it is unclear if their
> > >mechanisms are universal. Comparing KeyKOS/EROS/Coyotos with L4 shows
> > >many similarities, but also some fundamental differences. Given that
> > >even a single level of indirection makes these systems non-competitive
> > >with traditional monolithic kernels without object/capability system
> > >should make one careful about any exaggerated claims.
> >
> > I described my approach to dealing with a "single level of indirection"
> > above. Namely, eliminate it where needed, smash the mechanism
> > into a monolithic kernel (which may be needed to provide the required
> > performance), but leave the interface the same. In my experience
> > this works - though it may not be as pretty or analyzable, etc. as
> > some may wish. It can meet the performance goals, and it may
> > be the only approach that will. It can also meet the functional
> > goals of fully wrappable object-capabilities.
>
>Frankly, I don't believe that it is the only approach that will.
Hmmm. I think the way I stated the problem (cost too high for
domain exchanges internal to an implementation) then the only
solution is to speed up or eliminate the domain changes that
are the source of the unacceptable cost. If you assume that
the cost for the individual change is irreducible, then it
seems eliminating some domain changes is all that's left. That
is certainly the situation that we found ourselves in. I
expect (hope) that this situation is uncommon. All I'm pointing
out is that it's available. It's available in a form that's
independent of the "call" interface - whether a wrappable/mappable
object-capability "invocation" or a more traditional larger
set of semantically specific system calls as with Unix or Windows.
>I see now that you do not seem to be all that interested in using IPC
>as a feasible basis for user-level system construction, and are
>satisfied with a nicer set of kernel interfaces, that achieve, as you
>say below, POLA and virtualization. Ok, I see the value in that. I
>am not sure if people will go that way beyond what is already
>happening with privilege separation and the likes, but I can see where
>you are going.
>
> > Clearly you can't demand a modular design with many domain
> > changes per request, where domain changes have a significant
> > cost, and meet a stringent performance goal. Something
> > has to give. If the performance requirement is ascendent
> > then I argue that the modularity must be sacrificed.
>
>I think that's overly pessimistic. Instead, I think the solution is
>to avoid many domain changes per request.
Isn't that what I said?
>I optimistically believe
>that functional membrances can be implement in the kernel efficiently
>as a fundamental mechanism.
Fine. Do your best. More power to you. The more efficient you
make your domain change the more modular your design can be and
still meet whatever performance requirements are driving your
implementation.
>Furthermore, Neal has a proposal for
>fine-grained resource delegation without interposition, so that
>resource management can be done without many domain changes per
>request as well.
No problem. As I say, my goal is at the interface level, arguing
for value in providing a wrappable/mappable (including an implemented
mapping) object-capability interface on all systems.
> > I don't believe that an appropriately design object-capability interface
> > is (in my experience) or will be a significant issue. That interface
> > has a value of it's own (POLA, virtualization, etc.) that can be
> > maintained along with meeting performance goals.
>
>[...]
>
> > >You make it sound like Mach never happened,
> >
> > Heh. For me Mach didn't happen.
>
>Some guys are born lucky :)
>
>[...]
>
> > All I can say is that in our system we had no limitations on message
> > size (the buffers could be as large as the process memory). Any of
> > our operations could be what you regard as "statefull". What's the
> > problem? Our servers handled the worse case behavior by clients
> > (as I described elsewhere with "good guy" timeouts - an issue I
> > hope to get back to) and we considered it quite a "pretty sight."
>
>The first problem is that the bound on the IPC send operation is
>potentially too large, unless you have a preemptible IPC mechanism.
>This can be fixed by limiting the message size appropriately by the
>kernel.
As you note the preemptable IPC mechanism is an alternative.
>The second problem is that the server naturally absolutely must
>restrict the size of an incoming message to the resources it is
>willing to allocate to this request. This can be fixed by limiting
>the message size in the receive operation.
It can also be addressed by having the server simply return an
error to a sender of more data that it's designed to accept.
I'm just describing what we did that worked for us. The
discussion seems to be somewhat "argumentative". I'm not trying
to suggest that what we did was the only right way. What I
am trying to argue is that there is no significant cost in
providing a base interface that is a wrappable and mappable
object-capability interface.
Even this approach you are arguing for with fixed buffer limits
on a base object-capability "invoke" operation can work fine
in a wrappable/mappable interface. Just because we did it
differently (and I mentioned so in various discussions) doesn't
mean that I'm arguing that it even should be done that way.
All I'm arguing is that the interface "should" be a wrappable/
mappable object-capability interface (to provide the value
of POLA and network level interoperability between systems
at the object sharing level).
>The third problem is that stateful communication has a significant
>build-up overhead if sessions are instantiated frequently and are of
>short duration, as often is the case in a fine-grained capability
>system. This can not be fixed except by using a stateless server
>design, or by avoiding delegation of capabilities.
>
>The fourth problem is that sessions expose the identity of the invoker
>of a capability, which may be a security violation (depending on your
>level of paranoia).
>
>The fifth problem is that N cooperating clients can perform a DoS
>attack at the server.
>
>The sixth problem is that in many cases it is not clear what an
>appropriate time out is (although some people say that it would be a
>mistake to build systems without timing considerations in 2007, so
>this may be a virtue as well).
>
>Given your example, I consider problem 1, 2, and 6 as non-critical.
>But 3 and 5 are serious, and I am of mixed opinion on 4.
You consider these problems as barriers to use of a wrappable/mappable
object-capability interface (I'm starting to get into a rhythm here)?
Problem 5 will be there with any interface. Regarding problem
4, I don't see why you argue that a "session" exposes the identity
of an invoker.
Does that leave us with problem 3? How is problem 3 impacted by
the nature of the base interface, whether a wrappable/mappable
object-capability interface or otherwise?
> > One other thing I'll mention in passing regarding your designs
> > with fixed RPC buffers. I can only assume you are suggesting
> > such designs because the user buffers are copied at some point
> > into system buffers apart from their ultimate destination.
>
>No, the motivation is to get a bound on the length of the IPC
>invocation in the kernel.
I appreciate that now. There is the trade-off noted above in
that choice.
>Also, it seems clear to me that the server
>must limit the length of messages it accepts.
Right, but this need not be an imposition on the base interface
that's used.
> > I sense we may be at something of an impasse.
>
>It probably was just a misunderstanding of what you said earlier.
Let's see. Interesting discussion.
--Jed http://www.webstart.com/jed-signature.html
More information about the cap-talk
mailing list