Here's the response to Linus and others on the messaging/syscall discussion.
>That's why the OS boundary HAS to be equivalent to
> read(handle, buffer, size)
>and NOT be equivalent to
> handle->op(READ, buffer, size);
>because by definition, if you can do the "handle->op" lookup, then it's
>not a OS boundary any more - or at least it is a very BAD one. See?
Linus and several other people seem to share the same misunderstanding about what I wrote, so I obviously said it badly.
The confusion, I think, is over whether the descriptor dereference (I did not *say* handle, and I did not *mean* handle) is done by the application or the OS. Several people assumed that I meant it should be done by the application. Indeed, if the application is able to dereference the name (in which case it would indeed be a handle), then there is no real protection boundary. This would indeed be a poor design.
The issue I was trying to get at is both more simple and more complicated. The real question is: "What is an operating system?"
Never mind whether you do it all in one executable (monolithic style) or as a microkernel. There is a core nucleus to any operating system that demultiplexes and dispatches events. In typical designs, these events include interrupts and system calls. I am proposing that a better answer is for the nucleus to demultiplex and dispatch interrupts and *descriptors*, and then let the dispatched-to code figure out what call was made.
If you are building a monolithic system, the change in demultiplex order (operation first vs. descriptor first) has no impact on performance.
That said, there are a number of reasons to prefer it:
Before you jump on this, consider that the all of the file I/O calls in UNIX area actually capability invocations. So is the open() call: open() implicitly references either the current root descriptor or the CWD descriptor, both of which are part of the process state. If you want to have a discussion in UNIX-centric terms, I'm arguing that *all* system calls should rely on some descriptor in this way, and that it should be well-defined what happens when the descriptor is void.
For example, the whole notion of controlling signal delivery on the basis of parent/child trees is a bad idea. A better design would have explicit per-process descriptors, and give authority to signal based on whether the signaller held the appropriate descriptor. Such a design *can* support the current semantics, but is not limited to the current semantics.
Ultimately, I'ld note that when you strip away the syntax transformations you discover that above the vnode boundary UNIX is a capability system, with the notable exception of the signalling and process control mechanisms, which are rather crufty. With the introduction of /proc, processes have been given descriptors as well. What we now need to do is find a way to eliminate the legacy interfaces that render principled security in UNIX so difficult.
Jonathan S. Shapiro, Ph. D.
IBM T.J. Watson Research Center
Phone: +1 914 784 7085 (Tieline: 863)
Fax: +1 914 784 7595
Bill Huey <firstname.lastname@example.org> on 06/21/99 07:07:13 PM
To: email@example.com (Linus Torvalds) cc: firstname.lastname@example.org (bcc: Jonathan S Shapiro/Watson/IBM) Subject: Re: Some very thought-provoking ideas about OS architecture.
> because by definition, if you can do the "handle->op" lookup, then it's
> not a OS boundary any more - or at least it is a very BAD one. See?
I'm going to wait before I comment on this one.
Some of what's going on is a misrepresentation of current OS theory within acedemic circles, which changes the context surrounding the message passing thread.
Syscall overhead is an issue, but things in the late 90's are different now, so folks arguments are rather dated.
I'm going to wait for the IBM guy, to reply first.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to email@example.com Please read the FAQ at http://www.tux.org/lkml/