[e-lang] [cap-talk] Can We Make Operating Systems Reliable and Secure?

Jed at Webstart donnelley1 at webstart.com
Tue May 9 20:16:52 EDT 2006


At 07:19 PM 5/8/2006, David Hopwood wrote:
>Jed at Webstart wrote:
> > At 03:22 PM 5/8/2006, David Hopwood wrote:
> >>Ian G wrote:
> >>
> >>>May be of interest...
> >>>
> >>>=================
> >>>http://www.osnews.com/story.php?news_id=14532
> >>>
> >>>The micro vs. monolithic kernel debate is now very much alive.
> >>>...
> >>
> >>The actual article by Tanenbaum:
> >><http://www.computer.org/portal/site/computer/menuitem.5d61c1d5911 
> 62e4b0ef1bd108bcd45f3/index.jsp?&pName=computer_level1_article&TheCat=1005&path=computer/homepage/0506&file=cover1.xml&xsl=article.xsl&>
> >>is interesting and well-written. Don't bother with the comments and
> >>other links at osnews.com; they're just a waste of time, with a
> >>superficiality to rival Slashdot.
> >
> > That is an interesting article.  I'll make a few comments here.
> >
> > It's interesting to me that so much focus is given to device drivers.
> > While device drivers are out of control of the writers of the operating
> > system, I would think that the amount of code in the device drivers
> > is relatively small compared to the other code in the system.
>
>You would be wrong. As the article says:
>
># To make matters worse, typically, about 70 percent of the operating system
># consists of device drivers,

I of course noticed the above, but it surprised me.

>which have error rates three to seven times
># higher than ordinary code, (3) so the bug counts cited above are probably
># gross underestimates.
>
>where reference (3) is
>
>   A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler.
>   An empirical study of operating system errors.
>   In Proceedings of the 18th ACM Symposium on Operating Systems Principles,
>   Oct. 2001. pp. 73-78.
>   <http://citeseer.ist.psu.edu/chou01empirical.html>
>
>Of course, 70 percent of the code run by any particular OS instance is not
>in device drivers, since most of the supported devices are not present in a
>given hardware setup.

Ah, I think the above hits on the difference.  If I am understanding you, it
seems you're saying that if one includes all the code written for all the
devices that might be connected to a system then the total of the driver
code is 70% of the code written for the system.  I can believe that.
 From the above paper:

"Drivers account for over 90% of the Block, Free, and Intr bugs, and over 70%
of the Lock, Null, and Var bugs.

Since drivers account for the majority of the code (over 70% in this
release), they should also have the most bugs."

However, in any given system the amount of code in drivers would depend
on the quantity and type of devices connected to the system.  For a simple
system with, say, well tested disk drivers and a network connection and
perhaps KVM or serial console access I guess the percentage of driver
code is quite small and the code itself is relatively stable and debugged.

>However, that just means that at least some drivers
>aren't tested as consistently as the code that is common to all installations.

I think we are on the same page as above.

>I would *guess*, considering the diversity of PC hardware, that the
>distribution of drivers needed by a system is a "long tail distribution".
>If this guess is correct, then most users will have some drivers that are
>poorly tested.

Hmmm.  Why do you think that?  I would *guess* that most users
would use only common drivers that are relatively well tested and
that only relatively few users would use the less common drivers
that likely have most of the bugs.  Do we see this differently?

> > Regarding paravirtualization approach - of course new Pentium
> > and AMD processors are now coming out that are fully virtualizable.
>
>Well, more easily virtualizable.

Right.  I was just commenting on Tannenbaum's statement:

"...because the Pentium is not fully virtualizable, a concession was 
made to the idea of running an unmodified operating system in the 
virtual machine. This concession allows modifications to be made to 
the operating system to make sure it does not do anything that cannot 
be virtualized. To distinguish it from true virtualization, this 
technique is called paravirtualization."

from 
http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&pName=computer_level1_article&TheCat=1005&path=computer/homepage/0506&file=cover1.xml&xsl=article.xsl&

Given that the newer processors are fully virtualizable (all 
sensitive instructions
trap), the above concession will no longer be necessary.

>VT-x and Pacifica boxes still need quite
>a substantial and complicated virtual machine monitor. Perhaps too complicated
>to be secure, since the reliance set of a typical VMM is larger than that of
>a typical microkernel presenting its "native" kernel interface.

I'm not so sure about that.  Two points.  1.  The interface for a user into
the VMM is quite implicit (e.g. execute a sensitive instruction and get
the VMM to trap and handle it) - leaving potential hackers with less data
flow into the VMM to try to subvert it.  2.  One would hope that the trap
and general VM hardware feature set is relatively stable and can be
cleaned up over some number of years for a given architecture.

Still, I don't want to be seen as advocating the VM approach to
limiting the damage that drivers can do.  I favor the microkernel approach
with some sort of capability tokens passed between the servers
to manage POLA.

> > That means that this approach would not have to deal with modifications
> > to the operating systems to run under the virtual machine monitor.
>
>Right, although the distinction between virtualization and paravirtualization
>is not sharp: most VMMs just try to provide a virtualization that works well
>enough to run the most commonly used guest OSes, perhaps with minor changes to
>their configuration or additional/replacement device drivers.

Hmmm.  In that case I must be missing something.  My understanding of
a true VMM is that it provides a virtual image of the real hardware which
can run any OS as a guest OS that can run on the real hardware - except for
timing - which I don't think is generally a problem.  The only relatively minor
issue that I'm aware of is what the guest OS does when it is otherwise
idle.  If it has some sort of idle loop that it keeps running through, that
is awkward - though it should still work but be resource wasteful.  If it
blocks (waits) then it can be less resource wasteful under a VMM
that doesn't need to be tricky (e.g. looking at instructions, etc. the
way some VMMs like VMWare and Xen have to these days).  A VMM
for a fully virtualizable processor can be pretty simple - IF it doesn't
have to get into device management on a device specific basis (e.g.
emulating virtual devices).

I wonder if this is an area where some of these technologies could
be blended?  Perhaps to make VMMs more reliable they could use
separate user mode processes for device management?  It might
be difficult to really constrain such device driver processes as they
ultimately must be given permission to write to users memory spaces,
but perhaps even that could be effectively limited.

>In support of Tanenbaum's point, Xen considered as an OS is less monolithic
>than most (although it is not a true microkernel, and relies heavily on giving
>the "dom0" domain full privileges to access the hardware). L4 can be used as a
>VMM, and VMWare's ESX Server is sort-of a microkernel, IIUC.

There we go.  It seems to me only the language based protection approach
falls strictly outside the generalized micro kernel methodology.  That seems
to be for strictly performance reasons, though they could be pretty serious
performance reasons.  I wonder if this sort of thinking might come full
circle and see discussion of what one might call a more complex instruction
set that at least contains an efficient domain change instruction?

I noticed that in the Microsoft video they even called their language based
approach a "micro kernel".  I suppose there is technically some sort of
micro "kernel" there, but in that case it is depending on the fact that it
is only running "trusted" software that will only cross domain boundaries
through the communication primitives.

Tannenbaum mentioned the B5500 series.  While I believe it was completely
broken with regard to security/integrity in its dependence on safe languages,
its tagged mechanism did allow memory references to automatically enter
separate "domains" (though a so-called Program Control Word) for access
to shared parameters.  It seems to me that something even simpler to allow
efficient communication between domains is possible.  In my experience
the efficiency/performance problems in this area come from having to save/
restore large register sets.  I'm not sure how that can be done 
efficiently, but
I expect it can be.

> > Regarding "multiserver operating systems", the idea has been around
> > for at least 27 years (since we implemented such a system starting
> > in 1979, and I don't think it was new at that time).
> >
> > In this section I think I begin to understand a bit more what Tannenbaum
> > means by "device driver".  He says that these "device drivers" "cannot
> > execute privileged instructions or read or write the computer's I/O ports;
> > they must make kernel calls to obtain these services".  It is exactly the
> > code that can directly read/write the computer's I/O ports (often by
> > executing privileged instructions) that I have heard previously 
> referred to as
> > "device driver" code.  It seems that what Tannenbaum (et. al.) is referring
> > to as a "device driver" is some higher level software.
> >
> > One question to ask in this scenario is whether kernel software can
> > be written so as to provide the needed services to the various "device
> > driver" processes in a way that is device independent - hopefully while
> > still providing as much protection as possible from failures of the
> > device drivers.  Perhaps this question is answered in the Minix 3
> > implementation.
>
>It isn't the support for standard buses and I/O interfaces (USB, 
>PCI, DMA, etc.)
>that take up the vast amount of device-related code in an OS; it's the
>drivers for particular devices that are accessed via these standard 
>interfaces.

That sounds to me like you are saying "yes" to my question above.  Namely
that it is possible for the micro kernel to provide device access for 
a separate
user mode device driver on a device independent basis.  That is, without
requiring the micro kernel to have any device specific code.  That's good to
know.

> > Those were
> > tagged architectures that might be called descriptor based machines.
> > They depended on language protection even for user applications.  This
> > approach was very unwise as any user application that was able to
> > exploit a bug in a compiler would forevermore constitute a hole in the
> > security/protection perimeter of the system.  I know a bit about 
> this because
> > I owned such a hole at one time.
>
>I've broken trusted language implementations (Java VMs) as well. But I don't
>see why you say "forevermore"; a security bug in a language implementation
>is not fundamentally different from a security bug in an operating system
>using hardware memory protection, and is no less fixable.

What I'm referring to in the Burroughs implementation is that if a user
could at any time write arbitrary code (e.g. I got access to "Communications
Algol" and wrote a little routine to accept an array as a parameter and
execute it) then they can save that code as a binary on disk and execute
it at will - effectively taking over the system any time they want.  The
compiler could be fixed (or access to the unsafe compiler removed as
in this case), but the damage is done.  The binary is on disk available to
the user to take over the system at any time.

>It is true that most language implementations are more complicated than most
>microkernels. That's why MSIL and JVM code are bad choices for this; they
>are far too difficult to implement in a simple, verifiable way.
>
> > I don't see any discussion of any sort of authority token that can
> > be communicated to support POLA in Singularity.
>
>It's a message passing system, and as far as I understand, the references in
>MSIL are supposed to be unforgeable (from managed code; all non-managed code
>must be fully trusted). I don't know how successfully it avoids ambient
>authority traps. The two tech reports referenced at
><http://research.microsoft.com/os/singularity/> seem relevant, but I haven't
>read them yet.

If all authorities are communicated through messages as references then
the problem is solved.  If not then not.  There is a lot to read to be sure.
I hope this stuff gets worked out.  I find it a bit difficult deciding what to
read and how to communicate to have any positive effect.

> > At least Minix 3 is available as open source software.  I don't understand
> > how Microsoft Research expects to be able to compete with closed source
> > software like Singularity.
>
>There is a video where the main designers of Singularity basically 
>say that they
>don't expect to produce a successful consumer OS; that it's all just blue sky
>research, from which various ideas may or may not be folded into Windows. I'll
>try to find the URL for it.

What a waste.  I was interested to see that video (below).  I just 
had an email from
an old colleague who is listed as one of the "members" on the Singularity page:

http://research.microsoft.com/os/singularity/

, Roy Levin.  I suppose I could ask him in passing.  He worked on capabilities
many years ago at CMU with Hydra, but says he does distributed systems
research but hasn't worked with "capabilities" in many years.  I have to admit
that I don't really understand that.  How can you do distributed systems work
without communicating permissions (authorities) between components of the
distributed systems.  I call such communicated permissions "capabilities".
Maybe what we're talking about is the technical means for communicating
such permissions or perhaps the problem is deeper.

I keep digging in wherever I get a chance.  To me the base problems and
solutions seem pretty clear.  I can understand some of the performance
trade-offs (e.g. our processes moved into the kernel), but it seems to me
that a base micro kernel with communicable permissions can be made
efficient enough (delta from a monolithic kernel) that such an approach
should ultimately ascend - perhaps mirroring Tannenbaum.

I sure hope such work comes to the fore before blue sky may or may not
get folded into Windows.  Things seem such a mess today. ... <time
passes while I watch the video> ...

I listened to the video that you shared:

http://channel9.msdn.com/ShowPost.aspx?PostID=68302

Near the last of the video I see that they say they only accept code
in an intermediate language that they check before running...  Does
that in a way just move the overhead from one place to another?  Instead
of having the overhead at the process domain change (changing registers,
memory map, etc.) you have the overhead at the time you load the
software.  Hmmm.  I guess that at least only happens at initialization.
Perhaps that is a good trade-off.

I wonder what that intermediate language looks like?  He said they injected
safety checks into the code that they generated from that 
intermediate language.
Does that mean that they could in principle accept code from any compiler as
long as it generated the appropriate intermediate language?  Seemingly not
as C couldn't generate safe code if it can generate arbitrary 
pointers.  Perhaps
this is an area worth some study.

It seems it would be possible to get the best of both worlds.  Namely
one could accept safe code into a trusted heavy weight process to interact
with other safe code modules.  Untrusted binary code would need to run
alone in a heavy weight process and pay the overhead of context switches
for every communication.  That seems reasonable to me.  I can imagine
many applications (e.g. some of those that run on our scientific systems)
that can make good use of unsafe code to improve their performance
enough to make it worthwhile to pay the price of being in a heavy weight
process for communication.

It was good to hear from those folks at Microsoft research to hear that
work is ongoing in this area.  I still find it sad that it is happening in the
closed environment and not in the open source (e.g. academic) community.
Still, given that much of the work seems to be in the language area, perhaps
there is hope that it will filter out effectively over time.

--Jed http://www.webstart.com/jed/ 




More information about the e-lang mailing list