[cap-talk] Notes from Butler's 2006 Usenix (was: Re: Butler Lampson's upcoming talk)

Jed Donnelley jed at nersc.gov
Wed Apr 9 21:13:05 CDT 2008


Here are my notes on Butler's talk.  Some places
literal transcription (when I think others on cap-talk
might care), other times I didn't write anything
when it didn't interest me, many times just some
overview statements.  Very rough, but at least you
can skim/scan it, which is difficult with audio.

If you want to focus just on his negative comments
about capabilities scan down to the first "_______".
His canonical example of a capability system that failed
is the Cambridge Cap system.  He basically argues
that all capability systems will fail for the same
reason and he knows because he did "serious" work
on capability systems.  The capability paradigm was
not defended in the questions and it wasn't discussed
except at that one point.  It's rather short and I
have a note where it ends.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Mostly reasonable speculation and overview until chart 7

On chart 7 he says up front "Highly opinionated overview
of computer research in the last 30 years."

* marks things that did serious work on himself.  So ...

He believes that he did "serious" work on capabilities.
I believe his work was rather insignificant (one system
that never got into production followed by a lot of
talk with others and much criticism), but I guess
that's debatable.

An interesting point he made is that he argued that
object oriented programming, despite being very
successful (he seems to be measuring "success" by
market penetration), has caused any systems with
a "reasonable amount of modularity" to be "badly
hurt by OO programming" "because of all this
'nonsense' about inheritance and subclassing,
that 'completely wrecks the modularity of your
system'."  He argues that OO programming has done
a "tremendous amount of harm" - despite being
"successful."

Grumble.  He starts his discussion about things
that "looked promising" but turned out to be
failures by saying,

"I hope there are people in this room working on
these things and that I will be successful in
dissuading them from continuing to work on them."

In other words, Butler says "Don't work on
capabilities." (among other things).

The top of the list is capabilities, where
he says, "Some of these things might possibly
work in the future, but the evidence is against
it."

Then he launches in against capabilities which
I've tried my best to transcribe:
____________
Capabilities seemed like a wonderful notion
for structuring a computing system.  You would
have these fine grained things that gave you
the authority to do this and that, and that way
you could limit the damage that a particular
hunk of the program could do.  It would only be
able to hurt the things that it actually had
capabilities for.

This idea has been tried quite a few times.
It's pretty much been proven to be a bust.
The most compelling story I know came from
one of the most serious attempts to build
a capability system which was done at the
University of Cambridge in England in the
late 1970s I guess it was, mid '70s, late
'70s.  They built the Cambridge "Cap" and
of course the working hypothesis was that if
you had this fine grained capability protection
you'd be able to get a system with a high degree
of reliability much more quickly.  So when they
got it more or less finished and they had real
users and so forth, I said 'Well, was it true?'
Their first reaction was that 'we have no idea.'
because they totally lost sight of what the
original goal of the project had been during
the years of engineering that it took to get
this system actually working.  But then they
went away and they thought it over and they
studied all their debugging logs and they came
back and they said to me (Butler as God again...),
'We found about 1/2 dozen bugs that we think
would have been very difficult to find if we'd
written this system in C to run on a vanilla
machine.  They were found very easily because
they immediately tripped over the capability
protection mechanism.'  They and I agreed that
meant that the idea had been a bust, because
no way was all the engineering that had been
put into this thing justified by this small
improvement.
___________

<that's the end of the discussion about capabilities>

About "software engineering" he says that what
it comes down to is "have interfaces and count
the number of lines of code."  He admits that
having interfaces is a good idea, but argues
that "we" didn't need a large discipline like
software engineering to have that.

He says that RPC has been a failure because the
basic idea of it was to mask the fact that the
call is remote and 'that turns out to be a mistake'.
He says that CGI calls are basically remote
procedure calls that work, but they are visible
and thus not strictly RPC (paraphrasing).  "The
consequences for failure model and performance
are so serious that if you try to do that <mask
the remoteness> that you get into serious trouble"
The basic contribution to all the RPC projects
that Butler worked on was negative, because when
you do run into trouble you can't do anything about
it because you can "see" the remoteness (again
paraphrasing).  The RPC system is unlikely to be
what you need - very sad but true.

Persistent Objects, almost literal:
_____________
Persistent objects is an idea that dates from
around 1980.  The fundamental problem with
persistent objects has turned out to be ...
again it seems to be a very appealing idea,
in my programming language I can say, new,
new, new, and set up a bunch of pointers and
build data structures and do all kinds of great
stuff, but I have to do it from scratch every
time I start the program.  The whole idea of
persistent objects is that's not going to be
the case.  We're going to be able to 'persist'
these things and we're going to be able to build
all kinds of wonderful, complex, long lived
data structures.

The problem with that turns out to be that if
you actually build a system like that and you
solve all the implementation problems of making
sure you don't lose any bits and having transactional
updates and all that other stuff and then you do
a few million or hundred million news, and now
you've got 100M objects connected together in some
complex way out there on your disk, the question is
'do you have anything useful or do you just have
a rubble of objects?'  I'm afraid the answer is
'rubble of objects', because of course the program
that constructed these 100M interconnected objects
has bugs and its changed over time, and the invariants
that it maintains now are not the same ones that
it was maintaining four months ago.  The amount of
discipline that it requires to overcome these difficulties
is just beyond our capabilities <interesting word...>.

So persistent objects have not been very successful
and I must say this is one thing that I would really
like to see work, but it seems very tough, cause this
is a really hard problem to deal with the fact that the
world is changing out from under this persistent pile
of stuff.

The closest analog that I know to persistent objects
that doesn't involve actually using a persistent object
system is using the Unix file system in a certain way
that involves creating lots of directories, populating
them with lots of small files, and maintaining complex
invariants among the files.  Everyone that I know who's
tried to do that has discovered that the amount of
manual labor required to keep the whole thing from
collapsing into a rubble of files is far greater than
they anticipated.  One of the charms of relational
databases is that they force you into a straight jacket
that makes it much more difficult to run into these
problems.
____________

Security:

Security is something that I've worked at off and on
for almost my whole career in computing.  I think it's
fair to say that security has pretty much been a bust.
We had visions that we were going to be able to build
actual real live computer systems that would behave
in predictable ways more or less regardless of what
attacks were mounted against them.  We've been quite
unsuccessful in doing that and I would say that.  I
would say that on the whole things have been getting
worse, not better.  The reason for that is easy to
understand.  We have lots more software than we used
to have and people are not really interested in
security.  I mean, they don't like it when the virus
gets in there and scrambles their machine, but other than
that the job of the security system is to say 'no', and
people want their computers to say 'yes'.  So it's not
very surprising that security hasn't been very successful
and I think it will continue to not be successful.  It's
an interesting question that in the light of the fact that
we definitely do have lots of bad guys out there on the
net trying to mess things up, what should we do about
this, but that's a different talk. <Ah, again Butler
knows as God, but again in another talk...>.

RISC (doesn't interest me so much.  You can listen)
Mostly based on need to make Alpha 2x faster than
x86.  Intel was able to make x86 fast by essentially
putting RISC under x86.

On to the Maybes...

We have no clue on how to write the parallel programming.

Garbage collection has more credibility because of Java,
but there are some problems with it.  C# and the CLR...
Microsoft had to back off from using it.  Very controversial.
Butler didn't expect this at all.  He'd seen garbage
collection used successfully, but it hasn't worked in
Windows.  Mort (the programmer) has different needs than
a serious programmer.  What happens when you run out of
memory?  Mort doesn't care about that.  He believes it
can work, but it turns out to be harder than Butler predicted.

Interfaces and specifications are costly.  Willpower can
go a long ways (e.g. Microsoft), but it has broken down
now at Microsoft.  Programmers hate this because it really
gets in the way, but without it there are limitations
on size.  Successful in hardware, but dubious in software.

Reuse - e.g. ACM algorithms.  Snap together the components.
No business model.  Components have their own world view.
The world views get confused.  Unix filters have worked
and their are some other working examples, but few.  Big
component like browsers, databases, etc. has worked, but
not to Olay, com, web services, etc.  Hundreds of Amazon
services that can be called on...  In the Amazon world
they don't have to work.  Wrong information doesn't matter
so much, all that has to work is the "buy" button and the
accounting.  It is wrong most of the time, but it works.
Microsoft audience hated this because it would produce a
broken OS, but it's better if it doesn't "have to work"


Ah, his comments about why Systems Research didn't invent
the Web I think are applicable to why Systems Research
hasn't yet succeeded with capabilities, but I'm getting
too tired to type more in detail.

Interesting comment about Jim Gray's challenges - now that
Jim is missing and presumed dead.

Butler's grand challenge - reduce highway deaths to zero.

A thought on Butler's speaking - I think his speech
impediment is actually somewhat of an advantage (??).
It tends to keep one's attention better than if he was
just droning on.

Argues that the reliability of Windows is pretty well
matched to users wants/needs.

Ariane 5 story (floating overflow and how it caused rocket
to fail).

Butler's definition of "dependable" = no catastrophes.

Catastrophes:

    USS Yorktown  (DB failure stopped the engines)
    Terac 25      (well known)  Butler argues situation is no better
    Loss of crypto Keys
    Destruction of big power transformers

Most effort in avionics and nuclear reactors.  Car can pull over
to the side and stop.  Plane can't.

Trusted Computing Base concept hasn't worked except for national security.

FAA attempt to rebuild system failed.  Can't procure a high tech
system when the contractor is viewed as "the enemy."

"The best is the enemy of the good"  (Jed, what about the "better"?)

Unix is successful by using streams of bytes so much.

Web is successful because it is simple, HTML and URLs.


Conclusions for Engineers:

    Understand Moore's Law
    Aim for mass markets - Computers are everywhere
    Learn how to deal with uncertainty
    Learn how to avoid catastrophe


When he says why distributed computing doesn't work,
he means computing when there are more than two computers
involved.  <jed's comment - this seems to me to tie into
the "mashup" problem>.  With the Web there are always only
two computers involved.  He says SETI at Home is an example
where distributed computing works, but there are few.

AI is in such a mess because whenever any part of it is
successful it gets spun off (e.g. computer vision).

Human centered cyborgs  (book?  if 1/2 of our bodies
    are replaced by machines.  He says we are already
    dependent on technology )

National academies committee, 1.5 trillion traded every
day.  If trading is stopped for days or months Butler doesn't
think that would be a catastrophe.

Butler does like to be controversial.  Nobody defended
capabilities.

RISC is OK, but it didn't come out the way people thought.
It turned out to make much less difference than people thought
it would.  OK, OK, RISC is a huge success because it is
hidden under the x86 CISC.  However, all these claims about
compilers working better with RISC, etc. didn't work out.

"Failure oblivious computing"  Don't generate a segmentation
fault.  Just return 0.  Martin Rhinehard??  Today's PC is
about 30k times better than an Alto.  Today's PC doesn't
do 30k times as much.  Therefore most of the instructions that
are being executed aren't "useful".

Butler's motto is "Reboot often".

Grid he doesn't understand.  Seems to just be a way to get money.

There is still a lot of angst over why computer science research
didn't come up with the Web...

The end.




More information about the cap-talk mailing list