Side-effect free containers for E

Dean Tribble tribble@netcom.com
Wed, 16 Aug 2000 00:54:50 -0700


I did my one (or was that two) E message this week, but I can't resist.

> From now on, I'll refer to the
>object-without-security-per-se perspective as simply the object perspective.

Say rather, the "object-only" perspective.

>For concreteness, let's contrast two protocols for a Mutable Cell (ie, Slot
>or Location).  Using Java interface declarations as an interface description
>language (for which they are not bad):
>
>Protocol M (for MarkM):
>
>      public interface CellReader {
>          Object getValue();
>      }
>
>      public interface CellEditor extends CellReader {
>          CellReader readOnly();
>          void setValue(Object newValue);
>      }
>
>Protocol DT (for Dean & Tyler):
>
>      public interface CellReader {
>          Object getValue();
>      }
>
>      public interface CellEditor {
>          CellReader readOnly();
>          void setValue(Object newValue);
>      }

First a point about protocols.  With smaller, more focused protocols, you 
can have decent method names!  "get" and "set" or "value" and "setValue" 
are entirely sufficient names in the above case, but I won't quibble 
further about them.  "readOnly" on the other hand, is not the *intent* of 
the message for the DT protocol, so I will rename it to "view", which 
better captures the intended contract.  Note that "readOnly" does seem 
appropriate for M because it is the read-only variant of the same instance 
(i.e., readOnly is a plausible thing to ask for from the readWrite variant).

>I claim that Protocol DT is preferred from a capability perspective, but that
>protocol M is preferred from an object perspective.  I believe Dean claims
>that protocol DT is preferred from both perspectives.  If Dean should prove
>right on this example, then he's probably right on the more general claim.

The problem with this as a challenge is that most of the consequences of 
subtle differences in design manifest in *systems* rather than small 
examples.  Nonetheless, I will take a stab at it.

Note that for the real Cell protocol, it's not clear to me why the 
"readOnly" method is present at all.  I'd occasionally want to give 
components the ability to report without the ability to read the report.

OK, here we go.  Since it's late, I anticipate being appropriately 
punchy.  We'll see....

>In protocol M, when someone holding a CellEditor, ce,
>
>      define ce :CellEditor := ...
>
>wishes to read the value of the Cell, they merely do:
>
>      define value := ce getValue
>
>whereas in protocol DT, they must do
>
>      define value := ce readOnly getValue

define value := ce view getValue

>(which would also work in protocol M).

Well as soon as the same basic operation can work in more than one 
protocol, you've got maintenance problems.  Six months later you are 
looking at code and wondering why you chose to do it one way in one place, 
and a different way in another.  You see both alternatives in the same hunk 
of code, but don't have the source for the Cell library routine, so 
you're  not  quite  sure  what the difference is any more, but 
you...wait...damn...I guess one must take a snapshot of the value...no...hmmm.

>For this case, the second is more
>complicated than the first with no compensating virtue.

This is a little bit like a straw man.  With appropriate design, if what 
you need is to read the value, you probably already have the read 
facet.  It just works out that way more often then seems likely.  I always 
remember Norm's story about questions people ask when they are learning O-O 
programming:  how do you know you'll have the object with the right 
behavior and state when you need to invoke.  After experience the question 
evaporates (or often inverts: why would you need to invoke it if you didn't 
already have it?).

Failing the above case, it is four clear, obvious characters that a type 
checking compiler will happily complain about the absence of, that address 
other key issues in O-O systems (not to mention "security per se" :-).

>Now, let's say we have a foo function that only needs read access to the
>Cell.  It might be defined as
>
>      define foo(cr :CellReader) ... { ... }
>
>The holder of ce can call this either as
>
>      foo(ce)

Let's call this case A (I'll avoid pejorative names for the cases :-)

>or
>      foo(ce readOnly)

Let's call this case 2, but note that for DT, this would be foo(ce view)

>The first, of course, violates the principle of least authority, and so is
>much worse from a capability perspective.  However, the second is arguably
>more complex, and so is arguably worse from an object perspective.  In
>protocol DT, the first is an error that's caught early; either statically,
>if we've got a static type checker; or dynamically, by the ":CellReader"
>SlotGuard.

Can you say Const Correctness?  :-)  It's six months later.  The basic 
general argument against M in this example, is *what did you mean*?  In 
case 1, did you *mean* to pass the writeable version of the cell (file, DB 
record, etc.)?  Let me argue with an all too plausible scenario:  I guess 
you must have, because you obviously could have passed the view 
facet.  I've now inherited the foo code, and in all cases but one, the 
calls to foo pass the edit authority.  Since my job is to add a feature to 
foo that ensures that the value of the Cell is always greater than 0, I'll 
just change the interface, oh wait, there's that one call, but it's 
guaranteed never to have a negative value because it's passing me the value 
from getFoo (which oh by the way is typed to return a CellReader).  I'll 
just check if the value is less than 0, and only then cast it down to the 
CellEditor and change the value to 0.  Since this code is part of a 
brokerage system for high rollers, they don't often run negative 
balances.  As a result, I will have cashed out before this trivially 
preventable violation of invariants assumed by your code about when the 
value could change results in the loss of a major account :-).

Insufficiently persuaded?  There was a lovely Java exploit that I believe 
our Vulcan cohort Vijay Saraswat helped to document using the fact that 
reading a cell is covariant and writing a cell is contravariant.  I won't 
work too hard to explain this, but here's a brief explanatory 
example:  type B is a subtype of A (abbreviated A < B).  Is a collection B 
(which I will simply abbreviate as A[]) a subtype of A[]?  Let's consider 
reading:  For an operation that takes an A[] as an argument, a caller can 
pass in a B[] because all the objects read from the collection will be of 
type A as expected by the operation (they all happen to be of subtype B, 
but that's perfectly fine), and everything is type safe.  Thus, for 
reading, A < B implies A[] < B[], so [] is covariant.

Writing however is not covariant.  If that same operation assigns a value 
of type A into the array that it thinks of as A[], the caller will all of a 
sudden have an array ostensibly of type B[] that contains an instance of 
type A that is not a B.  Ooops, type error.  If we reverse the types and 
the caller has an A[] which the operation treats as a B[], the operation 
assigns a B into the A[], and the caller is happy:  it has an A[] that 
happens to have Bs in it--no problem.  Thus, for writing operations, A < B 
implies B[] < A[], which is contravariant.  Note that I'm using arrays as 
examples because this is how I keep this relatively subtle but nasty 
inheritance problem straight.

What does all this mean?  If arrays (or any other parameterized type) has 
separate read and write facets, they can be correctly covariant and 
contravariant respectively.  As a relatively trivial example, this enables 
correct support for inheritance of parameterized collection classes (in 
contrast to the, um, more limited Java array classes).  This would show up 
in your Cell example were it parameterized.  What's most surprising is that 
it crops up very late in very nasty design corners even in systems without 
type checking.  For any abstraction in which A and B are partner types in a 
pattern abstraction such that they will both be sub-typed to produce a 
concrete realization of the pattern, and the sub-types will have a more 
specialized relationship to each other.  Since its bedtime, I'll just make 
up some bogus but suggestive examples on the fly:  View and Controller, 
Component and ComponentPeer (oops, how did that get in there?! :-) 
PaymentMechanism and Payment, CompilerBackend and Encoder, 
MeasurementReporter and NumberWithUnits.

There are other arguments I could make here if these are insufficiently 
intimidating^H^H^H^H^H^H^H^H^H persuasive.  Well, OK, here's one more:

As a result of new requirements (for logging change timestamps or 
collapsing redundant updates or something), we extend the system with a new 
interface, TimedCellReader:

      public interface TimedCellReader extends CellReader {
          long lastChanged();
      }

that will return the timestamp of the last change (we postulate that this 
method would not be a sensible enhancement to most of the existing uses of 
CellReader, of course).  Since I could watch the clock while polling, the 
designer considers this to not be a distinct authority, so it is an allowed 
extension of the the Read facet.  Do ya see where I'm going here?  :-)

DeansCellEditor is an implementation of the CellEditor interface that 
returns an instance of TimedCellReader for the "view" method.  The new 
clients of CellEditor can now test whether they received a TimedCellReader, 
and if so, ask for the timestamp.  Oh but wait, read-only return values are 
covariant, so DeansCellEditor can declare the "view" method to return a 
TimedCellReader, and that is completely type and contract compatible with 
all existing clients of the Cell contract.  The result is that a holder of 
a DeansCellEditor can know by construction that they will always receive a 
TimedCellReader, and so can just ask for the timestamp (And yes all this 
can be type checked in an appropriate language, or programmed with low risk 
in a latently-typed language like E).

How would one do that with the M proposal?  All the proposals that I can 
come up with are obviously bad, but it's late so I won't presume that means 
there aren't any good ones :-)  Note that there are several other protocol 
extensions that similarly fall out naturally in the DT protocol.

Vaguely relevant example here: In Java 1.0.2, "controllers" were 
implemented as subclasses of specific Components, and it actually appeared 
relatively nice and convenient and pleasing.  In all later Java versions, 
this abuse of inheritance was replaced with composition because the subtle 
but absolute ways in which it limited extensibility and reuse.

As a final comment on the general topic, its strange to me that most 
everyone nowadays agrees that read streams should be a different type and 
different objects from write streams, but they don't generalize :-)

> >A subclass that extends a superclass with more authority violates this 
> rule (or rather, its instances do).  Note that a difference in function 
> (potentially including additional operations) is not necessarily a 
> difference in authority.  BTW: I do think that this particular problem is 
> sufficiently subtle that it deserves its own principle.
>
>Dean & I just had a long phone conversation.  I believe we agreed that the
>above paragraph conflates two issues.  The above example *does* obey

The statement conflates the two issues, though the intent of statement did 
not.  The intent of the statement was better represented by MarkM's rephrasing:

>reaching for a rule like Capability Design Rule #X:  All distinctions in
>static authority contracts are represented by different interface types.
>Rule #X is a static analog of Rule #1, but they are quite different.

Yup.

> >I know of no principle that says extending interfaces is a priori good, 
> only that it is possible.  Do you?
>
>I prefer empirical to a priori arguments anyway, so I refer again to the
>above Cell protocol designs.  The principle is expressive simplicity.

(Said with appropriate taunting twinkle...)  "Pleasing expressions of 
subtle but far-reaching bugs is not the exactly what I had in mind:-)"

"As simple as possible, but no simpler"  Collapsing authority distinctions 
into a single type abstracts away the essential expression of programmer 
intent under the guise of eliminating one-time keystrokes, and interferes 
with the separate extension and evolution of the different facets of the 
interface.  These are strictly O-O issues that only incidentally have 
potentially disastrous consequences for security :->.

>The DT protocol follows this description exactly.  We see it has a cost.

Presumably I've addressed this above.

>And the protocols of the primitive capabilities exported from the Kernel.
>I'm very curious how KeyKOS and EROS people will react to the proposal that
>their protocols be redesigned in this manner.  Guys, please speak up!

It should be the case that the transformations are straightforward.

Ironically, as I recall, the general issue was first identified as an 
*object design* issue that we then realized had security implications, not 
the other way around.

Enjoy!
dean