[E-Lang] Hydro & E
Mark S. Miller
markm@caplet.com
Tue, 20 Mar 2001 10:51:47 -0800
Security problems with E's current collections
A while ago, Tyler successfully argued on the e-lang list that E's current
collection classes ( http://www.erights.org/elang/collect/tables.html ,
http://www.erights.org/javadoc/org/erights/e/elib/tables/package-tree.html )
had at least the following two security flaws, both instances of the general
flaw "Don't make polymorphic subtypes that add authority.":
1) By having both mutable (FlexList,FlexMap) & immutable (ConstList,ConstMap)
collections share common protocol (as defined in their supertypes EList,
EMap), E makes it too easy to leak mutation authority.
A piece of code that's holding on to a mutable collection that need only
pass an immutable snapshot() to another may too easily accidentally pass the
mutable collection itself. While we can't prevent authority leakage bugs in
general, this kind of leakage is especially bad because the bug is not
easily catchable either with static or dynamic checks. This would be true
even with a conventional static type checker.
2) By using the keys of EMaps by convention as sets, and by representing sets
by convention as mappings from the set elements to null, E makes the
following accident too easy:
A piece of code C is holding on to a mapping M that it got from somewhere
else. C thinks of M as a set, and uses the set-oriented mapping operations
("and", "&" for intersection, "or", "|" for union, etc) to derive a new "set" S from
the old "set" M. However, M had values that provide authority, and these
operators actually derive new *mappings* from old mappings, where these new
mappings contain values from the old mappings. C may then innocently hand
out S to someone thinking only about the keys. E's current collections
encourages making an unwarranted and unchecked assumption that the values of
a "set" are null.
Examining the Hydro alternative
Tyler also proposed his own Hydro 2.0 library as a replacement.
http://www.waterken.com/Hydro/2.0/index.html Hydro, like E, is also covered
by Mozilla, and Hydro, as part of Droplets, is designed from the ground up
to support distributed persistent capability programming in a framework
that's partially inspired by E and is E-like in many regards. For some time
back then the list was actively discussing issues regarding fitting Hydro
into E. This discussion resulted in Tyler changing Hydro in several ways to
repair conflicts with E. (In particular, the adoption of E's notion of
irreflexive partial ordering.)
In order to productively continue the discussion using live experimentation,
I've bundled all of Hydro in with every E since E stl-0.8.9k. However,
since then I got distracted by other priorities and allowed myself to coast
on "the current collection classes are good enough", and decided to continue
with these rather than resume the discussion. Well, the current collection
classes aren't good enough: the security flaws Tyler points out makes E too
accident prone -- it would be too easy to write authority leakage bugs that
would be too hard to find. E should willing to compromise on almost
anything else in order to avoid compromising security.
Hydro is uncompromised in these dimensions. (I don't mean to imply that
it's compromised in others. I know of no such compromises.) If we do adopt
Hydro as E's new collections, and make the changes to both Hydro and E
required to integrate them, many useful synergies would result. Tyler would
like to use E as a scripting language. This integration would make this
both more sensible and more feasible.
E's greatest missing feature is persistence. While Tyler's scalable
persistence system, Lock, is proprietary and so can't be made part of E
itself, the interface to his persistence system, Acid, is Mozilla, and a
cheezy implementation of the contract defined by Acid should be adequate for
E's early persistence needs. Hydro's collections are also built to
integrate smoothly with the persistence framework defined by Acid, so an
E-Hydro integration would pave the way for an E-Acid-CheezyPersistence
integration.
However, the hard requirement is fixing the security flaws Tyler pointed
out, and this doesn't *necessarily* mean adopting Hydro into E. For
example, these flaws in the current collection classes could be fixed.
Hydro remains largely unexamined on this list, and until the various hard
integrations issues are examined, we must remain skeptical that
Hydro is the answer. Hydro and E have had enough independent evolution that
we should expect any attempted integration to be a non-trivial affair.
There's been a bit of private correspondence about this lately between
Tyler, Dean, MarcS, and myself. I expect the cream of this correspondence
will be resent to the list soon so we'll all share context. From this
correspondence, I'd say there's 5 main integration issues:
These four from Dean (with explanations by me):
- E syntax for supporting immutable collections
(Primarily, what should "[]" and E's operators expand to? And what should
be the meaning of what they expand to? Most pressing for immutable-only
collections, what should be the syntactic shorthand for update: "map := map
with(key, value)", as that's the closest replacement we'll have for the
familiar "map[key] := value".)
- Hydro as an E library
(Syntax aside, how well does the Hydro API integrate with the rest of E?)
- Hydro as a ELibJava library
(How well does Hydro integrate with ELib, and how usable is it by the
ELib/Java programmer?)
- specific collection classes included with E
Should E include all of Hydro as standard equipment, or only a subset?
And finally, a theme through the discussion:
- How learnable is Hydro?
This came up repeatedly, but it's unclear how much of the issue here is the
design of Hydro itself, and how much is simply an artifact of Hydro having
used javadoc as its primary means of documentation. While
javadoc-umentation is great for reference, it's not suited for introductory
material. In light of these issues, Tyler has rewritten the first page
accessible at http://www.waterken.com/Hydro/2.0/doc/abridged-index.html to
serve as a better introduction.
Can we postpone deciding?
While we examine the overall integration question, we should also ask the
earlier question:
Can we define what E is, such that the collections may be provided by Hydro
or not, but the language is still the same? Language design and library
design should normally be kept separate, but primitive data types are
where they intersect. How much of E do we need to agree on in order for
Tyler and E to separately proceed forward before deciding on the Hydro-E
question, but where we're proceeding with compatible notions of E? If we
can't find satisfying answers here, should we fork E and hope to rejoin later?
Cheers,
--MarkM