[e-lang] Joe-E

David Wagner daw at cs.berkeley.edu
Fri Apr 30 20:43:01 PDT 2010


Rob Meijer wrote:
> What I suggested was that if you have 3 classes:
> 
> * String          ( Powerless)
> * PowerfullFoo    ( non powerless )
> * PowerlessBar    ( Explicitly Powerless)
> 
> Than for any collection class for example the priority queue you could
> transitively infer:
> 
> *  PriorityQueue<String>         ( Powerless )
> *  PriorityQueue<PowerfullFoo>   ( non powerless )
> *  PriorityQueue<PowerlessBar>   ( Powerless )
> 
> But if as you stated those containers like PriorityQueue aren't capability
> safe, such inference would not be possible.

OK, thanks for the explicit proposal.  This is a credible
enough alternative to be worth considering explicitly.  Let
me lay out the challenges I perceive in this approach (but feel
free to correct flaws or omissions in my analysis):

Challenge 1: In Java, you can't really trust the types inside the
angle brackets.  It's possible to have a variable whose declared type is
PriorityQueue<String>, but where at runtime that variable actually holds a
queue containing an element of type PowerfullFoo.  (This is known as "heap
pollution".)  It's true that you're likely to obtain a ClassCastException
sometime not long after extracting that element, if you try to store it
in another variable whose type is PowerfullFoo.  Still, heap pollution
is a pain: it complicates reasoning about security.  For instance, in
your system, if variable v has declared type PriorityQueue<String>, it
would not be safe to assume that the value of v is necessarily powerless.

Dealing with this is a headache.  I don't know if this is a showstopper
problem or not.  It might be something we could live with; I'm not sure.
I find it hard to reason about the consequences of this Java quirk
for Joe-E.

In principle, I suppose one could try to design a static analysis to
prove that the program never pollutes the heap, and require Joe-E code
to pass this static analysis, but that sounds non-trivial and hairy to me.

Challenge 2: Some of the Java collection classes expose non-determinism.
Their iteration order is non-deterministic: the order in which elements
are returned by an iterator may depend upon the value of the hashCode()
of the collection's elements.  (An exception is LinkedHashMap, which
does have deterministic iteration order and thus is recommended over an
ordinary HashMap.)  The Joe-E taming database does not allow Joe-E code
to invoke .hashCode() on objects where that might expose non-determinism
(e.g., because the object hashCode() is based upon the address of the
object).  However, the Java collection classes are not restricted by the
Joe-E taming database, and some of them freely call .hashCode on their
elements, even when this might yield non-deterministic behavior.

Dealing with this is another headache.  It feels simpler to me to
write our own collection classes that do not expose non-determinism.
I realize this issue may not be a showstopper; maybe there are cleverer
ways to conceal the non-determinism which haven't occurred to me.

Challenge 3: Writing static analyses is a more specialized skill than
writing libraries.  Anyone skilled at Java and Joe-E programming can
write or review a library class for the Joe-E library.  In contrast,
of the main Joe-E contributors, only Adrian knows the intricacies of
writing Eclipse code to implement a powerful static analysis (such as the
one you propose).  This effectively means that it would be difficult for
the rest of us to review the static analysis implementation.  Moreover,
writing sound static analyses is challenging and tricky.  These are
arguments for limiting the use of sophisticated static analyses.

Another way to put it is that writing a specialized static analysis
feels more expensive to me than writing a library class.  That said,
I realize this argument is fairly weak: if the static analysis yields
a better experience for users of Joe-E, it's better, and that's that.

> But given this major issue, and given the fact that a
> CapabilitySafePriorityQueue  would be needed anyway if anyone needed a
> priority queue, would it not still make more sense to look at the explicit
> tagging for generics, for example introducing a 'CapabilitySafeContainer'
> concept, for what these simple inference rules would make sense? That
> would seem like a lot less work than having to rewrite capability safe
> alternatives for all of the java ones?

Amusingly, to me writing a special static analysis for capability-safe
containiners sounds like more work than writing capability-safe
alternative container classes.  I don't know if my perception is
correlated to reality or not.


A third alternative is to add ImmutableHolder and PowerlessHolder classes
to the Joe-E library, as Steven suggested.  That'd be a reasonable step.
I don't know of any reason not to do that.  It's something we'd talked
about before in the past.  I think this is just a case of "we never
got around to it" and "hadn't yet run into a case that required it",
rather than any fundamental reason not to do it.  It's far from an ideal
solution, and it doesn't fully address Steven's issue, but it'd be a
partial workaround.


More information about the e-lang mailing list