[e-lang] An object-capability subset of Python

Mark Seaborn mrs at mythic-beasts.com
Fri Aug 15 14:00:18 CDT 2008


"Ben Laurie" <benl at google.com> wrote:

> On Mon, Aug 11, 2008 at 9:31 PM, Mark Seaborn <mrs at mythic-beasts.com> wrote:

> > Although Python does not provide encapsulation, there is a widely-used
> > naming convention for private attributes of objects: their names start
> > with an underscore.  CapPython proposes to enforce this convention by
> > only allowing private attributes to be accessed through the "self"
> > variable that is bound inside methods.  There are more details here:
> > http://lackingrhoticity.blogspot.com/2008/08/introducing-cappython.html
> > along with some less readable notes here:
> > http://plash.beasts.org/wiki/CapPython
> 
> I tried to do this years ago. It wasn't well received by the Python
> developers :-)

Yes, I've been reading the past discussions on python-dev.  From what
I have read so far, the previous attempts at making Python safer have
involved some combination of

 - wrapping objects, e.g. the Bastion wrapper from old Python versions
 - modifying the interpreter (I'll include rexec in this)
 - using Cajita-style objects built from multiple closures
 - wrapping all uses of getattr/setattr (whether via the getattr/setattr
   functions or via the "." syntax), e.g. Zope's RestrictedPython

I'd be interested to see if someone has proposed CapPython's scheme
before.

> Yes, I think you need to take a CaPerl-like approach (and the name
> should be CaPython :-).

Maybe when it improves it can drop the excess "p" and be renamed to
CaPython, and perhaps further down the line it can drop the prefix and
be renamed to Python 4.0. :-)

> >  Python's variable binding
> > semantics are quite complex, so there would be loopholes if the
> > verifier's interpretation of the program did not match Python's
> > interpretation.  A rewriter can perform variable renaming which could
> > give us more confidence in the result.
> 
> Not sure that mere renaming is going to suffice.

I'm not suggesting that renaming variables is strictly necessary to
enforce anything.  The issue is that some constructs, in particular
list comprehensions and generators, have non-obvious variable binding
behaviour.

For example:

(lambda: ([self for self in [some_object]], self._foo))()[1]

This is actually equivalent to:
some_object._foo
(which should be rejected by the verifier)

They are equivalent because "self" is not bound by the list
comprehension but is assigned by it, and all three "self"s refer to
the same variable binding, which becomes implicitly bound inside the
lambda.

According to the Python Reference Manual in
<http://docs.python.org/ref/lists.html>, "this behavior is deprecated,
and relying on it will not work once this bug is fixed in a future
release".

If this is fixed, the list comprehension would behave like a
generator, so the expression above would become equivalent to:

(lambda: (list(self for self in [some_object]), self._foo))()[1]

which is equivalent to:
self._foo
(which should be allowed by the verifier, assuming "self" is really
a self variable)

It is not too difficult to determine what the variable binding rules
are, but there is always a slight risk that the verifier's version of
the rules does not match up with the Python implementation.  One way
to address that risk, if you want to be extra cautious, is to do
variable renaming.  David Wagner suggested to me that the verifier
could prohibit variable shadowing.  That would work without requiring
rewriting.


> In CaPerl I did run-time checking.

I don't think it will be necessary to insert run-time checks in
CapPython.  CapPython uses the "same-object" meaning of "private" (as
in E) rather than the "same-class" meaning (as in Java).  This means
the ability to access private attributes can be determined
syntactically.

CaPerl uses "same-class"/"same-module" private [1], so it needs to do
run-time checks.

This kind of code pattern will be rejected by CapPython because
"other" is not a self variable:

class C(object):
    ...
    def __eq__(self, other):
        return self._attr == other._attr

The way to do this will be to use some kind of explicit unsealer on
"other" to access _attr.

Regards,
Mark

[1] From http://www.links.org/pics/usenix-security2.pdf --
    "Can only look inside objects belonging to the same module"


More information about the e-lang mailing list