[e-lang] Need help with program analysis for Transparent auditor (or: code zippers)

Kevin Reid kpreid at mac.com
Sun May 4 09:25:14 CDT 2008


Context: I'm trying to implement the Transparent auditor.  
(Transparent is the property that an object's __optUncall accurately  
portrays it; that performing the described call produces an  
equivalent object. This is the most complex auditor I've attempted  
yet, and I came up with an analysis technique to help, but it doesn't  
seem quite sufficient and I'm not sure which direction to take now.

This message was originally written as a blog post without E-specific  
terminology <http://kpreid.livejournal.com/9816.html>; I have  
modified it slightly for e-lang.

------
It occurs to me that there is a similarity between a zipper <http:// 
www.haskell.org/haskellwiki/Zipper> and the combination of an  
expression and an environment, as for evaluation. I've got a problem  
that seems almost solvable by use of this.

For example, let's start with the expression

   def x := 74
   x + 1
For those unfamiliar with E, the AST of this expression is:

   SeqExpr(DefineExpr(FinalPattern(NounExpr("x"), null), LiteralExpr 
(74)),
           CallExpr(NounExpr("x"), "add", LiteralExpr(1)))
If we move “down” zipperwise into the second subexpression of the  
sequence, then we get the expression “x + 1” with the context “def  
x := 74 before this in a SeqExpr”. Similarly, an interpreter uses an  
intermediate result of “x + 1”, “x is bound to 74”.

------
With a slightly extended context, I'm trying to use this express  
program analysis in an auditor. The zipper context now tracks a small  
amount of information about variables, as well as the enclosing  
expressions.

The first problem is to determine that all occurrences of a given  
variable are used in a particular fashion: e.g. for f, permitting “f 
(1)” but not “f("foo")” or “somethingElse(f)” (where the third case  
would permit f to be used arbitrarily since the auditor doesn't get  
to see somethingElse’s code). This is easily accomplished by walking  
the entire expression looking for occurrences of NounExpr("f"), and  
walking “up” from each of those occurrences to check that the  
enclosing expressions form a permitted usage.

The second problem is to determine that the variable in a particular  
position is a free variable, and that its guard is of a particular  
form. This can be done by extending the context objects to record  
what variable names are bound by any expression preceding the current  
location (that is not inside its own block); thus walking up the  
contexts to the top-level determines whether the variable is free.

The third problem, I'm not sure what to do about. I have a situation  
roughly like this:

   def foo {
     method a() {
       def bar {
         method b() { foo }
       }
     }
   }

   ObjectExpr("",
              FinalPattern(NounExpr("foo"), null),
              [],
              EScript(
     EMethod("", "b", [], null,
             ObjectExpr(FinalPattern(NounExpr("bar"), null),
                        EScript(
               EMethod("", "b", [], null,
                       NounExpr("foo")))))))
As part of the first problem, I have found the object “bar”; I then  
walk upward to determine that it is generated by a method of the  
object “foo”. (There are additional constraints on the structure that  
I haven't mentioned.) I need to determine that the reference to “foo”  
in “b” is in fact to the outer object, and not to some other  
intervening lexical binding.

I've thought of two possible solutions so far:

1. Starting from EMethod(..., "a", ...), scan its children for  
occurrences of NounExpr("foo") and reject any which bind it. This  
would reject more than it needs to.

2. Switch to a pre-processing stage (instead of the incremental  
operation of a zipper) to assign an identity (or a mutable analysis- 
information field, equivalently) to each variable binding and all of  
its uses, and add upward references (like in the zipper) to each  
node; this would make the is-this-foo-that-foo test a simple comparison.
At the moment, the second option seems attractive; tying uses to  
definitions of variables in a generic fashion should be useful for  
other types of analysis. And since this is strictly analysis, not  
transformation, I don't need the modify-without-mutation function of  
a zipper. The only reason I haven't done that yet is I think zippers  
are neat (and there might be other uses for an established zipper  
over E ASTs).

What do you think I should do?

--
Kevin Reid                            <http://homepage.mac.com/kpreid/>




More information about the e-lang mailing list