Distribute Parse Trees, Not Bytecode

Bill Frantz frantz@communities.com
Thu, 24 Jun 1999 12:14:09 -0700


At 12:32 AM 6/24/99 -0700, Ka-Ping Yee wrote:
>On Wed, 23 Jun 1999, Bill Frantz wrote:
>> >In my
>> >mind you want the first basic passes done like parsing, getting symbols
into
>> >the table, but don't go as far as removing information needed to
reconstruct
>> >the source.  
>> 
>> [#] If you separate the data needed to reconstruct the source from that
>> needed to run the program, then you can compress the download by only
>> sending the "run" data.
>
>[-] But i want to get source code given exactly the runnable form.
>If you present the code in two forms, one for running and one for
>reconstructing source, that defeats the purpose of signing the code
>in the first place, and you no longer get to inspect the code that
>you're about to run.

[-] How can you be sure that the part of the runnable that allows you to
reconstruct the source code matches the part that is executable?  It could
have been modified after compilation but before being signed.  I admit that
the space for social engineering here is small, because the object-code
part determines the behavior.  I don't think there is any difference
between post-compilation modification and compiling deceptive source, but
there might be a bit more space to sow confusion.

[-] There are two kinds of code inspection going on.  There is
"verification" which in Java is performed by the JVM and assures the
integrity of the type system.  Then there is "auditing" which is performed
by a human.  Even something as source oriented as the GPL does not require
that the source always accompany the object.  It merely requires that the
source always be available.  This separation may be particularly valuable
for mobile code applications, (e.g. Java Applets) where we do not expect a
human verifier to be at every node.

In our case, what we can do is:

* Compile the program
* Compute a secure hash of the source-reconstruction part.
* Compute a secure hash of the representation of the AST.
* Sign both hashes saying that the signer has verified that the source-part
and the AST-part come from the same compilation.
* Distribute the hashes, signature, and URLs to retrieve the two parts.

Anyone who wants to allow that code to run at some level of trust can
reconstruct the source, re-compile it, sign the result so machines and
people which trust his signature will assign the code some level of trust,
and run the code.  Note that SPKI signatures may be the right model here as
they can instruct machines about special privileges to give the code (i.e.
capabilities).