[e-lang] [cap-talk] Initial draft Caja design doc

Jed Donnelley jed at nersc.gov
Wed Oct 17 19:02:34 EDT 2007


On 10/12/2007 5:22 PM, Mark Miller wrote:
> [Narrowing down to just e-lang]
<note about the mixup/delay in a separate message>

<OK, for this message I'm following that lead,
though as noted later I believe such a 'narrowing'
to be unwise.>

> Hi Jed, thanks for the comments! Mostly I reply simply by revising the
> text. Remaining issues below.
> 
> 
> On 10/12/07, Jed Donnelley <jed at nersc.gov> wrote:
>> I'll imagine passages -> content through out the
>> the rest of the paper and see if I notice any
>> problems. ... <I noticed a case where you use
>> the term 'content' somewhat separately, but
>> nothing that jumped out at me regarding a
>> real need for the 'passages' term.  I'd like
>> to hear a bit more justification for such a
>> new term.>
> 
> But "content" is a mass noun, like "water" whose singular refers to
> an indistinct quantity. I hate "passage" myself, but I would like a
> count noun whose singular refers to an individual unit. Please keep
> the suggestions coming. Did I mention that I *hate* "passage"?

I see your point about the "mass" noun aspect of the
word 'content.'  Considering your use of the term
'passage' in the spec, however, I don't consider that
a stopper.  What about 'page' or 'block' as alternatives?

The definitions I get for 'passage' look like:

way; route; course; section; piece

Only the last two seem to reasonably fit your usage
as I understand it.  Once one gets past the idea that
the 'way; route' definition doesn't apply, there is
still the lingering feeling that such a 'passage' is
just part of a something larger.  Is that what you
intend?  If so, what is that larger something?

Page or block can both be "content" that can be
referred to as individual elements without the
seeming requirement that they be part of some
larger whole.

Of course 'page' has come to be bound so tightly
to the Web (less so virtual memory elements) that
perhaps you might want to avoid that term - unless
the association could be considered positively without
confusion.  'Block' to me seems more generic.

"Chunk" is another possibility.  "Passage" has this
flow, route, direction, etc. connotation.  Is that
what you are hoping for?  I don't get any sense of
that in the spec, but perhaps if that is what you
are shooting for it might not be so bad as my
initial reaction would suggest.

>> 4.  When you first note the "powerbox" term I think
>> it's fine to have the references, but I think at least
>> a parenthetical definition/description is needed.
>> E.g. 'an interface that can allow a user to introduce
>> access to objects into a least authority execution
>> environment that doesn't yet have access to them.'
>>
>> 5.  Hmmm.  If I've been understanding your use of
>> the colored numbers and arrows, then I think there is
>> an error in their use in the paragraph where you
>> introduce the 'powerbox' term (page 3 third paragraph).
>> In the sentence there, "A web app, on detecting the
>> presence of a powerbox, could offer to edit a local
>> file chosen by the user 2 -> 6.  Shouldn't that
>> be 3 -> 5?  That is, didn't you argue earlier that
>> web apps don't have access to local objects?
> 
> Unless a browser extension makes a powerbox available to web apps.
> Unlike web apps, browser extensions do run with the user's full
> authority (at 2). A web app that detects that such a powerbox is
> present can, by asking the powerbox (to ask the user), obtain access
> to a local file (at 6).

Certainly.  The browser extension runs at 2, but the
web app that you refer to:

A web app, on detecting the presence of a powerbox, could
offer to edit a local file chosen by the user 2 -> 6

runs at 3 (or 1 - not sure the distinction there at this
time).  I believe in your example it is the web app
that is going to transit from not having the authority
to access the file for editing to having the needed
authority.  That is, from inadequate authority to
least authority, 3 -> 5.  If it was the browser
extension that you were considering for the editing
work then transiting from 2 -> 6 (or 4 -> 5) might
be appropriate - though how you would achieve that
I'm not sure.

Since I'm on the topic of figure 1 and the numbering,
let me just ask: Why wouldn't 1 (excess authority) <->
2 (inadequate authority) and both with arrows to
3 (least authority) be an effective simplification of
that figure?  What do the splits between 2 and 4,
1 and 3, and 5 and 6 contribute that justify their
additional complexity?  If those distinctions are
indeed important, perhaps you should put some sort
of labels in the figure between those elements to
name the distinctions?

>> I'm a bit
>> puzzled by "DOM and other APIs" being included above.
> 
> The DOM is a substantial additional library provided by browsers. In
> order for Caja to enable safe active web content, we must specify a
> tamed form of this API. The present document deals only with the
> Javascript library speced in ES3, which doesn't include the DOM.

Would this 'tamed' DOM then include a standard interface
to a powerbox?  I was going to suggest that seems important
to me.  Perhaps this is just for future work?

>> 7.  When on page #3 you say, "To facilitate
>> development, it is easy to write a Caja program
>> so it can run correctly whether it is run as a Caja
>> program or run directly as an untranslated Javascript
>> program."
>>
>> are you noting this more as a transition mechanism?
> 
> Yes.
> 
> 
>> That is, that Caja programs can run with excess authority
>> when it is inappropriately given to them until the
>> execution environments are tightened?
> 
> Yes.
> 
>>  In that case
>> it might be worthwhile to note that such Caja programs
>> are not able to access that excess authority - as I
>> believe is the case if I'm understanding correctly.
> 
> No. When a Caja program is run as an untranslated Javascript program,
> it has the full authority that Javascript programs running in that
> environment have, whatever that is.

This is a point that I think might be worth exploring a
bit more.  I believe I understand your statement above:

"When a Caja program is run as an untranslated Javascript program,
it has the full authority that Javascript programs running in that
environment have"

, however, isn't part of the value of Caja the fact that the
language itself doesn't provide the interfaces that allow
access to the additional authority that a straight
Javascript program would have in that environment?
That is, if the program truly is Caja (e.g. so that
it could properly be run in a Caja only environment)
then it will have no interfaces that would allow it
to exploit the additional authority that would be
available in the overly authorized environment that
was intended for the Javascript program whose place
it is taking.

> Once again, think of Caja as the "user mode" subset of the Javascript
> "instruction set". If you take a program that was allegedly written to
> run in user mode, but you run it in system mode instead, it might
> still work. But you should no longer assume it is running with reduced
> authority.

I understand this analogy and I believe it helps make
my point.  Perhaps we are just talking around the same
point from different perspectives?  To tie into the
above analogy, if a program running only the "user mode"
subset of an instruction set runs in system (monitor/kernel)
mode then it won't be able to exercise any of its
additional potential authority simply because it doesn't
have the instructions to do so (in the code).

Of course in the case of machine instructions, if the
program could modify its own code then it could
change some of its instructions and then execute them
to get at the additional authority provided by its
environment.  Presumably that isn't possible with
Caja.

For the compatibility case I imagine there would
be nothing to check that the code actually is
Caja and not more dangerous Javascript, but
perhaps some transition strategy might consider
such an approach - that is, run the script through
some sort of a safety checker and then if it passes
as Caja then  run it in an overly authorized Javascript
environment, taking some solace in the hope that
because it is actually Caja it won't be able to
exercise the excess authority that it is 'given.'

Am I missing something important here?

>> 7.  Regarding the OS/machine analogy - e.g. when you
>> say, "When a Caja object A invokes an object B
>> written directly in Javascript, the operations
>> provided by B serve the role of system calls."
>>
>> Doesn't this depend on the environment the Javascript
>> is running in?  I usually think of Javascript as
>> running in a browser sandbox.  In that case it seems
>> neither the Javascript object nor the Caja object
>> have access to any significant authority.
> 
> The Javascript does indeed (typically) not have any authority that
> would conventionally be considered interesting, such as the local file
> system. However, Javascript running on my gmail page has the authority
> to delete my gmail inbox.

An authority which it presumably gets from interacting
with the server side (i.e. "call home" as you suggest below)?

> The browser, by preventing the user from
> giving web apps local authority + preventing the user from being able
> to deny the web app the authority to call home, left the web apps no
> choice but to place their user's valuable assets on their site of
> origin.

Of course.  To me this fits into the "do no harm"
philosophy.  The server side has any access it needs to
the server side resources.  If you only allow the
client side "web app" supplied by the server but
running under the browser that same access, then you
have done no harm (i.e. provided no additional authority).
The web app is acting with the same authority as the
server but able to provide some additional interactivity.
If you aren't used to thinking of object access
via parameters sorts of least authority it makes
perfect sense.

The user has no way to distinguish sub domains from
the server's service.  From the user perspective what
is on the other side of an ssl connection may as
well be a fully integrated extra terrestrial
civilization - all intimately sharing access.
The fact that a web app running on behalf of that
service shares that intimacy should not be a surprising
model.

> This is the point of the (3)->(4) transition.

Vs., say, (1)->(2)?  That distinction is still lost on
me, so I thought I'd mention it again.  Are you suggesting
that (4) has less authority than (2) (if so what less?) and
that (3) has more authority than (1) (again if so, what?)
and even that (5) has less authority than (6) (what?)?

I really do think this discussion should be shared with
cap-talk where I expect you would get another level of
interaction more focused on the authority aspects of
Caja and the various environments.

> Regarding their
> ability to do what their user may do at this origin site, untranslated
> Javascript running in the browser sandbox on a web page has
> undiminished authority.

... on the server side I assume?  However, in what
sense is that really true?  Such a Javascript can
communicate with no filtering to the server side
to be sure.  However, does that indeed constitute
"undiminished authority"?  That is, if the gmail
server chose to require some out of band and
independently authenticated communication in order
to delete your mailbox, wouldn't you then say that
the Javascript running on gmail's behalf had
"diminished" authority?  Doesn't it in fact simply
have whatever authority the server grants it
through the available communication channel?

Of course from the user's perspective that could
be anything.  That is, the user has no control over
the authority that the server is granting to the
running Javascript program.  Is that your point?

I don't believe that situation changes substantively
with Caja.  As soon as a Caja program has the authority
to communicate to the server then it may be given any
authority possessed by the server (communicating
conspirators).

>> You mention the powerbox concept, but are there
>> means available for injecting, say, access to
>> a single local file into a sandboxed Javascript
>> environment with any sort of a powerbox?
> 
> Yes. When you install a browser extension, you grant it all of your
> authority. It may then provide web apps whatever subset of this
> authority it wishes, such as the authority to edit an individual file.

This is where I was suggesting that an interface standard
would be useful.  Essentially a standard "open" call.

>>  If so that would seem to me to provide most of the
>> value that web app programmers would like to have
>> in Javascript - regardless of the finer grained
>> mutual suspicion that you hope to be able to
>> provide with Caja.
> 
> Both are valuable. Which one is "most" depends on what you value.

Hmmm.  I don't agree with the above statement and am
willing to debate it a bit in case you might find it
helpful.

>From my perspective any support for fine grained mutual
suspicion that you can provide in a Caja environment
"only" provides for potentially greater reliability
of downloaded modules.  It doesn't provide the user
or the server any additional protection.  To me this
is a bit like breaking up a server (e.g. postfix)
into separate mutually suspicious processes.  The whole
conglomeration has a single trust boundary through
its communication AND a single trust boundary to
the local system (whatever "user" authority).  Any
internal mutual suspicion is in some sense "its"
business.  That is, if, for example, the code was proven
correct, then it wouldn't matter to anybody whether
it internally used mutually suspicious processes
or not.

On the other hand I argue that the least authority
environment provided to me by the Caja runtime
environment (e.g. coupled with a powerbox) and
the Caja language restrictions is qualitatively
more important to me as a user.  If I am going to
download some Caja code and execute it on my
machine (in a browser "sandbox" of some sort),
it is vitally important to me that it can't
get access to resources on my system without
my permission.  Whether there is some internal
separation in any such web app with mutual suspicion
between the separated pieces seems to me rather a
secondary concern.

Perhaps you were thinking of something else
when you suggested that both sorts of
'separation' may have comparable value in
different contexts?

>>  If not (I haven't heard of
>> such a Javascript powerbox mechanism) then I
>> don't see how you can get access to local objects
>> into your global Javascript environment to that
>> access can be safely communicated between Caja
>> objects.  If you are proposing to provide such
>> a mechanism, I didn't see it.
> 
> It won't be concretely proposed in this document. We should make that clear.

If I understand you above I think that is unfortunate.
What value is a language without an "open" statement
that allows it to safely interact with its
environment?

--Jed  http://www.webstart.com/jed/

P.S. Sorry not to see your message sooner MarkM!


More information about the e-lang mailing list