[E-Lang] Security Breach: Nominee for the Stock Exchange Prize

Tyler Close tclose@oilspace.com
Wed, 18 Apr 2001 19:37:19 +0100


At 08:46 AM 4/18/01 -0700, Marc Stiegler wrote:
> > >Btw, even though the motivation for exploring these issues was the
>current
> > >controversy over connection and timeout semantics, the above problem &
>fix
> > >seem to be equivalent in any of the current proposals.
> >
> > I disagree. In a system where ERiaSR, the user's bidding agent, upon being
> > revived, would have resent the original bid request, not issued a
> > completely new bid request. The receiving Vat guarantees that a message is
> > processed at most once (possibly using the scheme I laid out in my
>original
> > proposal for always sturdy refs).
>
>It is not the bidder's agent, but rather the bidder's client

Sorry, I meant to be referring to the "bidder's client".

>, that has to
>resend the message. So your assertion assumes that the bidder client is
>persistent and revived and that this is the one true client always used to
>make bids.

No, my assertion assumes that the return value reference can survive the 
lost connection and be used to send subsequent requests. I'll explain more 
further down.

>With this assumption, the breach cannot occur even with liverefs
>for reasons identified in the beginning of this thread: when the bidder's
>persistent client vat reconnects, the old connections are broken, the old
>agent is dead, and the delayed play of the old message is discarded.

Liverefs try to fake the "surviving reference" behaviour by magically 
killing the old proxy objects. Unfortunately, they can't do a complete job 
of this illusion, so there is remaining clean-up to be done by the user 
software, or by the user. The fact that such clean-up is not possible when 
the user changes client Vats is what caused the current security breach. 
(Again, more further down.)


>One of David Wagner's questions which led me to identify this breach was,
>what if the bidder decides to use his neighbor's computer when his own
>coughs and says it has lost its connection? Our bidder would presumably just
>put a copy of his capability onto the alternate machine, not a complete copy
>of his one true client (certainly, in a capability system, this seems like a
>reasonable notion :-) (once again, if he does copy the one true client,
>neither ERiaSR nor liveref architectures fail). The bidder proceeds down the
>same erroneous path with the same result. The solution remains the same: I
>have to kill the old agent in an outside-of-infrastructure revocation.

The issue is what capability is "pasted", how the user gets this 
capability, and what that capability means.

You are assuming that the pasted capability is an E SturdyRef for the 
"market account" that is immutable for its whole life and is not bound by 
any message timelines. This ability to break out of the message timeline is 
what caused the security breach. You pasted the E SturdyRef into the other 
Vat and started a new timeline that was unconnected to the timeline you 
just left. It was this act that caused "normal" assumptions about order to 
break down.

If, on the other hand, all references were sturdy references, and there was 
no such thing as a "proxy" object, then you would have switched Vats by 
forking the return value reference from your "bid" message (perhaps a 
reference to a Bid object). You would have then pasted this forked 
reference into the new Vat and started using it. The Market Vat, upon 
receiving your first message on this reference from the new Vat would find 
one of two cases: the reference exists, or the reference does not yet 
exist. If the reference exists, then the message from the old Vat did get 
through, and everything proceeds as if nothing had gone wrong. If the 
reference does not yet exist, then the message from the old Vat has not yet 
been delivered to the Market Vat. In the latter case, the Market Vat will 
round-trip to the old Vat and try the reference again. If the reference 
still does not exist, then it is resolved to a smashed reference. The 
smashed message is returned to the user in the new Vat. Upon receiving the 
smashed message, the user knows for sure that the first "bid" did not, and 
will not, go through. It is now safe to send a new "bid" message to the 
"bidding agent". If, at some later time, the "bid" message from the old Vat 
does arrive, then the Market Vat will find that the return value reference 
has already been resolved (to the smashed reference) and will return the 
smashed message rather than process the "bid" message.

E's LiveRef model does not permit this sort of recovery, since the return 
value reference from the "bid" message is a LiveRef that died with the 
originating connection, and whose state cannot be recovered. It is this 
simple fact that is at the root of the current exploit. The number of 
outstanding bids on the market, etc, are all extraneous details.

Note that in an ideal world, the user would not "paste" just the Bid object 
cap (ie: the return value), but would paste a client Vat generated 
"recovery message". In this case, the "recovery message" would be the 
original "bid" message on the "bidding agent" reference. Doing this would 
mean that the user would never be exposed to the possiblity of a lost 
message. Essentially, the new Vat would follow exactly the same logic that 
a revived old Vat would.


>If I understand the ERiaSR proposal correctly (which I doubt, actually, I
>sense nuances to the proposal I have not grokked properly), ERiaSR is
>easier, more natural, and does not correctly model the real world, which
>means I would spend more time trying to work around the incorrect model.

The assertion that the ERiaSR model "does not correctly model the real 
world" is a silly and unsubstantiated claim. The ERiaSR model actually does 
a much better job of representing message timelines. The LiveRef model is 
an incomplete and broken representation of the message timeline that was 
motivated solely by ill-conceived ideas about the implementation 
implications. In no way does the LiveRef model provide more information 
about the state of the world than the ERiaSR model does. In fact the 
LiveRef model provides less information, in crucially important areas, as 
you have just discovered.

( I get very annoyed when people try to argue using speculation about the 
"real world". It somehow feels like they're saying I lack the competence to 
be presenting my side of the case. So, hopefully, you'll pardon some 
excessive rhetoric on my part ;)


>For example:
>
>I have lately been revamping eDesk to work with 089t, so I have
>re-familiarized myself with that code. One of the clever things it does is,
>if an eDesk loses a connection to a file-server vat, it closes all the
>windows on all the files and directories on that vat.

This sort of functionality is easy to implement with an application level 
timeout. That this sort of functionality is somewhat easier to achieve in 
the LiveRef model, does not make up for the significant failures of the 
LiveRef model when implementing smart contracts, or other software that 
requires a complete model of the message timeline in order to fulfill its 
contract.

Tyler