One of the challenges with capability systems comes from storing them. If you allow objects containing capabilities to be written to disk, then you get into a bunch of update ordering issues. The general rule is that you want the object to get written before any of its capabilities get written, but in a capability system there can be circular write dependencies. This is of the reasons that EROS uses transparent systemwide persistence.
However, persistence seems to generate confusion (in humans, often including me) about how things like transactions work. The other day somebody asked me to explain that to them and I botched it horribly. This note is an attempt to get it right.
TRANSACTIONS IN NON-PERSISTENT SYSTEMS
Before proceeding to a discussion of transactions in persistent systems, I want to draw attention to three things that all correct transaction clients must deal with:
I want to describe each in turn and what happens when they occur. Then I'll get to persistent systems.
Routers may go down at any time. Therefore, at any point where there is an interaction between transaction client and transaction server -- up to and including the commit -- either client or server may get the error "your network connection has died". Generally, connection keepalives are used in this situation to allow both sides to do an orderly timeout and independently abort the connection.
The point is that a correct transaction client must be programmed to allow for the possibility of a network failure.
2. Commit Agreement Failure
I'm sure there is a technical term for this, and I don't know what it is. The issue is that in any commit architecture, it is possible for a failure to occur in such a way that (a) the commit has succeeded, and (b) the client isn't told. Consider, for example, the following two-phase commit scenario:
Client Server Phone Company
Prepare
to commit
Ready to
Commit
Commit
Cuts line with backhoe
Committed
????
Don't think this doesn't happen -- it's happened to me.
This is really just a special case of case (1). What is interesting about it is that it occurs *after* the commit. It is unknown to the client whether the commit succeeded or failed.
3. No Blind Updates
The following code logic is NOT correct:
begin transaction
check if some update has occurred
end transaction
....
begin transaction
do update
end transaction
The problem is that the update might be done by someone else between the transactions.
The one case where blind update is arguably "safe" is the case where you are doing an initial data load of a dataset. Even there, humans forget and do the process more than once, and I'ld argue that the first step of the transaction ought to be a check to see if the table already exists.
II. PERSISTENCE AND TRANSACTIONS: THE SERVER PERSPECTIVE
From the server perspective, persistence doesn't introduce any complications, but rollback does. The problem is that commitments must not be undone if the machine crashes. The bad sequence is
take checkpoint
begin transaction A
end transaction A (committing)
crash
After the failure, the commit of transaction A must not be lost.
EROS solves this problem by building an exception into the checkpoint mechanism called journalling. This allows a database to say "this here page must come back, even if a failure occurs." When the database system does a commit, it first writes the necessary information into a write-ahead log, journals those pages, and then announces that the commit has occurred. On restart, any modifications in the write-ahead log are re-applied, so transactions are not lost by the server.
III. PERSISTENCE AND TRANSACTIONS: THE CLIENT PERSPECTIVE
I'll go through the cases in a minute, but the crucial thing to remember is that network connections are NOT included in the persistence contract. The upshot of this is that the client will see what appears to be a network failure when the system recovers from a checkpoint.
Here are the scenarios and consequences as seen by the client
Client System
Take Checkpoint A
(failures here are not seen
by the client)
Open server
connection
(failures after the connection
manifest as network failures)
Begin transaction
Take Checkpoint B
(failures after ckpt B still
manifest as network failures)
Commit
(rollback to Ckpt B will see
network failure, but commit
has definitely occurred. See
Note 1)
Take Checkpoint C
(failures from here will see
that the network failed after
the commit)
Close Connection
Note 1: This case must already be handled by correct clients, per failures (1) and (2) described above.
The only remaining problem is the situation where a client has some sort of work queue of pending requests, and sits in a loop of the form
while (more in work queue)
grab next job
begin transaction
handle job
end transaction
In this situation, items in the work might be processed more than once, if the checkpoint occurs as follows:
while (more in work queue)
grab next job
TAKE_CHECKPOINT()
begin transaction
handle job
end transaction
SYSTEM_CRASH()
There is, however, an easy fix to this, which is to *simulate* a network failure. What you do is build a transaction into the initial connection to the database, and have both sides count the successful commits. You then alter the begin_transaction() code to send the number of commits that the client believes have occurred in the current session. If the server does not agree, it immediately aborts the transaction.
Where the remote database does not directly support such a protocol, it can be synthesized on the client side using a database front end that is built on the logging mechanism.
Jonathan S. Shapiro, Ph. D.
IBM T.J. Watson Research Center
Email: shapj@us.ibm.com
Phone: +1 914 784 7085 (Tieline: 863)
Fax: +1 914 784 7595