[cap-talk] How desirable / feasible is a persistent OCAP language?

James A. Donald jamesd at echeque.com
Thu Jul 17 20:23:10 CDT 2008


Stiegler, Marc D wrote:
 > E's persistence is just plain hard to use. Someone who
 > is interested should go through the cap-talk and
 > e-lang archives to see where we left this discussion
 > when I finished my last piece of work in this area. I
 > think we had some ideas to make it better, but none to
 > make it good.
 >
 > I am tempted to say that waterken's persistence is too
 > easy to use, although that is not quite right.
 > Waterken's orthogonal persistence runs quite behind
 > the scenes and is utterly unstoppable -- your stuff
 > gets persisted, and revives exactly where it left off,
 > it is like a bulldozer ... Or perhaps like a
 > Terminator. On a normal day this is really cool. But
 > on a bad day -- and bad days occur too often to just
 > write them off as anomalies -- this can be a serious
 > problem.  Recovering from an error can be so difficult
 > that you have to throw away the whole universe -- not
 > just one computer's worth, but a whole distributed
 > network's worth. One of the specific circumstances
 > that is most dreadworthy is the software upgrade,
 > which in a waterken world is necessarily a live system
 > upgrade. Alan Karp and I recently started a pilot
 > program with half a dozen users using a waterken
 > application we have under development. I have not
 > added a single significant feature since, and am
 > terrified !  of the need to do so.
 >
 > Truthfully, I do not know whether our application
 > would be any easier to upgrade if we were using a
 > nonpersistent language. Upgrading the software in a
 > distributed system when there is any persistent state
 > is just hard.

The reason linux systems tend to run forever while
Windows systems have to be rebooted at frequent
intervals is that linux/unix architecture encourages one
to continually spawn new processes and shut down old
processes, with each new process getting most of its
state from human edited plain text files, and very
little of its state from existing processes.  So bad
state can seldom persist.

Persistent state with mutual dependencies, persistent
state where it is *possible* for that state to be bad or
internally inconsistent, will give you problems.
Persistent distributed state with mutual consistency
requirements will give you bigger problems.

Long ago I wrote the networking OS for the Atari lynx, a
handheld video game intended to support impromtu
networks - two or three kids gather around in a circle
and play together with shared machines.  I took over the
project after the previous architect had run into
enormous problems with the network non deterministically
going into bad states.

I therefore re-architected the system so that there were
never any assumptions about shared or related state
between one machine and the next - that whenever there
needed to be relationships between the state in one
machine, and the state in the other, there was no
assumption that the system was in that state, rather,
state transitions were such that no matter what the
initial state, the network would randomly and non
deterministically tend towards a state where
relationships were mostly, most of the time, what was
desired.  Since determinism and synchrony could not be
reliably achieved, non determinism and asynchrony were
built in at every level, with many state transitions
being made on the basis of a true random number
generator choosing with probabilities depending on
statistical estimates of the state of the rest of the
network. Instead of a single machine trying various
possibilities systematically, it would try various
possibilities at random, thereby minimizing stored
state.  I went to considerable effort to ensure the
numbers were truly random, so as to avoid unplanned
correlation between random decisions of one machine on
the network, and the random decisions of the other.
Since no machine could ever be sure about the state of
other machines, each machine diversified its portfolio,
like a cautious investor taking a bet on each possible
outcome.  A machine's beliefs about the state of the
network and the state of other machines on the network
were always statistical, rather than certainties, and
past evidence of network state was continually
deprecated as new evidence became available, so that
incorrect beliefs about network state always faded over
time.

So in answer to the question:  Persistent state hurts.
Persistent distributed state hurts a lot.  Don't do
that!  Anything truly persistent must live at one
authoritative source, with other node's beliefs about
this persistent data fading over time, so that at random
times they check with the authoritative source.  If you
arbitrarily change this persistent data at the
authoritative node, then over time, the rest of the
network must eventually come to be consistent with the
changed data, rather than needing to change data
simultaneously and consistently at multiple nodes.
Further, the transitional period, where other nodes have
erroneous beliefs about the state of this persistent
authoritative node, will last a finite time, and if bad
things happen, then other nodes should recover
transparently and automatically by coming to
statistically doubt their beliefs about the present
state of the authoritative node, in accordance with
Baysian inference, by deprecating their beliefs about
authoritative state, leading them to check with the
authority.

 > As I pointed out years ago in an earlier discussion of
 > persistence, one of the nice things about the
 > traditional relational dbms and its schema is, when
 > you need to do an upgrade, the schema is a clean
 > compact representation of the core of the upgrade
 > problem. Its format makes it easier to wrap your head
 > around all the issues at one time.

Assume you have many machines, subject to different
administrators.  They are all using the same schema, and
exchanging data from databases governed by this schema.
You change the schema - but not all machines will change
their schema, and those that do, will not change their
schema at the same time.  So each interaction has to
involve beliefs about the version of the sotware that
the other machine is running, and a machine with new
software has to translate data the course of interacting
with the machine running older software.

Persistent state is bad!  Consider carefully how little
persistent state you  can get away with.  Distributed
persistent state is so bad that you should never never
ever ever use it, except in the form of swiftly fading
statistical estimates by one node about the state of
other nodes.

Therefore, easy, cool, engineer friendly ways of
providing and managing persistent state are bad.  The
more hostile the environment is to doing things by means
of persistent state, the better.  Don't do that!


More information about the cap-talk mailing list