[e-lang] Alternative persistence system
Thomas Leonard
tal at it-innovation.soton.ac.uk
Mon Oct 12 05:58:17 EDT 2009
Summary
I've spent some time using the E persistence mechanism now, and I've
found it to be clever, elegant and very hard to use for my purposes. I
ended up writing my own replacement for the timeMachine and
makeSturdyRef objects. Below is a report on the problems I had and the
solution I ended up with. Has anyone else had similar issues?
Problems with the default system
My test case is a fairly large program with many different kinds of
objects, most of which need to be made persistent and exported as
SturdyRefs. The program is divided into many modules, and plugins may
extend it. Using the default E mechanism (makeSturdyRef and
timeMachine), I found these problems:
1. Scalability: The system does not scale well. When an object's state
changes, the entire object graph has to be written out, which takes time
proportional to the number of objects in the system (roughly 10 ms per
object on my machine), not to the number of changes. Fixing this would,
presumably, require making each persistent object's representation be
independent of that of all other persistent objects. Ideally, I'd like
to save after creating every persistent object, before giving the sturdy
ref to the user. I expect to be creating many such objects per second in
some cases.
2. Redundant information: an object's portrayal must include all of its
authority. If I have 1000 "job" objects, each with access to "timer",
then the saved file will include 1000 references to "timer". This could
be fixed by referring to the "parent" object (the one that originally
created this one) instead (e.g. revive using "parent.makeJob()"), but
this still results in 1000 references to the parent. Also, this solution
conflicts with (1), since we want to make the portrayals independent,
and it requires giving the object access to its parent, which may not be
desirable from a security point of view.
3. Difficulty upgrading: If I have a saved file containing "job" objects
without timers, and I now decide that job objects should have timers,
there is no easy way to add them later (at least, not without messing
around with the surgeon's exits). Also, if I give an object access to a
subdirectory, it persists with an absolute pathname and I can't restart
in a different directory.
4. Organisation: In my systems at least, the owner of the service/vat
should be able to see the state of the system and discover all objects.
Each object is owned by some parent object, which maintains a list of
its children. The E persistence system makes it easy to have objects
which are exported and persistent but which are not owned by any object.
You would have to catch and handle exceptions very carefully to ensure
that this couldn't happen. Also, if I give makeSturdyRef to an object, I
have no control over the objects it creates. I want to group objects so
that I know where they came from and can destroy the whole group at
once.
5. Safety: Without persistence, objects accept authority but don't
generally give it out (unless that's part of their function). Making an
object persistent can be done in two ways (__optUncall and
__optSealedDispatch). The first is easy but unsafe, the second is harder
but safer (though still with some issues, as mentioned previously). A
typical programmer, not too concerned with security, has a reasonable
chance of writing a fairly secure E object that doesn't expose more
authority than it should. However, they are very likely to take the easy
and less secure option of using __optUncall for persistence.
6. Too many code paths: A persistable object must implement three code
paths: create, save and revive. Most objects are not designed for
persistence and only support the first case. An object which depends on
an unpersistable object is also unpersistable. For example, if I call
makeObject(file.deepReadOnly()) then the resulting object cannot be
persisted, because read-only files cannot be. Also, the revive operation
must be made public so that the persistence system can call it. This may
be safe, but it is not good API design as other people may start using
it by mistake.
A possible solution
While perhaps not as elegant as E's system, my replacement works for my
largish use-case and solves the above problems (while preserving the
essential property that objects can't take advantage of the persistence
system to gain authority).
Persistent objects are arranged in a tree. When saved, each node
contains the Swiss base of the object and a method call on the parent
that would re-create the object.
Each node in the tree is actually three E objects:
- A "builder" object.
- A "persistNode" object (provided by the persistence system and holding
the Swiss base).
- A "public" object, created by the builder. Holders of the SturdyRef
can call methods on this object.
The root builder object is provided by the application. All other
builders are created by their parent builders. For example, a chat
server managing chat rooms might look like this:
def makeChatServer() {
return def chatServer {
to makePublic(persistNode) {
return def chatServerPub {
to createChatRoom(name) {
require(validRoomName(name))
return persistNode.makeSturdyChild("loadChatRoom", [name])
}
}
}
to loadChatRoom(name) {
return makeChatRoom(name)
}
}
}
(imagine that this chat system is just a small sub-module of the main
application, without access to the surgeon, etc)
On startup, the persistence system will:
- Take the root builder (perhaps a chatServer) as input.
- Create a persistNode for it.
- Revive all saved children, by calling methods on chatServer (e.g.
"loadChatRoom").
- Create the public object (chatServer.makePublic(persistNode)) and
register it with identityMgr.
Similiarly, makeChatRoom() returns a builder for chat rooms. This
builder's makePublic will be called with its own persistNode, allowing
the chat room to manage its own children (e.g. bots).
If a chat room needs extra authority (e.g. a timer or a file for saving
the history), we don't need to change the on-disk format, just the
loadChatRoom method, e.g.
to loadChatRoom(name) {
return makeChatRoom(name, timer, <file:rooms>[name])
}
We can give any authority this way, not just persistable authorities
(e.g. we could pass a verb facet or a shallow-read-only directory to
makeChatRoom, which we couldn't do with the default system).
This seems to address the points above:
1. Scalability: It seems feasible that each object can be persisted
independently (although my implementation doesn't currently do this).
2. Redundant information: We only need to persist unique information
about each object. Everything else can be calculated anew at revival
time. Saved files are smaller and easier to read.
3. Difficulty upgrading: Because the on-disk format only contains the
key information, not incidental authority, it's easy to add or remove
authority, regenerate pathnames relative to a new base, etc.
4. Organisation: Every persistent object is organised into a hierarchy.
An object without a parent cannot be represented. Destroying an object
destroys all of its descendants automatically.
5. Safety: Objects don't need to export their authority ever, and they
don't need to hold a reference to their parents.
6. Too many code paths: The creation path exercises all code (e.g.
createChatRoom() uses the persistence system to create each room object
the first time too, not just to revive them). If an object can be made
sturdy, it is very likely it will save and restore correctly too.
Finally, this system ensures that objects are revived in a predictable
order (a parent builder before its children, the parent public object
after them).
--
Dr Thomas Leonard
IT Innovation Centre
2 Venture Road
Southampton
Hampshire SO16 7NP
Tel: +44 0 23 8076 0834
Fax: +44 0 23 8076 0833
mailto:tal at it-innovation.soton.ac.uk
http://www.it-innovation.soton.ac.uk
More information about the e-lang
mailing list