[e-lang] Alternative persistence system
Kevin Reid
kpreid at mac.com
Mon Oct 12 09:25:31 EDT 2009
On Oct 12, 2009, at 5:58, Thomas Leonard wrote:
> Summary
>
> I've spent some time using the E persistence mechanism now, and I've
> found it to be clever, elegant and very hard to use for my purposes. I
> ended up writing my own replacement for the timeMachine and
> makeSturdyRef objects. Below is a report on the problems I had and the
> solution I ended up with. Has anyone else had similar issues?
>
>
> Problems with the default system
>
> My test case is a fairly large program with many different kinds of
> objects, most of which need to be made persistent and exported as
> SturdyRefs. The program is divided into many modules, and plugins may
> extend it. Using the default E mechanism (makeSturdyRef and
> timeMachine), I found these problems:
>
> 1. Scalability: The system does not scale well. When an object's state
> changes, the entire object graph has to be written out, which takes
> time
> proportional to the number of objects in the system (roughly 10 ms per
> object on my machine), not to the number of changes. Fixing this
> would,
> presumably, require making each persistent object's representation be
> independent of that of all other persistent objects. Ideally, I'd like
> to save after creating every persistent object, before giving the
> sturdy
> ref to the user. I expect to be creating many such objects per
> second in
> some cases.
This is an efficiency problem; E-on-Java is just not very fast at
executing E code, which includes the implementation of the persistence
subsystem.
Due to E's requirements for consistency on revival, an entire vat
*must* be persisted as a unit. (Of course, if an application such as
yours has weaker requirements you can use an alternate system.)
> 2. Redundant information: an object's portrayal must include all of
> its
> authority. If I have 1000 "job" objects, each with access to "timer",
> then the saved file will include 1000 references to "timer". This
> could
> be fixed by referring to the "parent" object (the one that originally
> created this one) instead (e.g. revive using "parent.makeJob()"), but
> this still results in 1000 references to the parent. Also, this
> solution
> conflicts with (1), since we want to make the portrayals independent,
> and it requires giving the object access to its parent, which may
> not be
> desirable from a security point of view.
The repeated references are necessary to preserve capability security.
However, if an object needs multiple authorities, say 'timer' and
'stdout', then one thing you can do is have it persist as a reference
to a bundle of them:
def jobAuthority { # which is an exit or gotten from some loader
to timer() { return timer }
to stdout() { return stdout }
}
> 3. Difficulty upgrading: If I have a saved file containing "job"
> objects
> without timers, and I now decide that job objects should have timers,
> there is no easy way to add them later (at least, not without messing
> around with the surgeon's exits).
This is a hard problem in general, but if you use the authority bundle
above then you can just change the bundle and every job automatically
gets that authority when revived.
I've also imagined having a tool to basically do robust search-and-
replace on serialized files, which would be able to handle the 'adding
authority' problem in general.
> Also, if I give an object access to a subdirectory, it persists with
> an absolute pathname and I can't restart in a different directory.
This is a problem in the legacy file access subsystem, not the
persistence subsystem.
One possible solution (which *could* be built as a layer on top, or
built in): Create "RootRelativeFile" objects with the interface
makeRootRelativeFile(root :any, subpath :String)
such that they behave like the file root[subpath] but persist as this
representation and construct more of themselves (a membrane) when sub-
file references are retrieved from them. Then make the root-dir your
app uses an object which is switchable to forwards to whatever you
currently want the application root directory to be -- or, perhaps,
just a graph exit which you revive as whatever directory.
Yes, this is additional complexity, but it is I think useful for many
applications besides yours. Realize that E's standard library is
nowhere near "complete" in having every basic capability utility one
ought to want.
> 4. Organisation: In my systems at least, the owner of the service/vat
> should be able to see the state of the system and discover all
> objects.
> Each object is owned by some parent object, which maintains a list of
> its children. The E persistence system makes it easy to have objects
> which are exported and persistent but which are not owned by any
> object.
> You would have to catch and handle exceptions very carefully to ensure
> that this couldn't happen.
This is fixable generically: write the objects so that they (have just
enough authority to) check with their parents to make sure they are
properly registered, and become nonfunctional if they aren't.
> Also, if I give makeSturdyRef to an object, I
> have no control over the objects it creates. I want to group objects
> so
> that I know where they came from and can destroy the whole group at
> once.
Follow capability practice by subdividing authority. Write a caretaker
wrapper around makeSturdyRef which records the refs created and can be
destroyed as a group.
> 5. Safety: Without persistence, objects accept authority but don't
> generally give it out (unless that's part of their function). Making
> an
> object persistent can be done in two ways (__optUncall and
> __optSealedDispatch). The first is easy but unsafe, the second is
> harder
> but safer (though still with some issues, as mentioned previously). A
> typical programmer, not too concerned with security, has a reasonable
> chance of writing a fairly secure E object that doesn't expose more
> authority than it should. However, they are very likely to take the
> easy
> and less secure option of using __optUncall for persistence.
In principle it could be reduced to one extra call with a suitable
library:
to __optSealedDispatch(b) {
return doPersistence(b, fn { [makeWhatever, ...] })
}
But I suspect that your hypothetical "almost knows what to do"
programmer would fail to write secure code in other ways anyway.
> 6. Too many code paths: A persistable object must implement three code
> paths: create, save and revive. Most objects are not designed for
> persistence and only support the first case. An object which depends
> on
> an unpersistable object is also unpersistable. For example, if I call
> makeObject(file.deepReadOnly()) then the resulting object cannot be
> persisted, because read-only files cannot be. Also, the revive
> operation
> must be made public so that the persistence system can call it. This
> may
> be safe, but it is not good API design as other people may start using
> it by mistake.
The "revive" operation *should*, when possible, be the same as the
"create" operation. Exceptions should be reviewed with suspicion.
Makers-for-revival *are* part of the public API because as soon as
your app is deployed, people have saved data which uses those
interfaces. You have to preserve compatibility or announce breakage/
support migration just like with any other public interface. (Think of
it like ABI/"binary compatibility" in C shared libraries.)
That read-only files are unpersistable is a bug. (You can work around
it by adding a loader to the surgeon which recognizes read-only files.)
> A possible solution
...
I suspect that your solution is, in general, able to work more
straightforwardly *for your application* because you have additional
constraints:
1. Your objects are arranged in a hierarchy.
2. You have no objects with which your application is mutually
suspicious.
To expand on the second point, your scheme of reviving objects with
authority based on their parents would fail dangerously if the child
object was not actually one of yours, but something which did not have
that authority in the previous incarnation and now gets it.
I don't know your real persistence infrastructure, so I can't say
whether this actually makes sense, but that is the general form of my
suspicion: that you have something which is easier to use, but either
less powerful or unsafe-given-untrusted-code (depending on the details).
--
Kevin Reid <http://switchb.org/kpreid/>
More information about the e-lang
mailing list