[e-lang] Newbie questions about persistence
Thomas Leonard
tal at it-innovation.soton.ac.uk
Fri Aug 28 10:27:25 EDT 2009
Hi,
I didn't find any solution to the previous problem, but it hasn't been
causing too much trouble and I'm trying to learn about persistence
now.
If you visit "erights.org -> ELib -> Persistence" you just get a load
of pages marked "to be written", but there's actually quite a lot of
good documentation if you can find it. In particular, I found these
links useful:
* http://www.erights.org/data/serial/jhu-paper/deconstructing.html
* http://www.eros-os.org/pipermail/e-lang/2004-January/009483.html
* http://www.erights.org/elib/distrib/captp/index.html
There are a number of things which are probably obvious if you know
the system, but took me a while to realise. I'll summarise them here
for other beginners, and so people can correct the bits I get wrong.
The prototype I'm writing has a client/server architecture. A client
generally requests objects be created on a server, and the server
monitors them. Both the client and server are able to persist their
current state. There are multiple clients and servers and quite a lot
of sharing goes on between them.
Sturdy refs
My first mistake was assuming that, since a client can invoke methods
on a live ref and can pass the live ref to other clients, it must know
the remote object's Swiss number. I therefore assumed that the client
should call "makeSturdyRef(farRef)" to be able to reconnect or persist
the reference.
The CapTP documentation says this about what is sent to the client (in
DeliverOnlyOp):
"The NewFarDesc encoding must have all the information needed to
create such a new Far reference, which is the position the Far
reference should be assigned in the Imports table (the same as the
position at which Carol is Exported), and the SwissNumber that,
together with VatA's VatID, represents Carol's sameness identity.
However, the JavaDoc shows that a NewFarDesc actually contains only
the Swiss hash (i.e. the identity of the object), not the Swiss number
(the ability to call it):
http://www.erights.org/javadoc/net/captp/jcomm/NewFarDesc.html
I'm not sure whether we're using NewFarDesc or OldFarDesc in practice,
though I assume it doesn't make any difference here.
My understanding of the situation now is that:
* When a live ref is sent to the client, it is assigned a small
integer ID (like a Unix file descriptor), which is only valid in the
context of that TCP connection. There is a special mechanism for
passing these live refs between clients, via the server.
* The client cannot get a Swiss number from a live ref.
* The server (hosting vat) must always call makeSturdyRef on an object
before passing it to another vat if it wants the remote end to be
able to persist it or reconnect after the TCP connection is closed.
* Calling makeSturdyRef creates a new Swiss base and, from that, a new
Swiss number. It adds a mapping from this to the object, in a map it
shares with the time machine. It then registers the object with the
IdentityMgr (so that clients can connect to it) and creates the
actual SturdyRef object with the Swiss number in it.
* Calling makeSturdyRef.temp is similar, except that it doesn't add
anything to the map shared with the time machine.
* The time machine works by saving the object graph starting from the
swiss-number-to-object mapping.
* Since makeSturdyRef(obj) makes a sturdy ref that lasts forever and
is persisted, and throws away the SwissRetainer used to cancel it,
refs made this way will accumulate forever.
Persistence
Starting with the mapping from SwissRetainers to objects (i.e. all
persistent exported objects), the time machine serialises the object
graph to a file, along with the vat's private key and network address.
The format of this serialisation is a subset of E called Data-E.
Internally, the time machine uses a "surgeon" to serialise and
unserialise object graphs. Objects which should not be serialised, but
which will be available when reloading, are called "exits" of the
graph, and must be registered with the surgeon first.
The system is designed to be able to serialise mutually-suspicious and
untrusted objects. No object should be able to revive with more
authority than it started with.
Some objects in E can be serialised automatically, including numbers,
strings, lists, maps, etc.
Some objects can be serialised if an appropriate "loader" or
"uncaller" is registered with the surgeon. These include files and
modules.
Many things cannot be serialised by default, including:
- facets (obj.method)
- caretakers
- promises
- near problems (bug in minimalUncaller?)
Persistence of sturdy refs
A sturdy ref can only be serialised if fully resolved (is this a bug?),
e.g.
? introducer.onTheAir()
? def surgeon := <elib:serial.makeSurgeon>.withSrcKit("de: ").diverge()
? surgeon.addLoader(introducer, "cap__uriGetter")
? def a
? def b := makeSturdyRef.temp(3)
? bind a := b
? surgeon.serialize(a)
# problem: Can't uneval <SturdyRef to 3>
? a == b
# value: true
? surgeon.serialize(b)
# value: "de: <cap://*qvcmrxvqy66rest6gfzhhxwrjni6nrdq@192.9.206.110:59905/jo6sk4nehnrewzo7bvntfz7rl32f6q2g>"
I used this code to work around this:
# Uncalls SturdyRefs to <cap:...>
def sturdyLoader extends introducer {
to optUncall(obj) {
return super.optUncall(Ref.resolution(obj))
}
}
surgeon.addLoader(sturdyLoader, "cap__uriGetter")
Persistence of cycles
Some cyclic data structures can be persisted, while others can't (is
this a bug?). I haven't worked out what the rule is, but I do have
some examples:
def surgeon := <elib:serial.makeSurgeon>.withSrcKit("de: ").diverge()
def roundTrip(obj) {
println(`Serializing $obj...`)
def data := surgeon.serialize(obj)
println(`Serialized as $data`)
def obj2 := surgeon.unserialize(data)
println(`Unserialized as $obj2`)
println("")
}
def a := [a]
roundTrip(a)
def b := [a, a]
roundTrip(b)
This prints:
Serializing [<***CYCLE***>]...
Serialized as de: def t__0 := [t__0]
Unserialized as [<***CYCLE***>]
Serializing [[<***CYCLE***>], [<***CYCLE***>]]...
Serialized as de: [def t__0 := [t__0], t__0]
# problem: <IndexOutOfBoundsException: not found: t__0>
This is quite annoying, because a SwissRetainer includes a pointer to
its object, in addition to the pointer in the table for which the
retainer is a key. Therefore, an object containing a single cycle ends
up with the broken double-cycle structure when saved.
Persistence of identity
Transparent objects can implement __optUncall to allow them to be
persisted. This method returns all of the object's authority, proving
to the surgeon that it is permitted to be revived with it. However,
this also allows anyone else to get an object's authority by calling
__optUncall themselves.
Therefore, most objects should use __optSealedDispatch to seal the
result. There don't seem to be many examples of this (e.g. FileGetter
uses "obj instanceof File" instead). I'm having trouble seeing how to
use this. Should I add a new loader/uncaller? How is identity handled?
e.g if I have a one-shot object, how can I ensure that an object
holding it will revive with only one copy of the one-shot object?
Similarly for a caretaker.
Persistence of functions
The result of uncalling an object is usually of the form [makeFoo,
"run", [...]], where makeFoo is a top-level function in some module.
Is there any way to get the time machine to handle these
automatically, without having to add them all as exists manually?
My current solution is that every module looks like this:
def makeFooInternal(state) {
return def foo {
to __optUncall() {
return [makeFooInternal, "run", [state]]
}
}
}
def makeFoo() {
def state := ...
return makeFooInternal(state)
}
[ =>makeFoo, =>makeFooInternal ]
I then add an uncaller that recognises top-level functions by their
name:
def modules := [].asMap().diverge()
def loader(pkg, name) {
if (!modules.maps(pkg)) {
modules[pkg] := <import>[pkg]
}
return modules[pkg][name]
}
def functionUncaller {
to optUncall(obj) {
if (Ref.isNear(obj)) {
def type := obj.__getAllegedType()
def name := type.getFQName().split("$")
if (name.size() != 2) {
return null
}
def [pkg, obj] := name
return [loader, "run", [pkg, obj]]
}
}
}
This seems a bit clumsy. Is there a better way? There are too many
functions to add them manually.
Thanks, and sorry for all the questions!
--
Dr Thomas Leonard
IT Innovation Centre
2 Venture Road
Southampton
Hampshire SO16 7NP
Tel: +44 0 23 8076 0834
Fax: +44 0 23 8076 0833
mailto:tal at it-innovation.soton.ac.uk
http://www.it-innovation.soton.ac.uk
More information about the e-lang
mailing list