Default database / namespace
Kragen Sitaker
kragen@pobox.com
Mon, 3 Jul 2000 17:16:58 -0400 (EDT)
Robert Wittams writes:
> Warning: I may be clueless! Don't flame too hard ;-)
I may be too, so I'm in no position to flame you :)
> But there are
> still a lot of situations in which it will be necessary to organise keys
> & data, and having a unified namespace is a goal towards which most OSs
> seem to be moving.
Naming is important. Have you read "The Hideous Name"?
http://achille.cs.bell-labs.com/cm/cs/doc/85/1-05.ps.gz
> Of course as this is not mandated by the eros kernel, any implementation
> provided is merely a default and can be removed or ignored (by not
> passing the key to access it to any objects that do not need it).
Right --- it doesn't even need to be provided by the kernel at all.
> If an object posseses a write key to the namespace (ie it can bind
> things in to the namespace), it has a channel to communicate to any
> other object which can read the namespace. This must be prevented, so,
> for example, a sandboxed domain which wanted access to *a* namespace can
> be passed a key to a recently constructed one, or a filtered view of the
> "old" namespace. The key would obviously need to be explicitly passed in
> all situations. And most native eros programs would never need a
> namespace key.
I'm not sure about that last sentence --- let's wait and see --- but I
agree with the rest.
A Unix directory is a namespace. A tree of namespaces is a
hierarchical namespace. A directed graph of namespaces with some
reference point can also be thought of as a hierarchical namespace.
In EROS, each key constituting an edge of this graph can carry read
and/or write permission. This way, it is possible to give an object a
write key to a private namespace which bears read-only keys to shared
namespaces, preventing that object from communicating with other
objects that can read the shared namespaces.
> Having been impressed by the ideas presented by Hans Reiser in this
> paper:
> http://devlinux.com/projects/reiserfs/whitepaper.html
I am impressed by them too.
> and also the Microsoft COM/OLE idea of a moniker,
Can you elaborate on monikers?
> I propose that the default namespace of eros be flexible enough to
> subsume hierarchical, keyword, and relational systems.
Has someone yet come up with a usable mapping from namespaces into
relational databases? It seems like a very hard problem to me.
I think you should implement your idea and see if it is a good one.
> Also, it should be possible to stack namespaces, and mount them at any
> point within each other.
Sure.
> So, the way I think this could be done:
>
> An interface "namespace" with these operations:
> get(query) - return an object matching a query. May be a set.
Here "query" is analogous to "filename" in the Unix world?
> link(query, key) - add an association into the database for the key.
> unlink(query, key) - remove an association for the key.
> Need both as multiple objects may be linked
> to the same criteria.
> bind(query, key) - maybe should be called mount?
> Key must be to another namespace. any querys which match go into
> this namespace for further searching. Any "link" operation will try to
> link from the top down, on any namespaces which have a write key.
> unbind(query, key)
>
> ( maybe there needs to be a way to ask which namespaces are bound in
> currently,
> or maybe this comes in the answer to get() ? )
link() and bind() seem like the same thing to me; likewise unlink() and
unbind(). Is there a semantic difference?
Have you looked at the Linux VFS interface, the BSD VNode interface,
the Coda interface, or the Plan 9 9P interface, to see how they do
these things?
> and she does these queries:
>
> /linux/usr/local/bin - returns a set which has the contents of that
> directory under linux
Not a namespace, I take it --- just a set? You're talking about
building a flat, nonhierarchical namespace, then?
> Anyway, once you have a key out of the namespace, you check the alleged
> key type, and do what you like with it. I think something like
> QueryInterface could be very useful here.
QueryInterface is the COM interface for determining whether an object
supports a particular interface, isn't it?
> She has a key, k1 to a file (it is a stream of bytes) about a recipe.
> She goes link("/recipes/yumyum", k1)
> Then she goes link("[milk cheese flour]", k1)
> Then link("[ cooking-time/45mins ]", k1)
>
> What happens is that the namespace asks each namespace on / to link it
> to recipes/yumyum . First it asks the record namespace. The key is not a
> record, so it
> doesn't add it. Next it asks the file namespace. It is a file, so it is
> added.
This is a very interesting idea. Perhaps you should implement it to
see if it is good.
> In this way, efficient storage of things as either records or byte
> streams can be achieved. And legacy systems can be bolted in too.
The usual way to achieve this is for files to live in some places and
records to live in other places in the namespace. What kinds of
advantages do you get from mounting them all in the same place?
> from a string. The advantage of this is that different systems could be
> supported - eg a windows name parser, a unix name parser, even an SQL
> parser. eek. Also, elements of the tuples could be other keys. Meaning
> that parts of names could be arbitrary objects, not just strings. Eg in
> a record, another record type.
At this point I think you have gotten into OODBMSs :)
> Is there enough difference between a set and a namespace to warrant a
> seperate interface? It might be best to just return another namespace,
> then more querying could be done on that.
That's a good question. I'd err in the direction of unification at
first, splitting the objects if it turned out to be necessary.
> How do you optimise for disk and memory usage bearing checkpointing in
> mind? Is this (eg the indexes) something for which explicit
> journaling/IO may make sense?
Explicit journaling/IO is useful when you need durability guarantees:
that if the system crashes, the data won't be lost. OODBMSs and RDBMSs
frequently do this.
> Does some kind of transaction model need to be supported? If so, in what
> form? It might be best to have another interface for this on record
> oriented systems. So you do all your heavy database work with
> transactions via that interface, but it is still possible to add in /
> view stuff via the namespace interface, which internally would use a
> transaction for each change.
Transactions are necessary for some things, not worth the cost for
others, IMHO. I think that whether you need transactions is more or
less orthogonal to whether your data is relational, object-oriented,
textual, or hierarchical. Are you familiar with two-phase commit?
> Hope some of this made sense,
Much of it sounds interesting. If you build it and it turns out to be
good, I suspect it will be adopted.
--
<kragen@pobox.com> Kragen Sitaker <http://www.pobox.com/~kragen/>
The Internet stock bubble didn't burst on 1999-11-08. Hurrah!
<URL:http://www.pobox.com/~kragen/bubble.html>
The power didn't go out on 2000-01-01 either. :)