Kragen Sitaker wrote:
>
> Robert Wittams writes:
> > Warning: I may be clueless! Don't flame too hard ;-)
>
> I may be too, so I'm in no position to flame you :)
I hope everyone is as accomodating ;-)
> > But there are
> > still a lot of situations in which it will be necessary to organise keys
> > & data, and having a unified namespace is a goal towards which most OSs
> > seem to be moving.
>
> Naming is important. Have you read "The Hideous Name"?
> http://achille.cs.bell-labs.com/cm/cs/doc/85/1-05.ps.gz
A long, long time ago, in a galaxy far away, but I am reading it again now.
> > Of course as this is not mandated by the eros kernel, any implementation
> > provided is merely a default and can be removed or ignored (by not
> > passing the key to access it to any objects that do not need it).
>
> Right --- it doesn't even need to be provided by the kernel at all.
>
> > If an object posseses a write key to the namespace (ie it can bind
> > things in to the namespace), it has a channel to communicate to any
> > other object which can read the namespace. This must be prevented, so,
> > for example, a sandboxed domain which wanted access to *a* namespace can
> > be passed a key to a recently constructed one, or a filtered view of the
> > "old" namespace. The key would obviously need to be explicitly passed in
> > all situations. And most native eros programs would never need a
> > namespace key.
>
> I'm not sure about that last sentence --- let's wait and see --- but I
> agree with the rest.
Well, if you think about it - these are the things you need to use names for in most unix/windows/mac progs:
Interactive user apps eg text file editor- loading a file. The app has a
key to a file selector factory, makes a file selector. The file selector
waits for user input to select the file. Then it gives back a key to a
file to the app. similarly for saving.
So the app knows about files, but not about namespaces.
Command line - the shell opens things for them and passes them the keys.
The only stuff which does need to know about any names is the direct
portion
that the user selects a name from. Eg a web/ftp server. And the nice
thing is that this webserver could be given a key to an arbitrary
namespace (This could even be a query! Or a filter program supporting
the right interface. That would be very cool.)
But I think a lot of things could get away with no namespace knowledge
at all.
> A Unix directory is a namespace. A tree of namespaces is a
> hierarchical namespace. A directed graph of namespaces with some
> reference point can also be thought of as a hierarchical namespace.
>
> In EROS, each key constituting an edge of this graph can carry read
> and/or write permission. This way, it is possible to give an object a
> write key to a private namespace which bears read-only keys to shared
> namespaces, preventing that object from communicating with other
> objects that can read the shared namespaces.
Yep, exactly.
> > Having been impressed by the ideas presented by Hans Reiser in this
> > paper:
> > http://devlinux.com/projects/reiserfs/whitepaper.html
>
> I am impressed by them too.
>
> > and also the Microsoft COM/OLE idea of a moniker,
>
> Can you elaborate on monikers?
IIRC, they are a list of things, eg.
[ "http", "www.eros-os.org", "tada.xls", "A1:B5"]
this would ordinarily be stringified like:
http://www.eros-os.org/tada.xls!A1:B5
looking a bit like a URI.
so what happens is:
it loops over each element, calling get_object (I forget if this is the right name).
obj = root_object
for element in moniker:
obj = root_object.get_object(element)
so get_object activates an object for each sub item.
some of this might be a bit wrong ( long time no COM/Windows!) but you get the idea.
then at the end you do QueryInterface on the object and do whatever you
like.
Microsoft do get some things right!
> > I propose that the default namespace of eros be flexible enough to
> > subsume hierarchical, keyword, and relational systems.
>
> Has someone yet come up with a usable mapping from namespaces into
> relational databases? It seems like a very hard problem to me.
In that reiser paper, there is a mapping proposed. Whether it is usable remains to be seen ;-) But his case seems reasonable... do you see problems with it?
> I think you should implement your idea and see if it is a good one.
> > Also, it should be possible to stack namespaces, and mount them at any
> > point within each other.
>
> Sure.
>
> > So, the way I think this could be done:
> >
> > An interface "namespace" with these operations:
> > get(query) - return an object matching a query. May be a set.
>
> Here "query" is analogous to "filename" in the Unix world?
Yep.
> > link(query, key) - add an association into the database for the key.
Well, link just gets inserted as an entry in the namespace.
bind() means that it is treated as an extension of the namespace.
But I see what you mean... I can't quite get it straight in my head
right now, but it might be possible to do away with one of them, which
would be nice. hmmmmm.....
I don't know how you would do proper stacking in this situation.
> > unlink(query, key) - remove an association for the key.
> > Need both as multiple objects may be linked
> > to the same criteria.
> > bind(query, key) - maybe should be called mount?
> > Key must be to another namespace. any querys which match go into
> > this namespace for further searching. Any "link" operation will try to
> > link from the top down, on any namespaces which have a write key.
> > unbind(query, key)
> >
> > ( maybe there needs to be a way to ask which namespaces are bound in
> > currently,
> > or maybe this comes in the answer to get() ? )
>
> link() and bind() seem like the same thing to me; likewise unlink() and
> unbind(). Is there a semantic difference?
This would be a bit more like monikers in COM I think...
> Have you looked at the Linux VFS interface, the BSD VNode interface,
> the Coda interface, or the Plan 9 9P interface, to see how they do
> these things?
I have looked at Linux VFS and 9P, but not recently ;-) I'll try looking at coda and BSD VNode, and maybe the others again.
> > and she does these queries:
Yeah, I think returning another namespace is better come to think
> >
> > /linux/usr/local/bin - returns a set which has the contents of that
> > directory under linux
>
> Not a namespace, I take it --- just a set? You're talking about
> building a flat, nonhierarchical namespace, then?
> > Anyway, once you have a key out of the namespace, you check the alleged
> > key type, and do what you like with it. I think something like
> > QueryInterface could be very useful here.
>
> QueryInterface is the COM interface for determining whether an object
> supports a particular interface, isn't it?
Yep, I think it could be very useful in eros, if most "high level" domains supported this kind of component framework. Of course to be lazy, I think we should call it qi!
Of course this all implies some kind of interface definition and repository( another bit of the namespace!). I can't think of what security implications that might have right now.
I believe there were a few posts about a Capability IDL, but not much
discussion.
I think this would be pretty good, but it would require a lot of though
to get the mapping right from any kind of interface description to
a set of eros operations ( eg how do you pick the numbers? registers?
what goes in the data string? ) And how would this change with remote
things? Use GIOP/IIOP/DCOM/SOAP argh!
The other bits of IUnknown ( which is the base interface in COM) are
just
reference counting. But I thought there was some talk of a garbage
collector following capabilities like references in an ordinary garbage
collector,
and destroying unreferenced objects.
This would certainly be nicer than explicit reference counting and avoid
circular fun.
This is also a hard thing to get right in a distributed case (argh!
distributed garbage collection!)
But having a component framework in eros from the base system would certainly be cool, and lead to an easier to grok system. QueryInterface is a truly simple idea, but it is very powerful.
> > She has a key, k1 to a file (it is a stream of bytes) about a recipe.
> > She goes link("/recipes/yumyum", k1)
> > Then she goes link("[milk cheese flour]", k1)
> > Then link("[ cooking-time/45mins ]", k1)
> >
> > What happens is that the namespace asks each namespace on / to link it
> > to recipes/yumyum . First it asks the record namespace. The key is not a
> > record, so it
> > doesn't add it. Next it asks the file namespace. It is a file, so it is
> > added.
>
> This is a very interesting idea. Perhaps you should implement it to
> see if it is good.
Thats my plan ( in my copious spare time) gotta get eros compiled and stuff first ;-)
> > In this way, efficient storage of things as either records or byte
> > streams can be achieved. And legacy systems can be bolted in too.
>
> The usual way to achieve this is for files to live in some places and
> records to live in other places in the namespace. What kinds of
> advantages do you get from mounting them all in the same place?
It means you don't have to think about where you are putting something.
Thats the whole problem with hierarchichal systems.
In some cases, you won't even bind something to a heirarchical name at
all,
only to keywords or to a relation. Then it will only show up in queries
on that
keyword or relation. eg a quick note about something. You just write it
on your desktop, and the desktop manager quietly indexes it...
The way I'm thinking is this -
The namespace is a way of getting keys from a query. A lot of keys will be keys to either files ( streams of bytes) or records ( unordered lists of ordered pairs (see the reiser paper)) These can both be stored efficiently, and indeed there are already efficient implementations available. However, they are hard to merge, and this isa way of making the merge less painful.
Of course, for a lot of things (stuff done in RDBMS), it will be best to make a special table object which is very efficient given a set number of keys, and bind this into the namespace at a heirarchical point. eg /addresses. for an address book. Then only records fitting the table will be able to be inserted.
> > from a string. The advantage of this is that different systems could be
> > supported - eg a windows name parser, a unix name parser, even an SQL
> > parser. eek. Also, elements of the tuples could be other keys. Meaning
> > that parts of names could be arbitrary objects, not just strings. Eg in
> > a record, another record type.
>
> At this point I think you have gotten into OODBMSs :)
Thats the point ;-)
Eliminate & subsume all other namespaces!
> > Is there enough difference between a set and a namespace to warrant a
> > seperate interface? It might be best to just return another namespace,
> > then more querying could be done on that.
>
> That's a good question. I'd err in the direction of unification at
> first, splitting the objects if it turned out to be necessary.
Yeah, thats what I'm thinking too...
> > How do you optimise for disk and memory usage bearing checkpointing in
Should the default name space do this? I would say yes
> > mind? Is this (eg the indexes) something for which explicit
> > journaling/IO may make sense?
>
> Explicit journaling/IO is useful when you need durability guarantees:
> that if the system crashes, the data won't be lost. OODBMSs and RDBMSs
> frequently do this.
What I am thinking about is this:
Disks and memory have different properties, meaning different data
structures are appropriate for speed. Given that you don't know whether
you are jumping to a page that must be faulted in off disk, how do you
work out what is the best data structure to use?
Or do you just access it as if it is on disk ( eg cache stuff elsewhere,
use disk friendly data structures) and on average it *will* be on disk?
Other than this, you use some kind of block device interface directly.
But this is probably a bit evil.
> > Does some kind of transaction model need to be supported? If so, in what
> > form? It might be best to have another interface for this on record
> > oriented systems. So you do all your heavy database work with
> > transactions via that interface, but it is still possible to add in /
> > view stuff via the namespace interface, which internally would use a
> > transaction for each change.
>
> Transactions are necessary for some things, not worth the cost for
> others, IMHO. I think that whether you need transactions is more or
> less orthogonal to whether your data is relational, object-oriented,
> textual, or hierarchical. Are you familiar with two-phase commit?
Yep, I'm wondering whether it should be part of the default interface.
As in you must ask for a transaction, and do all other operations on
that,
then commit the transaction.
It seems like it might be a bit scary for someone who is used to just
doing unix style file stuff.
> > Hope some of this made sense,
>
> Much of it sounds interesting. If you build it and it turns out to be
> good, I suspect it will be adopted.
;-)
Rob