Default database / namespace

Jonathan S. Shapiro shap@eros-os.org
Tue, 4 Jul 2000 06:58:31 -0400


> > > And most native eros programs would never need a
> > > namespace key.
> >
> > I'm not sure about that last sentence --- let's wait and see --- but I
> > agree with the rest.

The last sentence is correct. In fact, I'ld suggest that the difference
between a shell and other applications is that the shell has access to the
user's namespace.

Consider an editor as an example. The editor doesn't need to know about file
systems. It only needs to know about files. When a new file is to be edited,
the shell can spawn a new editor.

Okay. This may be a weak example, because we have come to use editors as
shells, but the point is that most programs really only need access to the
files they operate on.

A middle ground is to give access to the file system by way of a "standard
file" box, which ensures that the user can see all of the accesses

> > > /linux/usr/local/bin - returns a set which has the contents of that
> > > directory under linux
> >
> > Not a namespace, I take it --- just a set?  You're talking about
> > building a flat, nonhierarchical namespace, then?
>
> Yeah, I think returning another namespace is better come to think of it.

This has to be done with great care. It is desirable in some cases for the
user to be able to traverse a namespace without being able to modify or
examine that space. This is analogous to being able to traverse the nodes of
an address space without being able to read/write the nodes themselves.

Therefore, in many cases you will wish to return a highly restricted
capability to the namespace that only allows traversal.

> The other bits of IUnknown ( which is the base interface in COM) are
> just
> reference counting. But I thought there was some talk of a garbage
> collector following capabilities like references in an ordinary garbage
> collector,
> and destroying unreferenced objects.

EROS does not have a garbage collector. The security implications of an
object that can examine all other objects are not pretty. Also, disk GC is
not as nice as memory GC for reasons of latency.

> > > In this way, efficient storage of things as either records or byte
> > > streams can be achieved. And legacy systems can be bolted in too.
> >
> > The usual way to achieve this is for files to live in some places and
> > records to live in other places in the namespace.  What kinds of
> > advantages do you get from mounting them all in the same place?

Also, you get uniformity of mechanisms for security. Many UNIX security
holes have arisen because there is a semantic gap in the security provisions
of one namespace that can be leveraged in another by compromising a program
that straddles them.

> > >       How do you optimise for disk and memory usage bearing
checkpointing in
> > > mind? Is this (eg the indexes) something for which explicit
> > > journaling/IO may make sense?
> >
> > Explicit journaling/IO is useful when you need durability guarantees:
> > that if the system crashes, the data won't be lost.  OODBMSs and RDBMSs
> > frequently do this.
>
> Should the default name space do this? I would say yes...

This is just what you *don't* want to do. The last thing on earth you want
is to force writes in the namespace that point to an object that won't exist
when the system recovers. You don't want all of the state of the system to
come back. You want all of the pieces of the system to come back consistent
with respect to each other. Also, note that there exists no means to journal
objects that contain capabilities, because this violates the causality of
the system security.

One of the real beauties of transparent persistence is that the name space
doesn't need to do anything special to get things written down in the right
order. Have a look at:

    http://www.eros-os.org/eros-src/domain/directory/

Note that it does not explicit storage management at all.

> What I am thinking about is this:
> Disks and memory have different properties, meaning different data
> structures are appropriate for speed.

Sometimes, and sometimes not. EROS gets a large performance gain out of
using the same (identically) data structure for memory objects on disk as in
memory.

> Given that you don't know whether
> you are jumping to a page that must be faulted in off disk, how do you
> work out what is the best data structure to use?
> Or do you just access it as if it is on disk ( eg cache stuff elsewhere,
> use disk friendly data structures) and on average it *will* be on disk?

EROS is a single level store system. It is none of the programmer's business
whether a given object is presently in memory or on the disk. For the most
part, you write your code as though it were in memory. For the very very few
applications where it really matters, you use working sets to avoid having
important high-frequency content paged out at the wrong time. Note that
working sets are not yet implemented.

> Other than this, you use some kind of block device interface directly.
> But this is probably a bit evil.

This would cause you to lose all of the performance, simplicity, and
security that EROS provides.

> > > Hope some of this made sense,
> >
> > Much of it sounds interesting.  If you build it and it turns out to be
> > good, I suspect it will be adopted.

Much of this is thought provoking, which is good. I am concerned, however,
by a pattern I see in the proposal. Instead of asking how to use the
existing EROS mechanisms effectively, much of the name space discussion
explores how to subvert the system, and many of the proposed subversions
create serious problems. I'ld encourage you to try living within the
persistent world before working around it.