[e-lang] Lazy Interning Fantasy (was: TraversalKey bug fixed; Equalizer questions)

Mark S. Miller markm at cs.jhu.edu
Sat Apr 14 15:57:32 CDT 2007


Kevin Reid wrote:
> 4. It has occurred to me that the equalizer could, having determined  
> that two references are the same but not pointer-equal, replace one  
> of them with the other, thus saving memory and allowing future  
> comparisons to be cheap. Can you think of a good way for the  
> equalizer to communicate this to the objects it uncalls?

Not as an incremental modification to the E-on-Java implementation. But the
Weizmann implementation of Flat Concurrent Prolog had a lazy interning idea
that did something similar: Since the implementation already knew how to
transparently follow forwarding pointers, in order to support incremental
migrating collectors and resolved logic variables, whenever two values were
successfully unified, one was smashed in place with a forwarding pointer to
the other. This avoided the creation-time costs of hash-consing but still
obtains most of the benefit: repeated unification attempts of the same values
could often quickly succeed (after following forwarding pointers) by an EQ check.

The migrating garbage collector would shorten forwarding chains, cleaning up
after 1) itself, 2) logic variable resolution, and 3) this lazy interning
scheme. I don't know that the benefits of this scheme were ever actually
measured, but it certainly had a nice intuitive appeal.

For a from-scratch ELib implementation, this suggests the following fantasy:

Each allocation record representing an E value in the E heap is in one of
three states:

* Normal - the default initial state.
* Forwarding - as in FCP, in which case it points at the record to use in its
stead. Resolved promises become Forwarding, and these are shortened and
cleaned up by the garbage collector.
* Settled & Interned - as with a hash consing scheme, but using a weak
interning table.

Given two settled objects x and y, x == y (x is the same as y as far as E is
concerned) iff intern(x) == intern(y) (the interned form of x has the same
memory address as the interned form of y). Since only settled objects can
serve as keys in E's primitive hash tables, with this technique, E's primitive
hash tables could individually just be pointer equality tables (what Java
calls identity hash tables or something). They would all share one big weak
interning table as the only bridge from sameness-equality to pointer-equality.

In an E system where multiple vats share an address space and pass 
DeepPassByCopy objects between vats by pointer sharing, it is interesting to 
ponder whether they should all share one weak lazy interning table, or whether 
they should each have their own. If they share one, it is also interesting to 
wonder whether DeepPassByCopy objects should be always be interned before 
crossing a vat boundary.

-- 
Text by me above is hereby placed in the public domain

     Cheers,
     --MarkM




More information about the e-lang mailing list