[cap-talk] Challenge in VM design

Jonathan S. Shapiro shap at eros-os.com
Mon Aug 4 11:32:40 CDT 2008


On Mon, 2008-08-04 at 12:16 -0400, Sandro Magi wrote:
> Jonathan S. Shapiro wrote:
> > Both of these facts impose a high indirection burden. The very best VMs
> > for these languages invest an unreasonable amount of effort trying to
> > remove this indirection. The philosophy in BitC is a bit different:
> > don't put that indirection in in the first place by injecting artificial
> > constraints about type representations!
> 
> Tagless/tag-free GC attempts to do this. Henderson's technique is one 
> possible implementation that I know you are aware of, and TIL also uses 
> a variant.

Yes. The Henderson technique is a great method when you have to compile
to C. The overhead is too high for a native implementation, where (much)
better techniques are available.

> Tagless GC places no assumptions on the representations, but it imposes 
> a type reconstruction burden on the runtime (or type-passing, ie. 
> dictionaries, for all polymorphic function arguments).

So far, I'm not that impressed with tagless GC. Perhaps I have not read
the right things yet.

> Do you have a hint as to how GC can be done absent whole program 
> compilation, type-passing, or runtime type reconstruction?

BitC rejects type-passing (or dictionary passing) as a run-time
technique for performance and calling convention compatibility reasons.
Given this, it doesn't seem like a good option here.

I don't know enough about runtime type reconstruction to have a sensible
opinion there.

Whole-program compilation is where I currently (and ignorantly) assume
that we should focus. When people speak about whole-program compilation,
they often assume that this means "whole program at static link time".
This is not actually required. It is sufficient if it is "whole program
at dynamic link time." This is true because:

 1. You do not have any obligation to collect objects of a given
    type until it is possible to create an object of that type.
    Until the constructor code gets loaded (and its uses propagated),
    you don't have a problem. After the constructor is loaded, you
    have all the information that you need.

 2. All type instantiations are ultimately rooted in concrete types,
    so either the propagation of instantiation succeeds or the program
    cannot be typed at all.

> "Slim Binaries" [1] are probably better than a bytecode for a VM. They 
> are form of compressed AST; in your case, the AST of a linking language.

I was using the term "bytecode" generically. I need to re-read this
dissertation (I had forgotten about it), but I was thinking by default
more or less along the lines of something like an augmented CLR.

> As for the current "function templates", I imagine you're doing 
> something along the lines of [2]?

>From a brief re-read, I would say "no". I am thinking about something
much lower level and less sophisticated. The approach they adopted is
interesting, but I think a much simpler approach will work here.



More information about the cap-talk mailing list