[cap-talk] Dan Bernstein's qmail security lessons paper

Jonathan S. Shapiro shap at eros-os.com
Mon Dec 17 11:40:13 EST 2007


On Mon, 2007-12-17 at 11:32 -0500, Sandro Magi wrote:
> Jonathan S. Shapiro wrote:
> > First, there is the problem of garbage retention. This results from dead
> > pointer slots on the stack that have not been nullified. Solutions have
> > long been known, but are not widely implemented.
> 
> I don't see how this is a problem for server-side systems. As you say,
> request/response cycles are short, which means the stack is filled then
> emptied in a single short request/response.

This is exactly how the problem arises. Typically, the request-response
loop will have some local variable somewhere that gets the "answer" from
all of the processing. This variable must still be in scope at the
response, and additionally there are many local variables that are out
of scope or not reachable from the control flow. Unfortunately those
other local variables usually have non-null values, and the stuff they
point to therefore cannot be GC'd.

The results are very application specific, but the degree of retention
can be very surprising.

Andrew Appel wrote a nice paper on all of this, but I now forget the
title.

> > Second, there is the pragmatic problem that most of the popular
> > languages of this sort have to interface with C. In many cases (though
> > perhaps not in Java) this forces them to conservative collectors, which
> > are horrible.
> 
> There are relatively few languages that use conservative GC [1]. Mono is
> moving to an accurate GC soon.

Thank God! This is a lot of what has kept us away from Mono.

> While I'm
> not a fan of conservative GC, Hans Boehm has studied the overheads of
> and found them to be minimal.

I have *enormous* respect for Hans, but he has made a number of
inadequately qualified statements in this regard. The cold truth is that
the overheads are extremely application dependent, and no general
conclusion of this sort can be drawn accurately. Our experience with
conservative GC in OpenCM was DISMAL. In fairness, it may also have been
atypical. The point is that nobody (including Hans) can say with
confidence what the cause of the problem was or whether it was typical
or not.

> > The only substantive execution-time overheads of
> > these languages have to do with things like "true" integers, and the
> > fact that many implementations therefore cannot exploit simplified
> > register arithmetic. A lot of that can be made to go away by a
> > sufficiently aggressive compiler, but I do not know if such compiler
> > technology is widely deployed. I doubt it.
> 
> I'm not sure I follow. Just about every language I know of exposes
> native integers by default without transparent conversions to big integers.

Counter-examples: ML, Lisp, Haskell

> I would say that the substantial execution overheads result from boxed
> representations of simple types like integers and floats, which must
> then be traced by the GC. OCaml performs substantial analysis to unbox
> such small values.

Yes. Unfortunately this unboxing is not supported by .NET, which is
probably the most widely used runtime implementation of this sort. By
the time you get the JIT layer you've lost a lot of information here.

shap



More information about the cap-talk mailing list