[cap-talk] Dan Bernstein's qmail security lessons paper
Sandro Magi
smagi at higherlogics.com
Mon Dec 17 11:32:35 EST 2007
Jonathan S. Shapiro wrote:
> First, there is the problem of garbage retention. This results from dead
> pointer slots on the stack that have not been nullified. Solutions have
> long been known, but are not widely implemented.
I don't see how this is a problem for server-side systems. As you say,
request/response cycles are short, which means the stack is filled then
emptied in a single short request/response. Sometimes a continuation is
built and saved for a subsequent request/response, but the continuation
saves only the relevant state; Waterken behaves like this, but the
continuations are explicit.
> Second, there is the pragmatic problem that most of the popular
> languages of this sort have to interface with C. In many cases (though
> perhaps not in Java) this forces them to conservative collectors, which
> are horrible.
There are relatively few languages that use conservative GC [1]. Mono is
moving to an accurate GC soon. Java, MS .NET, OCaml, SML, Haskell, Ruby,
Python, Perl, various Lisps and Schemes, all use accurate GC. While I'm
not a fan of conservative GC, Hans Boehm has studied the overheads of
and found them to be minimal.
> Finally, there is the problem that when something *does* go wrong with
> GC, there are no tools to help you find the culprit (and it isn't clear
> how one would design such tools).
Apparently Haskell has been developing a fairly sophisticated heap
profiler. They're in the worst spot resource leak-wise given Haskell's
laziness. I agree that this can be a problem in theory, but I've never
run into it using call-by-value languages.
> Note, however, that all of these costs are associated with GC rather
> than safety per se. The only substantive execution-time overheads of
> these languages have to do with things like "true" integers, and the
> fact that many implementations therefore cannot exploit simplified
> register arithmetic. A lot of that can be made to go away by a
> sufficiently aggressive compiler, but I do not know if such compiler
> technology is widely deployed. I doubt it.
I'm not sure I follow. Just about every language I know of exposes
native integers by default without transparent conversions to big integers.
I would say that the substantial execution overheads result from boxed
representations of simple types like integers and floats, which must
then be traced by the GC. OCaml performs substantial analysis to unbox
such small values.
Escape analysis has also been shown to substantially reduce memory
overheads and execution times. Because .NET retains types at runtime, it
performs such unboxing during JITC for its ValueTypes. Many languages
attempt the same optimizations. Region-based analysis is a
generalization of escape analysis, so they should optimize this memory
overhead even more aggressively.
> Hotspot is largely irrelevant for server applications. Technologies like
> hotspot have a higher startup cost to reach a more efficient stable
> state. The problem is that web transactions are short, so you never
> really get the payoff from this type of technology.
The number of execution paths in a given program are generally fixed, so
Hotspot would optimize these fairly well.
Sandro
[1] http://www.hpl.hp.com/personal/Hans_Boehm/gc/#users
More information about the cap-talk
mailing list