[cap-talk] small notes re: waterken
dmbarbour at gmail.com
Fri Mar 4 19:52:34 PST 2011
On Fri, Mar 4, 2011 at 5:46 PM, Jonathan S. Shapiro <shap at eros-os.org> wrote:
>> Even assuming that order of destruction isn't a problem for you,
>> the highest performance real-time or concurrent GCs allow a
>> certain epsilon of 'floating' garbage that will never be collected.
> Can you source that statement? Such a collector would, by its nature, be
> imprecise, and there are just a hoard of issues with an imprecise collector
> of any sort.
The term 'floating garbage' is sourced
The statement that the highest performance collectors will have
'floating garbage' is based on my own research, though I should have
been more precise: I'm talking about performance in large-scale,
concurrent, real-time garbage-collection where most of the 'image' is
on disk. It is infeasible to run a sweep across all of data in such
systems. You preferentially favor collecting 'regions' that are
mostly in-cache or in-memory. But, because you collect only part of a
system at a time, you will have floating garbage. Good design can keep
floating garbage down to an epsilon (e.g. you guarantee a maximum of
10% floating garbage).
I am curious what 'hoard of issues' you anticipate. My understanding
is that the epsilon bound takes care of most issues. E.g. you can
still support real-time and bounded-space programming. The only major
problems with it I know of are: (a) you cannot depend on gc-triggered
finalizers, and (b) you might be concerned about passwords and such
hanging around too long or making it into long-term memory.
In the gc/allocation system I was designing, (a) was handled by other
design, and (b) was handled by some guarantees that small and
short-lived things are properly collected before ever making it to
> I'm not aware of any true-parallel concurrent and incremental collector, for
> example, that allows garbage to survive for more than two generations.
It is true that whole-memory GCs will be precise. Perhaps the
region-based variations never came up in your research?
I.e. I don't even use whole-system 'generations'. I have a couple
generations per a processor, but a processor doesn't even bother doing
a larger collection if its small collections (each processor has a
rotating nursery and 1st gen pages it 'owns') are getting the job
done. (This fits the epsilon requirement since no new floating garbage
is being introduced, but it also means that no 'old' garbage is being
> Feasible, but you don't want it, because you don't want to take the latency
> consequences of the chain of deletes that can result from this.
Yeah, unpredictable spikes in latency are a problem for real-time
programming. You need to keep your latency spikes predictable.
More information about the cap-talk