The High Cost of Core Jonathan Shapiro (shap@viper.cis.upenn.edu)
Wed, 23 Nov 94 10:48:09 -0500

Following a long conversation with Norm last night, several problems have emerged with the working set proposal. There are a number of tweaks, but one of the problems is fundamental. Here it is:

Given our principle that the kernel must not deadlock due to the failure of a user process, we arrive at a problem in responding to core pressure. If the kernel is unable to evict a page due to insufficient knowledge of the page's backing store, we must take the pessimistic view that the page is pinned in memory.

I've heard the story about the fascist eviction policy proposal (kernel tells the manager "you have N microseconds to do something appropriate with the page before I drop it"). My response is to suggest that this policy should be adopted self-recursively.

Nonpurgeable pages are not a virtualizable resource; they must be conserved. One consequence of this is that the accounting for such resources must be strict. Because pinned pages cannot be summarily stolen, it isn't straightforward to build a mechanism for load balancing across pools of pinned pages.

By contrast, any core page that the kernel knows how to purge can be viewed as living in a cache, and we can come up with a variety of strategies for managing that cache dynamically.

In effect, this creates two different categories of page. The allocation cost on a page the kernel cannot purge is considerably higher than the allocation cost of a purgeable page.

I conclude with some reluctance that there are really three categories of page in the system we're discussing:

pinned pages	application has effectively told us that it has no
		intention of removing this page from core in the
		foreseeable future.

unbacked pages	the kernel cannot purge these pages, but in principle
		there exists a process that might.  The distinction
		between pinned and unbacked pages is that it is
		reasonable to contemplate load balancing strategies
		for unbacked page allocation.

backed pages	the kernel can purge these, and their frames in memory
		can be viewed as cache.

The bad news is that there isn't any one place to localize the policy; the "keeper" for a working set does not, in general, know enough to evict the pages either.

The problem is worse for object frames, which cannot be evicted under any circumstances.

Some thoughts about all of this:

  1. The problem of backed v/s unbacked pages can be viewed as a latency issue; a failed backing manager is viewed as having infinite latency, and such pages eventually run out if the system-wide allocation quota is exceeded.

A problem with this view is that the kernel-implemented backing manager does not compete for resource with any other backing manager, and can be trusted. An external backing manager cannot be trusted by independent third parties, which raises denial of service issues.

2) External backing managers are very expensive; every pageout notification sent to a backing manager implies at least four context switches, and there is a question of whether it should be a push or a pull model; both have problems.

3) It might be reasonable to insist that all unpinned pages be backed. In effect, this ties core pages to the disk, at which point we might as well go back to KeyKOS-style pages.

So the question arises: why don't these problems exist for disk page allocation in KeyKOS? I observe that KeyKOS does not use anything like a meter mechanism to regulate disk page allocation. Is it really true that the saving grace in KeyKOS is simply that disk pages are not as scarce?

Jonathan