Small Spaces for the x86 FamilyThis note describes how we have adapted the ideas from Liedtke's small address space paper [1] for EROS. 1. The Small Spaces IdeaThe basic idea behind small spaces relies on the availability of the x86 segmentation mechanism. Just as the EROS kernel is mapped in common in all address spaces, we can map a number of user regions into all address spaces. Access to these regions can be controlled with the segment registers. A process can first be allocated to one of these regions, and can be moved to a traditional address space if it is discovered not to fit in the so-called ``small space.'' Norm Hardy and I first did a design for small address spaces for EROS in 1991. This fell out of a larger conversation in which we were trying to figure out how to let an EROS process believe that it owned the full 4Gbyte virtual address space. Our technique was to have a region that the kernel lived in, but bounce this region around on demand when the process wanted those virtual addresses. In the end, I concluded that the complexity of the kernel bounce logic outweighed its benefit, and decided to settle for a 3 GByte address space for EROS processes on the x86. This, at least, is 1 GByte more than any of the major operating systems, and leaves enough spare room for kernel emulation within the target process address space if desired. A side effect was that we realized that small spaces were possible. We did not at that time appreciate their performance impact, or we would have implemented them at the time. We did not publish this design. Since I had many other things to implement at the time, I set it aside intending to go back to it later. Jochen Liedtke arrived at the idea independently, and in 1995 put out a paper describing an implementation of it for L4. His performance numbers were very impressive, and I resolved to implement them in EROS as well. The existence of small spaces can be discovered by means of the load segment limit operation, but we consider this a reasonable tradeoff. If it is helpful, the operating system can implement several sizes of small space (e.g. a large number of 128 Kbyte spaces and a smaller number of 1 Mbyte spaces). The current EROS implementation includes only a single size: 128 Kbyte. There are several advantages to small spaces:
2. Planned ImplementationMuch to my surprise, the implementation of small spaces looks to be very simple. The address map for the kernel was rearranged, and I corrected a number of places where assumptions about the previous virtual address had managed to creep into the code. 2.1. Changes to Address MappingsIn the EROS implementation of small spaces, the address space is divided into three regions: +--------------+--------------+----------------+ | Large Space | Small Spaces | Kernel Virtual | | (unique) | (shared) | (shared) | +--------------+--------------+----------------+ 2.1. Changes to the Process StructureI added three new fields to the Process structure:
struct Process {
...
uva_t limit; /* uppermost virtual address */
uva_t bias; /* true VA = process VA + bias */
PTE *smallPTE; /* true VA = process VA + bias */
};
While the limit information can be obtained by inquiring of the segment register itself, it is faster to load it from the Process structure. The bias is process specific, and must be applied to sent and received string addresses during string transfer in order for the kernel to reference the memory map in the correct place.Previous to this change, the lower bound of string addresses and argument blocks was not checked by the fast-path IPC code. Since the uppermost pages of kernel virtual space were unmapped, arithmetic rollover guaranteed that an invalid lower bound would result in a page fault. Rather than place unmapped buffer regions between small spaces I elected to pack them densely and start checking the lower bounds. Every EROS Process might potentially be a small (128 Kbyte) space. Since a 128 Kbyte space takes only 32 page tables to map, I decided to simply dedicate a special collection of page tables. These page tables are mapped into every process and also into the kernel-special address space. Every Process carries with it 32 dedicated PTE entries. When the Process structure is first loaded, the smallPTE field is initialized to point to the first entry. If this field is nonzero, the associated process resides in a small space, and no page table root pointer is loaded on return to the process. The net effect is that the small space runs parasitically in all address spaces, which is what we wanted. 2.2. Changes to Context SwitchThe new context switch logic must update the global descriptor table with the correct base and bound for the small space before switching into it. This amounts to rewriting four words, all of which prove to be in the same cache line. 2.3. Small Process Page FaultsWhen a process performs a page fault, the page fault code must now check to see if the smallPTE field is nonzero. If so, the faulting process is a small process, and should not attempt to share page tables. A different path is taken for address resolution that sidesteps the efficiencies of the main path, because in most cases a small process will have only a single Node or Page as its entire address space. If the fault address exceeds the bound for this process, then one of three things must be true:
2.2. Changes to IPCThe only other piece of code that needs to be touched is the fast and slow IPC paths. The slow IPC path needs to be updated to reflect the new mapping table traversal logic. The fast path needs to be taught a new string transfer method. 3. Results
Copyright 1998 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License |