[cap-talk] KeyKOS kernel safety

David Hopwood david.nospam.hopwood at blueyonder.co.uk
Tue Aug 23 19:42:18 EDT 2005


Charles Landau wrote:
> This question is directed at the KeyKOS folks who have a better memory 
> than I.
> 
> In KeyKOS there was a page range designated as the kernel range. The 
> boot code loaded the range without consulting checkpoint data. To update 
> the kernel, you could write data to these pages, wait for it to be 
> migrated from the checkpoint area to the home range, and reboot the system.
> 
> KeyKOS was designed to maintain a safe working state if the system 
> crashed and rebooted at any instant. My question is, what if the system 
> crashed during the write or migration of the kernel range?
> 
> My recollection is that the design, if not the implementation, called 
> for two kernel ranges, a primary and a backup. If the primary failed to 
> boot, the secondary was booted. You would update and test one before 
> updating the other.
>
> (To be safe, you would have to first write the first page of the primary 
> kernel with code that reliably fails. At the instant that page is 
> migrated, any reboot will reliably go to the secondary kernel. Then 
> write the other pages of the new primary kernel, and wait for them to be 
> migrated. Then write a good first page; at the instant that page is 
> migrated, the new kernel is bootable.)

This is basically a slightly unconventional implementation of Challis'
algorithm, depending on the fact that what is being updated is executable
code.

If you're also asking about how to do it in CapROS (or any new OS),
wouldn't it be more straightforward to implement Challis' algorithm in
the usual way that doesn't depend on the contents being code? In that
case it's still possible to have a primary kernel and a backup, but they
could each be updated atomically and independently.

-- 
David Hopwood <david.nospam.hopwood at blueyonder.co.uk>



More information about the cap-talk mailing list