Question from Bakin

Norman Hardy norm@netcom.com
Sun, 4 Jul 1999 08:38:26 -0700


At 9:17 -0700 99/6/24, David Bakin (Exchange) wrote:
>Jonathon,
> 
>I have a question about how device state is restarted/recovered after a
>crash and checkpoint recovery. 
> 
>In the ongoing thread "some very thought provoking questions" you
>distinguish between the kernel, which is rebooted/restarted, and the
>processes running, which are checkpointed and continued.  And you also
>mentioned how 'external connections' are revoked, and a process can then
>get a notification and perform some recovery (like revalidate a
>capability).
> 
>So is it the case that on reboot the device drivers - presumably part of
>the kernel - reinitialize all external hardware to some known state?  And
>then processes which talk to hardware, on discovering that their
>capability is now invalid, know that they need to recover which may
>including setting the hardware to other states?  I guess part of the
>question may be are "device drivers" entirely kernel mode, or do they tend
>to be written as a kernel mode part and a process part, or what?  In
>practice what kinds of device state are kept in the kernel and which in
>processes (checkpointed, and then verified?)
> 
>Thanks!  -- Dave

I have written a note: <http://www.mediacity.com/~norm/CapTheory/CPIO.html>
on the I/O interface state upon restartn for Keykos. That part of the
Gnosis (=Keykos) manual is not online. The IBM 370 architecture allows the
kernel to be oblivious to the attached device types. On restart the
hardware (micro code) resets the devices to known state documented in the
spec for the device type. The device savvy code (device driver?) runs in
user mode with persistent state. Its next interaction with the device is
via a capability. That interaction will report a missing device. It must
cope. It may ask higher authorities for another device. It may report to
its client that it is quitting.

Another note: <http://www.mediacity.com/~norm/CapTheory/Pragmatics.html>
discusses the pragmatics of not being able to abort an application by
rebooting.

A third note: <http://www.mediacity.com/~norm/CapTheory/WindowsCP.html>
speculates on how a Windows RAM recovery might work and how that differs
from that of Keykos.
Norman Hardy  <http://www.mediacity.com/~norm>