Hardware for EROS?
Jonathan S. Shapiro
Tue, 11 Apr 2000 15:13:50 -0400
> If a disk seek is the cost of a commit, then many realistic distributed
> computing scenarios cannot wait that long before releasing a message.
Typical seek delay will be lower on a system using EROS-style checkpoints,
because the arm is already in the checkpoint zone with high likelihood. It's
still units of milliseconds, though.
> It would seem that a vanilla EROS system
> + a UPS sufficient to keep things going until the next two commits (ie,
> until the next checkpoint is stable) could validly be considered an EROS
> that never needed to roll back because of power outages.
Better still, you can detect remaining power and just start checkpointing
more frequently. The journaled stuff never rolls back anyway.
> Does EROS check invariants before taking a checkpoint?
Yes. If the invariant check fails the system reboots. At the moment it
doesn't know how to warm-boot yet, but I hope to fix that. Reboot means
that it recovers from the most recent stable checkpoint.
I think in practice that the checkpoint really has to be stabilized to
non-volatile storage. If there is any practical chance that the battery dies
on you, the entire system ends up unrecoverable unless the checkpoint has
also been stabilized.