Hardware for EROS?

Jonathan S. Shapiro shap@eros-os.org
Tue, 11 Apr 2000 15:13:50 -0400


> If a disk seek is the cost of a commit, then many realistic distributed
> computing scenarios cannot wait that long before releasing a message.

Typical seek delay will be lower on a system using EROS-style checkpoints,
because the arm is already in the checkpoint zone with high likelihood. It's
still units of milliseconds, though.

> It would seem that a vanilla EROS system
> + a UPS sufficient to keep things going until the next two commits (ie,
> until the next checkpoint is stable) could validly be considered an EROS
> that never needed to roll back because of power outages.

Better still, you can detect remaining power and just start checkpointing
more frequently. The journaled stuff never rolls back anyway.

> Does EROS check invariants before taking a checkpoint?

Yes. If the invariant check fails the system reboots. At the moment it
doesn't know how to warm-boot yet, but I hope to fix that.  Reboot means
that it recovers from the most recent stable checkpoint.

I think in practice that the checkpoint really has to be stabilized to
non-volatile storage. If there is any practical chance that the battery dies
on you, the entire system ends up unrecoverable unless the checkpoint has
also been stabilized.

shap