Hardware for EROS?
Bill Frantz
frantz@communities.com
Tue, 11 Apr 2000 13:42:50 -0700
Many people have direct experience with battery backuped up main memory
permanent storage, because that is what the Palm Pilot uses. Those
experiences are directly applicable to larger systems.
Like most things, it works most of the time, but sometimes you need to go
to another level of backup. You can certainly postulate failures that will
kill all your backups. The sun going nova is my favorite example.
At 03:13 PM 4/11/00 -0400, Jonathan S. Shapiro wrote:
>> If a disk seek is the cost of a commit, then many realistic distributed
>> computing scenarios cannot wait that long before releasing a message.
>
>Typical seek delay will be lower on a system using EROS-style checkpoints,
>because the arm is already in the checkpoint zone with high likelihood. It's
>still units of milliseconds, though.
>
>> It would seem that a vanilla EROS system
>> + a UPS sufficient to keep things going until the next two commits (ie,
>> until the next checkpoint is stable) could validly be considered an EROS
>> that never needed to roll back because of power outages.
>
>Better still, you can detect remaining power and just start checkpointing
>more frequently. The journaled stuff never rolls back anyway.
>
>> Does EROS check invariants before taking a checkpoint?
>
>Yes. If the invariant check fails the system reboots. At the moment it
>doesn't know how to warm-boot yet, but I hope to fix that. Reboot means
>that it recovers from the most recent stable checkpoint.
>
>I think in practice that the checkpoint really has to be stabilized to
>non-volatile storage. If there is any practical chance that the battery dies
>on you, the entire system ends up unrecoverable unless the checkpoint has
>also been stabilized.
>
>shap
>
>
>
>