[EROS-Arch] Error logging

Bill Frantz frantz@pwpconsult.com
Tue, 25 Sep 2001 13:15:09 -0700


At 10:49 AM -0700 9/25/01, Jonathan S. Shapiro wrote:
>> The problem with bounded circular logs is that they lend themselves to
>> track-covering. I presume what you would usually want is a bounded
>> non-circular log - once full, attempts to access it would fail or block,
>> depending on your choice, I guess.
>>
>> Circular logs should only be used for non-critical information.
>
>So should we stop the machine because gcc can't write an assertion()?
>
>A diagnostic log is not a guarantee that you can figure out what happened.
>It is merely a potentially useful tool. Viewed in that light, a bounded
>circular log is acceptable for this application.

The problem is perhaps worse than it seems.  When debugging a program with
log entries, one of the big problems is that the frequently occurring
entries push the rare ones out of the circle, but it is the rare one you
need to see.

With lots of storage, and tools to search the log, these problems become
less important in the real world.  (Tools that highlight unusual patterns
in the logs would be most welcome.)

One would like to dream that the log entries would help debug the first
occurrence of those bugs that defy reproduction, and occur rarely.  The low
level circular trace of interrupts and system calls maintained in main
memory by certain IBM operating systems had this property.

>
>This is decidedly *not* true of an audit log, which must indeed last a long
>time. I think we may be talking about two different cases here.
>
>The questions, which I had not considered, would then seem to be:
>
>    1. Should these logs be distinct?
>    2. How should the audit log be handled?

I think there might be a place for rate limiting log entries in an audit
log, on the assumption that, if a component is producing log entries too
fast (for some definition of "too"), it is behaving abnormally, and should
not be permitted to flood the log.  Also, as a practical matter, there
probably needs to be a method of copying the audit log to cheaper/better
protected/offline storage.


>    3. In a decomposed system, how useful is an audit log?

I think it will help people gain confidence that the system is working as
it is supposed to be working (which may be different from how it was
designed to work, or how it was implemented to work).  In other words, an
audit log can highlight design and implementation errors.

Cheers - Bill


-------------------------------------------------------------------------
Bill Frantz           | The principal effect of| Periwinkle -- Consulting
(408)356-8506         | DMCA/SDMI is to prevent| 16345 Englewood Ave.
frantz@pwpconsult.com | fair use.              | Los Gatos, CA 95032, USA