[Return to Top] [Concepts]

EROS Object Reference

Concepts

Scheduling

D R A F T

Description

This description is still under construction.

The EROS kernel implements a two-level scheduling mechanism. The first tier is a scheduler based on processor capacity reserves [Mercer93], while the second is a conventional priority scheduling system. Every process contains a Schedule Key under which it runs. Schedules come in three varieties:

  • A reserve schedule owns the processor during the time period defined by its processor capacity reserve, but does not authorize execution at any other time.

  • A hybrid schedule combines a reserve and a priority. Hybrid schedules own the processor during the time period defined by its processor capacity reserve, and allows the process to run at the specified priority when no reserve is active.

  • A priority schedule enables its process to run at the specified priority when no reserve is active.

When a capacity reserve is in control of the processor and a program operating under that schedule is ready to run, that process will gain control of the processor. If no process is ready to run, or if no reserve is in control of the processor, cycles are allocated according to priority. Within a priority class, processes are executed in round-robin fashion.

Processor Capacity Reserves

A processor capacity reserve is defined by two values: a duration (D) indicating how much computation time is required, and a period (P) describing how often this duration must be made available. The EROS scheduler assumes that activities are preemtable provided that the requirements of their reserves are met. In consequence, both the duration and the period impact the scheduling of the process. A (3 tick,10 tick) reserve and a (30 tick, 100 tick) reserve both allocate 30% of the processor, but the variance permitted by the second reserve is much higher.

I am contemplating altering the reserve abstraction to define a duration, active period, and inactive period in order to facilitate jitter control. It is not clear if this is an improvement.

Once a reserve has been successfully allocated, the holder is entitled to D ticks out of every P ticks. Once these guaranteed ticks have been consumed, they are not replenished until the period P has expired. Depending on the reserve, the process may elect to run opportunistically on a priority basis during the remainder of its period. Guaranteed ticks are always consumed in preference to opportunistic ticks.

Multiple processes may run under the same reserve. Runnable processes within a reserve are initiated in FIFO order. Once activated, a process running under a reserve will continue until the reserve runs out or the process performs some action causing it to yield. A process yields when:

  • It performs a non-prompt action, such as touching a page that must be faulted in from the disk. Under normal circumstances, a real-time program will operate under one or more working set reserves, limiting this impact to startup overhead and/or explicit I/O.
  • It performs a Call, Return, or Send to a process that is not entitled to execute under the same reserve.

Should reserves have a preemption quanta defining when its active process should be rotated? Would doing so provide neater integration of reserves and preemption?

In order to be eligable for guaranteed ticks during a given period, the process must be in the running state at the beginning of that period. If a process running under a reserve R invokes another process entitled to run under the same reserve R, the behavior depends on the type of the invocation:

    Invocation Action
    Call Control flow passes to the invokee without preeption, unless the reserve's period expires in mid-call.
    Return Control flow passes to the invokee without preeption, unless the reserve's period expires in mid-call.
    Send Control flow continues with the invokee without preemption. The invoker is placed at the tail of the activation queue for the reserve (i.e. in the last-in position).

There is an open issue concerning whether we need to add an Activate invocation, in which control flow continues with the invokee and the invoker is placed at the tail of the FIFO. It's a nearly trivial addition, and I would appreciate feedback on whether it is needed. My guess is that the use of the Send primitive in real-time programs should be fairly minimal. Come to think of it, the Activate Primitive would appear to be necessary in the ethernet packet filter to ensure that a packet destined for multiple protocol stacks is delivered to all of them in time to ensure that they get a fair shot at executing within their reserve period.

Reserve Management

Processor reserves are allocated and managed by the Reserve Agent, which in turn holds a Schedule Creator key. Given a request for a processor capacity reserve, the reserve agent either constructs the requested reserve and returns a schedule key to it, or indicates that the reserve cannot be constructed. The reserve agent in turn makes use of the primitive kernel reserve mechanism to support the schedules it issues.

Certain system processes, including the login agent and the prime space bank, run under a pre-allocated reserve. This reserve grants 5ms out of every 200ms, and is generally underutilized. The purpose of this reserve is to ensure that an administrator desiring to log in to the system will be able to do so.

The current kernel implementation of reserves is based on a fixed-priority mechanism. This strategy was easy to implement, and allows up to 69% of the total processing capacity to be allocated to reserved tasks. While this should be be sufficient for the programs we currently anticipate, future implementations may switch to dynamic reprioritization using an earliest deadline first (EDF) algorithm.

Priorities

In addition to the capacity reserve mechanism, the EROS kernel implements a conventional priority scheduler. A process may run under a reserve, under a given priority, or both. Processes entitled to time under a reserve are dispatched in preference to priority based execution. Within a priority level, processes are dispatched in round-robin fashion according to a system-defined quanta. In all current implementations, the priority scheduling quanta is approximately 10 ms. The accuracy of quanta expiration is subject to the limits of the underlying hardware.

EROS provides 16 priority levels. The conventional interpretation of these priorities is:

    Value Name Description
    16 High Process will run in preference to all lower priority processes.
    8 Normal The priority level assigned to most interactive processes.
    0 Idle Process will run only when there is nothing better to do.

A schedule key conveying the authority to run at a given priority conveys the authority to fabricate schedule keys conveying lesser or equal priority. Processes with access to their own schedule key are thereby able to construct other processes having lesser priority.


References

[Mercer93] Clifford W. Mercer, Stefan Savage, and Hideyuki Tokuda. ``Processor Capacity Reserves: An Abstraction for Managing Processor Usage,'' Proceedings of the Fourth Workshop on Workstation Operating Systems (WWOS-IV), October 1993

Copyright 1998 by Jonathan Shapiro. All rights reserved. For terms of redistribution, see the GNU General Public License