Self Managing Managers

Overview

Managers are responsible for maintaining a consistent and recoverable stable copy of the data represented by the container. Managers are ordinary programs running inside containers. However, this presents the classic chicken and egg problem - who manages the managers? The answer is other managers; however, some fixed point is required in system and this may be provided by a single self managing manager. In practice more than one self managing manager may exist in the system. Self managing managers are responsible for their own stability and resilience; this requires a small amount of support from the kernel. This support is provided by the kernel taking responsibility for the resilience of a single page via the provision of a single kernel function, map_root_page.

map_root_page(
	Capref cont
	cont_offset addr )

which requests the kernel to act as a manager for the single page at address addr of the specified container. For managers to be completely self managing, this page must contain the container invocation point i.e. code and enough of its own data structures to be able to bootstrap itself. In practice this means the disk address of a single page of data from which its own page maps may be re-created. All access faults on the root page are satisfied by the kernel, in other words the kernel is the manager of that page.

Self management is best illustrated by a description of the stabilisation protocol for self managing managers. We assume that some entity has already specified that the shaded page in the figure below is the root page.

A Self Managing Manager

The stabilisation protocol for self managing managers is as follows:

The manager makes a snapshot_request call on the kernel.
The kernel makes a container_snapshot call on the manager. All other loci within the manager will be frozen and their state snapshotted as part of the container_snapshot call semantics. The locus executing the container_snapshot call is guaranteed by the kernel to be the only locus running in the container.
The manager writes its volatile data to disk. This data is structured in such a way that it is recoverable using the address of a single disk block whose address is contained in the root page along with the bootstrap code.
The container_snapshot call returns and the kernel makes a copy of the root page on stable media. If space is not at a premium, the kernel may choose to maintain all the root pages requested to be stabilised by the manager. The kernel may discard any root pages, it need only keep those which are required to enable recovery of consistent system states. Since it is the kernel that forms consistent cuts discard is invisible to the managers.

Self Managing Managers

Overview

See Also