Self Managing Managers
Overview
Managers
are responsible for maintaining a consistent and recoverable stable copy of the
data represented by the container. Managers are ordinary programs running
inside containers. However, this presents the classic chicken and egg problem -
who manages the managers? The answer is other managers; however, some fixed
point is required in system and this may be provided by a single self managing
manager. In practice more than one self managing manager may exist in the
system. Self managing managers are responsible for their own stability and
resilience; this requires a small amount of support from the kernel. This
support is provided by the kernel taking responsibility for the resilience of a
single page via the provision of a single kernel function,
map_root_page.
map_root_page(
Capref cont
cont_offset addr )
which
requests the kernel to act as a manager for the single page at address
addr of the specified container. For managers to be completely self
managing, this page must contain the container invocation point i.e. code and
enough of its own data structures to be able to bootstrap itself. In practice
this means the disk address of a single page of data from which its own page
maps may be re-created. All access faults on the root page are satisfied by the
kernel, in other words the kernel is the manager of that page.
Self management is best illustrated by a description of the stabilisation
protocol for self managing managers. We assume that some entity has already
specified that the shaded page in the figure below is the root page.
A Self Managing Manager
The stabilisation protocol for self managing managers is as follows:
- The manager makes a
snapshot_request
call on the kernel.
- The kernel makes a
container_snapshot
call on the manager. All other loci within the manager will be frozen and their
state snapshotted as part of the container_snapshot call semantics. The
locus executing the container_snapshot call is guaranteed by the kernel
to be the only locus running in the container.
- The manager writes its volatile data to disk. This data is structured in
such a way that it is recoverable using the address of a single disk block
whose address is contained in the root page along with the bootstrap code.
- The container_snapshot call returns and the kernel makes a copy of
the root page on stable media. If space is not at a premium, the kernel may
choose to maintain all the root pages requested to be stabilised by the
manager. The kernel may discard any root pages, it need only keep those which
are required to enable recovery of consistent system states. Since it is the
kernel that forms consistent cuts discard is invisible to the managers.
See Also
Container
managers
map_root_page
snapshot_request
container_snapshot