Core Components

Pacemaker

Pacemaker is a high-availability cluster resource manager. At its core, Pacemaker is a distributed finite state machine capable of co-ordinating the startup and recovery of inter-related services across a set of machines.

Pacemaker supports a number of resource agent standards (LSB init scripts, OCF resource agents, systemd unit files, etc.) to manage any service, and can model complex relationships among them (colocation, ordering, etc.).

Pacemaker supports advanced service configurations such as groups of dependent resources, cloned resources that must be active on multiple machines, resources that can switch between two different roles, and containerized services.

Corosync

Corosync APIs provide membership (a list of peers), messaging (the ability to talk to processes on those peers), and quorum (do we have a majority) capabilities to projects such as Pacemaker that need to be cluster-aware.

libQB

libqb is a library with the primary purpose of providing high-performance, reusable features for client/server applications, including high-performance logging, tracing, IPC, and polling.

Resource Agents

Resource agents are the abstraction that allows Pacemaker to manage services it knows nothing about. They contain the logic for what to do when the cluster wishes to start, stop or check the health of a service.

This particular set of agents conform to the Open Cluster Framework (OCF) specification. A guide to writing agents is also available.

Fence Agents

Fence agents are the abstraction that allows Pacemaker to isolate badly behaving nodes, by either powering off the node or disabling its access to common resources. The fence-agents project provides fence agents for commonly used fence devices, including intelligent power and network switches, IPMI, popular cloud services, virtualization hosts, and shared storage access.

OCF specification

The Open Cluster Framework specification is a set of standards for cluster components. Currently, only the resource agent standard is in use.

Configuration Tools

Pacemaker's internal configuration format is XML, which is great for machines but terrible for humans.

The community's best minds have created command-line and graphical interfaces to hide the XML and allow the configuration to be viewed and updated in a more human-friendly format.

crm shell

The original configuration shell for Pacemaker. Written and actively maintained by SUSE, it may be used either as an interactive shell with tab completion, for single commands directly on the shell's command line, or as a batch mode scripting tool.

Hawk

Hawk is a web-based GUI for managing and monitoring Pacemaker HA clusters. It is generally intended to be run on every node in the cluster, so that you can just point your web browser at any node to access it. There is a usage guide at hawk-guide.readthedocs.io, and it is documented as part of the SUSE Linux Enterprise High Availability Extension documentation

LCMC

The Linux Cluster Management Console (LCMC) is a GUI with an innovative approach for representing the status of and relationships between cluster services. It uses SSH to let you install, configure, and manage clusters from your desktop.

pcs

pcs provides both a command-line tool and Web-based GUI for managing the complete life cycle of all cluster components, including Pacemaker, Corosync, QDevice, SBD, and Booth.

pygui

The original GUI for Pacemaker, written in Python by IBM China. It is no longer actively developed.

Striker

Striker is the user interface for the Anvil! (virtual) server platform and the ScanCore autonomous self-defence and alert system.

Other Add-ons

booth

The Booth cluster ticket manager extends Pacemaker to support geographically distributed clustering. It does this by managing the granting and revoking of 'tickets' which authorizes one of the cluster sites, potentially located in geographically dispersed locations, to run certain resources.

sbd

SBD provides a node fencing mechanism through the exchange of messages via shared block storage such as for example a SAN, iSCSI, FCoE. This isolates the fencing mechanism from changes in firmware version or dependencies on specific firmware controllers, and it can be used as a STONITH mechanism in all configurations that have reliable shared storage. It can also be used as a pure watchdog-based fencing mechanism.