RHEL 6.4 onwards

Install

Pacemaker ships as part of the Red Hat High Availability Add-on. The easiest way to try it out on RHEL is to install it from the Scientific Linux or CentOS repositories.

If you are already running CentOS or Scientific Linux, you can skip this step. Otherwise, to teach the machine where to find the CentOS packages, run:

[ALL] # cat < /etc/yum.repo.d/centos.repo [centos-6-base] name=CentOS-$releasever - Base mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ enabled=1 EOF

Next we use yum to install pacemaker and some other necessary packages we will need:

[ALL] # yum install pacemaker cman pcs ccs resource-agents

Configure Cluster Membership and Messaging

The supported stack on RHEL6 is based on CMAN, so thats what Pacemaker uses too.

We now create a CMAN cluster and populate it with some nodes. Note that the name cannot exceed 15 characters (we'll use 'pacemaker1').

[ONE] # ccs -f /etc/cluster/cluster.conf --createcluster pacemaker1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node2

Next we need to teach CMAN how to send it's fencing requests to Pacemaker. We do this regardless of whether or not fencing is enabled within Pacemaker.

[ONE] # ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node2 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node1 pcmk-redirect port=node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node2 pcmk-redirect port=node2

Now copy /etc/cluster/cluster.conf to all the other nodes that will be part of the cluster.

Start the Cluster

CMAN was originally written for rgmanager and assumes the cluster should not start until the node has quorum, so before we try to start the cluster, we need to disable this behavior:

[ALL] # echo "CMAN_QUORUM_TIMEOUT=0" >> /etc/sysconfig/cman

Now, on each machine, run:

[ALL] # service cman start [ALL] # service pacemaker start

A note for users of prior RHEL versions

The original cluster shell (crmsh) is no longer available on RHEL. To help people make the transition there is a quick reference guide for those wanting to know what the pcs equivalent is for various crmsh commands.

Set Cluster Options

With so many devices and possible topologies, it is nearly impossible to include Fencing in a document like this. For now we will disable it.

[ONE] # pcs property set stonith-enabled=false

One of the most common ways to deploy Pacemaker is in a 2-node configuration. However quorum as a concept makes no sense in this scenario (because you only have it when more than half the nodes are available), so we'll disable it too.

[ONE] # pcs property set no-quorum-policy=ignore

For demonstration purposes, we will force the cluster to move services after a single failure:

[ONE] # pcs resource defaults migration-threshold=1

Add a Resource

Lets add a cluster service, we'll choose one doesn't require any configuration and works everywhere to make things easy. Here's the command:

[ONE] # pcs resource create my_first_svc Dummy op monitor interval=120s

"my_first_svc" is the name the service will be known as.

"ocf:pacemaker:Dummy" tells Pacemaker which script to use (Dummy - an agent that's useful as a template and for guides like this one), which namespace it is in (pacemaker) and what standard it conforms to (OCF).

"op monitor interval=120s" tells Pacemaker to check the health of this service every 2 minutes by calling the agent's monitor action.

You should now be able to see the service running using:

[ONE] # pcs status

or

[ONE] # crm_mon -1

Simulate a Service Failure

We can simulate an error by telling the service to stop directly (without telling the cluster):

[ONE] # crm_resource --resource my_first_svc --force-stop

If you now run crm_mon in interactive mode (the default), you should see (within the monitor interval - 2 minutes) the cluster notice that my_first_svc failed and move it to another node.

Next Steps