9.3.2. Spostare risorse in seguito ad un fallimento
New in 1.0 is the concept of a migration threshold.
Simply define migration-threshold=N
for a resource and it will migrate to a new node after N failures. There is no threshold defined by default. To determine the resource’s current failure status and limits, use crm_mon --failcounts
.
By default, once the threshold has been reached, this node will no longer be allowed to run the failed resource until the administrator manually resets the resource’s failcount using crm_failcount
(after hopefully first fixing the failure’s cause). However it is possible to expire them by setting the resource’s failure-timeout
option.
So a setting of migration-threshold=2
and failure-timeout=60s
would cause the resource to move to a new node after 2 failures, and allow it to move back (depending on the stickiness and constraint scores) after one minute.
There are two exceptions to the migration threshold concept; they occur when a resource either fails to start or fails to stop. Start failures cause the failcount to be set to INFINITY
and thus always cause the resource to move immediately.
I fallimenti in fase di stop sono leggermente differenti e cruciali. Se una risorsa fallisce lo stop e STONITH è abilitato, allora il cluster effettuerà un fence del nodo in modo da essere in grado di avviare la risorsa altrove. Se STONITH non è abilitato, allora il cluster non ha modo di continuare e non cercherà di avviare la risorsa altrove, ma continuerà a tentare di stopparla superato il failure timeout.