OpenStack Compute (nova)

fencing Instances of an unreachable host

Registered by Ehud Trainin on 2014-01-28

Fencing is essential to support high availability.
The following blueprint addresses fencing integration within Nova.
This is part of cross project effort to add fencing to OpenStack.
Background and an overall design for fencing across OpenStack are given in
https://wiki.openstack.org/wiki/Fencing_Instances_of_an_Unreachable_Host.

Fencing state
----------------------
A host may have one the following fencing states: FENCED and UNFENCED.

It is important to note that while the fencing state is associated with the host, the host is not necessarily completely fenced. The fencing is targeted only at the instances of the host, since the instances are needed to be restarted on other hosts. If possible, the host is not fenced from the controller, so it would be possible to make it usable again, in case the problem is over. For example, network fencing may disconnect only the data network and power fencing may try to do a hard reboot, rather than leaving the host powered off.

Nova should maintain the fencing state for the following reasons:
1) Nova should agree to remote restart an instance only if its host is fenced.
2) While a host is fenced, the Nova controller should not schedule instances to it even if it becomes reachable again.
3) When a fenced host becomes reachable again, the Nova controller should do clean up and then un-fence the host.

The fencing states would be changed due to operations exposed by API (fence API, power down API) and due to automatic procedures triggered by events (self fencing, rejoin of a disconnected host).

Fixing the remote restart path
----------------------------------------------
Currently, the remote restart ("evacuate") method enables to remote restart instances of a failed host, without checking a fencing state.
It is suggested to allow a remote restart only if both of the following hold
1) Host failure state = UNREACHABLE
2) Fencing state = FENCED

Currently, remote restart detach volumes attached to the obsolete instance before attaching to the remotely restart instance. The Cinder fencing task would also detach volumes (while also forcing the detachment at the storage level). Therefore, the detachment, currently done in remote restart, may be avoided, in order to have a faster recovery.

Scheduler fencing awareness
---------------------------------------------
It is suggested to allow scheduling only if both of the following hold
1) Host failure state = UP
2) Fencing state = UNFENCED

Adding fence-host API
-----------------------------------
While Nova would not perform fencing actions by itself, it is recommended Nova would be able to manage fencing for the following reasons:
1) As Nova controller provides an API method for a remote restart and since fencing is a pre-condition for remote-restart, it is recommended that Nova would also provide the user an API for fencing.
2) Such an API would also hide and save from the user fencing details, i.e. the combination of several fencing methods.

The fence-host method would do the follows:
1) Check if the host is already in a FENCED state. If so, return success.
2) Check if the host is up. If the host is up, return an error.
3) Issue fence-host-from-storage requests to Cinder
4) Issue fence-host-from-data-network to Neutron.
5) Issue power fencing request

It would be possible to fallback into manual fencing (aka meatware), for example, by powering down the host manually. After manual fencing was done, the user would be able to use the API to confirm a host is fenced. In such a case, Nova would trust the user and change the fencing state into FENCED, with no further actions.

It should be noted that the fencing, as well as fencing sub-tasks, performed by other component (e.g. Cinder or Neutron) should be idempotent, thus there is no problem if a host is already fenced. In such a case, the fencing tasks should quickly send a successful acknowledgement for the fencing request.

Changes in fencing state
-------------------------------------
Once the host power state was changed into powered off, due to a hard reboot or a power off, which were called by the fence-host method or directly, the host fencing state would be changed into FENCED.
Are there further API commands that affect fencing state?

If both the storage and the network were fenced, the host fencing state would be changed into FENCED.

After self fencing timeout was elapsed, the fencing state would be set into FENCED.

Since all the above may happen concurrently, the fencing state would be changed into FENCED after the first successful fencing action, while the rest actions would be done, but would not change the fencing state, since it is already set to FENCED.

Joining back a fenced host
-----------------------------------------
When a fenced host becomes reachable again, the following actions should be done:
1) Wait for the self fencing timeout to elapse, if it was not already elapsed. Alternatively, if self fencing is not activated, shut down all instances in case they are not shut down.
2) Delete all obsolete instances.
3) Reconnect the remaining instances to their volumes.
4) Reconnect the network.
5) If the above tasks succeeded, change host's instances state into UNFENCED.
6) Power on the remaining instances.

Listening to changes in fencing state
--------------------------------------------------------
It would be possible to subscribe to notifications about changes in fencing state. The subscribers would get a notification for such a change, whatever the event, action or action initiator caused to the fencing state change.

Read the full specification

Blueprint information

Status:: Not started

Approver:: None

Priority:: Undefined

Drafter:: Ehud Trainin

Direction:: Needs approval

Assignee:: Ehud Trainin

Definition:: Drafting

Series goal:: None

Implementation:: Unknown

Milestone target:: None

Related branches

Related bugs

Sprints

Whiteboard

If you are still working on this, please re-submit via nova-specs. If not, please mark as obsolete, and add a quick comment to describe why. --johnthetubaguy (20th April 2014)

(?)

Work Items

This blueprint contains Public information

Everyone can see this information.

Subscribers

Alex Glikson

Ezra Silvera

Kimi Zhang

Nautik

sabdhagiri

Tomi Juvonen

Zak Berrie

zhangwenjian