Services heartbeat with ZooKeeper

Registered by Alex Glikson

Today the heartbeat information of Nova services/nodes is maintained in the DB, while each service updates the corresponding record in the Service table periodically (by default -- every 10 seconds), specifying the timestamp of the last update. This mechanism is highly inefficient and does not scale. E.g., maintaining the heartbeat information for 1,000 nodes/services would require 100 DB updates per second (just for the heartbeat).
A much more lightweight, scalable and reliable heartbeat mechanism can be implemented using ZooKeeper (which on its own can be also used for other purposes, to further enhance scalability and resiliency of Nova).

Blueprint information

Status:
Complete
Approver:
None
Priority:
Low
Drafter:
Alex Glikson
Direction:
Needs approval
Assignee:
Yun Mao
Definition:
Approved
Series goal:
Accepted for grizzly
Implementation:
Implemented
Milestone target:
milestone icon 2013.1
Started by
Alex Glikson
Completed by
Yun Mao

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/zk-service-heartbeat,n,z

Addressed by: https://review.openstack.org/10903
    Add pluggable ServiceGroup monitoring APIs

Addressed by: https://review.openstack.org/16396
    Set node_availability_zone in XenAPIAggregateTestCase

Addressed by: https://review.openstack.org/19008
    Implement ZooKeeper driver for ServiceGroup API.

(?)

Work Items