Monitoring Physical Devices

Registered by Toni Zehnder on 2012-11-15

It should be possible to monitor physical devices in the OpenStack environment. The monitored devices are:
- the physical servers on which Glance, Cinder, Quantum, Swift, Nova compute node and Nova controller runs
- the network devices used in the OpenStack environment (switches, firewalls ...)

Blueprint information

Status:
Complete
Approver:
Julien Danjou
Priority:
Low
Drafter:
Toni Zehnder
Direction:
Approved
Assignee:
Oleksii Serhiienko
Definition:
Approved
Series goal:
Accepted for icehouse
Implementation:
Implemented
Milestone target:
milestone icon 2014.1
Started by
Toni Zehnder on 2013-03-19
Completed by
Julien Danjou on 2014-03-04

Related branches

Sprints

Whiteboard

Patches are already merged, can anybody mark this bp as DONE?

2013-05-14: Pinged Toni so he post his patchset before havana-1 -- jd

Status Update 2013-05-11
Done:
- hardware agent
- snmp inspector
- config stuff for the agent
- supported data: cpu, memoryspace, diskspace, network traffic

Toni, can you update your architecture diagram?
    1. Is the hardware agent going to just be like another agent similar to central agent consisting of pollsters?
         - Yes, it is very similar to the compute agent.
    2. where you are going to store the data? as metrics ? or as meters?
         - The data is stored similar to the compute agent and is in the same db as well.

[lsmola | 26.10.2013]

As a hardware agent is a kind of Central agent, there was a thought to merge them. Though talking to tripleo guys, it make sense to Have a separate hardware agent:

Talking to tripleo guys:
=================

<lsmola> lifeless, so does it make sense that the Undercloud Ceilometer central agent will talk directly to snmpd?
<lifeless> In some ways.
<lifeless> For virt instances, the hypervisor gathers metrics
<lifeless> This gives an implicit distribution of load which is good for scaling.
<lifeless> Any 'central X' is suspect from a scaling perspective.
<lifeless> So I guess I'd like to understand how it scales; what the options are.
* ifarkas has quit (Quit: Ex-Chat)
<lifeless> And how we gather things not known to snmp (if any exist)
<lifeless> having the instances talk to the message bus seems suspect, so snmp does seem better
<lifeless> but the abstraction point would be the hypervisor, which is Ironic in future, and nova-bm today.

<lsmola> lifeless, ok so it make sense that the Ironic as a hypervisor, would collect data both from SNMP and IPMI and send them to message bus, where the Ceilometer Collector will process them?
<lsmola> lifeless, so the hypervisor would contain the hardware agent
<lifeless> lsmola: I think so
<lifeless> lsmola: I'm fairly sure devananda will also think so
<lifeless> lsmola: [but be sure to consult w/devananda]

[lsmola | 27.9.2013]

So most concerns are about how the Central Agent will scale?

[lsmola | 4.10.2013]

Summary
=======

1. From yesterday ceilometer meeting, we have agreed with jd__, that this agent should merge into central agent. (nobody sees any problems regarding that for now)

2. Apart from the defining manually, what resources it should poll in pipeline.yaml, it should allow to configure fetching resources dynamically from services. (list of baremetals from nova, list of routers from neutron, etc.)

3. About scaling - there will be work on Horizontal scaling of Central agent in the future. This is a must have for larger deployments

btw. I have tested obtaining of Cpu and Memory stats, in Undercloud, on bm_poseur nodes. And it works great.

[lianhao | 1.9.2014]
I've rebased following patches onto bp support-resource-pipeline-item and moved them into central agent.

Gerrit topic: https://review.openstack.org/#q,topic:bp/monitoring-physical-devices,n,z

Addressed by: https://review.openstack.org/43073
    Added hardware agent's inspector and snmp implementation

Addressed by: https://review.openstack.org/43074
    Added pollsters for the hardware agent

(?)

Work Items

Work items:
Unittests: TODO
Multiple-publisher pipeline: TODO
New architecture diagram: TODO

Dependency tree

* Blueprints in grey have been implemented.