grouped resources alarm
This is aimed for setting up and managing Alarms that observes same metrics, though set up on many entities.
In general, this Alarm group would be defined by Sample-API query, which result to many entities (entity could be value of any attribute that could be grouped by, so e.g. user, project, resource...)
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- None
- Direction:
- Needs approval
- Assignee:
- Liusheng
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
From mailing list conversation:
[lsmola]
>> 5. There is a thought about generating a default alarms, that could observe
>> the most important things (verifying good behaviour, showing bad behaviour).
>> Does anybody have an idea which alarms could be the most important and
>> usable for everybody?
[jd__]
> I'm not sure you want to create alarm by default; alarm are resources, I
> don't think we should create resources without the user asking for it.
>
> Maybe you were talking about generating alarm template? You could start
> with things like CPU usage staying at >90% for more than 1 hour, and
> having an action that alerts the user via mail.
> Same for disk usage.
[lsmola]
Well for example, if we find metrics, that can be used for measuring health
(this is probably more undercloud talking, or hardware metrics in general),
we could do something like "I want this alarm on all resources of this type",
if there will be e.g. 100s of the resources of the same type, it would be pretty
dull to connect alarm to each of them, or to decide to change them.
Btw. it doesn't have to be a list of resource ids, but once the sample-api is finished,
it can be any query, that will produce a list of resources, tenants, etc.. (anything that
will allowed to be grouped by)
So it could serve as some kind of alarm groups management (let's say the group
is tagged somehow so you can recognize it ^^), it would add alarm on adding a
new resource and you could manage all alarms by one form.
Then when we have some alarm groups, that will be likely used by 80% of the
clouds, we could e.g. switch them on as default for Admins. Then Admin could
change the alarm group, or delete it if needed.
And yes, preparing a general templates is also a good idea, probably categorized by use case.
Users will have something pre-prepared, and they can set the most used Alarms
without need of reading the whole docs.
[jd__]
Agreed, but I don't know/think that Ceilometer has such capabilities
right now.
[lsmola]
Yes it would be good if something like this would be supported. -> relation of alarm to multiple entities, that
are result of sample-api query. Could it be worth creating a BP?
Though we could do some simple implementation using tags (special description) and keeping track that
every entity of some query has its alarm. Probably using composite alarm as wrapper around group of
alarms could be also good, we could use it to store the shared query.
So I guess there could be a way to implement this. Non optimal way. So it might be better
to support this in Ceilometer. Not sure. But definitely the upgrade from one way to the other would
be problematic.
[jd__]
Probably indeed.
[lsmola | 24.9.2013]
Summary
=======
The Alarm Group would be defined by sample-api. The result would be how many of the inner elements are in Alarm state (so it would create the Alarms on the background) also in time series chart.
I could list alarms of the Alarm Group. I could list history of each inner alarm.
-------
Hi Ladislav Smola, do you have plan to continue this bp ? ---by liusheng 2015.6.23
not in the near future, I can assign it to you if you want -- by lsmola 2015.6.23
Hi Ladislav Smola, sorry for reply late, I'm glad to have a try if you can assign this to me --by liusheng 2015.7.2
Work Items
Dependency tree
* Blueprints in grey have been implemented.