Maintenance As a Service - MAAS(2)

Registered by Tomasz Trębski

Maintenance for the monasca would be feature that would assist with the problem of alarm flooding.
At the high level, maintenance would be meant to support the cases of:

- maintenance is being performed at the server during, for instance, night time
- it is known that during weekends some data migrations are performed at the regular basis
- certain nodes/applications/services will be turned off for a certain period of time

In details the idea of maintenance-as-a-service would be based on:

- maintenance definition
An entity would allow to point at the alarms that needs to be muted/suppressed (see board below).
Plus it would define the time based expression that would describe when such maintenance would happen, for how long, should be periodic or just repeated X times over.
Additionally it shoukd be possible to disable maintenance_definition if, for some reason, it is not needed.

- maintenance engine
New component would be responsible for:
-- checking if the definition active (a.k.a. time expressions fits in the current date/time pair)
-- listing all alarm definitions (and further alarms) that needs to be suppressed. Which entity it would be depends on the decision if maintenance should be based on notification or alarm suppression (see board below)
-- affecting the state of found objects in order to keep alarms/notifications from flooding the user because of the abnormalities in the monitored environment happening during the maintenance

Resources:
- https://docs.google.com/drawings/d/1hokCc3KxTaeWiEyJi8K2ZRP3cnSpAqMBOySYCwnpQSE/edit?usp=sharing

Blueprint information

Status:
Not started
Approver:
Roland Hochmuth
Priority:
Undefined
Drafter:
Tomasz Trębski
Direction:
Needs approval
Assignee:
None
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

There were several places where decisions were not taken regarding the implementation approach:
1) Should maintenance define new alarm state (a.k.a. MAINTENANCE)
2) Should maintenance suppress notifications or transistions

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.