Find the root cause proactively
Currently, Vitrage relies on the monitored events for root cause analysis. It may deduce alarms on virtual resources, but it is always in a confirmed status. But in the real world, things could be more complicated, suppose there are two possible causes (A or B) for fault C. When fault C is monitored, it is suspicious that A or B could happen and be the root cause. We need a way to take action in such case to find the root cause more proactively.
Blueprint information
- Status:
- Not started
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- Yujun Zhang
- Direction:
- Needs approval
- Assignee:
- Yujun Zhang
- Definition:
- New
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by
Related branches
Related bugs
Sprints
Whiteboard
As the dependency blueprint "aggregate equivalent alarms" of "proactive-rca" is too complex to implemeted., the proposal has been updated.
summary for updated proposal:
When the current and new status of alarm are respectively monitored and
deduced, we prescribed rules for alarm merging, even if the state of these
two status are conflict. We can use diagnose action to further confirm
alarm in environment when the conflict happens.
link: