Notify LAVA Admins when a board is offlined for a failed health check
We currently have the ability to mark boards offline automatically when they fail a health check. This will show up as an offline board in the main scheduler view, and the reason will be logged for it being marked offline. However, no active notifications are sent when this occurs, so it depends on admins checking the jobs daily to ensure that they are still running, and problems are resolved.
When a board is marked offline, this should trigger an action to send email to users or mailing lists responsible for maintaining the system.
Blueprint information
- Status:
- Complete
- Approver:
- Paul Larson
- Priority:
- Medium
- Drafter:
- Paul Larson
- Direction:
- Approved
- Assignee:
- Michael Hudson-Doyle
- Definition:
- Approved
- Series goal:
- Accepted for trunk
- Implementation:
- Implemented
- Milestone target:
- 2012.05
- Started by
- Michael Hudson-Doyle
- Completed by
- Michael Hudson-Doyle
Whiteboard
[fboudra, 2012-03-22] re-target to 2012.04.
[fboudra, 2012-04-27] Re-target to 2012.05 milestone.
Meta:
Headline: Admin users are notified when a board is marked offline for a failed health check
Acceptance: When a board is marked offline, an email is triggered that notifies a defined user or users with a link to the board details page, and a link to the failed health job
Roadmap id: LAVA2012-
Work Items
Work items:
Investigate if django-signals is appropriate to use here (raised at the connect): DONE
Add a way to define the addresses (users?) to notify when a job fails: DONE
Add code to send emails from LAVA: DONE
Define a template email to send including links to the device details page and the scheduler job details page: DONE
Trigger an email using the template, with a job completes: DONE
Test that this works: DONE
add people to notify to all health job definitions in the lab: DONE