Notify LAVA Admins when a board is offlined for a failed health check

Registered by Paul Larson

We currently have the ability to mark boards offline automatically when they fail a health check. This will show up as an offline board in the main scheduler view, and the reason will be logged for it being marked offline. However, no active notifications are sent when this occurs, so it depends on admins checking the jobs daily to ensure that they are still running, and problems are resolved.

When a board is marked offline, this should trigger an action to send email to users or mailing lists responsible for maintaining the system.

Blueprint information

Status:
Complete
Approver:
Paul Larson
Priority:
Medium
Drafter:
Paul Larson
Direction:
Approved
Assignee:
Michael Hudson-Doyle
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Implemented
Milestone target:
milestone icon 2012.05
Started by
Michael Hudson-Doyle
Completed by
Michael Hudson-Doyle

Sprints

Whiteboard

[fboudra, 2012-03-22] re-target to 2012.04.
[fboudra, 2012-04-27] Re-target to 2012.05 milestone.

Meta:
Headline: Admin users are notified when a board is marked offline for a failed health check
Acceptance: When a board is marked offline, an email is triggered that notifies a defined user or users with a link to the board details page, and a link to the failed health job
Roadmap id: LAVA2012-LAVA-HEALTH-MANAGEMENT

(?)

Work Items

Work items:
Investigate if django-signals is appropriate to use here (raised at the connect): DONE
Add a way to define the addresses (users?) to notify when a job fails: DONE
Add code to send emails from LAVA: DONE
Define a template email to send including links to the device details page and the scheduler job details page: DONE
Trigger an email using the template, with a job completes: DONE
Test that this works: DONE
add people to notify to all health job definitions in the lab: DONE

This blueprint contains Public information 
Everyone can see this information.