Teach the scheduler about health check jobs and have it treat them as special

Registered by Paul Larson on 2012-02-21

Health check jobs are not regular jobs, but we are currently treating them that way. This approach can work, but doesn't really allow us to fully utilize them in the way we would like. What we discussed at the connect is that:
1. The scheduler should store information about each board's "health check" job
2. The health check job can be automatically dispatched when the board transitions from offline->idle (comes back after being fixed)
3. The health check job can be automatically scheduled once per day to keep a running status, but should not be executed if the board is offline

Blueprint information

Status:
Complete
Approver:
Paul Larson
Priority:
Medium
Drafter:
Michael Hudson-Doyle
Direction:
Approved
Assignee:
Michael Hudson-Doyle
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Implemented
Milestone target:
milestone icon 2012.03
Started by
Paul Larson on 2012-02-23
Completed by
Michael Hudson-Doyle on 2012-03-08

Related branches

Sprints

Whiteboard

Meta:
Headline: LAVA Scheduler can store health check jobs to execute automatically when boards come back online
Acceptance:
1. The scheduler should store information about each board's "health check" job
2. The health check job can be automatically dispatched when the board transitions from offline->idle (comes back after being fixed)
3. The health check job can be automatically scheduled once per day to keep a running status, but should not be executed if the board is offline
Roadmap id: LAVA2012-LAVA-HEALTH-MANAGEMENT

I think we can do this with a few judicious changes to getJobForBoard.

(?)

Work Items

Work items:
Add health check definition to DeviceType: DONE
Return health check job from getJobForBoard if board state unknown: DONE
Return health check job from getJobForBoard if last health check was more than 24 hours ago: DONE

This blueprint contains Public information 
Everyone can see this information.