Resource locks should have means to break on demand

Registered by aeva black

Resource locks should denote the time when the lock was taken, and allow that timestamp to be updated in the case of very-long-running tasks. This will allow other processes (human or automated) to guage whether a process has become stalled or timed out, and take action if necessary.

At the Icehouse summit, several proposals were put forward. It was generally agreed that the direction is good, but that automatic lock-breaking is too risky without substantial safe-guards in place to prevent uninterruptable operations (eg, firmware updates) from being interrupted.

  https://etherpad.openstack.org/p/IcehouseIronicFaultTolerance

Blueprint information

Status:
Not started
Approver:
aeva black
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Whiteboard

Addressed by: https://review.openstack.org/#/c/55549/
  Allow clean reservation at update node

-----------
Based on discussion in yesterday's meeting
  http://eavesdrop.openstack.org/meetings/ironic/2013/ironic.2013-12-09-19.01.log.html
I am going to target the related bug report to I-2 (it's already set to HIGH priority), and also lower this blueprint's priority and untarget it from the Icehouse cycle. We need a way to manually break locks held by dead conductors and prevent users from setting a lock in the API (both addressed by the bug) but we don't need automatic, time-based lock timeouts (discussed at the summit, fraught with potential issues).

--Devananda, 2013-12-10

Gerrit topic: https://review.openstack.org/#q,topic:bug/1250348,n,z

Addressed by: https://review.openstack.org/55549
    Allow clean reservation at update node

Gerrit topic: https://review.openstack.org/#q,topic:bp/breaking-resource-locks,n,z

Addressed by: https://review.openstack.org/70273
    Minor update for _check_clear_reservation

Addressed by: https://review.openstack.org/71212
    Add ability to break TaskManager locks via REST API

I'm still inclined to see this work happen, personally...
// jroll 2015-10-15

We're moving from using blueprints to track features to RFE bugs. I've filed one for your change (see related bugs section). Please track further work there using Closes-Bug, Partial-Bug or Related-Bug in commit messages and use this newly created RFE bug.
//vdrok 2015-12-16

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.