Nova scheduler may race for compute resources

Registered by Brian Elliott

The scheduler is subject to a race condition which can cause it to incorrectly identify available resources on a particular compute host. The problem occurs if multiple scheduler instances/threads concurrently issue an instance build request (i.e. run_instance) to the same compute host. This situation may oversubscribe the given compute host and cause one or more run_instance requests to fail.

Blueprint information

Status:
Complete
Approver:
Vish Ishaya
Priority:
Medium
Drafter:
Brian Elliott
Direction:
Approved
Assignee:
Brian Elliott
Definition:
Approved
Series goal:
Accepted for folsom
Implementation:
Implemented
Milestone target:
milestone icon 2012.2
Started by
Vish Ishaya
Completed by
Vish Ishaya

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/scheduler-resource-race,n,z

Addressed by: https://review.openstack.org/9402
    Keep the ComputeNode model updated with usage

Addressed by: https://review.openstack.org/9540
    Adds generic retries for build failures.

(?)

Work Items

Work items:
Added scheduling retries when build errors occur: DONE
Added resource tracking in the compute host to more gracefully control resource usage and provide up-to-date information to the scheduler: INPROGRESS

This blueprint contains Public information 
Everyone can see this information.