Record scheduler information for each run

Registered by Alvaro Lopez on 2013-10-11

Currently the scheduler only records information in the log files. We should implement some kind of (optional) monitoring for each of the scheduler runs (i.e. host states and why, instance resource utilization and the scheduler decision). This way it would be possible to use this information to debug what's going on during the scheduling. Moreover, this information could be used to simulate and analyze the scheduler's behaviour.

Blueprint information

Status:
Not started
Approver:
Andrew Laski
Priority:
Undefined
Drafter:
None
Direction:
Needs approval
Assignee:
Qiu Yu
Definition:
Review
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

There's no assignee, is anyone stepping up to do the work? --alaski

Also the blueprint needs to be a lot more specific on exactly what would be added. --russellb

I drafted this after this review: https://review.openstack.org/#/c/48894/ I being an operator find very useful to have a trace (not a log, but a trace in a specific format) of all the scheduling decisions. This way it is possible to study each of the scheduler passes and try to reproduce or simulate the results. I may be biased by batch systems (such as Gridengine) that normally enable this kind of logging, see http://arc.liv.ac.uk/SGE/htmlman/htmlman5/sge_schedule.html -- aloga

Recently I've spent some time thinking and investigating this issue. So I'm willing to assign myself to this BP.

Some drafting ideas:

== Information to be recorded ==

I'd like to take following patch from Phil Day as a starting point.
    Adds useful debug logging to filter_scheduler
    https://review.openstack.org/#/c/28179/

     - instance_uuids at the start of schedule
     - request_spec at the start of schedule
     - hosts in weighted order and weight values
     - host selected for which instance_uuid
     - and others

Since scheduler is now a queryable entity, and casting to scheduler run_instance logic is soon going to be replaced by build_and_run_instance logic in conductor [1]. Above debug logging patch is gonna be stop working. Scheduling info should be collected in conductor instead.

[1] https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance

== API ==

Currently two API formats in my mind.

Choice #1:

POST v2/{tenant_id}/os-schedulers/action
Request JSON:
{
    “list": {
    }
}

POST v2/{tenant_id}/os-schedulers/action
Request JSON:
{
    “show": {
        "action_id": req-829d8550-2145-4672-994e-5077b27f3c72,
    }
}

NOTE(qiuyu): need to investigate more about how to specify time range, or how many results to retrieve? Result paging? Sorting?

Choice #2:
GET v2/{tenant_id}/os-scheduler-actions
Lists all scheduler actions. Permission could be specified in policy.json. By default, only admin can list actions.

GET v2/{tenant_id}/os-scheduler-actions/{action_id}
Gets details for a specified action for a specified scheduler request. Permission could be specified in policy.json. By default, only admin can list actions.

I'm in favor of the choice #1, for the two reasons:
1. Since scheduler is now a queryable entity, this API format can be extended, such as scheduling dry run.
2. Flexible to handle paging logic? Not sure for the moment.

--qiuyu

@qiuyu: Were are you going to store all this information? In the DB? Wouldn't this impact the performance? I was thinking in a solution at a lower lever, since with my operator hat on I'd prefer a plain text file so I can easily take a glance at it.

qiuyu and I have been discussing this in irc a bit and probably should have updated the whiteboard. I share your concerns about storing this information in the DB for an API to expose. Not just for performance but there's added complexity for retention policies and cleanup since it's unlikely operators will want it stored indefinitely. Logging or notifications seem like the simplest solutions for this, but notifications are more difficult to consume and logging means it may get mixed in with unrelated information. But maybe for now we leave it up to deployers to deal with that and work out a mechanism for separating it later.

Please find more discussion at following link. --qiuyu
http://lists.openstack.org/pipermail/openstack-dev/2013-December/021650.html

[alaski] My preference would be to start small, and then improve on it. For a first pass I think it's reasonable to log this information along with the other logging that occurs. That would provide some usefulness immediately with little overhead. From there additional value can be added based on how it's used once it's available. At this point I'm not convinced that an API is going to be useful for this but once the information is available maybe I'll see the need.

deferred from icehouse-3 to "next": http://lists.openstack.org/pipermail/openstack-dev/2014-February/026335.html

Removed from next, as next is now reserved for near misses from the last milestone --johnthetubaguy

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.