Policy rule for host status is UNKNOWN

Registered by melanie witt on 2019-06-06

Today, the status of a server is shown as ACTIVE when nova believes it should
be running. However, when there is an interruption in communication between
nova-api and nova-compute greater than the configured ``service_down_time``,
the server status still shows as ACTIVE, even though nova knows it is actually
unknown. The server might be running or shutoff; it is unknown.

The ``host_status`` field will show the status of the underlying compute host
with a policy ``os_compute_api:servers:show:host_status`` that defaults to
admin-only. This is an all or nothing set of host statuses including: ``UP``,
``DOWN``, ``MAINTENANCE``, and ``UNKNOWN``. Some operators may not wish to
expose details such as compute host status: ``UP``, ``DOWN``, or
``MAINTENANCE`` to end users a public cloud, for example.

We propose that we add a new API policy rule
``os_compute_api:servers:show:host_status_unknown`` that defaults to
``admin_or_owner`` which will show a ``host_status`` of ``UNKNOWN`` if the
``get_instance_host_status`` API returns ``UNKNOWN``. Otherwise, the
``host_status`` field will be an empty string. Today, the ``host_status`` field
is not included in the server response for non-admin users by default. The
reasoning behind including it as an empty string when it is not ``UNKNOWN`` is
(1) to ensure the ``host_status`` field is always present, whether the status
is ``UNKNOWN`` or not, (2) to avoid inconsistency with the other values (``UP``
``DOWN``, ``MAINTENANCE``) that would be available for an admin user.

This is proposed as a user experience improvement for non-admin end users to
receive more transparent information about what to expect about their server
status than they have today.

Blueprint information

Status:
Started
Approver:
None
Priority:
Undefined
Drafter:
melanie witt
Direction:
Needs approval
Assignee:
melanie witt
Definition:
Approved
Series goal:
Accepted for ussuri
Implementation:
Needs Code Review
Milestone target:
None
Started by
Matt Riedemann on 2019-08-29

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.opendev.org/#/q/topic:bp/server-status-unknown-if-host-status-unknown

Addressed by: https://review.opendev.org/666181
    Propose showing server status UNKNOWN when host status UNKNOWN

Gerrit topic: https://review.opendev.org/#/q/topic:vm-status-unknown

mriedem 20190801: I'm definitely more on board with the idea proposed in the blueprint description than the earlier revisions of the spec that changed the server status based on the host_status. A few thoughts/concerns:

- The proposed policy rule should conform to the new rules, e.g. "compute" prefix and such: https://docs.openstack.org/oslo.policy/latest/user/usage.html#naming-policies

- The host_status field is only returned for microversion >= 2.16 today - that should stay the same here regardless of policy.

- Since host_status in the response is conditional on passing a policy check today (admin or not) I think that should remain if the non-admin passes the policy check but the host_status is not UNKNOWN, e.g. if the non-admin with microversion 2.16 has host_status=DOWN then we'd NOT return the host_status field in the response.

- Keep an eye on performance impacts when listing servers with details since we don't want to have to make the same policy check per server while listing servers, so can we cache that or re-use a variable or something. This is an implementation detail but I'm reminded of https://bugs.launchpad.net/nova/+bug/1830260.

- Following on the last point, I think the rule should, at least initially, default to admin-only to match the existing rule for backward compatibility and then deployers can opt into exposing this to their users. One could probably make an argument either way about what the default should be though, but I tend to opt for backward compatibility especially when there is no microversion gating this.

gibi: 20190805: I'm agree with mriedem's proposal above.

Gerrit topic: https://review.opendev.org/#/q/topic:host-status-unknown-policy

Addressed by: https://review.opendev.org/679181
    Add new policy rule for viewing host status UNKNOWN

[efried 20190829] Discussed in nova meeting, agreed that barring some solid justification for this to preempt other work already bp/spec-approved and code-ready, it doesn't make sense to add it to our already-full plate.

melwitt 20190829: So you’re gonna not approve this to prevent people from reviewing my 117 line patch if they feel like it? OK. ¯\_(ツ)_/¯

Putting this back that got removed during a mid-air collision:

Deferred to Ussuri, let's re-propose/discuss when that opens up. --
- mriedem 20100829

[mriedem 20191017] Per the nova meeting today:

http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-10-17-14.00.log.html#l-209

We agreed to approve this for Ussuri given melwitt said the patch accounts for the conditions above.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.