OpenStack Compute (nova)

Overview
Code
Bugs
Blueprints
Translations
Answers

Bug #1498126
Comment #1

Comment 1 for bug 1498126

Revision history for this message

Chris Friesen (cbf123) wrote on 2015-10-02:

Just thought I'd mention that I just finished investigating an issue that turned out to be the first item above, so it's a practical problem rather than theoretical.

We had a race (in kilo, but with very similar code to what is in liberty) between instances being migrated that are in the RESIZE_MIGRATED state (so the host/node have been updated but the numa_topology is stale) and the resource audit running on the destination.

The audit sees the instance and processes it in _update_usage_from_instances() but using the stale instance.numa_topology, thus possibly accounting for the wrong host CPUs.

We've just submitted a local workaround that modifies _update_usage_from_instances() to ignore instances with a task_state of RESIZE_MIGRATED. (So that they get handled by _update_usage_from_migrations(). So far it seems to help.