Comment 14 for bug 1818914

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/c/openstack/nova/+/751367
Committed: https://opendev.org/openstack/nova/commit/75f9b288f8edfd24affe5ecbc1f3efb6a63726e4
Submitter: "Zuul (22348)"
Branch: stable/train

commit 75f9b288f8edfd24affe5ecbc1f3efb6a63726e4
Author: Stephen Finucane <email address hidden>
Date: Wed Aug 5 14:27:06 2020 +0100

    Don't unset Instance.old_flavor, new_flavor until necessary

    Since change Ia6d8a7909081b0b856bd7e290e234af7e42a2b38, the resource
    tracker's 'drop_move_claim' method has been capable of freeing up
    resource usage. However, this relies on accurate resource reporting.
    It transpires that there's a race whereby the resource tracker's
    'update_available_resource' periodic task can end up not accounting for
    usage from migrations that are in the process of being completed. The
    root cause is the resource tracker's reliance on the stashed flavor in a
    given migration record [1]. Previously, this information was deleted by
    the compute manager at the start of the confirm migration operation [2].
    The compute manager would then call the virt driver [3], which could
    take a not insignificant amount of time to return, before finally
    dropping the move claim. If the periodic task ran between the clearing
    of the stashed flavor and the return of the virt driver, it would find a
    migration record with no stashed flavor and would therefore ignore this
    record for accounting purposes [4], resulting in an incorrect record for
    the compute node, and an exception when the 'drop_move_claim' attempts
    to free up the resources that aren't being tracked.

    The solution to this issue is pretty simple. Instead of unsetting the
    old flavor record from the migration at the start of the various move
    operations, do it afterwards.

    Conflicts:
      nova/compute/manager.py

    NOTE(stephenfin): Conflicts are due to a number of missing cross-cell
    resize changes.

    [1] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1288
    [2] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4310-L4315
    [3] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4330-L4331
    [4] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1300

    Change-Id: I4760b01b695c94fa371b72216d398388cf981d28
    Signed-off-by: Stephen Finucane <email address hidden>
    Partial-Bug: #1879878
    Related-Bug: #1834349
    Related-Bug: #1818914
    (cherry picked from commit 44376d2e212e0f9405a58dc7fc4d5b38d70ac42e)
    (cherry picked from commit ce95af2caf69cb1b650459718fd4fa5f00ff28f5)