PCI devices are sometime not freed after a migration

Bug #1641750 reported by Ludovic Beliveau
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Steven Webster

Bug Description

Description
===========

During stress testing of cold migration, it has been observed that sometimes the PCI devices are not freed by the resource tracker on the source node.

If on the source node the periodic resource audit kicks-in in the middle of the migration, the instance uuid is moved from tracked_migrations to tracked_instances. In which case the PCI devices won't get freed because the current logic in the code only cares about tracked_migration (see https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L355).

Steps to reproduce
==================

1) Boot a guest with a SR-IOV device.
2) Migrate and confirm the migration
3) Repeat 2 over and over

Expected result
===============

In this case the PCI devices will only get freed on the next periodic audit. For PCI resources such as PCI passthrough, those are limited in number and should be freed right away.

Actual result
=============

The PCI devices are not freed during the confirm_resize stage.

Environment
===========

$ git log -1
commit 633c817de5a67e798d8610d0df1135e5a568fd8a
Author: Matt Riedemann <email address hidden>
Date: Sat Nov 12 11:59:13 2016 -0500

    api-ref: fix server_id in metadata docs

    The api-ref was saying that the server_id was in the body of the
    server metadata requests but it's actually in the path for all
    of the requests.

    Change-Id: Icdecd980767f89ee5fcc5bdd4802b2c263268a26
    Closes-Bug: #1641331

Changed in nova:
assignee: nobody → Ludovic Beliveau (ludovic-beliveau)
status: New → In Progress
Changed in nova:
assignee: Ludovic Beliveau (ludovic-beliveau) → Steven Webster (swebster-wr)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/370374
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3a4909ae7e6294e45f09950ebca0b3d7126c80af
Submitter: Jenkins
Branch: master

commit 3a4909ae7e6294e45f09950ebca0b3d7126c80af
Author: Ludovic Beliveau <email address hidden>
Date: Wed Sep 14 14:44:46 2016 -0400

    Release PCI devices on drop_move_claim()

    On cold migration, drop_move_claim() is called in the confirm stage on the
    source node. Since the migration is being tracked by the resource tracker on
    the destination node, the source node has the instance in it's
    tracked_instances.

    So in this case the PCI devices were only freed on the next periodic audit.
    For PCI resources such as PCI passthrough, those are limited in number and
    should be freed right away.

    This patch fixes drop_move_claim() to also free PCI devices when an instance
    is in self.tracked_instances().

    Co-Authored-By: Steven Webster <email address hidden>
    Change-Id: Ie3392f80dfd2650048519c571ffaa11c025ad048
    Closes-Bug: #1641750

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b1

This issue was fixed in the openstack/nova 16.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.opendev.org/641806
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ad9f37350ad1f4e598a9a5df559b9160db1a11c1
Submitter: Zuul
Branch: master

commit ad9f37350ad1f4e598a9a5df559b9160db1a11c1
Author: Matt Riedemann <email address hidden>
Date: Thu Mar 7 16:07:18 2019 -0500

    Update usage in RT.drop_move_claim during confirm resize

    The confirm resize flow in the compute manager
    runs on the source host. It calls RT.drop_move_claim
    to drop resource usage from the source host for the
    old flavor. The problem with drop_move_claim is it
    only decrements the old flavor from the reported usage
    if the instance is in RT.tracked_migrations, which will
    only be there on the source host if the update_available_resource
    periodic task runs before the resize is confirmed, otherwise
    the instance is still just tracked in RT.tracked_instances on
    the source host. This leaves the source compute incorrectly
    reporting resource usage for the old flavor until the next
    periodic runs, which could be a large window if resizes are
    configured to automatically confirm, e.g. resize_confirm_window=1,
    and the periodic interval is big, e.g. update_resources_interval=600.

    This fixes the issue by also updating usage in drop_move_claim
    when the instance is not in tracked_migrations but is in
    tracked_instances.

    Because of the tight coupling with the instance.migration_context
    we need to ensure the migration_context still exists before
    drop_move_claim is called during confirm_resize, so a test wrinkle
    is added to enforce that.

    test_drop_move_claim_on_revert also needed some updating for
    reality because of how drop_move_claim is called during
    revert_resize.

    And finally, the functional recreate test is updated to show the
    bug is fixed.

    Change-Id: Ia6d8a7909081b0b856bd7e290e234af7e42a2b38
    Closes-Bug: #1818914
    Related-Bug: #1641750
    Related-Bug: #1498126

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/665138

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.