OpenStack Compute (nova)

Automatic recovery from transient db connection failures

Registered by aeva black on 2012-11-02

There are a variety of circumstances which can cause a transient failure in database connections, for example: restart / upgrade of the database, migration of VIP between HA pair, or just a network failure. Nova (and all projects connecting to a database) would benefit from the db/api catching these "db-has-gone-away" errors and automatically reconnecting and retrying the last operation, in such a way that the caller is able to continue what ever operation was in process. It is not necessary to abort long-running operations (such as nova boot or glance image-create) just because of a momentary interruption in db connectivity.

A (slightly brute-force) patch was previously proposed: https://review.openstack.org/#/c/10797/. To enable retries safely, more work is probably going to be required.

Blueprint information

Status:: Complete

Approver:: Russell Bryant

Priority:: Low

Drafter:: aeva black

Direction:: Approved

Assignee:: Viktor Serhieiev

Definition:: Approved

Series goal:: Accepted for icehouse

Implementation:: Implemented

Milestone target:: 2014.1

Started by: Viktor Serhieiev on 2013-07-05

Completed by: Viktor Serhieiev on 2014-03-12

Related branches

Related bugs

Bug #10797: nautilus smb browsing in hoary fails to show files

Invalid

Sprints

Whiteboard

johnthetubaguy: re-setting priority, need to go through a design discussion, and this is not yet targeted for icehouse-1 anyways.

Gerrit topic: https://review.openstack.org/#q,topic:bp/db-reconnect,n,z

Addressed by: https://review.openstack.org/#/c/33831/
Automatic retry db.api query if db connection lost

Patch that should implement current blueprint

Addressed by: https://review.openstack.org/35610
Automatic reconect to database (WIP)
'Proof-of-concept' patch in Nova

This blueprint was implemented in Oslo and came to Nova with patch https://review.openstack.org/#/c/75922/ (Sync the latest DB code from oslo-incubator)

Updating to icehouse rc1, since it merged with the above change. --johnthetubaguy

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information

Everyone can see this information.

Subscribers

Brent Eagles

David Xie

Frank Borkin

Kashi Reddy

Qiu Yu

Thuy Christenson