Restarting libvirtd breaks Eucalyptus NC

Bug #512887 reported by ariel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
eucalyptus (Ubuntu)
Fix Released
Medium
Dustin Kirkland 
Lucid
Fix Released
Medium
Dustin Kirkland 

Bug Description

Steps to reproduce:

- make sure Eucalyptus NC is up and running
- run /etc/init.d/libvirt-bin restart
- try to start a new Instance. If all other NC's are busy and the new request lands in the current NC, then
      /var/log/eucalyptus/nc.log
  will show:

[EUCAINFO ] doRunInstance() invoked (id=i-4912092F cores=2 disk=10 memory=512)
...
[EUCAERROR ] libvirt: cannot send data: Broken pipe (code=38)
...
[EUCAINFO ] currently running/booting: i-4912092F
[EUCAERROR ] libvirt: cannot send data: Broken pipe (code=38)
[EUCAFATAL ] hypervisor failed to start domain
...
[EUCAINFO ] doTerminateInstance() invoked (id=i-4912092F)
[EUCAERROR ] libvirt: cannot send data: Broken pipe (code=38)
[EUCAWARN ] warning: domain i-4912092F to be terminated not running on hypervisor

Apparently the NC keeps trying to use the old libvirtd socket and doesn't notice the daemon was restarted.
Another VERY nasty consequence is that Eucalyptus looses track of the previously running instances in that node! (even if they stay running in KVM)

In CC.log:
[EUCAINFO ] TerminateInstances(): calling terminate instance (i-42A10746) on (141.x.x.x)
[EUCAERROR ] ERROR: TerminateInstance() could not be invoked (check NC host, port, and credentials)

while in the NC we still have:
# virsh list
Connecting to uri: qemu:///system
 Id Name State
----------------------------------
 107 i-42A10746 running

At the very least the libvirt-bin/Euca-NC upstart dependencies should be such that restarting libvirtd restarts NC, or ideally and much better, Eucalyptus NC should be fixed to handle a restarted libvirtd.

Description: Ubuntu 9.10
Release: 9.10

ii euca2ools 1.0+bzr20091007-0ubuntu1.1 managing cloud instances for Eucalyptus
ii eucalyptus-common 1.6~bzr931-0ubuntu7.4 Elastic Utility Computing Architecture - Com
ii eucalyptus-gl 1.6~bzr931-0ubuntu7.4 Elastic Utility Computing Architecture - Log
ii eucalyptus-nc 1.6~bzr931-0ubuntu7.4 Elastic Utility Computing Architecture

Revision history for this message
Thierry Carrez (ttx) wrote :

Needs reproduction on lucid / 1.6.2

Changed in eucalyptus (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

We should test and confirm/fix and close this next week at the sprint.

Changed in eucalyptus (Ubuntu):
milestone: none → lucid-alpha-3
Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Still a problem in current Lucid.

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Dan, looking for some advice here...

eucalyptus-nc definitely is not surviving libvirt-bin restarts. It loses the socket. Is there an obvious way we can get eucalyptus-nc to re-attach the socket on a libvirt restart?

Revision history for this message
Dustin Kirkland  (kirkland) wrote :

Nevermind, Dan, I'm on it...

Changed in eucalyptus (Ubuntu Lucid):
assignee: nobody → Dustin Kirkland (kirkland)
status: Confirmed → In Progress
Changed in eucalyptus (Ubuntu Lucid):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package eucalyptus - 1.6.2~bzr1166-0ubuntu3

---------------
eucalyptus (1.6.2~bzr1166-0ubuntu3) lucid; urgency=low

  * debian/eucalyptus-nc.upstart: handle libvirt restarts, LP: #512887
  * eucalyptus-cc.eucalyptus-cc-publication.upstart,
    eucalyptus-cloud.eucalyptus-cloud-publication.upstart,
    eucalyptus-cloud.upstart, eucalyptus-common.eucalyptus.upstart,
    eucalyptus-nc.eucalyptus-nc-publication.upstart,
    eucalyptus-nc.upstart,
    eucalyptus-sc.eucalyptus-sc-publication.upstart,
    eucalyptus-sc.upstart,
    eucalyptus-walrus.eucalyptus-walrus-publication.upstart,
    eucalyptus-walrus.upstart, uec-component-listener.upstart: add a few
    inline comments, including a comment at the top of every upstart script
    that seems to be required to get get vim syntax highlighting to work
  * eucalyptus-cc.postrm, eucalyptus-cloud.postrm,
    eucalyptus-common.postrm, eucalyptus-sc.postrm,
    eucalyptus-walrus.postrm, uec-component-listener.postrm: fix package
    purging with per-package file purging lists, LP: #503063
  * eucalyptus-cc.eucalyptus-cc-publication.upstart,
    eucalyptus-sc.eucalyptus-sc-publication.upstart,
    eucalyptus-walrus.eucalyptus-walrus-publication.upstart: stop publication
    jobs if the relevant service stops running
 -- Dustin Kirkland <email address hidden> Wed, 03 Feb 2010 19:01:47 -0800

Changed in eucalyptus (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
ariel (garcia) wrote :

Great, thanks!

BTW, shouldn't this issue be reported upstream, to fix the NC to not loose the socket?
Fixing eucalyptus-nc.upstart will handle the case of a "civilized" restart by the sysadmin, but not if for instance the libvirtd dies and gets started by hand with /usr/sbin/libvirtd -d right? and also only in Ubuntu ;-)
Should i report it upstream, or does it get forwarded automatically, or does somebody else care?

Revision history for this message
Dustin Kirkland  (kirkland) wrote : Re: [Bug 512887] Re: Restarting libvirtd breaks Eucalyptus NC

Hmm, you can report it upstream, if you like. Upstream uses sysvinit
scripts rather than upstart, so the issue may or may not be present
there, I'm not entirely sure...

Revision history for this message
ariel (garcia) wrote :

It is upstream bug https://bugs.launchpad.net/eucalyptus/+bug/517340

No, IMHO they shouldn't "fix" it at the init script level but in the code itself.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.