Upstart service readiness

Registered by James Hunt on 2012-05-01

Upstart currently considers a service "ready" (fully initialised) once:

- [Services] The process has forked the expected number of times (0-2)
- [Tasks] The process has been exec'd successfully

For daemons therefore, "service readiness" is inextricably linked to the
overloaded 'expect' stanza which is also used for PID tracking.

The problem is that some services (such as cups) are _not_ ready once they have forked 'n' times.

The proposal is to introduce a new 'ready on' stanza coupled with a 'ready' event that would allow explicit control over when Upstart deems a service to be in a usable state:


- No change to existing 'expect' behaviour.
- If no 'ready on' condition specified, 'ready' event emitted immediately
  after 'started'.
- If 'ready on' condition specified, 'ready' event emitted if and when
  condition becomes true.
- 'ready' event can optionally be used by other services as a more
  reliable way to know when a service is fully initialized and thus usable.


- possible to specify multiple values in 'ready on' condition such as:
  "ready on (dbus and file FPATH=/var/log/myapp.log and socket PROTO=inet PORT=80"
  "ready on stopped myjob and started myjob2"
- upstart-socket-bridge will be retained but with advent of (C), no
  longer necessary to modify any daemons as is required by systemd for
  "socket activation".


- No change to existing 'expect' behaviour.
- Solves the readiness problem since .conf files would have a rich
  palette of sources of readiness to choose from which should cover 99%
  of all cases (udev, dbus, file, socket).
- More reliable behaviour.
- Would allow for simplification for jobs that currently fail to work
  solely via ptrace (for example, see gross hacks in /etc/init/cups.conf).

Work required:

- Finish (C).
- Implement (D) and (E).
- Modify upstart-udev-bridge to look at "ready on" job stanzas to allow
  "ready on <udev-event>".


- (D) would need to be accepted into the upstream kernel.
- (D) would not currently work in LXC containers since netlink is effectively disabled (as it is not namespace-aware). Correct fix would presumably be to make netlink ns-aware?
- (D) ties this feature to Linux rather heavily
  (*could* provide a very crude /proc/net/{tcp,udp} implementation but
  performance would be poor as file must be continually re-read!)
- (C) would need to use inotify (or fsnotify to avoid complexities to overcome racy behaviour for inotify recursive watches) but could be ported to other architectures
  (such as FreeBSD using kqueue).


Alternative idea (from apw): put the onus on the daemons to inform Upstart when they are ready.

This is in fact already possible using 'expect stop' where Upstart waits for the application to send SIGSTOP before considering it ready. It could be extended to obtain the PID directly via sigaction(2) to avoid the need to obtain it via ptrace(2). Could go a stage further and provide some sort of formal API rather than a signal to allow a daemon to indicate readiness (coupled with a utility command to do the same).


+ simple.
+ puts onus on daemons rather than Upstart.
+ potentially removes the need to use ptrace for PID tracking.
+ if the API idea were selected, this could be used with SysV jobs too (by providing a NOP implementation for the traditional SysV init).
+ no kernel support required (so would map across to other systems (BSD/Hurd if desirable).
+ could be standardized as part of the LSB since it would be init-system-agnostic.


- daemons may ignore the standard behaviour.
- we would need to modify every daemon in the archive to work with this model.
- highly unlikely that commercial vendors would modify their products unless it were an approved standard.
- putting control in the hands of the daemons is not necessarily desirable: consider if they go haywire - Upstart would not be able to control the problem as it may not yet know the PID.

Blueprint information

Not started
Steve Langasek
James Hunt
James Hunt
Series goal:
Accepted for quantal
Milestone target:
milestone icon ubuntu-12.10-beta-1


Decision taken to not implement additional bridges since although they would allow the problem to be solved fully, real-world experience suggests the time between the final fork and the time the service is ready is not significant enough to worry about (and there are apparently no known services that couldn't be fully fixed by implementing the fork *and* exit counting).

For cookbook update on udev events (see work item below), see:

From the etherpad...

Welcome to Ubuntu Developer Summit!
Please see below (more current than summit page for this session):
#uds-p #track #topic

Service Readiness
We have issues with certain daemons which we cannot identify when they are 'ready'. How can we identify the correct moment. There are two main options:
1) update daemons to perform in a predicable manner
2) add new 'wait for X' methods to cover the main ready indicators (socket appears etc)
New stanza to list the 'when they are ready'. Expect currently is intended to indicate that something is ready, with multiple of these possible. The expect side effect following the pid is only a side effect.
cgroups, could we use those to track pids, we are currently using the setpgrp for this but they may escape. Note that escaping is used and expected for things like ssh.
rather than introducing a new 'ready on' stanza, could enhance 'expect' stanza to take an arbitrary condition such as 'expect daemon and socket...'. This implies we should make 'fork' and 'daemon' events.
Start on dbus name appearing on the bus. However there are none that implement this correctly. So this is not worth implementing because services don't actually do this in a non-racy manner.
'expect socket' is a very accurate measure of socket as we do not listen until we are ready, and if we are listening we will queue incoming requests.
expect pidfile is likely not accurate either. either written by the parent, or not actually when we are ready.
there are bugs in expect daemon but this is really felt to be the right solution to the problem. we are working on fixing these.
= Action =
- fix upstart-socket-bridge to work with IPv6.
- fix ptrace handling to count exits *as well as* forks [slangasek has a branch].
- fix post-start issue (bug 711635)
- fix pre-stop restart issue (bug 703800)
- update cookbook to explain that 'start on *-device-added' event is *NOT* an indication that the device is "ready". Jobs should consider the specific behaviour of each h/w device. A general strategy is to check for the '*-device-added' AND the '*-device-changed' event with a variable 'ID_*=...'. Some kernel subsystems udev add events *are* an indication of readiness.
kernel subsystems that this technique can be used for:
Generated by:
awk 'BEGIN{RS="";ORS="\n\n"}; /ACTION=add/ && /ID.*=/ { print; }' /var/log/udev|grep SUBSYSTEM=|sort -u

- Since the concern is mostly sockets not existing (and similar for other connection methods like dbus) this surprising daemon users, launchd solves this by making the sockets before it starts the daemon, then handing them off.
- Would it be preferred to use the work Redhat put into the daemons by doing the same?
[vorlon] existence of sockets is not a major concern; traditional sequence- or dependency-based boot systems already require correct non-racy daemonization handling, which is sufficient to let upstart do the right thing. Furthermore, socket-based activation is fundamentally flaky in that it requires address or port lists to be specified in multiple places to ensure consistent, race-free startup behavior. While we want upstart to support socket-based activation correctly, we have no intention of relying on it within Ubuntu.


Work Items

Work items:
[vorlon] add in any additional test cases required for exit tracking: TODO
[jamesodhunt] update cookbook to explain semantics of 'start on *-device-added': DONE
[jamesodhunt] Update upstart-events(7) on udev *-device-added semantics: TODO
[jamesodhunt] Update upstart-udev-bridge(8) on udev *-device-added semantics: TODO