LTS upgrades

Registered by Michael Vogt

Prepare for the lucid -> precise LTS upgrade
* testing
* backporting apt from oneiric/precise to lucid for multiarch enabled upgrades
* dpkg-maintscript-helper not available in lucid, used in preinst scripts: need some archive analysis to catch these issues and make sure they won't impact upgrades

Blueprint information

Status:
Started
Approver:
Steve Langasek
Priority:
Essential
Drafter:
Michael Vogt
Direction:
Approved
Assignee:
Matthias Klose
Definition:
Approved
Series goal:
Accepted for precise
Implementation:
Good progress
Milestone target:
None
Started by
Colin Watson

Whiteboard

WORK ITEMS:
[mvo] provide lts->lts upgrade tests in the auto-upgrade tester: DONE
[mvo] provide amd64 and i386 upgrade tests: DONE
check if the new host for the auto-upgrade-testing is capable of doing universe upgrade tests as well (clearly is as Jenkins is reporting them now): DONE
[mvo] backport multiarch libapt-pkg, libapt-inst for lucid https://bugs.launchpad.net/lucid-backports/+bug/899281: DONE
[mvo] backport multiarch python-apt build against backported libapt-pkg for lucid https://launchpad.net/~mvo/+archive/lucid-precise-upgrades/+packages: DONE
backport multiarch dpkg for lucid(we don't need this, do we?): POSTPONED
[cjwatson] ensure those backports are on the alternate CD for lts cdrom only upgrades: DONE
[mvo] modify the upgrade to ensure its using the backported libapt/python-apt: DONE
test cdrom -> cdrom upgrades (no network)
for conffile prompt it would be great to have a way to specifiy "md5sum /etc/conffile keep" in some configuration (request from google and global services): POSTPONED
profile upgrade time to look for low-hanging fruit (man-db, maybe?): POSTPONED
[jibel] to take over automatic upgrade testing: DONE
need to appoint someone on desktop team for upgrade testing
[mvo] re-evaluate the --dry-run upgrade mode using AUFS
[mvo] test overlayfs for --dry-run (but will not work for lucid->precise, only oneiric->precise)
[mvo] improve apt-clone to store modified conffiles and unknown files in /etc
[mvo] backport apt-clone to lucid (lp:~mvo/apt-clone/lucid-backport): POSTPONED
get the apt-clone file data from launchpad bug reports and test upgrading those systems (apt-clone isn't available in lucid) (system state could be restored from system_state.tar.gz attached to bug reports)
[jibel] auto-upgrade-tester should be able to take a apt-clone image to create a base image: DONE
ensure that we check for transitional packages from lucid+1 -> precise (both if they are dropped or demoted)
grep maintainer scripts for maverick,natty,oneiric,precise looking for dpkg --compare-versions checks that have been dropped in precise and need to be re-added
provide test profile dapper->hardy->lucid->precise
[mvo] we can have a python-apt-upgrader binary package that just depends on libapt-pkg4.11 and the upgrader can fetch that and make apt-upgrader source only build the library, then it can be put into -updates and the upgrader can just install that: DONE
consider "defrag" /var/lib/dpkg/info: cp -a info info.new && sync && mv info info.old && mv info.new info || mv info.old info.new && rm -rf info.old to speedup the operations (however there is a risk here as there is a brief time when there is no info dir, u-m could cleanup the mess if that happens): TODO
[jibel] provide a option to run the upgrade with "eatmydata" (for the upgrade tester): DONE
[jibel] Add test profile for universe (limit universe to applications ie packages with a .desktop file): DONE
[jibel] auto-upgrade-tester should collect all debconf prompts, not only configuration change prompts: DONE
[pitti] Provide a script that set/read popular settings we'd like to preserve on upgrade (keyboard layouts, desktop background, GTK theme, custom panel/desktop launchers): DONE
[pitti] Provide a script that check for gdm -> lightdm upgrade (autologin, lightdm enabled upon upgrade): DONE
[pitti] Provide a script that set/check language settings: TODO
[jibel] Add test scripts provided by the desktop team to auto-upgrade-tester: DONE
improve failure handling when maintainer scripts fail by removing the failing package
improve failure handling when upgrades can not be calculated by removing packages that cause the calculation failures
[cjwatson] Fix all packages using dpkg-maintscript-helper without suitable Pre-Depends on dpkg: DONE

Depends on blueprint foundations-p-python-dh-improvements for dh_python2 lucid backport, other-p-plusonemaint-infrastructure for lintian dpkg-maintscript-helper check.

*Other important upgrade concerns

- Gathering success rate feedback: no tool yet in place to measure both success and fail rates, gather necessary feedback and send in an upgrade-report to further improve the process and upgrading experience for the user, business or institution.
- Backups: Update manager currently does not prompt users to backup to safeguard their data in case of unexpected event, like a power failure, etc.
see some discussions here (reducing upgrading risks):
https://bugs.launchpad.net/bugs/876146

Session notes:
Provide as good an experience as possible
Leverage auto upgrade tester as soon as possible
 * multiarch by default in precise is a challenge
  * the versions in lucid are not capable of handling multiarch, so we need a process that makes sure the new apt is available before we even calculate the upgrade

Identify power users to help out with upgrades, selection of them?
 * maybe by those with ubuntu-dev-tools installed offer them to help out with the testing earlier
 * proposal: gradual rollout of the upgrade announcement after the release date, so that we get some initial feedback on the release before rolling out to everyone
 - DNS server change? segmenting logic

 If currently running lucid, we'll not prompt them for LTS to LTS upgrade until the point release comes out. (12.04.1).

We probably want to keep statistics for future on percentage of users that upgrade on their own, vs those because update-manager tells them.
Lintian - setting it up. RT request? Hardware? Colin looking into this.
Automatic testing of upgrades exists - running on special machine in data center, ideally needs to be moved to Jenkins.
 Already do some automated testing
  * non-interactive
  * if it can be upgraded, they upgrade it
  * install with most packages of main
  * ideally we would test universe
  * make conffile prompts a high priority bug
  * lts-lts-lts upgade testing (dapper->hardy->lucid->precise)
  * test cdrom -> cdrom only upgrades (automatically if possible)
  * on cdrom -> cdrom upgrades we need to provide a backported apt with multiarch support on this CD
 * our default is now overlayfs instead of aufs - we need to check if we can do a dry-run upgrade using this new technology
  * for conffile prompt it would be great to have a way to specifiy "md5sum /etc/conffile keep" in some configuration (request from google and global services)
 * easy way to avoid notifications for upgrades (only admin group gets notification, check if we have a easy way to disable it)
 * run the test upgrade with unsafe IO to get results faster
Nice to have silencing of conflict checker.
 * dpkg passes this info to libapt; we believe update-manager could be modified to allow some kind of site-based "preseeding" of answers for conffiles
  * mvo says this is very easy, well volunteered
Testing resource during cycle for upgrade testing? need to figure out what is appropriate.

 update-manager recoverability? big sync at the end. More risk actually. File system unpacks not guaranteed to be there. 90 minutes isn't so slow.
 Analysis of why its slow? (see last UDS) Automatic upgrade testing has the stats, mvo has info.
 defrag before? saves time... consider
 ACTIONS:
 * profile upgrade time to look for low-hanging fruit (man-db, maybe?)
 * Patrick Wright and jibel to take over automatic upgrade testing
 * make the auto-upgrade-tester test both amd64 and i386
 * foundations team to sort out backport of apt (APT got a ABI break), python-apt, dh-python2 to lucid for multiarch, for use in update-manager
 * need to appoint someone on desktop team for upgrade testing
 * re-evaluate the --dry-run upgrade mode using AUFS
 * test overlayfs for --dry-run (but will not work for lucid->precise, only oneiric->precise)
 * get the apt-clone file data from launchpad bug reports and test upgrading those systems (apt-clone isn't available in lucid) (system state could be restored from system_state.tar.gz attached to bug reports)
 * auto-upgrade-tester should be able to take a apt-clone image to create a base image
 * adam creates a kvm image with bizare but functioning RAID/LVM setup as a base-image test
 * ensure that we check for transitional packages from lucid+1 -> precise (both if they are dropped or demoted)
 * grep maintainer scripts for dpkg --compare-versions
* scan the archive for dpkg-maintscripthelper usage in preinst without dpkg-dev pre-dependency and/or provide dpkg-mainscripthelper backport
* provide test profile dapper->hardy->lucid->precise
* concern: backports is not required to be mirrored so putting the backported apt/python-apt there may not be good enough
* we could simply put it into -proposed and never put it into -updates, this way we know its on the mirror
* we can have a python-apt-upgrader binary package that just depends on libapt-pkg4.11 and the upgrader can fetch that and make apt-upgrader source only build the library, then it can be put into -updates and the upgrader can just install that
* ensure that the updated apt is also on the lucid CDROM so that it can be pulled in by the upgrader
* "defrag" /var/lib/dpkg/info: cp -a info info.new && sync && mv info info.old && mv info.new info || mv info.old info.new && rm -rf info.old to speedup the operations (however there is a risk here as there is a brief time when there is no info dir, u-m could cleanup the mess if that happens)
 * provide a option to run the upgrade with "eatmydata" (for the upgrade tester)
 *
nice to have:
 * changing the update-manager interface to make download vs. install two separate stages, *or*, parallelizing the download vs. install
 * Etienne to provide a list of settings that need to be preserved across upgrades
What kind of package level problems needed fixing?
 * depends a lot on the debian release cycle, like conflicts getting dropped, new features getting used,

pitti, 2012-01-19:
 * Tests for gdm -> lightdm default and autologin migration (flexible to test more stuff): http://people.canonical.com/~pitti/tmp/test_lts_upgrade_system.py
 * Script to set up custom user configuration for lucid: http://people.canonical.com/~pitti/tmp/prepare_lts_upgrade_user.sh
 * Script to test custom user configuration migration after precise upgrade: http://people.canonical.com/~pitti/tmp/test_lts_upgrade_user.py

jibel 2012-01-19
eatmydata doesn't improve performances of the automated upgrade tests. Virtual disks are setup to dynamic resizing which causes most of the IO usage (tested on server upgrades) We cannot preallocate disk size due to host disk space limitation.
Other tests done to improve performance:
* raw disk instead of qcow2: no improvement
* virtual drive setting: cache=unsafe : 20s better (over 4 minutes)

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.