Upgrade control to Ubuntu 12.04

Registered by Andy Doan on 2012-10-25

Our production instance of LAVA, control, is currently running on Oneiric. We have recently encountered various issues related to not being on an LTS release as well as not matching the OS level of our staging/dogfood servers.

We need to put together a transition plan to move control to 12.04 with minimal service disruption

Blueprint information

Status:
Started
Approver:
Alan Bennett
Priority:
Medium
Drafter:
Tyler Baker
Direction:
Approved
Assignee:
Dave Pigott
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Started
Milestone target:
milestone icon 2013.05
Started by
Dave Pigott on 2013-04-03

Sprints

Whiteboard

[doanac, 2012-11-27] December is a short month. Lets just focus on the 192.168 network stuff and try this next year.
[mwhudson, 2012-11-30] Started a google doc to flesh out the plan https://docs.google.com/a/linaro.org/document/d/1K_FrpM0qaDCKd6fRHyt_NDf10lbqxVK0lyQRvXYPEW4/edit
[asac, 2013-02-20] Considering the workload for this milestone and the added work for the LAVA workshop we decided that thsi shouldnt be done for this month; we will keep it on the milestone for now to ensure it goes through post mortem. we will reassess priority and milestone then.
[danilo, 2013-03-27] Moving to next month, keeping the priority at medium as per post-mortem. Waiting for instructions from mwhudson.
[davepigott, 2013-04-19] mwhudson tried failover script, it failed. Will try again next week
[asac, 2013-04-24] ok; will we try this next month? Do we have the resources/plan in place to really make that happen? Can we set up stuff on a different machine and just flip DNS over once done? I would like to see this chapter closed as it has been plaguing us for far too long.
[mwhudson, 2013-04-26] "setting up stuff on a different machine and flip over dns" is more or less what my failover scripts do. I tested them again last week and they mostly worked -- I forgot to shut down lava on the other nodes before i tested it though. The problem with testing the scripts is that you need to offline all boards before testing them and some android jobs take a _very_ long time to finish.
[matty-hart, 2013-05-20] Agreed to delay to post-13.05 release and without need for failover setup, instead an agreed downtime of start taking boards offline 2013-06-02 and upgrade 2013-06-03.

Meta:
Headline: LAVA production server upgraded to Ubuntu 12.04 LTS
Acceptance: We've successfully upgraded control to 12.04 and are properly managing job submissions
Roadmap Id: LAVA MAINTENANCE

(?)

Work Items

Work items:
Audit packages that do not come from the primary archive on control: TODO
Upgrade Postgres to 9.1: DONE
offline all boards(2013-06-02): TODO
Run full backup of control using something like "partimage": TODO
Do a backup of known important files (/usr/local/bin, /etc, /srv/lava, etc): TODO
upgrade to 12.04 on control (2013-06-03): TODO
Test functionality: TODO

This blueprint contains Public information 
Everyone can see this information.