Upgrade control to Ubuntu 12.04
Our production instance of LAVA, control, is currently running on Oneiric. We have recently encountered various issues related to not being on an LTS release as well as not matching the OS level of our staging/dogfood servers.
We need to put together a transition plan to move control to 12.04 with minimal service disruption
Blueprint information
- Status:
- Started
- Approver:
- Alan Bennett
- Priority:
- Medium
- Drafter:
- Tyler Baker
- Direction:
- Approved
- Assignee:
- Dave Pigott
- Definition:
- Approved
- Series goal:
- Accepted for trunk
- Implementation:
- Started
- Milestone target:
- 2013.05
- Started by
- Dave Pigott
- Completed by
Whiteboard
[doanac, 2012-11-27] December is a short month. Lets just focus on the 192.168 network stuff and try this next year.
[mwhudson, 2012-11-30] Started a google doc to flesh out the plan https:/
[asac, 2013-02-20] Considering the workload for this milestone and the added work for the LAVA workshop we decided that thsi shouldnt be done for this month; we will keep it on the milestone for now to ensure it goes through post mortem. we will reassess priority and milestone then.
[danilo, 2013-03-27] Moving to next month, keeping the priority at medium as per post-mortem. Waiting for instructions from mwhudson.
[davepigott, 2013-04-19] mwhudson tried failover script, it failed. Will try again next week
[asac, 2013-04-24] ok; will we try this next month? Do we have the resources/plan in place to really make that happen? Can we set up stuff on a different machine and just flip DNS over once done? I would like to see this chapter closed as it has been plaguing us for far too long.
[mwhudson, 2013-04-26] "setting up stuff on a different machine and flip over dns" is more or less what my failover scripts do. I tested them again last week and they mostly worked -- I forgot to shut down lava on the other nodes before i tested it though. The problem with testing the scripts is that you need to offline all boards before testing them and some android jobs take a _very_ long time to finish.
[matty-hart, 2013-05-20] Agreed to delay to post-13.05 release and without need for failover setup, instead an agreed downtime of start taking boards offline 2013-06-02 and upgrade 2013-06-03.
Meta:
Headline: LAVA production server upgraded to Ubuntu 12.04 LTS
Acceptance: We've successfully upgraded control to 12.04 and are properly managing job submissions
Roadmap Id: LAVA MAINTENANCE
Work Items
Work items:
Audit packages that do not come from the primary archive on control: TODO
Upgrade Postgres to 9.1: DONE
offline all boards(2013-06-02): TODO
Run full backup of control using something like "partimage": TODO
Do a backup of known important files (/usr/local/bin, /etc, /srv/lava, etc): TODO
upgrade to 12.04 on control (2013-06-03): TODO
Test functionality: TODO