Switch CI to use custom AMIs

Registered by Fathi Boudra

We should create our custom AMI(s) with all needed packages preinstalled.
Their creation should be fully scripted and general to support all our needs.

Blueprint information

Status:
Complete
Approver:
Данило Шеган
Priority:
High
Drafter:
Paul Sokolovsky
Direction:
Approved
Assignee:
Paul Sokolovsky
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Implemented
Milestone target:
milestone icon 2012.06
Started by
Paul Sokolovsky
Completed by
Paul Sokolovsky

Related branches

Sprints

Whiteboard

Meta:
Headline: Linaro EC2 build systems now use optimized, custom AMIs as the basis for build slaves creation.
Acceptance: Tools and process are established to setup, create and update AMIs across Linaro Cloud build systems (ci.linaro.org, android-build, cbuild).

[michaelh1] I have similar things with the cbuild EC2 instances. I start a plain Precise image then run a script to download and install the rest. It would be nice to turn the basics of this into an AMI to cut the startup time. See http://bazaar.launchpad.net/~linaro-toolchain-dev/cbuild/tools/view/head:/ec2slave-init.sh
[pfalcon 2012-06-11] 2 michaelh1: Yes, that's the idea - to establish tool/process to create and maintain custom AMIs consistently across Linaro. It definitely would be based on the instance init scripts in use now (like yours, for Jenkins we have similar ones), but they may need some tweaking (e.g. /tmp mounting & at poweroff in your case).
[asac, Jun 11, 2012]: blueprint needs headline/acceptance and a complete set of work items that convey the delivery of the goals. At least the steps about roll out, validation and documentation of how to update the AMIs and definition of what triggers would usually cause us to update the AMIs would be good. Further, getting a process requesting changes to the AMI and rolling out new AMIs to the buildfarm encoded in a "template" blueprint feels the way to go.
[pfalcon 2012-06-11] Added initial headline/acceptance (those always need some polishing by TL/PM) and extended WIs. As for AMI update process, I guess it will be simpler than that - either AMI owners will be able to run a script and produce AMI directly as they need, or can submit ticket for Infra team to do that. Also again, worth settling on TL/PM level.
[pfalcon 2012-06-13] Turns out, there're 2 types of AMIs: inststore-based and EBS-based ones. Dealing with the latter is sufficiently easy: create instance from base AMI, install needed packages, stop instance, create AMI from it. Dealing with the former is sufficiently complicated and requires dealing with raw FS images and manually uploading them to EC2. We want streamlined process, so rather would use EBS custom AMIs, but so far we use inststore ones on a-b and partially on ci.*. So, we first step would be migrating away from them.
[pfalcon 2012-06-13] After successful initial testing, switching a-b from inststore ami-68ad5201 to EBS ami-87c31aee (both are Natty 64bit)
[stevanr 2012-06-14] Changed inststore ami's on ci.l.o to ebs, description here; https://pastebin.linaro.org/581/
[pfalcon 2012-06-15] Test build using custom AMI: https://android-build.linaro.org/builds/~pfalcon/step5-test/#build=4 (success)
[pfalcon 2012-06-15] Created https://code.launchpad.net/~linaro-aws-devs/linaro-aws-tools/linaro-ami with project code/config skeleton, suitable for TDD for example.
[stevanr 2012-06-15] Meeting minutes pfalcon vs. stevanr on https://wiki.linaro.org/Platform/Infrastructure/Spec/CustomAMIs#Meeting_minutes_pfalcon_vs_stevanr_on_15.06.
[stevanr 2012-06-15] After initial testing with custom EBS AMI on ci.l.o got the 3min decrease slave startup time and around 5min less job build execution time on https://ci.linaro.org/jenkins/job/stevanr-precise-nano-test/
Booting slave time with old ami: 4min 02sec
Booting slave time with new ami: 1min 15sec
[stevanr 2012-06-15] I think we should change the skeleton of the project just a bit. Let's put the separate directory for the python code since we will have more then one class(and thus, file) for this project.
[pfalcon 2012-06-18] Started pair programming style development of linaro-ami code.
[pfalcon 2012-06-20] linaro-ami tool is able to produce custom AMI based on the configuration specified. (Still requires refactoring and cleaning up).
[pfalcon 2012-06-21] android-build has been switched to custom AMI produced by linaro-ami tool, works as expected.
[stevanr 2012-06-22] Finished linaro-ami tool cleanup. Writing README for the tool.
[stevanr 2012-06-29] Implemented changes from danilo's code review.
[danilo 2012-06-29] Move the postponed items for ci.linaro.org into a bug 1019257.
[pfalcon 2012-07-02] Move the postponed items for cbuild into a bug 1020022.
[pfalcon 2012-07-02] With postponed items files as bugs to follow on, marking this as Implemented as agreed on postmortem meeting.

(?)

Work Items

Work items:
Switch android-build to EBS-based AMIs: DONE
[stevanr] Switch ci.* to EBS-based AMIs: DONE
Produce test AMI for android-build manually: DONE
Set up test build on android-build using test AMI: DONE
[stevanr] Set up test build on android-build using test AMI on ci.l.o: DONE
Settle on interface for package install/system prep scripts for various Linaro systems (ci.linaro.org, android-build, cbuild): DONE
Prepare skeleton structure of linaro-ami (sub)project: DONE
Develop script to automate AMI creation/update: DONE
[stevanr] Develop script to automate AMI creation/update: DONE
[stevanr] Produce AMIs for ci.linaro.org: POSTPONED
Produce AMIs for android-build: DONE
Produce AMIs for cbuild: POSTPONED
[stevanr] Update ci.linaro.org to use new AMIs: POSTPONED
Update android-build to use new AMIs: DONE
Update cbuild to use new AMIs: POSTPONED
Document Linaro custom AMI system: DONE