Create Ubuntu Archive Snapshots swift-mirror

Registered by Dimitri John Ledkov

Initially this came out as a discussion on #ubuntu-cloud (or was it #juju?) when swift CDN backed mirrors were announced for Amazon EC2 cloud.

It is in a way related to servercloud-q-apt-improvements the hash based apt repositories.

Debian has http://snapshot.debian.org/ repository. In essence it has every package, every version, every architecture, ever published in debian (more or less). Furthermore it also makes all of these packages available with standard apt-get, by storing and providing Releases/Packages files made by each publisher run (more or less).

Currently in Ubuntu/Launchpad we do not have such facility.

It is possible to retrieve individual source&binary packages from launchpadlibrarian, but the whole Ubuntu Archive is not available as of that point in time.

Ideally if the Ubuntu Mirror is backed by immutable CDN, it should be relatively cheap to also store and server the repository files over the CDN to create equivalent service like snapshot.debian.org but only for the ubuntu archive.

Another use case for this type of mirror would be the archive that currently holds automatically generated dbgsym packages. Currently dbgsym packages are not available for every single version of packages in the archive. And for launchpad/whoopsie/daisy retraces it has been requested to keep all versions of dbgsym packages and not remove them, otherwise retraces fail and we are loosing important information from submitted core dumps.

Rationale:

Goal:

Blueprint information

Status:
Started
Approver:
Dave Walker
Priority:
Undefined
Drafter:
Ubuntu Server
Direction:
Approved
Assignee:
Dimitri John Ledkov
Definition:
Approved
Series goal:
Accepted for raring
Implementation:
Started
Milestone target:
None
Started by
Dave Walker

Related branches

Sprints

Whiteboard

# FEEDBACK
Discussed with xnox, this is making good progress.

see also: https://blueprints.launchpad.net/ubuntu/+spec/bibisect
User Stories:
Colin is pinged about a horrible and nasty bug in the installer. After performing a binary bisection across the whole archive over the past month, he narrows it down to a small amount of uploads. After testing images with binaries from those uploads, the bug is pin-pointed and fixed.

CD/cloud images are build from a snapshot, instead of having their own mirror.

Risks:
Wasted bandwidth / disk-space.

Test Plans:

Release Note:

Session Notes:
How to do this?
- expensive on the public cloud
- potentially a lot of storage (i386, amd64 ~ 1TB during quantal)
- preserve signatures

- Ideally we want daily granularity for the tradeoffs
- as well as taking manual snapshots
- useful for cdimage building (actually a local mirror is already doing this for ISOs)

- this is pilot, test it for one cycle (due to resource constraints)
- keep a month at a time.....

- dbg symbols is a use-case (well, it is currently stored in a separate archive with

- backup to launchpadlibrarian (use a smart-proxy as a fallback to launchpadlibrarian)

- useful for full archive binary bisect

- avoiding cloud image and iso image skew

Workitems:
- use cases, costs, stake holders, solutions
- how much data does it grow?
- how is snapshot.debian.org is created?

see also http://summit.ubuntu.com/uds-r/meeting/21105/bibisect/

(?)

Work Items

Work items:
[xnox] find out how snapshot.debian.org is setup/creaed (copies .debs + uses a database, code linked from snapshot.debian.org): DONE
[xnox] estimate how much capacity is required to store a rolling 1 month of snapshots (decided against copying .debs, we have mirros + old-release + launchpadlibrarian as sources to pull the debs): DONE
[xnox] implement a smart-proxy to launchpadlibrarian with rsynced packages: POSTPONED