Multiple image cache handlers

Registered by Alvaro Lopez on 2013-08-07

This blueprint aims to implement support for multiple image cache handlers on
compute nodes (only one manager will be chosen from the set of available
drivers). Currently there is only one ImageCacheManager, that removes
images older than N seconds. This appliest to the libvirt driver (and baremetal,
since it uses libvirt's manager).

Features to implement:

BaseImageCacheManager
=====================

Split the current manager into an abc BaseImageCacheManager that will be inherited
by all the other image cache managers.

Thresholds
=========

The current manager deletes all the files, regardless of the free space on disk. We may add a threshold so that images are only removed if that threshold is reached.

Configurable image cache manager
================================

With this change it will be possible to change the manager, so we should add
support for this, via a configuration option.

Implementation of some other ImageCacheManager
==============================================

There are some more advanced managers that can be implemented:

 - A popularity-based ImageCacheManager.
 - LRU manager.
 - Protected images manager (i.e. once an image is cached, do not remove it).

Blueprint information

Status:
Not started
Approver:
Joe Gordon
Priority:
Undefined
Drafter:
Alvaro Lopez
Direction:
Needs approval
Assignee:
Alvaro Lopez
Definition:
Review
Series goal:
None
Implementation:
Not started
Milestone target:
None

Related branches

Sprints

Whiteboard

Sounds good to me, but this may be hard to get in by icehouse-1 -- jogo

One nit on this. I'd like it to be updated such that the scope is not open ended. Specifically, the last part about another image cache manager should be more specific. What exactly *will* be implemented before we call this done? --russellb

russell, jogo: I've been thinking about this a bit. Maybe as a first step the three first items could be targeted for icehouse-1 (that is, refactoring the manager (i already have code for this), adding threshold support and make the manager configurable) although I also think this is difficult, and create another blueprint for the implementation of other cache managers. -- aloga

This sounds good, but I don't see any code yet, so I can see this slipping into Icehouse-2, marking as low priority -- johnthetubaguy

@johnthetubaguy I'm OK with that.

I'd like to understand more why we need multiple managers, instead of just improving the current one? -- mikalstill

I too would like to understand why we need multiple ones. If I understand correctly this is the image cache manager code that resides in the virt/libvirt driver. - garyk

@mikalstill: I will change the description of this BP, so that it is focused in improving the current manager. Then, I will fill another BP for the LRU and popularity based manager to discuss if it is worth, or if they can be just extensions to the current manager

@garyk : The current manager removes the images that are older than a time if they are not being used at that moment. This is great, but as an operator, I find interesting for my infrastructure to have a popularity based manager (i.e. keep the popular images in the nodes, even if they are not used at that moment). In some other scenarios I also find useful a LRU manager. Both are just additions (just a different selection algorithm) to the manager that exists right now, not a replacement. Anyway, I think that I will refocus this blueprint to just the improvement of the manager (for example the implementation of thresholds) and create another one to discuss the need of different managers (time is tight at this moment), if you agree. -- aloga

@aloga: I have yet to understand the the wat that you intend to do this and the picture is not really clear. At the moment we have the following: image_cache_manager is invoked every X seconds. This does the following: the driver will select which images are 'candidates' for aging. At the moment the support algorithm is simple aging, that is, if the image is not used for Y seconds then delete. From the description above it is not clear if you want to change the way that image_cache_manager is called or just add in different aging algorithms - for example have the driver register an image that is a candidate for aging and then the aging driver will return if it can be deleted.
Can you please elaborate. I see a number of pain points:
- different drivers have different datastore - for libvirt the images may be on the local machine (maybe there is a shared file server - not sure if multiple nova-computes can delete at the same time). My point being here is that we should be aware that there is specific driver access to the images.
- do we want to have thresholds for the cache sizes? would this be % of the datastore size? in most cases one would not even need to age unless this is reached.

I'd like to take part in the developments so it would be nice if you could share some more information and we can discuss ideas about the details and implementation.

Gerrit topic: https://review.openstack.org/#q,topic:bp/multiple-image-cache-handlers,n,z

Addressed by: https://review.openstack.org/59994
    Image cache: move all of the variables to a common place

Gerrit topic: https://review.openstack.org/#q,topic:bp/vmware-image-cache-management,n,z

Addressed by: https://review.openstack.org/52630
    VMware: fix bug when more than one datacenter exists

Addressed by: https://review.openstack.org/62587
    VMware: fix bug when more than one datacenter exists

deferred from icehouse-3 to "next": http://lists.openstack.org/pipermail/openstack-dev/2014-February/026335.html

Unapproved - please re-submit via nova-spec --johnthetubagy (20th March 2014)

Removed from next, as next is now reserved for near misses from the last milestone --johnthetubaguy

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.