Allow groups of objects to have different replication strategies

Registered by clayg on 2013-04-19

In order to experiment with drastic changes (or even incremental improvements) with swifts consistency processes it is desirable to segregate groups of objects into different classes of storage policy.

Blueprint information

Needs approval
Series goal:
Milestone target:
Completed by
John Dickinson on 2013-11-20

Related branches



From an API perspective the most natural grouping of objects is probably the container level.

From an implementation point of view, it may be difficult to intermix objects with different storage policies inside of the same partitions without exactly the kind of drastic changes that we would want to experiment with in an sandboxed pool of storage.

Either way it seems likely a second (or third, or fourth) storage policy will want to have it's own ring. In fact the first practical application may leverage the configurability of the existing ring and run swift with both 2 and 3 replica rsync storage polices in the same cluster. We already manage and distribute 3 distinct rings - what's a few more? :\

Since it will likely continue to be possible to migrate from some storage polices to others (e.g. 3->2 replica rsync, rsync -> ssync[1]) there should be a strong desire towards configurability in the name, the ring, and the data dir - even if a merger would require a time consuming migration.

In the intrest of scope I'd like to limit this blueprint to objects. Many more parties have expressed interest in improving object replication, than rewriting container replication - and the reasonability of reduced redundancy containers seems questionable. Of course if in the course of design it becomes obvious to do both - we can draft a second blueprint.


P.Luse> So I'm definitely interested in working on this but possibly from a different angle where the implementations may line up though. Myself and a few others have been thinking about a plan where we characterize the storage nodes in a cluster and use that data, along with an application hint, to route traffic to the most appropriate node(s). A possible implementation would be the use of separate rings and one possible use case would be, for example, identifying nodes with faster storage versus slower storage (although a number of specific variants can be thought of both HW and SW) effectively creating a set of tiers within the cluster. I would have put this comment on John's blueprint regarding tiers but I think there could be some real value in thinking about reliability hints and performance hints at the same time. Not sure how many folks are familiar with Differentiated Storage Services but there's a close tie in there as well and some potentially cool combinations. Will post a note on John's blueprint as well, looking forward to discussing this further

Clay - take a look at regarding the EC work as it proposes an implementation of general multiple rings for storage policies

FYI Storage Policies patch set updated


Work Items

This blueprint contains Public information 
Everyone can see this information.