On-demand data migration for Swift

Registered by Gil Vernik

We suggest to add special data migration layer to Swift to efficiently migrate data from other storage clouds. This is achieved without being dependent on any special support from the other clouds. Our suggested mechanism provides three main functions:

1. Per container setup. In this step an existing ( or a new ) Local Swift container will be "linked" with a remote container in another cloud ( need not be Swift )

2. Unified view and operations on the local Swift container. The "Data Migration Layer will provide a unified view of the data for the local container from step 1, i.e., objects in the remote container will appear to be in the local container. This includes list operations..

3. Object migration on demand. Any object that will be accessed through the local Swift container and not yet migrated, will be migrated immediately and stored in Swift.

Data Migration functionality will be provided via Proxy MW that will access the remote storage provider and migrate objects that do not yet exist in the local Swift's container. Data Migration MW will be on the data path and there is no need to change any existing Swift API. The suggested middleware will also provide a unified view of the data in the container.

Specifically, Data Migration to Swift will include :
Data Migration Proxy middleware
1. Migrate objects on demand
2. Provide unified view

Storage Access Module
1. Responsible to access remote clouds for read only operations.
2. Will expose an internal "public" interface. Implementation will be based on Apache LibCloud.

To illustrate the flows, we detail two examples of user operations on "container new" in Swift that is linked to "container_old" from another cloud. Assume that "container old" contains three objects: obj1; obj2; obj3, and "container_new" only obj1
Container list operation:
A client sends to Swift a list operation for "container new". Our Data Migration MW intercepts this request, and sends the request both to the local Swift and also to another cloud to list objects of "container old". Having both results, Data Migration MW merges them into a single response comprised of the objects obj1; obj2; obj3, even though obj2 and obj3 are not yet exists in Swift,

Object read operation:

A client sends a GET request for container_new/obj2. Since obj2 is not yet exists in Swift, Data migration MW sends GET container old/obj2 operation to another cloud. Upon receiving obj2, Data Migration MW stores it in Swift then returns it to the client.

Blueprint information

Status:
Started
Approver:
None
Priority:
Undefined
Drafter:
Gil Vernik
Direction:
Needs approval
Assignee:
Gil Vernik
Definition:
New
Series goal:
None
Implementation:
Started
Milestone target:
None
Started by
Gil Vernik

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/and,n,z

Addressed by: https://review.openstack.org/64430
    On demand data migration for Swift

Gerrit topic: https://review.openstack.org/#q,topic:bp/on-demand-data-migration-for-swift,n,z

Addressed by: https://review.openstack.org/123473
    Functional tests for data migration middleware

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.