Add CDMI Support To Swift

Registered by Doug Davis on 2012-02-16

SNIA has produced a specification called CDMI (Cloud Data Management Interface) which defines a set of APIs for managing various storage-based resources. While the specification defines how to interact with a variety of storage devices and resource types, this blueprint proposes to have Swift support just the APIs for managing objects (files) and containers (directories) resources and their capabilities. These APIs will be offered as a standards-based alternative set of APIs to the current Swift and S3 APIs - without modifications to those existing APIs or back-end storage system.

The initial submitted code of the implementation can be found here:

https://review.openstack.org/#change,5539 Initial submit

The Swift team decided to support third party APIs (like CDMI) via 3rd party repositories and then distros can pick them up. In the case of CDMI the 3rd party repo can be found at:
    https://github.com/OSaddon/cdmi

Blueprint information

Status:
Complete
Approver:
None
Priority:
Undefined
Drafter:
Doug Davis
Direction:
Needs approval
Assignee:
Tong Li
Definition:
Obsolete
Series goal:
None
Implementation:
Unknown
Milestone target:
None
Completed by
John Dickinson on 2012-08-07

Sprints

Whiteboard

Note: there is another CDMI-based blueprint ( https://blueprints.launchpad.net/swift/+spec/swift-cdmi ) but it wasn't clear how much of the CDMI specification that blueprint wanted to tackle. This blueprint proposes a very limited scope to CDMI adoption - just objects and containers. While there may be some other parts of the specification that may need to be exposed (simply for compliance), they will be very limited in nature and clearly called out below.

This blueprint proposes that OpenStack expose the CDMI APIs as a standards-based alternative to the existing Swift and S3 APIs. The APIs will be supported via an additional middleware component that is modeled after the existing S3/Swift components. This will allow each of the APIs to continue to exist without interference.

While CDMI defines APIs for managing a wide range of storage systems and resources, this blueprint proposes to focus just on:
- object/file resources
- container/directory resources
- capabilities - metadata allowing clients to discover the capabilities offered by the storage provider.

Below are some key design points that will guide the implementation.

Objects/Files:
- Support CRUD operations for files.
- While CDMI also supports accessing objects by a unique ID (rather than by path), we propose to limit this work to just path-based access.
- In CDMI metadata for objects is accessible via a "CDMI" GET to the resource. A "CDMI" operation is an HTTP request with a "X-CDMI-Specification-Version" HTTP header included. This indicates that the request is not a normal HTTP request operating over the file's contents - rather it is operating over the CDMI metadata defined for the file. This means that unlike the Swift/S3 APIs where the metadata for files are included as HTTP headers on the file itself, CDMI users will use a separate HTTP GET/PUT to retrieve/update the metadata. While this is a syntactical difference, there is limited semantic difference.

Containers/Directories:
- CDMI models directories as first class resources in its model - it uses the term "container" for these - also see "Root Container" below.
- This means that unlike Swift where directories are files with a well-known mime-type but have no semantic differences from normal files, CDMI containers/directories do impose some constraints. The constraints that result in a different user experience for existing Swift users are noted below in "Design call-outs".
- CDMI containers/directories will behave like traditional directories in a file-system.
- Swift pseudo-directories will be mapped to CDMI containers, resources bear its name will be considered as the child resources in that container in CDMI terms. Removal of this resource will be restricted when there are children or grand-children exist.

Capabilities:
- CDMI has a rich capability model allowing clients to discover the features offered by the server.
- However, we propose to limit the use of capabilities to just the bare-minimum required by the specification. In this case, the spec requires that all resources (file and containers) must support retrieval of the list of capabilities for each.

Root Container:
- Both existing Swift APIs and CDMI have the notion of a 'root container' under which all operations take place.
- The term "container" is also used by CDMI for, what people typically called, directories - so we need to be careful when using the term to make sure people know if we mean the "swift container" or a "cdmi container" (aka directory) or a top level container which is true in both Swift and CDMI.

Design call-outs (notable differences between the CDMI and Swift APIs);
- Because Swift doesn't require pseudo-directories to be created before a nested file can be created, some CDMI containers might end up being virtual. Meaning, if a Swift data store has just one file (/c1/c2/c3/f1), a CDMI client will see c1, c2 and c3 as containers. CDMI does not allow a file to exist w/o its parent container also existing. So, not virtualizing these containers would lead to a CDMI client seeing inconsistent or invalid data. In this case, c1 has to exist in Swift as container, but c2 and c3 do not have to exists. But c2 and c3 to CDMI client should exist, thus they are virtual containers since there is actual no such resource existed in Swift.
- Likewise, the previous scenario could not have been instantiated via a CDMI client. The /c1/c2/c3/ path would have been required to be created before "f1" could be placed into "c3".

Open issues to discuss with the community:
- Some interesting situations can arise due to Swift treating pseudo directories as files. For example, in Swift you can create a file at /c1/xx and then create a file at /c1/xx/dd. There is no requirement for /c1/xx to be a pseudo-directory before creating a file that _appears_ to be a child of it. CDMI does not allow this. So, while a CDMI client will not be allowed to get into this situation, a Swift client can - which means a CDMI client viewing the end results of this scenario might get weird results. In this case, when they perform a CDMI GET against /c1/xx/dd they will get an error due to /c1/xx not being a container. The GET would normally return a URI to dd's parent and that parent, per the CDMI spec, must be a container. Since its not, rather they misleading the client we generate a fault stating that the system is in an inconsistent state.

Gerrit topic: https://review.openstack.org/#q,topic:bp/swift/cdmi,n,z

Addressed by: https://review.openstack.org/5640
    coding style changes for cdmi implementation

(?)

Work Items