MongoDb as dogpile caching backend

Registered by Arun Kant

Proposal :

To provide mongoDb as another backend choice for dogpile caching. MongoDb can be a good alternate solution for distributed caching.

Why MongoDb?

For a distributed caching solution, data consistency, coherent view of updates, partitioning for load balancing, and replication for high availability are critical aspects in various caching schemes.

MongoDb, a NoSql database, has extremely fast reads and already has some of the features built-in e.g. high availability and high read throughput with primary and secondary replica sets, data segmentation via sharding, automatic synchronization among various nodes for coherent view of updates. In addition, MongoDB and mongodb.org supported drivers is open source.

Advantages:
• Extremely fast random reads on large datasets.
     o MongoDb uses memory mapped files and usually it takes only nanoseconds to resolve minor page faults to get file system cached pages mapped into MongoDB’s memory space.
• Cache key hashing (mangling) for optimal read can be taken care by defining index for key field in collection.
• MongoDb scale out very well with increasing size of data.
• MongoDb have built-in support for TTL (time-to-live) data via its TTL type of collections.
• Data can be distributed across multiple shards as a data partition alternative which is usually needed for load balancing.
o With replica setup, cache reads can be directed to secondary nodes to provide improved load balancing of cache calls.
• MongoDb allows a much richer query interface programmatically. Other caching solution typically offers only programmatic/API-based key-based access. Although Search APIs are being added, these are still reasonably primitive in terms of express-ability and run-time performance.
• In MongoDB, a document’s representation in the database is similar to its representation in application memory. This means the database already stores the usable form of data, making the data usable in both the persistent store and in the application cache. Whereas distributed cache implementations are typically guilty of inventing proprietary and opaque persistence formats.

MongoDb backend needs pymongo driver for mongoDb operations. This package dependency and its usage need to be loosely coupled with keystone. If needed persistent data can be encrypted to secure sensitive data. Also MongoDb supports SSL between application and its server nodes.

Blueprint information

Status:
Complete
Approver:
Morgan Fainberg
Priority:
Low
Drafter:
Arun Kant
Direction:
Approved
Assignee:
Arun Kant
Definition:
Approved
Series goal:
Accepted for icehouse
Implementation:
Implemented
Milestone target:
milestone icon 2014.1
Started by
Arun Kant
Completed by
Dolph Mathews

Related branches

Sprints

Whiteboard

(morganfainberg - Notes on implementation):
Make sure that you use the same semantics dogpile uses for loading in the mongo library for the backends like memcached:

https://bitbucket.org/zzzeek/dogpile.cache/src/1f6c6b50fed188ba68c2e98a3798a94de5df9307/dogpile/cache/backends/memcached.py?at=master#cl-99

which calls a method like:

https://bitbucket.org/zzzeek/dogpile.cache/src/1f6c6b50fed188ba68c2e98a3798a94de5df9307/dogpile/cache/backends/memcached.py?at=master#cl-233

pymongo driver is already part of the global requirements: https://github.com/openstack/requirements/blob/master/global-requirements.txt#L63 be sure that the version is compatible with your use of it.

If you are explicitly testing the mongodb backend, and the tests need pymongo, add it to the test-requirements for keystone. It shouldn't need to be added to the main requirements.txt file. Be sure
to add relevant documentation to the proper .rst files as appropriate for the new dogpile backend.

Gerrit topic: https://review.openstack.org/#q,topic:bp/mongodb-dogpile-caching-backend,n,z

Addressed by: https://review.openstack.org/72026
    Support for mongo as dogpile cache backend

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.