Ingest large amounts of data from a big data analytics process

Registered by Keith Newstadt on 2014-04-01

MagnetoDB will be useful for interactive access to the results of large analytics processes. For example, suppose a user reduces multiple PBs of raw data into a few 10s of TBs and billions of rows of information. The user may likely want interactive access to that information, to retrieve sets of rows based on different query conditions. In order to support this scenario, MagnetoDB needs to support a process that efficiently imports very large data sets from a big data cluster running map reduce jobs. It should support both full table updates and delta updates to the result data stored in MagnetoDB.

Blueprint information

Status:
Complete
Approver:
Keith Newstadt
Priority:
High
Drafter:
Keith Newstadt
Direction:
Approved
Assignee:
Illia Khudoshyn
Definition:
New
Series goal:
Accepted for juno
Implementation:
Implemented
Milestone target:
milestone icon juno-3
Started by
Ilya Sviridov on 2014-05-28
Completed by
Ilya Sviridov on 2014-08-12

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:bp/ingest-from-big-data-analytics,n,z

Addressed by: https://review.openstack.org/96427
    Bulk data load

Addressed by: https://review.openstack.org/99049
    Add gunicorn support to streaming API

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.