Ingest large amounts of data from a big data analytics process
MagnetoDB will be useful for interactive access to the results of large analytics processes. For example, suppose a user reduces multiple PBs of raw data into a few 10s of TBs and billions of rows of information. The user may likely want interactive access to that information, to retrieve sets of rows based on different query conditions. In order to support this scenario, MagnetoDB needs to support a process that efficiently imports very large data sets from a big data cluster running map reduce jobs. It should support both full table updates and delta updates to the result data stored in MagnetoDB.
Blueprint information
- Status:
- Complete
- Approver:
- Keith Newstadt
- Priority:
- High
- Drafter:
- Keith Newstadt
- Direction:
- Approved
- Assignee:
- Illia Khudoshyn
- Definition:
- New
- Series goal:
- Accepted for juno
- Implementation:
- Implemented
- Milestone target:
- juno-3
- Started by
- Ilya Sviridov
- Completed by
- Ilya Sviridov
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
Bulk data load
Addressed by: https:/
Add gunicorn support to streaming API
Work Items
Dependency tree
* Blueprints in grey have been implemented.