Add streaming tags to mapreduce workflows
Oozie supports streaming mapreduce. Savanna should allow the streaming tag to be specified for mapreduce jobs.
This tag allows arbitrary scripts or executables to be specified as the mapper and reducer classes. The files specified must exist on the execution node, or they must be bundled in the /lib directory of the job or referenced in the <files> and <archives> tags (see the edp-oozie-
Blueprint information
- Status:
- Complete
- Approver:
- Sergey Lukjanov
- Priority:
- Medium
- Drafter:
- Trevor McKay
- Direction:
- Approved
- Assignee:
- Trevor McKay
- Definition:
- Approved
- Series goal:
- Accepted for icehouse
- Implementation:
-
Implemented
- Milestone target:
-
2014.1
- Started by
- Sergey Lukjanov
- Completed by
- Sergey Lukjanov
Related branches
Related bugs
Sprints
Whiteboard
Gerrit topic: https:/
Addressed by: https:/
Allow boolean "streaming" in Job JSON
Addressed by: https:/
Add <streaming> tag generation to mapreduce workflow
Addressed by: https:/
Extract configs beginning with "savanna." from job_configs[
Addressed by: https:/
Generate streaming tag in mapreduce job
Addressed by: https:/
Add validation check for streaming elements on MapReduce without libs
Addressed by: https:/
Add integration test for streaming mapreduce
Work Items
Dependency tree

* Blueprints in grey have been implemented.