High Performance Distributed Storage Driver
Requirement:
- high performance, close to those in memory queues
- persistent as in mongodb driver
- scalable
- configurable (eventual consistency etc.)
Implementation
- each node has marconi (so serves the API)
- each node has storage (any disk mapped storage, redis etc.)
- nodes talk to each other via zeromq
- messages are distributed to nodes via consistent hashing
- data is replicated to N nodes based on configuration
- writes are acknowledged after hitting M nodes (based on configuration) M <= N
- reads are served after hitting P nodes (based on configuration) P <= N
- replication is handled by the node that serves the initial API request
- there is no central catalog, location of a message is calculated by consistent hashing initially and then via locally stored map on each node
Pros
- resources are utilized better as both cpu, memory and disks are used on all nodes
- scaling is smooth as all nodes are same. there is not special node
- any node can fail, replacing a node is just adding one more node
- data is persistent
- performance increases with each node added
- no performance limit for a queue as messages are distributed individually
Cons
- complicated
- very complicated
Reference
- zeromq performance: http://
- redis performance: http://
Blueprint information
- Status:
- Complete
- Approver:
- None
- Priority:
- Undefined
- Drafter:
- Ozgur Akan
- Direction:
- Needs approval
- Assignee:
- None
- Definition:
- Obsolete
- Series goal:
- None
- Implementation:
- Unknown
- Milestone target:
- None
- Started by
- Completed by
- Flavio Percoco
Related branches
Related bugs
Sprints
Whiteboard
We've discussed this and it's not part of the projects goals right now.