Cloud databases and datastores

Registered by Thierry Carrez on 2010-04-28

In order to make Ubuntu Server the platform of choice on cloud environments, we need to support the workloads that users typically need on such environments. This include cloud-oriented databases or datastores. We'll consider various options, including packaging Cassandra, Drizzle... Suggestions welcome !

Blueprint information

Status:
Complete
Approver:
Jos Boumans
Priority:
High
Drafter:
Clint Byrum
Direction:
Approved
Assignee:
Clint Byrum
Definition:
Approved
Series goal:
Accepted for maverick
Implementation:
Implemented
Milestone target:
milestone icon maverick-alpha-2
Started by
Clint Byrum on 2010-05-28
Completed by
Clint Byrum on 2010-06-29

Related branches

Sprints

Whiteboard

Status:
Cassandra PPA up and running with latest release (0.6.3), contact made but not much interest from cassandra dev.

Work items for maverick-alpha-2:
couchdb - (LP: #591444) merge latest: DONE
mongodb - (LP: #589566) ensure latest version synced/merged (and send patches to Debian): DONE
sponsorship - upload of couchdb/mongodb: DONE
cassandra - upload upstream debian source packages to ppa: DONE
cassandra - investigate PPA/multiverse options and a sources.list.d package: DONE
cassandra - create cassandra-ubuntu team and 'stable' PPA: DONE
cassandra - contact upstream about providing links to said PPA on download page: DONE

ttx review / 20100526:
 * No Rationale in spec, maybe move Summary to rationale ?
   * resolved, moved Summary to Rationale and wrote more concise summary. (SpamapS)
 * About missing Thrift language bindings: could we investigate the cost of packaging at least one language ?
   * added work-item to build language bindings packages, which is easier than I previously had understood. (SpamapS)
 * Move the "Unresolved issues/Other Databases" section to "Notes" as it's a future development rather than an unresolved issue with the spec
   * done (SpamapS)
 * Couldn't find digg thrift packages to "adopt", and they probably wouldn't do the java packaging in the right way ? That may need to be split into multiple WI
    * added link to digg debian package repository to spec (SpamapS)
 * I'd split the json-simple++ work item into three WIs
  * done (SpamapS)
 * I'd rename the avro WI into "avro and dependencies", I can already see paranamer as missing and needed
  * done (SpamapS)
 * Suggested assignees: SpamapS / ttx
 * Estimated complexity: 4
 * Suggested priority: 1/High
 * Suggested Subcycle: Iteration 1 (Alpha-2)

mathiaz review / 20100526:
 * I would revisit the accuracy of packaging cassandra for this release cycle given its dependency on thrift:
   - cassandra is moving away from thrift as mentioned by monty taylor during uds.
     * I approached the cassandra community about this via #cassandra on freenode and they said this would be possible when avro is more stable, possibly in version 0.8. This release could be soon, but may be very close to Maverick's release date, and so I think thrift is still necessary to make cassandra useful in Maverick (SpamapS)
   - thrift seems to be complicated to package - see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=524135.
     * Digg.com has packaged thrift already. http://mirrors.digg.com/digg/pool/main/t/thrift/ It should be easy enough to package newer snapshots given Eric's comments about needing a specific revision of thrift. Also I'm not clear on the policy regarding just leaving a working snapshot of thrift embedded in Cassandra, which at least one Cassandra community member suggested would be their preference. (SpamapS)
   - I would contact Eric Evans, the Debian developer, to figure out how we can coordinate packaging cassandra in Ubuntu/Debian -- Eric is also a Cassandra upstream dev, which makes him quite central in that equation (ttx).
     * Email sent (SpamapS)
   - As a middle ground for Maverick we could focus on packaging the necessary dependencies for cassandra (leaving thrift out) and focus on getting cassandra in the next release cycle once thrift support has been dropped upstream.
    * Before deciding ont his, I need guidance on whether or not its ok to allow the Cassandra package to use the embedded thrift given upstream suggestions. (SpamapS)

Notes (SpamapS):
 * couchdb was already in main, I was a bit confused when writing the spec. What I meant was to do the merge and add it to server team's list of packages.

(?)

Work Items