Need performance controls per user

Registered by Lee Bieber on 2010-06-26

Per request from Rackspace we need performance isolation and controls by user (resource caps with global default and per-user overrides)

- On CPU utilization (thread scheduler)
   This is dependent on current_session work that Brian is doing, scheduled for Dexter

- On memory consumption (InnoDB Buffer pool per user, other cache LRU per user)
  Dependent on work Stewart is doing with embedded innodb

- On I/O operations (to the degree this is practical without forking the storage engines)
   Dependent on work Stewart is doing with embedded innodb

- On concurrent query execution (limit to no more than N concurrent queries per user)
  Need to get current_session work done - https://blueprints.launchpad.net/drizzle/+spec/remove-thread-specific-callers

Blueprint information

Status:
Not started
Approver:
None
Priority:
High
Drafter:
Lee Bieber
Direction:
Needs approval
Assignee:
Andrew Hutchings
Definition:
Approved
Series goal:
Accepted for 7.1
Implementation:
Not started
Milestone target:
milestone icon ongoing

Related branches

Sprints

Whiteboard

So this will probably fit in quite nicely in the authorization plugins. CPU utilization shouldn't be too hard as we already have per-session counters for this. The two InnoDB things I will have to talk to stewart about as I don't yet know how to implement them. Concurrent query execution shouldn't be too difficult to cap either.

Also, question: When will the counters for these caps reset?
From Adrian on 01/12/10 - None of them really reset. They are based on current utilization. For example, if you are over the concurrent query limit, you can queue work for some reasonable time (a few seconds perhaps) after which you must return an error to the client. As queries finish, the concurrent test will be consulting a lower number that should fall below the limit at some point. This same logic applies to all limits. Different logic must apply to "write" type queries than we have for SELECT queries. SELECT queries can be safely rejected with an error, so long as they are not part of any transaction but rejecting an INSERT, UPDATE, or DELETE could be a real mess. Most of the trouble we have in production has to do with the handling of SELECT queries when tons of them are runnable concurrently. Ideally we would have some sort of a policy we could create and apply on a per-user (or some grouping of users by some classification and that policy would indicate what types of queries the limits apply to, and what should happen when they are approached and/or reached.

From Andrew on 2010-01-13 - OK, but with CPU limits the only per-user accounting we can do is CPU time (ie. seconds/microseconds). It can never be based on current utilization due to platform specific implementation of threads.
My other concern is InnoDB buffer pool per-user. If every user had an InnoDB buffer pool it could lead to lots of duplication of tables in the buffer pools, and memory usage could be difficult to control with lots of users.

From Andrew on 2010-01-14 - After the conf call, what we are looking for is basically resource constraints per-catalog (originally per-user since this was specced before catalogs were in). The general idea is no catalog should eat up all the I/O/RAM/CPU and some catalogs should have a higher priority than others. Also instead of a buffer pool per-catalog (which would suck when getting into hundreds/thousands of catalogs) there should be some dynamic tuning of a global allocation, again with weighting.

From Andrew on 2010-01-15 - it is looking like the buffer pool stuff at least may be at least April until we can have it complete. The idea first I had was: Percona has a patch to save buffer pools to disk. I suspect we can have fixed buffer pools per catalog and any inactive catalogs (unused for a minute for example) can save to disk and free memory until they are used again

Work items:
Modify authorization plugin API: TODO
Create a basic example authorization plugin: TODO
Create DD table to show usage (in the auth plugin): TODO
Add CPU limits: TODO
Add memory limits: TODO
Add I/O limits: TODO
Add concurrent query limits: TODO
Add global defaults: TODO
Test cases: TODO
Documentation: TODO

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.