Support for GET_LOCK()/RELEASE_LOCK() functions

Registered by Alex Yurchenko on 2009-12-08

Support for GET_LOCK()/RELEASE_LOCK() functions

Blueprint information

Seppo Jaakola
Alex Yurchenko
Needs approval
Series goal:
Milestone target:
Started by
Seppo Jaakola on 2012-06-08


GET_LOCK()/RELEASE_LOCK() seems to be a widely used MySQL-specific mechanism to synchronize database application processes running on different hosts. Note that "widely-used" here refers to a segment which is a direct target for MySQL/Galera - massively distributed applications.

Without support for GET_LOCK()/RELEASE_LOCK() we cannot readily offer MySQL/Galera to the users of such applications. It also seems to be a widespread complaint that MySQL replication does not support replication of these mutexes. We could add this advantage to out solution, especially since we're in a good position to do that with all the required instrumentation (group communication, TO) in place.

The alternative is of course to use ddlockd or some such thing. This is however unsatisfactory (I'd say unacceptable) for the following reasons:
1) It requires users to modify their application
2) It requires users not just change their SQL and write some function calls in place of SQL queries, it requires users to implement deadlock detection and resolution code which they had for free in mysqld.
3) It requires users to add support for one more protocol to their application
4) It requires users to install and maintain one more cluster (of lock servers)
5) and last but not least: lock server cluster and mysqld cluster views are not synchronized. This can easily lead to deadlock: application instance grabs the lock, but then hangs because its server lost prim component.

Support for GET_LOCK()/RELEASE_LOCK() should be based on presumption that locked "mutexes" is a part of mysqld state. Thus they should be reflected in an explicit table that can be a part of SST.

2 reasons why this must be implemented at mysql and not at galera level:
1) This is a MySQL-specific feature
2) As mentioned above, these locks are global and so must be transfered to newly joined node. GCS state exchange is not suitable for that since GCS state messages cannot be fragmented and should be as small as possible.

Acquiring locks can be done in two ways:
1) translating GET_LOCK/RELEASE_LOCK into transactions. This might lead to a high certification collision rate which the use of these functions actually tries to eliminate by preventing applications to concurrently modify the same data.

2) using totally-ordered reference counting. Reference counting allows to obtain semantics very close to that of original GET_LOCK and friends, with the exception: timeouts are impossible to implement. Rather tedious to implement will be: global connection IDs for IS_USED_LOCK() and RELEASE_LOCK() and releasing lock on connection close/node failure.

Given the need to deterministically process connection close and connection disappearance from the cluster to avoid having stale locks, implementation perspective looks rather grim even to me. Should be postponed until we have a real request for that. However it IS very possible to implement.

As for the algorithm... LP text formatting capabilities suck (God, this does not even begin to give a due credit to it! It is no less than Neanderthal. Why do I even care to write anything here?). I'll need to do it in trac. Suffice to say it borrows from the traditional Linux scheduler and TO queue. So it is a tried and trusted algorithm.
So here it is:


Work Items

This blueprint contains Public information 
Everyone can see this information.