tarantool

Implement infrastructure that allows for automated creation of concurrency tests

Registered by Kostja Osipov on 2012-04-17

Implement infrastructure that allows for automatic testing of concurrency bugs and problems.

This must be a debug-only facility, which allows to represent a concurrent sequence of events sequentially,
as a sequence of server actions, waits, synchronization poitns, signals and continuations.

This task is inspired by a similar mechanism in MySQL test suite, called 'debug sync points'.
Its description can be found at:
http://forge.mysql.com/wiki/MySQL_Internals_Test_Synchronization#Debug_Sync_Facility

How this works
--------------------

The facility consists of two parts: client and server.
On the client, it is possible to send a request to the server, and continue execution
without waiting for the request to complete.
It is also possible to reap request results, as a counterpart for the 'send' part.
It is also possible to establish multiple client connections, and use all of them
in a test (this feature we already have and actively use to test our replication).

For example:

server.sql.send("insert into t1 values (1)")
con1 = new TarantoolConnection;
con1.execute("select * from t1 where k0=1")
server.sql.reap() // receives the result of the previous send.

The second part is a server facility, which works similarly to error injections (and perhaps shares with it pieces of architecture and implementation):

- there is a list of statically-defined "synchronization point"

- each synchronization point is a named object, produced by a line in the source code SYNC_POINT_INJECT("name")

- each sync point can be a) raised b) signalled c) waited upon, for a certain state/state change (wait until the sync point is set, entered, or signalled)

The closest analogy to a sync point would be a SUSv5 semaphore.

Using send/reap + debug sync points, it should be possible to:
- send a query to the server in one connection
- this query gets blocked on a sync point, which is enabled
- wait synchronously in another connection until the first query got locked up
- execute a statement
- unblock the sync point (signal it)
- reap the results of the query sent in the first connection.

This mechanism allows writing of sequential tests for various concurrent problems, such as a multi-index update, transaction visibility (visibility of phantoms), etc.

Blueprint information

Status:: Not started

Approver:: None

Priority:: High

Drafter:: None

Direction:: Approved

Assignee:: None

Definition:: Approved

Series goal:: None

Implementation:: Unknown

Milestone target:: None

Related branches

Related bugs

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information

Everyone can see this information.

Subscribers

No subscribers.