Transaction Replaying
wsrep write set contains lock information of only the rows that are referenced in the write set. When MySQL/InnoDB processes a transaction, it may need more resources (.e.g. GAP locks) than what is visible in the resulting write set. It can therefore happen, that a local transactions, which is in committing phase, has locked more rows than what write set contains.
If a slave applier conflicts with this local transaction's GAP lock, we need to abort the local transaction. If this victim had already replicated, all other nodes will just look in the write set's locks and certification verdict will be positive, the write set will be accepted in all other nodes, but the originating node.
To fix this issue, we can save the write set in the originating node and replay the transaction just like any remote transaction is applied by slave applier. Only difference is that, replaying transaction does not necessarily need to go through certification test anymore, it can start applying directly.
Replaying can be processed by the client connection, which is promoted to brute force priority for the duration of the replaying.
Replaying can be started in two different positions, depending on where the local transaction was interrupted:
1. in certification TO queue, if local transaction was interrupted in cert TO wait state
2. directly in applying, if local transaction was interrupted in commit TO queue
Blueprint information
- Status:
- Complete
- Approver:
- None
- Priority:
- Essential
- Drafter:
- Seppo Jaakola
- Direction:
- Needs approval
- Assignee:
- Seppo Jaakola
- Definition:
- Approved
- Series goal:
- None
- Implementation:
- Implemented
- Milestone target:
- None
- Started by
- Seppo Jaakola
- Completed by
- Seppo Jaakola
Whiteboard
Replaying logic faced an unexpected issue lp:502559, which seems to be fixed with recent commit