Page cleaner thread tuning for 5.6
Tune page cleaner thread algorithms and heuristics.
Timeouts:
- Implement a hard time limit for a single flush batch. The thread assumes that it runs <= 1 s for its heuristics, but under heavy load we have observed iterations taking up to 17 seconds. Such situations confuse the heuristics, and a flush taking a long time prevents the other kind of flush from running, which in turn may cause query threads to perform sync preflushes or single page LRU flushes depending on the starved flush type. For LRU flushes, the timeout is checked before each LRU chunk flush. In order to implement this for flush list flushes, the flush requests for each buffer pool instance were broken up to chunks too.
LRU flush chunk dispatching to instances:
- In the case of multiple buffer pool instances, recognize that their use is not uniform [1]. Attempting to make them uniform somehow is likely to be fruitless. The non-uniform buffer pool instance use means non-uniform free list depletion levels for LRU flushes. This is addressed by the changes described below.
- Reduce LRU list mutex contention and needless LRU list scanning that stems from the LRU list scan restarts. In case of dirty pages at the tail of LRU, the restarted scans will have to needlessly scan over still-dirty previously-seen pages with flush requests issued. It is better to scan such instance later, when the flush requests complete and pages become available for eviction.
- The two previous items are addressed by the following implementation: instead of issuing all the chunk-sized flush requests for the 1st instance, then for the 2nd one, etc, issue requests to all instances in parallel. If a particular instance has a nearly-depleted free list (<10% of innodb_
Furious flushing:
- Minimize mutex contention coming from single page LRU flushes by refilling free lists more aggressively (an equivalent of furious flushing for LRU), making an empty free list and consequent single page LRU flush as rare occurence as possible. This was already started by a separate work [2]. Here we make the LRU flushes run more aggressively if neeed, possibly violating innodb_
- Likewise furiously flush the flush list if the checkpoint age is in sync preflush zone.
Both kinds of furious flushing are implemented through adapting the cleaner thread sleep time based on the current checkpoint age and free list length. A target sleep time for LRU flush is calculated (depending on free list refill levels: <1%: no sleep; <5%: 50ms shorter sleep; 5%-20%: no change; >20%: 50ms longer sleep) which is then reduced to zero if checkpoint age is in sync preflush.
New UNIV_PERF_DEBUG tuning variables:
- innodb_
- innodb_
- innodb_
- innodb_
- innodb_
- innodb_
[1] http://
[2] https:/
Blueprint information
- Status:
- Complete
- Approver:
- None
- Priority:
- High
- Drafter:
- Laurynas Biveinis
- Direction:
- Approved
- Assignee:
- Laurynas Biveinis
- Definition:
- Approved
- Series goal:
- Accepted for 5.6
- Implementation:
-
Implemented
- Milestone target:
-
5.6.13-61.0
- Started by
- Laurynas Biveinis
- Completed by
- Laurynas Biveinis
Related branches
Related bugs
Sprints
Whiteboard
Work Items
Dependency tree

* Blueprints in grey have been implemented.