Page cleaner thread tuning for 5.6

Registered by Laurynas Biveinis

Tune page cleaner thread algorithms and heuristics.

Timeouts:

- Implement a hard time limit for a single flush batch. The thread assumes that it runs <= 1 s for its heuristics, but under heavy load we have observed iterations taking up to 17 seconds. Such situations confuse the heuristics, and a flush taking a long time prevents the other kind of flush from running, which in turn may cause query threads to perform sync preflushes or single page LRU flushes depending on the starved flush type. For LRU flushes, the timeout is checked before each LRU chunk flush. In order to implement this for flush list flushes, the flush requests for each buffer pool instance were broken up to chunks too.

LRU flush chunk dispatching to instances:

- In the case of multiple buffer pool instances, recognize that their use is not uniform [1]. Attempting to make them uniform somehow is likely to be fruitless. The non-uniform buffer pool instance use means non-uniform free list depletion levels for LRU flushes. This is addressed by the changes described below.

- Reduce LRU list mutex contention and needless LRU list scanning that stems from the LRU list scan restarts. In case of dirty pages at the tail of LRU, the restarted scans will have to needlessly scan over still-dirty previously-seen pages with flush requests issued. It is better to scan such instance later, when the flush requests complete and pages become available for eviction.

- The two previous items are addressed by the following implementation: instead of issuing all the chunk-sized flush requests for the 1st instance, then for the 2nd one, etc, issue requests to all instances in parallel. If a particular instance has a nearly-depleted free list (<10% of innodb_lru_scan_depth), then keep on issuing requests for that instance until it's not depleted, or flushing limit for has been reached. To support this, also provide two modes of a single chunk flush: a regular one that scans as much as needed until chunk size goal is reached, or a limited one, which limits the total number of scanned pages, scan restarts included, to innodb_lru_scan_depth.

Furious flushing:

- Minimize mutex contention coming from single page LRU flushes by refilling free lists more aggressively (an equivalent of furious flushing for LRU), making an empty free list and consequent single page LRU flush as rare occurence as possible. This was already started by a separate work [2]. Here we make the LRU flushes run more aggressively if neeed, possibly violating innodb_lru_scan_depth_limit.

- Likewise furiously flush the flush list if the checkpoint age is in sync preflush zone.

Both kinds of furious flushing are implemented through adapting the cleaner thread sleep time based on the current checkpoint age and free list length. A target sleep time for LRU flush is calculated (depending on free list refill levels: <1%: no sleep; <5%: 50ms shorter sleep; 5%-20%: no change; >20%: 50ms longer sleep) which is then reduced to zero if checkpoint age is in sync preflush.

New UNIV_PERF_DEBUG tuning variables:
- innodb_cleaner_max_lru_time to specify the timeout for the LRU flush of one page cleaner thread iteration.
- innodb_cleaner_max_flush_time likewise for the flush list flush.
- innodb_cleaner_lru_chunk_size replaces the hardcoded 100 constant as a chunk size for the LRU flushes.
- innodb_cleaner_flush_chunk_size for specifying the chunk size for the flush list flushes.
- innodb_cleaner_free_list_lwm to specify the percentage of free list len below which LRU flushing will keep on iterating on the same buffer pool instance to prevent empty free list.
- innodb_cleaner_eviction_factor for choosing between flushed and evicted page counts for LRU flushing heuristics.

[1] http://mikaelronstrom.blogspot.com/2010/09/multiple-buffer-pools-in-mysql-55.html
[2] https://blueprints.launchpad.net/percona-server/+spec/free-list-priority-refill

Blueprint information

Status:
Complete
Approver:
None
Priority:
High
Drafter:
Laurynas Biveinis
Direction:
Approved
Assignee:
Laurynas Biveinis
Definition:
Approved
Series goal:
Accepted for 5.6
Implementation:
Implemented
Milestone target:
milestone icon 5.6.13-61.0
Started by
Laurynas Biveinis
Completed by
Laurynas Biveinis

Whiteboard

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.