Improve menu governor behaviour on ARM platform

Registered by Amit Kucheria

Realtime usecases (high frequency events) are not handled well by cpuidle currently

Blueprint information

Amit Kucheria
Amit Kucheria
Amit Daniel Kachhap
Series goal:
Accepted for 11.05
Milestone target:
milestone icon 11.05-final
Started by
Amit Kucheria
Completed by
Amit Kucheria

Related branches



cpuidle "menu" governor mainly relies on two kinds of event for taking a decision on which target C/sleep state the processor should be moved to. These are:
1. next timer event scheduled
2. the device interrupts (disk, network, mouse and other peripherals)

Now, looking at the next scheduled timer event is trivial and it helps to provide the upper bound of the time for which the corresponding CPU can go to sleep. But, there can be random device interrupts which can wake up the CPU, much before the next timer fires. Today, "menu" governor has some intelligence built into it which helps to take relatively better decisions. This includes the latest addition to heuristics by Arjan, which takes into account learnings from last 8 sleep periods (see links below).

Now, it has been a known issue that the high frequency events are not handled very well by the "menu" governor. Need to evaluate if its still the case, and if it is - get it fixed.

Some related discussions can be found here:

In order to reproduce the shortcomings(slow responsiveness for external events) of menu governor,
following tests are performed in Samsung Orion platform. To make the test useful 3 C
states are added with latencies as 1, 100 & 1000 usec.Two set of tests are performed. In the
first set, the current menu governor is used as it is. In the second set, the function
detect_repeating_patterns is commented. In each set, 4 cycle of iteration is done.
Also 3 user scenarios is considered, they are mentioned below.

SCENARIO 1)Increasing computational load of the cpu with cpufreq-bench tool. This tool is used
to just increase the load of the cpu in a periodic manner. Also powertop is executed in parallel.
The test script is,
                cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g performance -r 5 -n 5 -v > temp_result &
                powertop -t 5

RESULT a):(No modification in menu governor)
Cycle 1) C0 = 6.2%, C1 = 0.0%, C2 = 97.3% Cycle 2) C0 = 2.3%, C1 = 0.0%, C2 = 11.0%
Cycle 3) C0 = 0.1%, C1 = 0.0%, C2 = 16.9% Cycle 4) C0 = 3.4%, C1 = 0.0%, C2 = 93.3%
RESULT b):(modification in menu governor)
Cycle 1) C0 = 13.0%, C1 = 0.0%, C2 = 86.8% Cycle 2) C0 = 2.5%, C1 = 0.0%, C2 = 10.8%
Cycle 3) C0 = 0.0%, C1 = 0.0%, C2 = 18.2% Cycle 4) C0 = 11.2%, C1 = 0.0%, C2 = 86.6%

Conclusion: Last 8 pattern detection code has no effect on computational intensive work.

SCENARIO 2)External character events are supplied to the target board from host repeatedly
through the UART console manually. Although this method is not very precise but the powertop
results are displayed below,
RESULT a):(No modification in menu governor)
Cycle 1) C0 = 96.0%, C1 = 0.0%, C2 = 3.9% Cycle 2) C0 = 94.5%, C1 = 0.0%, C2 = 5.3%
Cycle 3) C0 = 95.0%, C1 = 0.0%, C2 = 4.8% Cycle 4) C0 = 89.0%, C1 = 0.0%, C2 = 10.0%
RESULT b):(modification in menu governor)
Cycle 1) C0 = 1.9%, C1 = 0.0%, C2 = 98.0% Cycle 2) C0 = 94.5%, C1 = 0.0%, C2 = 5.4%
Cycle 3) C0 = 2.7%, C1 = 0.0%, C2 = 97.2% Cycle 4) C0 = 3.7%, C1 = 0.0%, C2 = 96.1%

Conclusion: Last 8 pattern detection code has big effect in case of periodic external intensive work.

SCENARIO 3): A test case is performed using random usb data transfer from from host pc to target board.
                       The test script and measurement data can be found in
                       The results shows that menu governor is able to scale up to low latency C states in case of usb transfer use case.

In progress


Work Items

Work items:
[amitdanielk] Investigate issues with cpuidle menu governor, specifically on ARM platform: DONE
[amitdanielk] Check the behaviour and performance of menu governor by disabling the intelligence and heuristics and rely only on the policy decisions(wakeup latency): DONE
[amitdanielk] Added a new pm_qos parameter to mask a specific C state: DONE
[amitdanielk] To investigate other ways to mask c-states based on performance requirements: DONE
[amitdanielk] Created a test program to transfer random data from linux host PC to target usb mass storage device: DONE
[amitdanielk] Verify with other partner platform members if the above usb test script has same behaviour: DONE

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.