Improve cpufreq ondemand governor for ARM

Registered by Amit Kucheria

cpufreq on ARM needs tweaking. Ondemand governor is sometimes too slow to trigger a DVFS transition.

Blueprint information

Status:
Complete
Approver:
Amit Kucheria
Priority:
High
Drafter:
vishwanath sripathy
Direction:
Approved
Assignee:
vishwanath sripathy
Definition:
Approved
Series goal:
Accepted for 11.05
Implementation:
Implemented
Milestone target:
milestone icon 11.05-02
Started by
Amit Kucheria
Completed by
Amit Kucheria

Related branches

Sprints

Whiteboard

current ondemand governor in linux is not able to scale frequencies appropriately. The major issues seen are
1. Slow responsiveness - ondemand governor is not able to detect need of higher MIPS at the beginning of a usecase, leading to a lag or frame drop
2. Lower performance - some of the short cpu intensive usecases suffer with ondemand governor.
3. ondemand governor only looks at CPU load without bothering about IO operations. This can impact performance of IO operations.

Some more discussions can be found in:
http://lwn.net/Articles/384132/

There is a tool called cpufreq-bench which can be used to measure performance impact of various cpufreq governors.
By running this utility with various parameters (load time, sleep time etc), issue with ondemand can be demonstrated.
Tool is available at: http://ftp.riken.go.jp/archives/Linux/suse/people/ckornacker/cpufreq-bench/
More details at: http://lwn.net/Articles/339862/
cpufreq-bench was run on OMAP and x86 platforms with below parameters.
cpufreq-bench -l 50000 -s 100000 -x 50000 -y 100000 -g ondemand -r 5 -n 5 -v

The conclusion of this exercise is that ondemand performance issue is seen on ARM SOC (OMAP, Freescale) as well as on X86, but performance degradation on X86 is lesser.
Performance on OMAP is degraded to 40% while using ondemand where are on X86, it is around 89%. Further investigation showed that x86 performance is higher because of optimized governor parameters (esp cpufreq_transition_latency).

There is also an interesting patch from David C Niemi (http://kerneltrap.org/mailarchive/linux-kernel/2010/10/6/4628889/thread) which helps in reducing frequent OPP changes when there is high load in the system which helps to improve ondemand performance.

After optimizing cpufreq_transition_latency for omap (reduced to 30ms from 300ms, patch available at https://patchwork.kernel.org/patch/356752/) along with applying above patch, cpufreq-bench results on OMAP are much better (worst case performance is 87%).

RESULTS:
cpufreq-bench results without optimization on OMAP.
Round 1 - 41.11%
Round 2 - 41.61%
Round 3 - 40.79%
Round 4 - 41.17%
Round 5 - 52.58%

Time spent in different P states:
300M - 12.26%
600M - .28%
800M - 0%
1000M - 87.33%

cpufreq-bench results with optimization on OMAP.
Round 1 - 90.24%
Round 2 - 94.48%
Round 3 - 96.06%
Round 4 - 96.6%
Round 5 - 86.89%

Time spent in different P states:
300M - 3.26%
600M - 0.4%
800M - 0%
1000M - 96.33%

cpufreq-bench results on X86 platform:
Round 1 - 88.67%
Round 2 - 94.71%
Round 3 - 95.53%
Round 4 - 96.34%
Round 5 - 98.03%

Status:
In Progress
sampling down factor patch has been merged in 2.6.37 kernel and is available at
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3f78a9f7fcee0e9b44a15f39ac382664e301fad5
By tuning this value appropriately (100 for OMAP) along with optimized cpufreq transtion latency value, performance of ondemand has been found to be increased on OMAP platforms when tested with cpufreq-bench.

Optimizing cpufreq transtion latency value does not seem to have impact on power consumption as it is a deferrable timer and does not cause any wake ups.

(?)

Work Items

Work items:
[vishwanath-bs] cpufreq: start following up the discussions on ondemand governor issues in cpufreq and linux-pm mailng list: DONE
[vishwanath-bs] cpufreq: reproduce the issue on OMAP platform: DONE
[vishwanath-bs] cpufreq: Come up with simple usecase to reproduce ondemand governor problems: DONE
[vishwanath-bs] cpufreq: Try reproducing ondemand issue using cpufreq-bench on x86 platform: DONE
[vishwanath-bs] cpufreq: Optimize cpufreq parameters for OMAP and post patch for review: DONE
[vishwanath-bs] cpufreq: Document the sysfs entry for sampling multiplier for ondemand governor: DONE
[vishwanath-bs] cpufreq: Upstream Document changes: DONE
[vishwanath-bs] cpufreq: Do actual power measurement with/without the patch: DONE
[vincent-guittot] cpufreq: Try reproducing ondemand issue using cpufreq-bench on STplatform: DONE
[vincent-guittot] cpufreq: Optimize cpufreq parameters for STplatform (if needed) and post patch for review: DONE
[amitdanielk] cpufreq: Try reproducing ondemand issue using cpufreq-bench on samsung platform: DONE
[amitdanielk] cpufreq: Optimize cpufreq parameters for Samsung (if needed) and post patch for review: DONE
[yong.shen] : cpufreq: Try reproducing ondemand issue using cpufreq-bench on Freescale platform: DONE
[yong.shen] cpufreq: Optimize cpufreq parameters for Freescale (if needed) and post patch for review: DONE

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.