ARM power mgmt - Policy management

Registered by Amit Kucheria

NOTE: This blueprint is out of the PMWG Roadmap and will not be worked on unless that changes.

Power management is spread about in multiple modules within the Linux kernel. As system-on-a-chip power consumption increases as the level of integration increases, in order to hit the lowest power states for any given usecase, additional kernel modules are added or adapted to manage the total SOC power consumption. These kernel modules now have increased interdependency on each other, and managing all possible factors that impact power consumption in isolation is fast becoming unwieldy. It is desirable to have a single point that can source all relevant frameworks that impact device power management, as well as feed desired goals or state information to such frameworks or their governors. A policy manager is to be defined and implemented that will source common interfaces (either kernel signals or from user-space via the filesystem) regarding power state, take inputs from a user or governor, and provide constraints or requirements directly to PM frameworks

  Modules affected by this policy manager include, but not limited to, cpufreq, cpuidle, cpu hotplug, PM QoS, Runtime PM, Clock framework, and regulator. It is anticipated that some of these frameworks may need to be enhanced to provide either new functionality, or to allow values to be changed from an external source. The overriding goal for this policy manager, however, is to impact as little as possible the core structure of the Linux PM frameworks, and serve more as an aggregator and cross-module constraint

  Thermal management will also be needed. A thermal manager will be designed independently of a policy manager. The thermal manager will monitor the device temperature and when certain thresholds are met, it will set constraints to cpufreq (control speed), cpuidle (turn features off) and cpu hotplug (remove cores). There is already a separate blueprint (cpufreq-thermal-management) that is in progress; it is possible that this overriding thermal manager will roll into that work.

Blueprint information

Status:
Complete
Approver:
Amit Kucheria
Priority:
Low
Drafter:
None
Direction:
Needs approval
Assignee:
None
Definition:
Obsolete
Series goal:
Accepted for trunk
Implementation:
Not started
Milestone target:
milestone icon backlog
Completed by
Serge Broslavsky

Related branches

Sprints

Whiteboard

Status:
[sjahnke] Purpose and high level goals: Done
[sjahnke] PM Block Diagram: Done (need to put on a common site as apparently Whiteboard here does not allow diagrams; will ask the right place to put it). Required to get a common understanding of the underlying PM infrastructure and signaling/dependencies among the frameworks.

Policy Manager Design Architecture: In Progress
[sjahnke] Design specification: In progress. Will first add a means to manage CPU hotplug and build off of that.

Implementation Needs
[sjahnke] As a stand-alone project, both the policy and thermal manager will need to be hosted independently. Hosted at git.linaro.org/people/stevejahnke/thermal_manager.git right now holds the OS and architecture specific functions while the git.linaro.org/people/stevejahnke/thermal_library.git holds the thermal management core. Focus is on thermal manager at the moment, as it is a required function and the design architecture used can be readily adapted to a general power policy manager as well.

---------------------------
SOC Thermal Manager
---------------------------
Note this is a holding place to get the discussions going. The information below is mostly likely more appropriately hosted elsewhere.

Overview
  This project will be initially a user space application that will periodically source a silicon temperature sensor and if a certain threshold is met, will start to cap the maximum frequency the cores may run (restricting the states that cpufreq may enter) and then, if the threshold temp is still not met, will start to remove processor cores via cpu hotplug. Further, it will be considered to set constraints on drivers as well (such as the memory interface or the interconnect bus, etc.) although that is beyond the scope of this initial definition.

  Note that this thermal policy manager will initially make use of the hwmon class to read the temp sensor directly, and affect core states by writing to the appropriate sysfs interface for the cpufreq governor. However, this will move to reading the thermal framework class interfaces once an ARM-specific implementation of the thermal framework (kernel space) is defined and implemented.

Design Parameters
  The following are the goals for the project:

  1.All user space at the moment. If there is a kernel module that needs to export any information to the system to make use of this project, it should be added as part of that kernel project separately (and hence a dependency on this project). In the future, a kernel space version of this code can be developed if required. Thermal changes are relatively “slow” in terms of other PM functions, so any timing inefficiencies by going to user space SHOULD have minimal impact.

  2.Initially restricted to consider thermal only. Whether this code ties into a general policy manager or remains independent will be considered once prototypes of this system is done.

  3.SOC agnostic. As a user-space application, there is no need to put any SOC-specific dependencies in this code.

  4.Follow a “framework – governor” methodology. The code shall be designed to have a clear distinction between what are system handlers to provide data I/O for the application and what is a governor or decision-maker. The governor aspects of the code shall be easily interchangeable so that various governor algorithms can be tried without impacting the framework aspects of the code. Is the framework component a separate library (and the governor a stand-alone application) or are they both simply a single application is to be decided.

  5.Initial focus will be on setting CPU states and hotplug events only. Will consider how to expand to drivers as a next step.

  6.Allow existing governors to be used by cpufreq (on-demand, etc.). We do NOT want to require a user-space governor for cpufreq.

Design Architecture

Framework
  1. At boot, read in valid speed ranges for each CPU – store the cpufreq/scaling_available_frequencies.

  2. Receive signal from governor requesting current temperature.

  3. Read the appropriate hwmon for the SOC temperature sensor. The SOC temperature sensor driver must export its information via the thermal framework so that it is appropriately registered as a /sys/class/hwmon

  4. Send the temp data (via a /sys/class/hwmon read) to the governor.

  5. Retrieve governor data pointer on the frequency to set. Structure should have at least 2 fields – a “is changed” field and the value desired. If the governor does not make a change, it should set “is changed” to false. Further, what core the value should be set at also needs to be passed.

 6. Retrieve governor data point on which cores to remove. This can be set up as a “remove these cores” type structure or a structure that says “here are all the cores that can be active”; which approach is TBD.

  7. Write received freq to cpufreq/scaling_max_freq if “is changed” is set to true

  8. echo a 0 to each sys/devices/system/cpu/cpuX/online for each core to go offline

Governor (default)
  1. Read in valid frequencies for each CPU (or store once at boot).

  2. Wakeup after a predetermined timeset (500ms, but must be variable).

  3. Signal to the framework to pass the existing SOC temperature.

  4. Compare SOC temperature to existing threshold.

  5. If temperature > threshold, set the signalling pointer “is changed” to true and pass the next lower valid frequency. If temp < threshold, keep “is changed” to false and keep the higher frequency valid. If at the lowest frequency already and still > threshold, reduce the number of cores by one. If reduced to one core, send an abort or error or some kind of message back to the system. The exact “total failure” message and its format is TBD.

Signalling between Governor and Framework
 For non-Android based systems, dbus will be used. External applications may connect to the exported thermal_manager signals and monitor thermal events. If any action at the system level is needed, these external applications can act upon a thermal event.
  For Android, a JNI wrapper is to be created that will allow any other Android app to connect to a ThermalObserver Class; this is an extension of the UEventObserver class

Configuration
  Both the framework and governors should have any configurable values set in a config file (as per most user space applications). The config file should follow common formatting conventions. Separate config files for the governor and framework aspects should be set as they are distinct modules.

(?)

Work Items