DVFS for the Common Clock Framework

Registered by Michael Turquette on 2012-06-03

This Blueprint has been moved to JIRA: https://cards.linaro.org/browse/PMWG-17

Implement and merge the needed infrastructure for platforms to start using the clock framework as the basis for DVFS transitions.

Blueprint information

Status:
Complete
Approver:
Amit Kucheria
Priority:
High
Drafter:
None
Direction:
Approved
Assignee:
Michael Turquette
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Implemented
Milestone target:
milestone icon 2013.05
Started by
Amit Kucheria on 2012-06-06
Completed by
Serge Broslavsky on 2013-08-27

Related branches

Sprints

Whiteboard

Meta:
Roadmap id: CARD-120
Headline: Implement and merge the needed infrastructure for platforms to start using the clock framework as the basis for DVFS transitions.
Acceptance: TBD

Misc notes:
* spent Week 1 getting a home office and workstation set up for development
* catching up on LAKML mails from Nov 30, 2012 to present. To be completed by end of Week 2
* Targeting Week 3 to begin another RFC

[ototo, 2013-02-01] Moving unfinished items to 2013.01.
[mturqette, 2013-02-10] reworked WI to break up large tasks more discretely

Reentrancy notes:
* need to figure out if .prepare callbacks can call clk_prepare_enable(dependant_clk) without incurring lockdep's wrath
* test rate-change notifiers to see if they keep the same task id as their parent caller
** good news! the same task is used for both the notifier dispatcher and the notifier handler(s)
** this means that rate-change notifiers can re-enter the clock framework using the get_current() method
** thus scaling voltage via an i2c message should no longer result in a deadlock

RCU performance:
* RCU lock implementation takes about 12 seconds longer to boot than rwlock
** RCU bootlog:
Uncompressing Linux... done, booting the kernel.
Warning: Neither atags nor dtb found
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 3.8.0-rc3-00032-g1215458 (mturquette@quantum) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-1ubuntu1) ) #29 SMP Wed Feb 13 16:57:43 PST 2013
...
[ 14.861938] twl_rtc twl_rtc: setting system clock to 2000-01-01 02:25:08 UTC (946693508)
[ 14.945495] clk_test_reentrancy_late_init: here1
[ 15.195281] Freeing init memory: 5124K

** rwlock bootlog:
Uncompressing Linux... done, booting the kernel.
Warning: Neither atags nor dtb found
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 3.8.0-rc3-00032-gb47b9d7 (mturquette@quantum) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-1ubuntu1) ) #31 SMP Wed Feb 13 17:06:51 PST 2013
...
[ 2.756225] twl_rtc twl_rtc: setting system clock to 2000-01-01 02:31:04 UTC (946693864)
[ 2.840820] clk_test_reentrancy_late_init: here1
[ 2.859954] Freeing init memory: 5124K

* synchronize_rcu() is called twice for every new entry into the clock framework
** this is very expensive and does not fit well within RCU's read-centric design

** atomic ops:
Uncompressing Linux... done, booting the kernel.
Warning: Neither atags nor dtb found
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 3.8.0-rc3-00029-g06eb8a3-dirty (mturquette@quantum) (gcc version 4.7.2 (Ubuntu/Linaro 4.7.2-1ubuntu1) ) #3 SMP Tue Feb 26 16:55:44 PST 2013
...
[ 2.524383] twl_rtc twl_rtc: setting system clock to 2000-01-02 04:08:28 UTC (946786108)
[ 2.604705] clk_test_reentrancy_late_init: here1
[ 2.623260] Freeing init memory: 5124K

* We save 0.2 seconds during boot time by falling back to atomic ops for tracking task context instead of the more heavy-weight rwlocks

[mturquette, 2013-02-13] clk_prepare, clk_unprepare, clk_enable & clk_disable are all converted. Remaining api's must be converted validated before publishing RFC next week.
[mturquette, 2013-02-15] nice slide presentation explaining about task_struct: http://www.cs.columbia.edu/~nahum/w4118/lectures/Processes.ppt

[mturquette, 2013-02-20] remove the internal __clk_whatever() api's and only expose the top-level apis. Since we're reentrant this should work?!?!?!?

[mturquette, 2013-02-25] David Brown looked over the reentrancy patch and had a good suggestion to use atomic ops instead of rwlocks. He also thought that the general approach was not incorrect. Getting close!

[mturquette, 2013-02-26] Smoke testing of both DVFS and atomic ops (in place of rwlock) appears sane. Log testing cpufreq with userspace governor:
/sys/devices/system/cpu/cpu0/cpufreq #
/sys/devices/system/cpu/cpu0/cpufreq # echo 350000 > scaling_setspeed
/sys/devices/system/cpu/cpu0/cpufreq # cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 694.85
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

processor : 1
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 698.17
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

Hardware : OMAP4 Panda board
Revision : 0000
Serial : 0000000000000000
/sys/devices/system/cpu/cpu0/cpufreq # cat /sys/class/regulator/regulator.3/microvolts
1025000
/sys/devices/system/cpu/cpu0/cpufreq #
/sys/devices/system/cpu/cpu0/cpufreq # echo 900000 > scaling_setspeed
/sys/devices/system/cpu/cpu0/cpufreq # cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 1826.11
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

processor : 1
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 1834.82
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

Hardware : OMAP4 Panda board
Revision : 0000
Serial : 0000000000000000
/sys/devices/system/cpu/cpu0/cpufreq #
/sys/devices/system/cpu/cpu0/cpufreq # cat /sys/class/regulator/regulator.3/microvolts
1313000
/sys/devices/system/cpu/cpu0/cpufreq #
/sys/devices/system/cpu/cpu0/cpufreq # echo 700000 > scaling_setspeed
/sys/devices/system/cpu/cpu0/cpufreq # cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 1389.71
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

processor : 1
model name : ARMv7 Processor rev 10 (v7l)
BogoMIPS : 1396.34
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part : 0xc09
CPU revision : 10

Hardware : OMAP4 Panda board
Revision : 0000
Serial : 0000000000000000
/sys/devices/system/cpu/cpu0/cpufreq #
/sys/devices/system/cpu/cpu0/cpufreq # cat /sys/class/regulator/regulator.3/microvolts
1200000
/sys/devices/system/cpu/cpu0/cpufreq # echo "it works!"
it works!
/sys/devices/system/cpu/

Refactoring notes:
Certain aspects of the common clk framework need to be rewritten or refactored. This includes moving past limitations of the current clk.h api as well as fixing some design mistakes in the days before DT existed. Some todo items:
* create per-user clock objects for tracking user constraints
** http://article.gmane.org/gmane.linux.kernel/1402006
** I've already emailed Vincent about this patch
* remove clk-private.h
** OMAP is almost to the point where this can happen
* remove internal __clk_whatever() functions
** replace with reentrant top-level calls?
** might need to rethink this considering the per-user tracking item above
** lowest priority
* clk_set_rate should select the best parent
** replace most instances of clk_set_parent with clk_set_rate in drivers
* track per-user clock rate requests
** introduce some small policy bits into the clock framework for this
** going to fork pretty hard from the legacy clk.h api on this one...
** necessary for dvfs
* don't use immutable string parent clock names for linking parents to their children during clk_register
** DT might provide a way to do this, need to dig deeper
* implement clk_unregister
** should migrate the children of a clk to the orphan list
** then should free the clock entirely
* implement clk_put
** should free the per-user
* improve __clk_init and _clk_register mess with respect to clk_init_data
* make it so that 100% of in-tree clock data can be marked as initdata

[mturquette, 2013-04-06]
* of_init_opp_table is a good start for parsing frequency/voltage pairs
** drivers/base/power/opp.c
* create a link between clocks and regulators in DT
** should this go in the clock binding, or should it be used by the device that consumes the clock?
* can of_clk_get take in a "con_id", like clkdev.h clk_get?
** or at least can the enum names of the clocks be shared via the new device tree "include chroot" instead of using a raw index number?

[mturquette, 2013-06-28]
Some of the work items I had below had nothing to do with DVFS but instead were part of on-going maintenance of the clock framework. I have moved them to the whiteboard for posterity:

* fix the "2Ghz problem" for clk_round_rate
** http://article.gmane.org/gmane.linux.kernel/1224685
* publish clk torture tests to list
* create per-user clock objects for tracking user constraints
** http://article.gmane.org/gmane.linux.kernel/1402006
* don't use immutable string parent clock names for linking parents to their children during clk_register
* improve __clk_init and _clk_register mess with respect to clk_init_data
* make it so that 100% of in-tree clock data can be marked as initdata

(?)

Work Items

Work items:
[mturquette] hack together prototype: DONE
[mturquette] enable clock tree reentrancy: DONE
[mturquette] design OPP library interface: DONE
[mturquette] publish RFC: DONE
[mturquette] publish 3.5 clk-fixes branch, Cc stable: DONE
[mturquette] publish clk-next branch, merge into linux-next: DONE
[mturquette] send pull request to Linus during 3.6 merge window: DONE

Work items for 2012.10:
RFC patches sent to list: DONE

Work items for 2013.01:
[mturquette] improve re-parenting in clk_set_rate: DONE
[mturquette] create new clk-next branch based on 3.8-rcN: DONE
[mturquette] brainstorm alternative locking mechanism: DONE
[mturquette] set up meeting with rajagopal to discuss locking schemes: DONE

Work items for 2013.02:
[mturquette] consider splitting DVFS framework out from clock framework: DONE
[mturquette] rework dvfs/reentrancy RFC v3: DONE
[mturquette] test RFC v3: DONE
[mturquette] publish V3: DONE
[mturquette] review OPP RCU locking scheme: DONE
[mturquette] test replacing RCU lock with rwlock: DONE
[mturquette] test replacing rwlock with atomic ops and measure performance impact: DONE
[mturquette] make all clk_ops reentrant: DONE
[mturquette] write clk torture tests: DONE

Work items for 2013.03:
[mturquette] deliver dvfs talk at LCA and gather feedback: DONE

Work items for 2013.04:
[mturquette] merge reentrancy patch for 3.10: DONE
[mturquette] consider removing internal __clk_whatever() functions: DONE
[mturquette] implement clk_unregister: DONE
[mturquette] beautify Kconfig and Makefile: DONE

Work items for 2013.05:
[mturquette] explore if clk flags can be made into DT properties or "compatible" strings (see arch/arm/mach-kirkwood/board-km_kirkwood.c): DONE
[mturquette] experiement with new clk.h apis for dvfs: DONE
[mturquette] clk_set_rate should select the best parent (http://article.gmane.org/gmane.linux.kernel/1462214): DONE

Work items for 2013.06:
[mturquette] DT bindings for OPPs that tie in clocks and regulators: DONE
[mturquette] publish DT bindings for basic clocks: DONE
[mturquette] publish OMAP4 clock conversion to DT: DONE

Work items for 2013.07:
[mturquette] cpufreq-cpu0 gets clock and regulator from DT: DONE
[mturquette] make CLK_SET_PARENT_RATE enabled by default: TODO
[mturquette] remove all instances of clk-private.h: TODO
[mturquette] remove clk-private.h: TODO
[mturquette] implement clk_put: INPROGRESS

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.