Dynamic timer irq affinity

Registered by Daniel Lezcano on 2013-02-20

When the cpuidle driver choose a deep idle state, the timer device switch to the broadcast timer.
This timer will expire and could wake up a processor which is not concerned by this event and which will just send an IPI to wakeup the target cpu. That could be improved by setting up the timer irq affinity to the cpu concerned by the first timer expiration. By this way, we prevent an extra and unnecessary wake up.

Investigate if this approach could be generalized to more peripheral.

Blueprint information

Status:
Complete
Approver:
Amit Kucheria
Priority:
Essential
Drafter:
Daniel Lezcano
Direction:
Approved
Assignee:
Daniel Lezcano
Definition:
Approved
Series goal:
Accepted for trunk
Implementation:
Implemented
Milestone target:
milestone icon 2013.03
Started by
Amit Kucheria on 2013-02-22
Completed by
Daniel Lezcano on 2013-05-06

Related branches

Sprints

Whiteboard

Link to the discussion with Thomas Gleixner https://lkml.org/lkml/2013/2/19/555

[daniel-lezcano, Feb 23, 2013] : on u8500, noted a change on the scheduling of a process. At the first glance, without this patch a process will be run on different cpu per time slice, with this patch the process runs on the same processor (not sure this is bad).

[daniel-lezcano, Feb 27, 2013] : on u8500, did a test with a simple program doing sleep running on each cpu. Measurements with trace-cmd is tricky, don't use trace-cmd record, the processor running the command will never sleep : increase trace buffer size and use trace-cmd start -ecpu_idle; sleep 10; trace-cmd stop; trace-cmd extract -o trace.dat; trace-cmd report trace.dat. Run the 'idlestat' on the traces:

With dynamic irq affinity:
Log is 10.042298 secs long with 4190 events
cpu0/state0, 24 hits, total 2718.00us, avg 113.25us, min 0.00us, max 854.00us
cpu0/state1, 994 hits, total 9874827.00us, avg 9934.43us, min 30.00us, max 10346.00us
cpu1/state0, 73 hits, total 17001.00us, avg 232.89us, min 0.00us, max 10040.00us
cpu1/state1, 1002 hits, total 9883840.00us, avg 9864.11us, min 0.00us, max 10742.00us
cluster/state0, 0 hits, total 0.00us, avg 0.00us, min 0.00us, max 0.00us
cluster/state1, 1931 hits, total 9762328.00us, avg 5055.58us, min 30.00us, max 9308.00us

Without dynamic irq affinity:
Log is 10.036834 secs long with 6574 events
cpu0/state0, 114 hits, total 20107.00us, avg 176.38us, min 0.00us, max 7233.00us
cpu0/state1, 1951 hits, total 9833836.00us, avg 5040.41us, min 0.00us, max 9217.00us
cpu1/state0, 223 hits, total 21140.00us, avg 94.80us, min 0.00us, max 2960.00us
cpu1/state1, 997 hits, total 9879748.00us, avg 9909.48us, min 0.00us, max 10346.00us
cluster/state0, 5 hits, total 5462.00us, avg 1092.40us, min 580.00us, max 2899.00us
cluster/state1, 2298 hits, total 9740988.00us, avg 4238.90us, min 30.00us, max 9217.00us

Results for the specific test case 'usleep 10000'
 * reduced by 40% the number of wake up on the system
 * reduced by 49% the number of wake up for CPU0
 * increased by factor two idle time for CPU0
 * increase by 16% package idle hits + 16% average package idle time

(?)

Work Items

Work items for 2013.02:
[daniel-lezcano] : send RFC patchset to upstream and collect feedbacks : DONE
[daniel-lezcano] : fix irq initialization in nomadik-mtu driver : DONE
[daniel-lezcano] : investigate the benefit in terms of wakeup : DONE
[daniel-lezcano] : investigate the impact on the scheduling of the processes : DONE
[daniel-lezcano] : investigate dynamic irq affinity vs /proc/irq/<N> : DONE

Work items for 2013.03:
[daniel-lezcano] : upstream the dynamic irq affinity core : DONE
[daniel-lezcano] : upstream the local timer flag for u8500 : DONE
[daniel-lezcano] : upstream the local timer flag for TC2 : DONE

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.