Performance inside GCC

Registered by Michael Hope on 2010-10-18

This session covers non-NEON performance improvements inside GCC. Includes current investigations by Linaro, areas that other groups are working on, and potential long-term topics that could help performance on ARM.

Blueprint information

Status:
Not started
Approver:
None
Priority:
Undefined
Drafter:
Yao Qi
Direction:
Needs approval
Assignee:
None
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

[It is still ongoing, and I'll update it a little bit in following days]

1. Tune instruction schedulings for ARM
GCC has various options related to instruction scheduling, but how much benefit ARM can take from them?
 - Swing Modulo Scheduling: Much speed lost (-2% on EEMBC) when -fmodulo-sched is turned on (FSF GCC r165607). Why ? Easy to improve it for ARM?
 - Selective scheduling:
   * Some speed gain is got with -fselective-scheduling, however, speed lost is got with -fselective-scheduling2. Why selective-scheduling2 is bad? How to improve?
   * -fsel-sched-pipelining can improve speed a little bit, can we make it better on ARM?
 - sched-pressure:
    * "-funroll-loops -fsched-pressure" (+11.3%) is better than "-funroll-loops" (+11.2). Performance gain is not that much. Can we improve "-fsched-pressure" further on ARM?

2. Avoid speed regression
 - Continuous speed evaluation: it might be impractical to measure speed on each commit, but we may check performance number in a weekly manner. Can Linaro infrastructure team help on this?
 - Avoid big changes in one commit: small-changes-in-each-commit is helpful to identify cause of speed regression.
 - Merge from upstreams or apply some patches from somewhere else, what should we do if we find a speed regression? For example, after merging from FSF 4.5.1, Linaro GCC 4.5 has some speed regressions on some EEMBC cases, here are some options for us to handle it,
     * Assign someone to look into this speed regression,
     * Open a ticket for this speed regression to track it,

(?)

Work Items