Linaro GCC

Current documentation for “Linaro GCC”

To classify a blueprint as documentation, set the Implementation status to “Informational” When the blueprint's Definition status is marked “Approved”, it will appear in this listing.

Add support for GProf to AArch64 backend of GCC for Linaro GCC Add support for GProf to AArch64 backend of GCC
AArch64 bootstrap for Linaro GCC This blueprint is to cover the initial post-bootstrap work for AArch64 in GCC. Proposed topics include: * IFUNC * Stack protection support * GProf support * Beohm GC See the dependent blueprints below for more.
Better end of loop counter optimisation for Linaro GCC GCC can calculate the final value of a loop counter and use that in later optimisations. The value is calculated in the original signed type which, due to overflow rules, can lead to a strange looking value which is not supported and reduced by later optimisations. Improve either through reducing the calculated va...
Cortex-A15 Theme for Linaro GCC Meta Blueprint covering work scheduled for the Cortex-A15 Focus Iteration
Disable peeling for Linaro GCC Peeling aligns memory pointers at run time both to give a performance boost where the CPU is faster with aligned accesses, and to support CPUs that can only do aligned vector accesses. ARM supports unaligned vector access for no penalty over aligned and a small penalty over aligned-with-alignment hints. As a first...
Investigate LRA in GCC for ARM for Linaro GCC LRA is a proposed replacement to reload in GCC. We should investigate whether it actually provides a benefit to x86 and x86_64, and if it does then work out what benefit turning it on will give to ARM, and the steps needed to turn it on.
Improve generation of conditional execution instructions for Linaro GCC Improve conditional execution code generation (cond-exec) especially for store-flag sequences. This is in the cases that we compare values with 0 or with other registers. It might be possible to improve these in certain cases to avoid conditional instructions and replace them with equivalent arithmetic instructions....
Check common programs for areas the vectoriser could improve for Linaro GCC There are a few libraries and programs out there that have hand-written NEON chunks. Look into these and see if the vectoriser can replace them or what is needs to be added.
Fix any NEON vs core regressions for Linaro GCC We tell people to enable -mfpu=neon by default. In some cases this code runs slower than the non-NEON code. Investigate and fix.
Improve SMS on code with memory dependencies for Linaro GCC Investigate whether SMS can use the rtl loop infrastructure to avoid unnecessary memory dependencies.
Track and investigate performance regression areas for GCC for Linaro GCC We would like to be able to track performance regressions along certain parameters in terms of GCC for Cortex A9. When run on a Cortex A9, the following should be true: * A9 vs A8: code tuned with -mtune=cortex-a9 should run faster than the same code tuned with -mtune=cortex-a8 * ARMv7 vs ARMv5: code built with -...
64 bit divide by constant for Linaro GCC GCC can convert a 32 bit divide by constant into the corresponding multiplies and shifts. Implement the same for 64 bit values. PENDING: we found this but michaelh1 can't remember where.
AArch64 GCC support for Stack Protection for Linaro GCC Add support to libssp for AArch64
Backport conditional execution work for Linaro GCC Backport any conditional execution work done by ARM into GCC 4.5
Detect smin / umin idiom for Linaro GCC Detects and optimise idioms like: #define min(x, y) ((x) <= (y)) ? (x) : (y) unsigned int foo (unsigned int i, unsigned int x ,unsigned int y) { return i < (min (x, y)); } int bar (int i, int x, int y) { return i < (min (x, y)); } See https://code.launchpad.net/~ramana/gcc-linaro/47-smin-umin-idiom/+merge/10...
Equivalent opposite condition detection for Linaro GCC In some conditions the compiler generates a pair of conditional stores with the opposite condition codes. These could be folded into one unconditional store. Seen in libav in vp8.e
Fix EPILOGUE_USES regression in CoreMark for Linaro GCC CoreMark regresses in Thumb-2 mode when using the LR regnum. Otherwise it's a good improvement. Investigate and fix. The idea is to use LR register as a general purpose register fit for use in a number of cases. The change proposed was the one upstream http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg05706.h...
Generic tuning for all Cortex-A devices for Linaro GCC The TSC have asked for a GCC optimisation option that gives code that performs well across all of the Cortex-A series. Discuss what that means, how to balance across the targets, how to present it to the user, approach for implementing, and how to qualify the results. Is it worth discussing adding -mcpu=native at th...
Hot/cold partitioning in PGO for Linaro GCC Enable hot/cold partitioning when doing a profile guided optimisation build. One feature of PGO is to see what code is hot and what is cold and then split this into different sections. This is difficult on ARM due to the constant pools. Implement.
Improve the auto increment/decrement pass for Linaro GCC Improve GCC's auto increment/decrement pass, with particular emphasis on NEON loads and stores. Status: - Requires a change to the Cortex A9 scheduling description that can only really be provided by ARM employees. - Current patches posted here: http://lists.linaro.org/pipermail/linaro-toolchain/2011-Decembe...
Improve constant pool support for Linaro GCC Investigate and improve the current constant pool generation and code. The current constant pool placement code doesn't take profile info into consideration or whether something gets placed in the inner most kerrnel of a loop etc. There are a few cases when it does get placed in the middle of the inner most loop in...
Improve IV opts #1 for Linaro GCC Finish the already upstream induction variable opts patch by backporting it.
Improve the register choice in the allocator for Linaro GCC Investigate the register allocator with respect to choice of Thumb1 vs Thumb2 instructions as discussed in the TSC commentary and write a proposal of what can be done (1MM).
Improve the Neon max and min intrinsics. for Linaro GCC The Neon max and min intrinsics could be represented as actual RTL (s/umax and s/umin ) for all types except for polynomial types rather than the current unspec form. In general they could also end up being folded into the GIMPLE form for max / min if possible.
Investigate -funroll-loops and -fvariable-expansion-in-unroller for Linaro GCC Investigate performance of -funroll-loops, alone and in combination with -fvariable-expansion-in-unroller. Potentially tweak default parameters and/or implementation, also taking into account similar changes in the CodeSoucery toolchain. If it proves useful, work on enabling unrolling by default upstream.
Prefer movw movt over literal pools where possible. for Linaro GCC Look at https://bugs.launchpad.net/gcc-linaro/+bug/886124 for more information. There are a number of places where this shows up as an issue in the headroom analysis that was done with respect to benchmarks.
NEON instruction coverage for Linaro GCC Check the coverage of the NEON instruction set by the vectoriser and backend, including different operand types such as registers and constants. There are two parts to vectorisation - detecting the patterns in the original code, and using those patterns in the backend. We know all of the operations that NEON imple...
Turn on Shrink Wrapping for Linaro 47 and upstream 48 for Linaro GCC Shrink-wrapping is a feature that was enabled upstream in GCC 4.7 but not for the ARM port. This requires at the minimum the definition of a new backend pattern "simple_return" to enable this feature to work. However this requires co-ordination with the way in which the epilogue is being generated which is being rew...
Add multiply pipeline bypass for Linaro GCC The A9 NEON pipeline has a bypass which can make the result of a multiply (or MLA?) quickly available to a following MLA. Describe this in the pipeline so that SMS can use it. PENDING: Michael can't find the bypass in the A9 NEON TRM and doesn't know the right term for it.
<arm_neon.h>/intrinsics improvements for Linaro GCC Check for any outstanding upstream enhancement requests and implement. Please fill out the summary when the work is started.
Improve CRC16 for Linaro GCC A simple bitwise CRC16 like used in a popular embedded benchmark has a range of possible improvements we can do in the middle end and backend. See the sandbox page at: https://wiki.linaro.org/MichaelHope/Sandbox/CRC16 for more. The initial steps are to do a hand written version to see the optimum and investigate...
Improve vectoriser narrowing operations for Linaro GCC Investigate how effectively GCC uses the NEON narrowing arithmetic instructions. Implement, upstream, and backport.
Room for a private call for Linaro GCC Room booking for a private call.
ARMv5 saturating add/subract support for Linaro GCC Add support for the ARMv5 saturated math operations
Add ARMv6 SIMD support for Linaro GCC Add GCC support for the short-vector SIMD instructions that work on core registers.
Improve block memory operations by GCC for Linaro GCC This blueprint is for investigating and improving block memory operations like memset and memclr generated by GCC and for backporting improvements done for unaligned access from upstream.
Transform statics to locals for Linaro GCC A popular embedded benchmark uses a lot of function level static variables that can be transformed into local variables. The speed up is very significant. Our 4.4 had a -fremove-local-statics option that did this. The original discussion is here: http://lists.linaro.org/pipermail/linaro-toolchain/2010-July/00005...

37 blueprint(s) listed.