DirectFB NEON Optimization

Registered by Ilias Biris

DIrectFB home page:
DirectFB wiki:
DirectFB is a complete hardware abstraction layer with software fallbacks for every graphics operation that is not supported by the underlying hardware, Many 2D operations (such as blending, Color format conversion, Blitting and so on ) could be accelerated by NEON technology .Because there is not much improvemrnt with GPU acceleration, This blueprint will track NEON optimization only.

Blueprint information

Kurt Taylor
Kui Zheng
Kui Zheng
Series goal:
Accepted for 2011.11
Milestone target:
milestone icon 2011.11
Started by
Kui Zheng
Completed by
Kui Zheng

Related branches



The optimization is target to rgb16 and argb pixel formats.
Would be of benefit to following 2D operations : Fill Rectangle (blend)/Triangles (blend)/Spans (blend), blit, Blit with format conversion, Blit with colorizing, Blit from 32bit (blend), Blit from 32bit (blend) with colorizing, Blit from 8bit palette, Blit from 8bit palette (blend), Blit with mask, Blit180, Blit colorkeyed, Blit destination colorkeyed, Stretch blit, Stretch Blit colorkeyed, Stretch Blit index.

DirectFB accelerated by GPU ( 2D functions accelerated by GLES2)
>> No performance inprovement, even decrease in many operations.

Combine NEON and 3D solution, Balance workload between CPU and GPU. Need do Research of ARM CPU/GPU Synergetic Parallel Computing.
>> It's better to track in the other blueprints.

1) DirectFB NEON optimization
[JesseBarker] from a technical perspective, though, as i say, it would really be worthwhile seeing if pixman could be integrated into directfb (similarly to how it is used to implement the basis for most X11 drivers).
[JesseBarker] certainly, it's an excellent base of NEON-based fill and composite routines. If you want to see how it is used, see libfb within the Xserver (let me know if you need pointers).
2) DirectFB with 3D acceleration
[JesseBarker] i think you would want to measure the results carefully. It is possible that you would get good results in some cases and poor results in others. There are many factors (how large and how many primitives are part of a given drawing request, is the 3D library trying to use the GPU at the same time, etc.).
[JesseBarker] i would suggest that you address the hot spots that you've found one at a time and run as many test cases against each solution as you can (switching between NEON routines and GPU-enabled routines).
[JesseBarker] there are two keys here: first, will the GPU render the 2D primitives efficiently at all, second, will it do so when there is also 3D rendering going on at the same time.


Work Items

Work items:
Add configure flag to enable NEON: DONE
Auto-detect whether to enable NEON: DONE
Runtime NEON detection: DONE
Sop_rgb16_to_Dacc_NEON: DONE
Sop_argb_to_Dacc_NEON: DONE
Sacc_to_Aop_rgb16_NEON: DONE
Cop_to_Aop_16_NEON: DONE
SCacc_add_to_Dacc_NEON: DONE
Sacc_add_to_Dacc_NEON: DONE
Xacc_blend_invsrcalpha_NEON: DONE
Xacc_blend_srcalpha_NEON: DONE
Dacc_modulate_argb_NEON: DONE
Dacc_modulate_rgb_NEON: DONE
Bop_argb_blend_alphachannel_src_invsrc_Aop_rgb16_NEON: DONE
Submit upstream: TODO

This blueprint contains Public information 
Everyone can see this information.