Improve the Neon vector permute intrinsics.

Registered by Ramana Radhakrishnan on 2012-05-26

The Neon vector permute intrinsics namely

vtrn
vtbl
vtbx
vrev32/64/16
vzip
vuzp
vext

can be implemented using the generic __builtin_shuffle extension mechanism. This will also benefit and make sure that all tests in this area work properly.

This is something that came up only in 4.7 and should have a reasonably high priority as it can improve our vectorization numbers in quite a few cases and end up improving
quite a few cases in the intrinsics as we now have a way of telling the vectorizer and the generic optimizers that this really isn't a builtin_call but just a vector permute operation :)

Blueprint information

Status:
Complete
Approver:
Michael Hope
Priority:
Medium
Drafter:
Ramana Radhakrishnan
Direction:
Approved
Assignee:
Ramana Radhakrishnan
Definition:
Approved
Series goal:
Accepted for 4.7
Implementation:
Implemented
Milestone target:
None
Started by
Ramana Radhakrishnan on 2012-06-22
Completed by
Matthew Gretton-Dann on 2013-05-22

Related branches

Sprints

Whiteboard

[2013-05-22] This has been completed and is in FSF GCC 4.8

(?)

Work Items

Work items:
* Improve vrev64/32/16 , vzip, vuzp, vtrn using __builtin_shuffle Commit upstream: DONE
* Implement support for __builtin_shuffle in the C++ frontend : DONE
* Implement support for constexpr and __builtin_shuffle in C++ frontend : DONE
* Improve support for vdup intrinsics in the backend : INPROGRESS
* Improve neon intrinsics tests : INPROGRESS
* Improve support for vld3 intrinsics in the backend- Do not stack values to and from the stack gratuitously. Done by tweaking costs to allow lower-subreg only to lower values you want : INPROGRESS
* Validate improvements obtained : TODO
* Backport to Linaro 4.7: TODO

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.