I am looking to do some NEON manual code optimization using inline ASM neon instructions inside C++ functions, target is ARM Cortex-A9 (i.MX6Q).
When it comes to making the correct flags for the compiler, I got a bit confused with -mfpu. My goal is to use the hard FPU with floating point operations and use NEON only with ASM code.
Is it safe to assume that by setting -mfpu=vfpv3, the NEON coprocessor is still accessible by calling ASM neon instructions?
By setting -mfpu=neon-fp16, will the FPU core be unused?
Will the FPU outperform NEON when it comes to making non vectorized floating point operations?
Best How To :
1) No, GCC will pass the
-mfpu value to the assembler, and the assembler will refuse to assemble your code, whether you are using inline asm or seperate assembler files:
vmov q1, q2
gcc foo.s -c -mfpu=vfpv3
foo.s: Assembler messages:
foo.s:1: Error: selected FPU does not support instruction -- `vmov q1,q2'
-mfpu=neon-fp16 in GCC also enables the use of instructions from the VFPv3 instruction set.
3) I am not sure what you mean by this question, the scalar versions of floating point instructions are in the various revisions of the VFP instruction sets, and the vector versions are in the NEON (Advanced SIMD) instruction sets.