Fix NEON sqrt for 32-bit, add prsqrt.

With !406, we accidentally broke arm 32-bit NEON builds, since
`vsqrt_f32` is only available for 64-bit.

Here we add back the `rsqrt` implementation for 32-bit, relying
on a `prsqrt` implementation with better handling of edge cases.

Note that several of the 32-bit NEON packet tests are currently
failing - either due to denormal handling (NEON versions flush
to zero, but scalar paths don't) or due to accuracy (e.g. sin/cos).
This commit is contained in:
Antonio Sanchez
2021-02-26 13:59:46 -08:00
parent fe19714f80
commit 29ebd84cb7
3 changed files with 51 additions and 2 deletions

View File

@@ -684,7 +684,7 @@ Packet plog2(const Packet& a) {
/** \internal \returns the square-root of \a a (coeff-wise) */
template<typename Packet> EIGEN_DECLARE_FUNCTION_ALLOWING_MULTIPLE_DEFINITIONS
Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); }
Packet psqrt(const Packet& a) { return numext::sqrt(a); }
/** \internal \returns the reciprocal square-root of \a a (coeff-wise) */
template<typename Packet> EIGEN_DECLARE_FUNCTION_ALLOWING_MULTIPLE_DEFINITIONS