Files
eigen/Eigen/src
Rasmus Munk Larsen f1e8307308 1. Fix a bug in psqrt and make it return 0 for +inf arguments.
2. Simplify handling of special cases by taking advantage of the fact that the
   builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
   This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:

Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)

Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.

4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.

Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
2019-11-15 17:09:46 -08:00
..
2018-09-18 04:15:01 -04:00
2018-03-11 10:01:44 -04:00
2019-02-19 14:00:15 +01:00
2017-03-07 11:25:58 +01:00
2019-09-03 00:50:51 +02:00
2019-09-04 23:00:21 +02:00
2019-05-13 19:02:30 +02:00
2019-05-23 15:31:12 +02:00