Add ULP accuracy measurement tool and documentation for vectorized math functions

libeigen/eigen!2153 Co-authored-by: Rasmus Munk Larsen <rmlarsen@gmail.com>
2026-04-10 11:34:33 +08:00 · 2026-03-01 13:22:16 -08:00
parent c20b6f5c41
commit c66fc52868
5 changed files with 973 additions and 0 deletions
--- a/doc/CoeffwiseMathFunctionsTable.dox
+++ b/doc/CoeffwiseMathFunctionsTable.dox
@@ -608,6 +608,133 @@ This also means that, unless specified, if the function \c std::foo is available

 \n

+\section CoeffwiseMathFunctionsAccuracy Accuracy of vectorized math functions
+
+The following tables summarize the accuracy of %Eigen's vectorized implementations measured
+in units of ULP (Unit in the Last Place) on an x86-64 system (Intel Xeon, GCC) with SSE2 SIMD
+target. The reference values were computed using
+<a href="https://www.mpfr.org/">MPFR</a> at 128-bit precision.
+Float results are exhaustive over all ~4.28 billion finite representable values.
+Double results sample ~2.88 billion values using a geometric stepping factor of 10<sup>-6</sup>.
+
+These numbers may differ for other SIMD targets (AVX, AVX512, NEON, SVE, etc.)
+since each has its own packet math implementations. Functions marked "delegates to std"
+do not have a custom vectorized implementation for the tested SIMD target &mdash; they call
+the standard library function element-by-element.
+
+The full histograms for each function can be generated with the \c ulp_accuracy tool
+in <tt>test/ulp_accuracy/</tt>.
+
+\subsection CoeffwiseMathFunctionsAccuracy_float Float precision
+
+<table class="manual-hl">
+<tr><th>Function</th><th>Max |ULP|</th><th>Mean |ULP|</th><th>% Exact</th><th>Notes</th></tr>
+<tr><th colspan="5">Trigonometric</th></tr>
+<tr><td>sin</td>  <td>2</td>  <td>0.087</td> <td>91.3%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>cos</td>  <td>2</td>  <td>0.088</td> <td>91.2%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>tan</td>  <td>5</td>  <td>0.238</td> <td>77.3%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>asin</td> <td>4</td>  <td>0.726</td> <td>51.3%</td> <td></td></tr>
+<tr><td>acos</td> <td>4</td>  <td>0.057</td> <td>95.0%</td> <td></td></tr>
+<tr><td>atan</td> <td>4</td>  <td>0.061</td> <td>94.0%</td> <td></td></tr>
+<tr><th colspan="5">Hyperbolic</th></tr>
+<tr><td>sinh</td>  <td>2</td>  <td>0.017</td> <td>98.3%</td> <td></td></tr>
+<tr><td>cosh</td>  <td>2</td>  <td>0.004</td> <td>99.6%</td> <td></td></tr>
+<tr><td>tanh</td>  <td>6</td>  <td>0.030</td> <td>97.2%</td> <td></td></tr>
+<tr><td>asinh</td> <td>2</td>  <td>0.145</td> <td>85.5%</td> <td></td></tr>
+<tr><td>acosh</td> <td>2</td>  <td>0.057</td> <td>94.3%</td> <td></td></tr>
+<tr><td>atanh</td> <td>2</td>  <td>0.004</td> <td>99.6%</td> <td></td></tr>
+<tr><th colspan="5">Exponential / Logarithmic</th></tr>
+<tr><td>exp</td>   <td>1</td>  <td>0.018</td> <td>98.2%</td> <td></td></tr>
+<tr><td>exp2</td>  <td>6</td>  <td>0.034</td> <td>97.3%</td> <td></td></tr>
+<tr><td>expm1</td> <td>5</td>  <td>0.060</td> <td>94.6%</td> <td></td></tr>
+<tr><td>log</td>   <td>3</td>  <td>0.120</td> <td>88.0%</td> <td></td></tr>
+<tr><td>log1p</td> <td>5</td>  <td>0.134</td> <td>87.5%</td> <td></td></tr>
+<tr><td>log10</td> <td>2</td>  <td>0.007</td> <td>99.3%</td> <td></td></tr>
+<tr><td>log2</td>  <td>5</td>  <td>0.005</td> <td>99.5%</td> <td></td></tr>
+<tr><th colspan="5">Error / Special</th></tr>
+<tr><td>erf</td>      <td>7</td>     <td>0.332</td> <td>67.5%</td> <td></td></tr>
+<tr><td>erfc</td>     <td>8</td>     <td>0.010</td> <td>99.2%</td> <td></td></tr>
+<tr><td>lgamma</td>   <td colspan="3"><em>delegates to std</em></td> <td>\ref CoeffwiseMathFunctionsAccuracy_note3 "3"</td></tr>
+<tr><th colspan="5">Other</th></tr>
+<tr><td>logistic</td> <td>7</td>     <td>0.040</td> <td>97.0%</td> <td></td></tr>
+<tr><td>sqrt</td>     <td>0</td>     <td>0.000</td> <td>100%</td>  <td>Uses hardware sqrt</td></tr>
+<tr><td>cbrt</td>     <td>2</td>     <td>0.552</td> <td>49.1%</td> <td></td></tr>
+<tr><td>rsqrt</td>    <td>&infin;</td> <td>0.114</td> <td>88.2%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note4 "4"</td></tr>
+</table>
+
+\subsection CoeffwiseMathFunctionsAccuracy_double Double precision
+
+<table class="manual-hl">
+<tr><th>Function</th><th>Max |ULP|</th><th>Mean |ULP|</th><th>% Exact</th><th>Notes</th></tr>
+<tr><th colspan="5">Trigonometric</th></tr>
+<tr><td>sin</td>  <td>13,879,755</td> <td>0.093</td> <td>93.2%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>cos</td>  <td>2,024,130</td>  <td>0.043</td> <td>98.2%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>tan</td>  <td>13,879,755</td> <td>0.128</td> <td>92.7%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note1 "1"</td></tr>
+<tr><td>asin</td> <td>1</td>          <td>&lt;0.001</td> <td>&gt;99.9%</td> <td></td></tr>
+<tr><td>acos</td> <td>1</td>          <td>&lt;0.001</td> <td>100%</td> <td></td></tr>
+<tr><td>atan</td> <td>5</td>          <td>0.013</td> <td>98.8%</td> <td></td></tr>
+<tr><th colspan="5">Hyperbolic</th></tr>
+<tr><td>sinh</td>  <td>2</td>  <td>0.004</td> <td>99.6%</td> <td></td></tr>
+<tr><td>cosh</td>  <td>2</td>  <td>0.001</td> <td>99.9%</td> <td></td></tr>
+<tr><td>tanh</td>  <td>8</td>  <td>0.008</td> <td>99.3%</td> <td></td></tr>
+<tr><td>asinh</td> <td>2</td>  <td>0.098</td> <td>90.2%</td> <td></td></tr>
+<tr><td>acosh</td> <td>2</td>  <td>0.047</td> <td>95.3%</td> <td></td></tr>
+<tr><td>atanh</td> <td>2</td>  <td>&lt;0.001</td> <td>&gt;99.9%</td> <td></td></tr>
+<tr><th colspan="5">Exponential / Logarithmic</th></tr>
+<tr><td>exp</td>   <td>2</td>   <td>0.001</td> <td>99.9%</td> <td></td></tr>
+<tr><td>exp2</td>  <td>214</td> <td>0.107</td> <td>99.6%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note2 "2"</td></tr>
+<tr><td>expm1</td> <td>3</td>   <td>0.010</td> <td>99.1%</td> <td></td></tr>
+<tr><td>log</td>   <td>2</td>   <td>0.147</td> <td>85.3%</td> <td></td></tr>
+<tr><td>log1p</td> <td>3</td>   <td>0.097</td> <td>90.6%</td> <td></td></tr>
+<tr><td>log10</td> <td>2</td>   <td>0.001</td> <td>99.9%</td> <td></td></tr>
+<tr><td>log2</td>  <td>2</td>   <td>&lt;0.001</td> <td>99.9%</td> <td></td></tr>
+<tr><th colspan="5">Error / Special</th></tr>
+<tr><td>erf</td>      <td>&infin;</td> <td>0.050</td> <td>70.5%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note5 "5"</td></tr>
+<tr><td>erfc</td>     <td>11</td>      <td>0.002</td> <td>99.9%</td> <td></td></tr>
+<tr><td>lgamma</td>   <td colspan="3"><em>delegates to std</em></td> <td>\ref CoeffwiseMathFunctionsAccuracy_note3 "3"</td></tr>
+<tr><th colspan="5">Other</th></tr>
+<tr><td>logistic</td> <td>3</td>       <td>0.008</td> <td>99.2%</td> <td></td></tr>
+<tr><td>sqrt</td>     <td>0</td>       <td>0.000</td> <td>100%</td>  <td>Uses hardware sqrt</td></tr>
+<tr><td>cbrt</td>     <td>2</td>       <td>0.119</td> <td>88.1%</td> <td></td></tr>
+<tr><td>rsqrt</td>    <td>&infin;</td> <td>0.135</td> <td>86.5%</td> <td>\ref CoeffwiseMathFunctionsAccuracy_note4 "4"</td></tr>
+</table>
+
+\subsection CoeffwiseMathFunctionsAccuracy_notes Notes
+
+\anchor CoeffwiseMathFunctionsAccuracy_note1
+<b>1. sin/cos/tan argument reduction:</b>
+%Eigen's vectorized sin, cos, and tan use a Cody-Waite argument reduction scheme that
+subtracts multiples of &pi;/2 from the input. For very large arguments (|x| &gt; ~10<sup>4</sup>
+in float, |x| &gt; ~10 in double), this reduction loses precision, producing
+occasional large ULP errors. The mean error remains low because most representable
+values are small. Applications that need high accuracy for large arguments should
+perform argument reduction in user code before calling these functions.
+
+\anchor CoeffwiseMathFunctionsAccuracy_note2
+<b>2. exp2 double precision:</b>
+The exp2 implementation for double shows a max error of 214 ULP near the overflow
+boundary (x &asymp; 1022). The mean error is still low (0.107 ULP, 99.6% exact), so
+the large max error affects only inputs very close to overflow.
+
+\anchor CoeffwiseMathFunctionsAccuracy_note3
+<b>3. lgamma:</b>
+The vectorized lgamma delegates to the standard library function \c std::lgamma for
+this SIMD target (SSE2) and therefore has the same accuracy as the platform's C math library.
+
+\anchor CoeffwiseMathFunctionsAccuracy_note4
+<b>4. rsqrt max=&infin;:</b>
+The infinite max ULP is due to a sign disagreement at a single subnormal input:
+rsqrt(-0) returns -&infin; in %Eigen but the MPFR reference produces NaN (rsqrt of a negative
+value). Ignoring this edge case, the implementation is accurate (&lt; 2 ULP) everywhere else.
+For float, 16.8 million subnormal negative inputs (0.4%) also produce &plusmn;&infin; vs NaN.
+The mean error excluding these outliers is well below 1 ULP.
+
+\anchor CoeffwiseMathFunctionsAccuracy_note5
+<b>5. erf double &infin;:</b>
+The vectorized erf for double returns NaN for &plusmn;&infin; instead of the correct &plusmn;1.
+This produces an infinite max ULP error at a single input value. Excluding &plusmn;&infin;,
+the max error is 3 ULP and the mean is 0.050 ULP.
+
 */

 }