Commit Graph

168 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
72db3f0fa5 Remove references to M_PI_2 and M_PI_4. 2022-10-11 00:27:16 +00:00
Rasmus Munk Larsen
e95c4a837f Simpler range reduction strategy for atan<float>(). 2022-10-04 18:11:00 +00:00
Antonio Sánchez
80efbfdeda Unconditionally enable CXX11 math. 2022-10-04 17:37:47 +00:00
Rasmus Munk Larsen
c475228b28 Vectorize atan() for double. 2022-10-01 01:49:30 +00:00
Rasmus Munk Larsen
1e1848fdb1 Add a vectorized implementation of atan2 to Eigen. 2022-09-28 20:46:49 +00:00
Rasmus Munk Larsen
273e0c884e Revert "Add constexpr, test for C++14 constexpr." 2022-09-16 21:14:29 +00:00
Tobias Schlüter
133498c329 Add constexpr, test for C++14 constexpr. 2022-09-07 03:42:34 +00:00
Antonio Sanchez
3e44f960ed Reduce compiler warnings for tests. 2022-09-06 18:20:56 +00:00
Rasmus Munk Larsen
bd393e15c3 Vectorize acos, asin, and atan for float. 2022-08-29 19:49:33 +00:00
Charles Schlosser
e5af9f87f2 Vectorize pow for integer base / exponent types 2022-08-29 19:23:54 +00:00
Rasmus Munk Larsen
7064ed1345 Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>. 2022-08-26 17:02:37 +00:00
Rasmus Munk Larsen
98e51c9e24 Avoid undefined behavior in array_cwise test due to signed integer overflow 2022-08-26 16:19:03 +00:00
Rasmus Munk Larsen
6aad0f821b Fix psign for unsigned integer types, such as bool. 2022-08-22 20:19:35 +00:00
Rasmus Munk Larsen
7c67dc67ae Use proper double word division algorithm for pow<double>. Gives 11-15% speedup. 2022-08-17 18:36:23 +00:00
Charles Schlosser
76a669fb45 add fixed power unary operation 2022-08-16 21:32:36 +00:00
Lexi Bromfield
66ea0c09fd Don't double-define Half functions on aarch64 2022-08-09 20:00:34 +00:00
Rasmus Munk Larsen
97e0784dc6 Vectorize the sign operator in Eigen. 2022-08-09 19:54:57 +00:00
Rasmus Munk Larsen
7a87ed1b6a Fix code and unit test for a few corner cases in vectorized pow() 2022-08-08 18:48:36 +00:00
Alexander Richardson
b7668c0371 Avoid including <sstream> with EIGEN_NO_IO 2022-07-29 18:02:51 +00:00
Tobias Schlüter
f3ba220c5d Remove EIGEN_EMPTY_STRUCT_CTOR 2022-04-08 18:27:26 +00:00
Antonio Sánchez
73b2c13bf2 Disable f16c scalar conversions for MSVC. 2022-03-30 18:35:32 +00:00
Tobias Schlüter
cb1e8228e9 Convert bit calculation to constexpr, avoid casts. 2022-03-13 22:38:36 +09:00
Rasmus Munk Larsen
0e6f4e43f1 Fix a few confusing comments in psincos_float. 2022-03-04 20:41:49 +00:00
Sean McBride
f1b9692d63 Removed EIGEN_UNUSED decorations from many functions that are in fact used 2022-03-03 20:19:33 +00:00
Rasmus Munk Larsen
92d0026b7b Provide a definition for numeric_limits static data members 2022-02-08 20:34:53 +00:00
Antonio Sánchez
6b60bd6754 Fix 32-bit arm int issue. 2022-02-04 21:59:33 +00:00
Rasmus Munk Larsen
96dc37a03b Some fixes/cleanups for numeric_limits & fix for related bug in psqrt 2022-01-07 01:10:17 +00:00
Rasmus Munk Larsen
7b5a8b6bc5 Improve plog: 20% speedup for float + handle denormals 2022-01-05 23:40:31 +00:00
Rasmus Munk Larsen
8eab7b6886 Improve exp<float>(): Don't flush denormal results +4% speedup.
1. Speed up exp(x) by reducing the polynomial approximant from degree 7 to
degree 6. With exactly representable coefficients computed by the Sollya tool,
this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for
arguments where exp(x) is a normalized float. This change results in a speedup
of about 4% for AVX2.


2. Extend the range where exp(x) returns a non-zero result to from ~[-88;88] to
~[-104;88] i.e. return denormalized values for large negative arguments instead
of zero. Compared to exp<double>(x) the denormalized results gradually decrease
in accuracy down to 0.033 relative error for arguments around x = -104 where
exp(x) is ~std::numeric<float>::denorm_min(). This is expected and acceptable.
2021-12-28 15:00:19 +00:00
Erik Schultheis
dee6428a71 fixed clang warnings about alignment change and floating point precision 2021-12-18 17:18:16 +00:00
Rasmus Munk Larsen
f04fd8b168 Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385. 2021-12-08 17:57:23 +00:00
Erik Schultheis
ec2fd0f7ed Require recent GCC and MSCV and removed EIGEN_HAS_CXX14 and some other feature test macros 2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
5137a5157a Make numeric_limits members constexpr as per the newer C++ standards.
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Antonio Sanchez
e559701981 Fix compile issue for gcc 4.8 2021-10-28 08:23:19 -07:00
Rohit Santhanam
48e40b22bf Preliminary HIP bfloat16 GPU support. 2021-10-27 18:36:45 +00:00
Kolja Brix
afa616bc9e Fix some typos found 2021-09-23 15:22:00 +00:00
sciencewhiz
4b6036e276 fix various typos 2021-09-22 16:15:06 +00:00
Alexander Grund
b5eaa42695 Fix alias violation in BFloat16
reinterpret_cast between unrelated types is undefined behavior and leads
to misoptimizations on some platforms.
Use the safer (and faster) version via bit_cast
2021-09-20 10:37:50 +02:00
Rasmus Munk Larsen
d7d0bf832d Issue an error in case of direct inclusion of internal headers. 2021-09-10 19:12:26 +00:00
Gauri Deshpande
e6a5a594a7 remove denormal flushing in fp32tobf16 for avx & avx512 2021-08-09 22:15:21 +00:00
Rasmus Munk Larsen
7b35638ddb Fix breakage of conj_helper in conjunction with custom types introduced in !537. 2021-07-02 20:42:15 +00:00
Rasmus Munk Larsen
bbfc4d54cd Use padd instead of +. 2021-07-02 02:51:48 +00:00
Rasmus Munk Larsen
9312a5bf5c Implement a generic vectorized version of Smith's algorithms for complex division. 2021-07-01 23:31:12 +00:00
Rasmus Munk Larsen
5aebbe9098 Get rid of redundant pabs instruction in complex square root. 2021-06-29 23:26:15 +00:00
Rohit Santhanam
2d132d1736 Commit 52a5f982 broke conjhelper functionality for HIP GPUs.
This commit addresses this.
2021-06-25 19:28:00 +00:00
Rasmus Munk Larsen
52a5f98212 Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations. 2021-06-24 15:47:48 -07:00
Antonio Sanchez
12e8d57108 Remove pset, replace with ploadu.
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned.  But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.

This is causing segfaults in Google Maps.
2021-06-16 18:41:17 -07:00
Rasmus Munk Larsen
fc87e2cbaa Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. 2021-06-11 02:35:53 +00:00
Antonio Sanchez
87729ea39f Eliminate round_impl double-promotion warnings for c++03. 2021-03-25 16:52:19 +00:00
Antonio Sanchez
8dfe1029a5 Augment NumTraits with min/max_exponent() again.
Replace usage of `std::numeric_limits<...>::min/max_exponent` in
codebase where possible.  Also replaced some other `numeric_limits`
usages in affected tests with the `NumTraits` equivalent.

The previous MR !443 failed for c++03 due to lack of `constexpr`.
Because of this, we need to keep around the `std::numeric_limits`
version in enum expressions until the switch to c++11.

Fixes #2148
2021-03-16 20:12:46 -07:00