Commit Graph

1367 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
283d871a3f Add missing EIGEN_DEVICE_FUNCTION decorations. 2024-11-08 14:25:57 -08:00
Rasmus Munk Larsen
0d366f6532 Vectorize erfc(x) for double and improve erfc(x) for float. 2024-11-08 17:21:11 +00:00
Charles Schlosser
8adf43640e more avx predux_any 2024-11-07 19:58:48 +00:00
Charles Schlosser
bc424f617a add missing avx predux_any functions 2024-11-07 19:11:29 +00:00
Antonio Sánchez
dd4c2805d9 Fix clang6 failures. 2024-10-29 22:18:30 +00:00
Rasmus Munk Larsen
58b252e5b3 Fix typo in PacketMath.h 2024-10-28 18:19:52 +00:00
Rasmus Munk Larsen
6c04d0cd68 Add missing exp2 definition for Altivec. 2024-10-28 18:12:36 +00:00
Peter Gavin
b15ebb1c2d add nextafter for bfloat16 2024-10-26 00:08:25 +00:00
Rasmus Munk Larsen
3f067c4850 Add exp2() as a packet op and array method. 2024-10-22 22:09:34 +00:00
Rasmus Munk Larsen
74dcfbbd0f Use ppolevl for polynomial evaluation in more places. 2024-10-07 13:27:28 -07:00
Sean McBride
b6b8b54e5e Fixed issue #2858: removed unneeded call to _mm_setzero_si128 2024-09-24 16:29:45 +00:00
Antonio Sánchez
132f281f50 Fix generic ceil for SSE2. 2024-09-14 01:31:21 +00:00
qile lin
072ec9d954 Fix a bug for pcmp_lt_or_nan and Add sqrt support for SVE 2024-09-04 21:45:39 +00:00
Rasmus Munk Larsen
9315389795 Fix bug in bug fix for atanh. 2024-09-04 09:37:59 -07:00
Rasmus Munk Larsen
f33af052e0 Fix bug for atanh(-1). 2024-09-03 20:54:01 +00:00
Rasmus Munk Larsen
66927f7807 Fix out-of-range arguments to _mm_permute_pd. 2024-08-30 17:31:52 +00:00
Rasmus Munk Larsen
bbdabebf44 Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1. 2024-08-30 17:27:55 +00:00
Morris Hafner
26e2c4f617 Add nvc++ support 2024-08-30 12:34:48 +00:00
Charles Schlosser
648bce6cae SSE/AVX Complex FMA 2024-08-29 17:37:57 +00:00
qile lin
3b5a1b4157 sve instrinsics with "_x" suffix will be faster than "_z" suffix 2024-08-23 12:52:22 +00:00
Rasmus Munk Larsen
98f1ac5e65 Fix breakage in GPU build. 2024-08-23 06:08:37 +00:00
Tobias Wood
2bf8fe1489 NEON Complex Intrinsics 2024-08-22 22:46:16 +00:00
Rasmus Munk Larsen
f91f8e9ab9 Consolidate float and double implementations of patan(). 2024-08-21 20:44:18 +00:00
Rasmus Munk Larsen
32d95bb097 Add vectorized implementation of tanh<double> 2024-08-21 02:29:45 +00:00
Rasmus Munk Larsen
cc240eea2f Speed up and improve accuracy of tanh. 2024-08-16 23:46:28 +00:00
Charles Schlosser
59498c96fe SSE/AVX use fmaddsub for complex products 2024-08-05 21:26:05 +00:00
Tyler Veness
d14b0a4e53 Remove C++23 check around has_denorm deprecation suppression 2024-08-03 21:34:27 +00:00
Jatin Chaudhary
24db460503 hlog symbol lookup should not restricted to global namespace 2024-08-03 03:59:13 +00:00
Alexander Grund
767e60e290 Fix Woverflow warnings in PacketMathFP16 2024-08-03 03:57:18 +00:00
Alexander Grund
8025683226 Fix conversion of Eigen::half to _Float16 in AVX512 code 2024-08-03 03:49:51 +00:00
Mike Taves
c593e9e948 Fix typos 2024-08-02 00:06:24 +00:00
Frédéric Chapoton
6331da95eb fixing a lot of typos 2024-07-30 22:15:49 +00:00
Rasmus Munk Larsen
d791d48859 Fix AVX512FP16 build failure 2024-06-18 22:34:32 +00:00
Charles Schlosser
b430eb31e2 AVX512F double->int64_t cast 2024-06-15 17:45:02 +00:00
Tyler Veness
b9b1c8661e Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss 2024-05-17 15:55:22 +00:00
Chip Kerchner
4d1d14e069 Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements (like other architectures). 2024-05-08 22:39:27 +00:00
Rasmus Munk Larsen
9000b37677 Fix new generic nearest integer ops on GPU. 2024-04-30 22:18:25 +00:00
Charles Schlosser
fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Antonio Sánchez
dcceb9afec Unbork avx512 preduce_mul on MSVC. 2024-04-26 15:28:03 +00:00
Antonio Sanchez
1c8c734c8b Fix sin/cos on PPC. 2024-04-24 15:58:03 -07:00
Rasmus Munk Larsen
112ad8b846 Revert part of !1583, which may cause underflow on ARM. 2024-04-22 21:14:38 +00:00
Charles Schlosser
5635d37f46 more pblend optimizations 2024-04-19 02:02:27 +00:00
Antonio Sánchez
f0795d35e3 Fix new psincos for ppc and arm32. 2024-04-19 00:31:09 +00:00
Chip Kerchner
ad452e575d Fix compilation problems with PacketI on PowerPC. 2024-04-18 14:55:15 +00:00
Charles Schlosser
fcaf03ef7c fix pendantic compiler warnings 2024-04-17 16:55:45 +00:00
Rasmus Munk Larsen
b5feca5d03 Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported. 2024-04-16 16:12:41 +00:00
Damiano Franzò
888fca0e2b Simd sincos double 2024-04-15 21:12:32 +00:00
Charles Schlosser
6ad2ccea4e Eigen pblend 2024-04-15 16:19:53 +00:00
Charles Schlosser
9099c5eac7 Handle missing AVX512 intrinsic 2024-04-14 16:41:23 +00:00
Charles Schlosser
122befe54c Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings 2024-04-12 19:35:04 +00:00