Rasmus Munk Larsen
|
283d871a3f
|
Add missing EIGEN_DEVICE_FUNCTION decorations.
|
2024-11-08 14:25:57 -08:00 |
|
Rasmus Munk Larsen
|
0d366f6532
|
Vectorize erfc(x) for double and improve erfc(x) for float.
|
2024-11-08 17:21:11 +00:00 |
|
Charles Schlosser
|
8adf43640e
|
more avx predux_any
|
2024-11-07 19:58:48 +00:00 |
|
Charles Schlosser
|
bc424f617a
|
add missing avx predux_any functions
|
2024-11-07 19:11:29 +00:00 |
|
Antonio Sánchez
|
dd4c2805d9
|
Fix clang6 failures.
|
2024-10-29 22:18:30 +00:00 |
|
Rasmus Munk Larsen
|
58b252e5b3
|
Fix typo in PacketMath.h
|
2024-10-28 18:19:52 +00:00 |
|
Rasmus Munk Larsen
|
6c04d0cd68
|
Add missing exp2 definition for Altivec.
|
2024-10-28 18:12:36 +00:00 |
|
Peter Gavin
|
b15ebb1c2d
|
add nextafter for bfloat16
|
2024-10-26 00:08:25 +00:00 |
|
Rasmus Munk Larsen
|
3f067c4850
|
Add exp2() as a packet op and array method.
|
2024-10-22 22:09:34 +00:00 |
|
Rasmus Munk Larsen
|
74dcfbbd0f
|
Use ppolevl for polynomial evaluation in more places.
|
2024-10-07 13:27:28 -07:00 |
|
Sean McBride
|
b6b8b54e5e
|
Fixed issue #2858: removed unneeded call to _mm_setzero_si128
|
2024-09-24 16:29:45 +00:00 |
|
Antonio Sánchez
|
132f281f50
|
Fix generic ceil for SSE2.
|
2024-09-14 01:31:21 +00:00 |
|
qile lin
|
072ec9d954
|
Fix a bug for pcmp_lt_or_nan and Add sqrt support for SVE
|
2024-09-04 21:45:39 +00:00 |
|
Rasmus Munk Larsen
|
9315389795
|
Fix bug in bug fix for atanh.
|
2024-09-04 09:37:59 -07:00 |
|
Rasmus Munk Larsen
|
f33af052e0
|
Fix bug for atanh(-1).
|
2024-09-03 20:54:01 +00:00 |
|
Rasmus Munk Larsen
|
66927f7807
|
Fix out-of-range arguments to _mm_permute_pd.
|
2024-08-30 17:31:52 +00:00 |
|
Rasmus Munk Larsen
|
bbdabebf44
|
Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1.
|
2024-08-30 17:27:55 +00:00 |
|
Morris Hafner
|
26e2c4f617
|
Add nvc++ support
|
2024-08-30 12:34:48 +00:00 |
|
Charles Schlosser
|
648bce6cae
|
SSE/AVX Complex FMA
|
2024-08-29 17:37:57 +00:00 |
|
qile lin
|
3b5a1b4157
|
sve instrinsics with "_x" suffix will be faster than "_z" suffix
|
2024-08-23 12:52:22 +00:00 |
|
Rasmus Munk Larsen
|
98f1ac5e65
|
Fix breakage in GPU build.
|
2024-08-23 06:08:37 +00:00 |
|
Tobias Wood
|
2bf8fe1489
|
NEON Complex Intrinsics
|
2024-08-22 22:46:16 +00:00 |
|
Rasmus Munk Larsen
|
f91f8e9ab9
|
Consolidate float and double implementations of patan().
|
2024-08-21 20:44:18 +00:00 |
|
Rasmus Munk Larsen
|
32d95bb097
|
Add vectorized implementation of tanh<double>
|
2024-08-21 02:29:45 +00:00 |
|
Rasmus Munk Larsen
|
cc240eea2f
|
Speed up and improve accuracy of tanh.
|
2024-08-16 23:46:28 +00:00 |
|
Charles Schlosser
|
59498c96fe
|
SSE/AVX use fmaddsub for complex products
|
2024-08-05 21:26:05 +00:00 |
|
Tyler Veness
|
d14b0a4e53
|
Remove C++23 check around has_denorm deprecation suppression
|
2024-08-03 21:34:27 +00:00 |
|
Jatin Chaudhary
|
24db460503
|
hlog symbol lookup should not restricted to global namespace
|
2024-08-03 03:59:13 +00:00 |
|
Alexander Grund
|
767e60e290
|
Fix Woverflow warnings in PacketMathFP16
|
2024-08-03 03:57:18 +00:00 |
|
Alexander Grund
|
8025683226
|
Fix conversion of Eigen::half to _Float16 in AVX512 code
|
2024-08-03 03:49:51 +00:00 |
|
Mike Taves
|
c593e9e948
|
Fix typos
|
2024-08-02 00:06:24 +00:00 |
|
Frédéric Chapoton
|
6331da95eb
|
fixing a lot of typos
|
2024-07-30 22:15:49 +00:00 |
|
Rasmus Munk Larsen
|
d791d48859
|
Fix AVX512FP16 build failure
|
2024-06-18 22:34:32 +00:00 |
|
Charles Schlosser
|
b430eb31e2
|
AVX512F double->int64_t cast
|
2024-06-15 17:45:02 +00:00 |
|
Tyler Veness
|
b9b1c8661e
|
Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss
|
2024-05-17 15:55:22 +00:00 |
|
Chip Kerchner
|
4d1d14e069
|
Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements (like other architectures).
|
2024-05-08 22:39:27 +00:00 |
|
Rasmus Munk Larsen
|
9000b37677
|
Fix new generic nearest integer ops on GPU.
|
2024-04-30 22:18:25 +00:00 |
|
Charles Schlosser
|
fb95e90f7f
|
Add truncation op
|
2024-04-29 23:45:49 +00:00 |
|
Antonio Sánchez
|
dcceb9afec
|
Unbork avx512 preduce_mul on MSVC.
|
2024-04-26 15:28:03 +00:00 |
|
Antonio Sanchez
|
1c8c734c8b
|
Fix sin/cos on PPC.
|
2024-04-24 15:58:03 -07:00 |
|
Rasmus Munk Larsen
|
112ad8b846
|
Revert part of !1583, which may cause underflow on ARM.
|
2024-04-22 21:14:38 +00:00 |
|
Charles Schlosser
|
5635d37f46
|
more pblend optimizations
|
2024-04-19 02:02:27 +00:00 |
|
Antonio Sánchez
|
f0795d35e3
|
Fix new psincos for ppc and arm32.
|
2024-04-19 00:31:09 +00:00 |
|
Chip Kerchner
|
ad452e575d
|
Fix compilation problems with PacketI on PowerPC.
|
2024-04-18 14:55:15 +00:00 |
|
Charles Schlosser
|
fcaf03ef7c
|
fix pendantic compiler warnings
|
2024-04-17 16:55:45 +00:00 |
|
Rasmus Munk Larsen
|
b5feca5d03
|
Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported.
|
2024-04-16 16:12:41 +00:00 |
|
Damiano Franzò
|
888fca0e2b
|
Simd sincos double
|
2024-04-15 21:12:32 +00:00 |
|
Charles Schlosser
|
6ad2ccea4e
|
Eigen pblend
|
2024-04-15 16:19:53 +00:00 |
|
Charles Schlosser
|
9099c5eac7
|
Handle missing AVX512 intrinsic
|
2024-04-14 16:41:23 +00:00 |
|
Charles Schlosser
|
122befe54c
|
Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings
|
2024-04-12 19:35:04 +00:00 |
|