Chip Kerchner
ce60a7be83
Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API.
2022-08-03 18:15:19 +00:00
Ilya Tokar
e618c4a5e9
Improve pblend AVX implementation
2022-07-29 18:45:33 +00:00
Alexander Richardson
b7668c0371
Avoid including <sstream> with EIGEN_NO_IO
2022-07-29 18:02:51 +00:00
Antonio Sánchez
2cf4d18c9c
Disable AVX512 GEMM kernels by default.
2022-07-20 21:22:48 +00:00
b-shi
4a56359406
Add option to disable avx512 GEBP kernels
2022-07-18 17:59:09 +00:00
Chip Kerchner
84cf3ff18d
Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.
2022-06-27 19:18:00 +00:00
Chip Kerchner
c603275dc9
Better performance for Power10 using more load and store vector pairs for GEMV
2022-06-27 18:11:55 +00:00
b-shi
37673ca1bc
AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested
2022-06-17 18:05:26 +00:00
Chip Kerchner
4d1c16eab8
Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX.
2022-06-15 16:06:43 +00:00
Shi, Brian
28812d2ebb
AVX512 TRSM Kernels respect EIGEN_NO_MALLOC
2022-06-07 11:28:42 -07:00
aaraujom
8fbb76a043
Fix build issues with MSVC for AVX512
2022-06-03 14:55:40 +00:00
aaraujom
d49ede4dc4
Add AVX512 s/dgemm optimizations for compute kernel (2nd try)
2022-05-28 02:00:21 +00:00
Chip Kerchner
aa8b7e2c37
Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster)
2022-05-23 15:18:29 +00:00
Guoqiang QI
32a3f9ac33
Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h
2022-05-23 09:33:49 +00:00
Eisuke Kawashima
ac5c83a3f5
unset executable flag
2022-05-22 22:47:43 +09:00
Antonio Sánchez
9b9496ad98
Revert "Add AVX512 optimizations for matrix multiply"
...
This reverts commit 25db0b4a82
2022-05-13 18:50:33 +00:00
aaraujom
25db0b4a82
Add AVX512 optimizations for matrix multiply
2022-05-12 23:41:19 +00:00
Chip Kerchner
c2f15edc43
Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.
2022-04-25 16:23:01 +00:00
Chip Kerchner
44ba7a0da3
Fix compiler bugs for GCC 10 & 11 for Power GEMM
2022-04-20 15:59:00 +00:00
Chip Kerchner
b02c384ef4
Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub
2022-04-18 16:16:32 +00:00
Shi, Brian
fc1d888415
Remove AVX512VL dependency in trsm
2022-04-14 12:44:24 -07:00
Antonio Sánchez
07db964bde
Restrict new AVX512 trsm to AVX512VL, rename files for consistency.
2022-04-14 16:58:32 +00:00
Chip Kerchner
53eec53d2a
Fix Power GEMV order of operations in predux for MMA.
2022-04-11 21:29:05 +00:00
Tobias Schlüter
f3ba220c5d
Remove EIGEN_EMPTY_STRUCT_CTOR
2022-04-08 18:27:26 +00:00
Chip Kerchner
403fa33409
Performance improvements in GEMM for Power
2022-04-05 12:18:53 +00:00
Antonio Sánchez
73b2c13bf2
Disable f16c scalar conversions for MSVC.
2022-03-30 18:35:32 +00:00
b-shi
0611f7fff0
Add missing explicit reinterprets
2022-03-23 21:10:26 +00:00
Chip Kerchner
0699fa06fe
Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.
2022-03-23 18:09:33 +00:00
Antonio Sánchez
4451823fb4
Fix ODR violation in trsm.
2022-03-20 15:56:53 +00:00
Antonio Sánchez
9a14d91a99
Fix AVX512 builds with MSVC.
2022-03-18 16:04:53 +00:00
Chip Kerchner
7b10795e39
Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...
2022-03-17 22:35:27 +00:00
Antonio Sanchez
e34db1239d
Fix missing pound
2022-03-16 12:26:12 -07:00
Antonio Sánchez
591906477b
Fix up PowerPC MMA flags so it builds by default.
2022-03-16 19:16:28 +00:00
b-shi
518fc321cb
AVX512 Optimizations for Triangular Solve
2022-03-16 18:04:50 +00:00
Erik Schultheis
421cbf0866
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
2022-03-16 16:43:40 +00:00
Rasmus Munk Larsen
9ad5661482
Revert "Fix up PowerPC MMA flags so it builds by default."
2022-03-15 20:51:03 +00:00
Antonio Sánchez
65eeedf964
Fix up PowerPC MMA flags so it builds by default.
2022-03-15 20:22:23 +00:00
Tobias Schlüter
cb1e8228e9
Convert bit calculation to constexpr, avoid casts.
2022-03-13 22:38:36 +09:00
Duncan McBain
a3b64625e3
Remove ComputeCpp-specific code from SYCL Vptr
2022-03-08 22:44:18 +00:00
Rasmus Munk Larsen
0e6f4e43f1
Fix a few confusing comments in psincos_float.
2022-03-04 20:41:49 +00:00
Sean McBride
f1b9692d63
Removed EIGEN_UNUSED decorations from many functions that are in fact used
2022-03-03 20:19:33 +00:00
Antonio Sánchez
9c07e201ff
Modified sqrt/rsqrt for denormal handling.
2022-03-02 17:20:47 +00:00
Antonio Sánchez
19c39bea29
Fix mixingtypes for g++-11.
2022-02-25 19:28:10 +00:00
Rasmus Munk Larsen
8b875dbef1
Changes to fast SQRT/RSQRT
2022-02-23 17:32:21 +00:00
Ramil Sattarov
f9b7564faa
E2K: initial support of LCC MCST compiler for the Elbrus 2000 CPU architecture
2022-02-23 17:07:34 +00:00
Antonio Sánchez
28e008b99a
Fix sqrt/rsqrt for NEON.
2022-02-15 21:31:51 +00:00
Erik Schultheis
7197b577fb
Remove unused macros in AVX packetmath.
...
The following macros are removed:
* EIGEN_DECLARE_CONST_Packet8f
* EIGEN_DECLARE_CONST_Packet4d
* EIGEN_DECLARE_CONST_Packet8f_FROM_INT
* EIGEN_DECLARE_CONST_Packet8i
2022-02-14 10:34:23 +00:00
Chip Kerchner
cb5ca1c901
Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC
2022-02-09 18:47:08 +00:00
Rasmus Munk Larsen
92d0026b7b
Provide a definition for numeric_limits static data members
2022-02-08 20:34:53 +00:00
Rasmus Munk Larsen
979fdd58a4
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.
2022-02-05 00:20:13 +00:00