aaraujom
25db0b4a82
Add AVX512 optimizations for matrix multiply
2022-05-12 23:41:19 +00:00
Chip Kerchner
c2f15edc43
Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV.
2022-04-25 16:23:01 +00:00
Chip Kerchner
44ba7a0da3
Fix compiler bugs for GCC 10 & 11 for Power GEMM
2022-04-20 15:59:00 +00:00
Chip Kerchner
b02c384ef4
Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub
2022-04-18 16:16:32 +00:00
Shi, Brian
fc1d888415
Remove AVX512VL dependency in trsm
2022-04-14 12:44:24 -07:00
Antonio Sánchez
07db964bde
Restrict new AVX512 trsm to AVX512VL, rename files for consistency.
2022-04-14 16:58:32 +00:00
Chip Kerchner
53eec53d2a
Fix Power GEMV order of operations in predux for MMA.
2022-04-11 21:29:05 +00:00
Tobias Schlüter
f3ba220c5d
Remove EIGEN_EMPTY_STRUCT_CTOR
2022-04-08 18:27:26 +00:00
Chip Kerchner
403fa33409
Performance improvements in GEMM for Power
2022-04-05 12:18:53 +00:00
Antonio Sánchez
73b2c13bf2
Disable f16c scalar conversions for MSVC.
2022-03-30 18:35:32 +00:00
b-shi
0611f7fff0
Add missing explicit reinterprets
2022-03-23 21:10:26 +00:00
Chip Kerchner
0699fa06fe
Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor.
2022-03-23 18:09:33 +00:00
Antonio Sánchez
4451823fb4
Fix ODR violation in trsm.
2022-03-20 15:56:53 +00:00
Antonio Sánchez
9a14d91a99
Fix AVX512 builds with MSVC.
2022-03-18 16:04:53 +00:00
Chip Kerchner
7b10795e39
Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's...
2022-03-17 22:35:27 +00:00
Antonio Sanchez
e34db1239d
Fix missing pound
2022-03-16 12:26:12 -07:00
Antonio Sánchez
591906477b
Fix up PowerPC MMA flags so it builds by default.
2022-03-16 19:16:28 +00:00
b-shi
518fc321cb
AVX512 Optimizations for Triangular Solve
2022-03-16 18:04:50 +00:00
Erik Schultheis
421cbf0866
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
2022-03-16 16:43:40 +00:00
Rasmus Munk Larsen
9ad5661482
Revert "Fix up PowerPC MMA flags so it builds by default."
2022-03-15 20:51:03 +00:00
Antonio Sánchez
65eeedf964
Fix up PowerPC MMA flags so it builds by default.
2022-03-15 20:22:23 +00:00
Tobias Schlüter
cb1e8228e9
Convert bit calculation to constexpr, avoid casts.
2022-03-13 22:38:36 +09:00
Duncan McBain
a3b64625e3
Remove ComputeCpp-specific code from SYCL Vptr
2022-03-08 22:44:18 +00:00
Rasmus Munk Larsen
0e6f4e43f1
Fix a few confusing comments in psincos_float.
2022-03-04 20:41:49 +00:00
Sean McBride
f1b9692d63
Removed EIGEN_UNUSED decorations from many functions that are in fact used
2022-03-03 20:19:33 +00:00
Antonio Sánchez
9c07e201ff
Modified sqrt/rsqrt for denormal handling.
2022-03-02 17:20:47 +00:00
Antonio Sánchez
19c39bea29
Fix mixingtypes for g++-11.
2022-02-25 19:28:10 +00:00
Rasmus Munk Larsen
8b875dbef1
Changes to fast SQRT/RSQRT
2022-02-23 17:32:21 +00:00
Ramil Sattarov
f9b7564faa
E2K: initial support of LCC MCST compiler for the Elbrus 2000 CPU architecture
2022-02-23 17:07:34 +00:00
Antonio Sánchez
28e008b99a
Fix sqrt/rsqrt for NEON.
2022-02-15 21:31:51 +00:00
Erik Schultheis
7197b577fb
Remove unused macros in AVX packetmath.
...
The following macros are removed:
* EIGEN_DECLARE_CONST_Packet8f
* EIGEN_DECLARE_CONST_Packet4d
* EIGEN_DECLARE_CONST_Packet8f_FROM_INT
* EIGEN_DECLARE_CONST_Packet8i
2022-02-14 10:34:23 +00:00
Chip Kerchner
cb5ca1c901
Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC
2022-02-09 18:47:08 +00:00
Rasmus Munk Larsen
92d0026b7b
Provide a definition for numeric_limits static data members
2022-02-08 20:34:53 +00:00
Rasmus Munk Larsen
979fdd58a4
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.
2022-02-05 00:20:13 +00:00
Antonio Sánchez
4bffbe84f9
Restrict GCC<6.3 maxpd workaround to only gcc.
2022-02-04 22:47:34 +00:00
Antonio Sánchez
e7f4a901ee
Define EIGEN_HAS_AVX512_MATH in PacketMath.
2022-02-04 22:25:52 +00:00
Antonio Sánchez
6b60bd6754
Fix 32-bit arm int issue.
2022-02-04 21:59:33 +00:00
Antonio Sánchez
96da541cba
Fix AVX512 math function consistency, enable for ICC.
2022-02-04 19:35:18 +00:00
Antonio Sánchez
cafeadffef
Fix ODR violations.
2022-02-04 19:01:07 +00:00
Chip Kerchner
66464bd2a8
Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV
2022-01-27 20:35:53 +00:00
Rasmus Munk Larsen
8f2c6f0aa6
Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf.
2022-01-26 20:38:05 +00:00
Rasmus Munk Larsen
51311ec651
Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub.
2022-01-26 04:25:41 +00:00
Rasmus Munk Larsen
ea2c02060c
Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512.
2022-01-21 23:49:18 +00:00
Ilya Tokar
a0fc640c18
Add support for packets of int64 on x86
2022-01-21 19:55:23 +00:00
Erik Schultheis
970640519b
Cleanup
2022-01-21 01:48:59 +00:00
Chip Kerchner
708fd6d136
Add MMA and performance improvements for VSX in GEMV for PowerPC.
2022-01-13 13:23:18 +00:00
Kolja Brix
8d81a2339c
Reduce usage of reserved names
2022-01-10 20:53:29 +00:00
Matthias Möller
c4b1dd2f6b
Add support for Cray, Fujitsu, and Intel ICX compilers
...
The following preprocessor macros are added:
- EIGEN_COMP_CPE and EIGEN_COMP_CLANGCPE version number of the CRAY compiler if
Eigen is compiled with the Cray C++ compiler, 0 otherwise.
- EIGEN_COMP_FCC and EIGEN_COMP_CLANGFCC version number of the FCC compiler if
Eigen is compiled with the Fujitsu C++ compiler, 0 otherwise
- EIGEN_COMP_CLANGICC version number of the ICX compiler if Eigen is compiled
with the Intel oneAPI C++ compiler, 0 otherwise
All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based
frontend. This is distinguished by the CLANG prefix.
2022-01-07 18:46:16 +00:00
Rasmus Munk Larsen
96dc37a03b
Some fixes/cleanups for numeric_limits & fix for related bug in psqrt
2022-01-07 01:10:17 +00:00
Rasmus Munk Larsen
7b5a8b6bc5
Improve plog: 20% speedup for float + handle denormals
2022-01-05 23:40:31 +00:00