eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Chip Kerchner	ab8725d947	Turn off vectorize version of rsqrt - doesn't match generic version	2023-01-27 18:28:54 +00:00
Chip Kerchner	6fc9de7d93	Fix slowdown in bfloat16 MMA when rows is not a multiple of 8 or columns is not a multiple of 4.	2023-01-25 18:22:20 +00:00
Sean McBride	d70b4864d9	issue #2581 : review and cleanup of compiler version checks	2023-01-17 18:58:34 +00:00
Mehdi Goli	b523120687	[SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen	2023-01-16 07:04:08 +00:00
Sergey Fedorov	4d05765345	Altivec fixes for Darwin: do not use unsupported VSX insns	2023-01-12 16:33:33 +00:00
Martin Burchell	c54785b071	Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm	2023-01-10 21:15:28 +00:00
Chip Kerchner	d20fe21ae4	Improve performance for Power10 MMA bfloat16 GEMM	2023-01-06 23:08:37 +00:00
Ryan Senanayake	fe7f527787	Fix guard macros for emulated FP16 operators on GPU	2023-01-06 22:02:51 +00:00
Arthur	311cc0f9cc	Enable NEON pcmp, plset, and complex psqrt	2022-12-22 05:38:34 +00:00
Antonio Sánchez	bb6675caf7	Fix incorrect NEON native fp16 multiplication.	2022-12-19 20:46:44 +00:00
Arthur Feeney	c4fb6af24b	Enable NEON pabs for unsigned int types	2022-12-19 17:07:36 +00:00
Lianhuang Li	d194167149	Fix the bug using neon instruction fmla for data type half	2022-12-01 17:28:57 +00:00
Pedro Caldeira	31ab62d347	Add support for Power10 (AltiVec) MMA instructions for bfloat16.	2022-11-30 23:33:37 +00:00
Charles Schlosser	02805bd56c	Fix AVX2 psignbit	2022-11-16 13:43:11 +00:00
Chip Kerchner	399ce1ed63	Fix duplicate execution code for Power 8 Altivec in pstore_partial.	2022-11-16 13:41:42 +00:00
Antonio Sánchez	8588d8c74b	Correct pnegate for floating-point zero.	2022-11-15 18:07:23 +00:00
Antonio Sanchez	5eacb9e117	Put brackets around unsigned type names.	2022-11-15 09:09:45 -08:00
Antonio Sánchez	37e40dca85	Fix ambiguity in PPC for vec_splats call.	2022-11-14 18:58:16 +00:00
Charles Schlosser	9b6d624eab	fix neon	2022-11-08 20:03:01 +00:00
Rasmus Munk Larsen	7e398e9436	Add missing return keyword in psignbit for NEON.	2022-11-04 16:13:09 +00:00
Charles Schlosser	82b152dbe7	Add signbit function	2022-11-04 00:31:20 +00:00
Antonio Sánchez	886aad1361	Disable patan for double on PPC.	2022-10-27 17:56:08 +00:00
Rasmus Munk Larsen	462758e8a3	Don't use generic sign function for sign(complex) unless it is vectorizable	2022-10-12 16:03:29 +00:00
Rasmus Munk Larsen	72db3f0fa5	Remove references to M_PI_2 and M_PI_4.	2022-10-11 00:27:16 +00:00
Rasmus Munk Larsen	e95c4a837f	Simpler range reduction strategy for atan<float>().	2022-10-04 18:11:00 +00:00
Antonio Sánchez	80efbfdeda	Unconditionally enable CXX11 math.	2022-10-04 17:37:47 +00:00
Antonio Sánchez	e5794873cb	Replace assert with eigen_assert.	2022-10-04 17:11:23 +00:00
Rasmus Munk Larsen	1414a76fa9	Only vectorize atan<double> for Altivec if VSX is available.	2022-10-03 22:06:58 +00:00
Rasmus Munk Larsen	c475228b28	Vectorize atan() for double.	2022-10-01 01:49:30 +00:00
Rasmus Munk Larsen	1e1848fdb1	Add a vectorized implementation of atan2 to Eigen.	2022-09-28 20:46:49 +00:00
Rasmus Munk Larsen	13b69fc1b0	Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR	2022-09-23 20:09:42 +00:00
Rasmus Munk Larsen	ed8cda3ce4	Move EIGEN_NEON_GEBP_NR macro to the right place in GeneralBlockPanelKernel.h	2022-09-23 02:24:27 +00:00
Rasmus Munk Larsen	e2ea866515	Add a macro to set the nr trait in the BEBP kernel for NEON.	2022-09-22 23:56:34 +00:00
Lianhuang Li	23299632c2	Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon	2022-09-21 16:36:40 +00:00
Rasmus Munk Larsen	7b2901e2aa	Add vectorized integer division for int32 with AVX512, AVX or SSE.	2022-09-21 00:27:23 +00:00
Rasmus Munk Larsen	f913a40678	Revert "Add AVX int32_t pdiv" This reverts commit `ea84e7ad63`	2022-09-16 22:48:08 +00:00
Rasmus Munk Larsen	273e0c884e	Revert "Add constexpr, test for C++14 constexpr."	2022-09-16 21:14:29 +00:00
Charles Schlosser	ea84e7ad63	Add AVX int32_t pdiv	2022-09-16 17:06:29 +00:00
Rasmus Munk Larsen	f9dfda28ab	Add missing comparison operators for GPU packets.	2022-09-07 21:13:45 +00:00
Tobias Schlüter	133498c329	Add constexpr, test for C++14 constexpr.	2022-09-07 03:42:34 +00:00
Antonio Sanchez	3e44f960ed	Reduce compiler warnings for tests.	2022-09-06 18:20:56 +00:00
Rasmus Munk Larsen	bd393e15c3	Vectorize acos, asin, and atan for float.	2022-08-29 19:49:33 +00:00
Charles Schlosser	e5af9f87f2	Vectorize pow for integer base / exponent types	2022-08-29 19:23:54 +00:00
Rasmus Munk Larsen	7064ed1345	Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>.	2022-08-26 17:02:37 +00:00
Rasmus Munk Larsen	98e51c9e24	Avoid undefined behavior in array_cwise test due to signed integer overflow	2022-08-26 16:19:03 +00:00
Rasmus Munk Larsen	6aad0f821b	Fix psign for unsigned integer types, such as bool.	2022-08-22 20:19:35 +00:00
Rasmus Munk Larsen	1a09defce7	Protect new pblend implementation with EIGEN_VECTORIZE_AVX2	2022-08-22 18:28:03 +00:00
Rasmus Munk Larsen	7c67dc67ae	Use proper double word division algorithm for pow<double>. Gives 11-15% speedup.	2022-08-17 18:36:23 +00:00
Matthew Sterrett	7a3b667c43	Add support for AVX512-FP16 for vectorizing half precision math	2022-08-17 18:15:21 +00:00
Charles Schlosser	76a669fb45	add fixed power unary operation	2022-08-16 21:32:36 +00:00

1 2 3 4 5 ...

1210 Commits