eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Rasmus Larsen	cb3c059fa4	Merged eigen/eigen into default	2019-01-09 15:04:17 -08:00
Gael Guennebaud	3492a1ca74	fix plog(+inf) with AVX512	2019-01-09 16:53:37 +01:00
Gael Guennebaud	47810cf5b7	Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE	2019-01-09 16:40:42 +01:00
Gael Guennebaud	3f14e0d19e	fix warning	2019-01-09 15:45:21 +01:00
Gael Guennebaud	aeec68f77b	Add missing pcmp_lt and others for AVX512	2019-01-09 15:36:41 +01:00
Gael Guennebaud	e6b217b8dd	bug #1652 : implements a much more accurate version of vectorized sin/cos. This new version achieve same speed for SSE/AVX, and is slightly faster with FMA. Guarantees are as follows: - no FMA: 1ULP up to 3pi, 2ULP up to sin(25966) and cos(18838), fallback to std::sin/cos for larger inputs - FMA: 1ULP up to sin(117435.992) and cos(71476.0625), fallback to std::sin/cos for larger inputs	2019-01-09 15:25:17 +01:00
Rasmus Munk Larsen	055f0b73db	Add support for pcmp_eq and pnot, including for complex types.	2019-01-07 16:53:36 -08:00
Mark D Ryan	bc5dd4cafd	PR560: Fix the AVX512f only builds Commit `c53eececb0` introduced AVX512 support for complex numbers but required avx512dq to build. Commit `1d683ae2f5` fixed some but not, it would seem all, of the hard avx512dq dependencies. Build failures are still evident on Eigen and TensorFlow when compiling with just avx512f and no avx512dq using gcc 7.3. Looking at the code there does indeed seem to be a problem. Commit `c53eececb0` calls avx512dq intrinsics directly, e.g, _mm512_extractf32x8_ps and _mm512_and_ps. This commit fixes the issue by replacing the direct intrinsic calls with the various wrapper functions that are safe to use on avx512f only builds.	2019-01-03 14:33:04 +01:00
Gael Guennebaud	60d3fe9a89	One more stupid AVX 512 fix (I don't have direct access to AVX512 machines)	2018-12-24 13:05:03 +01:00
Gael Guennebaud	4aa667b510	Add EIGEN_STRONG_INLINE where required	2018-12-24 10:45:01 +01:00
Gael Guennebaud	961ff567e8	Add missing pcmp_lt_or_nan for AVX512	2018-12-23 22:13:29 +01:00
Gael Guennebaud	0f6f75bd8a	Implement a faster fix for sin/cos of large entries that also correctly handle INF input.	2018-12-23 17:26:21 +01:00
Gael Guennebaud	38d704def8	Make sure that psin/pcos return number in [-1,1] for large inputs (though sin/cos on large entries is quite useless because it's inaccurate)	2018-12-23 16:13:24 +01:00
Gael Guennebaud	5713fb7feb	Fix plog(+INF): it returned ~87 instead of +INF	2018-12-23 15:40:52 +01:00
Gael Guennebaud	efa4c9c40f	bug #1615 : slightly increase the default unrolling limit to compensate for changeset `101ea26f5e` . This solves a performance regression with clang and 3x3 matrix products.	2018-12-13 10:42:39 +01:00
Gael Guennebaud	0a7e7af6fd	Properly set the number of registers for AVX512	2018-12-11 15:33:17 +01:00
Gael Guennebaud	81c27325ae	bug #1641 : fix testing of pandnot and fix pandnot for complex on SSE/AVX/AVX512	2018-12-08 14:27:48 +01:00
Gael Guennebaud	7b6d0ff1f6	Enable FMA with MSVC (through /arch:AVX2). To make this possible, I also has to turn the #warning regarding AVX512-FMA to a #error.	2018-12-07 15:14:50 +01:00
Gael Guennebaud	f233c6194d	bug #1637 : workaround register spilling in gebp with clang>=6.0+AVX+FMA	2018-12-07 10:01:09 +01:00
Gael Guennebaud	cbf2f4b7a0	AVX512f includes FMA but GCC does not define __FMA__ with -mavx512f only	2018-12-06 18:21:56 +01:00
Gael Guennebaud	1d683ae2f5	Fix compilation with avx512f only, i.e., no AVX512DQ	2018-12-06 18:11:07 +01:00
Gael Guennebaud	c53eececb0	Implement AVX512 vectorization of std::complex<float/double>	2018-12-06 15:58:06 +01:00
Gael Guennebaud	1ac2695ef7	bug #1636 : fix compilation with some ABI versions.	2018-12-06 00:05:10 +01:00
Gael Guennebaud	0ea7ae7213	Add missing padd for Packet8i (it was implicitly generated by clang and gcc)	2018-11-30 21:52:25 +01:00
Gael Guennebaud	c785464430	Add packet sin and cos to Altivec/VSX and NEON	2018-11-30 16:21:33 +01:00
Gael Guennebaud	69ace742be	Several improvements regarding packet-bitwise operations: - add unit tests - optimize their AVX512f implementation - add missing implementations (half, Packet4f, ...)	2018-11-30 15:56:08 +01:00
Gael Guennebaud	fa87f9d876	Add psin/pcos on AVX512 -> almost for free, at last!	2018-11-30 14:33:13 +01:00
Gael Guennebaud	c68bd2fa7a	Cleanup	2018-11-30 14:32:31 +01:00
Gael Guennebaud	f91500d303	Fix pandnot order in AVX512	2018-11-30 14:32:06 +01:00
Gael Guennebaud	b477d60bc6	Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX)	2018-11-30 11:26:30 +01:00
Gael Guennebaud	e19ece822d	Disable fma gcc's workaround for gcc >= 8 (based on GEMM benchmarks)	2018-11-28 17:56:24 +01:00
Gael Guennebaud	41052f63b7	same for pmax	2018-11-28 17:17:28 +01:00
Gael Guennebaud	3e95e398b6	pmin/pmax o SSE: make sure to use AVX instruction with AVX enabled, and disable gcc workaround for fixed gcc versions	2018-11-28 17:14:20 +01:00
Eugene Zhulenev	80f1651f35	Use explicit packet type in SSE/PacketMath pldexp	2018-11-27 17:25:49 -08:00
Gael Guennebaud	b131a4db24	bug #1631 : fix compilation with ARM NEON and clang, and cleanup the weird pshiftright_and_cast and pcast_and_shiftleft functions.	2018-11-27 23:45:00 +01:00
Gael Guennebaud	a1a5fbbd21	Update pshiftleft to pass the shift as a true compile-time integer.	2018-11-27 22:57:30 +01:00
Gael Guennebaud	fa7fd61eda	Unify SSE/AVX psin functions. It is based on the SSE version which is much more accurate, though very slightly slower. This changeset also includes the following required changes: - add packet-float to packet-int type traits - add packet float<->int reinterpret casts - add faster pselect for AVX based on blendv	2018-11-27 22:41:51 +01:00
Gael Guennebaud	b5695a6008	Unify Altivec/VSX pexp(double) with default implementation	2018-11-27 13:53:05 +01:00
Gael Guennebaud	7655a8af6e	cleanup	2018-11-26 23:21:29 +01:00
Gael Guennebaud	502f92fa10	Unify SSE and AVX pexp for double.	2018-11-26 23:12:44 +01:00
Gael Guennebaud	4a347a0054	Unify NEON's pexp with generic implementation	2018-11-26 22:15:44 +01:00
Gael Guennebaud	5c8406babc	Unify Altivec/VSX's pexp with generic implementation	2018-11-26 16:47:13 +01:00
Gael Guennebaud	cf8b85d5c5	Unify SSE and AVX implementation of pexp	2018-11-26 16:36:19 +01:00
Gael Guennebaud	c2f35b1b47	Unify Altivec/VSX's plog with generic implementation, and enable it!	2018-11-26 15:58:11 +01:00
Gael Guennebaud	c24e98e6a8	Unify NEON's plog with generic implementation	2018-11-26 15:02:16 +01:00
Gael Guennebaud	2c44c40114	First step toward a unification of packet log implementation, currently only SSE and AVX are unified. To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.	2018-11-26 14:21:24 +01:00
Gael Guennebaud	5f6045077c	Make SSE/AVX pandnot(A,B) consistent with generic version, i.e., "A and not B"	2018-11-26 14:14:07 +01:00
Gael Guennebaud	0836a715d6	bug #1611 : fix plog(0) on NEON	2018-11-26 09:08:38 +01:00
Christian von Schultz	4a40b3785d	Collapsed revision (based on pull request PR-325) * Support compiling without IO streams Add the preprocessor definition EIGEN_NO_IO which, if defined, disables all use of the IO streams part of the standard library.	2018-10-22 21:14:40 +02:00
Gael Guennebaud	0f780bb0b4	Fix float-to-double warning	2018-10-16 09:19:45 +02:00

1 2 3 4 5 ...

711 Commits