eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Deven Desai	39a038f2e4	Fix for ROCm (and CUDA?) breakage - 201029 The following commit breaks Eigen for ROCm (and probably CUDA too) with the following error `e265f7ed8e` ``` Building HIPCC object test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o In file included from /home/rocm-user/eigen/test/gpu_basic.cu:20: In file included from /home/rocm-user/eigen/test/main.h:355: In file included from /home/rocm-user/eigen/Eigen/QR:11: In file included from /home/rocm-user/eigen/Eigen/Core:169: /home/rocm-user/eigen/Eigen/src/Core/arch/Default/Half.h:825:76: error: use of undeclared identifier 'numext'; did you mean 'Eigen::numext'? return Eigen::half_impl::raw_uint16_to_half(__ldg(reinterpret_cast<const numext::uint16_t>(ptr))); ^~~~~~ Eigen::numext /home/rocm-user/eigen/Eigen/src/Core/MathFunctions.h:968:11: note: 'Eigen::numext' declared here namespace numext { ^ 1 error generated when compiling for gfx900. CMake Error at gpu_basic_generated_gpu_basic.cu.o.cmake:192 (message): Error generating file /home/rocm-user/eigen/build/test/CMakeFiles/gpu_basic.dir//./gpu_basic_generated_gpu_basic.cu.o test/CMakeFiles/gpu_basic.dir/build.make:63: recipe for target 'test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o' failed make[3]: [test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o] Error 1 CMakeFiles/Makefile2:16611: recipe for target 'test/CMakeFiles/gpu_basic.dir/all' failed make[2]: * [test/CMakeFiles/gpu_basic.dir/all] Error 2 CMakeFiles/Makefile2:16618: recipe for target 'test/CMakeFiles/gpu_basic.dir/rule' failed make[1]: * [test/CMakeFiles/gpu_basic.dir/rule] Error 2 Makefile:5401: recipe for target 'gpu_basic' failed make: * [gpu_basic] Error 2 ``` The fix is in this commit is trivial. Please review and merge	2020-10-29 15:34:05 +00:00
David Tellenbach	f895755c0e	Remove unused functions in Half.h. The following functions have been removed: Eigen::half fabsh(const Eigen::half&) Eigen::half exph(const Eigen::half&) Eigen::half sqrth(const Eigen::half&) Eigen::half powh(const Eigen::half&, const Eigen::half&) Eigen::half floorh(const Eigen::half&) Eigen::half ceilh(const Eigen::half&)	2020-10-29 07:37:52 +01:00
David Tellenbach	09f015852b	Replace numext::as_uint with numext::bit_cast<numext::uint32_t>	2020-10-29 07:28:28 +01:00
David Tellenbach	e265f7ed8e	Add support for Armv8.2-a __fp16 Armv8.2-a provides a native half-precision floating point (__fp16 aka. float16_t). This patch introduces * __fp16 as underlying type of Eigen::half if this type is available * the packet types Packet4hf and Packet8hf representing float16x4_t and float16x8_t respectively * packet-math for the above packets with corresponding scalar type Eigen::half The packet-math functionality has been implemented by Ashutosh Sharma <ashutosh.sharma@amperecomputing.com>. This closes #1940.	2020-10-28 20:15:09 +00:00
mehdi-goli	b9ff791fed	[Missing SYCL math op]: Addin the missing LDEXP Function for SYCL.	2020-10-28 08:32:57 +00:00
mehdi-goli	61461d682a	[Fixing expf issue]: Eigen uses the packet type operation for scaler type float on Sigmoid function(https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/functors/UnaryFunctors.h#L990 ). As a result SYCL backend breaks since SYCL backend only supports packet operation for vectorized type float4 and double2. The issue has been fixed by adding scalar type float to packet operation pexp for SYCL backend.	2020-10-28 08:30:34 +00:00
guoqiangqi	28aef8e816	Improve polynomial evaluation with instruction-level parallelism for pexp_float and pexp<Packet16f>	2020-10-20 11:37:09 +08:00
guoqiangqi	4a77eda1fd	remove unnecessary specialize template of pexp for scale float/double	2020-10-19 00:51:42 +00:00
Antonio Sanchez	d9f0d9eb76	Fix missing `pfirst<Packet16b>` for MSVC. It was only defined under one `#ifdef` case. This fixes the `packetmath_14` test for MSVC.	2020-10-16 16:22:00 -07:00
Rasmus Munk Larsen	21edea5edd	Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent.	2020-10-15 18:39:58 -07:00
Rasmus Munk Larsen	6ea8091705	Revert change from `4e4d3f32d1` that broke BFloat16.h build with older compilers.	2020-10-15 01:20:08 +00:00
Guoqiang QI	4700713faf	Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 plog<Packet16f> op with generic api	2020-10-15 00:54:45 +00:00
Rasmus Munk Larsen	af6f43d7ff	Add specializations for pmin/pmax with prescribed NaN propagation semantics for SSE/AVX/AVX512.	2020-10-14 23:11:24 +00:00
acxz	807e51528d	undefine EIGEN_CONSTEXPR before redefinition	2020-10-12 20:28:56 -04:00
Rasmus Munk Larsen	4e4d3f32d1	Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.	2020-10-09 20:05:49 +00:00
Rasmus Munk Larsen	f93841b53e	Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.	2020-10-02 14:47:15 -07:00
Rasmus Munk Larsen	9078f47cd6	Fix build breakage with MSVC 2019, which does not support MMX intrinsics for 64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c Instead use the equivalent SSE2 intrinsics.	2020-10-01 12:37:55 -07:00
Rasmus Munk Larsen	44b9d4e412	Specialize pldexp_double and pfdexp_double and get rid of Packet2l definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.	2020-09-30 13:33:44 -07:00
Rasmus Munk Larsen	74ff5719b3	Fix compilation of 64 bit constant arguments to pset1frombits in TypeCasting.h on platforms where uint64_t != unsigned long.	2020-09-28 22:47:11 +00:00
Rasmus Munk Larsen	3a0b23e473	Fix compilation of pset1frombits calls on iOS.	2020-09-28 22:30:36 +00:00
Christoph Hertzberg	6b0c0b587e	Provide a more efficient Packet2l->Packet2d cast method	2020-09-28 22:14:02 +00:00
Deven Desai	ce5c59729d	Fix for ROCm/HIP breakage - 200921 The following commit causes regressions in the ROCm/HIP support for Eigen `e55182ac09` I suspect the same breakages occur on the CUDA side too. The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`. When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `half` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does. This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.	2020-09-22 22:26:45 +00:00
Guoqiang QI	821702e771	Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler	2020-09-21 15:49:00 +00:00
Rasmus Munk Larsen	c4b99f78c7	Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.	2020-09-18 18:13:20 -07:00
guoqiangqi	9aad16b443	Fix undefined reference to pset1frombits bug on different platforms	2020-09-19 00:53:21 +00:00
Rasmus Munk Larsen	e55182ac09	Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.	2020-09-18 17:38:58 +00:00
Rasmus Munk Larsen	14022f5eb5	Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h. 'vmvnq_u64' does not exist for some reason.	2020-09-18 04:14:13 +00:00
Rasmus Munk Larsen	a5b226920f	Fix typo in PacketMath.h	2020-09-18 01:22:23 +00:00
Rasmus Munk Larsen	3af744b023	Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.	2020-09-18 01:07:01 +00:00
Brad King	880fa43b2b	Add support for CastXML on ARM aarch64 CastXML simulates the preprocessors of other compilers, but actually parses the translation unit with an internal Clang compiler. Use the same `vld1q_u64` workaround that we do for Clang. Fixes: #1979	2020-09-16 13:40:23 -04:00
Benoit Jacob	cc0c38ace8	Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.	2020-09-15 20:54:14 -04:00
Tim Shen	bb56a62582	Make bfloat16(float(-nan)) produce -nan, not nan.	2020-09-15 13:24:23 -07:00
Guoqiang QI	3012e755e9	Add plog ops support packet2d for NEON	2020-09-15 17:10:35 +00:00
Guoqiang QI	7c5d48f313	Unified sse pldexp_double api	2020-09-12 10:56:55 +00:00
Niels Dekker	5328c9be43	Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning C26450 from inside `half_impl::float_to_half_rtne(float)`: > Arithmetic overflow: '<<' operation causes overflow at compile time.	2020-09-10 16:22:28 +02:00
Pedro Caldeira	35d149e34c	Add missing functions for Packet8bf in Altivec architecture. Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.	2020-09-08 09:22:11 -05:00
Guoqiang QI	85428a3440	Add Neon psqrt<Packet2d> and pexp<Packet2d>	2020-09-08 09:04:03 +00:00
Everton Constantino	6fe88a3c9d	MatrixProuct enhancements: - Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.	2020-09-02 18:21:36 -03:00
Everton Constantino	6568856275	Changing u/int8_t to un/signed char because clang does not understand it. Implementing pcmp_eq to Packet8 and Packet16.	2020-09-02 17:02:15 -03:00
Chip Kerchner	e5886457c8	Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub.	2020-08-28 19:27:32 +00:00
Guoqiang QI	8bb0febaf9	add psqrt ops support packet2f/packet4f for NEON	2020-08-21 03:17:15 +00:00
David Tellenbach	8ba1b0f41a	bfloat16 packetmath for Arm Neon backend	2020-08-13 15:48:40 +00:00
Pedro Caldeira	704798d1df	Add support for Bfloat16 to use vector instructions on Altivec architecture	2020-08-10 13:22:01 -05:00
Zachary Garrett	21122498ec	Temporarily turn off the NEON implementation of pfloor as it does not work for large values. The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point represented are supported.	2020-08-04 16:28:23 +00:00
Teng Lu	3ec4f0b641	Fix undefine BF16 union behavior in AVX512.	2020-07-29 02:20:21 +00:00
David Tellenbach	99da2e1a8d	Fix clang-tidy warnings in generic bfloat16 implementation See !172 for related discussions.	2020-07-27 16:00:24 +02:00
David Tellenbach	c1ffe452fc	Fix bfloat16 casts If we have explicit conversion operators available (C++11) we define explicit casts from bfloat16 to other types. If not (C++03), we don't define conversion operators but rely on implicit conversion chains from bfloat16 over float to other types.	2020-07-23 20:55:06 +00:00
Rasmus Munk Larsen	1b84f21e32	Revert change that made conversion from bfloat16 to {float, double} implicit. Add roundtrip tests for casting between bfloat16 and complex types.	2020-07-22 18:09:00 -07:00
David Tellenbach	38b91f256b	Fix cast of blfoat16 to std::complex<T> This fixes https://gitlab.com/libeigen/eigen/-/issues/1951	2020-07-22 19:00:17 +00:00
Rasmus Munk Larsen	bed7fbe854	Make sure we take the little-endian path if __BYTE_ORDER__ is not defined.	2020-07-22 18:54:38 +00:00

1 2 3 4 5 ...

909 Commits