eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Benoit Steiner	05089aba75	Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts	2015-02-27 09:27:30 -08:00
Benoit Steiner	573b377110	Added support for vectorized type casting of tensors	2015-02-27 08:46:04 -08:00
Benoit Steiner	f41b1f1666	Added support for fast reciprocal square root computation.	2015-02-26 09:42:41 -08:00
Benoit Steiner	7765039f1c	Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up being executed on the GPU device.	2015-02-19 21:22:51 -08:00
Benoit Jacob	9bd8a4bab5	bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path This is substantially faster on ARM, where it's important to minimize the number of loads. This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome. Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).	2015-02-18 15:03:35 -05:00
Gael Guennebaud	6f4adc9e94	Add missing install directives for arch/CUDA	2015-02-18 11:40:06 +01:00
Gael Guennebaud	eb563049f7	Remove some dead stores.	2015-02-18 11:26:48 +01:00
Gael Guennebaud	159fb181c2	Disable __m128* wrappers when compiling with AVX and -fabi-version=4	2015-02-17 16:27:20 +01:00
Gael Guennebaud	91ab2489dd	Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same type with default ABI)	2015-02-17 16:08:07 +01:00
Gael Guennebaud	98604576d1	Merged in chtz/eigen-indexconversion (pull request PR-92) bug #877, bug #572: Get rid of Index conversion warnings, summary of changes: - Introduce a global typedef Eigen::Index making Eigen::DenseIndex and AnyExpr<>::Index deprecated (default is std::ptrdiff_t). - Eigen::Index is used throughout the API to represent indices, offsets, and sizes. - Classes storing an array of indices uses the type StorageIndex to store them. This is a template parameter of the class. Default is int. - Methods that explicitly set or return an element of such an array take or return a StorageIndex type. In all other cases, the Index type is used.	2015-02-16 15:29:00 +01:00
Gael Guennebaud	45cbb0bbb1	The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index	2015-02-16 15:05:41 +01:00
Benoit Steiner	e2cfddf75f	Pulled latest updates from trunk	2015-02-13 16:21:59 -08:00
Benoit Steiner	0927801a84	Optimized version of the sin(), exp(), log() and sqrt() function for AVX	2015-02-13 16:07:08 -08:00
Gael Guennebaud	0918c51e60	merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper	2015-02-12 21:48:41 +01:00
Gael Guennebaud	029d236ceb	merge	2015-02-10 23:12:47 +01:00
Gael Guennebaud	fe25f3b8e3	FMA has been wrongly disabled	2015-02-10 23:11:35 +01:00
Benoit Steiner	cc5d7ff523	Added vectorized implementation of the exponential function for ARM/NEON	2015-02-10 14:02:38 -08:00
Benoit Steiner	c739102ef9	Pulled the latest changes from the trunk	2015-02-06 05:25:03 -08:00
Benoit Jacob	5ef95fabee	bug #936 , patch 3/3: Properly detect FMA support on ARM (requires VFPv4) and use it instead of MLA when available, because it's both more accurate, and faster.	2015-01-30 17:45:03 -05:00
Benoit Jacob	0f21613698	bug #936 , patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD	2015-01-30 17:44:26 -05:00
Benoit Jacob	340b8afb14	bug #936 , patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_, because this is what they are about. "Fused" means "no intermediate rounding between the mul and the add, only one rounding at the end". Instead, what we are concerned about here is whether a temporary register is needed, i.e. whether the MUL and ADD are separate instructions. Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA. But a true fused mul-add is only available on VFPv4: VFMA.	2015-01-31 14:15:57 -05:00
Benoit Jacob	9f99f61e69	bug #936 , patch 1/3: some cleanup and renaming for consistency.	2015-01-30 17:43:56 -05:00
Gael Guennebaud	ae4644cc68	bug #907 , ARM64: workaround ICE in xcode/clang	2015-01-13 10:03:00 +01:00
Gael Guennebaud	36f7c1337f	bug #907 , ARM64: workaround vreinterpretq_u64_* not defined in xcode/clang	2015-01-13 09:57:37 +01:00
Gael Guennebaud	63974bcb88	Big 907: workaround some missing intrinsics in current NDK's gcc version (ARM64)	2015-01-07 09:44:25 +01:00
Gael Guennebaud	79f4a59ed9	bug #907 : fix compilation with ARM64	2015-01-07 09:41:56 +01:00
Benoit Steiner	509e4ddc02	Added reduction packet primitives for CUDA	2014-11-19 10:34:11 -08:00
Gael Guennebaud	ee06f78679	Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.	2014-11-04 21:58:52 +01:00
Benoit Steiner	1946cc4478	Added missing packet primitives for CUDA.	2014-10-30 17:52:32 -07:00
Konstantinos Margaritis	fcb3573d17	Merged eigen/eigen into default	2014-10-22 10:42:18 +03:00
Konstantinos Margaritis	fae4fd7a26	Added ARMv8 support	2014-10-22 07:39:49 +00:00
Konstantinos Margaritis	b508619392	working 64-bit support in PacketMath.h, Complex.h needed	2014-10-21 18:10:33 +00:00
Christoph Hertzberg	84aaa03182	Addendum to bug #859 : pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN. psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0	2014-10-20 13:13:43 +02:00
Gael Guennebaud	aa5f79206f	Fix bug #859 : pexp(NaN) returned Inf instead of NaN	2014-10-20 11:38:51 +02:00
Benoit Steiner	95a430a2ca	Vector primitives for CUDA	2014-10-03 19:45:19 -07:00
Konstantinos Margaritis	9d3c69952b	fixed to make big-endian VSX work as well	2014-10-01 09:43:56 +00:00
Konstantinos Margaritis	de38ff2499	prefetch are noops on VSX, actually disable the prefetch trait	2014-09-21 11:56:07 +00:00
Konstantinos Margaritis	60e093a9dc	Merged eigen/eigen into default	2014-09-21 14:02:51 +03:00
Konstantinos Margaritis	56408504e4	fix compile error on big endian altivec	2014-09-21 13:59:30 +03:00
Konstantinos Margaritis	974fe38ca3	prefetch are noops on VSX	2014-09-21 11:24:30 +00:00
Konstantinos Margaritis	c0205ca4af	VSX supports vec_div, implement where appropriate (float/doubles)	2014-09-21 08:12:22 +00:00
Konstantinos Margaritis	10f8aabb61	VSX port passes packetmath_[1-5] tests!	2014-09-20 22:31:31 +00:00
Konstantinos Margaritis	60663a510a	32-bit floats/ints, 64-bit doubles pass packetmath tests, complex 32/64-bit remaining	2014-09-19 21:05:01 +00:00
Benoit Steiner	10a79ca3a3	Merged latest updates from the Eigen trunk.	2014-09-15 09:18:16 -07:00
Konstantinos Margaritis	470aa15c35	First time it compiles, but fails to pass the tests.	2014-09-09 16:58:48 +00:00
Konstantinos Margaritis	7ff266e3ce	Initial VSX commit	2014-08-29 20:03:49 +00:00
Benoit Steiner	16047c8d4a	Pulled in the latest changes from the Eigen trunk	2014-08-13 22:25:29 -07:00
Jitse Niesen	25bceefb4e	Replace asm by __asm__ (bug #873 )	2014-09-06 11:47:24 +01:00
Gael Guennebaud	0369db12af	bug #871 : fix compilation on ARM/Neon regarding __has_builtin usage	2014-09-01 10:52:58 +02:00
Konstantinos Margaritis	2c625ec9ba	Simplification of some Altivec constants, reuse existing constants and avoid loading from RAM esp in the case of p16uc_COMPLEX_TRANSPOSE*	2014-07-22 20:46:03 +00:00

1 2 3 4 5 ...

291 Commits