eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	21633e585b	bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER	2017-08-24 11:06:47 +02:00
Gael Guennebaud	bbd97b4095	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases	2017-07-17 01:02:51 +02:00
Gael Guennebaud	24fe1de9b4	merge	2017-06-15 10:17:39 +02:00
Gael Guennebaud	b240080e64	bug #1436 : fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing.	2017-06-15 10:16:30 +02:00
Benoit Steiner	3baef62b9a	Added missing __device__ qualifier	2017-06-13 12:56:55 -07:00
Benoit Steiner	449936828c	Added missing __device__ qualifier	2017-06-13 12:54:57 -07:00
Gael Guennebaud	26f552c18d	fix compilation of Half in C++98 (issue introduced in previous commit)	2017-06-09 13:36:58 +02:00
Gael Guennebaud	1d59ca2458	Fix compilation with gcc 4.3 and ARM NEON	2017-06-09 13:20:52 +02:00
Gael Guennebaud	d588822779	Add missing std::numeric_limits specialization for half, and complete NumTraits<half>	2017-06-09 11:51:53 +02:00
Abhijit Kundu	9bc0a35731	Fixed nested angle barckets >> issue when compiling with cuda 8	2017-04-27 03:09:03 -04:00
Benoit Jacob	61160a21d2	ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.	2017-03-15 06:57:25 -04:00
Gael Guennebaud	e958c2baac	remove UTF8 symbols	2017-03-07 10:47:40 +01:00
Benoit Steiner	7b61944669	Made most of the packet math primitives usable within CUDA kernel when compiling with clang	2017-02-28 17:05:28 -08:00
Benoit Steiner	34d9fce93b	Avoid unecessary float to double conversions.	2017-02-27 16:33:33 -08:00
Gael Guennebaud	cbbf88c4d7	Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON.	2017-02-17 14:39:02 +01:00
Rasmus Munk Larsen	5c9ed4ba0d	Reverse arguments for pmin in AVX.	2017-01-25 09:21:57 -08:00
Rasmus Munk Larsen	7b6aaa3440	Fix NaN propagation for AVX512.	2017-01-24 13:37:08 -08:00
Rasmus Munk Larsen	5e144bbaa4	Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op. See #1373 for details.	2017-01-24 13:32:50 -08:00
Gael Guennebaud	ca79c1545a	Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t	2017-01-23 22:02:53 +01:00
Benoit Steiner	354baa0fb1	Avoid using horizontal adds since they're not very efficient.	2016-12-21 20:55:07 -08:00
Benoit Steiner	d7825b6707	Use native AVX512 types instead of Eigen Packets whenever possible.	2016-12-21 20:06:18 -08:00
Benoit Steiner	923acadfac	Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics	2016-12-19 13:02:27 -08:00
Benoit Jacob	751e097c57	Use 32 registers on ARM64	2016-12-19 13:44:46 -05:00
Gael Guennebaud	8c0e701504	bug #1360 : fix sign issue with pmull on altivec	2016-12-18 22:13:19 +00:00
Gael Guennebaud	fc94258e77	Fix unused warning	2016-12-18 22:11:48 +00:00
Gael Guennebaud	5d00fdf0e8	bug #1363 : fix mingw's ABI issue	2016-12-15 11:58:31 +01:00
Srinivas Vasudevan	f7d7c33a28	Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).	2016-12-05 12:19:01 -08:00
Srinivas Vasudevan	09ee7f0c80	Fix small nit where I changed name of plog1p to pexpm1.	2016-12-02 15:30:12 -08:00
Srinivas Vasudevan	218764ee1f	Added support for expm1 in Eigen.	2016-12-02 14:13:01 -08:00
Rasmus Munk Larsen	a0329f64fb	Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.	2016-11-29 13:18:09 -08:00
Gael Guennebaud	e340866c81	Fix compilation with gcc and old ABI version	2016-11-23 14:04:57 +01:00
Gael Guennebaud	74637fa4e3	Optimize predux<Packet8f> (AVX)	2016-11-22 21:57:52 +01:00
Gael Guennebaud	178c084856	Disable usage of SSE3 _mm_hadd_ps that is extremely slow.	2016-11-22 21:53:14 +01:00
Gael Guennebaud	7dd894e40e	Optimize predux<Packet4d> (AVX)	2016-11-22 21:41:30 +01:00
Gael Guennebaud	f3fb0a1940	Disable usage of SSE3 haddpd that is extremely slow.	2016-11-22 16:58:31 +01:00
Konstantinos Margaritis	672aa97d4d	implement float/std::complex<float> for ZVector as well, minor fixes to ZVector	2016-11-17 13:27:33 -05:00
Benoit Steiner	dff9a049c4	Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs	2016-11-16 09:01:51 -08:00
Benoit Steiner	c80587c92b	Merged eigen/eigen into default	2016-11-03 03:55:11 -07:00
Gael Guennebaud	598de8b193	Add pinsertfirst function and implement pinsertlast for complex on SSE/AVX.	2016-11-02 10:38:13 +01:00
Benoit Steiner	7a0e96b80d	Gate the code that refers to cuda fp16 primitives more thoroughly	2016-11-01 12:08:09 -07:00
Gael Guennebaud	aad72f3c6d	Add missing inline keywords	2016-10-25 20:20:09 +02:00
Benoit Steiner	3e194a6a73	Fixed a typo	2016-10-25 08:42:15 -07:00
Gael Guennebaud	13fc18d3a2	Add a pinsertlast function replacing the last entry of a packet by a scalar. (useful to vectorize LinSpaced)	2016-10-25 16:48:49 +02:00
Benoit Steiner	38b6048e14	Deleted redundant implementation of predux	2016-10-12 14:37:56 -07:00
Benoit Steiner	78d2926508	Merged eigen/eigen into default	2016-10-12 13:46:29 -07:00
Benoit Steiner	2e2f48e30e	Take advantage of AVX512 instructions whenever possible to speedup the processing of 16 bit floats.	2016-10-12 13:45:39 -07:00
Gael Guennebaud	5c366fe1d7	Merged in rmlarsen/eigen (pull request PR-230) Fix a bug in psqrt for SSE and AVX when EIGEN_FAST_MATH=1	2016-10-12 16:30:51 +00:00
Rasmus Munk Larsen	47150af1c8	Fix copy-paste error: Must use _mm256_cmp_ps for AVX.	2016-10-12 08:34:39 -07:00
Gael Guennebaud	89e315152c	bug #1325 : fix compilation on NEON with clang	2016-10-12 16:55:47 +02:00
Benoit Steiner	507b661106	Renamed predux_half into predux_downto4	2016-10-06 17:57:04 -07:00

1 2 3 4 5 ...

591 Commits