eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Benoit Steiner	0b9e3dcd06	Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%.	2016-05-10 11:05:33 -07:00
Benoit Steiner	8adf5cc70f	Added support for packet processing of fp16 on kepler and maxwell gpus	2016-05-06 19:16:43 -07:00
Benoit Steiner	0451940fa4	Relaxed the dummy precision for fp16	2016-05-05 15:40:01 -07:00
Christoph Hertzberg	dacb469bc9	Enable and fix -Wdouble-conversion warnings	2016-05-05 13:35:45 +02:00
Ola Røer Thorsen	be78aea6b3	fix double-promotion/float-conversion in Core/SpecialFunctions.h	2016-05-04 10:52:08 +02:00
Gael Guennebaud	75a94b9662	Improve documentation of BDCSVD	2016-05-04 12:53:14 +02:00
Gael Guennebaud	e2ca478485	bug #1214 : consider denormals as zero in D&C SVD. This also workaround infinite binary search when compiling with ICC's unsafe optimizations.	2016-05-03 23:15:29 +02:00
Benoit Steiner	6c3e5b85bc	Fixed compilation error with cuda >= 7.5	2016-05-03 09:38:42 -07:00
Benoit Steiner	da50419df8	Made a cast explicit	2016-05-02 19:50:22 -07:00
Gael Guennebaud	b1bd53aa6b	Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement.	2016-05-01 23:25:06 +02:00
Benoit Steiner	2b890ae618	Fixed compilation errors generated by clang	2016-04-29 18:30:40 -07:00
Benoit Steiner	46bcb70969	Don't turn on const expressions when compiling with gcc >= 4.8 unless the -std=c++11 option has been used	2016-04-29 15:20:59 -07:00
Gael Guennebaud	0f3c4c8ff4	Fix compilation of sparse.cast<>().transpose().	2016-04-29 18:26:08 +02:00
Benoit Steiner	dacb23277e	Fixed the igamma and igammac implementations to make them callable from a gpu kernel.	2016-04-28 18:54:54 -07:00
Benoit Steiner	a5d4545083	Deleted unused variable	2016-04-28 14:14:48 -07:00
Justin Lebar	40d1e2f8c7	Eliminate mutual recursion in igamma{,c}_impl::Run. Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls igammac_impl::Run. This isn't actually mutual recursion; the calls are guarded such that we never get into a loop. Nonetheless, it's a stretch for clang to prove this. As a result, clang emits a recursive call in both igammac_impl::Run and igamma_impl::Run. That this is suboptimal code is bad enough, but it's particularly bad when compiling for CUDA/nvptx. nvptx allows recursion, but only begrudgingly: If you have recursive calls in a kernel, it's on you to manually specify the kernel's stack size. Otherwise, ptxas will dump a warning, make a guess, and who knows if it's right. This change explicitly eliminates the mutual recursion in igammac_impl::Run and igamma_impl::Run.	2016-04-28 13:57:08 -07:00
Benoit Steiner	2b917291d9	Merged in rmlarsen/eigen2 (pull request PR-183) Detect cxx_constexpr support when compiling with clang.	2016-04-27 15:19:54 -07:00
Rasmus Munk Larsen	09b9e951e3	Depend on the more extensive support for constexpr in clang: http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr	2016-04-27 14:59:11 -07:00
Rasmus Munk Larsen	1a325ef71c	Detect cxx_constexpr support when compiling with clang.	2016-04-27 14:33:51 -07:00
Benoit Steiner	c61170e87d	fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen.	2016-04-27 14:22:20 -07:00
Benoit Steiner	f629fe95c8	Made the index type a template parameter to evaluateProductBlockingSizes Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.	2016-04-27 13:11:19 -07:00
Benoit Steiner	25141b69d4	Improved support for min and max on 16 bit floats when running on recent cuda gpus	2016-04-27 12:57:21 -07:00
Benoit Steiner	6744d776ba	Added support for fpclassify in Eigen::Numext	2016-04-27 12:10:25 -07:00
Benoit Steiner	5c372d19e3	Merged in rmlarsen/eigen (pull request PR-179) Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.	2016-04-21 18:06:36 -07:00
Rasmus Munk Larsen	a3256d78d8	Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.	2016-04-21 16:49:28 -07:00
Benoit Steiner	80200a1828	Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction.	2016-04-20 12:10:27 -07:00
Benoit Steiner	1d0238375d	Made sure all the required header files are included when trying to use fp16	2016-04-19 17:44:12 -07:00
Gael Guennebaud	e4fe611e2c	Enable lazy-coeff-based-product for vector*(1x1) products	2016-04-16 15:17:39 +02:00
Benoit Steiner	1a16fb1532	Deleted extraneous comma.	2016-04-15 15:50:13 -07:00
Gael Guennebaud	2a7115daca	bug #1203 : by-pass large stack-allocation in stableNorm if EIGEN_STACK_ALLOCATION_LIMIT is too small	2016-04-15 22:34:11 +02:00
Benoit Steiner	1d23430628	Improved the matrix multiplication blocking in the case where mr is not a power of 2 (e.g on Haswell CPUs).	2016-04-15 10:53:31 -07:00
Gael Guennebaud	1e80bddde3	Fix trmv for mixing types.	2016-04-15 17:58:36 +02:00
Benoit Steiner	a62e924656	Added ability to access the cache sizes from the tensor devices	2016-04-14 21:25:06 -07:00
Benoit Steiner	18e6f67426	Added support for exclusive or	2016-04-14 20:37:46 -07:00
Gael Guennebaud	20f387fafa	Improve numerical robustness of JacoviSVD: - avoid noise amplification in complex to real conversion - compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix.	2016-04-14 22:46:55 +02:00
Benoit Steiner	7718749fee	Force the inlining of the << operator on half floats	2016-04-14 11:51:54 -07:00
Benoit Steiner	5379d2b594	Inline the << operator on half floats	2016-04-14 11:40:48 -07:00
Benoit Steiner	5c13765ee3	Added ability to printf fp16	2016-04-14 10:24:52 -07:00
Gael Guennebaud	3551dea887	Cleaning pass on rcond estimator.	2016-04-14 16:45:41 +02:00
Gael Guennebaud	d402adc3d7	Better use .data() than &coeffRef(0)	2016-04-14 15:18:08 +02:00
Gael Guennebaud	ea7087ef31	Merged in rmlarsen/eigen (pull request PR-174) Add matrix condition number estimation module.	2016-04-14 15:11:33 +02:00
Benoit Steiner	36f5a10198	Properly gate the definition of the error and gamma functions for fp16	2016-04-13 18:44:48 -07:00
Benoit Steiner	10b69810d1	Improved support for trigonometric functions on GPU	2016-04-13 16:00:51 -07:00
Benoit Steiner	d6105b53b8	Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16	2016-04-13 15:26:02 -07:00
Gael Guennebaud	703251f10f	merge	2016-04-13 23:45:10 +02:00
Gael Guennebaud	39211ba46b	Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block.	2016-04-13 23:43:26 +02:00
Benoit Steiner	2986253259	Cleaned up the implementation of digamma	2016-04-13 14:24:06 -07:00
Benoit Steiner	d5de1a8220	Pulled latest updates from trunk	2016-04-13 14:17:11 -07:00
Benoit Steiner	87ca15c4e8	Added support for sin, cos, tan, and tanh on fp16	2016-04-13 14:12:38 -07:00
Gael Guennebaud	feef39e2d1	Fix underflow in JacoviSVD's complex to real preconditioner	2016-04-13 22:49:51 +02:00

1 2 3 4 5 ...

4618 Commits