Commit Graph

4776 Commits

Author SHA1 Message Date
Benoit Jacob
6136f4fdd4 Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm Krait) that are not as ubiquitous today as they were when I introduced it. 2016-05-24 10:00:32 -04:00
Benoit Steiner
e617711306 Don't attempt to use MMX instructions with visualstudio since they're only partially supported. 2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f Worked around missing clang intrinsic 2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b Use the generic ploadquad intrinsics since it does the job 2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b Worked around missing clang intrinsics 2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61 Added missing EIGEN_DEVICE_FUNC qualifier 2016-05-23 20:56:59 -07:00
Benoit Steiner
3d0741f027 Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type. 2016-05-23 20:43:48 -07:00
Benoit Steiner
33a94f5dc7 Use the Index type instead of integers to specify the strides in pgather/pscatter 2016-05-23 20:37:30 -07:00
Benoit Steiner
6bc684ab6a Added missing alignment in the fp16 packet traits 2016-05-23 20:32:30 -07:00
Benoit Steiner
283e33dea4 ptranspose is not a template. 2016-05-23 19:55:55 -07:00
Benoit Steiner
a5a3ba2b80 Avoid unnecessary float to double conversions 2016-05-23 17:16:09 -07:00
Benoit Steiner
5ba0ebe7c9 Avoid unnecessary float to double conversion. 2016-05-23 17:14:31 -07:00
Benoit Steiner
7d980d74e5 Started to vectorize the processing of 16bit floats on CPU. 2016-05-23 15:21:40 -07:00
Benoit Steiner
5d51a7f12c Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path. 2016-05-23 15:13:16 -07:00
Christoph Hertzberg
88654762da Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit. 2016-05-23 10:03:03 +02:00
Christoph Hertzberg
718521d5cf Silenced several double-promotion warnings 2016-05-22 18:17:04 +02:00
Gael Guennebaud
ccaace03c9 Make EIGEN_HAS_CONSTEXPR user configurable 2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable 2016-05-20 15:05:38 +02:00
Gael Guennebaud
abd1c1af7a Make EIGEN_HAS_STD_RESULT_OF user configurable 2016-05-20 15:01:27 +02:00
Gael Guennebaud
1395056fc0 Make EIGEN_HAS_C99_MATH user configurable 2016-05-20 14:58:19 +02:00
Gael Guennebaud
48bf5ec216 Make EIGEN_HAS_RVALUE_REFERENCES user configurable 2016-05-20 14:54:20 +02:00
Gael Guennebaud
f43ae88892 Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES 2016-05-20 14:48:51 +02:00
Gael Guennebaud
998f2efc58 Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used. 2016-05-20 14:44:28 +02:00
Gael Guennebaud
c028d96089 Improve doc of special math functions 2016-05-20 14:18:48 +02:00
Gael Guennebaud
0ba32f99bd Rename UniformRandom to UnitRandom. 2016-05-20 13:21:34 +02:00
Gael Guennebaud
7a9d9cde94 Fix coding practice in Quaternion::UniformRandom 2016-05-20 13:19:52 +02:00
Joseph Mirabel
eb0cc2573a bug #823: add static method to Quaternion for uniform random rotations. 2016-05-20 13:15:40 +02:00
Gael Guennebaud
6761c64d60 zeta and polygamma are not unary functions, but binary ones. 2016-05-19 18:34:16 +02:00
Gael Guennebaud
7a54032408 zeta and digamma do not require C++11/C99 2016-05-19 17:36:47 +02:00
Gael Guennebaud
ce12562710 Add some c++11 flags in documentation 2016-05-19 17:35:30 +02:00
Gael Guennebaud
b6ed8244b4 bug #1201: optimize affine*vector products 2016-05-19 16:09:15 +02:00
Gael Guennebaud
73693b5de6 bug #1221: disable gcc 6 warning: ignoring attributes on template argument 2016-05-19 15:21:53 +02:00
Gael Guennebaud
df9a5e13c6 Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs. 2016-05-19 13:07:33 +02:00
Gael Guennebaud
6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. 2016-05-19 13:06:21 +02:00
Gael Guennebaud
a226f6af6b Add support for SelfAdjointView::diagonal() 2016-05-19 13:05:33 +02:00
Gael Guennebaud
ee7da3c7c5 Fix SelfAdjointView::triangularView for complexes. 2016-05-19 13:01:51 +02:00
Gael Guennebaud
b6b8578a67 bug #1230: add support for SelfadjointView::triangularView. 2016-05-19 11:36:38 +02:00
Gael Guennebaud
84df9142e7 bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests 2016-05-18 23:00:13 +02:00
Gael Guennebaud
21d692d054 Use coeff(i,j) instead of operator(). 2016-05-18 17:09:20 +02:00
Gael Guennebaud
8456bbbadb bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only. 2016-05-18 16:53:28 +02:00
Gael Guennebaud
b507b82326 Use default sorting strategy for square products. 2016-05-18 16:51:54 +02:00
Gael Guennebaud
747e3290c0 bug #1213: rename some enums type for consistency. 2016-05-18 13:26:56 +02:00
Rasmus Munk Larsen
0dbd68145f Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h. 2016-05-17 10:25:19 -07:00
Rasmus Munk Larsen
e55deb21c5 Improvements to parallelFor.
Move some scalar functors from TensorFunctors. to Eigen core.
2016-05-12 14:07:22 -07:00
Benoit Steiner
fae0493f98 Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message.  Lines beginning with 'HG:' are removed.
2016-05-11 23:02:26 -07:00
Benoit Steiner
b6a517c47d Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
2016-05-11 21:26:48 -07:00
Benoit Steiner
518149e868 Misc fixes for fp16 2016-05-11 20:11:14 -07:00
Benoit Steiner
56a1757d74 Made predux_min and predux_max on fp16 less noisy 2016-05-11 17:37:34 -07:00
Benoit Steiner
9091351dbe __ldg is only available with cuda architectures >= 3.5 2016-05-11 15:22:13 -07:00
Benoit Steiner
02f76dae2d Fixed a typo 2016-05-11 15:08:38 -07:00