Commit Graph

3126 Commits

Author SHA1 Message Date
Benoit Jacob
6136f4fdd4 Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm Krait) that are not as ubiquitous today as they were when I introduced it. 2016-05-24 10:00:32 -04:00
Benoit Steiner
e617711306 Don't attempt to use MMX instructions with visualstudio since they're only partially supported. 2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f Worked around missing clang intrinsic 2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b Use the generic ploadquad intrinsics since it does the job 2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b Worked around missing clang intrinsics 2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61 Added missing EIGEN_DEVICE_FUNC qualifier 2016-05-23 20:56:59 -07:00
Benoit Steiner
33a94f5dc7 Use the Index type instead of integers to specify the strides in pgather/pscatter 2016-05-23 20:37:30 -07:00
Benoit Steiner
6bc684ab6a Added missing alignment in the fp16 packet traits 2016-05-23 20:32:30 -07:00
Benoit Steiner
283e33dea4 ptranspose is not a template. 2016-05-23 19:55:55 -07:00
Benoit Steiner
a5a3ba2b80 Avoid unnecessary float to double conversions 2016-05-23 17:16:09 -07:00
Benoit Steiner
5ba0ebe7c9 Avoid unnecessary float to double conversion. 2016-05-23 17:14:31 -07:00
Benoit Steiner
7d980d74e5 Started to vectorize the processing of 16bit floats on CPU. 2016-05-23 15:21:40 -07:00
Benoit Steiner
5d51a7f12c Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path. 2016-05-23 15:13:16 -07:00
Christoph Hertzberg
88654762da Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit. 2016-05-23 10:03:03 +02:00
Christoph Hertzberg
718521d5cf Silenced several double-promotion warnings 2016-05-22 18:17:04 +02:00
Gael Guennebaud
ccaace03c9 Make EIGEN_HAS_CONSTEXPR user configurable 2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable 2016-05-20 15:05:38 +02:00
Gael Guennebaud
abd1c1af7a Make EIGEN_HAS_STD_RESULT_OF user configurable 2016-05-20 15:01:27 +02:00
Gael Guennebaud
1395056fc0 Make EIGEN_HAS_C99_MATH user configurable 2016-05-20 14:58:19 +02:00
Gael Guennebaud
48bf5ec216 Make EIGEN_HAS_RVALUE_REFERENCES user configurable 2016-05-20 14:54:20 +02:00
Gael Guennebaud
f43ae88892 Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES 2016-05-20 14:48:51 +02:00
Gael Guennebaud
998f2efc58 Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used. 2016-05-20 14:44:28 +02:00
Gael Guennebaud
c028d96089 Improve doc of special math functions 2016-05-20 14:18:48 +02:00
Gael Guennebaud
6761c64d60 zeta and polygamma are not unary functions, but binary ones. 2016-05-19 18:34:16 +02:00
Gael Guennebaud
7a54032408 zeta and digamma do not require C++11/C99 2016-05-19 17:36:47 +02:00
Gael Guennebaud
73693b5de6 bug #1221: disable gcc 6 warning: ignoring attributes on template argument 2016-05-19 15:21:53 +02:00
Gael Guennebaud
6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. 2016-05-19 13:06:21 +02:00
Gael Guennebaud
a226f6af6b Add support for SelfAdjointView::diagonal() 2016-05-19 13:05:33 +02:00
Gael Guennebaud
ee7da3c7c5 Fix SelfAdjointView::triangularView for complexes. 2016-05-19 13:01:51 +02:00
Gael Guennebaud
b6b8578a67 bug #1230: add support for SelfadjointView::triangularView. 2016-05-19 11:36:38 +02:00
Gael Guennebaud
84df9142e7 bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests 2016-05-18 23:00:13 +02:00
Gael Guennebaud
747e3290c0 bug #1213: rename some enums type for consistency. 2016-05-18 13:26:56 +02:00
Rasmus Munk Larsen
0dbd68145f Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h. 2016-05-17 10:25:19 -07:00
Rasmus Munk Larsen
e55deb21c5 Improvements to parallelFor.
Move some scalar functors from TensorFunctors. to Eigen core.
2016-05-12 14:07:22 -07:00
Benoit Steiner
fae0493f98 Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message.  Lines beginning with 'HG:' are removed.
2016-05-11 23:02:26 -07:00
Benoit Steiner
b6a517c47d Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
2016-05-11 21:26:48 -07:00
Benoit Steiner
518149e868 Misc fixes for fp16 2016-05-11 20:11:14 -07:00
Benoit Steiner
56a1757d74 Made predux_min and predux_max on fp16 less noisy 2016-05-11 17:37:34 -07:00
Benoit Steiner
9091351dbe __ldg is only available with cuda architectures >= 3.5 2016-05-11 15:22:13 -07:00
Benoit Steiner
02f76dae2d Fixed a typo 2016-05-11 15:08:38 -07:00
Christoph Hertzberg
131e5a1a4a Do not copy for trivial 1x1 case. This also avoids a "maybe-uninitialized" warning in some situations. 2016-05-11 23:50:13 +02:00
Benoit Steiner
70195a5ff7 Added missing EIGEN_DEVICE_FUNC 2016-05-11 14:10:09 -07:00
Benoit Steiner
09a19c33a8 Added missing EIGEN_DEVICE_FUNC qualifiers 2016-05-11 14:07:43 -07:00
Christoph Hertzberg
33ca7e3c8d bug #1207: Add and fix logical-op warnings 2016-05-11 19:36:34 +02:00
Benoit Steiner
217d984abc Fixed a typo in my previous commit 2016-05-11 10:22:15 -07:00
Benoit Steiner
0b9e3dcd06 Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%. 2016-05-10 11:05:33 -07:00
Benoit Steiner
8adf5cc70f Added support for packet processing of fp16 on kepler and maxwell gpus 2016-05-06 19:16:43 -07:00
Christoph Hertzberg
a11bd82dc3 bug #1213: Give names to anonymous enums 2016-05-06 11:31:56 +02:00
Benoit Steiner
0451940fa4 Relaxed the dummy precision for fp16 2016-05-05 15:40:01 -07:00
Christoph Hertzberg
dacb469bc9 Enable and fix -Wdouble-conversion warnings 2016-05-05 13:35:45 +02:00