Commit Graph

  • 283e33dea4 ptranspose is not a template. Benoit Steiner 2016-05-23 19:55:55 -07:00
  • a5a3ba2b80 Avoid unnecessary float to double conversions Benoit Steiner 2016-05-23 17:16:09 -07:00
  • 5ba0ebe7c9 Avoid unnecessary float to double conversion. Benoit Steiner 2016-05-23 17:14:31 -07:00
  • 7d980d74e5 Started to vectorize the processing of 16bit floats on CPU. Benoit Steiner 2016-05-23 15:21:40 -07:00
  • 5d51a7f12c Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path. Benoit Steiner 2016-05-23 15:13:16 -07:00
  • 7aa5bc9558 Fixed a typo in the array.cpp test Benoit Steiner 2016-05-23 14:39:51 -07:00
  • a09cbf9905 Merged in rmlarsen/eigen (pull request PR-188) Benoit Steiner 2016-05-23 12:55:12 -07:00
  • 88654762da Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit. Christoph Hertzberg 2016-05-23 10:03:03 +02:00
  • 718521d5cf Silenced several double-promotion warnings Christoph Hertzberg 2016-05-22 18:17:04 +02:00
  • b5a7603822 fixed macro name Christoph Hertzberg 2016-05-22 16:49:29 +02:00
  • 25a03c02d6 Fix some sign-compare warnings Christoph Hertzberg 2016-05-22 16:42:27 +02:00
  • 0851d5d210 Identify clang++ even if it is not named llvm-clang++ Christoph Hertzberg 2016-05-22 15:21:14 +02:00
  • 6a15e14cda Document EIGEN_MAX_CPP_VER and user controllable compiler features. Gael Guennebaud 2016-05-20 15:26:09 +02:00
  • ccaace03c9 Make EIGEN_HAS_CONSTEXPR user configurable Gael Guennebaud 2016-05-20 15:10:08 +02:00
  • c3410804cd Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable Gael Guennebaud 2016-05-20 15:05:38 +02:00
  • abd1c1af7a Make EIGEN_HAS_STD_RESULT_OF user configurable Gael Guennebaud 2016-05-20 15:01:27 +02:00
  • 1395056fc0 Make EIGEN_HAS_C99_MATH user configurable Gael Guennebaud 2016-05-20 14:58:19 +02:00
  • 48bf5ec216 Make EIGEN_HAS_RVALUE_REFERENCES user configurable Gael Guennebaud 2016-05-20 14:54:20 +02:00
  • f43ae88892 Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES Gael Guennebaud 2016-05-20 14:48:51 +02:00
  • 8d6bd5691b polygamma is C99/C++11 only Gael Guennebaud 2016-05-20 14:45:33 +02:00
  • 998f2efc58 Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used. Gael Guennebaud 2016-05-20 14:44:28 +02:00
  • c028d96089 Improve doc of special math functions Gael Guennebaud 2016-05-20 14:18:48 +02:00
  • 0ba32f99bd Rename UniformRandom to UnitRandom. Gael Guennebaud 2016-05-20 13:21:34 +02:00
  • 7a9d9cde94 Fix coding practice in Quaternion::UniformRandom Gael Guennebaud 2016-05-20 13:19:52 +02:00
  • eb0cc2573a bug #823: add static method to Quaternion for uniform random rotations. Joseph Mirabel 2016-05-20 13:15:40 +02:00
  • 2f656ce447 Remove std:: to enable custom scalar types. Gael Guennebaud 2016-05-19 23:13:47 +02:00
  • b1e080c752 Merged eigen/eigen into default Rasmus Larsen 2016-05-18 15:21:50 -07:00
  • 5624219b6b Merge. Rasmus Munk Larsen 2016-05-18 15:16:06 -07:00
  • 7df811cfe5 Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL. Rasmus Munk Larsen 2016-05-18 15:09:48 -07:00
  • f519fca72b Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor. Rasmus Munk Larsen 2016-05-17 16:06:00 -07:00
  • a910bcee43 Merged latest updates from trunk Benoit Steiner 2016-05-17 09:14:22 -07:00
  • 8d06c02ffd Allow vectorized padding on GPU. This helps speed things up a little. Benoit Steiner 2016-05-17 09:13:27 -07:00
  • ccc7563ac5 made a fix to the GMRES solver so that it now correctly reports the error achieved in the solution process David Dement 2016-05-16 14:26:41 -04:00
  • 575bc44c3f Fix unit test. Gael Guennebaud 2016-05-19 22:48:16 +02:00
  • ccb408ee6a Improve unit tests of zeta, polygamma, and digamma Gael Guennebaud 2016-05-19 18:34:41 +02:00
  • 6761c64d60 zeta and polygamma are not unary functions, but binary ones. Gael Guennebaud 2016-05-19 18:34:16 +02:00
  • 7a54032408 zeta and digamma do not require C++11/C99 Gael Guennebaud 2016-05-19 17:36:47 +02:00
  • ce12562710 Add some c++11 flags in documentation Gael Guennebaud 2016-05-19 17:35:30 +02:00
  • b6ed8244b4 bug #1201: optimize affine*vector products Gael Guennebaud 2016-05-19 16:09:15 +02:00
  • 73693b5de6 bug #1221: disable gcc 6 warning: ignoring attributes on template argument Gael Guennebaud 2016-05-19 15:21:53 +02:00
  • df9a5e13c6 Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs. Gael Guennebaud 2016-05-19 13:07:33 +02:00
  • 6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. Gael Guennebaud 2016-05-19 13:06:21 +02:00
  • a226f6af6b Add support for SelfAdjointView::diagonal() Gael Guennebaud 2016-05-19 13:05:33 +02:00
  • ee7da3c7c5 Fix SelfAdjointView::triangularView for complexes. Gael Guennebaud 2016-05-19 13:01:51 +02:00
  • b6b8578a67 bug #1230: add support for SelfadjointView::triangularView. Gael Guennebaud 2016-05-19 11:36:38 +02:00
  • bb3ff8e9d9 Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available. Benoit Steiner 2016-05-18 14:52:49 -07:00
  • 84df9142e7 bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests Gael Guennebaud 2016-05-18 23:00:13 +02:00
  • 21d692d054 Use coeff(i,j) instead of operator(). Gael Guennebaud 2016-05-18 17:09:20 +02:00
  • 8456bbbadb bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only. Gael Guennebaud 2016-05-18 16:53:28 +02:00
  • b507b82326 Use default sorting strategy for square products. Gael Guennebaud 2016-05-18 16:51:54 +02:00
  • 1fa15ceee6 Extend sparse*sparse product unit test to check that the expected implementation is used (conservative vs auto pruning). Gael Guennebaud 2016-05-18 16:50:54 +02:00
  • 548a487800 bug #1229: bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway. Gael Guennebaud 2016-05-18 16:44:05 +02:00
  • 43790e009b Pass argument by const ref instead of by value in pow(AutoDiffScalar...) Gael Guennebaud 2016-05-18 16:28:02 +02:00
  • 1fbfab27a9 bug #1223: fix compilation of AutoDiffScalar's min/max operators, and add regression unit test. Gael Guennebaud 2016-05-18 16:26:26 +02:00
  • 448d9d943c bug #1222: fix compilation in AutoDiffScalar and add respective unit test Gael Guennebaud 2016-05-18 16:00:11 +02:00
  • 5a71eb5985 Big 1213: add regression unit test. Gael Guennebaud 2016-05-18 14:03:03 +02:00
  • 747e3290c0 bug #1213: rename some enums type for consistency. Gael Guennebaud 2016-05-18 13:26:56 +02:00
  • 86ae94462e #if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly. Benoit Steiner 2016-05-17 14:06:15 -07:00
  • 997c335970 Fixed compilation error Benoit Steiner 2016-05-17 12:54:18 -07:00
  • ebf6ada5ee Fixed compilation error in the tensor thread pool Benoit Steiner 2016-05-17 12:33:46 -07:00
  • 0bb61b04ca Merge upstream. Rasmus Munk Larsen 2016-05-17 10:26:10 -07:00
  • 0dbd68145f Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h. Rasmus Munk Larsen 2016-05-17 10:25:19 -07:00
  • 00228f2506 Merged eigen/eigen into default Rasmus Larsen 2016-05-17 09:49:31 -07:00
  • e7e64c3277 Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit: Benoit Steiner 2016-05-17 09:24:35 -07:00
  • 5fa27574dd Allow vectorized padding on GPU. This helps speed things up a little Benoit Steiner 2016-05-17 09:17:26 -07:00
  • 86da77cb9b Pulled latest updates from trunk. Benoit Steiner 2016-05-17 07:21:48 -07:00
  • 92fc6add43 Don't rely on c++11 extension when we don't have to. Benoit Steiner 2016-05-17 07:21:22 -07:00
  • 2d74ef9682 Avoid float to double conversion Benoit Steiner 2016-05-17 07:20:11 -07:00
  • a80d875916 Added missing costPerCoeff method Benoit Steiner 2016-05-16 09:31:10 -07:00
  • 83ef39e055 Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions. Benoit Steiner 2016-05-16 08:55:21 -07:00
  • b789a26804 Fixed syntax error Benoit Steiner 2016-05-16 08:51:08 -07:00
  • 83dfb40f66 Turnon the new thread pool by default since it scales much better over multiple cores. It is still possible to revert to the old thread pool by compiling with the EIGEN_USE_SIMPLE_THREAD_POOL define. Benoit Steiner 2016-05-13 17:23:15 -07:00
  • 97605c7b27 New multithreaded contraction that doesn't rely on the thread pool to run the closure in the order in which they are enqueued. This is needed in order to switch to the new non blocking thread pool since this new thread pool can execute the closure in any order. Benoit Steiner 2016-05-13 17:11:29 -07:00
  • 069a0b04d7 Added benchmarks for contraction on CPU. Benoit Steiner 2016-05-13 14:32:17 -07:00
  • c4fc8b70ec Removed unnecessary thread synchronization Benoit Steiner 2016-05-13 10:49:38 -07:00
  • 7aa3557d31 Fixed compilation errors triggered by old versions of gcc Benoit Steiner 2016-05-12 18:59:04 -07:00
  • 5005b27fc8 Diasbled cost model by accident. Revert. Rasmus Munk Larsen 2016-05-12 16:55:21 -07:00
  • 989e419328 Address comments by bsteiner. Rasmus Munk Larsen 2016-05-12 16:54:19 -07:00
  • e55deb21c5 Improvements to parallelFor. Rasmus Munk Larsen 2016-05-12 14:07:22 -07:00
  • ae9688f313 Worked around a compilation error triggered by nvcc when compiling a tensor concatenation kernel. Benoit Steiner 2016-05-12 12:06:51 -07:00
  • 2a54b70d45 Fixed potential race condition in the non blocking thread pool Benoit Steiner 2016-05-12 11:45:48 -07:00
  • a071629fec Replace implicit cast with an explicit one Benoit Steiner 2016-05-12 10:40:07 -07:00
  • 2f9401b061 Worked around compilation errors with older versions of gcc Benoit Steiner 2016-05-11 23:39:20 -07:00
  • 09653e1f82 Improved the portability of the tensor code Benoit Steiner 2016-05-11 23:29:09 -07:00
  • fae0493f98 Fixed a couple of bugs related to the Pascalfamily of GPUs Benoit Steiner 2016-05-11 23:02:26 -07:00
  • 886445ce4d Avoid unnecessary conversions between floats and doubles Benoit Steiner 2016-05-11 23:00:03 -07:00
  • 595e890391 Added more tests for half floats Benoit Steiner 2016-05-11 21:27:15 -07:00
  • b6a517c47d Added the ability to load fp16 using the texture path. Improved the performance of some reductions on fp16 Benoit Steiner 2016-05-11 21:26:48 -07:00
  • 518149e868 Misc fixes for fp16 Benoit Steiner 2016-05-11 20:11:14 -07:00
  • 56a1757d74 Made predux_min and predux_max on fp16 less noisy Benoit Steiner 2016-05-11 17:37:34 -07:00
  • 9091351dbe __ldg is only available with cuda architectures >= 3.5 Benoit Steiner 2016-05-11 15:22:13 -07:00
  • 02f76dae2d Fixed a typo Benoit Steiner 2016-05-11 15:08:38 -07:00
  • 131e5a1a4a Do not copy for trivial 1x1 case. This also avoids a "maybe-uninitialized" warning in some situations. Christoph Hertzberg 2016-05-11 23:50:13 +02:00
  • 70195a5ff7 Added missing EIGEN_DEVICE_FUNC Benoit Steiner 2016-05-11 14:10:09 -07:00
  • 09a19c33a8 Added missing EIGEN_DEVICE_FUNC qualifiers Benoit Steiner 2016-05-11 14:07:43 -07:00
  • 1a1ce6ff61 Removed deprecated flag (which apparently was ignored anyway) Christoph Hertzberg 2016-05-11 23:05:37 +02:00
  • 2150f13d65 fixed some double-promotion and sign-compare warnings Christoph Hertzberg 2016-05-11 23:02:26 +02:00
  • 7268b10203 Split unit test Christoph Hertzberg 2016-05-11 19:41:53 +02:00
  • 8d4ef391b0 Don't flood test output with successful VERIFY_IS_NOT_EQUAL tests. Christoph Hertzberg 2016-05-11 19:40:45 +02:00
  • bda21407dd Fix help output of buildtests and check scripts Christoph Hertzberg 2016-05-11 19:39:09 +02:00