Commit Graph

  • a8ad8887bf Implement a more generic blocking-size selection algorithm. See explanations inlines. It performs extremely well on Haswell. The main issue is to reliably and quickly find the actual cache size to be used for our 2nd level of blocking, that is: max(l2,l3/nb_core_sharing_l3) Gael Guennebaud 2015-02-26 16:04:35 +01:00
  • 400becc591 Fix typos in block-size testing code, and set peeling on k to 8. Gael Guennebaud 2015-02-26 15:57:06 +01:00
  • bffb6bdf45 Made TensorIntDiv.h compile with MSVC Benoit Steiner 2015-02-25 23:54:43 -08:00
  • 27f3fb2bcc Fixed another clang warning Benoit Steiner 2015-02-25 22:54:20 -08:00
  • f8fbb3f9a6 Fixed several compilation warnings reported by clang Benoit Steiner 2015-02-25 22:22:37 -08:00
  • 8e817b65d0 Silenced a few more compilation warnings generated by nvcc Benoit Steiner 2015-02-25 17:46:20 -08:00
  • 410070e5ab Added more tests to validate support for tensors laid out in RowMajor order. Benoit Steiner 2015-02-25 16:14:59 -08:00
  • 1cfd51908c Added support for RowMajor layout to the tensor patch extraction cofde. Benoit Steiner 2015-02-25 13:29:12 -08:00
  • eb21a8173e Pulled latest changes from trunk Benoit Steiner 2015-02-25 09:49:44 -08:00
  • 8afce86e64 Added support for RowMajor layout to the image patch extraction code Speeded up the unsupported_cxx11_tensor_image_patch test and reduced its memory footprint Benoit Steiner 2015-02-25 09:48:54 -08:00
  • 692136350b So I extensively measured the impact of the offset in this prefetch. I tried offset values from 0 to 128 (on this float* pointer, so implicitly times 4 bytes). Benoit Jacob 2015-02-25 12:37:14 -05:00
  • 531fa9de77 bug #970: Add EIGEN_DEVICE_FUNC to RValue functions, in case Cuda supports RValue-references. Christoph Hertzberg 2015-02-24 21:03:28 +01:00
  • 26275b250a Fix my recent prefetch changes: - the first prefetch is actually harmful on Haswell with FMA, but it is the most beneficial on ARM. - the second prefetch... I was very stupid and multiplied by sizeof(scalar) and offset of a scalar* pointer. The old offset was 64; pk = 8, so 64=pk*8. So this effectively restores the older offset. Actually, there were two prefetches here, one with offset 48 and one with offset 64. I could not confirm any benefit from this strange 48 offset on either the haswell or my ARM device. Benoit Jacob 2015-02-23 16:55:17 -05:00
  • 488874781b Add analyze-blocking-sizes program under bench/ to analyze multiple logs generated by benchmark-blocking-sizes. Benoit Jacob 2015-02-23 14:02:29 -05:00
  • 052b6b40f1 Fix two trivial warnings Christoph Hertzberg 2015-02-22 12:40:51 +01:00
  • ecbf2a6656 log1p is defined only for real Scalars in C++11 Christoph Hertzberg 2015-02-21 19:58:24 +01:00
  • 6af6cf0c2e I can reproduce any problems that justified this hack. However it makes builds fail in C++11 mode. Christoph Hertzberg 2015-02-21 19:43:56 +01:00
  • 3cf642baa3 Fix compilation of unit tests disabling assertion cheking Gael Guennebaud 2015-02-21 14:13:48 +01:00
  • 458cf91cd9 Add benchmark-blocking-sizes.cpp to bench/ per mailing list discussion. Benoit Jacob 2015-02-20 17:08:04 -05:00
  • 03ec601ff7 Initial version of a small script to help tracking performance regressions Gael Guennebaud 2015-02-20 19:20:34 +01:00
  • 333b497383 update bench_gemm Gael Guennebaud 2015-02-20 11:59:49 +01:00
  • 2da1594750 Fix doc of Ref<> Gael Guennebaud 2015-02-20 11:52:22 +01:00
  • 01b8440579 With C++11 Matrix<float> + Matrix<complex<float>> does not even compile Gael Guennebaud 2015-02-20 09:32:49 +01:00
  • 3594451ee0 Remove EIGEN_TEST_C++0x option and let EIGEN_TEST_CXX11 adds the -std=c++11 flag Gael Guennebaud 2015-02-20 09:31:27 +01:00
  • b192e29eae In C++11 destructors do not throw by default (fix CommaInitializer unit test) Gael Guennebaud 2015-02-20 09:28:34 +01:00
  • ab41652d81 Pulled latest changes from trunk Benoit Steiner 2015-02-19 21:23:37 -08:00
  • 7765039f1c Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up being executed on the GPU device. Benoit Steiner 2015-02-19 21:22:51 -08:00
  • a66f5fc2fd Fix regression with C++11 support of lambda: now internal::result_of falls back to std::result_of in C++11. Gael Guennebaud 2015-02-19 23:32:12 +01:00
  • ece6b440f9 Fix a C++11 compilation issue in unit test Gael Guennebaud 2015-02-19 23:31:08 +01:00
  • 1b7e12847d Fix some calls to result_of on binary functors as unary ones. Gael Guennebaud 2015-02-19 23:30:41 +01:00
  • 0f4dd15dfc Declare const some const variables Gael Guennebaud 2015-02-19 23:28:57 +01:00
  • 92ceb02c6d Pulle latest updates from trunk Benoit Steiner 2015-02-19 11:59:52 -08:00
  • 110fb90250 Improved the documentations Benoit Steiner 2015-02-19 11:59:04 -08:00
  • 829dddd0fd Add support for C++11 result_of/lambdas Gael Guennebaud 2015-02-19 15:18:37 +01:00
  • db05f2d01e rotating kernel: avoid compiling anything outside of ARM Benoit Jacob 2015-02-18 15:43:52 -05:00
  • 0ed00d5438 remove a newly introduced redundant typedef - sorry. Benoit Jacob 2015-02-18 15:05:01 -05:00
  • 9bd8a4bab5 bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path Benoit Jacob 2015-02-18 15:03:35 -05:00
  • ee27d50633 Fixed template parameter. Hauke Heibel 2015-02-18 18:51:08 +01:00
  • 73a24de424 merge Gael Guennebaud 2015-02-18 15:51:00 +01:00
  • 63eb0f6fe6 Clean a bit computeProductBlockingSizes (use Index type, remove CEIL macro) Gael Guennebaud 2015-02-18 15:49:05 +01:00
  • fc5c3e85e2 Fix bug #961: eigen-doc.tgz included part of itself. Gael Guennebaud 2015-02-18 15:47:01 +01:00
  • 4a3e6c8be1 bug #958 - Allow testing specific blocking sizes Benoit Jacob 2015-02-18 09:43:55 -05:00
  • c7bb1e8ea8 Fix a regression when using OpenMP, and fix bug #714: the number of threads might be lower than the number of requested ones Gael Guennebaud 2015-02-18 15:19:23 +01:00
  • 548b781380 Fix bug #945: workaround MSVC warning Gael Guennebaud 2015-02-18 12:53:49 +01:00
  • 6f4adc9e94 Add missing install directives for arch/CUDA Gael Guennebaud 2015-02-18 11:40:06 +01:00
  • 371d3bef36 Workaround dead store warnings in unit tests. Gael Guennebaud 2015-02-18 11:30:44 +01:00
  • 63464754ef Add an internal assertion in makeCompressed to catch a possible risk of null-pointer access. Gael Guennebaud 2015-02-18 11:29:54 +01:00
  • eb563049f7 Remove some dead stores. Gael Guennebaud 2015-02-18 11:26:48 +01:00
  • dc7e6acc05 Fix possible usage of a null pointer in CholmodSupport Gael Guennebaud 2015-02-18 11:26:25 +01:00
  • d4eda01488 Big 957, workaround MSVC/ICC compilation issue Gael Guennebaud 2015-02-18 11:24:32 +01:00
  • 24d65ac0b0 Removed redundant typedef which confused old gcc versions. Christoph Hertzberg 2015-02-18 01:03:32 +01:00
  • 20cac72b82 Packet must be passed by const reference and not by value to avoid alignment issue. Gael Guennebaud 2015-02-17 22:58:32 +01:00
  • 36c9d08274 Pulled latest updates from trunk Benoit Steiner 2015-02-17 10:04:25 -08:00
  • f77054f43c Silenced compilation warning Benoit Steiner 2015-02-17 10:02:04 -08:00
  • 1d3b64d32b Added support for tensor concatenation as lvalue Benoit Steiner 2015-02-17 09:57:41 -08:00
  • 00f048d44f Added support for tensor concatenation as lvalue Benoit Steiner 2015-02-17 09:54:40 -08:00
  • 97a36ecba4 Suppress some remaining Index conversion warnings Christoph Hertzberg 2015-02-17 18:52:39 +01:00
  • 159fb181c2 Disable __m128* wrappers when compiling with AVX and -fabi-version=4 Gael Guennebaud 2015-02-17 16:27:20 +01:00
  • 91ab2489dd Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same type with default ABI) Gael Guennebaud 2015-02-17 16:08:07 +01:00
  • 9daf8eba6f Fix compilation of Cholmod*(matrix) ctor Gael Guennebaud 2015-02-17 15:24:52 +01:00
  • 3373c903b3 Fix compilation of int*complex with gcc Gael Guennebaud 2015-02-16 19:18:12 +01:00
  • 9f49f00feb Extend sparse-determinant unitests Gael Guennebaud 2015-02-16 19:09:48 +01:00
  • f56d452c7e Enable atv in Blaze Benchmark Florian George 2014-05-04 17:07:17 +02:00
  • af79b158a1 Use trans(X) instead of X.transpose() in Blaze Benchmark Florian George 2014-05-04 17:06:34 +02:00
  • 756024825d Fix support for row (resp. column) of a column-major (resp. row-major) sparse matrix (grafted from 3573a10712 ) Gael Guennebaud 2014-02-17 13:46:17 +01:00
  • ec6ca4eae9 bug #1249: enable use of __builtin_prefetch for GCC, clang, and ICC only. Gael Guennebaud 2016-07-25 15:17:45 +02:00
  • eb7863ebd0 Workaround MSVC 2013 compilation issue in Reverse (users are unlikely to be affected) Gael Guennebaud 2016-07-19 17:21:49 +02:00
  • aa0d407f2e Added tag 3.2.9 for changeset dc2f92ba4a Gael Guennebaud 2016-07-18 16:28:53 +02:00
  • dc2f92ba4a bump to 3.2.9 3.2.9 Gael Guennebaud 2016-07-18 16:28:24 +02:00
  • 2eb8b99a32 Fix compilation issue if PastixSupport Gael Guennebaud 2016-07-18 14:55:06 +02:00
  • 83c726b343 merge Gael Guennebaud 2016-07-18 14:51:53 +02:00
  • 473e70e8be Fix compilation of matrix exponential Gael Guennebaud 2016-07-18 14:51:44 +02:00
  • 80e72a2653 Fix warning and remove checking of empty matrices (not supported by 3.2) Gael Guennebaud 2016-07-18 13:59:43 +02:00
  • 201a317912 Fix compilation with MSVC Gael Guennebaud 2016-07-18 10:40:14 +02:00
  • 2a3680da3d Backport numerical robustness fixes from 3.3 branch Gael Guennebaud 2016-07-11 22:48:52 +02:00
  • 4f7baefa81 bug #1017: apply Christoph's patch preventing underflows in makeHouseholder (grafted from 476beed7f8 ) Gael Guennebaud 2015-06-22 16:51:45 +02:00
  • 38b9ff8b6f Backport some cmake hacks - This fixes Ninja generator. Gael Guennebaud 2016-07-01 09:46:57 +02:00
  • 87112908be Biug 1242: fix comma init with empty matrices. (grafted from a3f7edf7e7 ) Gael Guennebaud 2016-06-23 10:25:04 +02:00
  • d5c2a01031 Add missing explicit scalar conversion (grafted from 4c61f00838 ) Gael Guennebaud 2016-06-12 22:42:13 +02:00
  • 4c8f0cbc1f Fixes for PARDISO: warnings, and defaults to metis+ in-core mode. Gael Guennebaud 2016-06-08 18:31:19 +02:00
  • 538bc98b33 Fix extraction of complex eigenvalue pairs in real generalized eigenvalue problems. (grafted from 9fc8379328 ) Gael Guennebaud 2016-06-08 16:39:11 +02:00
  • 29f5f098cc Homogeneous vectors could not be accessed with single index. Added a regression test. Christoph Hertzberg 2016-06-08 15:35:31 +02:00
  • c21f2cde34 bug #1238: fix SparseMatrix::sum() overload for un-compressed mode. Gael Guennebaud 2016-05-31 10:56:53 +02:00
  • 909747d6b2 bug #1236: fix possible integer overflow in density estimation. (grafted from e8cef383b7 ) Gael Guennebaud 2016-05-26 17:51:04 +02:00
  • 1cff196837 Fix compilation of SPlines module (grafted from bd6eca059d ) Gael Guennebaud 2014-02-17 10:00:38 +01:00
  • 15f273b63c fix reshape flag and test case yoco 2014-02-10 22:49:13 +08:00
  • b64a09acc1 fix reshape's Max[Row/Col]AtCompileTime yoco 2014-02-04 05:54:50 +08:00
  • f8ad87f226 Reshape always non-directly-access yoco 2014-02-04 05:19:56 +08:00
  • 515bbf8bb2 Improve reshape test case yoco 2014-02-04 02:50:23 +08:00
  • 009047db27 Fix Reshape traits flag calculate bug yoco 2014-02-04 02:21:41 +08:00
  • 4ecd782c31 Fixed issue #734 (thanks to Philipp Büttgenbach for reporting the issue and proposing a fix). Kept ColMajor layout if possible in order to keep derivatives of the same order adjacent in memory. (grafted from e722f36ffa ) Hauke Heibel 2014-02-01 20:49:48 +01:00
  • 84a65f996f bug #1221: disable gcc 6 warning: ignoring attributes on template argument Gael Guennebaud 2016-05-19 15:21:53 +02:00
  • 17c40e5524 bug #1222: fix compilation in AutoDiffScalar and add respective unit test (grafted from 448d9d943c ) Gael Guennebaud 2016-05-18 16:00:11 +02:00
  • 51f763eaba bug #1213: backport "Give names to anonymous enums" to workaround gcc linking issues. Gael Guennebaud 2016-05-18 13:32:35 +02:00
  • f5e01a2cde Workaround a division by zero when outerstride==0 Gael Guennebaud 2016-04-13 19:02:02 +02:00
  • 8d16e2aa27 Fix detection of same matrices for expressions not handled by extract_data Gael Guennebaud 2016-04-13 18:40:02 +02:00
  • 547a3c0d28 Add StorageIndex type to easethe transition to 3.3. Gael Guennebaud 2016-04-13 15:09:39 +02:00
  • a432b017fb bug #1200: backport aligned_allocator from 3.3 Gael Guennebaud 2016-04-13 14:56:49 +02:00
  • b4669f9036 Fix cross-compiling windows version detection (grafted from 2b457f8e5e ) Gael Guennebaud 2016-04-04 11:47:46 +02:00
  • 4854326ae8 Fix usage of nesting type in blas_traits. In practice, this fixes compilation of expressions such as A*(A*A)^T where a product is hidden behind an expression supported by blas-traits. Gael Guennebaud 2016-03-29 22:39:12 +02:00