Commit Graph

  • a9df28c95b SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases) Gael Guennebaud 2015-03-13 21:00:21 +01:00
  • 5ffe29cb9f Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible. Gael Guennebaud 2015-03-13 20:57:33 +01:00
  • d73ccd717e Add support for dumping blocking sizes tables Benoit Jacob 2015-03-13 10:36:01 -07:00
  • 2f6f8bf31c Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests. Gael Guennebaud 2015-03-13 16:24:40 +01:00
  • f2c3e2b10f Add --only-cubic-sizes option to analyze-blocking-sizes tool Benoit Jacob 2015-03-12 13:16:33 -07:00
  • 657407227e Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of doubles instead of swapping the doubles. Doug Kwan 2015-03-11 15:13:37 -07:00
  • f89fcefa79 Add hyperbolic trigonometric functions from std array support Deanna Hood 2015-03-11 13:13:30 +10:00
  • a5e49976f5 Add log10 array support Deanna Hood 2015-03-11 08:56:42 +10:00
  • 19a71056ae Allow calling of square(array) in addition to array.square() Deanna Hood 2015-03-11 06:59:28 +10:00
  • 31fdd67756 Additional unary coeff-wise functors (isnan, round, arg, e.g.) Deanna Hood 2015-03-11 06:39:23 +10:00
  • fd78874888 Fix compilation of iterative solvers with dense matrices Gael Guennebaud 2015-03-09 21:31:03 +01:00
  • d4317a85e8 Add typedefs for return types of SparseMatrixBase::selfadjointView Gael Guennebaud 2015-03-09 21:29:46 +01:00
  • 9e885fb766 Add unit tests for CG and sparse-LLT for long int as storage-index Gael Guennebaud 2015-03-09 14:33:15 +01:00
  • 224a1fe4c6 bug #963: make IncompleteLUT compatible with non-default storage index types. Gael Guennebaud 2015-03-09 13:55:20 +01:00
  • cf9940e17b Make sparse unit-test helpers aware of StorageIndex Gael Guennebaud 2015-03-09 13:54:05 +01:00
  • 39228cb224 deserialization assumed benchmarks in same order, but we shuffle them. Benoit Jacob 2015-03-06 19:29:01 -05:00
  • a4f956b1da merge Benoit Jacob 2015-03-06 19:13:36 -05:00
  • 19bf13aa62 Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad Benoit Jacob 2015-03-06 19:11:50 -05:00
  • 0ee391863e Avoid undeflow when blocking size are tuned manually. Gael Guennebaud 2015-03-06 21:51:09 +01:00
  • 14a5f135a3 bug #969: workaround abiguous calls to Ref using enable_if. Gael Guennebaud 2015-03-06 17:51:31 +01:00
  • d23fcc0672 bug #978: add unit test for zero-sized products Gael Guennebaud 2015-03-06 16:12:08 +01:00
  • 87681e508f bug #978: early return for vanishing products Gael Guennebaud 2015-03-06 16:11:22 +01:00
  • 4c8eeeaed6 update gemm changeset list Gael Guennebaud 2015-03-06 15:08:20 +01:00
  • cd3bbffa73 Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1) Gael Guennebaud 2015-03-06 14:31:39 +01:00
  • eedd5063fd Update gemm performance monitoring tool: - permit to recompute a subset of changesets - update changeset list - add a few more cases Gael Guennebaud 2015-03-06 11:47:13 +01:00
  • 58740ce4c6 Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one: It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited. Gael Guennebaud 2015-03-06 10:30:35 +01:00
  • 4ab01f7c21 slightly increase tolerance to clock speed variation Benoit Jacob 2015-03-05 14:41:16 -05:00
  • 5db2baa573 Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that. Benoit Jacob 2015-03-05 13:44:20 -05:00
  • 4c8b95d5c5 Rename LSCG to LeastSquaresConjugateGradient Gael Guennebaud 2015-03-05 10:16:32 +01:00
  • 7550107028 Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large" Gael Guennebaud 2015-03-05 10:03:46 +01:00
  • 2dc968e453 bug #824: improve accuracy of Quaternion::angularDistance using atan2 instead of acos. Gael Guennebaud 2015-03-04 17:03:13 +01:00
  • 2231b3dece output to cout, not cerr, the actual results Benoit Jacob 2015-03-04 09:45:12 -05:00
  • 00ea121881 Complete the tool to analyze the efficiency of default sizes. Benoit Jacob 2015-03-04 09:30:56 -05:00
  • 0196141938 Fixed the optimized AVX implementation of the fast rsqrt function Benoit Steiner 2015-03-02 13:49:39 -08:00
  • b0f2b6f297 Updated the tensor type casting code as follow: in the case where TgtRatio < SrcRatio, disable the vectorization of the source expression unless is has direct-access. Benoit Steiner 2015-03-02 10:11:40 -08:00
  • d9cb604a5d Disabled the use of aligned memory loads when converting a tensor from float to doubles since alignment can't always be guaranteed. Benoit Steiner 2015-03-02 09:41:36 -08:00
  • 4fd7f47692 Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined. Benoit Steiner 2015-03-02 09:38:47 -08:00
  • ae73859a0a Fixed incorrect assertion Benoit Steiner 2015-02-28 08:02:02 -08:00
  • 131449298f Fixed clang compilation warning Benoit Steiner 2015-02-28 03:01:19 -08:00
  • 56ea45ff0f Silenced some compilation warnings Benoit Steiner 2015-02-28 02:37:41 -08:00
  • bb483313f6 Fixed another batch of compilation warnings Benoit Steiner 2015-02-28 02:32:46 -08:00
  • fb53384b0f Improved the default implementation of prsqrt Benoit Steiner 2015-02-28 01:51:26 -08:00
  • 61409d9449 Silenced one more comilation warning Benoit Steiner 2015-02-28 01:49:09 -08:00
  • 1a7b84dc75 Silenced a few compilation warnings Benoit Steiner 2015-02-28 01:45:15 -08:00
  • 37357a310f Fixed compilation warnings Benoit Steiner 2015-02-27 23:54:24 -08:00
  • cf1eea11de Fixed compilation warnings Benoit Steiner 2015-02-27 23:52:02 -08:00
  • 78732186ee Fixed compilation warnings Benoit Steiner 2015-02-27 23:51:16 -08:00
  • 4250a0cab0 Fixed compilation warnings Benoit Steiner 2015-02-27 21:59:10 -08:00
  • a4e37b0617 Reverted the README Benoit Steiner 2015-02-27 13:09:49 -08:00
  • 306fceccbe Pulled latest updates from trunk Benoit Steiner 2015-02-27 13:05:26 -08:00
  • 75e7f381c8 Pulled latest updates from trunk Benoit Steiner 2015-02-27 12:57:55 -08:00
  • 2386fc8528 Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary. Benoit Steiner 2015-02-27 12:57:13 -08:00
  • e1f6a45b14 README.md edited online with Bitbucket Benoit Steiner 2015-02-27 20:44:24 +00:00
  • 90893bbe18 README.md edited online with Bitbucket Benoit Steiner 2015-02-27 20:44:10 +00:00
  • 473e6d4c3d README.md edited online with Bitbucket Benoit Steiner 2015-02-27 20:41:45 +00:00
  • 4369538227 README.md edited online with Bitbucket Benoit Steiner 2015-02-27 20:41:33 +00:00
  • 99cfbd6e84 README.md edited online with Bitbucket Benoit Steiner 2015-02-27 20:41:14 +00:00
  • 05089aba75 Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts Benoit Steiner 2015-02-27 09:27:30 -08:00
  • 573b377110 Added support for vectorized type casting of tensors Benoit Steiner 2015-02-27 08:46:04 -08:00
  • f41b1f1666 Added support for fast reciprocal square root computation. Benoit Steiner 2015-02-26 09:42:41 -08:00
  • 168ceb271e Really use zero guess in ConjugateGradients::solve as documented and expected for consistency with other methods. Jan Blechta 2015-02-18 14:26:10 +01:00
  • 8fdcaded5e merge Gael Guennebaud 2015-03-04 10:18:08 +01:00
  • c43154bbc5 Check for no-reallocation in SparseMatrix::insert (bug #974) Gael Guennebaud 2015-03-04 10:16:46 +01:00
  • 1ce0178363 Improve efficiency of SparseMatrix::insert/coeffRef for sequential outer-index insertion strategies (bug #974) Gael Guennebaud 2015-03-04 09:39:26 +01:00
  • 3dca4a1efc Update manual wrt new LSCG solver. Gael Guennebaud 2015-03-04 09:35:30 +01:00
  • 05274219a7 Add a CG-based solver for rectangular least-square problems (bug #975). Gael Guennebaud 2015-03-04 09:34:27 +01:00
  • 2aa09e6b4e Fix asm comments in 1px1 kernel Benoit Jacob 2015-03-03 13:44:00 -05:00
  • 5d2fd64a1a Fixed compilation error when compiling with gcc4.7 Benoit Steiner 2015-03-03 08:56:49 -08:00
  • f64b4480af Add missing copyright notices Benoit Jacob 2015-03-03 11:43:56 -05:00
  • eae8e27b7d Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp Benoit Jacob 2015-03-03 11:41:21 -05:00
  • 37a93c4263 New scoring functor to select the pivot. This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice. Marc Glisse 2015-03-03 17:08:28 +01:00
  • ccc1277a42 must also disable complex<double> when disabling double vectorization Benoit Jacob 2015-03-03 10:17:05 -05:00
  • f839099512 Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON intrinsics. Benoit Jacob 2015-03-03 09:35:22 -05:00
  • 9930e9583b Improve analyze-blocking-sizes, and in particular give it a evaluate-defaults tool that shows the efficiency of Eigen's default blocking sizes choices, using a previously computed table from benchmark-blocking-sizes. Benoit Jacob 2015-03-02 18:08:38 -05:00
  • 1ec0f4fadf HalfPacket also needed to be disabled for double, on ARMv8. Benoit Jacob 2015-03-02 16:08:54 -05:00
  • 3109f0e74e Add SSE vectorization of Quaternion::conjugate. Significant speed-up when combined with products like q1*q2.conjugate() Gael Guennebaud 2015-03-02 20:09:33 +01:00
  • ef09ce4552 Fix for TensorIO for Fixed sized Tensors. Abhijit Kundu 2015-02-28 21:30:31 -05:00
  • 3a4b6827b4 Merged eigen/eigen into default Abhijit Kundu 2015-02-28 20:15:28 -05:00
  • 31e2ffe82c Replaced POSIX random() by internal::random Christoph Hertzberg 2015-02-28 18:39:37 +01:00
  • 73dd95e7b0 Use @CMAKE_MAKE_PROGRAM@ instead of make in buildtests.sh Christoph Hertzberg 2015-02-28 16:51:53 +01:00
  • 682196e9fc Fixed MPRealSupport Christoph Hertzberg 2015-02-28 16:41:00 +01:00
  • 33f40b2883 Cygwin does not like weak linking either. Christoph Hertzberg 2015-02-28 14:53:11 +01:00
  • 0f82a1d7b7 bug #967: Automatically add cxx11 suffix when building in C++11 mode Christoph Hertzberg 2015-02-28 14:52:26 +01:00
  • 9aee1e300a Increase unit-test L1 cache size to ensure we are doing at least 2 peeled loop within product kernel. Gael Guennebaud 2015-02-27 22:55:12 +01:00
  • b10cd3afd2 Re-enbale detection of min/max parentheses protection, and re-enable mpreal_support unit test. Gael Guennebaud 2015-02-27 22:38:00 +01:00
  • 6466fa63be Reimplement the selection between rotating and non-rotating kernels using templates instead of macros and if()'s. That was needed to fix the build of unit tests on ARM, which I had broken. My bad for not testing earlier. Benoit Jacob 2015-02-27 15:30:10 -05:00
  • bf9877a92a Pulled latest updates from trunk Benoit Steiner 2015-02-27 09:23:22 -08:00
  • 90f4e90f1d Fixed off-by-one error that prevented the evaluation of small tensor expressions from being vectorized Benoit Steiner 2015-02-27 09:22:37 -08:00
  • 2fc3b484d7 remove trailing comma Benoit Jacob 2015-02-27 11:37:45 -05:00
  • 33669348c4 Disable Packet2f/2i halfpacket support in NEON. I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented, and code trying to use halfpackets just fails to compile on NEON, as it tries to use the default implementation of pload/pstore and the types don't match. Benoit Jacob 2015-02-27 11:35:37 -05:00
  • f5ff4d826f Fix NEON build flags: in the current NDK, at least with the clang-3.5 toolchain, -mfpu=neon is not enough to activate NEON, since it's incompatible with the default float ABI, and I have to pass -mfloat-abi=softfp (which is what everyone does in practice). In fact, it would be a good idea to pass -mfloat-abi=softfp all the time, regardless of NEON. Also removing the -mcpu=cortex-a8, as 1) it's not needed and 2) if we really wanted to pass a specific -mcpu flag, that would presumably to tune performance for benchmarks, and it would then not really make sense to tune for the very old cortex-a8 (it reflects ARM CPUs from 5 years ago). Benoit Jacob 2015-02-27 10:56:50 -05:00
  • b7fc8746e0 Replace a static assert by a runtime one, fixes the build of unit tests on ARM Also safely assert in the non-implemented path that should never be taken in practice, and would return wrong results. Benoit Jacob 2015-02-27 10:01:59 -05:00
  • 4084dce038 Added CMake support for Tensor module. CMake now installs CXX11 Tensor module like the rest of the unsupported modules Abhijit Kundu 2015-02-26 16:50:09 -05:00
  • f074bb4b5f Fixed another compilation problem with TensorIntDiv.h Benoit Steiner 2015-02-26 11:14:23 -08:00
  • 57154fdb32 Can now use the tensor 'reverse' operation as a lvalue Benoit Steiner 2015-02-26 11:13:42 -08:00
  • 2fffe69b1b Added missing copy constructor Benoit Steiner 2015-02-26 09:27:53 -08:00
  • bcf9bb5c1f Avoid packing rhs multiple-times when blocking on the lhs only. Gael Guennebaud 2015-02-26 17:01:33 +01:00
  • 4ec3f04b3a Make sure that the block size computation is tested by our unit test. Gael Guennebaud 2015-02-26 17:00:36 +01:00
  • 2e9cb06a87 Update changeset list to be checked by perf_monitoring/gemm. Gael Guennebaud 2015-02-26 16:13:33 +01:00
  • a46061ab7b Make perf_monitoring/gemm script more flexible: - skip existing dataset - add a "-up" option to recompute the dataset (see script header) - allow to specify a filename prefix Gael Guennebaud 2015-02-26 16:12:58 +01:00