Commit Graph

  • f933f69021 Added a few comments Benoit Steiner 2016-02-03 14:12:18 -08:00
  • 5d82e47ef6 Properly disable nvcc warning messages in user code. Benoit Steiner 2016-02-03 14:10:06 -08:00
  • af8436b196 Silenced the "calling a __host__ function from a __host__ __device__ function is not allowed" messages Benoit Steiner 2016-02-03 13:48:36 -08:00
  • d7742d22e4 Revert the nvcc messages to their default severity instead of the forcing them to be warnings Benoit Steiner 2016-02-03 13:47:28 -08:00
  • ac26e1aaf3 Pulled latest updates from trunk Benoit Steiner 2016-02-03 12:52:20 -08:00
  • 492fe7ce02 Silenced some unhelpful warnings generated by nvcc. Benoit Steiner 2016-02-03 12:51:19 -08:00
  • b70db60e4d Merged in rmlarsen/eigen (pull request PR-161) Gael Guennebaud 2016-02-03 21:37:06 +01:00
  • 5fb04ab2da Fix bad line break. Don't repeat Kahan matrix test since it is deterministic. Rasmus Munk Larsen 2016-02-03 10:12:10 -08:00
  • d9a6f86cc0 Make the array of directly compute column norms a member to avoid allocation in computeInPlace. Rasmus Munk Larsen 2016-02-03 09:55:30 -08:00
  • 70dc14e4e1 bug #1161: fix division by zero for huge scalar types Gael Guennebaud 2016-02-03 18:25:41 +01:00
  • c301f99208 bug #1164: fix list and deque specializations such that our aligned allocator is automatically activatived only when the user did not specified an allocator (or specified the default std::allocator). Damien R 2016-02-03 18:07:25 +01:00
  • eb6d9aea0e Clarify error message when writing to a read-only sparse-sub-matrix. Gael Guennebaud 2016-02-03 16:58:23 +01:00
  • 040cf33e8f merge Gael Guennebaud 2016-02-03 16:09:51 +01:00
  • c85fbfd0b7 Clarify documentation on the restrictions of writable sparse block expressions. Gael Guennebaud 2016-02-03 16:08:43 +01:00
  • dc413dbe8a Merged in ville-k/eigen/explicit_long_constructors (pull request PR-158) Benoit Steiner 2016-02-02 20:58:06 -08:00
  • 783018d8f6 Use EIGEN_STATIC_ASSERT for backward compatibility. Ville Kallioniemi 2016-02-02 16:45:12 -07:00
  • 99cde88341 Don't try to use direct offsets when computing a tensor product, since the required stride isn't available. Benoit Steiner 2016-02-02 11:06:53 -08:00
  • ff0a83aaf8 Use single template constructor to avoid overload resolution issues. Ville Kallioniemi 2016-02-02 00:33:25 -07:00
  • aedea349aa Replace separate low word constructors with a single templated constructor. Ville Kallioniemi 2016-02-01 20:25:02 -07:00
  • f0fdefa96f Rebase to latest. Ville Kallioniemi 2016-02-01 19:32:31 -07:00
  • d93b71a301 Updated the packetmath test to call predux_half instead of predux4 Benoit Steiner 2016-02-01 15:18:33 -08:00
  • ef66f2887b Updated the matrix multiplication code to make it compile with AVX512 enabled. Benoit Steiner 2016-02-01 14:38:05 -08:00
  • 85b6d82b49 Generalized predux4 to support AVX512 packets, and renamed it predux_half. Disabled the implementation of pabs for avx512 since the corresponding intrinsics are not shipped with gcc Benoit Steiner 2016-02-01 14:35:51 -08:00
  • 64ce78c2ec Cleaned up a tensor contraction test Benoit Steiner 2016-02-01 13:57:41 -08:00
  • 0ce5d32be5 Sharded the cxx11_tensor_contract_cuda test Benoit Steiner 2016-02-01 13:33:23 -08:00
  • 922b5f527b Silenced a few compilation warnings Benoit Steiner 2016-02-01 13:30:49 -08:00
  • 6b5dff875e Made it possible to limit the number of blocks that will be used to evaluate a tensor expression on a CUDA device. This makesit possible to set aside streaming multiprocessors for other computations. Benoit Steiner 2016-02-01 12:46:32 -08:00
  • 00f9ef6c76 merging. Rasmus Munk Larsen 2016-02-01 11:10:30 -08:00
  • 264f8141f8 Shared the tensor reduction test Benoit Steiner 2016-02-01 07:44:31 -08:00
  • 11bb71c8fc Sharded the tensor device test Benoit Steiner 2016-02-01 07:34:59 -08:00
  • ff1157bcbf bug #694: document that SparseQR::matrixR is not sorted. Gael Guennebaud 2016-02-01 16:09:34 +01:00
  • ec469700dc bug #557: make InnerIterator of sparse storage types more versatile by adding default-ctor, copy-ctor/assignment Gael Guennebaud 2016-02-01 15:04:33 +01:00
  • 6e0a86194c Fix integer path for num_steps==1 Gael Guennebaud 2016-02-01 15:00:04 +01:00
  • e1d219e5c9 bug #698: fix linspaced for integer types. Gael Guennebaud 2016-02-01 14:25:34 +01:00
  • 2c3224924b Fix warning and replace min/max macros by calls to mini/maxi Gael Guennebaud 2016-02-01 10:23:45 +01:00
  • e80ed948e1 Fixed a number of compilation warnings generated by the cuda tests Benoit Steiner 2016-01-31 20:09:41 -08:00
  • 6720b38fbf Fixed a few compilation warnings Benoit Steiner 2016-01-31 16:48:50 -08:00
  • 3f1ee45833 Fixed compilation errors triggered by duplicate inline declaration Benoit Steiner 2016-01-31 10:48:49 -08:00
  • 70be6f6531 Pulled latest changes from trunk Benoit Steiner 2016-01-31 10:44:45 -08:00
  • 4a2ddfb81d Sharded the CUDA argmax tensor test Benoit Steiner 2016-01-31 10:44:15 -08:00
  • d142165942 bug #667: declare several critical functions as FORECE_INLINE to make ICC happier. Gael Guennebaud 2016-01-31 16:34:10 +01:00
  • a4e4542b89 Avoid overflow in unit test. Gael Guennebaud 2016-01-30 22:26:17 +01:00
  • 3ba8a3ab1a Disable underflow unit test on the i387 FPU. Gael Guennebaud 2016-01-30 22:14:04 +01:00
  • 483082ef6e Fixed a few memory leaks in the cuda tests Benoit Steiner 2016-01-30 11:59:22 -08:00
  • bd21aba181 Sharded the cxx11_tensor_cuda test and fixed a memory leak Benoit Steiner 2016-01-30 11:47:09 -08:00
  • 9de155d153 Added a test to cover threaded tensor shuffling Benoit Steiner 2016-01-30 10:56:47 -08:00
  • 32088c06a1 Made the comparison between single and multithreaded contraction results more resistant to numerical noise to prevent spurious test failures. Benoit Steiner 2016-01-30 10:51:14 -08:00
  • 2053478c56 Made sure to use a tensor of rank 0 to store the result of a full reduction in the tensor thread pool test Benoit Steiner 2016-01-30 10:46:36 -08:00
  • d0db95f730 Sharded the tensor thread pool test Benoit Steiner 2016-01-30 10:43:57 -08:00
  • ba27c8a7de Made the CUDA contract test more robust to numerical noise. Benoit Steiner 2016-01-30 10:28:43 -08:00
  • 4281eb1e2c Added 2 benchmarks to the suite of tensor benchmarks running on GPU Benoit Steiner 2016-01-30 10:20:43 -08:00
  • 102fa96a96 Extend doc on dense+sparse Gael Guennebaud 2016-01-30 14:58:21 +01:00
  • 1bc207c528 backout changeset d4a9e61569 : the extended SparseView is not needed anymore Gael Guennebaud 2016-01-30 14:43:21 +01:00
  • 8ed1553d20 bug #632: implement general coefficient-wise "dense op sparse" operations through specialized evaluators instead of using SparseView. This permits to deal with arbitrary storage order, and to by-pass the more complex iterator of the sparse-sparse case. Gael Guennebaud 2016-01-30 14:39:50 +01:00
  • 699634890a bug #946: generalize Cholmod::solve to handle any rhs expression Gael Guennebaud 2016-01-29 23:02:22 +01:00
  • 15084cf1ac bug #632: add support for "dense +/- sparse" operations. The current implementation is based on SparseView to make the dense subexpression compatible with the sparse one. Gael Guennebaud 2016-01-29 22:09:45 +01:00
  • d4a9e61569 Extend SparseView to allow keeping explicit zeros. This is equivalent to sparseView(1,-1) but faster because the test is removed at compile-time. Gael Guennebaud 2016-01-29 22:07:56 +01:00
  • d8d37349c3 bug #696: enable zero-sized block at compile-time by relaxing the respective assertion Gael Guennebaud 2016-01-29 12:44:49 +01:00
  • e8ccc06fe5 merge Gael Guennebaud 2016-01-29 09:40:38 +01:00
  • 963f2d2a8f Marked several methods EIGEN_DEVICE_FUNC Benoit Steiner 2016-01-28 23:37:48 -08:00
  • c5d25bf1d0 Fixed a couple of compilation warnings. Benoit Steiner 2016-01-28 23:15:45 -08:00
  • e4f83bae5d Fixed the tensor benchmarks on apple devices Benoit Steiner 2016-01-28 21:08:07 -08:00
  • 10bea90c4a Fixed clang related compilation error Benoit Steiner 2016-01-28 20:52:08 -08:00
  • d3f533b395 Fixed compilation warning Benoit Steiner 2016-01-28 20:09:45 -08:00
  • 3fde202215 Making ceil() functor generic w.r.t packet type Abhijit Kundu 2016-01-28 21:27:00 -05:00
  • 211d350fc3 Fixed a typo Benoit Steiner 2016-01-28 17:13:04 -08:00
  • bd2e5a788a Made sure the number of floating point operations done by a benchmark is computed using 64 bit integers to avoid overflows. Benoit Steiner 2016-01-28 17:10:40 -08:00
  • 120e13b1b6 Added a readme to explain how to compile the tensor benchmarks. Benoit Steiner 2016-01-28 17:06:00 -08:00
  • a68864b6bc Updated the benchmarking code to print the number of flops processed instead of the number of bytes. Benoit Steiner 2016-01-28 16:51:40 -08:00
  • 8217281ae4 Merge latest updates from trunk Benoit Steiner 2016-01-28 16:20:53 -08:00
  • c8d5f21941 Added extra tensor benchmarks Benoit Steiner 2016-01-28 16:20:36 -08:00
  • 7b3044d086 Made sure to call nvcc with the relaxed-constexpr flag. Benoit Steiner 2016-01-28 15:36:34 -08:00
  • acce4dd050 Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf, which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices. Rasmus Munk Larsen 2016-01-28 15:07:26 -08:00
  • b908e071a8 bug #178: get rid of some const_cast in SparseCore Gael Guennebaud 2016-01-28 22:11:18 +01:00
  • c1d900af61 bug #178: remove additional const on nested expression, and remove several const_cast. Gael Guennebaud 2016-01-28 21:43:20 +01:00
  • 12f8bd12a2 Merged in jiayq/eigen (pull request PR-159) Benoit Steiner 2016-01-28 11:28:55 -08:00
  • 270c4e1ecd bugfix Yangqing Jia 2016-01-28 11:11:45 -08:00
  • c4e47630b1 benchmark modifications to make it compilable in a standalone fashion. Yangqing Jia 2016-01-28 10:35:14 -08:00
  • f50bb1e6f3 Fix compilation with gcc Gael Guennebaud 2016-01-28 13:25:26 +01:00
  • ddf64babde merge Gael Guennebaud 2016-01-28 13:21:48 +01:00
  • df15fbc452 bug #1158: PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag Gael Guennebaud 2016-01-28 13:16:30 +01:00
  • 9bcadb7fd1 Disable stupid MSVC warning Gael Guennebaud 2016-01-28 12:14:16 +01:00
  • b4d87fff4a Fix MSVC warning. Gael Guennebaud 2016-01-28 12:12:30 +01:00
  • 2bad3e78d9 bug #96, bug #1006: fix by value argument in result_of. Gael Guennebaud 2016-01-28 12:12:06 +01:00
  • 7802a6bb1c Fix unit test filename. Gael Guennebaud 2016-01-28 09:35:37 +01:00
  • 4bf9eaf77a Deleted an invalid assertion that prevented the assignment of empty tensors. Benoit Steiner 2016-01-27 17:09:30 -08:00
  • 291069e885 Fixed some compilation problems with nvcc + clang Benoit Steiner 2016-01-27 15:37:03 -08:00
  • 47ca9dc809 Fixed the tensor_cuda test Benoit Steiner 2016-01-27 14:58:48 -08:00
  • 55a5204319 Fixed the flags passed to nvcc to compile the tensor code. Benoit Steiner 2016-01-27 14:46:34 -08:00
  • 4865e1e732 Update link to suitesparse. Gael Guennebaud 2016-01-27 22:48:40 +01:00
  • 9dfbd4fe8d Made the cuda tests compile using make check Benoit Steiner 2016-01-27 12:22:17 -08:00
  • 5973bcf939 Properly specify the namespace when calling cout/endl Benoit Steiner 2016-01-27 12:04:42 -08:00
  • c8d94ae944 digamma special function: merge shared code. Eugene Brevdo 2016-01-27 09:52:29 -08:00
  • 9c8f7dfe94 bug #1156: fix several function declarations whose arguments were passed by value instead of being passed by reference Gael Guennebaud 2016-01-27 18:34:42 +01:00
  • 9aa6fae123 bug #1154: move to dynamic scheduling for spmv products. Gael Guennebaud 2016-01-27 18:03:51 +01:00
  • 9ac8e8c6a1 Extend mixing type unit test with trmv, and the following not yet supported products: trmm, symv, symm Gael Guennebaud 2016-01-27 17:29:53 +01:00
  • 6da5d87f92 add nomalloc unit test for rank2 updates Gael Guennebaud 2016-01-27 17:26:48 +01:00
  • 9801c959e6 Fix tri = complex * real product, and add respective unit test. Gael Guennebaud 2016-01-27 17:12:25 +01:00
  • 21b5345782 Add meta_least_common_multiple helper. Gael Guennebaud 2016-01-27 17:11:39 +01:00
  • fecea26d93 Extend doc on shifting strategy Gael Guennebaud 2016-01-27 15:55:15 +01:00