Commit Graph

  • 33ca7e3c8d bug #1207: Add and fix logical-op warnings Christoph Hertzberg 2016-05-11 19:36:34 +02:00
  • a11bd82dc3 bug #1213: Give names to anonymous enums Christoph Hertzberg 2016-05-06 11:31:56 +02:00
  • 217d984abc Fixed a typo in my previous commit Benoit Steiner 2016-05-11 10:22:15 -07:00
  • 08348b4e48 Fix potential race condition in the CUDA reduction code. Benoit Steiner 2016-05-11 10:08:51 -07:00
  • cbb14ed47e Added a few tests to validate the generation of random tensors on GPU. Benoit Steiner 2016-05-11 10:05:56 -07:00
  • 6a5717dc74 Explicitely initialize all the atomic variables. Benoit Steiner 2016-05-11 10:04:41 -07:00
  • 0f61343893 Workaround maybe-uninitialized warning Christoph Hertzberg 2016-05-11 09:00:18 +02:00
  • 3bfc9b47ca Workaround "misleading-indentation" warnings Christoph Hertzberg 2016-05-11 08:41:36 +02:00
  • 4ede059de1 Properly gate the use of half2. Benoit Steiner 2016-05-10 17:04:01 -07:00
  • bf185c3c28 Extended the tests for ptanh Benoit Steiner 2016-05-10 16:21:43 -07:00
  • 661e710092 Added support for fp16 to the sigmoid functor. Benoit Steiner 2016-05-10 12:25:27 -07:00
  • 0eb69b7552 Small improvement to the full reduction of fp16 Benoit Steiner 2016-05-10 11:58:18 -07:00
  • 0b9e3dcd06 Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%. Benoit Steiner 2016-05-10 11:05:33 -07:00
  • 6bf8273bc0 Added a test to validate the new non blocking thread pool Benoit Steiner 2016-05-10 10:49:34 -07:00
  • 4013b8feca Simplified the reduction code a little. Benoit Steiner 2016-05-10 09:40:42 -07:00
  • 75bd2bd32d Fixed compilation warning Benoit Steiner 2016-05-09 19:24:41 -07:00
  • 4670d7d5ce Improved the performance of full reductions on GPU: Benoit Steiner 2016-05-09 17:09:54 -07:00
  • c3859a2b58 Added the ability to use a scratch buffer in cuda kernels Benoit Steiner 2016-05-09 17:05:53 -07:00
  • ba95e43ea2 Added a new parallelFor api to the thread pool device. Benoit Steiner 2016-05-09 10:45:12 -07:00
  • dc7dbc2df7 Optimized the non blocking thread pool: * Use a pseudo-random permutation of queue indices during random stealing. This ensures that all the queues are considered. * Directly pop from a non-empty queue when we are waiting for work, instead of first noticing that there is a non-empty queue and then doing another round of random stealing to re-discover the non-empty queue. * Steal only 1 task from a remote queue instead of half of tasks. Benoit Steiner 2016-05-09 10:17:17 -07:00
  • 05c365fb16 Pulled latest updates from trunk Benoit Steiner 2016-05-07 13:39:04 -07:00
  • 691614bd2c Worked around a bug in nvcc on tegra x1 Benoit Steiner 2016-05-07 13:28:53 -07:00
  • a2d94fc216 Merged latest updates from trunk Benoit Steiner 2016-05-06 19:17:57 -07:00
  • 8adf5cc70f Added support for packet processing of fp16 on kepler and maxwell gpus Benoit Steiner 2016-05-06 19:16:43 -07:00
  • 1660e749b4 Avoid double promotion Benoit Steiner 2016-05-06 08:15:12 -07:00
  • c54ae65c83 Marked a few tensor operations as read only Benoit Steiner 2016-05-05 17:18:47 -07:00
  • 69a8a4e1f3 Added a test to validate full reduction on tensor of half floats Benoit Steiner 2016-05-05 16:52:50 -07:00
  • 678a17ba79 Made the testing of contractions on fp16 more robust Benoit Steiner 2016-05-05 16:36:39 -07:00
  • e3d053e14e Refined the testing of log and exp on fp16 Benoit Steiner 2016-05-05 16:24:15 -07:00
  • 9a48688d37 Further improved the testing of fp16 Benoit Steiner 2016-05-05 15:58:05 -07:00
  • 0451940fa4 Relaxed the dummy precision for fp16 Benoit Steiner 2016-05-05 15:40:01 -07:00
  • 910e013506 Relaxed an assertion that was tighter that necessary. Benoit Steiner 2016-05-05 15:38:16 -07:00
  • f81e413180 Added a benchmark to measure the performance of full reductions of 16 bit floats Benoit Steiner 2016-05-05 14:15:11 -07:00
  • 28d5572658 Fixed some incorrect assertions Benoit Steiner 2016-05-05 10:02:26 -07:00
  • 2aba40d208 Avoid unecessary type promotion Benoit Steiner 2016-05-05 09:26:57 -07:00
  • a4d6e8fef0 Strongly hint but don't force the compiler to unroll a some loops in the tensor executor. This results in up to 27% faster code. Benoit Steiner 2016-05-05 09:25:55 -07:00
  • 7875437ca0 Avoided unecessary type promotion Benoit Steiner 2016-05-05 09:08:42 -07:00
  • f363e533aa Added tests for full contractions using thread pools and gpu devices. Fixed a couple of issues in the corresponding code. Benoit Steiner 2016-05-05 09:05:45 -07:00
  • 06d774bf58 Updated the contraction code to ensure that full contraction return a tensor of rank 0 Benoit Steiner 2016-05-05 08:37:47 -07:00
  • b300a84989 Fixed some singed/unsigned comparison warnings Christoph Hertzberg 2016-05-05 13:36:28 +02:00
  • dacb469bc9 Enable and fix -Wdouble-conversion warnings Christoph Hertzberg 2016-05-05 13:35:45 +02:00
  • 62b710072e Reduced the memory footprint of the cxx11_tensor_image_patch test Benoit Steiner 2016-05-04 21:08:22 -07:00
  • dd2b45feed Removed extraneous 'explicit' keywords Benoit Steiner 2016-05-04 16:57:52 -07:00
  • be78aea6b3 fix double-promotion/float-conversion in Core/SpecialFunctions.h Ola Røer Thorsen 2016-05-04 10:52:08 +02:00
  • 75a94b9662 Improve documentation of BDCSVD Gael Guennebaud 2016-05-04 12:53:14 +02:00
  • 968ec1c2ae Use numext::isfinite instead of std::isfinite Benoit Steiner 2016-05-03 19:56:40 -07:00
  • e2ca478485 bug #1214: consider denormals as zero in D&C SVD. This also workaround infinite binary search when compiling with ICC's unsafe optimizations. Gael Guennebaud 2016-05-03 23:15:29 +02:00
  • f899e08946 Enabled a number of tests previously disabled by mistake Benoit Steiner 2016-05-03 14:07:47 -07:00
  • 4c05fb03a3 Merged eigen/eigen into default Benoit Steiner 2016-05-03 13:15:00 -07:00
  • 577a07a86e Re-enabled the product_small test now that everything compiles correctly. Benoit Steiner 2016-05-03 13:11:38 -07:00
  • 2c5568a757 Added a test to validate the computation of exp and log on 16bit floats Benoit Steiner 2016-05-03 12:06:07 -07:00
  • 6c3e5b85bc Fixed compilation error with cuda >= 7.5 Benoit Steiner 2016-05-03 09:38:42 -07:00
  • aad9a04da4 Deleted superfluous explicit keyword. Benoit Steiner 2016-05-03 09:37:19 -07:00
  • da50419df8 Made a cast explicit Benoit Steiner 2016-05-02 19:50:22 -07:00
  • 73ef5371e4 Pulled latest updates from trunk Benoit Steiner 2016-05-01 14:48:57 -07:00
  • 8a9228ed9b Fixed compilation error Benoit Steiner 2016-05-01 14:48:01 -07:00
  • b1bd53aa6b Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement. Gael Guennebaud 2016-05-01 23:25:06 +02:00
  • d6c9596fd8 Added missing accessors to fixed sized tensors Benoit Steiner 2016-04-29 18:51:33 -07:00
  • 17fe7f354e Deleted trailing commas Benoit Steiner 2016-04-29 18:39:01 -07:00
  • e5f71aa6b2 Deleted useless trailing commas Benoit Steiner 2016-04-29 18:36:10 -07:00
  • 44f592dceb Deleted unnecessary trailing commas. Benoit Steiner 2016-04-29 18:33:46 -07:00
  • 2b890ae618 Fixed compilation errors generated by clang Benoit Steiner 2016-04-29 18:30:40 -07:00
  • d217217842 Added a few tests to ensure that the dimensions of rank 0 tensors are correctly computed Benoit Steiner 2016-04-29 18:15:34 -07:00
  • f100d1494c Return the proper size (ie 1) for tensors of rank 0 Benoit Steiner 2016-04-29 18:14:33 -07:00
  • d14105f158 Made several tensor tests compatible with cxx03 Benoit Steiner 2016-04-29 17:22:37 -07:00
  • c0882ef4d9 Moved a number of tensor tests that don't require cxx11 to work properly outside the EIGEN_TEST_CXX11 test section Benoit Steiner 2016-04-29 17:13:51 -07:00
  • 9d1dbd1ec0 Fixed teh cxx11_tensor_empty test to compile without requiring cxx11 support Benoit Steiner 2016-04-29 16:53:55 -07:00
  • a8c0405cf5 Deleted unused default values for template parameters Benoit Steiner 2016-04-29 16:34:43 -07:00
  • 4f53178e62 Made a coupe of tensor tests compile without requiring c++11 support. Benoit Steiner 2016-04-29 16:09:54 -07:00
  • 1131a984a6 Made the cxx11_tensor_forced_eval compile without c++11. Benoit Steiner 2016-04-29 15:48:59 -07:00
  • 46bcb70969 Don't turn on const expressions when compiling with gcc >= 4.8 unless the -std=c++11 option has been used Benoit Steiner 2016-04-29 15:20:59 -07:00
  • c07404f6a1 Restore Tensor support for non c++11 compilers Benoit Steiner 2016-04-29 15:19:19 -07:00
  • ba32ded021 Fixed include path Benoit Steiner 2016-04-29 15:11:09 -07:00
  • 3b8da4be5a Extended the packetmath test to cover all the alignments made possible by avx512 instructions. Benoit Steiner 2016-04-29 14:13:43 -07:00
  • 2f28ccbea3 Update the makefile to make the tests compile with gcc 4.9 Benoit Steiner 2016-04-29 14:11:09 -07:00
  • 7a4bd337d9 Resolved merge conflict Benoit Steiner 2016-04-29 13:42:22 -07:00
  • 07a247dcf4 Pulled latest updates from upstream Benoit Steiner 2016-04-29 13:41:26 -07:00
  • fa5a8f055a Implemented palign_impl for AVX512 Benoit Steiner 2016-04-29 13:30:13 -07:00
  • ef3ac9d05a Fixed the AVX512 packet traits Benoit Steiner 2016-04-29 13:28:36 -07:00
  • d7b75e8d86 Added pdiv packet primitives for avx512 Benoit Steiner 2016-04-29 13:26:47 -07:00
  • 5e89ded685 Implemented preduxp for AVX512 Benoit Steiner 2016-04-29 13:00:33 -07:00
  • 5f85662ad8 Implemented the pabs and preverse primitives for avx512. Benoit Steiner 2016-04-29 12:53:34 -07:00
  • d37ee89ca8 Disabled some of the AVX512 primitives on compilers that don't support them Benoit Steiner 2016-04-29 12:50:29 -07:00
  • 0f3c4c8ff4 Fix compilation of sparse.cast<>().transpose(). Gael Guennebaud 2016-04-29 18:26:08 +02:00
  • a524a26fdc Fixed a few memory leaks Benoit Steiner 2016-04-28 18:55:53 -07:00
  • dacb23277e Fixed the igamma and igammac implementations to make them callable from a gpu kernel. Benoit Steiner 2016-04-28 18:54:54 -07:00
  • a5d4545083 Deleted unused variable Benoit Steiner 2016-04-28 14:14:48 -07:00
  • 40d1e2f8c7 Eliminate mutual recursion in igamma{,c}_impl::Run. Justin Lebar 2016-04-28 13:57:08 -07:00
  • 87294c84a6 define Packet2d constants with VSX only Konstantinos Margaritis 2016-04-28 14:39:56 -03:00
  • 6ed7a7281c remove accidentally pasted code Konstantinos Margaritis 2016-04-28 14:35:55 -03:00
  • 62f9093b31 improve state of MathFunctions as well Konstantinos Margaritis 2016-04-28 14:33:09 -03:00
  • 8ed26120c8 bring Altivec/VSX to a better state, implement some of the missing functions Konstantinos Margaritis 2016-04-28 14:32:42 -03:00
  • 950158f6d1 add name to copyrights Konstantinos Margaritis 2016-04-28 14:32:11 -03:00
  • ee0459300b minor fix, add to copyright Konstantinos Margaritis 2016-04-28 14:31:21 -03:00
  • 3ec81fc00f Fixed compilation error with clang. Benoit Steiner 2016-04-27 19:32:12 -07:00
  • 2b917291d9 Merged in rmlarsen/eigen2 (pull request PR-183) Benoit Steiner 2016-04-27 15:19:54 -07:00
  • 09b9e951e3 Depend on the more extensive support for constexpr in clang: Rasmus Munk Larsen 2016-04-27 14:59:11 -07:00
  • 1a325ef71c Detect cxx_constexpr support when compiling with clang. Rasmus Munk Larsen 2016-04-27 14:33:51 -07:00
  • 1a97fd8b4e Merged latest update from trunk Benoit Steiner 2016-04-27 14:22:45 -07:00
  • c61170e87d fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen. Benoit Steiner 2016-04-27 14:22:20 -07:00