Commit Graph

  • ab9b749b45 Improved a test Benoit Steiner 2016-03-14 20:03:13 -07:00
  • 5a51366ea5 Fixed a typo. Benoit Steiner 2016-03-14 09:25:16 -07:00
  • fcf59e1c37 Properly gate the use of cuda intrinsics in the code Benoit Steiner 2016-03-14 09:13:44 -07:00
  • 97a1f1c273 Make sure we only use the half float intrinsic when compiling with a version of CUDA that is recent enough to provide them Benoit Steiner 2016-03-14 08:37:58 -07:00
  • 9550be925d Merge specfun branch. Eugene Brevdo 2016-03-13 15:46:51 -07:00
  • b1a9afe9a9 Add tests in array.cpp that check igamma/igammac properties. Eugene Brevdo 2016-03-13 15:45:34 -07:00
  • e29c9676b1 Don't mark the cast operator as explicit, since this is a c++11 feature that's not supported by older compilers. Benoit Steiner 2016-03-12 00:15:58 -08:00
  • eecd914864 Also replaced uint32_t with unsigned int to make the code more portable Benoit Steiner 2016-03-11 19:34:21 -08:00
  • 1ca8c1ec97 Replaced a couple more uint16_t with unsigned short Benoit Steiner 2016-03-11 19:28:28 -08:00
  • 0423b66187 Use unsigned short instead of uint16_t since they're more portable Benoit Steiner 2016-03-11 17:53:41 -08:00
  • 048c4d6efd Made half floats usable on hardware that doesn't support them natively. Benoit Steiner 2016-03-11 17:21:42 -08:00
  • b72ffcb05e Made the comparison of Eigen::array GPU friendly Benoit Steiner 2016-03-11 16:37:59 -08:00
  • 25f69cb932 Added a comparison operator for Eigen::array Alias Eigen::array to std::array when compiling with Visual Studio 2015 Benoit Steiner 2016-03-11 15:20:37 -08:00
  • c5b98a58b8 Updated the cxx11_meta test to work on the Eigen::array class when std::array isn't available. Benoit Steiner 2016-03-11 11:53:38 -08:00
  • 456e038a4e Fixed the +=, -=, *= and /= operators to return a reference Benoit Steiner 2016-03-10 15:17:44 -08:00
  • 86d45a3c83 Worked around visual studio compilation warnings. Benoit Steiner 2016-03-09 21:29:39 -08:00
  • 8fd4241377 Fixed a typo. Benoit Steiner 2016-03-10 02:28:46 +00:00
  • a685a6beed Made the list reductions less ambiguous. Benoit Steiner 2016-03-09 17:41:52 -08:00
  • 3149b5b148 Avoid implicit cast Benoit Steiner 2016-03-09 17:35:17 -08:00
  • b2100b83ad Made sure to include the <random> header file when compiling with visual studio Benoit Steiner 2016-03-09 16:03:16 -08:00
  • f05fb449b8 Avoid unnecessary conversion from 32bit int to 64bit unsigned int Benoit Steiner 2016-03-09 15:27:45 -08:00
  • 1d566417d2 Enable the random number generators when compiling with visual studio Benoit Steiner 2016-03-09 10:55:11 -08:00
  • 836e92a051 Update MathFunctions/SpecialFunctions with intelligent header guards. Eugene Brevdo 2016-03-09 09:04:45 -08:00
  • b084133dbf Fixed the integer division code on windows Benoit Steiner 2016-03-09 07:06:36 -08:00
  • 6d30683113 Fixed static assertion Benoit Steiner 2016-03-08 21:02:51 -08:00
  • 5e7de771e3 Properly fix merge issues. Eugene Brevdo 2016-03-08 17:35:05 -08:00
  • 73220d2bb0 Resolve bad merge. Eugene Brevdo 2016-03-08 17:28:21 -08:00
  • 5f17de3393 Merge changes. Eugene Brevdo 2016-03-08 17:22:26 -08:00
  • 14f0fde51f Add certain functions to numext (log, exp, tan) because CUDA doesn't support std:: Eugene Brevdo 2016-03-08 17:17:44 -08:00
  • 46177c8d64 Replace std::vector with our own implementation, as using the stl when compiling with nvcc and avx enabled leads to many issues. Benoit Steiner 2016-03-08 16:37:27 -08:00
  • 6d6413f768 Simplified the full reduction code Benoit Steiner 2016-03-08 16:02:00 -08:00
  • 5a427a94a9 Fixed the tensor generator code Benoit Steiner 2016-03-08 13:28:06 -08:00
  • a81b88bef7 Fixed the tensor concatenation code Benoit Steiner 2016-03-08 12:30:19 -08:00
  • 551ff11d0d Fixed the tensor layout swapping code Benoit Steiner 2016-03-08 12:28:10 -08:00
  • 8768c063f5 Fixed the tensor chipping code. Benoit Steiner 2016-03-08 12:26:49 -08:00
  • e09eb835db Decoupled the packet type definition from the definition of the tensor ops. All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit. Benoit Steiner 2016-03-08 12:07:33 -08:00
  • 3b614a2358 Use NumTraits::highest() and NumTraits::lowest() instead of the std::numeric_limits to make the tensor min and max functors more CUDA friendly. Benoit Steiner 2016-03-07 17:53:28 -08:00
  • dd6dcad6c2 Merge branch specfun. Eugene Brevdo 2016-03-07 15:37:12 -08:00
  • 0bb5de05a1 Finishing touches on igamma/igammac for GPU. Tests now pass. Eugene Brevdo 2016-03-07 15:35:09 -08:00
  • 769685e74e Added the ability to pad a tensor using a non-zero value Benoit Steiner 2016-03-07 14:45:37 -08:00
  • 7f87cc3a3b Fix a couple of typos in the code. Benoit Steiner 2016-03-07 14:31:27 -08:00
  • 5707004d6b Fix Eigen's building of sharded tests that use CUDA & more igamma/igammac bugfixes. Eugene Brevdo 2016-03-07 14:08:56 -08:00
  • e5f25622e2 Added a test to validate the behavior of some of the tensor syntactic sugar. Benoit Steiner 2016-03-07 09:04:27 -08:00
  • 9f5740cbc1 Added missing include Benoit Steiner 2016-03-06 22:03:18 -08:00
  • 5238e03fe1 Don't try to compile the uint128 test with compilers that don't support uint127 Benoit Steiner 2016-03-06 21:59:40 -08:00
  • 9a54c3e32b Don't warn that msvc 2015 isn't c++11 compliant just because it doesn't claim to be. Benoit Steiner 2016-03-06 09:38:56 -08:00
  • 05bbca079a Turn on some of the cxx11 features when compiling with visual studio 2015 Benoit Steiner 2016-03-05 10:52:08 -08:00
  • 6093eb9ff5 Don't test our 128bit emulation code when compiling with msvc Benoit Steiner 2016-03-05 10:37:11 -08:00
  • 57b263c5b9 Avoid using initializer lists in test since not all version of msvc support them Benoit Steiner 2016-03-05 08:35:26 -08:00
  • 23aed8f2e4 Use EIGEN_PI instead of redefining our own constant PI Benoit Steiner 2016-03-05 08:04:45 -08:00
  • 0b9e0abc96 Make igamma and igammac work correctly. Eugene Brevdo 2016-03-04 21:12:10 -08:00
  • c23e0be18f Use the CMAKE_CXX_STANDARD variable to turn on cxx11 Benoit Steiner 2016-03-04 20:18:01 -08:00
  • ec35068edc Don't rely on the M_PI constant since not all compilers provide it. Benoit Steiner 2016-03-04 16:42:38 -08:00
  • 60d9df11c1 Fixed the computation of leading zeros when compiling with msvc. Benoit Steiner 2016-03-04 16:27:02 -08:00
  • 4e49fd5eb9 MSVC uses __uint128 while other compilers use __uint128_t to encode 128bit unsigned integers. Make the cxx11_tensor_uint128.cpp test work in both cases. Benoit Steiner 2016-03-04 14:49:18 -08:00
  • 667fcc2b53 Fixed syntax error Benoit Steiner 2016-03-04 14:37:51 -08:00
  • 4416a5dcff Added missing include Benoit Steiner 2016-03-04 14:35:43 -08:00
  • c561eeb7bf Don't use implicit type conversions in initializer lists since not all compilers support them. Benoit Steiner 2016-03-04 14:12:45 -08:00
  • 174edf976b Made the contraction test more portable Benoit Steiner 2016-03-04 14:11:13 -08:00
  • 2c50fc878e Fixed a typo Benoit Steiner 2016-03-04 14:09:38 -08:00
  • 7ea35bfa1c Initial implementation of igamma and igammac. Eugene Brevdo 2016-03-03 19:39:41 -08:00
  • deea866bbd Added tests to cover the new rounding, flooring and ceiling tensor operations. Benoit Steiner 2016-03-03 12:38:02 -08:00
  • 5cf4558c0a Added support for rounding, flooring, and ceiling to the tensor api Benoit Steiner 2016-03-03 12:36:55 -08:00
  • dac58d7c35 Added a test to validate the conversion of half floats into floats on Kepler GPUs. Restricted the testing of the random number generation code to GPU architecture greater than or equal to 3.5. Benoit Steiner 2016-03-03 10:37:25 -08:00
  • 1032441c6f Enable partial support for half floats on Kepler GPUs. Benoit Steiner 2016-03-03 10:34:20 -08:00
  • 1da10a7358 Enable the conversion between floats and half floats on older GPUs that support it. Benoit Steiner 2016-03-03 10:33:20 -08:00
  • 2de8cc9122 Merged in ebrevdo/eigen (pull request PR-167) Benoit Steiner 2016-03-03 09:42:12 -08:00
  • ab3dc0b0fe Small bugfix to numeric_limits for CUDA. Eugene Brevdo 2016-03-02 21:48:46 -08:00
  • 6afea46838 Add infinity() support to numext::numeric_limits, use it in lgamma. Eugene Brevdo 2016-03-02 21:35:48 -08:00
  • 3fccef6f50 bug #537: fix compilation with Apples's compiler Gael Guennebaud 2016-03-02 13:22:46 +01:00
  • fedaf19262 Pulled latest updates from trunk Benoit Steiner 2016-03-01 06:15:44 -08:00
  • dfa80b2060 Compilation fix Gael Guennebaud 2016-03-01 12:48:56 +01:00
  • bee9efc203 Compilation fix Gael Guennebaud 2016-03-01 12:47:27 +01:00
  • 68ac5c1738 Improved the performance of large outer reductions on cuda Benoit Steiner 2016-02-29 18:11:58 -08:00
  • 56a3ada670 Added benchmarks for full reduction Benoit Steiner 2016-02-29 14:57:52 -08:00
  • b2075cb7a2 Made the signature of the inner and outer reducers consistent Benoit Steiner 2016-02-29 10:53:38 -08:00
  • 3284842045 Optimized the performance of narrow reductions on CUDA devices Benoit Steiner 2016-02-29 10:48:16 -08:00
  • e9bea614ec Fix shortcoming in fixed-value deduction of startRow/startCol Gael Guennebaud 2016-02-29 10:31:27 +01:00
  • 609b3337a7 Print some information to stderr when a CUDA kernel fails Benoit Steiner 2016-02-27 20:42:57 +00:00
  • 1031b31571 Improved the README Benoit Steiner 2016-02-27 20:22:04 +00:00
  • 8e6faab51e bug #1172: make valuePtr and innderIndexPtr properly return null for empty matrices. Gael Guennebaud 2016-02-27 14:55:40 +01:00
  • ac2e6e0d03 Properly vectorized the random number generators Benoit Steiner 2016-02-26 13:52:24 -08:00
  • caa54d888f Made the TensorIndexList usable on GPU without having to use the -relaxed-constexpr compilation flag Benoit Steiner 2016-02-26 12:38:18 -08:00
  • 93485d86bc Added benchmarks for type casting of float16 Benoit Steiner 2016-02-26 12:24:58 -08:00
  • 002824e32d Added benchmarks for fp16 Benoit Steiner 2016-02-26 12:21:25 -08:00
  • 2cd32cad27 Reverted previous commit since it caused more problems than it solved Benoit Steiner 2016-02-26 13:21:44 +00:00
  • d9d05dd96e Fixed handling of long doubles on aarch64 Benoit Steiner 2016-02-26 04:13:58 -08:00
  • af199b4658 Made the CUDA architecture level a build setting. Benoit Steiner 2016-02-25 09:06:18 -08:00
  • c36c09169e Fixed a typo in the reduction code that could prevent large full reductionsx from running properly on old cuda devices. Benoit Steiner 2016-02-24 17:07:25 -08:00
  • 7a01cb8e4b Marked the And and Or reducers as stateless. Benoit Steiner 2016-02-24 16:43:01 -08:00
  • 91e1375ba9 merge Gael Guennebaud 2016-02-23 11:09:05 +01:00
  • 055000a424 Fix startRow()/startCol() for dense Block with direct access: the initial implementation failed for empty rows/columns for which are ambiguous. Gael Guennebaud 2016-02-23 11:07:59 +01:00
  • 1d9256f7db Updated the padding code to work with half floats Benoit Steiner 2016-02-23 05:51:22 +00:00
  • 8cb9bfab87 Extended the tensor benchmark suite to support types other than floats Benoit Steiner 2016-02-23 05:28:02 +00:00
  • f442a5a5b3 Updated the tensor benchmarking code to work with compilers that don't support cxx11. Benoit Steiner 2016-02-23 04:15:48 +00:00
  • 72d2cf642e Deleted the coordinate based evaluation of tensor expressions, since it's hardly ever used and started to cause some issues with some versions of xcode. Benoit Steiner 2016-02-22 15:29:41 -08:00
  • 6270d851e3 Declare the half float type as arithmetic. Benoit Steiner 2016-02-22 13:59:33 -08:00
  • 5cd00068c0 include <iostream> in the tensor header since we now use it to better report cuda initialization errors Benoit Steiner 2016-02-22 13:59:03 -08:00
  • 257b640463 Fixed compilation warning generated by clang Benoit Steiner 2016-02-21 22:43:37 -08:00
  • 584832cb3c Implemented the ptranspose function on half floats Benoit Steiner 2016-02-21 12:44:53 -08:00