Commit Graph

  • 06447e0a39 Improve half-packet vectorization logic to distinguish linear versus inner traversal modes. Gael Guennebaud 2016-04-13 18:15:49 +02:00
  • bbb8854bf7 Enable half-packet in reduxions. Gael Guennebaud 2016-04-13 13:02:34 +02:00
  • e9b12cc1f7 Fixed compilation warnings generated by clang Benoit Steiner 2016-04-12 20:53:18 -07:00
  • eaeb6ca93a Enable the benchmarks for algebraic and transcendental fnctions on fp16. Benoit Steiner 2016-04-12 16:29:00 -07:00
  • aa1ba8bbd2 Don't put a command at the end of an enumerator list Benoit Steiner 2016-04-12 16:28:11 -07:00
  • e49945ced4 Pulled latest update from trunk Benoit Steiner 2016-04-12 14:13:41 -07:00
  • 25d05c4b8f Fixed the vectorization logic test Benoit Steiner 2016-04-12 14:13:25 -07:00
  • 53121c0119 Turned on the contraction benchmarks for fp16 Benoit Steiner 2016-04-12 14:11:52 -07:00
  • b67c983291 Enable the use of half-packet in coeff-based product. For instance, Matrix4f*Vector4f is now vectorized again when using AVX. Gael Guennebaud 2016-04-12 23:03:03 +02:00
  • e3a184785c Fixed the zeta test Benoit Steiner 2016-04-12 11:12:36 -07:00
  • 3b76df64fc Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3 Benoit Steiner 2016-04-12 10:58:51 -07:00
  • 8bfe739cd2 Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions Benoit Steiner 2016-04-11 18:40:16 -07:00
  • 6498dadc2f Merged eigen/eigen into default Rasmus Larsen 2016-04-11 17:42:05 -07:00
  • d6e596174d Pull latest updates from upstream Benoit Steiner 2016-04-11 17:20:17 -07:00
  • 748c4c4599 More accurate cost estimates for exp, log, tanh, and sqrt. Benoit Steiner 2016-04-11 13:11:04 -07:00
  • 833efb39bf Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16 Benoit Steiner 2016-04-11 11:03:56 -07:00
  • e939b087fe Pulled latest update from trunk Benoit Steiner 2016-04-11 11:03:02 -07:00
  • 1744b5b5d2 Update doc regarding the genericity of EIGEN_USE_BLAS Gael Guennebaud 2016-04-11 17:16:07 +02:00
  • 91bf925fc1 Improve constness of level2 blas API. Gael Guennebaud 2016-04-11 17:13:01 +02:00
  • 0483430283 Move LAPACK declarations from blas.h to lapack.h and fix compatibility with EIGEN_USE_MKL Gael Guennebaud 2016-04-11 17:12:31 +02:00
  • 097d1e8823 Cleanup obsolete assign_scalar_eig2mkl helper. Gael Guennebaud 2016-04-11 16:09:29 +02:00
  • fec4c334ba Remove all references to MKL in BLAS wrappers. Gael Guennebaud 2016-04-11 16:04:09 +02:00
  • ddabc992fa Fix long to int conversion in BLAS API. Gael Guennebaud 2016-04-11 15:52:01 +02:00
  • 8191f373be Silent unused warning. Gael Guennebaud 2016-04-11 15:37:16 +02:00
  • 6a9ca88e7e Relax dependency on MKL for EIGEN_USE_BLAS Gael Guennebaud 2016-04-11 15:17:14 +02:00
  • 4e8e5888d7 Improve constness of blas level-3 interface. Gael Guennebaud 2016-04-11 15:12:44 +02:00
  • 675e0a2224 Fix static/inline keywords order. Gael Guennebaud 2016-04-11 15:06:20 +02:00
  • fc6a0ebb1c Typos in doc. Gael Guennebaud 2016-04-11 10:54:58 +02:00
  • 643b697649 Proper handling of domain errors. Till Hoffmann 2016-04-10 00:37:53 +01:00
  • 1f70bd4134 Merge. Rasmus Munk Larsen 2016-04-09 15:31:53 -07:00
  • 096e355f8e Add short-circuit to avoid calling matrix norm for empty matrix. Rasmus Munk Larsen 2016-04-09 15:29:56 -07:00
  • be80fb49fc Merged default (4a92b590a0 ) into default Rasmus Larsen 2016-04-09 13:13:01 -07:00
  • 7a8176587b Merged eigen/eigen into default Rasmus Larsen 2016-04-09 12:47:41 -07:00
  • 4a92b590a0 Merge. Rasmus Munk Larsen 2016-04-09 12:47:24 -07:00
  • ee6c69733a A few tiny adjustments to short-circuit logic. Rasmus Munk Larsen 2016-04-09 12:45:49 -07:00
  • 7f4826890c Merge upstream Till Hoffmann 2016-04-09 20:08:07 +01:00
  • de057ebe54 Added nans to zeta function. Till Hoffmann 2016-04-09 20:07:36 +01:00
  • af2161cdb4 bug #1197: fix/relax some LM unit tests Gael Guennebaud 2016-04-09 11:14:02 +02:00
  • a05a683d83 bug #1160: fix and relax some lm unit tests by turning faillures to warnings Gael Guennebaud 2016-04-09 10:49:19 +02:00
  • 5da90fc8dd Use numext::abs instead of std::abs in scalar_fuzzy_default_impl to make it usable inside GPU kernels. Benoit Steiner 2016-04-08 19:40:48 -07:00
  • 01bd577288 Fixed the implementation of Eigen::numext::isfinite, Eigen::numext::isnan, andEigen::numext::isinf on CUDA devices Benoit Steiner 2016-04-08 16:40:10 -07:00
  • 89a3dc35a3 Fixed isfinite_impl: NumTraits<T>::highest() and NumTraits<T>::lowest() are finite numbers. Benoit Steiner 2016-04-08 15:56:16 -07:00
  • 995f202cea Disabled the use of half2 on cuda devices of compute capability < 5.3 Benoit Steiner 2016-04-08 14:43:36 -07:00
  • 8d22967bd9 Initial support for taking the power of fp16 Benoit Steiner 2016-04-08 14:22:39 -07:00
  • 3394379319 Fixed the packet_traits for half floats. Benoit Steiner 2016-04-08 13:33:59 -07:00
  • 0d2a532fc3 Created the new EIGEN_TEST_CUDA_CLANG option to compile the CUDA tests using clang instead of nvcc Benoit Steiner 2016-04-08 13:16:08 -07:00
  • 0b81a18d12 Merged eigen/eigen into default Rasmus Larsen 2016-04-08 12:58:57 -07:00
  • 2d072b38c1 Don't test the division by 0 on float16 when compiling with msvc since msvc detects and errors out on divisions by 0. Benoit Steiner 2016-04-08 12:50:25 -07:00
  • cd2b667ac8 Add references to filed LLVM bugs Benoit Jacob 2016-04-08 08:12:47 -04:00
  • 3bd16457e1 Properly handle complex numbers. Benoit Steiner 2016-04-07 23:28:04 -07:00
  • 63102ee43d Turn on the coeffWise benchmarks on fp16 Benoit Steiner 2016-04-07 23:05:20 -07:00
  • 7c47d3e663 Fixed the type casting benchmarks for fp16 Benoit Steiner 2016-04-07 22:50:25 -07:00
  • 166b56bc61 Fixed the type casting benchmark for float16 Benoit Steiner 2016-04-07 22:45:54 -07:00
  • 2f2801f096 Merged in parthaEth/eigen (pull request PR-175) Benoit Steiner 2016-04-07 22:10:14 -07:00
  • d962fe6a99 Renamed float16 into cxx11_float16 since the test relies on c++11 features Benoit Steiner 2016-04-07 20:28:32 -07:00
  • c34e55c62b Merged eigen/eigen into default Rasmus Larsen 2016-04-07 20:23:03 -07:00
  • 7d5b17087f Added missing EIGEN_DEVICE_FUNC to the tensor conversion code. Benoit Steiner 2016-04-07 20:01:19 -07:00
  • a6d08be9b2 Fixed the benchmarking of fp16 coefficient wise operations Benoit Steiner 2016-04-07 17:13:44 -07:00
  • 283c51cd5e Widen short-circuiting ReciprocalConditionNumberEstimate so we don't call InverseMatrixL1NormEstimate for dec.rows() <= 1. Rasmus Munk Larsen 2016-04-07 16:45:40 -07:00
  • d51803a728 Use Index instead of int for indexing and sizes. Rasmus Munk Larsen 2016-04-07 16:39:48 -07:00
  • fd872aefb3 Remove transpose() method from LLT and LDLT classes as it would imply conjugation. Explicitly cast constants to RealScalar in ConditionEstimator.h. Rasmus Munk Larsen 2016-04-07 16:28:44 -07:00
  • 0b5546d182 Use lpNorm<1>() to compute l1 norms in LLT and LDLT. Rasmus Munk Larsen 2016-04-07 15:49:30 -07:00
  • 2d5bb375b7 Static casting scalar types so as to let chlesky module of eigen work with ceres parthaEth 2016-04-08 00:14:44 +02:00
  • a02ec09511 Worked around numerical noise in the test for the zeta function. Benoit Steiner 2016-04-07 12:11:02 -07:00
  • c912b1d28c Fixed a typo in the polygamma test. Benoit Steiner 2016-04-07 11:51:07 -07:00
  • 74f64838c5 Updated the unary functors to use the numext implementation of typicall functions instead of the one provided in the standard library. The standard library functions aren't supported officially by cuda, so we're better off using the numext implementations. Benoit Steiner 2016-04-07 11:42:14 -07:00
  • 737644366f Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace Benoit Steiner 2016-04-07 11:40:15 -07:00
  • dc45aaeb93 Added tests for float16 Benoit Steiner 2016-04-07 11:18:05 -07:00
  • 8db269e055 Fixed a typo in a test Benoit Steiner 2016-04-07 10:41:51 -07:00
  • b89d3f78b2 Updated the isnan, isinf and isfinite functions to make compatible with cuda devices. Benoit Steiner 2016-04-07 10:08:49 -07:00
  • 48308ed801 Added support for isinf, isnan, and isfinite checks to the tensor api Benoit Steiner 2016-04-07 09:48:36 -07:00
  • cfb34d808b Fixed a possible integer overflow. Benoit Steiner 2016-04-07 08:46:52 -07:00
  • df838736e2 Fixed compilation warning triggered by msvc Benoit Steiner 2016-04-06 20:48:55 -07:00
  • 14ea7c7ec7 Fixed packet_traits<half> Benoit Steiner 2016-04-06 19:30:21 -07:00
  • 532fdf24cb Added support for hardware conversion between fp16 and full floats whenever possible. Benoit Steiner 2016-04-06 17:11:31 -07:00
  • 165150e896 Fixed the tests for the zeta and polygamma functions Benoit Steiner 2016-04-06 14:31:01 -07:00
  • 7be1eaad1e Fixed typos in the implementation of the zeta and polygamma ops. Benoit Steiner 2016-04-06 14:15:37 -07:00
  • 58c1dbff19 Made the fp16 code more portable. Benoit Steiner 2016-04-06 13:44:08 -07:00
  • cf7e73addd Added some missing conversions to the Half class, and fixed the implementation of the < operator on cuda devices. Benoit Steiner 2016-04-06 09:59:51 -07:00
  • 10bdd8e378 Merged in tillahoffmann/eigen (pull request PR-173) Benoit Steiner 2016-04-06 09:40:17 -07:00
  • 7781f865cb Renamed the EIGEN_TEST_NVCC cmake option into EIGEN_TEST_CUDA per the discussion in bug #1173. Benoit Steiner 2016-04-06 09:35:23 -07:00
  • 72abfa11dd Added support for isfinite on fp16 Benoit Steiner 2016-04-06 09:07:30 -07:00
  • 4d07064a3d Fix bug in alternate lower bound calculation due to missing parentheses. Make a few expressions more concise. Rasmus Munk Larsen 2016-04-05 16:40:48 -07:00
  • 2bba4ee2cf Merged kmargar/eigen/tip into default Konstantinos Margaritis 2016-04-05 22:22:08 +03:00
  • 317384b397 complete the port, remove float support Konstantinos Margaritis 2016-04-05 14:56:45 -04:00
  • 726bd5f077 Merged eigen/eigen into default tillahoffmann 2016-04-05 18:21:05 +01:00
  • a350c25a39 Added accuracy comments. Till Hoffmann 2016-04-05 18:20:40 +01:00
  • 4d7e230d2f bug #1189: fix pow/atan2 compilation for AutoDiffScalar Gael Guennebaud 2016-04-05 14:49:41 +02:00
  • bc0ad363c6 add remaining includes Konstantinos Margaritis 2016-04-05 06:01:17 -04:00
  • 2d41dc9622 complete int/double specialized traits for ZVector Konstantinos Margaritis 2016-04-05 06:00:51 -04:00
  • 644d0f91d2 enable all tests again Konstantinos Margaritis 2016-04-05 05:59:54 -04:00
  • 988344daf1 enable the other includes as well Konstantinos Margaritis 2016-04-05 05:59:30 -04:00
  • d7eeee0c1d Merged eigen/eigen into default Rasmus Larsen 2016-04-04 15:58:27 -07:00
  • 513c372960 Fix docstrings to list all supported decompositions. Rasmus Munk Larsen 2016-04-04 14:34:59 -07:00
  • 86e0ed81f8 Addresses comments on Eigen pull request PR-174. Rasmus Munk Larsen 2016-04-04 14:20:01 -07:00
  • 158fea0f5e bug #1190 - Don't trust __ARM_FEATURE_FMA on Clang/ARM Benoit Jacob 2016-04-04 16:42:40 -04:00
  • 03f2997a11 bug #1191 - Prevent Clang/ARM from rewriting VMLA into VMUL+VADD Benoit Jacob 2016-04-04 16:41:47 -04:00
  • b0143de177 Merge upstream. Till Hoffmann 2016-04-04 19:16:48 +01:00
  • b97911dd18 Refactored code into type-specific helper functions. Till Hoffmann 2016-04-04 19:16:03 +01:00
  • c4179dd470 Updated the scalar_abs_op struct to make it compatible with cuda devices. Benoit Steiner 2016-04-04 11:11:51 -07:00