Commit Graph

  • e741b43668 Make Transform::computeRotationScaling(0,&S) continuous Essex Edwards 2020-12-14 13:03:46 -08:00
  • 0bdc0dba20 Add missing #endif directive in Macros.h David Tellenbach 2021-01-07 12:32:41 +01:00
  • cb654b1c45 #define was defined incorrectly because the result_of function was deprecated in c++17 and removed in c++20. Also, EIGEN_COMP_MSVC (which is _MSC_VER) only affects result_of indirectly, which can cause errors. shrek1402 2021-01-07 10:12:25 +00:00
  • 52d1dd979a Fix Ref initialization. Antonio Sanchez 2021-01-06 13:14:20 -08:00
  • 166fcdecdb Allow CwiseUnaryView to be used on device. Antonio Sanchez 2021-01-06 09:13:28 -08:00
  • bb1de9dbde Fix Ref Stride checks. Antonio Sanchez 2020-12-10 14:05:38 -08:00
  • 12dda34b15 Eliminate boolean product warnings by factoring out a combine_scalar_factors helper function. Christoph Hertzberg 2021-01-01 20:54:45 +01:00
  • 070d303d56 Add CUDA complex sqrt. Antonio Sanchez 2020-12-22 22:49:06 -08:00
  • fdf2ee62c5 Fix missing EIGEN_DEVICE_FUNC rgreenblatt 2020-12-12 13:55:36 -05:00
  • 05754100fe * Add iterative psqrt<double> for AVX and SSE when FMA is available. This provides a ~10% speedup. * Write iterative sqrt explicitly in terms of pmadd. This gives up to 7% speedup for psqrt<float> with AVX & SSE with FMA. * Remove iterative psqrt<double> for NEON, because the initial rsqrt apprimation is not accurate enough for convergence in 2 Newton-Raphson steps and with 3 steps, just calling the builtin sqrt insn is faster. Rasmus Munk Larsen 2020-12-16 18:16:11 +00:00
  • 3bee9422d6 Merge branch 'lambdaknight/eigen-master' Turing Eret 2020-12-16 09:18:24 -07:00
  • 19e6496ce0 Replace call to FixedDimensions() with a singleton instance of FixedDimensions. Turing Eret 2020-12-16 07:18:09 -07:00
  • 6cee8d347e Add an additional step of Newton-Raphson for psqrt<double> on Arm, which otherwise has an error of ~1000 ulps. Rasmus Munk Larsen 2020-12-15 04:06:41 +00:00
  • bc7d1599fb TensorStorage with FixedDimensions now has zero instance memory overhead. Removed m_dimension as instance member of TensorStorage with FixedDimensions and instead use the template parameter. This means that the sizeof a pure fixed-size storage is exactly equal to the data it is storing. Turing Eret 2020-12-14 07:16:38 -07:00
  • cf0b5b0344 Remove code checking for CMake < 3.5 Alexander Grund 2020-12-14 09:57:44 +00:00
  • 751f18f2c0 Remove comma at the end of enumeration list to silence C++03 warnings David Tellenbach 2020-12-13 18:11:02 +01:00
  • 5dc2fbabee Fix implicit cast to double. Antonio Sanchez 2020-12-12 09:26:20 -08:00
  • 55967f87d1 Fix NEON pmax<PropagateNumbers,Packet4bf>. Antonio Sanchez 2020-12-11 21:50:52 -08:00
  • 839aa505c3 Fix typo in AVX512 packet math. Antonio Sanchez 2020-12-11 21:35:44 -08:00
  • 536c8a79f2 Remove unused macro in Half.h David Tellenbach 2020-12-12 00:53:26 +01:00
  • 8c9976d7f0 Fix more SSE/AVX packet conversions for peven. Antonio Sanchez 2020-12-11 15:46:42 -08:00
  • c6efc4e0ba Replace M_LOG2E and M_LN2 with custom macros. Antonio Sanchez 2020-12-11 14:34:31 -08:00
  • e82722a4a7 Fix MSVC SSE casts. Antonio Sanchez 2020-12-11 08:52:59 -08:00
  • f3d2ea48f5 Fix for broken ROCm/HIP Support Deven Desai 2020-12-10 14:51:13 +00:00
  • c7eb3a74cb Don't guard psqrt for std::complex<float> with EIGEN_ARCH_ARM64 David Tellenbach 2020-12-11 12:41:52 +01:00
  • bccf055a7c Add Armv8 guard on PropagateNumbers implementation. Everton Constantino 2020-12-09 15:47:09 -03:00
  • 82c0c18a83 Remove private access of std::deque::_M_impl. Antonio Sanchez 2020-12-10 14:59:34 -08:00
  • 00be0a7ff3 Fix vectorization of complex sqrt on NEON David Tellenbach 2020-12-10 15:23:23 +00:00
  • 8eb461a431 Remove comma at end of enumerator list in NEON PacketMath David Tellenbach 2020-12-10 15:22:55 +01:00
  • a36d19c4fc Fix a typo in SparseMatrix documentation. David Tellenbach 2020-12-09 14:48:24 +01:00
  • 2e8f850c78 Fix a typo in SparseMatrix documentation. David Tellenbach 2020-12-09 14:48:24 +01:00
  • 125cc9a5df Implement vectorized complex square root. Rasmus Munk Larsen 2020-12-08 18:13:35 -08:00
  • 8cfe0db108 Fix host/device calls for __half. Antonio Sanchez 2020-12-07 19:11:07 -08:00
  • baf9d762b7 - Enabling PropagateNaN and PropagateNumbers for NEON. - Adding propagate tests to bfloat16. Everton Constantino 2020-11-16 19:03:58 +00:00
  • 634bd79b0e Fix unused warning on new dense_assignment_loop impl. Antonio Sanchez 2020-12-07 19:14:21 -08:00
  • 655c3a4042 Add specialization for compile-time zero-sized dense assignment. Antonio Sanchez 2020-12-07 08:27:03 -08:00
  • 5ec4907434 Clean up #ifs in GPU PacketPath. Antonio Sanchez 2020-12-04 15:33:19 -08:00
  • 0fd6b4f71d Bump to 3.3.9 3.3.9 David Tellenbach 2020-12-04 22:53:41 +01:00
  • f9fac1d5b0 Add log2() to Eigen. Rasmus Munk Larsen 2020-12-04 21:45:09 +00:00
  • 2dbac2f99f Fix bad NEON fp16 check Antonio Sanchez 2020-12-04 13:42:18 -08:00
  • e2f21465fe Special function implementations for half/bfloat16 packets. Antonio Sanchez 2020-12-02 14:00:57 -08:00
  • 305b8bd277 Remove duplicate #if clause David Tellenbach 2020-12-04 18:55:46 +01:00
  • 9ee9ac81de Fix shfl* macros for CUDA/HIP Antonio Sanchez 2020-12-03 15:00:18 -08:00
  • a9a2f2bebf The function 'prefetch' did not work correctly on the win64 platform shrek1402 2020-12-04 17:18:08 +00:00
  • f23dc5b971 Revert "Add log2() operator to Eigen" Rasmus Munk Larsen 2020-12-03 14:32:45 -08:00
  • 4d91519a9b Add log2() operator to Eigen Rasmus Munk Larsen 2020-12-03 22:31:44 +00:00
  • 25d8ae7465 Small cleanup of generic plog implementations: Adding the term e*ln(2) is split into two step for no obvious reason. This dates back to the original Cephes code from which the algorithm is adapted. It appears that this was done in Cephes to prevent the compiler from reordering the addition of the 3 terms in the approximation Rasmus Munk Larsen 2020-12-03 19:40:40 +00:00
  • eb4d4ae070 Include chrono in main for c++11. Antonio Sanchez 2020-12-03 11:27:29 -08:00
  • 71c85df4c1 Clean up the Tensor header and get rid of the EIGEN_SLEEP macro. Rasmus Munk Larsen 2020-12-02 11:04:04 -08:00
  • 70fbcf82ed Fix typo in F32MaskToBf16Mask. Antonio Sanchez 2020-12-02 07:58:34 -08:00
  • 2627e2f2e6 Fix neon cmp* functions for bf16. Antonio Sanchez 2020-12-01 16:28:48 -08:00
  • ddd48b242c Implement CUDA __shfl* for Eigen::half Antonio Sanchez 2020-12-01 14:27:52 -08:00
  • e57281a741 Fix a few issues for AVX512. This change enables vectorized versions of log, exp, log1p, expm1 when AVX512DQ is not available. Rasmus Munk Larsen 2020-12-01 11:31:47 -08:00
  • 1992af3de2 Fix #2077, EIGEN_CONSTEXPR in Half. Antonio Sanchez 2020-11-30 13:45:46 -08:00
  • 7b80609d49 add EIGEN_DEVICE_FUNC to methods acxz 2020-12-01 03:08:47 +00:00
  • 89f90b585d AVX512 missing ops. Antonio Sanchez 2020-11-24 16:28:07 -08:00
  • 52207cf6f9 Fix typo in doc Florian Maurin 2020-11-30 10:53:29 +00:00
  • c5985c46f5 Fix typo in doc Florian Maurin 2020-11-30 10:53:29 +00:00
  • 0c26611d2d Workaround for doxygen class template titles in which the template part of the class signature is lost due to a problem with forward declarations. The problem is probably caused by doxygen bug #7689. It is confirmed to be fixed in doxygen >= 1.8.19. Jim Lersch 2020-11-26 13:17:54 -07:00
  • 68f69414f7 Workaround for doxygen class template titles in which the template part of the class signature is lost due to a problem with forward declarations. The problem is probably caused by doxygen bug #7689. It is confirmed to be fixed in doxygen >= 1.8.19. Jim Lersch 2020-11-26 13:17:54 -07:00
  • 2a4fcb2c31 Fix doxygen class block that was wrongly named. Christoph Hertzberg 2020-11-27 19:40:14 +01:00
  • a7170f2aca Fix doxygen class blocks that were not associated with the correct classes. Jim Lersch 2020-11-26 13:39:04 -07:00
  • 550e8f8f57 Include CMakeDependentOption to be able to use cmake_dependent_option David Tellenbach 2020-11-27 13:21:49 +01:00
  • 9842366bba Make inclusion of doc sub-directory optional by adjusting options. Bowie Owens 2020-11-18 10:08:23 +11:00
  • aa56e1d980 check for include dirs set filippobrizzi 2020-11-25 13:20:27 +00:00
  • 54930b6b55 Remove unused variable Christoph Hertzberg 2020-11-25 17:59:18 +01:00
  • 1e74f93d55 Fix some packet-functions in the IBM ZVector packet-math. Andreas Krebbel 2020-11-25 14:11:23 +00:00
  • 79818216ed Revert "Fix Half NaN definition and test." Rasmus Munk Larsen 2020-11-24 12:57:28 -08:00
  • c770746d70 Fix Half NaN definition and test. Rasmus Munk Larsen 2020-11-24 20:53:07 +00:00
  • 22f67b5958 Fix boolean float conversion and product warnings. Antonio Sanchez 2020-11-19 10:22:42 -08:00
  • a3b300f1af Implement missing AVX half ops. Antonio Sanchez 2020-11-23 16:11:01 -08:00
  • 38abf2be42 Fix Half NaN definition and test. Antonio Sanchez 2020-11-23 14:13:59 -08:00
  • 4cf01d2cf5 Update AVX half packets, disable test. Antonio Sanchez 2020-11-19 15:44:19 -08:00
  • fd1dcb6b45 Fixes duplicate symbol when building blas Antonio Sanchez 2020-11-20 08:49:57 -08:00
  • 6c9c3f9a1a Remove explicit casts from Eigen::half and Eigen::bfloat16 to bool David Tellenbach 2020-11-19 18:49:09 +01:00
  • a8fdcae55d Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix. Antonio Sanchez 2020-11-18 13:23:13 -08:00
  • 11e4056f6b Re-enable Arm Neon Eigen::half packets of size 8 David Tellenbach 2020-11-18 23:02:21 +00:00
  • 17268b155d Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom Antonio Sanchez 2020-11-17 15:32:44 -08:00
  • 41d5d5334b Initialize primitives to fix -Wuninitialized-const-reference. Antonio Sanchez 2020-11-18 08:02:15 -08:00
  • 3669498f5a Fix rule-of-3 for the Tensor module. Antonio Sanchez 2020-11-12 15:59:29 -08:00
  • 60218829b7 EOF newline added to InverseSize4. Antonio Sanchez 2020-11-18 07:58:33 -08:00
  • 2d63706545 Add missing parens around macro argument. Rasmus Munk Larsen 2020-11-18 00:24:19 +00:00
  • 6bba58f109 Replace SSE_SHUFFLE_MASK macro with shuffle_mask. Rasmus Munk Larsen 2020-11-17 15:28:37 -08:00
  • e9b55c4db8 Avoid promotion of Arm __fp16 to float in Neon PacketMath David Tellenbach 2020-11-17 20:19:44 +01:00
  • 117a4c0617 Fix missing EIGEN_CONSTEXPR pop_macro in Half. Antonio Sanchez 2020-11-17 08:24:58 -08:00
  • 4e5385c905 Enable MathJax in Doxygen.in Martin Vonheim Larsen 2020-11-16 12:59:13 +00:00
  • 394f564055 Unify Inverse_SSE.h and Inverse_NEON.h into a single generic implementation using PacketMath. Guoqiang QI 2020-11-17 12:27:01 +00:00
  • 8e9cc5b10a Eliminate double-promotion warnings. Antonio Sanchez 2020-11-16 10:39:09 -08:00
  • 9175f50d6f Add EIGEN_DEVICE_FUNC to TranspositionsBase acxz 2020-11-16 15:37:40 +00:00
  • 280f4f2407 Enable MathJax in Doxygen.in Martin Vonheim Larsen 2020-11-16 12:59:13 +00:00
  • bb69a8db5d Explicit casts of S -> std::complex<T> Antonio Sanchez 2020-11-12 13:12:00 -08:00
  • 90f6d9d23e Suppress ignored-attributes warning (same as in vectorization_logic). Remove redundant include and using namespace. Christoph Hertzberg 2020-11-13 16:21:53 +01:00
  • 8324e5e049 Fix typo in NEON/PacketMath.h guoqiangqi 2020-11-09 11:38:47 +08:00
  • 852513e7a6 Disable testing of OpenGL by default. Antonio Sanchez 2020-11-12 16:07:55 -08:00
  • bec72345d6 Simplify expression for inner product fallback in Gemv product evaluator. Rasmus Munk Larsen 2020-11-12 23:43:15 +00:00
  • 276db21f26 Remove redundant branch for handling dynamic vector*vector. This will be handled by the equivalent branch in the specialization for GemvProduct. Rasmus Munk Larsen 2020-11-12 21:54:56 +00:00
  • cf12474a8b Optimize matrix*matrix and matrix*vector products when they correspond to inner products at runtime. Rasmus Munk Larsen 2020-11-12 18:02:37 +00:00
  • c29935b323 Add support for dynamic dispatch of MMA instructions for POWER 10 Pedro Caldeira 2020-09-09 12:16:44 -05:00
  • ac632f663e bug #1746: Removed implementation of standard copy-constructor and standard copy-assign-operator from PermutationMatrix and Transpositions to allow malloc-less std::move. Added unit-test to rvalue_types Christoph Hertzberg 2019-09-24 11:09:58 +02:00
  • b714dd9701 remove annotation for first declaration of default con/destruction acxz 2020-10-12 21:52:00 -04:00