Commit Graph

  • e83af2cc24 Commit 52a5f982 broke conjhelper functionality for HIP GPUs. Rohit Santhanam 2021-06-25 19:28:00 +00:00
  • 2d132d1736 Commit 52a5f982 broke conjhelper functionality for HIP GPUs. Rohit Santhanam 2021-06-25 19:28:00 +00:00
  • 413ff2b531 Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code. Rasmus Munk Larsen 2021-06-24 18:52:17 -07:00
  • bffd267d17 Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code. Rasmus Munk Larsen 2021-06-24 18:52:17 -07:00
  • a235ddef39 Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations. Rasmus Munk Larsen 2021-06-24 15:47:48 -07:00
  • 52a5f98212 Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations. Rasmus Munk Larsen 2021-06-24 15:47:48 -07:00
  • 4ad30a73fc Use internal::ref_selector to avoid holding a reference to a RHS expression. Rasmus Munk Larsen 2021-06-22 14:31:32 +00:00
  • 4780d8dfb2 Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp Rasmus Munk Larsen 2021-06-21 19:06:04 +00:00
  • fd5d23fdf3 Update ComplexEigenSolver_eigenvectors.cpp Rasmus Munk Larsen 2021-06-21 19:06:25 +00:00
  • ea62c937ed Update ComplexEigenSolver_eigenvectors.cpp Rasmus Munk Larsen 2021-06-21 19:06:25 +00:00
  • c8a2b4d20a Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp Rasmus Munk Larsen 2021-06-21 19:06:04 +00:00
  • a2040ef796 Rewrite balancer to avoid overflows. Antonio Sanchez 2021-06-18 14:24:11 -07:00
  • e9ab4278b7 Rewrite balancer to avoid overflows. Antonio Sanchez 2021-06-18 14:24:11 -07:00
  • c2c0f6f64b Fix fix<> for gcc-4.9.3. Antonio Sanchez 2021-06-18 13:06:04 -07:00
  • 35a367d557 Fix fix<> for gcc-4.9.3. Antonio Sanchez 2021-06-18 13:06:04 -07:00
  • ee4e099aa2 Remove pset, replace with ploadu. Antonio Sanchez 2021-06-16 14:36:42 -07:00
  • 12e8d57108 Remove pset, replace with ploadu. Antonio Sanchez 2021-06-16 14:36:42 -07:00
  • 9fc93ce31a EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate. Chip-Kerchner 2021-06-16 08:49:22 -05:00
  • ef1fd341a8 EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate. Chip-Kerchner 2021-06-16 08:49:22 -05:00
  • 175f0cc1e9 changed documentation to make example compile jenswehner 2021-06-16 11:45:06 +02:00
  • 1374f49f28 Add missing ppc pcmp_lt_or_nan<Packet8bf> Antonio Sanchez 2021-06-15 13:42:17 -07:00
  • 9e94c59570 Add missing ppc pcmp_lt_or_nan<Packet8bf> Antonio Sanchez 2021-06-15 13:42:17 -07:00
  • 2d6eaaf687 Fix placement of permanent GPU defines. Antonio Sanchez 2021-06-15 12:15:58 -07:00
  • 954879183b Fix placement of permanent GPU defines. Antonio Sanchez 2021-06-15 12:15:58 -07:00
  • 47722a66f2 Fix more enum arithmetic. Rasmus Munk Larsen 2021-06-15 09:09:31 -07:00
  • 13fb5ab92c Fix more enum arithmetic. Rasmus Munk Larsen 2021-06-15 09:09:31 -07:00
  • 5e75331b9f Fix checking of version number for mingw. Antonio Sanchez 2021-06-11 10:21:07 -07:00
  • ad82d20cf6 Fix checking of version number for mingw. Antonio Sanchez 2021-06-11 10:21:07 -07:00
  • b5fc69bdd8 Add ability to permanently enable HIP/CUDA gpu* defines. Antonio Sanchez 2021-06-11 08:21:34 -07:00
  • 514977f31b Add ability to permanently enable HIP/CUDA gpu* defines. Antonio Sanchez 2021-06-11 08:21:34 -07:00
  • 4b683b65df Allow custom TENSOR_CONTRACTION_DISPATCH macro. Antonio Sanchez 2021-06-11 08:30:41 -07:00
  • 6aec83263d Allow custom TENSOR_CONTRACTION_DISPATCH macro. Antonio Sanchez 2021-06-11 08:30:41 -07:00
  • 1cb1ffd5b2 Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. Rasmus Munk Larsen 2021-06-10 19:18:50 -07:00
  • fc87e2cbaa Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. Rasmus Munk Larsen 2021-06-10 19:18:50 -07:00
  • 4b502a7215 Fix c++20 warnings about using enums in arithmetic expressions. Rasmus Munk Larsen 2021-06-10 17:17:39 -07:00
  • f64b2954c7 Fix c++20 warnings about using enums in arithmetic expressions. Rasmus Munk Larsen 2021-06-10 17:17:39 -07:00
  • 85868564df Fix parsing of version for nvhpc Nicolas Cornu 2021-06-08 15:48:21 +02:00
  • 001a57519a Fix parsing of version for nvhpc Nicolas Cornu 2021-06-08 15:48:21 +02:00
  • cbb6ae6296 Removed dead code from GPU float16 unit test. Rohit Santhanam 2021-05-28 20:06:48 +00:00
  • c8d40a7bf1 Removed dead code from GPU float16 unit test. Rohit Santhanam 2021-05-28 20:06:48 +00:00
  • 573570b6c9 Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor. Cyril Kaiser 2021-05-22 18:15:32 +01:00
  • 91cd67f057 Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor. Cyril Kaiser 2021-05-22 18:15:32 +01:00
  • 98cf1e076f Add missing NEON ptranspose implementations. Antonio Sanchez 2021-05-24 21:34:35 -07:00
  • dba753a986 Add missing NEON ptranspose implementations. Antonio Sanchez 2021-05-24 21:34:35 -07:00
  • ee2a8f7139 Modify Unary/Binary/TernaryOp evaluators to work for non-class types. Antonio Sanchez 2021-05-23 12:35:38 -07:00
  • ebb300d0b4 Modify Unary/Binary/TernaryOp evaluators to work for non-class types. Antonio Sanchez 2021-05-23 12:35:38 -07:00
  • 3835046309 predux_half_dowto4 test extended to all applicable packets Jakub Lichman 2021-05-21 14:12:25 +00:00
  • 4fbd01cd4b Adds macro for checking if C++14 variable templates are supported Steve Bronder 2021-05-21 16:25:32 +00:00
  • 12471fcb5d predux_half_dowto4 test extended to all applicable packets Jakub Lichman 2021-05-21 14:12:25 +00:00
  • 1720057023 Adds macro for checking if C++14 variable templates are supported Steve Bronder 2021-05-21 16:25:32 +00:00
  • a883a8797c Use derived object type in conservative_resize_like_impl Niall Murphy 2021-05-10 11:43:49 +01:00
  • 391094c507 Use derived object type in conservative_resize_like_impl Niall Murphy 2021-05-10 11:43:49 +01:00
  • 0bd9e9bc45 ptranpose test for non-square kernels added Jakub Lichman 2021-05-19 08:26:45 +00:00
  • 09f3e95447 WIP 2 starting_new_generickernels Everton Constantino 2021-05-19 17:29:42 +00:00
  • 8877f8d9b2 ptranpose test for non-square kernels added Jakub Lichman 2021-05-19 08:26:45 +00:00
  • 6533187280 WIP 2 - need to implement 2x1x1 Everton Constantino 2021-05-18 20:42:08 +00:00
  • 029f78abf0 WIP 2 Everton Constantino 2021-05-14 20:21:52 +00:00
  • 5d47f6697d WIP 2 Everton Constantino 2021-05-14 16:26:33 +00:00
  • ad67705447 WIP2 Everton Constantino 2021-05-14 12:29:37 +00:00
  • 9fc17867e5 WIP 2 Everton Constantino 2021-05-13 19:21:48 +00:00
  • 3999ab2dc7 WIP 2 Everton Constantino 2021-05-13 18:12:52 +00:00
  • 58db05afbc WIP 2 starting_new_packmapcalculator Everton Constantino 2021-05-13 15:30:08 +00:00
  • 77c66e368c Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 . Guoqiang QI 2021-05-13 15:03:30 +00:00
  • 3e006bfd31 Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 . Guoqiang QI 2021-05-13 15:03:30 +00:00
  • bfadb56107 WIP 2 Everton Constantino 2021-05-13 14:48:40 +00:00
  • 9b8cdceea8 WIP 2 Everton Constantino 2021-05-13 14:42:22 +00:00
  • a8ec6d6a36 WIP with tests Everton Constantino 2021-05-12 17:09:33 +00:00
  • 2f908f8255 Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 . guoqiangqi 2021-05-10 09:27:41 +08:00
  • 82f13830e6 Fix calls to device functions from host code Nathan Luehr 2021-05-11 22:47:49 +00:00
  • 972cf0c28a Fix calls to device functions from host code Nathan Luehr 2021-05-11 22:47:49 +00:00
  • d1825cbb68 Device implementation of log for std::complex types. Nathan Luehr 2021-04-19 18:05:27 -05:00
  • 7e6a1c129c Device implementation of log for std::complex types. Nathan Luehr 2021-04-19 18:05:27 -05:00
  • d9288f078d Fix ambiguity due to argument dependent lookup. Nathan Luehr 2021-04-16 14:04:20 -05:00
  • 6753f0f197 Fix ambiguity due to argument dependent lookup. Nathan Luehr 2021-04-16 14:04:20 -05:00
  • 3d9051ea84 Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 . guoqiangqi 2021-05-10 09:27:41 +08:00
  • 85ebd6aff8 Fix for issue where numext::imag and numext::real are used before they are defined. Rohit Santhanam 2021-05-10 19:20:32 +00:00
  • 54f80f442d WIP - Vector Everton Constantino 2021-05-10 20:06:34 +00:00
  • 70c0363c28 WIP2 Everton Constantino 2021-05-10 19:59:47 +00:00
  • 39ec31c0ad Fix for issue where numext::imag and numext::real are used before they are defined. Rohit Santhanam 2021-05-10 19:20:32 +00:00
  • b2cd094863 WIP Everton Constantino 2021-05-10 16:53:17 +00:00
  • 2947c0cc84 Restore ABI compatibility for conj with 3.3, fix conflict with boost. Antonio Sanchez 2021-05-06 19:49:49 -07:00
  • c0eb5f89a4 Restore ABI compatibility for conj with 3.3, fix conflict with boost. Antonio Sanchez 2021-05-06 19:49:49 -07:00
  • 25424f4cf1 Clean up gpu device properties. Antonio Sanchez 2021-05-06 12:50:51 -07:00
  • 42acbd5700 Fix numext::arg return type. Antonio Sanchez 2021-05-07 08:24:32 -07:00
  • 0eba8a1fe3 Clean up gpu device properties. Antonio Sanchez 2021-05-06 12:50:51 -07:00
  • 90e9a33e1c Fix numext::arg return type. Antonio Sanchez 2021-05-07 08:24:32 -07:00
  • 9e0dc8f09b Revert addition of unused paddsub<Packet2cf>. This fixes #2242 Christoph Hertzberg 2021-05-06 18:36:47 +02:00
  • 722ca0b665 Revert addition of unused paddsub<Packet2cf>. This fixes #2242 Christoph Hertzberg 2021-05-06 18:36:47 +02:00
  • da19f7a910 Simplify TensorRandom and remove time-dependence. Antonio Sanchez 2021-04-30 08:19:48 -07:00
  • e3b7f59659 Simplify TensorRandom and remove time-dependence. Antonio Sanchez 2021-04-30 08:19:48 -07:00
  • fc2cc10842 Better CUDA complex division. Antonio Sanchez 2021-04-23 16:04:01 -07:00
  • 1c013be2cc Better CUDA complex division. Antonio Sanchez 2021-04-23 16:04:01 -07:00
  • a33855f6ee Add missing pcmp_lt_or_nan for NEON Packet4bf. Antonio Sanchez 2021-04-27 14:12:11 -07:00
  • 172db7bfc3 Add missing pcmp_lt_or_nan for NEON Packet4bf. Antonio Sanchez 2021-04-27 14:12:11 -07:00
  • 83df5df61b Added complex matrix unit tests for SelfAdjointEigenSolve Theo Fletcher 2021-04-26 16:52:44 +01:00
  • 2ced0cc233 Added complex matrix unit tests for SelfAdjointEigenSolve Theo Fletcher 2021-04-26 16:52:44 +01:00
  • ac3c5aad31 Tests added and AVX512 bug fixed for pcmp_lt_or_nan Jakub Lichman 2021-04-25 20:58:56 +00:00
  • d87648a6be Tests added and AVX512 bug fixed for pcmp_lt_or_nan Jakub Lichman 2021-04-25 20:58:56 +00:00
  • 63abb10000 Tests for pcmp_lt and pcmp_le added Jakub Lichman 2021-04-23 19:51:43 +00:00
  • 1115f5462e Tests for pcmp_lt and pcmp_le added Jakub Lichman 2021-04-23 19:51:43 +00:00