eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Fabian Keßler	d0bfdc1658	optimize cmake scripts for subproject use (cherry picked from commit `19cacd3ecb`)	2023-07-26 12:01:28 -07:00
Charles Schlosser	208e44c979	fix warnings in tensorreduction and memory	2023-07-19 16:48:07 +00:00
Antonio Sanchez	ac561cd038	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each. (cherry picked from commit `be9e7d205f`)	2023-07-11 11:27:31 -07:00
Antonio Sanchez	554982beef	Disable Tree reduction for GPU. For moderately sized inputs, running the Tree reduction quickly fills/overflows the GPU thread stack space, leading to memory errors. This was happening in the `cxx11_tensor_complex_gpu` test, for example. Disabling tree reduction on GPU fixes this. (cherry picked from commit `24ebb37f38`)	2023-07-10 16:09:30 -07:00
Antonio Sanchez	89a71f3126	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars. (cherry picked from commit `701f5d1c91`)	2023-07-10 15:57:08 -07:00
Antonio Sanchez	a605d6b996	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang. (cherry picked from commit `846d34384a`)	2023-07-10 15:30:41 -07:00
Antonio Sanchez	dfcd6de20a	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings). (cherry picked from commit `7b00e8b186`)	2023-07-10 15:30:41 -07:00
Antonio Sánchez	26b8fabd80	Return NaN in ndtri for values outside valid input range. (cherry picked from commit `1f79a6078f`)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	f296720d7d	Make sure we return +/-1 above the clamping point for Erf(). (cherry picked from commit `b378014fef`)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	d4c24eca96	Don't crash on empty tensor contraction. (cherry picked from commit `b0f877f8e0`)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	72b0759451	Fix arm builds. (cherry picked from commit `2c8011c2dd`)	2023-07-10 14:52:08 -07:00
Chip Kerchner	8f1b6198c2	Fix epsilon and dummy_precision values in long double for double doubles. Prevented some algorithms from converging on PPC. (cherry picked from commit `54459214a1`)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	669dc8fadf	Eliminate bool bitwise warnings. (cherry picked from commit `b8e93bf589`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	ea57f9b78f	AutoDiff depends on Core, so include appropriate header. (cherry picked from commit `e1165dbf9a`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	f55a112cb1	Fix ODR violations. (cherry picked from commit `bb51d9f4fa`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	a11bdf3965	Skip f16/bf16 bessel specializations on AVX512 if unavailable. (cherry picked from commit `8ed3b9dcd6`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	80c5b8b3c3	Fix ambiguous comparisons for c++20 (again again) (cherry picked from commit `8c2e0e3cb8`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	af912a7b5c	Fix MSVC+CUDA issues. (cherry picked from commit `5ed7a86ae9`)	2023-07-07 15:21:17 -07:00
Antonio Sanchez	ac78f84b72	Eliminate trace unused warning. (cherry picked from commit `9bc9992dd3`)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b158fcaa74	Fix edge-case in zeta for large inputs. (cherry picked from commit `9296bb4b93`)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b30a2a527e	Remove poor non-convergence checks in NonLinearOptimization. (cherry picked from commit `d819a33bf6`)	2023-07-07 11:50:25 -07:00
Antonio Sanchez	bc1b354b32	Adjust tolerance of matrix_power test for MSVC. (cherry picked from commit `1c2690ed24`)	2023-07-07 11:50:02 -07:00
Antonio Sánchez	36be6747e0	Modify test expression to avoid numerical differences (#2402 ). (cherry picked from commit `ae86a146b1`)	2023-07-07 11:45:56 -07:00
Antonio Sanchez	21e0ad056e	Fix ODR failures in TensorRandom. (cherry picked from commit `bded5028a5`)	2023-07-07 11:43:03 -07:00
Antonio Sánchez	52e545324e	Fix ODR violations. (cherry picked from commit `cafeadffef`)	2023-07-07 11:37:31 -07:00
Antonio Sánchez	f3aaba8705	Revert "Replace call to FixedDimensions() with a singleton instance of" This reverts commit `19e6496ce0` (cherry picked from commit `f7b31f864c`)	2022-04-10 15:34:11 +00:00
Antonio Sanchez	7e3bc4177e	Fix tensor broadcast off-by-one error. Caught by JAX unit tests. Triggered if broadcast is smaller than packet size. (cherry picked from commit `ffb78e23a1`)	2021-11-16 18:41:25 +00:00
Nico	71320af66a	Fix -Wbitwise-instead-of-logical clang warning & and \| short-circuit, && and \|\| don't. When both arguments to those are boolean, the short-circuiting version is usually the desired one, so clang warns on this. Here, it is inconsequential, so switch to && and \|\| to suppress the warning. (cherry picked from commit `b17bcddbca`)	2021-11-03 23:32:57 +00:00
Antonio Sanchez	0ab1f8ec03	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351. (cherry picked from commit `a500da1dc0`)	2021-11-03 23:30:47 +00:00
Antonio Sanchez	f9b2e92040	Remove bad "take" impl that causes g++-11 crash. For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes g++-11 to ICE with ``` sorry, unimplemented: unexpected AST of kind nontype_argument_pack ``` It does work with other versions of gcc, and with clang. I filed a GCC bug [here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999). Technically we should never actually run into this case, since you can't take n > 0 elements from an empty list. Commenting it out allows our Eigen tests to pass. (cherry picked from commit `8f8c2ba2fe`)	2021-11-03 23:26:34 +00:00
Maxiwell S. Garcia	b8cf1ed753	Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h (cherry picked from commit `09fc0f97b5`)	2021-09-01 17:26:59 +00:00
Antonio Sanchez	c2b6df6e60	Disable cuda Eigen::half vectorization on host. All cuda `__half` functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0. All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all. Modified the code to disable vectorization for `__half` on host, which required also updating the `TensorReductionGpu` implementation which previously made assumptions about available packets. (cherry picked from commit `cc3573ab44`)	2021-08-31 21:23:11 +00:00
jenswehner	338924602d	added includes for unordered_map (cherry picked from commit `e3e74001f7`)	2021-08-10 16:10:03 +00:00
Antonio Sanchez	46ecdcd745	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282. (cherry picked from commit `31f796ebef`)	2021-08-03 18:13:12 +00:00
Antonio Sanchez	bb33880e57	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit `00f32752`, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17. (cherry picked from commit `8cf6cb27ba`)	2021-08-03 17:25:17 +00:00
Alexander Karatarakis	c334eece44	_DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier (cherry picked from commit `f357283d31`)	2021-07-29 18:18:47 +00:00
Jonas Harsch	5a3c9eddb4	Removed superfluous boolean `degenerate` in TensorMorphing.h. (cherry picked from commit `e9c9a3130b`)	2021-07-08 18:34:10 +00:00
Antonio Sanchez	84955d109f	Fix Tensor documentation page. The extra [TOC] tag is generating a huge floating duplicated table-of-contents, which obscures the majority of the page (see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html). Remove it. Also, headers do not support markup (see [doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so backticks like ``` ``` end up generating titles that looks like ``` Constructor <tt>Tensor<double,2></tt> ``` Removing backticks for now. To generate proper formatted headers, we must directly use html instead of markdown, i.e. ``` <h2>Constructor <code>Tensor<double,2></code></h2> ``` which is ugly. Fixes #2254. (cherry picked from commit `f5a9873bbb`)	2021-07-07 17:18:20 +00:00
Jonas Harsch	601814b575	Don't crash when attempting to shuffle an empty tensor. (cherry picked from commit `aab747021b`)	2021-07-02 21:08:38 +00:00
Antonio Sanchez	8190739f12	Fix compile issues for gcc 4.8. - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. (cherry picked from commit `6035da5283`)	2021-07-01 23:18:10 +00:00
Antonio Sanchez	d82d915047	Modify tensor argmin/argmax to always return first occurence. As written, depending on multithreading/gpu, the returned index from `argmin`/`argmax` is not currently stable. Here we modify the functors to always keep the first occurence (i.e. if the value is equal to the current min/max, then keep the one with the smallest index). This is otherwise causing unpredictable results in some TF tests. (cherry picked from commit `3a087ccb99`)	2021-06-29 23:28:37 +00:00
Antonio Sanchez	a2040ef796	Rewrite balancer to avoid overflows. The previous balancer overflowed for large row/column norms. Modified to prevent that. Fixes #2273. (cherry picked from commit `e9ab4278b7`)	2021-06-21 18:14:53 +00:00
Antonio Sanchez	2d6eaaf687	Fix placement of permanent GPU defines. (cherry picked from commit `954879183b`)	2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen	47722a66f2	Fix more enum arithmetic. (cherry picked from commit `13fb5ab92c`)	2021-06-15 16:40:35 +00:00
Antonio Sanchez	b5fc69bdd8	Add ability to permanently enable HIP/CUDA gpu* defines. When using Eigen for gpu, these simplify portability. If `EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then we do not undefine them. (cherry picked from commit `514977f31b`)	2021-06-11 17:48:37 +00:00
Antonio Sanchez	4b683b65df	Allow custom TENSOR_CONTRACTION_DISPATCH macro. Currently TF lite needs to hack around with the Tensor headers in order to customize the contraction dispatch method. Here we add simple `#ifndef` guards to allow them to provide their own dispatch prior to inclusion. (cherry picked from commit `6aec83263d`)	2021-06-11 17:19:29 +00:00
Rohit Santhanam	cbb6ae6296	Removed dead code from GPU float16 unit test. (cherry picked from commit `c8d40a7bf1`)	2021-06-10 17:16:47 +00:00
Nathan Luehr	82f13830e6	Fix calls to device functions from host code (cherry picked from commit `972cf0c28a`)	2021-05-12 17:01:45 +00:00
Antonio Sanchez	25424f4cf1	Clean up gpu device properties. Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue. (cherry picked from commit `0eba8a1fe3`)	2021-05-07 18:13:40 +00:00
Antonio Sanchez	da19f7a910	Simplify TensorRandom and remove time-dependence. Time-dependence prevents tests from being repeatable. This has long been an issue with debugging the tensor tests. Removing this will allow future tests to be repeatable in the usual way. Also, the recently added macros in !476 are causing headaches across different platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE` are sometimes defined with values, sometimes defined as empty, and sometimes not defined at all when they probably should be. This is leading to multiple build breakages. The simplest approach is to generate a seed via `Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a hash based on the current thread ID (since `rand()` isn't supported on GPU). Fixes #1602. (cherry picked from commit `e3b7f59659`)	2021-05-05 23:37:48 +00:00

1 2 3 4 5 ...

2956 Commits