eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2026-04-10 11:34:33 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	1217390db4	Fix windows+CUDA builds	2023-10-25 20:55:59 +00:00
Fabian Keßler	d0bfdc1658	optimize cmake scripts for subproject use (cherry picked from commit `19cacd3ecb`)	2023-07-26 12:01:28 -07:00
Charles Schlosser	208e44c979	fix warnings in tensorreduction and memory	2023-07-19 16:48:07 +00:00
Antonio Sanchez	ac561cd038	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each. (cherry picked from commit `be9e7d205f`)	2023-07-11 11:27:31 -07:00
Antonio Sanchez	554982beef	Disable Tree reduction for GPU. For moderately sized inputs, running the Tree reduction quickly fills/overflows the GPU thread stack space, leading to memory errors. This was happening in the `cxx11_tensor_complex_gpu` test, for example. Disabling tree reduction on GPU fixes this. (cherry picked from commit `24ebb37f38`)	2023-07-10 16:09:30 -07:00
Antonio Sanchez	89a71f3126	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars. (cherry picked from commit `701f5d1c91`)	2023-07-10 15:57:08 -07:00
Antonio Sanchez	a605d6b996	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang. (cherry picked from commit `846d34384a`)	2023-07-10 15:30:41 -07:00
Antonio Sanchez	dfcd6de20a	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings). (cherry picked from commit `7b00e8b186`)	2023-07-10 15:30:41 -07:00
Antonio Sánchez	26b8fabd80	Return NaN in ndtri for values outside valid input range. (cherry picked from commit `1f79a6078f`)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	f296720d7d	Make sure we return +/-1 above the clamping point for Erf(). (cherry picked from commit `b378014fef`)	2023-07-10 14:52:08 -07:00
Rasmus Munk Larsen	d4c24eca96	Don't crash on empty tensor contraction. (cherry picked from commit `b0f877f8e0`)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	72b0759451	Fix arm builds. (cherry picked from commit `2c8011c2dd`)	2023-07-10 14:52:08 -07:00
Chip Kerchner	8f1b6198c2	Fix epsilon and dummy_precision values in long double for double doubles. Prevented some algorithms from converging on PPC. (cherry picked from commit `54459214a1`)	2023-07-10 14:52:08 -07:00
Antonio Sánchez	669dc8fadf	Eliminate bool bitwise warnings. (cherry picked from commit `b8e93bf589`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	ea57f9b78f	AutoDiff depends on Core, so include appropriate header. (cherry picked from commit `e1165dbf9a`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	f55a112cb1	Fix ODR violations. (cherry picked from commit `bb51d9f4fa`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	a11bdf3965	Skip f16/bf16 bessel specializations on AVX512 if unavailable. (cherry picked from commit `8ed3b9dcd6`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	80c5b8b3c3	Fix ambiguous comparisons for c++20 (again again) (cherry picked from commit `8c2e0e3cb8`)	2023-07-07 15:21:17 -07:00
Antonio Sánchez	af912a7b5c	Fix MSVC+CUDA issues. (cherry picked from commit `5ed7a86ae9`)	2023-07-07 15:21:17 -07:00
Antonio Sanchez	ac78f84b72	Eliminate trace unused warning. (cherry picked from commit `9bc9992dd3`)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b158fcaa74	Fix edge-case in zeta for large inputs. (cherry picked from commit `9296bb4b93`)	2023-07-07 15:06:18 -07:00
Antonio Sánchez	b30a2a527e	Remove poor non-convergence checks in NonLinearOptimization. (cherry picked from commit `d819a33bf6`)	2023-07-07 11:50:25 -07:00
Antonio Sanchez	bc1b354b32	Adjust tolerance of matrix_power test for MSVC. (cherry picked from commit `1c2690ed24`)	2023-07-07 11:50:02 -07:00
Antonio Sánchez	36be6747e0	Modify test expression to avoid numerical differences (#2402 ). (cherry picked from commit `ae86a146b1`)	2023-07-07 11:45:56 -07:00
Antonio Sanchez	21e0ad056e	Fix ODR failures in TensorRandom. (cherry picked from commit `bded5028a5`)	2023-07-07 11:43:03 -07:00
Antonio Sánchez	52e545324e	Fix ODR violations. (cherry picked from commit `cafeadffef`)	2023-07-07 11:37:31 -07:00
Antonio Sánchez	f3aaba8705	Revert "Replace call to FixedDimensions() with a singleton instance of" This reverts commit `19e6496ce0` (cherry picked from commit `f7b31f864c`)	2022-04-10 15:34:11 +00:00
Antonio Sanchez	7e3bc4177e	Fix tensor broadcast off-by-one error. Caught by JAX unit tests. Triggered if broadcast is smaller than packet size. (cherry picked from commit `ffb78e23a1`)	2021-11-16 18:41:25 +00:00
Nico	71320af66a	Fix -Wbitwise-instead-of-logical clang warning & and \| short-circuit, && and \|\| don't. When both arguments to those are boolean, the short-circuiting version is usually the desired one, so clang warns on this. Here, it is inconsequential, so switch to && and \|\| to suppress the warning. (cherry picked from commit `b17bcddbca`)	2021-11-03 23:32:57 +00:00
Antonio Sanchez	0ab1f8ec03	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351. (cherry picked from commit `a500da1dc0`)	2021-11-03 23:30:47 +00:00
Antonio Sanchez	f9b2e92040	Remove bad "take" impl that causes g++-11 crash. For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes g++-11 to ICE with ``` sorry, unimplemented: unexpected AST of kind nontype_argument_pack ``` It does work with other versions of gcc, and with clang. I filed a GCC bug [here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999). Technically we should never actually run into this case, since you can't take n > 0 elements from an empty list. Commenting it out allows our Eigen tests to pass. (cherry picked from commit `8f8c2ba2fe`)	2021-11-03 23:26:34 +00:00
Maxiwell S. Garcia	b8cf1ed753	Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h (cherry picked from commit `09fc0f97b5`)	2021-09-01 17:26:59 +00:00
Antonio Sanchez	c2b6df6e60	Disable cuda Eigen::half vectorization on host. All cuda `__half` functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0. All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all. Modified the code to disable vectorization for `__half` on host, which required also updating the `TensorReductionGpu` implementation which previously made assumptions about available packets. (cherry picked from commit `cc3573ab44`)	2021-08-31 21:23:11 +00:00
jenswehner	338924602d	added includes for unordered_map (cherry picked from commit `e3e74001f7`)	2021-08-10 16:10:03 +00:00
Antonio Sanchez	46ecdcd745	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282. (cherry picked from commit `31f796ebef`)	2021-08-03 18:13:12 +00:00
Antonio Sanchez	bb33880e57	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit `00f32752`, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17. (cherry picked from commit `8cf6cb27ba`)	2021-08-03 17:25:17 +00:00
Alexander Karatarakis	c334eece44	_DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier (cherry picked from commit `f357283d31`)	2021-07-29 18:18:47 +00:00
Jonas Harsch	5a3c9eddb4	Removed superfluous boolean `degenerate` in TensorMorphing.h. (cherry picked from commit `e9c9a3130b`)	2021-07-08 18:34:10 +00:00
Antonio Sanchez	84955d109f	Fix Tensor documentation page. The extra [TOC] tag is generating a huge floating duplicated table-of-contents, which obscures the majority of the page (see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html). Remove it. Also, headers do not support markup (see [doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so backticks like ``` ``` end up generating titles that looks like ``` Constructor <tt>Tensor<double,2></tt> ``` Removing backticks for now. To generate proper formatted headers, we must directly use html instead of markdown, i.e. ``` <h2>Constructor <code>Tensor<double,2></code></h2> ``` which is ugly. Fixes #2254. (cherry picked from commit `f5a9873bbb`)	2021-07-07 17:18:20 +00:00
Jonas Harsch	601814b575	Don't crash when attempting to shuffle an empty tensor. (cherry picked from commit `aab747021b`)	2021-07-02 21:08:38 +00:00
Antonio Sanchez	8190739f12	Fix compile issues for gcc 4.8. - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. (cherry picked from commit `6035da5283`)	2021-07-01 23:18:10 +00:00
Antonio Sanchez	d82d915047	Modify tensor argmin/argmax to always return first occurence. As written, depending on multithreading/gpu, the returned index from `argmin`/`argmax` is not currently stable. Here we modify the functors to always keep the first occurence (i.e. if the value is equal to the current min/max, then keep the one with the smallest index). This is otherwise causing unpredictable results in some TF tests. (cherry picked from commit `3a087ccb99`)	2021-06-29 23:28:37 +00:00
Antonio Sanchez	a2040ef796	Rewrite balancer to avoid overflows. The previous balancer overflowed for large row/column norms. Modified to prevent that. Fixes #2273. (cherry picked from commit `e9ab4278b7`)	2021-06-21 18:14:53 +00:00
Antonio Sanchez	2d6eaaf687	Fix placement of permanent GPU defines. (cherry picked from commit `954879183b`)	2021-06-15 19:18:20 +00:00
Rasmus Munk Larsen	47722a66f2	Fix more enum arithmetic. (cherry picked from commit `13fb5ab92c`)	2021-06-15 16:40:35 +00:00
Antonio Sanchez	b5fc69bdd8	Add ability to permanently enable HIP/CUDA gpu* defines. When using Eigen for gpu, these simplify portability. If `EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then we do not undefine them. (cherry picked from commit `514977f31b`)	2021-06-11 17:48:37 +00:00
Antonio Sanchez	4b683b65df	Allow custom TENSOR_CONTRACTION_DISPATCH macro. Currently TF lite needs to hack around with the Tensor headers in order to customize the contraction dispatch method. Here we add simple `#ifndef` guards to allow them to provide their own dispatch prior to inclusion. (cherry picked from commit `6aec83263d`)	2021-06-11 17:19:29 +00:00
Rohit Santhanam	cbb6ae6296	Removed dead code from GPU float16 unit test. (cherry picked from commit `c8d40a7bf1`)	2021-06-10 17:16:47 +00:00
Nathan Luehr	82f13830e6	Fix calls to device functions from host code (cherry picked from commit `972cf0c28a`)	2021-05-12 17:01:45 +00:00
Antonio Sanchez	25424f4cf1	Clean up gpu device properties. Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue. (cherry picked from commit `0eba8a1fe3`)	2021-05-07 18:13:40 +00:00

1 2 3 4 5 ...

2957 Commits