Compare commits

..

12196 Commits

Author SHA1 Message Date
Antonio Sanchez
bc3b39870e Add 5.0.1 release notes and a few unreleased features.
(cherry picked from commit 91526464aef156bb0f847db6e00f0b971f9d9ac8)
2025-11-08 12:44:56 -08:00
Antonio Sanchez
c324ad2503 Enable tests on push 2025-11-06 09:31:16 -08:00
Antonio Sánchez
d6d4a3399f Fix MKL enum conversion warning.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Fix MKL enum conversion warning.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->
Fixes #2999.

Closes #2999

See merge request libeigen/eigen!2061

(cherry picked from commit 142caf889c)
2025-11-05 13:34:05 -08:00
Antonio Sánchez
6d5a91dae2 Remove deprecated CUDA device properties.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Remove deprecated CUDA device properties.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->
Fixes #3000.

Closes #3000

See merge request libeigen/eigen!2060

(cherry picked from commit 9e5714b93b)
2025-11-05 13:33:49 -08:00
Rasmus Munk Larsen
99e199f0eb Make assume_aligned a no-op on ARM & ARM64 when msan is used, to work around a missing linker symbol.
(cherry picked from commit 71703a9816)
2025-11-05 13:32:28 -08:00
Rasmus Munk Larsen
f471ebb8cc Fix more bugs in !2052
Fixes #2998

Closes #2998

See merge request libeigen/eigen!2057

Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
(cherry picked from commit ed9a0e59ba)
2025-11-05 13:32:02 -08:00
Rasmus Munk Larsen
f7ab506bd0 Fixes #2998.
(cherry picked from commit bfdbc031c2)
2025-11-05 13:31:45 -08:00
Antonio Sanchez
0db477863d Set 5.0.1 release version. 2025-11-05 13:31:12 -08:00
Antonio Sánchez
7a85735a1a Implement assume_aligned using the standard API
This implements `Eigen::internal::assume_aligned` to match the API for C++20 standard as best as possible using either `std::assume_aligned` or `__builtin_assume_aligned` if available. If neither is available, the function is a no-op.

The override macro `EIGEN_ASSUME_ALIGNED` was changed to a `EIGEN_DONT_ASSUME_ALIGNED`, which now forces the function to be a no-op.

See merge request libeigen/eigen!2052


(cherry picked from commit 8716f109e4)

f8191848 Fix pcmp_* for HVX to to comply with the new definition of true = Scalar(1).
7cc169d9 Revert "Fix pcmp_* for HVX to to comply with the new definition of true = Scalar(1)."
06999845 Merge branch eigen:master into master
b5fde61f Merge branch eigen:master into master
10d10d60 Merge branch eigen:master into master
9398d6ad Merge branch eigen:master into master
7804d5a4 Merge branch eigen:master into master
0068623c Merge branch eigen:master into master
b0ffc9cf Merge branch eigen:master into master
f3791c80 Merge branch eigen:master into master
74275f0c Merge branch eigen:master into master
b095614e Merge branch eigen:master into master
1312a696 Merge branch eigen:master into master
e6dd44d2 Merge branch eigen:master into master
8ac67769 Implement assume_aligned using the standard API if available.
97b299fa Format.
b31798be Fix typos.
04b3d312 Unformat.

Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
2025-11-05 21:26:22 +00:00
Antonio Sánchez
748962722c Fix SparseVector::insert(Index) assigning int to Scalar
Scalar doesn't necessarily support implicit construction from int or
assignment from int.

Here's the error message I got without this fix:
```
/home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/src/SparseCore/SparseVector.h:180:25: error: no match for ‘operator=’ (operand types are ‘Eigen::internal::CompressedStorage<ExplicitDouble, int>::Scalar’ {aka ‘ExplicitDouble’} and ‘int’)
  180 |     m_data.value(p + 1) = 0;
      |     ~~~~~~~~~~~~~~~~~~~~^~~
```

See merge request libeigen/eigen!2046


(cherry picked from commit 9234883914)

48d2b101 Fix SparseVector::insert(Index) assigning int to Scalar

Co-authored-by: Tyler Veness <calcmogul@gmail.com>
2025-11-05 21:24:44 +00:00
Antonio Sánchez
284dcc122b Fix doc references for nullary expressions.
<!-- 
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Fix doc references for nullary expressions.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>. 
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->

Fixes #2997.

### Additional information
<!--Any additional information you think is important.-->

Closes #2997

See merge request libeigen/eigen!2054


(cherry picked from commit 04eb06b354)

31a2702e Fix doc references for nullary expressions.

Co-authored-by: Antonio Sánchez <cantonios@google.com>
2025-11-03 18:54:35 +00:00
Antonio Sánchez
691009cfa3 Allow user to configure if free is allowed at runtime.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Allow user to configure if `free` is allowed at runtime.

Reverts to Eigen 3.4 behavior by default, where `free(...)` is allowed if `EIGEN_RUNTIME_NO_MALLOC` is defined but `set_is_malloc_allowed(true)`.  Adds a separate `set_is_free_allowed(...)` to explicitly control use of `std::free`.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->

Fixes #2983.

### Additional information
<!--Any additional information you think is important.-->

Closes #2983

See merge request libeigen/eigen!2047

(cherry picked from commit 60122df698)
2025-10-29 08:40:21 -07:00
Tyler Veness
a9110bfc87 Fix ambiguous sqrt() overload caused by ADL
Here's the compiler error:
```
/home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/src/Householder/Householder.h:82:16: error: call of overloaded ‘sqrt(boost::decimal::decimal64_t)’ is ambiguous
   82 |     beta = sqrt(numext::abs2(c0) + tailSqNorm);
      |            ~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/src/Householder/Householder.h:82:16: note: there are 2 candidates
In file included from /home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/Core:198,
                 from /home/tav/git/Sleipnir/test/src/optimization/cart_pole_problem_test.cpp:8:
/home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/src/Core/MathFunctions.h:1384:75: note: candidate 1: ‘typename Eigen::internal::sqrt_retval<typename Eigen::internal::global_math_functions_filtering_base<Scalar>::type>::type Eigen::numext::sqrt(const Scalar&) [with Scalar = boost::decimal::decimal64_t; typename Eigen::internal::sqrt_retval<typename Eigen::internal::global_math_functions_filtering_base<Scalar>::type>::type = boost::decimal::decimal64_t; typename Eigen::internal::global_math_functions_filtering_base<Scalar>::type = boost::decimal::decimal64_t]’
 1384 | EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE EIGEN_MATHFUNC_RETVAL(sqrt, Scalar) sqrt(const Scalar& x) {
      |                                                                           ^~~~
In file included from /home/tav/git/Sleipnir/build/_deps/decimal-src/include/boost/decimal/detail/cmath/ellint_1.hpp:16,
                 from /home/tav/git/Sleipnir/build/_deps/decimal-src/include/boost/decimal/cmath.hpp:18,
                 from /home/tav/git/Sleipnir/build/_deps/decimal-src/include/boost/decimal.hpp:33,
                 from /home/tav/git/Sleipnir/test/include/scalar_types_under_test.hpp:6,
                 from /home/tav/git/Sleipnir/test/src/optimization/cart_pole_problem_test.cpp:19:
/home/tav/git/Sleipnir/build/_deps/decimal-src/include/boost/decimal/detail/cmath/sqrt.hpp:167:16: note: candidate 2: ‘constexpr T boost::decimal::sqrt(T) requires  is_decimal_floating_point_v<T> [with T = decimal64_t]’
  167 | constexpr auto sqrt(const T val) noexcept
      |                ^~~~
```

Calling a function via its unqualified name invokes argument-dependent lookup. In this case, since `using numext::sqrt;` was used, both `numext::sqrt()` and `boost::decimal::sqrt()` participated in overload resolution. Since only `numext::sqrt()` was intended, the fix is to call that overload directly instead.

See merge request libeigen/eigen!2044

(cherry picked from commit be56fff1ff)
2025-10-27 09:49:29 -07:00
Antonio Sánchez
da889c88f9 Clarify range spanning major versions only works with 3.4.1.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Clarify range spanning major versions only works with 3.4.1.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->

Fixes #2994.

Closes #2994

See merge request libeigen/eigen!2042

(cherry picked from commit 1a5eecd45e)
2025-10-27 09:49:14 -07:00
Antonio Sánchez
85e122408b Eliminate use of std::cout in ArpackSelfAdjointEigenSolver.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Eliminate use of std::cout in ArpackSelfAdjointEigenSolver.

Instead set the appropriate error status on failure.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->

### Additional information
<!--Any additional information you think is important.-->

See merge request libeigen/eigen!2041

(cherry picked from commit b4209fe984)
2025-10-27 09:49:02 -07:00
Tyler Veness
5aec73ab06 Fix SparseVector::insertBack() with custom scalar types
It fixes this compiler error:
```
/home/tav/git/Sleipnir/build/_deps/eigen3-src/Eigen/src/SparseCore/SparseVector.h:143:19: error: cannot convert ‘int’ to ‘const Eigen::internal::CompressedStorage<boost::decimal::decimal64_t, int>::Scalar&’ {aka ‘const boost::decimal::decimal64_t&’}
  143 |     m_data.append(0, i);
      |                   ^
      |                   |
      |                   int
```

This change matches what SparseMatrix does:
https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/SparseCore/SparseMatrix.h#L430-L438

See merge request libeigen/eigen!2040

(cherry picked from commit ac3ef16f30)
2025-10-27 09:48:45 -07:00
Charles Schlosser
9071c1cd07 CI enhancements: visual indication of flaky tests
<!--
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message.
If the MR fixes an issue, please include "Fixes #issue" in the commit message and the MR description.

In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR.

Before submitting the MR, you also need to complete the following checks:
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes.
- Rebase before committing
- For code changes, run the test suite (at least the tests that are likely affected by the change).
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- If possible, add a test (both for bug-fixes as well as new features)
- Make sure new features are documented

Note that we are a team of volunteers; we appreciate your patience during the review process.

Again, thanks for contributing! -->

### Reference issue
<!-- You can link to a specific issue using the gitlab syntax #<issue number>  -->

### What does this implement/fix?
<!--Please explain your changes.-->

Currently, we run each test 3 times to account for flaky tests. Sometimes, the test fails so quickly that the random seed is the same for the subsequent test, which fails the exact same way.

This MR uses a nanosecond seed which resolves the issue described above. Now, if the test does not pass on the first attempt but passes on the retries, the gitlab job status will be yellow but still be treated as a pass in the ci/cd pipeline. Hopefully, this means we will get more passes and help us identify room for improvement.

### Additional information
<!--Any additional information you think is important.-->

See merge request libeigen/eigen!2025

(cherry picked from commit 40da5b64ce)
2025-10-27 09:48:36 -07:00
Antonio Sánchez
41e848c558 Support AVX for i686.
<!--
Thanks for contributing a merge request!

We recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen).

Before submitting the MR, please complete the following checks:
- Create one PR per feature or bugfix,
- Run the test suite to verify your changes.
  See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- Add tests to cover the bug addressed or any new feature.
- Document new features.  If it is a substantial change, add it to the [Changelog](https://gitlab.com/libeigen/eigen/-/blob/master/CHANGELOG.md).
- Leave the following box checked when submitting: `Allow commits from members who can merge to the target branch`.
  This allows us to rebase and merge your change.

Note that we are a team of volunteers; we appreciate your patience during the review process.
-->

### Description
<!--Please explain your changes.-->

Support AVX for i686.

There was an existing work-around for windows.  Added the more generic
architecture comparison to also apply for linux.

### Reference issue
<!--
You can link to a specific issue using the gitlab syntax #<issue number>.
If the MR fixes an issue, write "Fixes #<issue number>" to have the issue automatically closed on merge.
-->

Fixes #2991.

Closes #2991

See merge request libeigen/eigen!2037

(cherry picked from commit 8e60d4173c)
2025-10-27 09:48:15 -07:00
Antonio Sánchez
66375a8dc9 Fix commit references in changelog.
(cherry picked from commit f5907c5930)
2025-10-27 09:48:01 -07:00
Antonio Sanchez
076e524362 Try disabling the cache again for ROCm.
(cherry picked from commit 73da4623b1)
2025-10-27 09:47:47 -07:00
Saran Tunyasuvunakool
55d059108e Add a missing #include <version> to Core.
(cherry picked from commit db02f97850)
2025-10-27 09:47:32 -07:00
Antonio Sánchez
a5c3f697fa Fix DLL builds and c++ lapack declarations.
(cherry picked from commit e1f1a608be)
2025-10-27 09:47:17 -07:00
Antonio Sanchez
bdacccda15 Grammar fix "must has" --> "must have".
(cherry picked from commit 52358cb93b)
2025-10-27 09:47:04 -07:00
Antonio Sánchez
c9c2c1300a Disable ROCm job cache.
(cherry picked from commit 3bd0bfe0e0)
2025-10-27 09:46:47 -07:00
Charlie Schlosser
9ad6809b32 assume_aligned uses bytes not bits
(cherry picked from commit cd4f989f8f)
2025-10-27 09:46:30 -07:00
Antonio Sánchez
4f240eee3f Add a bunch of useful scripts for planning releases.
(cherry picked from commit ac7c192e1b)
2025-10-27 09:46:15 -07:00
Damiano Franzò
0f80c3d7e5 Fix jacobi svd for TriangularBase
(cherry picked from commit 5bc944a3ef)
2025-10-27 09:46:04 -07:00
Antonio Sánchez
e2e711aabe Fix BLAS/LAPACK DLL usage on Windows.
(cherry picked from commit dbe9e6961e)
2025-10-27 09:45:51 -07:00
Antonio Sánchez
6c1a886042 Add workaround for using std::fma for scalar multiply-add.
(cherry picked from commit ef3c5c1d1d)
2025-10-27 09:45:41 -07:00
Charles Schlosser
a34d8eee5c Fix alignment bug in avx pcast<Packet4l, Packet4d>
(cherry picked from commit 5996176b88)
2025-10-27 09:45:26 -07:00
Laurenz
92d5da40ac Fix SSE PacketMath Compilation Error on QNX
(cherry picked from commit 4bd382df56)
2025-10-27 09:45:08 -07:00
Charles Schlosser
b5848287be fix errors in windows builds and tests
(cherry picked from commit 13bd14974d)
2025-10-27 09:44:55 -07:00
Charles Schlosser
0ea30a44f3 get rid of a bunch of windows jobs
(cherry picked from commit f9f515fb55)
2025-10-27 09:43:13 -07:00
Antonio Sánchez
4abf3bd540 Update dev version number.
(cherry picked from commit ccde35bcd5)
2025-10-27 09:40:18 -07:00
Sergiu Deitsch
b89b549f96 Eliminate possible -Wstringop-overflow warning in .setZero()
(cherry picked from commit 32b0f386bc)
2025-10-27 09:39:01 -07:00
Antonio Sanchez
be7d8d37f2 Fix line-endings for MSVC files. 2025-10-27 09:37:58 -07:00
Jeremy Nimmer
b27f784ca5 Fix scalar_inner_product_op when binary ops return a different type 2025-10-06 18:19:59 +00:00
Guilhem Saurel
c08497ae1c tests: add missing link
(cherry picked from commit a67f9dabb0)
2025-10-02 10:10:05 -07:00
Antonio Sánchez
514221aee5 Update geo_homogeneous test, add eval() to PermutationMatrix.
(cherry picked from commit 4916887f2c)
2025-10-02 10:09:37 -07:00
Eugene Zhulenev
b60f763ec0 The 'CompressedStorageIterator<>' needs to satisfy the RandomAccessIterator
(cherry picked from commit 5c1029be1a)
2025-10-02 10:08:49 -07:00
Antonio Sanchez
63b5f63a6a Enable docs for the 5.0 branch on push. 2025-10-01 09:01:44 -07:00
Antonio Sanchez
549bf8c75b Add 5.0 changelog. 2025-09-28 00:24:17 -07:00
Charlie Schlosser
151b95d078 bump to 5.0.0 2025-09-28 00:24:17 -07:00
Hans Johnson
2e5447e620 STYLE: Scripts with shebang should be executable 2025-09-28 06:38:59 +00:00
Sergiu Deitsch
8d7ebac6ec Disambiguate multiplication of a permutation matrix and a homogeneous vector 2025-09-27 14:05:28 +02:00
Charles Schlosser
bea7f7c582 SparseMatrixBase: delete redundant/shadowed typedef 2025-09-26 09:32:28 +00:00
Julien Schueller
7292c78e18 blas: Fix parenthesis suggestion warning 2025-09-24 19:14:55 +00:00
Sergiu Deitsch
e524488eb2 Convert Mercurial hgeol to gitattributes 2025-09-24 19:14:40 +00:00
Charles Schlosser
dbd25f632b Fix select: return typed comparisons if vectorized 2025-09-24 05:38:12 +00:00
Antonio Sánchez
027dc5bc8d Extend the range of supported CMake package config versions 2025-09-23 19:52:35 +00:00
Sergiu Deitsch
4df215785b Support matrix multiplication of homogeneous row vectors 2025-09-23 14:56:28 +00:00
Rasmus Munk Larsen
2d170aea11 Define pcmp_le generically in terms of pcmp_eq and pcmp_lt. 2025-09-23 14:34:57 +00:00
Sergiu Deitsch
ea869e183b Add missing bool SSE2 PacketMath comparison 2025-09-22 21:28:45 +02:00
Julien Schueller
6ef18340a1 CMake: Explicit STATIC libs 2025-09-22 18:32:36 +00:00
Sergiu Deitsch
14477c5d43 Replace deprecated std::is_trivial by an internal definition 2025-09-22 16:59:10 +00:00
Antonio Sánchez
b2ec79a23c Move smoketests to small GitLab runners. 2025-09-22 16:45:02 +00:00
Sergiu Deitsch
62fbd276e0 Provide hints for deprecated functionality 2025-09-22 16:00:42 +00:00
Mark Shachkov
d38d669fdb Fix real schur exceptional shift 2025-09-22 15:57:14 +00:00
Sergiu Deitsch
4ac3e71f77 CMake: Require at least C++14 2025-09-22 15:45:39 +00:00
Antonio Sánchez
a627f72cd6 Add "Version" file and update version. 2025-09-20 02:08:59 +00:00
Evan Porter
e0a59e5a66 Fix typo 2025-09-09 06:21:42 +00:00
Evan Porter
6cd6284f7f Make the sparse matrix printer pretty 2025-09-08 20:05:46 +00:00
Antonio Sánchez
e5f3fa2d61 Add gemmtr implementation. 2025-09-05 22:31:30 +00:00
Antonio Sanchez
f426eff949 Add inline/device-function attributes to fma. 2025-09-02 22:51:35 +00:00
Antonio Sánchez
da1a34a6ba Zero-out matrix for empty set of triplets. 2025-09-02 22:51:17 +00:00
Evan Porter
52fc978c6f fixed typo sparcity -> sparsity 2025-09-02 19:34:43 +00:00
Antonio Sánchez
8a8fbc8f5e Don't enable AVX for wasm. 2025-08-29 21:50:25 +00:00
Antonio Sanchez
70d8d99d0d Only build docs on push to master branch, not MRs. 2025-08-29 18:33:09 +00:00
Antonio Sánchez
7f0cb638c5 Specialize numext::madd for half/bfloat16. 2025-08-29 18:11:25 +00:00
Antonio Sánchez
1e9d7ed7d3 Add missing semicolon to has_fma definitions to fix GPU builds. 2025-08-29 17:19:28 +00:00
Antonio Sanchez
5d4485e767 Move more jobs to gitlab runners. 2025-08-29 10:06:35 -07:00
Antonio Sánchez
2e8cc042a1 Replace calls to numext::fma with numext:madd. 2025-08-28 21:40:19 +00:00
Antonio Sánchez
52f570a409 Move GPU ci jobs to gitlab-hosted runners. 2025-08-28 18:24:41 +00:00
Charles Schlosser
38b51d5b7e Mitigate setConstant regression with custom scalars 2025-08-26 20:04:17 +00:00
Antonio Sanchez
d2a70fe4e2 Make permutation products aliasing by default. 2025-08-25 18:39:06 +00:00
Antonio Sánchez
4ae5647355 Fix direct index aliased assignment. 2025-08-25 18:17:18 +00:00
Antonio Sánchez
1a45d2168e Fix use of FMA in triangular solver for boost multiprecision. 2025-08-25 18:05:22 +00:00
anonymouspc
05e74b1a40 Tiny fix in unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h 2025-08-25 10:26:46 +00:00
Tyler Veness
d368998120 Fix MSVC error about missing std::bit_cast 2025-08-23 22:25:52 +00:00
Aleksei Nikiforov
c487a4fe9e Clean up most of testsuite on s390x 2025-08-15 20:04:25 +00:00
Charles Schlosser
4033cfcc1d Fix dangling reference in VectorwiseOp::iterator: Episode II: The Dependent Typedef Strikes Back 2025-08-14 16:30:19 +00:00
Charles Schlosser
e9dfbad618 Fix dangling reference in VectorwiseOp::iterator 2025-08-14 00:04:01 +00:00
Charles Schlosser
43a65a9cbd add RealView api 2025-08-12 16:55:05 +00:00
Rasmus Munk Larsen
954e21152e Include <limits> in test main.h 2025-08-10 21:23:31 +00:00
Artem Bishev
e15cd620a0 Remove select class 2025-08-10 17:44:09 +00:00
Cheng Wang
1c0048a08c Fix inconsistency between ptrue and pcmp_* in HVX 2025-08-09 19:32:30 +00:00
Artem Bishev
ddce1d7d12 Fixes #2952 2025-08-07 16:58:22 +00:00
Tyler Veness
8b9dbcdaaf Fix numext::bit_cast() compilation failure in C++20 2025-08-07 00:03:33 +00:00
Rasmus Munk Larsen
975a5aba4f Fix TODO: Use std::bit_cast or __builtin_bit_cast if available. 2025-08-06 19:00:08 +00:00
Rasmus Munk Larsen
4be7e6b4e0 Fix pcmp_* for HVX to comply with the new definition of true = Scalar(1) 2025-08-04 20:56:24 +00:00
Antonio Sánchez
edcf4c135f Remove fortran dependency for eigenblas. 2025-08-04 19:11:43 +00:00
Antonio Sanchez
e4493233e8 Fix EIGEN_OPTIMIZATION_BARRIER for clang-cl 2025-07-31 17:02:43 +00:00
Charles Schlosser
f5ead2d34c Fix intel packet math header inclusion order 2025-07-29 01:00:37 +00:00
Charles Schlosser
1e65707aa2 Suppress Warray-bounds warning in generic ploaduSegment, fix edge case for vectorized cast 2025-07-23 22:26:40 +00:00
Rasmus Munk Larsen
abeba85356 Use proper float literals in SpecialFunctionsImpl.h. 2025-07-19 01:17:12 +00:00
Rasmus Munk Larsen
b5bef9dcb0 Fix bug in Erfc introduced in !1862. 2025-07-18 17:58:48 -07:00
Rasmus Munk Larsen
97c7cc6200 Explicitly use the packet trait HasPow to control whether Pow is vectorized. 2025-07-18 21:51:42 +00:00
Rasmus Munk Larsen
efe5b6979d Unconditionally include <memory>. Some c++20 builds are currently broken because it is needed for std::assume_aligned. 2025-07-18 18:06:28 +00:00
Rasmus Munk Larsen
2cf66d4b0d Use numext::fma in more places in SparseCore. 2025-07-17 21:20:39 +00:00
jacques FRANC
d7fa5ebe0e Fix API incompatibility for ILU in superLU support 2025-07-17 15:27:26 +00:00
Kuan-Ting
cedf1f4c17 Fix typo: duplicated 'for' in docs 2025-07-16 01:12:48 +00:00
Charles Schlosser
302fc46bc3 arm packet alignment requirements and aligned loads/stores 2025-07-15 23:49:04 +00:00
Sean McBride
430e35fbd1 Fixed -Wshadow warning by renaming variables 2025-07-11 11:30:23 -04:00
Antonio Sánchez
bd0cd1d67b Fix self-adjoint products when multiplying by a compile-time vector. 2025-07-08 21:48:59 +00:00
Charles Schlosser
6854da2ea0 Fix 1x1 selfadjoint matrix-vector product bug 2025-07-07 17:32:54 +00:00
Sean McBride
ac1b29f823 Set CMake POLICY CMP0177 to NEW 2025-07-07 16:37:01 +00:00
Antonio Sánchez
849a336243 Move default builds/tests to GitLab runners. 2025-07-05 04:37:08 +00:00
Rasmus Munk Larsen
8ac2fb077d Use numext::fma for sparse x dense dot product. 2025-07-02 23:19:26 +00:00
Antonio Sánchez
cc0be00435 Fix docs build. 2025-07-02 22:10:33 +00:00
Antonio Sánchez
f169c13d8e Replace PPC g++-10 with g++14. 2025-07-02 17:07:44 +00:00
Henric Ryden
7fa069ef90 tensor documentation 2025-06-29 03:47:42 +00:00
Antonio Sánchez
7c636dd5db Move HIP/CUDA defines to Core. 2025-06-27 16:48:07 +00:00
Antonio Sánchez
26616fe5b8 Fix VSX packetmath psin and pcast tests. 2025-06-27 04:08:20 +00:00
Antonio Sánchez
a395ee162d Fix a collection of random failures encountered when testing with Bazel. 2025-06-26 16:58:24 +00:00
Antonio Sánchez
0bce653efc Use QEMU for arm and ppc tests. 2025-06-25 15:22:46 +00:00
Antonio Sánchez
db8bd5b825 Modify pselect and various masks to use Scalar(1) for true. 2025-06-20 22:40:46 +00:00
Antonio Sánchez
6de0515fa6 Create a changelog file. 2025-06-20 21:54:14 +00:00
Antonio Sánchez
98fbf6ed77 Decommission aarch64 ampere runner. 2025-06-20 20:33:52 +00:00
Charles Schlosser
81044ec13d Provide macro to explicitly disable alloca 2025-06-19 04:23:35 +00:00
Charles Schlosser
bcce88c99e Faster emulated half comparisons 2025-06-17 17:05:58 +00:00
Filippo Basso
ac6955ebc6 Remove MSVC warnings in FindCoeff.h 2025-06-17 00:39:02 +00:00
Antonio Sánchez
67a898a079 Fix unprotected SIZE in macro. 2025-06-16 22:54:25 +00:00
Antonio Sánchez
cdf6a1f5ed Add OpenBLAS sbgemm. 2025-06-16 18:23:03 +00:00
Charles Schlosser
d228bcdf8f Fix neon compilation bug 2025-06-10 21:52:01 +00:00
Charles Schlosser
994f3d107a Fix neon packet math tests, add missing neon intrinsics 2025-06-09 17:13:31 +00:00
AnonymousPC
cda19a6255 Make Eigen::Map<const Vector>::operator[] return correct type 2025-06-06 19:15:18 +00:00
Charles Schlosser
d0b490ee09 Optimize maxCoeff and friends 2025-06-06 14:55:49 +00:00
Antonio Sánchez
c458d68fae Fix compile warning about * with bool. 2025-06-05 22:48:57 +00:00
Adam Cogdell
3f00059beb Fix fuzzer range error for scalar parity check. 2025-06-05 22:27:35 +00:00
Charles Schlosser
21e89b930c Enable default behavior for pmin<PropagateFast>, predux_min, etc 2025-06-02 17:23:37 +00:00
Charles Schlosser
4fdf87bbf5 clean up intel packet reductions 2025-05-30 19:18:07 +00:00
Hs293Go
a7f183cadb Add factory/getters for quat coeffs in both orders 2025-05-28 18:39:55 -04:00
Sergiu Deitsch
d81aa18f4d Explicitly construct the scalar for non-implicitly convertible types 2025-05-15 17:40:29 +02:00
Charles Schlosser
171bd08ca9 fix 2849 2025-05-15 02:04:50 +00:00
Damiano Franzò
db85838ee2 Add DUCC FFT support 2025-05-12 17:56:02 +00:00
Damiano Franzò
6f1a143418 Ensure info() implementation across all SolverBase derived types 2025-05-10 01:25:26 +00:00
Damiano Franzò
f3e7d64f3d Fix: Correct Lapacke bindings for BDCSVD and JacobiSVD to match the updated API 2025-05-09 11:52:53 +00:00
Rasmus Munk Larsen
434a2fc4a4 Fix obsolete comment in InverseImpl.h. We use PartialPivLU for the general case. 2025-05-08 23:02:10 +00:00
Rasmus Munk Larsen
ae3aba99db Fix typo in CoreEvaluators.h 2025-05-08 17:43:12 +00:00
Charles Schlosser
ee4f86f909 Fix MSAN in vectorized casting evaluator 2025-05-08 09:38:35 +00:00
Duy Tran
6dbbf0a843 CMake: only create uninstall target when eigen is top level 2025-05-02 23:17:42 +00:00
Damiano Franzò
fb2fca90be Avoid unnecessary matrix copy in BDCSVD and JacobiSVD 2025-05-01 23:17:21 +00:00
Tyler Veness
d6b23a2256 Fix unused local typedef warning in matrix exponential 2025-04-29 19:54:15 +00:00
Rasmus Munk Larsen
7294434099 Avoid UB in ploaduSegment 2025-04-25 21:13:52 +00:00
Antonio Sánchez
2265a5e025 Fix commainitializer noexcept test. 2025-04-23 00:05:02 +00:00
Tyler Veness
619be0deb6 Replace instances of EIGEN_NOEXCEPT macros 2025-04-22 00:58:47 +00:00
Rasmus Munk Larsen
d2dce37767 Optimize slerp() as proposed by Gopinath Vasalamarri. 2025-04-21 14:11:42 -07:00
Rasmus Munk Larsen
66d8111ac1 Use a more conservative method to detect non-finite inputs to cbrt. 2025-04-21 20:59:46 +00:00
Tyler Veness
d6689a15d7 Replace instances of EIGEN_CONSTEXPR macro 2025-04-18 08:27:52 -07:00
Rasmus Munk Larsen
33f5f59614 Vectorize cbrt for float and double. 2025-04-17 23:31:20 +00:00
Charles Schlosser
5330960900 Enable packet segment in partial redux 2025-04-14 17:44:53 +00:00
Charles Schlosser
6266d430cc packet segment: also check DiagonalWrapper 2025-04-12 19:34:11 +00:00
Charles Schlosser
e39ad8badc fix constexpr in CoreEvaluators.h 2025-04-12 18:54:09 +00:00
Charles Schlosser
7aefb9f4d9 fix memset optimization for std::complex types 2025-04-12 16:20:09 +00:00
Charles Schlosser
73ca849a68 fix packetSegment for ArrayWrapper / MatrixWrapper 2025-04-12 12:12:48 +00:00
Charles Schlosser
28c3b26d53 masked load/store framework 2025-04-12 00:31:10 +00:00
Eugene Zhulenev
cebe09110c Fix a potential deadlock because of Eigen thread pool 2025-04-11 23:43:14 +00:00
William Kong
11fd34cc1c Fix the typing of the Tasks in ForkJoin.h 2025-04-09 17:21:36 +00:00
Hunter Belanger
2cd47d743e Fixe Conversion Warning in Parallelizer 2025-04-08 07:39:01 +00:00
Antonio Sánchez
b860042263 Add postream for ostream-ing packets more reliably. 2025-04-01 22:12:00 +00:00
Antonio Sánchez
02d9e1138a Add missing pmadd for Packet16bf. 2025-03-31 04:17:17 +00:00
Antonio Sánchez
9cc9209b9b Fix cmake warning and default to j0. 2025-03-29 16:09:40 +00:00
Rasmus Munk Larsen
e0c99a8dd6 By default, run ctests on all available cores in parallel. 2025-03-28 04:28:10 +00:00
Rasmus Munk Larsen
63a40ffb95 Use fma<float> for fma<half> and fma<bfloat16> if native fma is not available on the platform. 2025-03-28 04:26:04 +00:00
Antonio Sanchez
44fb6422be All triggering full CI if MR label containts all-tests 2025-03-27 08:37:24 -07:00
Rasmus Munk Larsen
3866cbfbe8 Fix test for TensorRef of trace. 2025-03-25 23:01:46 +00:00
Antonio Sanchez
6579e36eb4 Allow Tensor trace to be passed to a TensorRef. 2025-03-25 08:26:23 -07:00
Antonio Sanchez
8e32cbf7da Reduce flakiness of test for Eigen::half. 2025-03-23 22:31:25 -07:00
Antonio Sánchez
d935916ac6 Add numext::fma and missing pmadd implementations. 2025-03-23 01:05:53 +00:00
Charles Schlosser
754bd24f5e fix 2828 2025-03-22 17:19:44 +00:00
Charles Schlosser
ac2165c11f fix allFinite 2025-03-20 16:04:46 +00:00
William Kong
3143968195 Generalize the Eigen ForkJoin scheduler to use any ThreadPool interface. 2025-03-19 19:56:21 +00:00
Antonio Sánchez
70f2aead9a Use native _Float16 for AVX512FP16 and update vectorization. 2025-03-19 19:55:26 +00:00
Markus Vieth
0259a52b0e Use more .noalias() 2025-03-17 19:41:00 +01:00
Antonio Sánchez
14f845a1a8 Fix givens rotation. 2025-03-14 17:15:57 +00:00
Guilhem Saurel
33b04fe518 CMake: add install-doc target 2025-03-14 00:35:00 +00:00
Charles Schlosser
10e62ccd22 Fix x86 complex vectorized fma 2025-03-12 17:06:32 +00:00
Rasmus Munk Larsen
464c1d0978 Format TensorDeviceThreadPool.h & use if constexpr for c++20. 2025-03-08 01:09:36 +00:00
Rasmus Munk Larsen
21223f6bb6 Fix addition of different enum types. 2025-03-07 22:18:00 +00:00
Rasmus Munk Larsen
350544eb01 Clean up TensorDeviceThreadPool.h 2025-03-07 18:14:17 +00:00
Kevin
43810fc1be Fix extra semicolon in DeviceWrapper 2025-03-07 01:07:23 +00:00
Charles Schlosser
d28041ed5a refactor AssignmentFunctors.h, unify with existing scalar_op 2025-03-06 01:28:39 +00:00
Gopinath Vasalamarri
9a86214039 Optimize division operations in TensorVolumePatch.h 2025-02-28 22:34:13 +00:00
Antonio Sánchez
be5147b090 Fix STL feature detection for c++20. 2025-02-28 19:52:37 +00:00
Antonio Sanchez
179a49684a Fix CMake BOOST warning 2025-02-28 07:33:26 -08:00
Antonio Sanchez
dd56367554 Fix docs job for nightlies 2025-02-26 16:01:33 +00:00
Antonio Sánchez
d79bac0d3c Fix boolean scatter and random generation for tensors. 2025-02-25 21:37:09 +00:00
Tyler Veness
9935396b15 Specify constructor template arguments for ConstexprTest struct 2025-02-25 19:38:47 +00:00
Rasmus Munk Larsen
72adf891d5 Slightly simplify ForkJoin code, and make sure the test is actually run. 2025-02-25 17:22:43 +00:00
Antonio Sanchez
6aebfa9acc Build docs on push, and don't expire 2025-02-24 08:29:21 -08:00
Markus Vieth
bddaa99e15 Fix bitwise operation error when compiling as C++26 2025-02-23 02:30:55 +00:00
C. Antonio Sanchez
e42dceb3a1 Fix implicit copy-constructor warning in TensorRef. 2025-02-22 08:37:56 -08:00
Antonio Sanchez
5fc6fc9881 Initialize matrix in bicgstab test 2025-02-21 10:27:29 -08:00
Tyler Veness
0ae7b59018 Make assignment constexpr 2025-02-21 18:16:46 +00:00
Charles Schlosser
4dda5b927a fix Warray-bounds in inner product 2025-02-20 22:40:55 +00:00
C. Antonio Sanchez
66f7f51b7e Disable fno-check-new on clang. 2025-02-18 21:24:47 -08:00
Charles Schlosser
151f6127df Fix Warray-bounds warning for fixed-size assignments 2025-02-18 19:23:14 +00:00
C. Antonio Sanchez
1d8b82b074 Fix power builds for no VSX and no POWER8. 2025-02-15 13:56:47 -08:00
Charles Schlosser
eb3f9f443d refactor AssignmentEvaluator 2025-02-15 00:39:41 +00:00
Antonio Sánchez
9c211430b5 Fix TensorRef details 2025-02-14 18:33:26 +00:00
Antonio Sanchez
22cd7307dd Remove assumption of std::complex for complex scalar types. 2025-02-12 15:44:32 -08:00
Antonio Sánchez
6b4881ad48 Eliminate type-punning UB in Eigen::half. 2025-02-12 21:12:33 +00:00
Antonio Sánchez
420d891de7 Add missing mathjax/latex configuration. 2025-02-12 21:11:50 +00:00
Antonio Sánchez
becefd59e2 Returns condition number of zero if matrix is not invertible. 2025-02-12 07:09:20 +00:00
Antonio Sánchez
809d266b49 Fix numerical issues with BiCGSTAB. 2025-02-11 19:41:59 +00:00
Antonio Sánchez
ef475f2770 Add missing graphviz to doc build. 2025-02-11 16:03:41 +00:00
Antonio Sánchez
a0591cbc93 Fix doxygen-generated pages 2025-02-11 01:20:27 +00:00
Antonio Sánchez
715deac188 Add EIGEN_CI_CTEST_ARGS to allow for custom timeout. 2025-02-06 21:32:38 +00:00
Antonio Sánchez
4c38131a16 Fix android hardware_destructive_inference_size issue. 2025-02-05 23:53:55 +00:00
Antonio Sánchez
4c2611d27c Update check for std::hardware_destructive_interference_size 2025-02-05 19:41:07 +00:00
Antonio Sánchez
c079ee5e44 Fix tensor documentation. 2025-02-05 17:36:00 +00:00
Antonio Sanchez
74264c391a Add missing return statements for ppc. 2025-02-05 08:12:27 -08:00
Antonio Sánchez
3ebe898b5f Build and deploy nightly docs. 2025-02-05 00:35:34 +00:00
Antonio Sánchez
b73bb766a5 Increase max alignment to 256. 2025-02-04 20:06:28 +00:00
Antonio Sánchez
b1e74b1ccd Fix all the doxygen warnings. 2025-02-01 00:00:31 +00:00
Antonio Sánchez
9589cc4e7f Fix loongarch64 emulated tests. 2025-01-31 19:30:42 +00:00
Johannes Zipfel
2926b2e0a9 added functions to fetch L and U Factors from IncompleteLUT 2025-01-31 18:32:38 +00:00
William Kong
b6849f675d Change the midpoint chosen in Eigen::ForkJoinScheduler. 2025-01-30 20:21:30 +00:00
William Kong
1b2e84e55a Fix minor typos in ForkJoin.h 2025-01-29 20:12:04 +00:00
Tyler Veness
872c179f58 Fix UTF-8 encoding errors introduced by #1801 2025-01-28 16:52:46 -08:00
Rasmus Munk Larsen
2a35a917be Fix syntax error in NonBlockingThreadPool.h 2025-01-28 18:34:31 +00:00
Charles Schlosser
a056b93114 improve Simplicial Cholesky analyzePattern 2025-01-28 17:53:43 +00:00
William Kong
5d866a7a78 Fix potential data race on spin_count_ NonBlockingThreadPool member variable 2025-01-28 17:22:15 +00:00
William Kong
bc67025ba7 Clean up and fix the documentation of ForkJoin.h 2025-01-27 23:12:17 +00:00
Antonio Sánchez
dc1126e762 Fix threadpool for c++14. 2025-01-27 21:57:23 +00:00
Rasmus Munk Larsen
cd511a09aa Fix initialization order and remove unused variables in NonBlockingThreadPool.h. 2025-01-27 19:35:49 +00:00
Johannes Zipfel
f679843dc2 Block doc non square 2025-01-25 17:14:21 +00:00
William Kong
f9705adabb Fix typo introduced in the refactor of NonBlockingThreadPool 2025-01-25 17:13:24 +00:00
Antonio Sánchez
b75895a8b6 Try to fix loongarch 2025-01-25 16:38:41 +00:00
William Kong
4a6ac97d13 Add a ForkJoin-based ParallelFor algorithm to the ThreadPool module 2025-01-24 22:12:05 +00:00
Pengzhou0810
e986838464 Add LoongArch64 architecture LSX support.(build/test ) 2025-01-20 18:37:44 +00:00
Markus Vieth
c486af5ad3 Change Eigen::aligned_allocator to not inherit from std::allocator 2025-01-20 16:04:43 +00:00
Antonio Sánchez
abac563f5d Update documentation to clarify cross product for complex numbers. 2025-01-16 00:52:40 +00:00
Antonio Sanchez
2e76277bd0 Zero-initialize test arrays to avoid uninitialized reads. 2025-01-14 09:15:43 -08:00
Antonio Sánchez
ad13df7ea4 Fix std::fill_n reference. 2025-01-14 00:43:00 +00:00
Frédéric Simonis
9836e8d035 Fix read of uninitialized threshold in SparseQR 2025-01-08 23:40:58 +00:00
Charles Schlosser
7bb23b1e36 CI: don't add ToolChain PPA 2024-12-31 14:04:01 +00:00
xsjk
7bb8c58e7c Fix the missing CUDA device qualifier 2024-12-28 15:17:55 +00:00
Joerg Buchwald
24e0c2a125 use omp_get_max_threads if setNbThreads is not set 2024-12-20 21:16:15 +00:00
Jordan Rupprecht
a32db43966 Add missing #include <new> 2024-12-19 11:06:08 +00:00
Charles Schlosser
c01ff45312 Enable fill_n and memset optimizations for construction and assignment 2024-12-14 14:25:04 +00:00
Antonio Sánchez
af59ada0ac Use alpine for deploying nightly tag. 2024-12-10 22:48:29 +00:00
Charles Schlosser
4a9e32ae0b matrix equality operator 2024-12-10 12:40:39 +00:00
Antonio Sanchez
00776d1ba4 Remove branch name from nightly tag job. 2024-12-09 20:18:18 -08:00
Antonio Sanchez
7f23778593 Add tag to commit instead of branch 2024-12-09 07:47:48 -08:00
Antonio Sánchez
c30b35a310 Force tag to update to latest head. 2024-12-08 04:48:21 +00:00
Antonio Sánchez
a26ba67349 Add LICENSE file in correct place so it is picked up by gitlab. 2024-12-08 03:26:43 +00:00
Charles Schlosser
08c31c3ba6 try alpine for formatting 2024-12-08 01:09:33 +00:00
Antonio Sanchez
1ac1af62ef Update deploy job 2024-12-07 09:19:21 -08:00
Antonio Sánchez
7b6623af30 Fix special packetmath erfc flushing for ARM32. 2024-12-07 01:42:30 +00:00
Antonio Sánchez
fd48fbb260 Update rocm docker again again. 2024-12-06 22:13:53 +00:00
Antonio Sánchez
a885340ba5 Update rocm docker again. 2024-12-06 17:19:31 +00:00
Antonio Sanchez
45a8478d09 Update rocm docker image in CI. 2024-12-06 07:14:59 -08:00
Antonio Sánchez
de4afcf414 Add a deploy phase to the CI that tags the latest nightly pipeline if it passes. 2024-12-05 15:28:18 +00:00
Charles Schlosser
5e8916050b move constructor / move assignment doc strings 2024-12-04 17:42:20 +00:00
Charles Schlosser
77a073aaa8 fix checkformat ci stage 2024-12-04 02:45:52 +00:00
Charles Schlosser
41e46ed243 fix IOFormat alignment 2024-12-04 01:13:48 +00:00
Charles Schlosser
a0d32e40d9 fix map fill logic 2024-11-30 13:39:02 +00:00
Charles Schlosser
d34b100c13 Fix UB in setZero 2024-11-27 19:32:14 +00:00
Rasmus Munk Larsen
f19a6803c8 Refactor special case handling in pow(x,y) and revert to repeated squaring for <float,int> 2024-11-27 00:24:21 +00:00
Rasmus Munk Larsen
5064cb7d5e Add test for using pcast on scalars. 2024-11-25 22:27:26 -08:00
Rasmus Munk Larsen
1ea61a5d26 Improve pow(x,y): 25% speedup, increase accuracy for integer exponents. 2024-11-26 06:13:48 +00:00
Charles Schlosser
8ad4344ca7 optimize setConstant, setZero 2024-11-22 03:39:19 +00:00
Rasmus Munk Larsen
5610a13b77 Simplify and speed up pow() by 5-6% 2024-11-20 12:45:00 +00:00
Rasmus Munk Larsen
6c6ce9d06b Enable vectorized erf<double>(x) for SSE and AVX, which was accidentally removed in merge request 1750. 2024-11-19 22:14:29 +00:00
Rasmus Munk Larsen
e7c799b7c9 Prevent premature overflow to infinity in exp(x). The changes also provide a 3-4% speedup. 2024-11-19 13:08:18 -08:00
Rasmus Munk Larsen
00af47102d Revert 040180078d 2024-11-19 10:25:16 -08:00
Rasmus Munk Larsen
8ee6f8475a Speed up exp(x). 2024-11-19 17:50:34 +00:00
Charles Schlosser
93ec5450cb disable fill_n optimization for msvc 2024-11-19 01:38:48 +00:00
Rasmus Munk Larsen
0af6ab4b76 Remove unnecessary check for HasBlend trait. 2024-11-18 21:16:45 +00:00
Rasmus Munk Larsen
d5eec781b7 Get rid of redundant computation for large arguments to erf(x). 2024-11-18 10:51:58 -08:00
Tyler Veness
2fc63808e4 Fix C++20 constexpr test compilation failures 2024-11-18 01:56:55 +00:00
Rasmus Munk Larsen
5133c836c0 Vectorize erf(x) for double. 2024-11-16 19:05:16 +00:00
Conrad Poelman
d6e3b528b2 Update Assign_MKL.h to cast disparate enum type to int, so it can be compared... 2024-11-15 20:00:29 +00:00
breathe1
040180078d Ensure that destructor's needed by lldb make it into binary in non-inlined fashion 2024-11-15 17:15:09 +00:00
Tyler Veness
0fb2ed140d Make element accessors constexpr 2024-11-14 01:05:29 +00:00
Charles Schlosser
8b4efc8ed8 check_size_for_overflow: use numeric limits instead of c99 macro 2024-11-13 00:35:35 +00:00
Charles Schlosser
489dbbc651 make fixed_size matrices conform to std::is_standard_layout 2024-11-12 23:34:26 +00:00
Rasmus Munk Larsen
283d871a3f Add missing EIGEN_DEVICE_FUNCTION decorations. 2024-11-08 14:25:57 -08:00
Rasmus Munk Larsen
0d366f6532 Vectorize erfc(x) for double and improve erfc(x) for float. 2024-11-08 17:21:11 +00:00
Charles Schlosser
8adf43640e more avx predux_any 2024-11-07 19:58:48 +00:00
Charles Schlosser
bc424f617a add missing avx predux_any functions 2024-11-07 19:11:29 +00:00
Charles Schlosser
e52ac76ca3 use EIGEN_CPLUSPLUS instead of checking cpp version 2024-11-06 17:25:22 +00:00
Rasmus Munk Larsen
122be167cd Revert "make fixed-size objects trivially move assignable" 2024-11-06 01:09:38 +00:00
Tobias Wood
d49021212b Tensor Roll / Circular Shift / Rotate 2024-11-05 14:10:19 +00:00
Charles Schlosser
bb73be8a2e make fixed-size objects trivially move assignable 2024-11-04 17:55:27 +00:00
Antonio Sánchez
7fd305ecae Fix GPU builds. 2024-11-01 04:50:03 +00:00
Morris Hafner
c8267654f2 Don't use __builtin_alloca_with_align with nvc++ 2024-10-30 18:02:08 +00:00
Tyler Veness
84c446df2c Fix macro redefinition warning in FFTW test 2024-10-30 17:18:42 +00:00
Antonio Sánchez
a9584d8e3c Fix clang6 failures. 2024-10-30 14:41:50 +00:00
Antonio Sánchez
dd4c2805d9 Fix clang6 failures. 2024-10-29 22:18:30 +00:00
Antonio Sánchez
9e962d9c54 Fix OOB access in triangular matrix multiplication. 2024-10-29 19:07:07 +00:00
Antonio Sánchez
695e49d1bd Fix NVCC builds for CUDA 10+. 2024-10-29 18:38:14 +00:00
Antonio Sánchez
dae09773fc Don't pass matrices by value. 2024-10-29 18:19:02 +00:00
Rasmus Munk Larsen
c23ec3420e Add tests for sizeof() with one dynamic dimension. 2024-10-28 13:48:53 -07:00
Rasmus Munk Larsen
58b252e5b3 Fix typo in PacketMath.h 2024-10-28 18:19:52 +00:00
Rasmus Munk Larsen
6c04d0cd68 Add missing exp2 definition for Altivec. 2024-10-28 18:12:36 +00:00
Peter Gavin
b15ebb1c2d add nextafter for bfloat16 2024-10-26 00:08:25 +00:00
Rasmus Munk Larsen
53b83cddf9 Include <type_traits> in main.h for std::is_trivial* 2024-10-25 20:55:51 +00:00
Charles Schlosser
37563856c9 Fix stack allocation assert 2024-10-25 17:02:43 +00:00
Rasmus Munk Larsen
3f067c4850 Add exp2() as a packet op and array method. 2024-10-22 22:09:34 +00:00
Charles Schlosser
4e5136d239 make fixed size matrices and arrays trivially_default_constructible 2024-10-21 17:10:15 +00:00
Antonio Sánchez
b396a6fbb2 Add free-function swap. 2024-10-14 15:51:40 +00:00
Charles Schlosser
820e8a45fb add compile time info to reverse in place 2024-10-13 17:55:56 +00:00
Charles Schlosser
b55dab7f21 Fix DenseBase::tail for Dynamic template argument 2024-10-12 21:03:30 +00:00
Charles Schlosser
e0cbc55d92 Update README.md 2024-10-10 01:54:30 +00:00
Rasmus Munk Larsen
7eea0a9213 Vectorize erfc() for float 2024-10-09 18:38:05 +00:00
Rasmus Munk Larsen
78f3c654ee Don't use constexpr with half. 2024-10-08 16:44:40 +00:00
Antonio Sánchez
6d7af238fa Adjust array_cwise for 32-bit arm. 2024-10-07 23:15:24 +00:00
Rasmus Munk Larsen
74dcfbbd0f Use ppolevl for polynomial evaluation in more places. 2024-10-07 13:27:28 -07:00
Rasmus Munk Larsen
a097f728fe Avoid producing erf(x) = NaN for large |x|. 2024-10-04 12:15:23 -07:00
Rasmus Munk Larsen
44b16f48cb Improve speed and accuracy or erf() 2024-10-03 01:52:16 +00:00
Antonio Sánchez
12068cbcdb Fix inverse evaluator for running on CUDA device. 2024-10-01 20:59:54 +00:00
Rasmus Munk Larsen
4e8e5e7409 Add max_digits10 in NumTraits for mpreal types. 2024-10-01 18:57:17 +00:00
Rasmus Munk Larsen
8e8c319087 Add missing EIGEN_DEVICE_FUNC annotations. 2024-10-01 11:40:58 -07:00
Charles Schlosser
7ad7c1d5c5 fix implicit conversion warning (again) 2024-09-24 22:07:00 +00:00
Charles Schlosser
d052b7f864 add extra debugging info to float_pow_test_impl, clean up array_cwise tests 2024-09-24 21:08:22 +00:00
Charles Schlosser
ba5183f98c fix warning in EigenSolver::pseudoEigenvalueMatrix() 2024-09-24 17:23:58 +00:00
Charles Schlosser
3ffb4e50df fix implicit conversion in TensorChipping 2024-09-24 16:58:49 +00:00
Sean McBride
b6b8b54e5e Fixed issue #2858: removed unneeded call to _mm_setzero_si128 2024-09-24 16:29:45 +00:00
Frédéric BRIOL
2a3465102a Refactor code to use constexpr for data() functions. 2024-09-23 16:43:53 +00:00
Charles Schlosser
2d4c9b400c make fixed size matrices and arrays trivially_copy_constructible and trivially_move_constructible 2024-09-17 17:43:36 +00:00
Antonio Sánchez
132f281f50 Fix generic ceil for SSE2. 2024-09-14 01:31:21 +00:00
Charles Schlosser
84282c42fc optimize new dot product 2024-09-11 21:40:43 +00:00
Charles Schlosser
fb477b8be1 Better dot products 2024-09-10 21:02:31 +00:00
Sophie Chang
134b526d61 Update NonBlockingThreadPool.h plain asserts to use eigen_plain_assert 2024-09-10 00:18:27 +00:00
qile lin
072ec9d954 Fix a bug for pcmp_lt_or_nan and Add sqrt support for SVE 2024-09-04 21:45:39 +00:00
Rasmus Munk Larsen
9315389795 Fix bug in bug fix for atanh. 2024-09-04 09:37:59 -07:00
Rasmus Munk Larsen
f33af052e0 Fix bug for atanh(-1). 2024-09-03 20:54:01 +00:00
Rasmus Munk Larsen
66927f7807 Fix out-of-range arguments to _mm_permute_pd. 2024-08-30 17:31:52 +00:00
Rasmus Munk Larsen
bbdabebf44 Vectorize atanh<double>. Make atanh(x) standard compliant for |x| >= 1. 2024-08-30 17:27:55 +00:00
Morris Hafner
26e2c4f617 Add nvc++ support 2024-08-30 12:34:48 +00:00
Eugene Zhulenev
c59332d74a Detect "effectively inner/outer" chipping in TensorChipping 2024-08-29 17:49:59 +00:00
Charles Schlosser
648bce6cae SSE/AVX Complex FMA 2024-08-29 17:37:57 +00:00
Charles Schlosser
c21a80be3d BDCSVD: Suppress Wmaybe-uninitialized 2024-08-29 02:45:38 +00:00
Charles Schlosser
9d3d37c5b7 Complex Numtraits::HasSign and nmsub test 2024-08-28 03:02:47 +00:00
Valentin Sarthou
c5189ac656 Fix GeneralizedEigenSolver::eigenvectors() not appearing in documentation 2024-08-24 00:30:06 +00:00
qile lin
3b5a1b4157 sve instrinsics with "_x" suffix will be faster than "_z" suffix 2024-08-23 12:52:22 +00:00
Rasmus Munk Larsen
98f1ac5e65 Fix breakage in GPU build. 2024-08-23 06:08:37 +00:00
Charles Schlosser
231308f690 TensorVolumePatchOp: Suppress Wmaybe-uninitialized caused by unreachable code 2024-08-23 01:55:12 +00:00
Tobias Wood
2bf8fe1489 NEON Complex Intrinsics 2024-08-22 22:46:16 +00:00
Rasmus Munk Larsen
f91f8e9ab9 Consolidate float and double implementations of patan(). 2024-08-21 20:44:18 +00:00
Charles Schlosser
87239e058a vectorize squaredNorm() for complex types 2024-08-21 10:54:17 +00:00
Rasmus Munk Larsen
32d95bb097 Add vectorized implementation of tanh<double> 2024-08-21 02:29:45 +00:00
Rasmus Munk Larsen
cc240eea2f Speed up and improve accuracy of tanh. 2024-08-16 23:46:28 +00:00
Rasmus Munk Larsen
92e373e6f5 Speed up StableNorm for non-trivial sizes and improve consistency between aligned and unaligned inputs. 2024-08-14 21:42:04 +00:00
Rasmus Munk Larsen
1dbc7581ec Include <thread> for std::this_thread::yield(). 2024-08-14 17:44:14 +00:00
Rasmus Munk Larsen
ab310943d6 Add a yield instruction in the two spinloops of the threaded matmul implementation. 2024-08-09 10:48:24 -07:00
Rasmus Munk Larsen
99ffad1971 A few cleanups to threaded product code and test. 2024-08-09 09:35:23 -07:00
Charles Schlosser
59498c96fe SSE/AVX use fmaddsub for complex products 2024-08-05 21:26:05 +00:00
Rasmus Munk Larsen
1dcae7cefc Revert "BDCSVD fix -Wmaybe-uninitialized"
This reverts merge request !1649
2024-08-05 18:17:01 +00:00
Tyler Veness
d14b0a4e53 Remove C++23 check around has_denorm deprecation suppression 2024-08-03 21:34:27 +00:00
Jatin Chaudhary
24db460503 hlog symbol lookup should not restricted to global namespace 2024-08-03 03:59:13 +00:00
Alexander Grund
767e60e290 Fix Woverflow warnings in PacketMathFP16 2024-08-03 03:57:18 +00:00
Alexander Grund
8025683226 Fix conversion of Eigen::half to _Float16 in AVX512 code 2024-08-03 03:49:51 +00:00
Alexey Korepanov
ec18dd09c8 fix pi in kissfft 2024-08-02 22:57:47 +00:00
Rasmus Munk Larsen
2b7b7aac57 Speed up complex * complex matrix multiplication. 2024-08-02 20:40:53 +00:00
Devon Loehr
b3e3b7b0ec Remove implicit this capture in lambdas 2024-08-02 20:06:35 +00:00
Eugene Zhulenev
e44db21092 Optimize ThreadPool spinning 2024-08-02 19:18:34 +00:00
Mike Taves
c593e9e948 Fix typos 2024-08-02 00:06:24 +00:00
Eugene Zhulenev
fd98cc49f1 Avoid atomic false sharing in RunQueue 2024-08-01 17:41:16 +00:00
Charles Schlosser
0b646f3f36 Update file .clang-format 2024-08-01 03:18:50 +00:00
Charles Schlosser
1dcb07bb2a Update file eigen_navtree_hacks.js 2024-08-01 02:51:04 +00:00
Charles Schlosser
ddb163ffb1 Update file .clang-format 2024-08-01 00:29:36 +00:00
Charles Schlosser
3f06651fd6 BDCSVD fix -Wmaybe-uninitialized 2024-07-30 22:53:06 +00:00
Frédéric Chapoton
6331da95eb fixing a lot of typos 2024-07-30 22:15:49 +00:00
Alexander Hans
c29c800126 Fix formatting in README.md 2024-07-03 19:16:56 +00:00
adambanas
33d0937c6b Add async support for 'chip' and 'extract_volume_patches' 2024-06-27 09:56:06 +02:00
Rasmus Munk Larsen
d791d48859 Fix AVX512FP16 build failure 2024-06-18 22:34:32 +00:00
Charles Schlosser
2fae4d7a77 Revert "fix scalar pselect" 2024-06-15 20:02:28 +00:00
Charles Schlosser
b430eb31e2 AVX512F double->int64_t cast 2024-06-15 17:45:02 +00:00
Charles Schlosser
02bcf9b591 fix scalar pselect 2024-06-10 17:30:22 +00:00
Louis David
392b95bdf1 allow pointer_based_stl_iterator to conform to the contiguous_iterator concept if we are in c++20 2024-06-06 21:38:09 +00:00
Victor Ceballos
27f8176254 fixing warning C5054: operator '==': deprecated between enumerations of different types 2024-06-04 16:44:13 +03:00
Charles Schlosser
eac6355df2 Fix warnings created by other warnings fix 2024-06-01 03:37:04 +00:00
Rasmus Munk Larsen
7029a2e971 Vectorize allFinite() 2024-06-01 03:24:26 +00:00
Charles Schlosser
e605227030 Fix warnings 2024-05-31 14:33:37 +00:00
Rasmus Munk Larsen
38b9cc263b Fix warnings about repeated deinitions of macros. 2024-05-29 13:38:00 -07:00
Rasmus Munk Larsen
f02f89bf2c Don't redefine EIGEN_DEFAULT_IO_FORMAT in main.h. 2024-05-29 18:14:32 +00:00
Rasmus Munk Larsen
9148c47d67 Vectorize isfinite and isinf. 2024-05-29 00:20:12 +00:00
Tobias Wood
5a9f66fb35 Fix Thread tests 2024-05-24 16:50:14 +00:00
Tyler Veness
c4d84dfddc Fix compilation failures on constexpr matrices with GCC 14 2024-05-22 12:29:01 +00:00
Charles Schlosser
99adca8b34 Incorporate Threadpool in Eigen Core 2024-05-20 23:42:51 +00:00
Tyler Veness
d165c7377f Format EIGEN_STATIC_ASSERT() as a statement macro 2024-05-20 23:02:42 +00:00
Charles Schlosser
f78dfe36b0 use built in alloca with align if available 2024-05-19 19:32:49 +00:00
Tyler Veness
b9b1c8661e Suppress C++23 deprecation warnings for std::has_denorm and std::has_denorm_loss 2024-05-17 15:55:22 +00:00
Charlie Schlosser
3d2e738f29 fix performance-no-int-to-ptr 2024-05-16 23:25:42 -04:00
Antonio Sánchez
de8013fa67 Fix ubsan failure in array_for_matrix 2024-05-16 18:47:36 +00:00
Rasmus Munk Larsen
5e4f3475b5 Remove call to deprecated method initParallel() in SparseDenseProduct.h 2024-05-15 23:12:32 +00:00
Charles Schlosser
59cf0df1d6 SparseMatrix::insert add checks for valid indices 2024-05-15 16:14:32 +00:00
Anabasis
c0fe6ce223 Fixed a clerical error at documentation of class Matrix. 2024-05-13 02:51:40 +00:00
Antonio Sánchez
afb17288cb Fix gcc6 compile error. 2024-05-10 19:13:21 +00:00
Chip Kerchner
4d1d14e069 Change predux on PowerPC for Packet4i to NOT saturate the sum of the elements (like other architectures). 2024-05-08 22:39:27 +00:00
daizhirui
ff174f7926 fix issue: cmake package does not set include path correctly 2024-05-07 21:21:08 +00:00
Antonio Sánchez
e16d70bd4e Fix FFT when destination does not have unit stride. 2024-05-07 17:18:29 +00:00
Charles Schlosser
99c18bce6e Msvc muluh 2024-05-07 16:30:58 +00:00
Charles Schlosser
8e47971789 Bit shifting functions 2024-05-03 18:55:02 +00:00
Antonio Sánchez
9700fc847a Reorganize CMake and minimize configuration for non-top-level builds. 2024-05-01 17:42:53 +00:00
Antonio Sánchez
c1d637433e Judge unitary-ness relative to scaling. 2024-04-30 22:28:46 +00:00
Rasmus Munk Larsen
9000b37677 Fix new generic nearest integer ops on GPU. 2024-04-30 22:18:25 +00:00
Charles Schlosser
0ee5c90aa9 Eigen transpose product 2024-04-30 13:32:52 +00:00
Charles Schlosser
fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Jonathan Freed
d5524fc57b Remove unnecessary semicolons. 2024-04-29 21:31:26 +00:00
Antonio Sánchez
ae5280aa8d Fix more hard-coded magic bounds. 2024-04-29 21:21:11 +00:00
Antonio Sánchez
a5e147305b Fix undefined behavior for generating inputs to the predux_mul test. 2024-04-29 20:32:09 +00:00
Antonio Sánchez
dcceb9afec Unbork avx512 preduce_mul on MSVC. 2024-04-26 15:28:03 +00:00
Antonio Sánchez
42aa3d17cd Slightly adjust error bound for nonlinear tests. 2024-04-25 18:04:49 +00:00
Antonio Sanchez
1c8c734c8b Fix sin/cos on PPC. 2024-04-24 15:58:03 -07:00
Charles Schlosser
34967b0b5b Revert "fix transposed matrix product bug"
This reverts merge request !1598
2024-04-23 14:07:11 +00:00
Antonio Sánchez
9cec679ef1 Don't let the PPC runner try to cross-compile. 2024-04-23 03:40:40 +00:00
Charles Schlosser
574bc8820d fix transposed matrix product bug 2024-04-23 03:25:57 +00:00
Rasmus Munk Larsen
112ad8b846 Revert part of !1583, which may cause underflow on ARM. 2024-04-22 21:14:38 +00:00
Charles Schlosser
8cafbc4736 Fix unused variable warnings in TensorIO 2024-04-22 18:14:54 +00:00
Charles Schlosser
4de870b6eb fix autodiff enum comparison warnings 2024-04-22 18:14:20 +00:00
Antonio Sánchez
2265242aa1 Update CI scripts. 2024-04-20 01:08:19 +00:00
ahmed
ee9d57347b Fix tridiagonalization_inplace_selector::run() when called from CUDA 2024-04-19 21:06:59 +00:00
Charles Schlosser
1550c99541 Eigen select 2024-04-19 17:52:34 +00:00
Charles Schlosser
5635d37f46 more pblend optimizations 2024-04-19 02:02:27 +00:00
Antonio Sánchez
f0795d35e3 Fix new psincos for ppc and arm32. 2024-04-19 00:31:09 +00:00
Chip Kerchner
ad452e575d Fix compilation problems with PacketI on PowerPC. 2024-04-18 14:55:15 +00:00
Charles Schlosser
fcaf03ef7c fix pendantic compiler warnings 2024-04-17 16:55:45 +00:00
Rasmus Munk Larsen
b5feca5d03 Fix build for pblend and psin_double, pcos_double when AVX but not AVX2 is supported. 2024-04-16 16:12:41 +00:00
Damiano Franzò
888fca0e2b Simd sincos double 2024-04-15 21:12:32 +00:00
Charles Schlosser
6ad2ccea4e Eigen pblend 2024-04-15 16:19:53 +00:00
Charles Schlosser
9099c5eac7 Handle missing AVX512 intrinsic 2024-04-14 16:41:23 +00:00
Charles Schlosser
122befe54c Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings 2024-04-12 19:35:04 +00:00
Antonio Sánchez
dcdb0233c1 Refactor indexed view to appease MSVC 14.16. 2024-04-12 17:05:20 +00:00
Rasmus Munk Larsen
5226566a14 Speed up pldexp_generic. 2024-04-12 01:32:17 +00:00
Stéphane T.
3c6521ed90 Add constexpr to accessors in DenseBase, Quaternions and Translations 2024-04-11 14:46:48 +00:00
Rasmus Munk Larsen
3c9109238f Add support for Packet8l to AVX512. 2024-04-09 22:58:44 +00:00
Charles Schlosser
2620cb930b Update file Geometry_SIMD.h 2024-04-05 18:30:39 +00:00
Dieter Dobbelaere
b2c9ba2bee Fix preprocessor condition on when to use fast float logistic implementation. 2024-04-03 21:51:39 +00:00
Rasmus Munk Larsen
283d69294b Guard AVX2 implementation of psignbit in PacketMath.h 2024-04-03 21:03:26 +00:00
Chip Kerchner
be54cc8ded Fix preverse for PowerPC. 2024-04-03 20:09:06 +00:00
Rasmus Munk Larsen
c5b234196a Fix unused variable warning in TensorIO.h 2024-04-03 19:49:31 +00:00
Charles Schlosser
86aee3d9c5 Fix long double random 2024-04-02 12:05:40 +00:00
Charles Schlosser
776d86d8df AVX: guard Packet4l definition 2024-04-01 00:31:46 +00:00
Charles Schlosser
e63d9f6ccb Fix random again 2024-03-29 21:49:27 +00:00
Charles Schlosser
f75e2297db AVX2 - double->int64_t casting 2024-03-29 21:35:09 +00:00
Antonio Sánchez
13092b5d04 Fix usages of Eigen::array to be compatible with std::array. 2024-03-29 15:51:15 +00:00
Antonio Sánchez
77833f9320 Allow symbols to be used in compile-time expressions. 2024-03-28 18:43:50 +00:00
Antonio Sánchez
d26e19714f Add missing cwiseSquare, tests for cwise matrix ops. 2024-03-28 04:26:55 +00:00
Maarten Baert
35bf6c8edc Add SimplicialNonHermitianLLT and SimplicialNonHermitianLDLT 2024-03-28 00:22:27 +00:00
Rasmus Munk Larsen
4dccaa587e Use truncation rather than rounding when casting Packet2d to Packet2l. 2024-03-27 21:39:58 +00:00
Charles Schlosser
7b5d32b7c9 Sparse move 2024-03-27 17:44:50 +00:00
Antonio Sánchez
c8d368bdaf More fixes for 32-bit. 2024-03-26 22:53:38 +00:00
Antonio Sanchez
de304ab960 Fix using ScalarPrinter redefinition for gcc. 2024-03-26 15:32:38 -07:00
Rasmus Munk Larsen
c54303848a Undef macro in TensorContractionGpu.h that causes buildbreakages. 2024-03-26 18:01:48 +00:00
Antonio Sánchez
d8aa4d6ba5 Fix another instance of Packet2l on win32. 2024-03-26 15:48:44 +00:00
Antonio Sánchez
9f77ce4f19 Add custom formatting of complex numbers for Numpy/Native. 2024-03-25 17:41:44 +00:00
Charles Schlosser
5570a27869 cross3_product vectorization 2024-03-25 00:06:33 +00:00
Antonio Sánchez
0b3df4a6e6 Remove "extern C" in CholmodSupport. 2024-03-25 00:03:28 +00:00
Antonio Sanchez
a39ade4ccf Protect use of alloca. 2024-03-23 16:35:37 +00:00
Rasmus Munk Larsen
b86641a4c2 Add support for casting between double and int64_t for SSE and AVX2. 2024-03-22 22:32:29 +00:00
Antonio Sánchez
d883932586 Fix Packet*l for 32-bit builds. 2024-03-22 17:16:42 +00:00
Tyler Veness
d792f13a61 Make more Matrix functions constexpr 2024-03-19 22:02:21 +00:00
Rasmus Munk Larsen
d3cd312652 Remove slow index check in Tensor::resize from release mode. 2024-03-18 23:43:25 +00:00
Antonio Sánchez
386e2079e4 Fix Jacobi module doc. 2024-03-17 23:08:04 +00:00
Antonio Sánchez
8b101ade2b Fix CwiseUnaryView for MSVC. 2024-03-17 16:28:17 +00:00
Antonio Sánchez
0951ad2a8e Don't hide rbegin/rend for GPU. 2024-03-14 21:11:43 +00:00
Antonio Sánchez
24f8fdeb46 Fix CwiseUnaryView const access (Attempt 2). 2024-03-14 21:04:49 +00:00
Antonio Sánchez
285da30ec3 Fix const input and c++20 compatibility in unary view. 2024-03-13 16:59:44 +00:00
Rasmus Munk Larsen
126ba1a166 Add Packet2l for SSE. 2024-03-11 19:54:55 +00:00
Antonio Sánchez
1d4369c2ff Fix CwiseUnaryView. 2024-03-11 19:08:30 +00:00
Antonio Sánchez
352ede96e4 Fix incomplete cholesky. 2024-03-08 19:18:10 +00:00
Antonio Sánchez
f1adb0ccc2 Split up cxx11_tensor_gpu to reduce timeouts. 2024-03-07 17:21:37 +00:00
Antonio Sánchez
17f3bf8985 Fix pexp test for ARM. 2024-03-07 00:19:57 +00:00
Antonio Sanchez
6da34d9d9e Allow aligned assignment in TRMV. 2024-03-06 23:53:01 +00:00
Antonio Sánchez
3e8e63eb46 Fix packetmath plog test on Windows. 2024-03-06 23:51:47 +00:00
Tyler Veness
5ffb307afa Fix deprecated anonymous enum-enum conversion warnings 2024-03-06 21:22:02 +00:00
Antonio Sánchez
55dd487478 Revert "fix unaligned access in trmv"
This reverts merge request !1536
2024-03-06 16:42:59 +00:00
Antonio Sánchez
38fcedaf8e Fix pexp complex test edge-cases. 2024-03-04 17:44:38 +00:00
Sotiris Papatheodorou
251ec42087 Return 0 volume for empty AlignedBox 2024-03-04 17:32:44 +00:00
Antonio Sanchez
64edfbed04 Fix static_assert for c++14. 2024-03-02 20:39:34 -08:00
Charles Schlosser
3f3144f538 fix unaligned access in trmv 2024-03-03 04:20:09 +00:00
Antonio Sánchez
23f6c26857 Rip out make_coherent, add CoherentPadOp. 2024-02-29 23:15:02 +00:00
Antonio Sánchez
edaf9e16bc Fix triangular matrix-vector multiply uninitialized warning. 2024-02-29 21:00:58 +00:00
Antonio Sánchez
98620b58c3 Eliminate FindCUDA cmake warning. 2024-02-29 20:49:41 +00:00
Antonio Sánchez
cc941d69a5 Update error about c++14 requirement. 2024-02-29 20:45:13 +00:00
Antonio Sánchez
6893287c99 Add degenerate checks before calling BLAS routines. 2024-02-29 18:56:36 +00:00
Antonio Sánchez
fa201f1bb3 Fix QR colpivoting warnings and test failure. 2024-02-28 15:00:13 +00:00
Charles Schlosser
b334910700 delete shadowed typedefs 2024-02-28 02:40:45 +00:00
Antonio Sánchez
a962a27594 Fix MSVC GPU build. 2024-02-27 23:26:06 +00:00
Rasmus Munk Larsen
a2f8eba026 Speed up sparse x dense dot product. 2024-02-24 19:13:33 +00:00
Antonio Sánchez
7a88cdd6ad Fix signed integer UB in random. 2024-02-24 13:16:23 +00:00
Rasmus Munk Larsen
a6dc930d16 Speed up SparseQR. 2024-02-22 16:49:10 -08:00
Antonio Sánchez
feaafda30a Change array_size result from enum to constexpr. 2024-02-22 22:52:25 +00:00
Antonio Sánchez
8a73c6490f Remove "using namespace Eigen" from blas/common.h. 2024-02-22 22:51:42 +00:00
Rasmus Munk Larsen
6ed4d80cc8 Fix crash in IncompleteCholesky when the input has zeros on the diagonal. 2024-02-22 22:22:21 +00:00
Rasmus Munk Larsen
3859e8d5b2 Add method signDeterminant() to QR and related decompositions. 2024-02-20 23:44:28 +00:00
Rasmus Munk Larsen
db6b9db33b Make header guards in GeneralMatrixMatrix.h and Parallelizer.h consistent:... 2024-02-20 20:03:18 +00:00
Antonio Sánchez
b56e30841c Enable direct access for IndexedView. 2024-02-20 18:21:45 +00:00
Antonio Sanchez
90087b990a Fix use of uninitialized memory in kronecker_product test. 2024-02-20 08:44:34 -08:00
Antonio Sánchez
6b365e74d6 Fix GPU build for ptanh_float. 2024-02-20 16:08:50 +00:00
Antonio Sánchez
b14c5d0fa1 Fix real schur and polynomial solver. 2024-02-17 15:22:11 +00:00
Charles Schlosser
8a4118746e fix exp complex test: use int instead of index 2024-02-17 03:55:32 +00:00
Charles Schlosser
960892ca13 JacobiSVD: get rid of m_scaledMatrix, m_adjoint, hopefully fix some compiler warnings 2024-02-17 03:41:55 +00:00
Charles Schlosser
18a161bf17 fix pexp_complex_test 2024-02-17 03:08:23 +00:00
Damiano Franzò
be06c9ad51 Implement float pexp_complex 2024-02-17 00:26:57 +00:00
Rasmus Munk Larsen
4d419e2209 Rename generic_fast_tanh_float to ptanh_float and move it to... 2024-02-16 21:27:22 +00:00
Antonio Sánchez
2a9055b50e Fix random for custom scalars that don't have constexpr digits(). 2024-02-16 02:30:54 +00:00
Antonio Sánchez
500a3602f0 Use traits<Matrix>::Options instead of Matrix::Options. 2024-02-16 00:11:57 +00:00
Antonio Sánchez
0b9ca1159b Fix deflation in BDCSVD. 2024-02-15 23:53:59 +00:00
Antonio Sánchez
f40ad38fda Fix failure on ARM with latest compilers. 2024-02-14 23:00:56 +00:00
Antonio Sánchez
a24bf2e9a2 Disable float16 packet casting if native AVX512 f16 is available. 2024-02-14 20:05:00 +00:00
Antonio Sánchez
5361dea833 Remove return int types from BLAS/LAPACK functions. 2024-02-14 19:51:36 +00:00
Alec Jacobson
7e655c9a5d Fixes 2780 2024-02-13 02:57:43 +00:00
Antonio Sánchez
6ea33f95df Eliminate warning about writing bytes directly to non-trivial type. 2024-02-12 23:27:48 +00:00
Antonio Sánchez
06b45905e7 Remove r_cnjg due to conflicts with f2c. 2024-02-12 23:16:03 +00:00
Antonio Sánchez
9229cfa822 Fix division by zero UB in packet size logic. 2024-02-12 21:01:19 +00:00
Antonio Sanchez
186f8205db Apply clang-format to lapack/blas directories 2024-02-12 19:36:07 +00:00
Gautam Jha
4eac211e96 Fix C++20 error, Arithmetic between different enumeration types 2024-02-12 04:25:04 +00:00
Tyler Veness
d1d87973f4 Fix segfault in CholmodBase::factorize() for zero matrix 2024-02-12 03:27:56 +00:00
Antonio Sánchez
7b87b21910 Fix UB in bool packetmath test. 2024-02-09 19:46:45 +00:00
Charles Schlosser
431e4a913b Fix the fuzz 2024-02-07 04:52:19 +00:00
Charles Schlosser
3ab8f48256 fix tests when scalar is bfloat16, half 2024-02-07 04:50:11 +00:00
Antonio Sánchez
3ebaab8a63 Fix PPC rand and other failures. 2024-02-05 20:07:15 +00:00
Charlie Schlosser
ebd13c3b14 fix skew symmetric test 2024-02-04 21:13:06 -05:00
Antonio Sanchez
128c8abf44 Fix gcc-6 bug in the rand test. 2024-02-02 15:36:23 -08:00
Charles Schlosser
d626762e3f improve random 2024-01-31 08:16:29 +00:00
Antonio Sánchez
a9ddab3e06 Fix a bunch of ODR violations. 2024-01-30 22:38:43 +00:00
Damiano Franzò
7fd7a3f946 Implement plog_complex 2024-01-30 19:06:05 +00:00
Antonio Sánchez
043442e21b Fix preshear transformation. 2024-01-30 06:37:33 +00:00
Antonio Sánchez
69ee52ed13 Remove Skyline. 2024-01-30 00:13:17 +00:00
Antonio Sánchez
0f0c76dc29 Use stableNorm in ComplexEigenSolver. 2024-01-29 23:46:23 +00:00
Antonio Sánchez
dd71c23e23 Remove MoreVectorization. 2024-01-29 18:48:28 +00:00
Pascal Getreuer
62b5474032 Fix busted formatting in Eigen::Tensor README.md. 2024-01-29 08:12:43 +00:00
Antonio Sánchez
d8d0b60b59 Fix CI for clang-6 when cross-compiled. 2024-01-27 05:13:21 +00:00
Antonio Sánchez
f391289150 Fix bug in checking subnormals. 2024-01-25 17:52:07 +00:00
Antonio Sanchez
5a90fbceaa Update documentation of lapack second/dsecnd. 2024-01-25 17:51:26 +00:00
Antonio Sánchez
9079820241 Remove simple relicense script. 2024-01-25 05:50:36 +00:00
Antonio Sánchez
851b40afd8 LAPACK CPU time functions. 2024-01-23 23:58:58 +00:00
Antonio Sánchez
a73970a864 Fix arm32 issues. 2024-01-23 22:04:55 +00:00
Antonio Sánchez
5808122017 Formatting. 2024-01-23 16:56:27 +00:00
Antonio Sánchez
92f9544f6d Remove explicit specialization of member function. 2024-01-23 16:40:43 +00:00
Antonio Sanchez
2692fb2b71 Fix compile-time error caused by chip static asserts 2024-01-22 21:40:48 +00:00
Cheng Wang
2c6b61c006 Add half and quarter vector support to HVX architecture 2024-01-22 21:23:21 +00:00
Tobias Wood
05a457534f Chipping Asserts v2 2024-01-22 18:08:23 +00:00
Martin Heistermann
fc92fe3125 SPQR: Fix build error, Index/StorageIndex mismatch. 2024-01-22 17:37:36 +00:00
Andreas Forster
eede526b7c [Compressed Storage] Use smaller type of Index & StorageIndex for determining maximum size during resize. 2024-01-22 00:35:31 +00:00
Antonio Sánchez
772057c558 Revert "Add asserts for .chip" 2024-01-20 05:14:42 +00:00
Antonio Sánchez
6163dbe2bc Allow specifying a temporary directory for fileio outputs. 2024-01-20 00:55:14 +00:00
Antonio Sanchez
6b6bb9d34e Fix unused warnings in failtest. 2024-01-19 15:30:18 -08:00
Antonio Sánchez
f6e41e6433 Revert "Clean up stableNorm" 2024-01-19 20:22:47 +00:00
Tobias Wood
b1ae206ea6 Add asserts for .chip 2024-01-19 19:18:19 +00:00
Nuno Gonçalves
b0f906419e add missing constexpr qualifier 2024-01-19 18:49:53 +00:00
Antonio Sánchez
34fd46a9b4 Update CI with testing framework from eigen_ci_cross_testing. 2024-01-19 17:55:09 +00:00
Antonio Sanchez
b2814d53a7 Fix stableNorm when input is zero-sized. 2024-01-16 10:14:51 -08:00
Charles Schlosser
c29a410116 check pointers before freeing 2024-01-12 06:09:46 +00:00
Chao Chen
48e0c827da [ROCm] MI300 related test support 2024-01-11 23:46:25 +00:00
Antonio Sánchez
5385773015 Fix TensorForcedEval in the case of the evaluator being copied. 2024-01-10 00:45:39 +00:00
Nicolas Cornu
3f3bc6d862 Improve documentation of SparseLU 2024-01-09 18:18:05 +00:00
Arnaud Billon
d33174d5ae Doc: Fix Basic slicing examples minor issues 2024-01-09 18:17:00 +00:00
Tyler Veness
19b1192886 Add factor getters to Cholmod LLT/LDLT 2024-01-09 02:14:04 +00:00
Charles Schlosser
a1a96fafde Clean up stableNorm 2024-01-08 23:28:41 +00:00
Antonio Sánchez
3026f1f296 Fix various asan errors. 2024-01-08 00:13:17 +00:00
Antonio Sánchez
a2cf99ec6f Fix GPU+clang+asan. 2024-01-04 17:29:37 +00:00
Antonio Sánchez
fee5d60b50 Fix MSAN failures. 2023-12-22 03:18:46 +00:00
Antonio Sánchez
9697d481c8 Fix up clang-format CI. 2023-12-14 00:15:11 +00:00
Tobias Wood
85efa83292 Set up clang-format in CI 2023-12-13 21:08:07 +00:00
Charles Schlosser
2c4541f735 fix msvc clz 2023-12-13 03:33:49 +00:00
Antonio Sánchez
75e273afcc Add internal ctz/clz implementation. 2023-12-11 21:03:09 +00:00
Antonio Sanchez
454f89af9d Protect kernel launch syntax from clang-format 2023-12-05 14:26:44 -08:00
Rasmus Munk Larsen
383506fcb2 Fix CUDA syntax error introduced by clang-format. 2023-12-05 14:06:27 -08:00
Antonio Sánchez
61b7155d77 Add formatting change to .git-blame-ignore-revs 2023-12-05 21:56:31 +00:00
Antonio Sánchez
46e9cdb7fe Clang-format tests, examples, libraries, benchmarks, etc. 2023-12-05 21:22:55 +00:00
Antonio Sánchez
3252ecc7a4 Fix scalar_logistic_function overflow for complex inputs. 2023-12-05 18:21:04 +00:00
Antonio Sánchez
9688081029 Add .git-blame-ignore-revs file. 2023-12-01 01:05:50 +00:00
Tobias Wood
f38e16c193 Apply clang-format 2023-11-29 11:12:48 +00:00
Drew Lewis
9ea520fc45 Ensure that mc is not smaller than Traits::nr 2023-11-28 22:48:53 +00:00
Antonio Sánchez
dd8c71e628 Fix typecasting for arm32 2023-11-23 00:47:50 +00:00
Tobias Wood
b2cb49e280 Static asserts to check for matching NumDimensions 2023-11-22 23:29:22 +00:00
Charles Schlosser
283dec7f25 Update file GeneralMatrixVector.h 2023-11-21 19:50:35 +00:00
Pavel Labath
66b9f4ed5c Fix (u)int64_t->float conversion on arm 2023-11-21 16:09:12 +00:00
Charles Schlosser
d1b03fb5c9 Gemv microoptimization 2023-11-20 17:26:39 +00:00
Rasmus Munk Larsen
3cf6bb6f1c Fix a bug in commit 76e8c04553: 2023-11-15 21:45:37 +00:00
Charles Schlosser
32165c6f0c Fix Wshorten-64-to-32 warning in gemm parallelizer 2023-11-14 13:51:27 +00:00
Rasmus Munk Larsen
b33dbb5765 Fix implicit narrowing warning in Parallelizer.h. 2023-11-13 21:30:39 +00:00
Antonio Sanchez
3a9635b20c Link pthread for product_threaded test 2023-11-13 11:34:23 -08:00
wk
f78c37f0af traits<Ref>::match: use correct strides 2023-11-11 14:10:56 +00:00
Rasmus Munk Larsen
516d08a490 Fix typo in Parallelizer.h 2023-11-10 20:29:29 +00:00
Rasmus Munk Larsen
76e8c04553 Generalize parallel GEMM implementation in Core to work with ThreadPool in addition to OpenMP. 2023-11-10 17:42:30 +00:00
Antonio Sánchez
4d54c43d6c Fix typo to allow nomalloc test to pass on AVX512. 2023-11-06 18:58:43 +00:00
Antonio Sánchez
a25f02d73e Fix int overflow causing cxx11_tensor_gpu_1 to fail. 2023-11-06 17:10:16 +00:00
Charles Schlosser
6f9ad7da61 fix Wshorten-64-to-32 warnings in div_ceil 2023-10-27 15:52:00 +00:00
Charles Schlosser
1aac9332ce TensorReduction: replace divup with div_ceil 2023-10-25 16:44:34 +00:00
Kyle Macfarlan
5de0f2f89e Fixes #2735: Component-wise cbrt 2023-10-25 03:06:13 +00:00
Antonio Sánchez
48b254a4bc Disable denorm deprecation warnings in MSVC C++23. 2023-10-23 17:56:04 +00:00
Antonio Sánchez
176821f2f7 Avoid building docs if cross-compiling or not top level. 2023-10-23 17:55:01 +00:00
Antonio Sánchez
aa6964bf3a Work around MSVC issue in Block XprType. 2023-10-19 22:02:03 +00:00
Anatoly Borisov
877c2d1e9b fix typo in comment 2023-10-18 12:58:49 +00:00
Antonio Sánchez
0c9526912c Pass div_ceil arguments by value. 2023-10-17 18:46:19 +00:00
Ioannis Assiouras
d9839718aa [ROCm] Replace HIP_PATH with ROCM_PATH for rocm 6.0 2023-10-16 20:56:35 +00:00
Antonio Sánchez
5bdf58b8df Eliminate use of _res. 2023-10-16 19:56:53 +00:00
Rasmus Munk Larsen
a96545777b Consolidate multiple implementations of divup/div_up/div_ceil. 2023-10-10 17:16:59 +00:00
Charles Schlosser
e8515f78ac Fix sparse triangular view iterator 2023-10-05 17:13:37 +00:00
Kevin
6d829e766f Fix extra semicolon in XprHelper 2023-09-14 08:18:28 +00:00
Alejandro Acosta
ba47341a14 [SYCL-2020] Enabling half precision support for SYCL. 2023-09-13 09:12:40 +00:00
François Girinon
92a77a596b Fix call to static functions from device by adding EIGEN_DEVICE_FUNC attribute to run methods 2023-09-13 04:16:52 +00:00
Antonio Sánchez
8f858a4ea8 Export ThreadPool symbols from legacy header. 2023-09-10 20:56:20 +00:00
Chip Kerchner
4e598ad259 New panel modes for GEMM MMA (real & complex). 2023-09-06 20:03:45 +00:00
Daniel Benedí
2c64a655fe Stage will not be ok if pardiso returned error 2023-09-06 14:40:06 +02:00
Charles Schlosser
18018ed013 Unwind Block of Blocks 2023-08-29 17:21:41 +00:00
Charles Schlosser
81b48065ea Fix arm32 float division and related bugs 2023-08-29 00:36:07 +00:00
Antonio Sánchez
2873916f1c Rename plugin headers to .inc. 2023-08-21 16:26:11 +00:00
Antonio Sánchez
6e4d5d4832 Add IWYU private pragmas to internal headers. 2023-08-21 16:25:22 +00:00
Antonio Sánchez
328b5f9085 Add temporary macro to allow unaligned scalar UB. 2023-08-15 15:58:41 +00:00
Charles Schlosser
a798d07659 Fix tensor stridedlinearbuffercopy 2023-08-03 20:36:42 +00:00
Charles Schlosser
8d9f467036 fix boost mp test to refer to new svd tests 2023-08-02 13:38:12 +00:00
Antonio Sánchez
0ae7d7a361 Fix unaligned scalar alignment UB. 2023-08-01 19:39:08 +00:00
cheng wang
66e8f38891 Add architecture definition files for Qualcomm Hexagon Vector Extension (HVX) 2023-08-01 17:47:57 +00:00
Antonio Sánchez
2af700aa40 Fix nullptr dereference in SVD. 2023-08-01 16:33:16 +00:00
Rasmus Munk Larsen
86a43d8c04 Fix clang-tidy warning 2023-07-31 21:26:28 +00:00
Antonio Sánchez
0cef325b07 Fix another UB access. 2023-07-31 19:18:45 +00:00
Charles Schlosser
5527e78a64 Add missing x86 pcasts 2023-07-28 23:41:38 +00:00
Alejandro Acosta
24d15e086f [SYCL-2020] Add test to validate SYCL in Eigen core. 2023-07-28 15:45:08 +00:00
Antonio Sánchez
d4ae542ed1 Fix nullptr dereference issue in triangular product. 2023-07-27 22:10:21 +00:00
Chip Kerchner
7769eb1b2e Fix problems with recent changes and Tensorflow in Power 2023-07-26 16:24:58 +00:00
Yingnan Wu
ba1cb6e45e Fixes #2703 by adding max_digits10 function 2023-07-26 16:02:52 +00:00
Charles Schlosser
9995c3da6f Fix -Wmaybe-uninitialized in SVD 2023-07-25 22:22:17 +00:00
Charles Schlosser
4e9e493b4a Fix -Waggressive-loop-optimizations 2023-07-21 03:47:40 +00:00
Charles Schlosser
6e7abeae69 fix arm build warnings 2023-07-17 20:37:27 +00:00
Charles Schlosser
81fe2d424f Fix more gcc compiler warnings / sort-of bugs 2023-07-14 21:12:45 +00:00
Charles Schlosser
21cd3fe209 Optimize check_rows_cols_for_overflow 2023-07-10 17:40:17 +00:00
Antonio Sánchez
9297aae66f Fix AVX512 nomalloc issues in trsm. 2023-07-10 16:42:13 +00:00
Charles Schlosser
1a2bfca8f0 Fix annoying warnings 2023-07-07 20:19:58 +00:00
Antonio Sánchez
63dcb429cd Fix use of arg function in CUDA. 2023-07-07 18:37:14 +00:00
Marcus Comstedt
8f927fb52e Altivec: fix compilation with C++20 and higher 2023-07-05 13:14:02 +00:00
Kevin Leonardic
d4b05454a7 Fix argument for _mm256_cvtps_ph imm parameter 2023-07-03 13:44:20 +02:00
Charles Schlosser
15ac3765c4 Fix ivcSize return type in IndexedViewMethods.h 2023-07-03 03:49:37 +00:00
Chip Kerchner
3791ac8a1a Fix supportsMMA to obey EIGEN_ALTIVEC_MMA_DYNAMIC_DISPATCH compilation flag and compiler support. 2023-06-28 17:57:21 +00:00
H S Helson Go
bc57b926a0 Add Quaternion constructor from real scalar and imaginary vector 2023-06-27 05:38:17 +00:00
Antonio Sánchez
31cd2ad371 Ensure EIGEN_HAS_ARM64_FP16_VECTOR_ARITHMETIC is always defined on arm. 2023-06-26 19:21:54 +00:00
Antonio Sánchez
7465b7651e Disable FP16 arithmetic for arm32. 2023-06-26 18:39:42 +00:00
Rasmus Munk Larsen
b3267f6936 Remove unused variable in test/svd_common.h. 2023-06-23 23:12:19 +00:00
Chip Kerchner
211c5dfc67 Add optional offset parameter to ploadu_partial and pstoreu_partial 2023-06-23 19:53:05 +00:00
Charles Schlosser
44c20bbbe3 rint round floor ceil 2023-06-23 16:29:16 +00:00
Charles Schlosser
6ee86fd473 delete deprecated function call in svd test 2023-06-23 14:17:27 +00:00
Charles Schlosser
387175c258 Fix safe_abs in int_pow 2023-06-23 04:12:41 +00:00
Charles Schlosser
c6db610bc7 Fix svd test 2023-06-22 17:37:24 +00:00
Charles Schlosser
969c31eefc Fix AVX pstore 2023-06-15 01:47:38 +00:00
wilfried.karel
6c1411e521 define a move constructor for Ref<const...> 2023-06-14 20:10:51 +00:00
wilfried.karel
d8f3eb87bf Compile- and run-time assertions for the construction of Ref<const>. 2023-06-14 15:49:58 +00:00
Charles Schlosser
59b3ef5409 Partially Vectorize Cast 2023-06-09 16:54:31 +00:00
Rasmus Munk Larsen
7d7576f326 Avoid underflow in prsqrt. 2023-06-06 14:06:19 -07:00
Charles Schlosser
b7151ffaab Fix unary pow error handling and test 2023-06-06 18:46:55 +00:00
Rasmus Munk Larsen
7ac8897431 Reduce max relative error of prsqrt from 3 to 2 ulps. 2023-06-04 22:25:33 +00:00
Charles Schlosser
1d80e23186 Optimize scalar_unary_pow_op error handling 2023-06-02 18:53:06 +00:00
Alexander Shaposhnikov
316eab8deb Do not set EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC for cuda compilation 2023-05-31 15:15:06 +00:00
Alejandro Acosta
07e4604b19 Replace usage of CudaStreamDevice with GpuStreamDevice in tensor benchmarks GPU 2023-05-30 15:44:07 +00:00
Rasmus Munk Larsen
8c43bf2b5b Clean up Redux.h and fix vectorization_logic test after changes to traversal order in Redux. 2023-05-24 20:26:52 +00:00
Charles Schlosser
da6a71faf0 Add linear redux evaluators 2023-05-24 17:07:25 +00:00
Charles Schlosser
67a1e881d9 Sparse matrix column/row removal 2023-05-24 17:04:45 +00:00
Rasmus Munk Larsen
de1c884687 Add reference to writeup of approach used in canonicalEulerAngles. 2023-05-24 15:52:26 +00:00
Charles Schlosser
307a417e1c Fix unrolled assignment evaluator 2023-05-22 16:39:24 +00:00
Juraj Oršulić
c18f94e3b0 Geometry/EulerAngles: introduce canonicalEulerAngles 2023-05-19 15:42:22 +00:00
Charles Schlosser
7d9bb90f15 SVD: fix numerous compiler warnings / failures 2023-05-15 16:56:47 +00:00
Rasmus Munk Larsen
2709f4c8fb Use relative path to include EmulateArray.h in CXX11Meta.h, and get rid of redundant meta-programming code, which was moved to Core. 2023-05-09 23:21:35 +00:00
Rasmus Munk Larsen
9a02c977ec Use relative paths to include Meta.h and MaxSizeVector.h in Tensor 2023-05-09 22:07:55 +00:00
Rasmus Munk Larsen
96c42771d6 Make it possible to override the synchonization primitives used by the threadpool using macros. 2023-05-09 19:36:17 +00:00
Rasmus Munk Larsen
1321821e86 Add missing braces in Umeyama.h 2023-05-09 19:10:50 +00:00
Rasmus Munk Larsen
524c329ab2 Work around compiler bug in Umeyama.h. 2023-05-09 18:53:56 +00:00
Charles Schlosser
fbf7189bd5 Fix cuda compilation 2023-05-08 16:15:47 +00:00
Mehdi Goli
0623791930 [SYCL-2020] Enabling USM support for SYCL. SYCL-1.2.1 did not have support for USM. 2023-05-05 17:30:36 +00:00
Andrzej Ciarkowski
1698c367a0 Use std::shared_ptr for FFTW/IMKL FFT plan implementation; Fixes #2651 2023-05-05 16:58:23 +00:00
Antonio Sánchez
1f79a6078f Return NaN in ndtri for values outside valid input range. 2023-05-05 16:27:26 +00:00
Tobias Wood
94f57867fe Thread pool 2023-05-05 16:23:34 +00:00
Charles Schlosser
9eb8e2afba Change array_cwise test name 2023-05-05 03:08:43 +00:00
Charles Schlosser
725c11719b Visitor: fix modulo by zero compiler warning 2023-05-04 18:21:09 +00:00
Chip Kerchner
b8208b363c Specialized loadColData correctly - fix previous BF16 GEMV MR 2023-05-04 16:38:17 +00:00
Charles Schlosser
2af03fb685 clean up array_cwise test 2023-05-04 16:02:08 +00:00
Chip Kerchner
fda1373a15 Fix ColMajor BF16 GEMV for when vector is RowMajor 2023-05-03 20:12:50 +00:00
Charles Schlosser
fdc749de2a JacobiSVD: set m_nonzeroSingularValues to zero if not finite 2023-05-02 17:48:21 +00:00
Chip Kerchner
6418ac0285 Unroll F32 to BF16 loop - 1.8X faster conversions for LLVM. Use vector pairs for GCC. 2023-05-01 16:54:16 +00:00
Pedro Gonnet
874f5947f4 Add half-Packet operations to StridedLinearBufferCopy. 2023-05-01 16:09:31 +00:00
Charles Schlosser
c9a14f48d9 SSE Packet4ui has pcmp, pmin, pmax 2023-04-28 20:36:08 +00:00
Rasmus Munk Larsen
0b51f763cb Revert "Geometry/EulerAngles: make sure that returned solution has canonical ranges"
This reverts commit 7f06bcae2c
2023-04-27 00:06:23 +00:00
Antonio Sánchez
2d0c6ad873 Revert "Vectorize cast"
This reverts commit eb5ff1861a
2023-04-26 18:03:36 +00:00
Charles Schlosser
8999525c29 AVX2: Packet4ul has pmul, abs2 2023-04-26 16:22:16 +00:00
Charles Schlosser
eb5ff1861a Vectorize cast 2023-04-26 02:50:13 +00:00
Antonio Sánchez
3918768be1 Fix sparse iterator and tests. 2023-04-25 19:05:49 +00:00
Antonio Sanchez
70410310a4 Fix boolean bitwise and warning. 2023-04-25 15:24:49 +00:00
Charles Schlosser
f6cf5dca80 Packet4ul does not have Abs2 2023-04-21 19:48:01 +00:00
Chip Kerchner
03f646b7e3 New VSX version of BF16 GEMV (Power) - up to 6.7X faster 2023-04-21 17:06:59 +00:00
Charles Schlosser
29c8e3c754 fix pow for uint32_t, disable pmul<Packet4ul> 2023-04-21 05:47:56 +00:00
Juraj Oršulić
7f06bcae2c Geometry/EulerAngles: make sure that returned solution has canonical ranges 2023-04-19 19:12:24 +00:00
Rasmus Munk Larsen
a347dbbab2 Delete last few occurences of HasHalfPacket. 2023-04-19 10:36:59 -07:00
Rasmus Munk Larsen
b378014fef Make sure we return +/-1 above the clamping point for Erf(). 2023-04-18 20:53:01 +00:00
Charles Schlosser
e2bbf496f6 Use select ternary op in tensor select evaulator 2023-04-18 20:52:16 +00:00
Charles Schlosser
2b954be663 fix typo in sse packetmath 2023-04-18 18:17:41 +00:00
Rasmus Munk Larsen
25685c90ad Fix incorrect packet type for unsigned int version of pfirst() in MSVC workaround in PacketMath.h. 2023-04-18 17:46:23 +00:00
Rasmus Munk Larsen
1e223a956c Add missing 'f' in float literal in SpecialFunctionsImpl.h that triggers implicit conversion warning. 2023-04-18 17:33:29 +00:00
Chip Kerchner
3f3ce214e6 New BF16 pcast functions and move type casting to TypeCasting.h 2023-04-18 02:38:38 +00:00
Pedro Gonnet
17b5b4de58 Add Packet4ui, Packet8ui, and Packet4ul to the SSE/AVX PacketMath.h headers 2023-04-17 23:33:59 +00:00
Charles Schlosser
87300c93ca Refactor IndexedView 2023-04-17 12:32:50 +00:00
Chip Kerchner
1148f0a9ec Add dynamic dispatch to BF16 GEMM (Power) and new VSX version 2023-04-14 22:20:42 +00:00
Rasmus Munk Larsen
3026fc0d3c Improve accuracy of erf(). 2023-04-14 16:57:56 +00:00
Rasmus Munk Larsen
554fe02ae3 Enable new AVX512 GEMM kernel by default. 2023-04-12 13:39:06 -07:00
Charles Schlosser
0d12fcc34e Insert from triplets 2023-04-12 20:01:48 +00:00
Rob Conde
990a282fc4 exclude Eigen/Core and Eigen/src/Core from being ignored due to core ignore rule 2023-04-12 10:42:21 -04:00
Rohit Goswami
b0eded878d DOC: Update documentation for 3.4.x 2023-04-06 19:20:41 +00:00
Rasmus Munk Larsen
b0f877f8e0 Don't crash on empty tensor contraction. 2023-04-05 17:06:14 +00:00
b-shi
15fbddaf9b ASAN fixes for AVX512 GEMM/TRSM 2023-04-04 15:54:24 -07:00
Charles Schlosser
178ef8c97f qualify non-const symbolic indexed view with is_lvalue 2023-04-04 19:06:32 +00:00
Rasmus Munk Larsen
df1049ddf4 Small packet math cleanup. 2023-04-04 16:14:32 +00:00
Antoine Hoarau
9b48d10215 Guard all malloc, realloc and free() fonctions with check_that_malloc_is_allowed() 2023-04-04 04:24:22 +00:00
Rasmus Munk Larsen
c730290fa0 Use the correct truncating intrinsic for double->int casting. 2023-04-03 13:56:41 -07:00
Charles Schlosser
766db02020 disable raw array indexed view access for 1d arrays 2023-03-29 02:39:45 +00:00
Charles Schlosser
bfbc66e078 refactor indexedviewmethods, enable non-const ref access with symbolic indices 2023-03-29 01:35:26 +00:00
Rasmus Munk Larsen
1a5dfd7c0f Fix incorrect casting in AVX512DQ path. 2023-03-27 09:28:06 -07:00
Charles Schlosser
a08649994f Optimize generic_rsqrt_newton_step 2023-03-24 22:42:57 +00:00
Rasmus Munk Larsen
b8b8a26145 Add more missing vectorized casts for int on x86, and remove redundant unit tests 2023-03-24 16:02:00 +00:00
unageek
33e206f714 Remove unused declarations of BLAS/LAPACK routines 2023-03-23 21:54:05 +00:00
Rasmus Munk Larsen
d57a79e512 Optimize float->bool cast for AVX2, based on Charles Schlosser's comments. 2023-03-21 20:59:25 -07:00
Rasmus Munk Larsen
a5ae832773 Fix reversal of arguments to _mm256_set_m128() in pcast<Packet4d, Packet8f>. 2023-03-22 03:21:44 +00:00
Rasmus Munk Larsen
09945f2cc1 Optimize casting for x86_64. 2023-03-21 18:24:16 +00:00
Colin Broderick
8f9b8e3630 Replaced all instances of internal::(U)IntPtr with std::(u)intptr_t. Remove ICC workaround. 2023-03-21 16:50:23 +00:00
Antonio Sánchez
2c8011c2dd Fix arm builds. 2023-03-20 16:59:38 +00:00
Charles Schlosser
fd8f410bbe Fix 2624 2625 2023-03-20 16:30:04 +00:00
Chip Kerchner
e887196d9d Undo cmake pools changes 2023-03-17 16:06:26 +00:00
Jonas Schulze
81cb6a51d0 Fix some typos 2023-03-16 23:11:43 +00:00
Antonio Sánchez
555cec17ed Fix parsing of command-line arguments when already specified as a cmake list. 2023-03-16 22:47:38 +00:00
Chip Kerchner
7db19baabe Remove pools if cmake is less than 3.11 2023-03-16 16:54:45 +00:00
Rasmus Munk Larsen
0488b708b4 Vectorize tensor.isnan() by using typed predicates. 2023-03-16 04:04:22 +00:00
Rasmus Munk Larsen
f02856c640 Use EIGEN_NOT_A_MACRO macro (oh the irony!) to avoid build issue in TensorFlow. 2023-03-15 11:42:57 -07:00
Rasmus Munk Larsen
690ae9502f Use C++11 standard features for detecting presence of Inf and NaN 2023-03-15 16:52:44 +00:00
Chip Kerchner
d71ac6a755 Fix recent PowerPC warnings and clang warning 2023-03-15 16:50:46 +00:00
Chip Kerchner
d54d228b49 Limit the number of build jobs to 8 and link jobs to 4 for PowerPC. This should help reduce the OOM build problems. 2023-03-15 16:29:41 +00:00
Chip Kerchner
23e1541863 Put deadcode checks back in from previous change. 2023-03-14 00:57:16 +00:00
Chip Kerchner
6c58f0fe1f Revert changes that made BF16 GEMM to cause bad register spillage for LLVM (Power) 2023-03-13 23:36:06 +00:00
Rasmus Munk Larsen
8fe6190001 Add numext::isnan for AnnoyingOrange^H^H^H^H^H^HScalar. 2023-03-13 21:19:35 +00:00
Rasmus Munk Larsen
79de101d23 Handle PropagateFast the same way as PropagateNaN in minmax visitor to 2023-03-13 20:47:11 +00:00
Chip Kerchner
9d72412385 Add MMA to BF16 GEMV - 5.0-6.3X faster (for Power) 2023-03-13 19:37:13 +00:00
Rasmus Munk Larsen
2067b54b13 Fix bug in minmax_coeff_visitor for matrix of all NaNs. 2023-03-13 18:25:22 +00:00
Rasmus Munk Larsen
ee0ff0ab3a Fix typo in MathFunctions.h 2023-03-13 15:50:40 +00:00
Rasmus Munk Larsen
21c49e8f8e Delete mystery character from Eigen/src/Core/arch/NEON/MathFunctions.h 2023-03-10 23:27:24 +00:00
Rasmus Munk Larsen
6bb9609bcb Make new Select implementation backwards compatible. 2023-03-10 23:07:47 +00:00
Antonio Sánchez
394aabb0a3 Fix failing MSVC tests due to compiler bugs. 2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen
d6235d76db Clean up generic packetmath specializations for various backends with the help of a macro. 2023-03-10 22:02:23 +00:00
Rasmus Munk Larsen
e8fdf127c6 Work around compiler bug in Tridiagonalization.h 2023-03-10 21:21:07 +00:00
Rasmus Munk Larsen
adf26b6840 Add newline to end of file. 2023-03-10 16:53:22 +00:00
Rasmus Munk Larsen
3492d9e2e5 s/Lesser/Less/ 2023-03-10 00:28:31 +00:00
Rasmus Munk Larsen
2419632cf5 Revert change to allFinite(), since the new version does not work for complex numbers. 2023-03-09 21:50:43 +00:00
Zach Davis
b1beba8a3e Fix LinAlgSVD example code 2023-03-08 17:04:59 +00:00
Charles Schlosser
7bf2968fed Specify Permutation Index for PartialPivLU and FullPivLU 2023-03-07 20:28:05 +00:00
Antonio Sánchez
eb4dbf6135 Modify failing cwise test to get it to pass. 2023-03-07 19:47:42 +00:00
Timofey Pushkin
e577f43ab2 Set CMAKE_* cache variables only when Eigen is a top-level project 2023-03-07 14:39:45 +00:00
Charles Schlosser
1ce8b25825 Vectorize any() / all() 2023-03-06 23:54:02 +00:00
Charles Schlosser
cb8e6d4975 Fix 2240, 2620 2023-03-06 23:11:06 +00:00
Charles Schlosser
d670039309 fix tensor comparison test 2023-03-06 13:11:14 +00:00
Chip Kerchner
2b513ca2a0 Added partial linear access for LHS & Output - 30% faster for bfloat16 GEMM MMA (Power) 2023-03-02 19:22:43 +00:00
Charles Schlosser
0b396c3167 Scalarize comps 2023-03-02 17:06:23 +00:00
Charles Schlosser
3abe12472e fix signed shift test 2023-03-01 14:31:13 +00:00
Antonio Sánchez
ba7417f146 Fix gpu conv3d out-of-resources failure. 2023-02-28 21:25:00 +00:00
Antonio Sánchez
62d5cfe835 Fix ODR issues with Intel's AVX512 TRSM kernels. 2023-02-27 07:54:52 +00:00
Charles Schlosser
826627f653 vectorize comparisons and select by enabling typed comparisons 2023-02-25 20:52:11 +00:00
Rasmus Munk Larsen
2e9b945baf Fix bug that disabled vectorization for coeffMin/coeffMax. 2023-02-25 20:03:54 +00:00
Antonio Sánchez
bc5cdc7a67 Guard use of long double on GPU device. 2023-02-24 21:49:59 +00:00
Chip Kerchner
e4598fedbe Fix compiler versions for certain instructions on Power. 2023-02-23 23:24:41 +00:00
Rasmus Munk Larsen
1c0a6cf228 Get rid of EIGEN_HAS_AVX512_MATH workaround. 2023-02-23 23:16:41 +00:00
Rasmus Munk Larsen
00844e3865 Fix a number of MSAN failures in SVD tests. 2023-02-23 18:44:53 +00:00
Mehdi Goli
c3f67063ed [SYCL-2020]- null placeholder accessor issue in Reduction SYCL test 2023-02-22 17:44:53 +00:00
Rasmus Munk Larsen
6bcd941ee3 Use pmsub in twoprod. This speeds up pow() on Skylake by ~1%. 2023-02-21 20:09:29 +00:00
Rasmus Munk Larsen
ce62177b5b Vectorize atanh & add a missing definition and unit test for atan. 2023-02-21 03:14:05 +00:00
Charles Schlosser
049a144798 Add typed logicals 2023-02-18 01:23:47 +00:00
Chip Kerchner
e797974689 Add and enable Packet int divide for Power10. 2023-02-17 19:04:18 +00:00
Chip Kerchner
54459214a1 Fix epsilon and dummy_precision values in long double for double doubles. Prevented some algorithms from converging on PPC. 2023-02-16 23:35:42 +00:00
Antonio Sánchez
a16fb889dd Guard complex sqrt on old MSVC compilers. 2023-02-16 19:47:00 +00:00
Charles Schlosser
94b19dc5f2 Add CArg 2023-02-15 21:33:06 +00:00
Charles Schlosser
71a8e60a7a Tweak pasin_float, fix psqrt_complex 2023-02-15 01:01:14 +00:00
Antonio Sánchez
384269937f More NEON packetmath fixes. 2023-02-14 21:45:25 +00:00
Antonio Sánchez
c15b386203 Fix MSVC atan2 test. 2023-02-14 18:30:58 +00:00
Antonio Sánchez
2dfbf1b251 Fix NEON make_packet2f. 2023-02-14 16:52:07 +00:00
Rasmus Munk Larsen
07aaa62e6f Fix compiler warnings in tests. 2023-02-14 02:29:03 +00:00
Chip Kerchner
4a03409569 Fix problem with array conversions BF16->F32 in Power. 2023-02-13 21:30:45 +00:00
Rasmus Munk Larsen
77b48c440e Fix compiler warnings. 2023-02-10 20:46:23 +00:00
Chip Kerchner
0ecae61568 Disable array BF16 to F32 conversions in Power 2023-02-10 20:06:58 +00:00
Charles Schlosser
c999284bad Print diagonal matrix 2023-02-10 18:07:29 +00:00
Chip Kerchner
fba12e02b3 Fold extra column calculations into an extra MMA accumulator and other bfloat16 MMA GEMM improvements 2023-02-10 17:32:06 +00:00
Chip Kerchner
79cfc74f4d Revert ODR changes and make gemm_extra_cols and gemm_complex_extra_cols EIGEN_ALWAYS_INLINE to avoid external functions. 2023-02-10 17:05:07 +00:00
Alexander Grund
f9659d91f1 Fix ODR violation with gemm_extra_cols on PPC 2023-02-09 22:16:06 +00:00
Charles Schlosser
325e3063d9 Optimize psign 2023-02-09 22:15:26 +00:00
Charles Schlosser
0e490d452d Update file ColPivHouseholderQR_LAPACKE.h 2023-02-09 13:45:56 +00:00
Antonio Sánchez
0a5392d606 Fix MSVC arm build. 2023-02-08 21:46:37 +00:00
Antonio Sánchez
3f7e775715 Add IWYU export pragmas to top-level headers. 2023-02-08 17:40:31 +00:00
Rasmus Munk Larsen
e4f58816d9 Get rid of custom implementation of equal_to and not_equal_no. No longer needed with c+14. 2023-02-07 21:36:44 -08:00
Antonio Sánchez
e256ad1823 Remove LGPL Code and references. 2023-02-08 01:25:06 +00:00
Chip Kerchner
e71f88abce Change in Power eigen_asserts to eigen_internal_asserts since it is putting unnecessary error checking and assertions without NDEBUG. 2023-02-08 00:57:30 +00:00
Gregory Kramida
232b18fa8a Fixes #2602 2023-02-06 22:52:39 +00:00
Antonio Sánchez
f6cc359e10 More EIGEN_DEVICE_FUNC fixes for CUDA 10/11/12. 2023-02-03 19:18:45 +00:00
Charles Schlosser
2a90653395 fix lapacke config 2023-02-03 16:40:08 +00:00
Rasmus Munk Larsen
3460f3558e Use VERIFY_IS_EQUAL to compare to zeros. 2023-02-01 13:49:56 -08:00
Jeremy Nimmer
13a1f25da9 Revert StlIterators edit from "Fix undefined behavior..." 2023-02-01 20:01:36 +00:00
Charles Schlosser
fd2fd48703 Update file ForwardDeclarations.h 2023-02-01 16:52:20 +00:00
Rasmus Munk Larsen
37b2e97175 Tweak special case handling in atan2. 2023-01-31 17:48:00 -08:00
Jeremy Nimmer
a1cdcdb038 Fix undefined behavior in Block access 2023-02-01 00:40:45 +00:00
Chip Kerchner
4a58f30aa0 Fix pre-POWER8_VECTOR bugs in pcmp_lt and pnegate and reactivate psqrt. 2023-01-31 19:40:24 +00:00
Rasmus Munk Larsen
12ad99ce60 Remove unused variables from GenericPacketMathFunctions.h 2023-01-29 18:10:28 +00:00
Charles Schlosser
6987a200bb Fix stupid sparse bugs with outerSize == 0 2023-01-28 02:03:09 +00:00
Charles Schlosser
0471e61b4c Optimize various mathematical packet ops 2023-01-28 01:34:26 +00:00
Charles Schlosser
1aa6dc2007 Fix sparse warnings 2023-01-27 22:47:42 +00:00
Antonio Sánchez
17ae83a966 Fix bugs exposed by enabling GPU asserts. 2023-01-27 21:43:00 +00:00
Chip Kerchner
ab8725d947 Turn off vectorize version of rsqrt - doesn't match generic version 2023-01-27 18:28:54 +00:00
Charles Schlosser
6d9f662a70 Tweak atan2 2023-01-26 17:38:21 +00:00
Chip Kerchner
6fc9de7d93 Fix slowdown in bfloat16 MMA when rows is not a multiple of 8 or columns is not a multiple of 4. 2023-01-25 18:22:20 +00:00
Charles Schlosser
6d4221af76 Revert qr tests 2023-01-23 22:23:08 +00:00
Charles Schlosser
7f58bc98b1 Refactor sparse 2023-01-23 17:55:50 +00:00
Rasmus Munk Larsen
576448572f More fixes for __GNUC_PATCHLEVEL__. 2023-01-23 17:04:24 +00:00
Rasmus Munk Larsen
164ddf75ab Use __GNUC_PATCHLEVEL__ rather than __GNUC_PATCH__, according to the documentation https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html 2023-01-23 16:56:14 +00:00
Charles Schlosser
5a7ca681d5 Fix sparse insert 2023-01-20 21:32:32 +00:00
Antonio Sánchez
08c961e837 Add custom ODR-safe assert. 2023-01-20 17:38:13 +00:00
Amir Masoud Abdol
3fe8c51104 Replace the Deprecated $<CONFIGURATION> with $<CONFIG> 2023-01-17 19:44:32 +00:00
Sean McBride
d70b4864d9 issue #2581: review and cleanup of compiler version checks 2023-01-17 18:58:34 +00:00
Mehdi Goli
b523120687 [SYCL-2020 Support] Enabling Intel DPCPP Compiler support to Eigen 2023-01-16 07:04:08 +00:00
tttapa
bae119bb7e Support per-thread is_malloc_allowed() state 2023-01-16 01:34:56 +00:00
Charles Schlosser
fa0bd2c34e improve sparse permutations 2023-01-15 03:21:25 +00:00
Antonio Sánchez
2e61c0c6b4 Add missing EIGEN_DEVICE_FUNC in a few places when called by asserts. 2023-01-15 02:06:17 +00:00
Charles Schlosser
4aca06f63a avoid move assignment in ColPivHouseholderQR 2023-01-15 01:34:10 +00:00
Charles Schlosser
68082b8226 Fix QR, again 2023-01-13 03:23:17 +00:00
Sergey Fedorov
4d05765345 Altivec fixes for Darwin: do not use unsupported VSX insns 2023-01-12 16:33:33 +00:00
Rasmus Munk Larsen
6156797016 Revert "Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings"
This reverts commit be7791e097
2023-01-11 18:50:52 +00:00
Charles Schlosser
be7791e097 Add template to specify QR permutation index type, Fix ColPivHouseholderQR Lapacke bindings 2023-01-11 15:57:28 +00:00
Charles Schlosser
9463fc95f4 change insert strategy 2023-01-11 06:24:49 +00:00
Martin Burchell
c54785b071 Fix error: unused parameter 'tmp' [-Werror,-Wunused-parameter] on clang/32-bit arm 2023-01-10 21:15:28 +00:00
Antonio Sanchez
f47472603b Add missing header for GPU tests. 2023-01-09 11:21:13 -08:00
Charles Schlosser
81172cbdcb Overhaul Sparse Core 2023-01-07 22:09:42 +00:00
Robin Miquel
9255181891 Modified spbenchsolver help message because it could be misunderstood 2023-01-07 21:35:46 +00:00
Chip Kerchner
d20fe21ae4 Improve performance for Power10 MMA bfloat16 GEMM 2023-01-06 23:08:37 +00:00
Ryan Senanayake
fe7f527787 Fix guard macros for emulated FP16 operators on GPU 2023-01-06 22:02:51 +00:00
Rasmus Munk Larsen
b8422c99cd Update file jacobisvd.cpp 2023-01-06 21:14:17 +00:00
Antonio Sánchez
262194f12c Fix a bunch of minor build and test issues. 2023-01-06 16:37:26 +00:00
Antonio Sánchez
3564668908 Fix overalign check. 2023-01-05 17:10:48 +00:00
Charles Schlosser
f3929ac7ed Fix EIGEN_HAS_CXX17_OVERALIGN for icc 2023-01-03 17:30:10 +00:00
LAI Bruce
1b33a6374b Fixes git add . doesn't include scripts/buildtests.in 2023-01-03 17:06:36 +00:00
Charles Schlosser
a8bab0d8ae Patch SparseLU 2022-12-31 04:52:36 +00:00
Antonio Sánchez
910f6f65d0 Adjust thresholds for bfloat16 product tests that are currently failing. 2022-12-28 19:32:25 +00:00
Arthur
311cc0f9cc Enable NEON pcmp, plset, and complex psqrt 2022-12-22 05:38:34 +00:00
Antonio Sánchez
dbf7ae6f9b Fix up C++ version detection macros and cmake tests. 2022-12-20 18:06:03 +00:00
Antonio Sánchez
bb6675caf7 Fix incorrect NEON native fp16 multiplication. 2022-12-19 20:46:44 +00:00
Rasmus Munk Larsen
dd85d26946 Revert "Avoid mixing types in CompressedStorage.h" 2022-12-19 20:09:37 +00:00
Arthur Feeney
c4fb6af24b Enable NEON pabs for unsigned int types 2022-12-19 17:07:36 +00:00
Rasmus Munk Larsen
400bc5cd5b Add sparse_basic_1 to smoke tests. 2022-12-16 22:03:33 +00:00
Rasmus Munk Larsen
04e4f0bb24 Add missing colon in SparseMatrix.h. 2022-12-16 21:50:00 +00:00
Rasmus Munk Larsen
3d8a8def8a Avoid mixing types in CompressedStorage.h 2022-12-16 20:11:02 +00:00
Charles Schlosser
4bb2446796 Add operators to CompressedStorageIterator 2022-12-16 16:48:50 +00:00
Rasmus Munk Larsen
e1aee4ab39 Update test of numext::signbit. 2022-12-15 19:58:59 +00:00
Rasmus Munk Larsen
3717854a21 Use numext::signbit instead of std::signbit, which is not defined for bfloat16. 2022-12-15 18:41:46 +00:00
Alexander Richardson
37de432907 Avoid using std::raise() for divide by zero 2022-12-14 20:06:16 +00:00
Alexander Richardson
62de593c40 Allow std::initializer_list constructors in constexpr expressions 2022-12-14 17:05:37 +00:00
Charles Schlosser
6d3e3678b4 optimize equalspace packetop 2022-12-13 01:22:25 +00:00
Charles Schlosser
2004831941 add EqualSpaced / setEqualSpaced 2022-12-13 00:54:57 +00:00
Melven Roehrig-Zoellner
273f803846 Add BDCSVD_LAPACKE binding 2022-12-09 18:50:12 +00:00
Antonio Sánchez
03c9b4738c Enable direct access for NestByValue. 2022-12-07 18:21:45 +00:00
Chip Kerchner
b59f18b4f7 Increase L2 and L3 cache size for Power10. 2022-12-07 18:20:33 +00:00
Antonio Sánchez
c614b2bbd3 Fix index type for sparse index sorting. 2022-12-06 00:02:31 +00:00
Charles Schlosser
44fe539150 add sparse sort inner vectors function 2022-12-01 19:28:56 +00:00
Lianhuang Li
d194167149 Fix the bug using neon instruction fmla for data type half 2022-12-01 17:28:57 +00:00
Pedro Caldeira
31ab62d347 Add support for Power10 (AltiVec) MMA instructions for bfloat16. 2022-11-30 23:33:37 +00:00
Antonio Sánchez
dcb042a87d Fix serialization for non-compressed matrices. 2022-11-30 18:16:47 +00:00
Antonio Sánchez
2260e11eb0 Fix reshape strides when input has non-zero inner stride. 2022-11-29 19:39:29 +00:00
Alexandre Hoffmann
23524ab6fc Changing BiCGSTAB parameters initialization so that it works with custom types 2022-11-29 19:37:46 +00:00
Antonio Sánchez
ab2b26fbc2 Fix sparseLU solver when destination has a non-unit stride. 2022-11-29 19:37:03 +00:00
Antonio Sánchez
551eebc8ca Add synchronize method to all devices. 2022-11-29 19:35:02 +00:00
Charles Schlosser
b7551bff92 Fix a bunch of annoying compiler warnings in tests 2022-11-21 20:07:19 +00:00
Antonio Sánchez
e7b1ad0315 Add serialization for sparse matrix and sparse vector. 2022-11-21 19:43:07 +00:00
Charles Schlosser
044f3f6234 Fix bug in handmade_aligned_realloc 2022-11-18 22:35:31 +00:00
Chris
6728683938 Small cleanup of IDRS.h 2022-11-16 13:51:23 +00:00
Charles Schlosser
02805bd56c Fix AVX2 psignbit 2022-11-16 13:43:11 +00:00
Chip Kerchner
399ce1ed63 Fix duplicate execution code for Power 8 Altivec in pstore_partial. 2022-11-16 13:41:42 +00:00
Gabriele Buondonno
6431dfdb50 Cross product for vectors of size 2. Fixes #1037 2022-11-15 22:39:42 +00:00
Antonio Sánchez
8588d8c74b Correct pnegate for floating-point zero. 2022-11-15 18:07:23 +00:00
Antonio Sanchez
5eacb9e117 Put brackets around unsigned type names. 2022-11-15 09:09:45 -08:00
Antonio Sánchez
37e40dca85 Fix ambiguity in PPC for vec_splats call. 2022-11-14 18:58:16 +00:00
Antonio Sánchez
7dc6db75d4 Fix typo in CholmodSupport 2022-11-08 23:49:56 +00:00
Charles Schlosser
9b6d624eab fix neon 2022-11-08 20:03:01 +00:00
Rasmus Munk Larsen
7e398e9436 Add missing return keyword in psignbit for NEON. 2022-11-04 16:13:09 +00:00
Charles Schlosser
82b152dbe7 Add signbit function 2022-11-04 00:31:20 +00:00
Antonio Sánchez
8f8e36458f Remove recently added sparse assert in SparseMapBase. 2022-11-03 17:29:05 +00:00
Antonio Sanchez
01a31b81b2 Remove unused parameter name. 2022-11-01 15:51:25 -07:00
Antonio Sánchez
c5b896c5a3 Allow empty matrices to be resized. 2022-10-27 20:33:35 +00:00
Antonio Sánchez
886aad1361 Disable patan for double on PPC. 2022-10-27 17:56:08 +00:00
Antonio Sánchez
ab407b2b6e Fix handmade_aligned_malloc offset computation. 2022-10-27 17:33:47 +00:00
Antonio Sánchez
adb30efb25 Add assert for invalid outerIndexPtr array in SparseMapBase. 2022-10-26 22:51:33 +00:00
Antonio Sánchez
c27d1abe46 Fix pragma check for disabling fastmath. 2022-10-26 22:50:57 +00:00
Charles Schlosser
a226371371 Change handmade_aligned_malloc/realloc/free to store a 1 byte offset instead of absolute address 2022-10-22 22:51:31 +00:00
Antonio Sánchez
bf48d46338 Explicitly state that indices must be sorted. 2022-10-19 18:15:29 +00:00
Rasmus Munk Larsen
3bb6a48d8c Fix bug atan2 2022-10-12 23:49:32 +00:00
Rasmus Munk Larsen
14c847dc0e Refactor special values test for pow, and add a similar test for atan2 2022-10-12 20:12:08 +00:00
Rasmus Munk Larsen
462758e8a3 Don't use generic sign function for sign(complex) unless it is vectorizable 2022-10-12 16:03:29 +00:00
Rasmus Munk Larsen
c0d6a72611 Use pnegate(pzero(x)) as a generic way to generate -0.0. Some compiler do not handle the literal -0.0 properly in fastmath mode. 2022-10-12 01:57:05 +00:00
Laurent Rineau
7846c7387c Eigen/Sparse: fix warnings -Wunused-but-set-variable 2022-10-11 17:37:04 +00:00
Rasmus Munk Larsen
3167544873 Handle NaN inputs to atan2. 2022-10-10 19:36:36 -07:00
Rasmus Munk Larsen
72db3f0fa5 Remove references to M_PI_2 and M_PI_4. 2022-10-11 00:27:16 +00:00
Rasmus Munk Larsen
d6bc062591 Remove reference to EIGEN_HAS_CXX11_MATH. 2022-10-10 23:38:28 +00:00
Rasmus Munk Larsen
5ceed0d57f Guard GCC-specific pragmas with "#ifdef EIGEN_COMP_GNUC" 2022-10-10 20:38:53 +00:00
Alexander Richardson
528b68674c [clang-format] Add a few macros to AttributeMacros 2022-10-10 16:44:47 +00:00
Rasmus Munk Larsen
e95c4a837f Simpler range reduction strategy for atan<float>(). 2022-10-04 18:11:00 +00:00
Antonio Sánchez
80efbfdeda Unconditionally enable CXX11 math. 2022-10-04 17:37:47 +00:00
Antonio Sánchez
e5794873cb Replace assert with eigen_assert. 2022-10-04 17:11:23 +00:00
Antonio Sánchez
7d6a9925cc Fix 4x4 inverse when compiling with -Ofast. 2022-10-04 16:05:49 +00:00
Rasmus Munk Larsen
1414a76fa9 Only vectorize atan<double> for Altivec if VSX is available. 2022-10-03 22:06:58 +00:00
Rasmus Munk Larsen
c475228b28 Vectorize atan() for double. 2022-10-01 01:49:30 +00:00
Rasmus Munk Larsen
1e1848fdb1 Add a vectorized implementation of atan2 to Eigen. 2022-09-28 20:46:49 +00:00
Rasmus Munk Larsen
b3bf8d6a13 Try to reduce size of GEBP kernel for non-ARM targets. 2022-09-28 02:37:18 +00:00
Rasmus Munk Larsen
13b69fc1b0 Try to reduce compilation time/memory for GEBP kernel using EIGEN_IF_CONSTEXPR 2022-09-23 20:09:42 +00:00
Rasmus Munk Larsen
3c4637640b Remove unused typedef. 2022-09-23 19:11:31 +00:00
Rasmus Munk Larsen
ed8cda3ce4 Move EIGEN_NEON_GEBP_NR macro to the right place in GeneralBlockPanelKernel.h 2022-09-23 02:24:27 +00:00
Rasmus Munk Larsen
e2ea866515 Add a macro to set the nr trait in the BEBP kernel for NEON. 2022-09-22 23:56:34 +00:00
Lianhuang Li
23299632c2 Use 3px8/2px8/1px8/1x8 gebp_kernel on arm64-neon 2022-09-21 16:36:40 +00:00
Rasmus Munk Larsen
7b2901e2aa Add vectorized integer division for int32 with AVX512, AVX or SSE. 2022-09-21 00:27:23 +00:00
Chao Chen
5ffe7b92e0 [ROCm] fixed gpuGetDevice unused message 2022-09-20 21:38:20 +00:00
Rasmus Munk Larsen
f913a40678 Revert "Add AVX int32_t pdiv"
This reverts commit ea84e7ad63
2022-09-16 22:48:08 +00:00
Rasmus Munk Larsen
273e0c884e Revert "Add constexpr, test for C++14 constexpr." 2022-09-16 21:14:29 +00:00
Charles Schlosser
ea84e7ad63 Add AVX int32_t pdiv 2022-09-16 17:06:29 +00:00
Rasmus Munk Larsen
dceb779ecd Fix test for pow with mixed integer types. We do not convert the exponent if it is an integer type. 2022-09-12 15:51:27 -07:00
Rasmus Munk Larsen
afc014f1b5 Allow mixed types for pow(), as long as the exponent is exactly representable in the base type. 2022-09-12 21:55:30 +00:00
Antonio Sánchez
b2c82a9347 Remove bad skew_symmetric_matrix3 test. 2022-09-10 07:08:37 +00:00
Rasmus Munk Larsen
e8a2aa24a2 Fix a couple of issues with unary pow(): 2022-09-09 17:21:11 +00:00
Rohit Santhanam
07d0759951 [ROCm] Fix for sparse matrix related breakage on ROCm. 2022-09-09 14:41:00 +00:00
Antonio Sánchez
fb212c745d Fix g++-6 constexpr and c++20 constexpr build errors. 2022-09-09 03:41:45 +00:00
Thomas Gloor
ec9c7163a3 Feature/skew symmetric matrix3 2022-09-08 20:44:40 +00:00
Antonio Sánchez
311ba66f7c Fix realloc for non-trivial types. 2022-09-08 19:39:36 +00:00
Antonio Sánchez
3c37dd2a1d Tweak bound for pow to account for floating-point types. 2022-09-08 17:40:45 +00:00
Rasmus Munk Larsen
f9dfda28ab Add missing comparison operators for GPU packets. 2022-09-07 21:13:45 +00:00
Rasmus Munk Larsen
242325eca7 Remove unused variable. 2022-09-07 20:46:44 +00:00
Tobias Schlüter
133498c329 Add constexpr, test for C++14 constexpr. 2022-09-07 03:42:34 +00:00
Antonio Sánchez
69f50e3a67 Adjust overflow threshold bound for pow tests. 2022-09-06 19:53:29 +00:00
Antonio Sanchez
3e44f960ed Reduce compiler warnings for tests. 2022-09-06 18:20:56 +00:00
Florian Richer
b7e21d4e38 Call check_that_malloc_is_allowed() in aligned_realloc() 2022-09-06 18:00:37 +00:00
Gilles Aouizerate
6e83e906c2 fix typo in doc/TutorialSparse.dox 2022-09-06 17:56:13 +00:00
Michael Palomas
525f066671 fixed msvc compilation error in GeneralizedEigenSolver.h 2022-09-04 17:50:43 +00:00
Antonio Sánchez
f241a2c18a Add asserts for index-out-of-bounds in IndexedView. 2022-09-02 17:28:03 +00:00
Antonio Sánchez
f5364331eb Fix some cmake issues. 2022-09-02 16:43:14 +00:00
Antonio Sánchez
d816044b6e Fix mixingtypes tests. 2022-09-02 15:30:13 +00:00
Gilles Aouizerate
94cc83faa1 2 typos fix in the 3rd table. 2022-08-31 19:54:42 +00:00
Antonio Sánchez
30c42222a6 Fix some test build errors in new unary pow. 2022-08-30 17:24:14 +00:00
Rasmus Munk Larsen
bd393e15c3 Vectorize acos, asin, and atan for float. 2022-08-29 19:49:33 +00:00
Charles Schlosser
e5af9f87f2 Vectorize pow for integer base / exponent types 2022-08-29 19:23:54 +00:00
chuckyschluz
8acbf5c11c re-enable pow for complex types 2022-08-26 17:29:02 -04:00
Rasmus Munk Larsen
7064ed1345 Specialize psign<Packet8i> for AVX2, don't vectorize psign<bool>. 2022-08-26 17:02:37 +00:00
Rasmus Munk Larsen
98e51c9e24 Avoid undefined behavior in array_cwise test due to signed integer overflow 2022-08-26 16:19:03 +00:00
Arthur
a7c1cac18b Fix GeneralizedEigenSolver::info() and Asserts 2022-08-25 22:05:04 +00:00
Antonio Sanchez
714678fc6c Add missing ptr in realloc call. 2022-08-24 22:04:04 -07:00
Charles Schlosser
b2a13c9dd1 Sparse Core: Replace malloc/free with conditional_aligned 2022-08-23 21:44:22 +00:00
Rasmus Munk Larsen
6aad0f821b Fix psign for unsigned integer types, such as bool. 2022-08-22 20:19:35 +00:00
Rasmus Munk Larsen
1a09defce7 Protect new pblend implementation with EIGEN_VECTORIZE_AVX2 2022-08-22 18:28:03 +00:00
Rasmus Munk Larsen
7c67dc67ae Use proper double word division algorithm for pow<double>. Gives 11-15% speedup. 2022-08-17 18:36:23 +00:00
Matthew Sterrett
7a3b667c43 Add support for AVX512-FP16 for vectorizing half precision math 2022-08-17 18:15:21 +00:00
Charles Schlosser
76a669fb45 add fixed power unary operation 2022-08-16 21:32:36 +00:00
Matthew Sterrett
39fcc89798 Removed unnecessary checks for FP16C 2022-08-16 18:14:41 +00:00
Romain Biessy
2f7cce2dd5 [SYCL] Fix some SYCL tests 2022-08-16 17:37:54 +00:00
Arthur
27367017bd Disable bad "deprecated warning" edge-case in BDCSVD 2022-08-11 18:43:31 +00:00
Antonio Sánchez
b8e93bf589 Eliminate bool bitwise warnings. 2022-08-09 22:42:30 +00:00
Lexi Bromfield
66ea0c09fd Don't double-define Half functions on aarch64 2022-08-09 20:00:34 +00:00
Rasmus Munk Larsen
97e0784dc6 Vectorize the sign operator in Eigen. 2022-08-09 19:54:57 +00:00
Arthur
be20207d10 Fix vectorized Jacobi Rotation 2022-08-08 19:29:56 +00:00
Rasmus Munk Larsen
7a87ed1b6a Fix code and unit test for a few corner cases in vectorized pow() 2022-08-08 18:48:36 +00:00
Chip Kerchner
9e0afe0f02 Fix non-VSX PowerPC build 2022-08-08 18:18:17 +00:00
Chip Kerchner
84a9d6fac9 Fix use of Packet2d type for non-VSX. 2022-08-03 20:48:13 +00:00
Chip Kerchner
ce60a7be83 Partial Packet support for GEMM real-only (PowerPC). Also fix compilation warnings & errors for some conditions in new API. 2022-08-03 18:15:19 +00:00
Antonio Sánchez
5a1c7807e6 Fix inner iterator for sparse block. 2022-08-03 17:26:12 +00:00
Antonio Sánchez
39d22ef46b Fix flaky packetmath_1 test. 2022-08-02 17:42:45 +00:00
Antonio Sánchez
7896c7dc6b Use numext::sqrt in ConjugateGradient. 2022-07-29 20:17:23 +00:00
Ilya Tokar
e618c4a5e9 Improve pblend AVX implementation 2022-07-29 18:45:33 +00:00
sjusju
ef4654bae7 Add true determinant to QR and it's variants 2022-07-29 18:24:14 +00:00
Alexander Richardson
b7668c0371 Avoid including <sstream> with EIGEN_NO_IO 2022-07-29 18:02:51 +00:00
John Mather
7dd3dda3da Updated AccelerateSupport documentation after PR 966. 2022-07-29 17:42:31 +00:00
Julian Kent
69714ff613 Add Sparse Subset of Matrix Inverse 2022-07-28 18:04:35 +00:00
Antonio Sánchez
34780d8bd1 Include immintrin.h header for enscripten. 2022-07-22 02:27:42 +00:00
Antonio Sánchez
2cf4d18c9c Disable AVX512 GEMM kernels by default. 2022-07-20 21:22:48 +00:00
Charles Schlosser
a678a3e052 Fix aligned_realloc to call check_that_malloc_is_allowed() if ptr == 0 2022-07-19 20:59:07 +00:00
b-shi
4a56359406 Add option to disable avx512 GEBP kernels 2022-07-18 17:59:09 +00:00
Mathieu Westphal
1092574b26 Fix wrong doxygen group usage 2022-07-12 13:22:46 +02:00
Antonio Sánchez
e1165dbf9a AutoDiff depends on Core, so include appropriate header. 2022-07-09 23:57:09 +00:00
Antonio Sánchez
bb51d9f4fa Fix ODR violations. 2022-07-09 04:56:36 +00:00
Rohit Santhanam
06a458a13d Enable subtests which use device side malloc since this has been fixed in ROCm 5.2. 2022-06-29 17:09:43 +00:00
Chip Kerchner
84cf3ff18d Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial. 2022-06-27 19:18:00 +00:00
Chip Kerchner
c603275dc9 Better performance for Power10 using more load and store vector pairs for GEMV 2022-06-27 18:11:55 +00:00
Antonio Sanchez
0e18714167 Fix clang-tidy warnings about function definitions in headers. 2022-06-24 15:10:58 +00:00
Antonio Sánchez
8ed3b9dcd6 Skip f16/bf16 bessel specializations on AVX512 if unavailable. 2022-06-24 15:10:36 +00:00
Antonio Sánchez
bc2ab81634 Eliminate undef warnings when not compiling for AVX512. 2022-06-24 15:10:10 +00:00
Antonio Sánchez
0e083b172e Use numext::sqrt in Householder.h. 2022-06-21 16:29:59 +00:00
b-shi
37673ca1bc AVX512 TRSM kernels use alloca if EIGEN_NO_MALLOC requested 2022-06-17 18:05:26 +00:00
Chip Kerchner
4d1c16eab8 Fix tanh and erf to use vectorized version for EIGEN_FAST_MATH in VSX. 2022-06-15 16:06:43 +00:00
Mehdi Goli
7ea823e824 [SYCL-Spec] According to [SYCL-2020 spec](... 2022-06-13 15:52:29 +00:00
Arthur
ba4d7304e2 Document DiagonalBase 2022-06-08 17:46:32 +00:00
Binhao Qin
95463b59bc Mark index_remap as EIGEN_DEVICE_FUNC in src/Core/Reshaped.h (Fixes #2493) 2022-06-07 20:10:47 +00:00
Shi, Brian
28812d2ebb AVX512 TRSM Kernels respect EIGEN_NO_MALLOC 2022-06-07 11:28:42 -07:00
sfalmo
9960a30422 Fix row vs column vector typo in Matrix class tutorial 2022-06-07 17:28:19 +00:00
Antonio Sánchez
8c2e0e3cb8 Fix ambiguous comparisons for c++20 (again again) 2022-06-07 17:06:17 +00:00
Arthur
14aae29470 Provide DiagonalMatrix Product and Initializers 2022-06-06 21:43:22 +00:00
Antonio Sánchez
76cf6204f3 Revert "Fix c++20 ambiguity of comparisons."
This reverts commit 4f6354128f
2022-06-04 02:32:10 +00:00
aaraujom
8fbb76a043 Fix build issues with MSVC for AVX512 2022-06-03 14:55:40 +00:00
Antonio Sánchez
4f6354128f Fix c++20 ambiguity of comparisons. 2022-06-03 05:11:07 +00:00
Oleg Shirokobrod
f542b0a71f Adding an MKL adapter in FFT module. 2022-06-02 18:10:43 +00:00
aaraujom
d49ede4dc4 Add AVX512 s/dgemm optimizations for compute kernel (2nd try) 2022-05-28 02:00:21 +00:00
Rasmus Munk Larsen
510f6b9f15 Fix integer shortening warnings in visitor tests. 2022-05-27 18:51:37 +00:00
Arthur
705ae70646 Add R-Bidiagonalization step to BDCSVD 2022-05-27 02:00:24 +00:00
Mario Rincon-Nigro
e99163e732 fix: issue 2481: LDLT produce wrong results with AutoDiffScalar 2022-05-25 15:26:10 +00:00
Antonio Sánchez
477eb7f630 Revert "Avoid ambiguous Tensor comparison operators for C++20 compatibility"
This reverts commit 5c2179b6c3
2022-05-24 16:09:59 +00:00
Mehdi Goli
c5a5ac680c [SYCL] SYCL-2020 range does not have default constructor. 2022-05-24 03:11:46 +00:00
Benjamin Kramer
5c2179b6c3 Avoid ambiguous Tensor comparison operators for C++20 compatibility 2022-05-23 17:36:03 +00:00
Chip Kerchner
aa8b7e2c37 Add subMappers to Power GEMM packing - simplifies the address calculations (10% faster) 2022-05-23 15:18:29 +00:00
Antonio Sánchez
32348091ba Avoid signed integer overflow in adjoint test. 2022-05-23 14:46:16 +00:00
Mehdi Goli
cbe03f3531 [SYCL] Extending SYCL queue interface extension. 2022-05-23 14:45:27 +00:00
Guoqiang QI
32a3f9ac33 Improve plogical_shift_* implementations and fix typo in SVE/PacketMath.h 2022-05-23 09:33:49 +00:00
Eisuke Kawashima
ac5c83a3f5 unset executable flag 2022-05-22 22:47:43 +09:00
Antonio Sanchez
481a4a8c31 Fix BDCSVD condition for failing with numerical issue. 2022-05-20 08:18:31 -07:00
Tobias Wood
a9868bd5be Add arg() to tensor 2022-05-20 03:33:01 +00:00
Antonio Sánchez
028ab12586 Prevent BDCSVD crash caused by index out of bounds. 2022-05-19 22:29:48 +00:00
Rohan Ghige
798fc1c577 Fix 'Incorrect reference code in STL_interface.hh for ata_product' eigen/isses/2425 2022-05-18 14:42:57 +00:00
Antonio Sánchez
9b9496ad98 Revert "Add AVX512 optimizations for matrix multiply"
This reverts commit 25db0b4a82
2022-05-13 18:50:33 +00:00
aaraujom
25db0b4a82 Add AVX512 optimizations for matrix multiply 2022-05-12 23:41:19 +00:00
Guoqiang QI
00b75375e7 Adding PocketFFT support in FFT module since kissfft has some flaw in accuracy and performance 2022-05-11 17:44:22 +00:00
Rasmus Munk Larsen
73d65dbc43 Update README.md. Remove obsolete comment about RowMajor not being fully supported. 2022-05-06 18:19:35 +00:00
Francesco Romano
68e03ab240 Add uninstall target only if not already defined. 2022-05-05 17:43:08 +00:00
Alex_M
2c055f8633 make diagonal matrix cols() and rows() methods constexpr 2022-05-03 10:13:37 +02:00
Chip Kerchner
c2f15edc43 Add load vector_pairs for RHS of GEMM MMA. Improved predux GEMV. 2022-04-25 16:23:01 +00:00
John Mather
9e026e5e28 Removed need to supply the Symmetric flag to UpLo argument for Accelerate LLT and LDLT 2022-04-21 20:02:10 +00:00
Chip Kerchner
44ba7a0da3 Fix compiler bugs for GCC 10 & 11 for Power GEMM 2022-04-20 15:59:00 +00:00
Chip Kerchner
b02c384ef4 Add fused multiply functions for PowerPC - pmsub, pnmadd and pnmsub 2022-04-18 16:16:32 +00:00
Rohit Santhanam
3de96caeaa Fix HouseholderSequence.h 2022-04-17 02:46:56 +00:00
Antonio Sánchez
f845a8bb1a Fix cwise NaN propagation for scalar input. 2022-04-16 05:07:44 +00:00
Charles Schlosser
a4bb513b99 Update HouseholderSequence.h 2022-04-15 16:56:17 +00:00
Shi, Brian
fc1d888415 Remove AVX512VL dependency in trsm 2022-04-14 12:44:24 -07:00
Antonio Sánchez
07db964bde Restrict new AVX512 trsm to AVX512VL, rename files for consistency. 2022-04-14 16:58:32 +00:00
Charles Schlosser
67eeba6e72 Avoidable heap allocation in applyHouseholderToTheLeft 2022-04-13 18:45:36 +00:00
Antonio Sánchez
3342fc7e4d Allow all tests to pass with EIGEN_TEST_NO_EXPLICIT_VECTORIZATION 2022-04-12 14:48:22 +00:00
Antonio Sánchez
efb08e0bb5 Revert "Fix ambiguous DiagonalMatrix constructors."
This reverts commit a81bba962a
2022-04-12 03:54:31 +00:00
Chip Kerchner
53eec53d2a Fix Power GEMV order of operations in predux for MMA. 2022-04-11 21:29:05 +00:00
Antonio Sánchez
a81bba962a Fix ambiguous DiagonalMatrix constructors. 2022-04-11 19:13:25 +00:00
Antonio Sánchez
f7b31f864c Revert "Replace call to FixedDimensions() with a singleton instance of"
This reverts commit 19e6496ce0
2022-04-10 15:30:33 +00:00
Tobias Schlüter
f3ba220c5d Remove EIGEN_EMPTY_STRUCT_CTOR 2022-04-08 18:27:26 +00:00
Antonio Sánchez
5ed7a86ae9 Fix MSVC+CUDA issues. 2022-04-08 18:05:32 +00:00
Antonio Sánchez
734ed1efa6 Fix ODR issues in lapacke_helpers. 2022-04-08 15:31:30 +00:00
Antonio Sánchez
2c45a3846e Fix some max size expressions. 2022-04-06 22:19:57 +00:00
Antonio Sánchez
edc822666d Fix navbar scroll with toc. 2022-04-05 20:14:22 +00:00
Erik Schultheis
df87d40e34 constexpr reshape helper 2022-04-05 17:32:17 +00:00
Chip Kerchner
403fa33409 Performance improvements in GEMM for Power 2022-04-05 12:18:53 +00:00
Erik Schultheis
e1df3636b2 More constexpr helpers 2022-04-04 18:38:34 +00:00
Erik Schultheis
64909b82bd static const class members turned into constexpr 2022-04-04 17:33:33 +00:00
William Talbot
2c0ef43b48 Added Scaling function overload for vector rvalue reference 2022-04-04 16:50:09 +00:00
Antonio Sanchez
ba2cb835aa Add back std::remove* aliases - third-party libraries rely on these. 2022-04-01 17:02:52 +00:00
Antonio Sánchez
0c859cf35d Consider inf/nan in scalar test_isApprox. 2022-04-01 17:00:24 +00:00
Erik Schultheis
1ddd3e29cb fixed order of arguments in blas syrk 2022-03-30 21:45:13 +00:00
Antonio Sánchez
2c56442805 Don't include .cpp in lapack. 2022-03-30 21:41:56 +00:00
Antonio Sánchez
73b2c13bf2 Disable f16c scalar conversions for MSVC. 2022-03-30 18:35:32 +00:00
Antonio Sanchez
9bc9992dd3 Eliminate trace unused warning. 2022-03-29 22:04:50 +00:00
Tobias Schlüter
e22d58e816 Add is_constant_evaluated, update alignment checks 2022-03-25 04:00:58 +00:00
Everton Constantino
f0a91838aa Enable Aarch64 CI 2022-03-24 19:50:49 +00:00
Erik Schultheis
b9d2900e8f added a missing typename and fixed a unused typedef warning 2022-03-24 12:07:18 +02:00
b-shi
0611f7fff0 Add missing explicit reinterprets 2022-03-23 21:10:26 +00:00
Essex Edwards
cd3c81c3bc Add a NNLS solver to unsupported - issue #655 2022-03-23 20:20:44 +00:00
Chip Kerchner
0699fa06fe Split general_matrix_vector_product interface for Power into two macros - one ColMajor and RowMajor. 2022-03-23 18:09:33 +00:00
Antonio Sánchez
19a6a827c4 Optimize visitor traversal in case of RowMajor. 2022-03-23 15:27:57 +00:00
Romain Biessy
f2a3e03e9b Fix usages of wrong namespace 2022-03-21 15:07:53 +00:00
Antonio Sánchez
4451823fb4 Fix ODR violation in trsm. 2022-03-20 15:56:53 +00:00
Antonio Sánchez
9a14d91a99 Fix AVX512 builds with MSVC. 2022-03-18 16:04:53 +00:00
Chip Kerchner
7b10795e39 Change EIGEN_ALTIVEC_ENABLE_MMA_DYNAMIC_DISPATCH and EIGEN_ALTIVEC_DISABLE_MMA flags to be like TensorFlow's... 2022-03-17 22:35:27 +00:00
Antonio Sánchez
3ca1228d45 Work around MSVC compiler bug dropping const. 2022-03-17 20:50:26 +00:00
Tobias Schlüter
40eb34bc5d Fix RowMajorBit <-> RowMajor mixup. 2022-03-17 15:28:12 +00:00
Øystein Sørensen
c062983464 Completed a missing parenthesis in tutorial. 2022-03-17 14:52:07 +00:00
Antonio Sánchez
9deaa19121 Work around g++-10 docker issue for geo_orthomethods_4. 2022-03-16 21:46:04 +00:00
Antonio Sanchez
e34db1239d Fix missing pound 2022-03-16 12:26:12 -07:00
Antonio Sánchez
591906477b Fix up PowerPC MMA flags so it builds by default. 2022-03-16 19:16:28 +00:00
b-shi
518fc321cb AVX512 Optimizations for Triangular Solve 2022-03-16 18:04:50 +00:00
Antonio Sánchez
01b5bc48cc Disable schur non-convergence test. 2022-03-16 17:33:53 +00:00
Erik Schultheis
421cbf0866 Replace Eigen type metaprogramming with corresponding std types and make use of alias templates 2022-03-16 16:43:40 +00:00
Arthur
514f90c9ff Remove workarounds for bad GCC-4 warnings 2022-03-16 00:08:16 +00:00
Rasmus Munk Larsen
9ad5661482 Revert "Fix up PowerPC MMA flags so it builds by default." 2022-03-15 20:51:03 +00:00
Antonio Sánchez
65eeedf964 Fix up PowerPC MMA flags so it builds by default. 2022-03-15 20:22:23 +00:00
Tobias Schlüter
cb1e8228e9 Convert bit calculation to constexpr, avoid casts. 2022-03-13 22:38:36 +09:00
Antonio Sánchez
baf9a985ec Fix swap test for size 1 inputs. 2022-03-10 15:05:58 +00:00
Everton Constantino
7882408856 Temporarily disable aarch64 CI. 2022-03-10 09:34:19 -03:00
Rohit Santhanam
2a6be5492f Fix construct_at compilation breakage on ROCm. 2022-03-09 16:47:53 +00:00
Duncan McBain
a3b64625e3 Remove ComputeCpp-specific code from SYCL Vptr 2022-03-08 22:44:18 +00:00
Antonio Sánchez
9296bb4b93 Fix edge-case in zeta for large inputs. 2022-03-08 21:21:20 +00:00
Tobias Schlüter
cd2ba9d03e Add construct_at, destroy_at wrappers. Use throughout. 2022-03-08 20:43:22 +00:00
AlexanderMueller
dfa5176780 make SparseSolverBase and IterativeSolverBase move constructable 2022-03-08 20:03:53 +01:00
Tobias Schlüter
9883108f3a Remove copy_bool workaround for gcc 4.3 2022-03-08 17:43:11 +00:00
John Mather
3a9d404d76 Add support for Apple's Accelerate sparse matrix solvers 2022-03-08 00:09:18 +00:00
Antonio Sánchez
008ff3483a Fix broken tensor executor test, allow tensor packets of size 1. 2022-03-07 20:30:37 +00:00
Antonio Sanchez
28c7c1a629 Log position of first difference for easier debugging. 2022-03-07 19:06:27 +00:00
Antonio Sánchez
cf82186416 Adds new CMake Options for controlling build components. 2022-03-05 05:49:45 +00:00
Antonio Sánchez
b2ee235a4b Split and reduce SVD test sizes. 2022-03-05 00:15:28 +00:00
Antonio Sánchez
0ae94456a0 Remove duplicate IsRowMajor declaration. 2022-03-04 21:22:02 +00:00
Rasmus Munk Larsen
0e6f4e43f1 Fix a few confusing comments in psincos_float. 2022-03-04 20:41:49 +00:00
Sean McBride
f1b9692d63 Removed EIGEN_UNUSED decorations from many functions that are in fact used 2022-03-03 20:19:33 +00:00
Antonio Sánchez
27d8f29be3 Update vectorization_logic tests for all platforms. 2022-03-03 19:54:15 +00:00
Arthur
c9ff739af1 Fix JacobiSVD_LAPACKE bindings 2022-03-03 19:24:07 +00:00
Zhuo Zhang
d0b1aef6f6 Speed lscg by using .noalias 2022-03-03 08:52:09 +00:00
Antonio Sanchez
55c7400db5 Fix enum conversion warnings in BooleanRedux. 2022-03-03 04:44:20 +00:00
Antonio Sánchez
711803c427 Skip denormal test if Cond is false. 2022-03-03 04:32:13 +00:00
Antonio Sánchez
d819a33bf6 Remove poor non-convergence checks in NonLinearOptimization. 2022-03-02 19:31:20 +00:00
Antonio Sánchez
9c07e201ff Modified sqrt/rsqrt for denormal handling. 2022-03-02 17:20:47 +00:00
Antonio Sanchez
1c2690ed24 Adjust tolerance of matrix_power test for MSVC. 2022-03-01 23:33:05 +00:00
Antonio Sánchez
b48922cb5c Fix SVD for MSVC+CUDA. 2022-03-01 21:35:22 +00:00
Yury Gitman
bf6726a0c6 Fix any/all reduction in the case of row-major layout 2022-03-01 05:27:50 +00:00
Antonio Sánchez
f03df0df53 Fix SVD for MSVC. 2022-02-28 19:53:15 +00:00
Antonio Sánchez
19c39bea29 Fix mixingtypes for g++-11. 2022-02-25 19:28:10 +00:00
Antonio Sánchez
2ed4bee78f Fix frexp packetmath tests for MSVC. 2022-02-24 22:16:37 +00:00
Antonio Sánchez
d58e629130 Disable deprecated warnings for SVD tests on MSVC. 2022-02-24 21:20:49 +00:00
Antonio Sánchez
3d7e2d0e3e Fix packetmath compilation error. 2022-02-23 23:27:08 +00:00
Antonio Sánchez
8970719771 Fix gcc-5 packetmath_12 bug. 2022-02-23 21:56:25 +00:00
Antonio Sánchez
f0b81fefb7 Disable deprecated warnings in SVD tests. 2022-02-23 18:32:00 +00:00
Rasmus Munk Larsen
8b875dbef1 Changes to fast SQRT/RSQRT 2022-02-23 17:32:21 +00:00
Ramil Sattarov
f9b7564faa E2K: initial support of LCC MCST compiler for the Elbrus 2000 CPU architecture 2022-02-23 17:07:34 +00:00
Antonio Sánchez
ae86a146b1 Modify test expression to avoid numerical differences (#2402). 2022-02-23 16:37:03 +00:00
Arthur
cd80e04ab7 Add assert for edge case if Thin U Requested at runtime 2022-02-23 05:35:19 +00:00
Lingzhu Xiang
35727928ad Fix test macro conflicts with STL headers in C++20 2022-02-23 07:43:30 +08:00
Romain Biessy
2dd879d4b0 [SYCL] Fix CMake for SYCL support 2022-02-22 16:53:27 +00:00
Martin Heistermann
550af3938c Fix for crash bug in SPQRSupport: Initialize pointers to nullptr to avoid free() calls of invalid pointers. 2022-02-18 16:13:28 +00:00
Antonio Sánchez
58a90c7463 Use fixed-sized U/V for fixed-sized inputs. 2022-02-16 18:31:47 +00:00
Antonio Sánchez
c367ed26a8 Make FixedInt constexpr, fix ODR of fix<N> 2022-02-16 17:47:51 +00:00
Antonio Sánchez
766087329e Re-add svd::compute(Matrix, options) method to avoid breaking external projects. 2022-02-16 00:54:02 +00:00
Antonio Sánchez
a58af20d61 Add descriptions to Matrix typedefs. 2022-02-15 21:53:27 +00:00
Antonio Sánchez
28e008b99a Fix sqrt/rsqrt for NEON. 2022-02-15 21:31:51 +00:00
Antonio Sanchez
23755030c9 Fix MSVC+NVCC 9.2 pragma error. 2022-02-15 10:51:32 -08:00
Erik Schultheis
7197b577fb Remove unused macros in AVX packetmath.
The following macros are removed:

* EIGEN_DECLARE_CONST_Packet8f
* EIGEN_DECLARE_CONST_Packet4d
* EIGEN_DECLARE_CONST_Packet8f_FROM_INT
* EIGEN_DECLARE_CONST_Packet8i
2022-02-14 10:34:23 +00:00
Antonio Sanchez
bded5028a5 Fix ODR failures in TensorRandom. 2022-02-11 23:28:33 -08:00
Rasmus Munk Larsen
18eab8f997 Add convenience method constexpr std::size_t size() const to Eigen::IndexList 2022-02-12 04:23:03 +00:00
Florian Maurin
fbc62f7df9 Complete doc with MatrixXNt and MatrixNXt 2022-02-11 21:55:54 +00:00
Chip Kerchner
cb5ca1c901 Cleanup compiler warnings, etc from recent changes in GEMM & GEMV for PowerPC 2022-02-09 18:47:08 +00:00
Matt Keeter
cec0005c74 Return alphas() and betas() by const reference 2022-02-08 23:16:10 +00:00
Rasmus Munk Larsen
92d0026b7b Provide a definition for numeric_limits static data members 2022-02-08 20:34:53 +00:00
Björn Ingvar Dahlgren
b94bddcde0 Typo in COD's doc: matrixR() -> matrixT() 2022-02-07 18:30:25 +00:00
Antonio Sánchez
94bed2b80c Fix collision with resolve.h. 2022-02-07 18:17:42 +00:00
Antonio Sanchez
b88de3f24f Update MPL2 with https. 2022-02-07 17:30:31 +00:00
Antonio Sánchez
9441d94dcc Revert "Make fixed-size Matrix and Array trivially copyable after C++20"
This reverts commit 47eac21072
2022-02-05 04:40:29 +00:00
Rasmus Munk Larsen
979fdd58a4 Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments. 2022-02-05 00:20:13 +00:00
Antonio Sánchez
4bffbe84f9 Restrict GCC<6.3 maxpd workaround to only gcc. 2022-02-04 22:47:34 +00:00
Antonio Sánchez
e7f4a901ee Define EIGEN_HAS_AVX512_MATH in PacketMath. 2022-02-04 22:25:52 +00:00
Antonio Sánchez
6b60bd6754 Fix 32-bit arm int issue. 2022-02-04 21:59:33 +00:00
Antonio Sánchez
96da541cba Fix AVX512 math function consistency, enable for ICC. 2022-02-04 19:35:18 +00:00
Antonio Sánchez
cafeadffef Fix ODR violations. 2022-02-04 19:01:07 +00:00
Arthur
18b50458b6 Update SVD Module with Options template parameter 2022-02-02 00:15:44 +00:00
Erik Schultheis
89c6ab2385 removed some documentation referencing c++98 behaviour 2022-01-30 12:02:18 +00:00
Chip Kerchner
66464bd2a8 Fix number of block columns to NOT overflow the cache (PowerPC) abnormally in GEMV 2022-01-27 20:35:53 +00:00
Rasmus Munk Larsen
7db0ac977a Remove extraneous ")". 2022-01-27 02:20:03 +00:00
Rasmus Munk Larsen
09c0085a57 Only test pmsub, pnmadd, and pnmsub on signed types. 2022-01-27 02:09:25 +00:00
Rasmus Munk Larsen
8f2c6f0aa6 Make preciprocal IEEE compliant w.r.t. 1/0 and 1/inf. 2022-01-26 20:38:05 +00:00
Erik Schultheis
d271a7d545 reduce float warnings (comparisons and implicit conversions) 2022-01-26 18:16:19 +00:00
Rasmus Munk Larsen
51311ec651 Remove inline assembly for FMA (AVX) and add remaining extensions as packet ops: pmsub, pnmadd, and pnmsub. 2022-01-26 04:25:41 +00:00
Erik Schultheis
4e629b3c1b make casts explicit and fixed the type 2022-01-24 18:19:21 +00:00
Rasmus Munk Larsen
ea2c02060c Add reciprocal packet op and fast specializations for float with SSE, AVX, and AVX512. 2022-01-21 23:49:18 +00:00
Arthur Feeney
4b0926f99b Prevent heap allocation in diagonal product 2022-01-21 21:15:44 +00:00
Ilya Tokar
a0fc640c18 Add support for packets of int64 on x86 2022-01-21 19:55:23 +00:00
Erik Schultheis
970640519b Cleanup 2022-01-21 01:48:59 +00:00
Stephen Pierce
81c928ba55 Silence some MSVC warnings 2022-01-21 00:29:23 +00:00
Sean McBride
c454b8c813 Improve clang warning suppressions by checking if warning is supported 2022-01-21 00:27:43 +00:00
David Gao
fb05198bdd Port EIGEN_OPTIMIZATION_BARRIER to soft float arm 2022-01-20 00:44:17 +00:00
arthurfeeney
937c3d73cb Fix implicit conversion warning in GEBP kernel's packing 2022-01-17 17:00:59 -06:00
Essex Edwards
49a8a1e07a Minor correction/clarification to LSCG solver documentation 2022-01-14 19:48:54 +00:00
Arthur
5fe0115724 Update comment referencing removed macro EIGEN_SIZE_MIN_PREFER_DYNAMIC 2022-01-14 19:29:47 +00:00
Arthur
ff4dffc34d fix implicit conversion warning in vectorwise_reverse_inplace 2022-01-13 20:30:54 +00:00
Chip Kerchner
708fd6d136 Add MMA and performance improvements for VSX in GEMV for PowerPC. 2022-01-13 13:23:18 +00:00
Jörg Buchwald
d1bf056394 fix compilation issue with gcc < 10 and -std=c++2a 2022-01-13 01:24:20 +01:00
Rasmus Munk Larsen
a30ecb7221 Don't use the fast implementation if EIGEN_GPU_CC, since integer_packet is not defined for float4 used by the GPU compiler (even on host). 2022-01-12 20:16:16 +00:00
Erik Schultheis
5a0a165c09 fix broken asserts 2022-01-12 18:31:53 +00:00
Rasmus Munk Larsen
0b58738938 Fix two corner cases in the new implementation of logistic sigmoid. 2022-01-12 00:41:29 +00:00
Fabrice Fontaine
5d7ffe2ca9 drop COPYING.GPL 2022-01-10 23:14:33 +00:00
Matthias Möller
a32e6a4047 Explicit type casting 2022-01-10 22:06:43 +00:00
Kolja Brix
8d81a2339c Reduce usage of reserved names 2022-01-10 20:53:29 +00:00
Essex Edwards
c61b3cb0db Fix IterativeSolverBase referring to itself as ConjugateGradient 2022-01-08 08:25:15 +00:00
Rasmus Munk Larsen
80ccacc717 Fix accuracy of logistic sigmoid 2022-01-08 00:15:14 +00:00
Rasmus Munk Larsen
8b8125c574 Make sure the scalar and vectorized path for array.exp() return consistent values. 2022-01-07 23:31:35 +00:00
Matthias Möller
c9df98b071 Fix Gcc8.5 warning about missing base class initialisation (#2404) 2022-01-07 19:16:53 +00:00
Lingzhu Xiang
47eac21072 Make fixed-size Matrix and Array trivially copyable after C++20
Making them trivially copyable allows using std::memcpy() without undefined
behaviors.

Only Matrix and Array with trivially copyable DenseStorage are marked as
trivially copyable with an additional type trait.

As described in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0848r3.html
it requires extremely verbose SFINAE to make the special member functions of
fixed-size Matrix and Array trivial, unless C++20 concepts are available to
simplify the selection of trivial special member functions given template
parameters. Therefore only make this feature available to compilers that support
C++20 P0848R3.

Fix #1855.
2022-01-07 19:04:35 +00:00
Matthias Möller
c4b1dd2f6b Add support for Cray, Fujitsu, and Intel ICX compilers
The following preprocessor macros are added:

- EIGEN_COMP_CPE and EIGEN_COMP_CLANGCPE version number of the CRAY compiler if
  Eigen is compiled with the Cray C++ compiler, 0 otherwise.

- EIGEN_COMP_FCC and EIGEN_COMP_CLANGFCC version number of the FCC compiler if
  Eigen is compiled with the Fujitsu C++ compiler, 0 otherwise

- EIGEN_COMP_CLANGICC version number of the ICX compiler if Eigen is compiled
  with the Intel oneAPI C++ compiler, 0 otherwise

All three compilers (Cray, Fujitsu, Intel) offer a traditional and a Clang-based
frontend. This is distinguished by the CLANG prefix.
2022-01-07 18:46:16 +00:00
Rasmus Munk Larsen
96dc37a03b Some fixes/cleanups for numeric_limits & fix for related bug in psqrt 2022-01-07 01:10:17 +00:00
Fabian Keßler
ed27d988c1 Fixes #i2411 2022-01-06 20:02:37 +00:00
Rasmus Munk Larsen
7b5a8b6bc5 Improve plog: 20% speedup for float + handle denormals 2022-01-05 23:40:31 +00:00
Andrew Johnson
a491c7f898 Allow specifying inner & outer stride for CWiseUnaryView - fixes #2398 2022-01-05 19:24:46 +00:00
Rohit Santhanam
27a78e4f96 Some serialization API changes were made in commit... 2022-01-05 16:18:45 +00:00
Erik Schultheis
9210e71fb3 ensure that eigen::internal::size is not found by ADL, rename to ssize and... 2022-01-05 00:46:09 +00:00
Lingzhu Xiang
7244a74ab0 Add bounds checking to Eigen serializer 2022-01-03 17:00:24 +08:00
David Tellenbach
ba91839d71 Remove user survey from Doxygen header 2021-12-31 12:15:19 +01:00
Shiva Ghose
a4098ac676 Fix duplicate include guard *ALTIVEC_H -> *ZVECTOR_H
Some some header guards were repeated between the `AltiVec` package and the
`ZVector` packages. This could cause a problem if (for whatever reason) someone
attempts to include headers for both architectures.
2021-12-31 08:43:24 +00:00
David Tellenbach
22a347b9d2 Remove unused EIGEN_HAS_STATIC_ARRAY_TEMPLATE
ec2fd0f7 removed the EIGEN_HAS_STATIC_ARRAY_TEMPLATE but forgot to remove this
last occurrence.

This fixes issue #2399.
2021-12-30 15:26:55 +00:00
David Tellenbach
d705eb5f86 Revert "Select AVX2 even if the data size is not a multiple of 8"
Tests are failing for AVX and NEON.

This reverts commit eb85b97339.
2021-12-28 23:57:06 +01:00
Rasmus Munk Larsen
8eab7b6886 Improve exp<float>(): Don't flush denormal results +4% speedup.
1. Speed up exp(x) by reducing the polynomial approximant from degree 7 to
degree 6. With exactly representable coefficients computed by the Sollya tool,
this still gives a maximum relative error of 1 ulp, i.e. faithfully rounded, for
arguments where exp(x) is a normalized float. This change results in a speedup
of about 4% for AVX2.


2. Extend the range where exp(x) returns a non-zero result to from ~[-88;88] to
~[-104;88] i.e. return denormalized values for large negative arguments instead
of zero. Compared to exp<double>(x) the denormalized results gradually decrease
in accuracy down to 0.033 relative error for arguments around x = -104 where
exp(x) is ~std::numeric<float>::denorm_min(). This is expected and acceptable.
2021-12-28 15:00:19 +00:00
David Tellenbach
6e95c0cd9a Add missing internal namespace
The vectorization logic tests miss some namespace internal qualifiers.
2021-12-27 23:50:32 +00:00
David Tellenbach
d3675b2e73 Add vectorization_logic_1 test to list of CI smoketests 2021-12-28 00:32:14 +01:00
David Tellenbach
c06c3e52a0 Include immintrin.h if F16C is available and vectorization is disabled
If EIGEN_DONT_VECTORIZE is defined, immintrin.h is not included even if F16C is available. Trying to use F16C intrinsics thus fails.

This fixes issue #2395.
2021-12-25 19:51:42 +00:00
Erik Schultheis
f7a056bf04 Small fixes
This MR fixes a bunch of smaller issues, making the following changes:

* Template parameters in the documentation are documented with `\tparam` instead
  of `\param`
* Superfluous semicolon warnings fixed
* Fixed the type of literals used to initialize float variables
2021-12-21 16:46:09 +00:00
Kolja Brix
2a6594de29 Small cleanup of GDB pretty printer code 2021-12-18 17:34:38 +00:00
Erik Schultheis
dee6428a71 fixed clang warnings about alignment change and floating point precision 2021-12-18 17:18:16 +00:00
Kolja Brix
d0b4b75fbb Simplify logical_xor() 2021-12-16 20:20:47 +00:00
Rasmus Munk Larsen
e93a071774 Fix a bug introduced in !751. 2021-12-15 22:00:40 +00:00
Erik Schultheis
e939c06b0e Small speed-up in row-major sparse dense product 2021-12-15 18:46:25 +00:00
Erik Schultheis
2d39da8af5 space separated EIGEN_TEST_CUSTOM_CXX_FLAGS 2021-12-13 15:27:33 +00:00
Rohit Santhanam
6b2df80317 Fixes for enabling HIP unit tests. Includes a fix to make this work with the latest cmake. 2021-12-12 21:03:30 +00:00
Erik Schultheis
c20e908ebc turn some macros intro constexpr functions 2021-12-10 19:27:01 +00:00
Erik Schultheis
0f36e42169 Fix 2021-12-10 16:59:48 +00:00
Erik Schultheis
c35679af27 fixed customIndices2Array forgetting first index 2021-12-10 16:41:59 +00:00
Erik Schultheis
0b81185fe3 removed Find*.cmake scripts for which these are available in cmake itself 2021-12-10 02:02:34 +00:00
Erik Schultheis
495ffff945 removed helper cmake macro and don't use deprecated COMPILE_FLAGS anymore. 2021-12-09 23:09:56 +00:00
Rohit Santhanam
8a8122874b Build unit tests for HIP using C++14. 2021-12-09 08:04:19 +00:00
Rasmus Munk Larsen
f04fd8b168 Make sure exp(-Inf) is zero for vectorized expressions. This fixes #2385. 2021-12-08 17:57:23 +00:00
Erik Schultheis
39a6aff16c get rid of using namespace Eigen in sample code 2021-12-07 19:57:38 +00:00
Erik Schultheis
e4c40b092a disambiguate overloads for empty index list 2021-12-07 19:40:09 +00:00
Jens Wehner
c6fa0ca162 Idrsstabl 2021-12-06 20:00:00 +00:00
Erik Schultheis
cc11e240ac Some further cleanup 2021-12-06 18:01:15 +00:00
Erik Schultheis
14c32c60f3 fixed snippets 2021-12-05 17:31:12 +00:00
Erik Schultheis
cd83f34d3a fix typo StableNorm -> stableNorm 2021-12-04 14:52:09 +00:00
Rasmus Munk Larsen
3ffefcb95c Only include <atomic> if needed. 2021-12-02 23:55:25 +00:00
Jens Wehner
4ee2e9b340 Idrs refactoring 2021-12-02 23:32:07 +00:00
Jens Wehner
f63c6dd1f9 Bicgstabl 2021-12-02 22:48:22 +00:00
Erik Schultheis
2f65ec5302 fixed leftover else branch 2021-12-02 18:13:19 +00:00
Erik Schultheis
d60f7fa518 Improved lapacke binding code for HouseholderQR and PartialPivLU 2021-12-02 00:10:58 +00:00
Xinle Liu
7ef5f0641f Remove macro EIGEN_GPU_TEST_C99_MATH
Remove macro EIGEN_GPU_TEST_C99_MATH which is used in a single test file only and always defaults to true.
2021-12-01 14:48:56 +00:00
Antonio Sánchez
f56a5f15c6 Disable GCC-4.8 tests. 2021-12-01 02:12:52 +00:00
Erik Schultheis
ec2fd0f7ed Require recent GCC and MSCV and removed EIGEN_HAS_CXX14 and some other feature test macros 2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
085c2fc5d5 Revert "Update SVD Module to allow specifying computation options with a... 2021-11-30 18:45:54 +00:00
Erik Schultheis
4dd126c630 fixed cholesky with 0 sized matrix (cf. #785) 2021-11-30 17:17:41 +00:00
Rohit Santhanam
4d3e50036f Fix for HIP compilation breakage in selfAdjoint and triangular view classes. 2021-11-30 14:00:59 +00:00
Erik Schultheis
63abb35dfd SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue 2021-11-29 22:51:26 +00:00
Jakub Gałecki
1b8dce564a bugfix: issue #2375 2021-11-29 22:26:15 +00:00
Francesco Mazzoli
eb85b97339 Select AVX2 even if the data size is not a multiple of 8 2021-11-29 21:13:24 +00:00
Arthur
eef33946b7 Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051 2021-11-29 20:50:46 +00:00
Erik Schultheis
4a76880351 Updated CMake
This patch updates the minimum required CMake version to 3.10 and removes the EIGEN_TEST_CXX11 CMake option, including corresponding logic.
2021-11-29 20:24:20 +00:00
Erik Schultheis
f33a31b823 removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks 2021-11-29 19:18:57 +00:00
Rohit Santhanam
9d3ffb3fbf Fix for HIP compilation failure in DenseBase. 2021-11-28 15:59:30 +00:00
David Tellenbach
08da52eb85 Remove DenseBase::nonZeros() which just calls DenseBase::size()
Fixes #2382.
2021-11-27 14:31:00 +00:00
Ali Can Demiralp
96e537d6fd Add EIGEN_DEVICE_FUNC to DenseBase::hasNaN() and DenseBase::allFinite(). 2021-11-27 11:27:52 +00:00
Erik Schultheis
b8b6566f0f Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly. 2021-11-25 16:11:25 +00:00
Erik Schultheis
ec4efbd696 remove EIGEN_HAS_CXX11 2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
cfdb3ce3f0 Fix warnings about shadowing definitions. 2021-11-23 14:34:47 -08:00
Rasmus Munk Larsen
5e89573e2a Implement Eigen::array<...>::reverse_iterator if std::reverse_iterator exists. 2021-11-20 00:22:46 +00:00
Rasmus Munk Larsen
5137a5157a Make numeric_limits members constexpr as per the newer C++ standards.
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Erik Schultheis
7e586635ba don't use deprecated MappedSparseMatrix 2021-11-19 15:58:04 +00:00
Rasmus Munk Larsen
11cb7b8372 Add basic iterator support for Eigen::array to ease transition to std::array in third-party libraries. 2021-11-19 05:14:30 +00:00
Antonio Sanchez
c107bd6102 Fix errors for windows build. 2021-11-19 04:23:25 +00:00
Erik Schultheis
b0fb5417d3 Fixed Sparse-Sparse Product in case of mixed StorageIndex types 2021-11-18 18:33:31 +00:00
Rasmus Munk Larsen
96aeffb013 Make the new TensorIO implementation work with TensorMap with const elements. 2021-11-17 18:16:04 -08:00
Rasmus Munk Larsen
824d06eb36 Include <numeric> to get std::iota. 2021-11-18 00:47:18 +00:00
Pablo Speciale
d04edff570 Update Umeyama.h: src_var is only used when with_scaling == true. Therefore, the actual computation can be avoided when with_scaling == false. 2021-11-16 17:58:22 +00:00
Antonio Sanchez
ffb78e23a1 Fix tensor broadcast off-by-one error.
Caught by JAX unit tests.  Triggered if broadcast is smaller than packet
size.
2021-11-16 17:37:38 +00:00
cpp977
f73c95c032 Reimplemented the Tensor stream output. 2021-11-16 17:36:58 +00:00
Rasmus Munk Larsen
2b9297196c Update Transform.h to make transform_construct_from_matrix and transform_take_affine_part callable from device code. Fixes #2377. 2021-11-16 00:58:30 +00:00
Erik Schultheis
ca9c848679 use consistent StorageIndex types in SparseMatrix::Map
and `SparseMatrix::TransposedSparseMatrix`
2021-11-15 22:18:26 +00:00
Erik Schultheis
13954c4440 moved pruning code to SparseVector.h 2021-11-15 22:16:01 +00:00
Nathan Luehr
da79095923 Convert diag pragmas to nv_diag. 2021-11-15 03:42:42 +00:00
Erik Schultheis
532cc73f39 fix a typo 2021-11-13 13:11:06 +02:00
jenswehner
675b72e44b added clang format 2021-11-09 23:49:01 +01:00
Ben Barsdell
50df8d3d6d Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-11-05 16:39:37 +11:00
Rasmus Munk Larsen
55e3ae02ac Compare summation results against forward error bound. 2021-11-04 18:04:04 -07:00
Gengxin Xie
5c642950a5 Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C
if the compiler isn't clang
2021-11-04 22:13:01 +00:00
Gilad
0d73440fb2 Documentation of Quaternion constructor from MatrixBase 2021-11-04 16:21:26 +00:00
Minh Quan HO
4284c68fbb nestbyvalue test: fix uninitialized matrix
- Doing computation with uninitialized (zero-ed ? but thanks Linux) matrix, or
worse NaN on other non-linux systems.
- This commit fixes it by initializing to Random().
2021-11-04 14:32:12 +01:00
Xinle Liu
478a1bdda6 Fix total deflation issue in BDCSVD, when & only when M is already diagonal. 2021-11-02 16:53:55 +00:00
Antonio Sanchez
8f8c2ba2fe Remove bad "take" impl that causes g++-11 crash.
For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes
g++-11 to ICE with
```
sorry, unimplemented: unexpected AST of kind nontype_argument_pack
```
It does work with other versions of gcc, and with clang.
I filed a GCC bug
[here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999).

Technically we should never actually run into this case, since you
can't take n > 0 elements from an empty list.  Commenting it out
allows our Eigen tests to pass.
2021-11-01 17:04:41 +00:00
Antonio Sanchez
f6c8cc0e99 Fix TensorReduction warnings and error bound for sum accuracy test.
The sum accuracy test currently uses the default test precision for
the given scalar type.  However, scalars are generated via a normal
distribution, and given a large enough count and strong enough random
generator, the expected sum is zero.  This causes the test to
periodically fail.

Here we estimate an upper-bound for the error as `sqrt(N) * prec` for
summing N values, with each having an approximate epsilon of `prec`.

Also fixed a few warnings generated by MSVC when compiling the
reduction test.
2021-10-30 14:59:00 -07:00
Rasmus Munk Larsen
b3bea43a2d Don't use unrolled loops for stateful reducers. The problem is the combination step, e.g.
reducer0.reducePacket(accum1, accum0);
reducer0.reducePacket(accum2, accum0);
reducer0.reducePacket(accum3, accum0);

For the mean reducer this will increment the count as well as adding together the accumulators and result in the wrong count being divided into the sum at the end.
2021-10-28 23:52:54 +00:00
Chip Kerchner
9cf34ee0ae Invert rows and depth in non-vectorized portion of packing (PowerPC). 2021-10-28 21:59:41 +00:00
Ilya Tokar
e1cb6369b0 Add AVX vector path to float2half/half2float
Makes e. g. matrix multiplication 2x faster:
name         old cpu/op  new cpu/op  delta
BM_convers   181ms ± 1%    62ms ± 9%  -65.82%  (p=0.016 n=4+5)

Tested on all possible input values (not adding tests, since they
take a long time).
2021-10-28 13:59:01 -04:00
Antonio Sanchez
03d4cbb307 Fix min/max nan-propagation for scalar "other".
Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`.

Fixes #2362.
2021-10-28 09:28:29 -07:00
Antonio Sanchez
e559701981 Fix compile issue for gcc 4.8 2021-10-28 08:23:19 -07:00
Fabian Keßler
19cacd3ecb optimize cmake scripts for subproject use 2021-10-28 16:08:02 +02:00
Rohit Santhanam
48e40b22bf Preliminary HIP bfloat16 GPU support. 2021-10-27 18:36:45 +00:00
Antonio Sanchez
40bbe8a4d0 Fix ZVector build.
Cross-compiled via `s390x-linux-gnu-g++`, run via qemu.  This allows the
packetmath tests to pass.
2021-10-27 16:30:15 +00:00
Alex Druinsky
6bb6a6bf53 Vectorize fp16 tanh and logistic functions on Neon
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Antonio Sánchez
185ad0e610 Revert "Avoid integer overflow in EigenMetaKernel indexing"
This reverts commit 100d7caf92
2021-10-27 14:55:25 +00:00
Rasmus Munk Larsen
68e0d023c0 Remove license column in tables for builtin sparse solvers since all are MPL2 now. 2021-10-26 18:09:22 +00:00
Andreas Krebbel
8faafc3aaa ZVector: Move alignas qualifier to come first
We currently have plenty of type definitions with the alignment
qualifier coming after the type.  The compiler warns about ignoring
them:
int EIGEN_ALIGN16 ai[4];

Turn this into:
EIGEN_ALIGN16 int ai[4];
2021-10-26 15:33:47 +02:00
Ben Barsdell
100d7caf92 Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-10-26 00:04:28 +00:00
Alex Druinsky
d0e3791b1a Fix vectorized reductions for Eigen::half
Fixes compiler errors in expressions that look like

  Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff()

The error comes from the code that creates the initial value for
vectorized reductions. The fix is to specify the scalar type of the
reduction's initial value.

The cahnge is necessary for Eigen::half because unlike other types,
Eigen::half scalars cannot be implicitly created from integers.
2021-10-25 14:44:33 -07:00
Maxiwell S. Garcia
99600bd1a6 test: fix boostmutiprec test to compile with older Boost versions
Eigen boostmultiprec test redefines a symbol that is already defined
inside Boot Math [1]. Boost has fixed it recently [2], but this
patch avoids errors if Boost version was less than 1.77.

https://github.com/boostorg/math/blob/boost-1.76.0/include/boost/math/policies/policy.hpp#L18
6830712302 (diff-c7a8e5911c2e6be4138e1a966d762200f147792ac16ad96fdcc724313d11f839)
2021-10-25 20:32:33 +00:00
Yann Billeter
6c3206152a fix(CommaInitializer): pass dims at compile-time 2021-10-25 19:53:38 +00:00
Antonio Sanchez
a500da1dc0 Fix broadcasting oob error.
For vectorized 1-dimensional inputs that do not take the special
blocking path (e.g. `std::complex<...>`), there was an
index-out-of-bounds error causing the broadcast size to be
computed incorrectly.  Here we fix this, and make other minor
cleanup changes.

Fixes #2351.
2021-10-25 19:31:12 +00:00
Antonio Sanchez
0578feaabc Remove const from visitor return type.
This seems to interfere with `pload`/`ploadu`, since `pload<const
Packet**>` are not defined.

This should unbreak the arm/ppc builds.
2021-10-25 19:09:50 +00:00
benardp
b63c096fbb Extend EIGEN_QT_SUPPORT to Qt6 2021-10-23 23:43:06 +00:00
Lennart Steffen
163f11e24a Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126 2021-10-22 09:46:43 +00:00
Nico
b17bcddbca Fix -Wbitwise-instead-of-logical clang warning
& and | short-circuit, && and || don't. When both arguments to those
are boolean, the short-circuiting version is usually the desired one, so
clang warns on this.

Here, it is inconsequential, so switch to && and || to suppress the warning.
2021-10-21 23:32:45 -04:00
Rasmus Munk Larsen
2d3fec8ff6 Add nan-propagation options to matrix and array plugins. 2021-10-21 19:40:11 +00:00
Antonio Sanchez
b86e013321 Revert bit_cast to use memcpy for CUDA.
To elide the memcpy, we need to first load the `src` value into
registers by making a local copy. This avoids the need to resort
to potential UB by using `reinterpret_cast`.

This change doesn't seem to affect CPU (at least not with gcc/clang).
With optimizations on, the copy is also elided.
2021-10-21 08:14:11 -07:00
Antonio Sanchez
45e67a6fda Use reinterpret_cast on GPU for bit_cast.
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation.  We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
Antonio Sanchez
24ebb37f38 Disable Tree reduction for GPU.
For moderately sized inputs, running the Tree reduction quickly
fills/overflows the GPU thread stack space, leading to memory errors.
This was happening in the `cxx11_tensor_complex_gpu` test, for example.
Disabling tree reduction on GPU fixes this.
2021-10-20 20:42:37 +00:00
Rasmus Munk Larsen
360290fc42 Improve accuracy of full tensor reduction for half and bfloat16 by reducing leaf size in tree reduction.
Add more unit tests for summation accuracy.
2021-10-20 19:54:06 +00:00
Antonio Sanchez
95bb645e92 Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.
Looks like we need to update the
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as
well when compiling with NVCC.  Fixes build issues for VS 2017.
2021-10-20 19:38:14 +00:00
Antonio Sanchez
fd5f48e465 Fix tuple compilation for VS2017.
VS2017 doesn't like deducing alias types, leading to a bunch of compile
errors for functions involving the `tuple` alias.  Replacing with
`TupleImpl` seems to solve this, allowing the test to compile/pass.
2021-10-20 19:18:34 +00:00
Antonio Sanchez
d0d34524a1 Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h
The `Complex.h` file applies equally to HIP/CUDA, so placing under the
generic `GPU` folder.

The `TensorReductionCuda.h` has already been deprecated, now removing
for the next Eigen version.
2021-10-20 12:00:19 -07:00
Rasmus Munk Larsen
f2c9c2d2f7 Vectorize Visitor.h. 2021-10-20 16:58:01 +00:00
Antonio Sanchez
2bf07fa5b5 Fix Windows CMake compiler/OS detection.
Replaced deprecated `DetermineVSServicePack`macro with recommended
`CMAKE_CXX_COMPILER_VERSION`.

Deleted custom `OSVersion` detection.  The windows-specific code is
highly outdated, and on other systems simply returns `CMAKE_SYSTEM`.
We will get values like `windows-10.0.17763`, but this is preferable
to `unknownwin`, and saves us needing to maintain a separate cmake file.
2021-10-02 16:30:38 +00:00
Rasmus Munk Larsen
1d75fab368 Speed up tensor reduction 2021-10-02 14:58:23 +00:00
Antonio Sanchez
be9e7d205f Reduce tensor_contract_gpu test.
The original test times out after 60 minutes on Windows, even when
setting flags to optimize for speed.  Reducing the number of
contractions performed from 3600->27 for subtests 8,9 allow the
two to run in just over a minute each.
2021-10-02 04:36:15 +00:00
Antonio Sanchez
701f5d1c91 Fix gpu special function tests.
Some checks used incorrect values, partly from copy-paste errors,
partly from the change in behaviour introduced in !398.

Modified results to match scipy, simplified tests by updating
`VERIFY_IS_CWISE_APPROX` to work for scalars.
2021-10-01 10:20:50 -07:00
Antonio Sanchez
f0f1d7938b Disable testing of complex compound assignment operators for MSVC.
MSVC does not support specializing compound assignments for
`std::complex`, since it already specializes them (contrary to the
standard).

Trying to use one of these on device will currently lead to a
duplicate definition error.  This is still probably preferable
to no error though.  If we remove the definitions for MSVC, then
it will compile, but the kernel will fail silently.

The only proper solution would be to define our own custom `Complex`
type.
2021-09-27 15:15:11 -07:00
Kolja Brix
51a0b4e2d2 Reorganize test main file 2021-09-27 18:30:47 +00:00
Antonio Sanchez
21640612be Disable more CUDA warnings.
For cuda 9.2 and 11.4, they changed the numbers again.

Fixes #2331.
2021-09-24 21:31:14 -07:00
Antonio Sanchez
de218b471d Add -arch=<arch> argument for nvcc.
Without this flag, when compiling with nvcc, if the compute architecture of a card does
not exactly match any of those listed for `-gencode arch=compute_<arch>,code=sm_<arch>`,
then the kernel will fail to run with:
```
cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device.
```
This can happen, for example, when compiling with an older cuda version
that does not support a newer architecture (e.g. T4 is `sm_75`, but cuda
9.2 only supports up to `sm_70`).

With the `-arch=<arch>` flag, the code will compile and run at the
supplied architecture.
2021-09-24 20:48:01 -07:00
Antonio Sanchez
846d34384a Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS
Also add a missing space for clang.
2021-09-24 20:15:55 -07:00
Antonio Sanchez
7b00e8b186 Clean up CUDA CMake files.
- Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt
- Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed
to the cuda compiler (nvcc or clang).

The latter is to support passing custom flags (e.g. `-arch=` to nvcc,
or to disable cuda-specific warnings).
2021-09-24 14:43:59 -07:00
Antonio Sanchez
e9e90892fe Disable another device warning 2021-09-23 13:43:18 -07:00
Antonio Sanchez
86c0decc48 Disable more NVCC warnings.
The 2979 warning is yet another "calling a __host__ function from a
__host__ device__ function.  Although we probably should eventually
address these, they are flooding the logs.  Most of these are
harmless since we only call the original from the host.
In cases where these are actually called from device, an error is generated
instead anyways.

The 2977 warning is a bit strange - although the warning suggests the
`__device__` annotation is ignored, this doesn't actually seem to be
the case.  Without the `__device__` declarations, the kernel actually
fails to run when attempting to construct such objects.  Again,
these warnings are flooding the logs, so disabling for now.
2021-09-23 10:52:39 -07:00
Kolja Brix
afa616bc9e Fix some typos found 2021-09-23 15:22:00 +00:00
Antonio Sanchez
76bb29c0c2 Add -mfma for AVX512DQ tests. 2021-09-22 14:06:29 -07:00
sciencewhiz
4b6036e276 fix various typos 2021-09-22 16:15:06 +00:00
Antonio Sanchez
3753e6a2b3 Add AVX512 test job to CI. 2021-09-21 15:11:31 -07:00
Antonio Sanchez
343847273d Enable AVX512 testing. 2021-09-21 15:00:36 -07:00
Alexander Grund
b5eaa42695 Fix alias violation in BFloat16
reinterpret_cast between unrelated types is undefined behavior and leads
to misoptimizations on some platforms.
Use the safer (and faster) version via bit_cast
2021-09-20 10:37:50 +02:00
Alexander Karatarakis
4d622be118 [AutodiffScalar] Remove const when returning by value
clang-tidy: Return type 'const T' is 'const'-qualified at the top level,
which may reduce code readability without improving const correctness

The types are somewhat long, but the affected return types are of the form:
```
const T my_func() { /**/ }
```

Change to:
```
T my_func() { /**/ }
```
2021-09-18 21:23:32 +00:00
Antonio Sanchez
f49217e52b Fix implicit conversion warnings in tuple_test.
Fixes #2329.
2021-09-17 19:40:22 -07:00
Rasmus Munk Larsen
5595cfd194 Run CI tests in parallel no available cores. 2021-09-17 22:35:22 +00:00
Antonio Sanchez
3c724c44cf Fix strict aliasing bug causing product_small failure.
Packet loading is skipped due to aliasing violation, leading to nullopt matrix
multiplication.

Fixes #2327.
2021-09-17 21:09:34 +00:00
Antonio Sanchez
9882aec279 Silence string overflow warning for GCC in initializer_list_construction test.
This looks to be a GCC bug.  It doesn't seem to reproduce is a smaller example,
making it hard to isolate.
2021-09-17 18:33:50 +00:00
Rasmus Munk Larsen
5dac69ff0b Added a macro to pass arguments to ctest, e.g. to run tests in parallel. 2021-09-17 18:33:12 +00:00
Antonio Sanchez
5dac0b53c9 Move Eigen::all,last,lastp1,lastN to Eigen::placeholders::.
These names are so common, IMO they should not exist directly in the
`Eigen::` namespace.  This prevents us from using the `last` or `all`
names for any parameters or local variables, otherwise spitting out
warnings about shadowing or hiding the global values.  Many external
projects (and our own examples) also heavily use
```
using namespace Eigen;
```
which means these conflict with external libraries as well, e.g.
`std::fill(first,last,value)`.

It seems originally these were placed in a separate namespace
`Eigen::placeholders`, which has since been deprecated.  I propose
to un-deprecate this, and restore the original locations.

These symbols are also imported into `Eigen::indexing`, which
additionally imports `fix` and `seq`. An alternative is to remove the
`placeholders` namespace and stick with `indexing`.

NOTE: this is an API-breaking change.

Fixes #2321.
2021-09-17 10:21:42 -07:00
Rohit Santhanam
44da7a3b9d Disable specific subtests that fail on HIP due to non-functional device side malloc/free (on HIP). 2021-09-17 16:19:03 +00:00
Antonio Sanchez
16f9a20a6f Add buildtests_gpu and check_gpu to simplify GPU testing.
This is in preparation of adding GPU tests to the CI, allowing
us to limit building/testing of GPU-specific tests for a given
GPU-capable runner.

GPU tests are tagged with the label "gpu".  The new targets
```
make buildtests_gpu
make check_gpu
```
allow building and running only the gpu tests.
2021-09-17 00:48:57 +00:00
Rasmus Munk Larsen
1239adfcab Remove -fabi-version=6 flag from AVX512 builds. It was added to fix builds with gcc 4.9, but these don't even work today, and the flag breaks compilation with newer versions of gcc. 2021-09-16 16:16:47 -07:00
Rasmus Munk Larsen
6cadab6896 Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert. 2021-09-16 20:43:54 +00:00
Rasmus Munk Larsen
7b975acb1f Remove unused variable. 2021-09-16 20:27:13 +00:00
Rasmus Munk Larsen
92849d814b Remove unused variable. 2021-09-16 20:21:31 +00:00
Rasmus Munk Larsen
da027fa20a Remove unused variable. 2021-09-16 20:02:42 +00:00
Ryan Pavlik
3c87d6b662 Fix typos in copyright dates
(cherry picked from commit 3335e0767c)
2021-09-15 20:49:43 +00:00
Antonio Sanchez
cb50730993 Default eigen_packet_wrapper constructor.
This makes it trivial, allowing use of `memcpy`.

Fixes #2326
2021-09-14 10:57:22 -07:00
Rohit Santhanam
a751225845 Minor fix for compilation error on HIP. 2021-09-12 14:06:58 +00:00
Antonio Sanchez
2e31570c16 Fix tuple_test after gpu_test_helper update.
Duplicating the namespace `tuple_impl` caused a conflict with the
`arch/GPU/Tuple.h` definitions for the `tuple_test`.  We can't
just use `Eigen::internal` either, since there exists a different
`Eigen::internal::get`.  Renaming the namespace to `test_detail`
fixes the issue.
2021-09-11 20:24:42 -07:00
Antonio Sanchez
d06c639667 Fix unused variable warning and unnecessessary gpuFree. 2021-09-11 20:02:22 -07:00
Antonio Sanchez
bf66137efc New GPU test utilities.
This introduces new functions:
```
// returns kernel(args...) running on the CPU.
Eigen::run_on_cpu(Kernel kernel, Args&&... args);

// returns kernel(args...) running on the GPU.
Eigen::run_on_gpu(Kernel kernel, Args&&... args);
Eigen::run_on_gpu_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args);

// returns kernel(args...) running on the GPU if using
//   a GPU compiler, or CPU otherwise.
Eigen::run(Kernel kernel, Args&&... args);
Eigen::run_with_hint(size_t buffer_capacity_hint, Kernel kernel, Args&&... args);
```

Running on the GPU is accomplished by:
- Serializing the kernel inputs on the CPU
- Transferring the inputs to the GPU
- Passing the kernel and serialized inputs to a GPU kernel
- Deserializing the inputs on the GPU
- Running `kernel(inputs...)` on the GPU
- Serializing all output parameters and the return value
- Transferring the serialized outputs back to the CPU
- Deserializing the outputs and return value on the CPU
- Returning the deserialized return value

All inputs must be serializable (currently POD types, `Eigen::Matrix`
and `Eigen::Array`).  The kernel must also  be POD (though usually
contains no actual data).

Tested on CUDA 9.1, 10.2, 11.3, with g++-6, g++-8, g++-10 respectively.

This MR depends on !622, !623, !624.
2021-09-10 14:22:50 -07:00
Rasmus Munk Larsen
d7d0bf832d Issue an error in case of direct inclusion of internal headers. 2021-09-10 19:12:26 +00:00
Maxiwell S. Garcia
36402e281d ci: ppc64le: disable MMA for gcc-10
This patch disables MMA for CI because the building environment is
using Ubuntu 18.04 image with LD 2.30. This linker version together
with gcc-10 causes some 'unrecognized opcode' errors.
2021-09-09 12:18:07 -05:00
Antonio Sanchez
6c10495a78 Remove unnecessary std::tuple reference. 2021-09-09 15:49:44 +00:00
Antonio Sanchez
26e5beb8cb Device-compatible Tuple implementation.
An analogue of `std::tuple` that works on device.

Context: I've tried `std::tuple` in various versions of NVCC and clang,
and although code seems to compile, it often fails to run - generating
"illegal memory access" errors, or "illegal instruction" errors.
This replacement does work on device.
2021-09-08 13:34:19 -07:00
Antonio Sanchez
fcd73b4884 Add a simple serialization mechanism.
The `Serializer<T>` class implements a binary serialization that
can write to (`serialize`) and read from (`deserialize`) a byte
buffer.  Also added convenience routines for serializing
a list of arguments.

This will mainly be for testing, specifically to transfer data to
and from the GPU.
2021-09-08 09:38:59 -07:00
Huang, Zhaoquan
558b3d4fb8 Add LLDB Pretty Printer 2021-09-07 17:28:24 +00:00
Antonio Sanchez
7792b1e909 Fix AVX2 PacketMath.h.
There were a couple typos ps -> epi32, and an unaligned load issue.
2021-09-03 19:47:57 +00:00
Antonio Sanchez
5bf35383e0 Disable MSVC constant condition warning.
We use extensive use of `if (CONSTANT)`, and cannot use c++17's `if
constexpr`.
2021-09-03 11:07:18 -07:00
Antonio Sanchez
def145547f Add missing packet types in pset1 call.
Oops, introduced this when "fixing" integer packets.
2021-09-02 16:21:07 -07:00
Antonio Sanchez
eea2a3385c Remove more DynamicSparseMatrix references.
Also fixed some typos in SparseExtra/MarketIO.h.
2021-09-02 15:36:47 -07:00
Antonio Sanchez
3b48a3b964 Remove stray DynamicSparseMatrix references.
DynamicSparseMatrix has been removed.  These shouldn't be here anymore.
2021-09-02 19:47:26 +00:00
Antonio Sanchez
ebd4b17d2f Fix tridiagonalization_inplace_selector.
The `Options` of the new `hCoeffs` vector do not necessarily match
those of the `MatrixType`, leading to build errors. Having the
`CoeffVectorType` be a template parameter relieves this restriction.
2021-09-02 12:23:27 -07:00
Sergiu Deitsch
bf426faf93 cmake: populate package registry by default 2021-09-02 17:36:01 +00:00
Jens Wehner
8286073c73 Matrixmarket extension 2021-09-02 17:23:33 +00:00
Sergiu Deitsch
e8beb4b990 cmake: use ARCH_INDEPENDENT versioning if available 2021-09-02 16:08:58 +00:00
Sergiu Deitsch
7bc90cee7d cmake: remove unused interface definitions 2021-09-02 15:41:56 +02:00
Antonio Sanchez
998bab4b04 Missing EIGEN_DEVICE_FUNCs to get gpu_basic passing with CUDA 9.
CUDA 9 seems to require labelling defaulted constructors as
`EIGEN_DEVICE_FUNC`, despite giving warnings that such labels are
ignored.  Without these labels, the `gpu_basic` test fails to
compile, with errors about calling `__host__` functions from
`__host__ __device__` functions.
2021-09-01 19:49:53 -07:00
Antonio Sanchez
74da2e6821 Rename Tuple -> Pair.
This is to make way for a new `Tuple` class that mimics `std::tuple`,
but can be reliably used on device and with aligned Eigen types.

The existing Tuple has very few references, and is actually an
analogue of `std::pair`.
2021-09-02 02:20:54 +00:00
Antonio Sanchez
3d4ba855e0 Fix AVX integer packet issues.
Most are instances of AVX2 functions not protected by
`EIGEN_VECTORIZE_AVX2`.  There was also a missing semi-colon
for AVX512.
2021-09-01 14:14:43 -07:00
Sergiu Deitsch
f2984cd077 cmake: remove deprecated package config variables 2021-09-01 19:05:51 +02:00
Maxiwell S. Garcia
09fc0f97b5 Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h 2021-09-01 16:42:51 +00:00
Antonio Sanchez
3a6296d4f1 Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang.
Clang doesn't like !621, needs the "g" constraint back.
The "g" constraint also works for GCC >= 5.

This fixes our gitlab CI.
2021-09-01 09:19:55 -07:00
jenswehner
a443a2373f updated documentation 2021-08-31 22:58:28 +00:00
Antonio Sanchez
ff07a8a639 GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315).
GCC 4.8 doesn't seem to like the `g` register constraint, failing to
compile with "error: 'asm' operand requires impossible reload".

Tested `r` instead, and that seems to work, even with latest compilers.

Also fixed some minor macro issues to eliminate warnings on armv7.

Fixes #2315.
2021-08-31 20:20:47 +00:00
Antonio Sanchez
cc3573ab44 Disable cuda Eigen::half vectorization on host.
All cuda `__half` functions are device-only in CUDA 9, including
conversions. Host-side conversions were added in CUDA 10.
The existing code doesn't build prior to 10.0.

All arithmetic functions are always device-only, so there's
therefore no reason to use vectorization on the host at all.

Modified the code to disable vectorization for `__half` on host,
which required also updating the `TensorReductionGpu` implementation
which previously made assumptions about available packets.
2021-08-31 19:13:12 +00:00
Adam Kallai
1415817d8d win: include intrin header in Windows on ARM
intrin header is needed for _BitScanReverse and
_BitScanReverse64
2021-08-31 10:57:34 +02:00
Rasmus Munk Larsen
6f429a202d Add missing dependency on LAPACK test suite binaries to target buildtests, so make check will work correctly when EIGEN_ENABLE_LAPACK_TESTS is ON. 2021-08-30 21:49:08 +00:00
Rasmus Munk Larsen
7e096ddcb0 Allow old Fortran code for LAPACK tests to compile despite argument mismatch errors (REAL passed to COMPLEX workspace argument) with GNU Fortran 10. 2021-08-30 19:47:30 +00:00
Turing Eret
3324389f6d Add EIGEN_TENSOR_PLUGIN support per issue #2052. 2021-08-30 19:36:55 +00:00
Antonio Sanchez
5db9e5c779 Fix fix<N> when variable templates are not supported.
There were some typos that checked `EIGEN_HAS_CXX14` that should have
checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch
in some of the `Eigen::fix<N>` assumptions.

Also fixed the `symbolic_index` test when
`EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0.

Fixes #2308
2021-08-30 08:06:55 -07:00
Jens Wehner
53ad9c75b4 included unordered_map header 2021-08-27 16:53:28 +00:00
Kolja Brix
a032397ae4 Fix PEP8 and formatting issues in GDB pretty printer. 2021-08-26 15:22:28 +00:00
jenswehner
9abf4d0bec made RandomSetter C++11 compatible 2021-08-25 20:24:55 +00:00
Antonio Sanchez
eeacbd26c8 Bump CMake files to at least c++11.
Removed all configurations that explicitly test or set the c++ standard
flags. The only place the standard is now configured is at the top of
the main `CMakeLists.txt` file, which can easily be updated (e.g. if
we decide to move to c++14+). This can also be set via command-line using
```
> cmake -DCMAKE_CXX_STANDARD 14
```

Kept the `EIGEN_TEST_CXX11` flag for now - that still controls whether to
build/run the `cxx11_*` tests.  We will likely end up renaming these
tests and removing the `CXX11` subfolder.
2021-08-25 20:07:48 +00:00
Jakub Lichman
dc5b1f7d75 AVX512 and AVX2 support for Packet16i and Packet8i added 2021-08-25 19:38:23 +00:00
Han-Kuan Chen
ab28419298 optimize predux if architecture is aarch64 2021-08-25 19:18:54 +00:00
Antonio Sanchez
4011e4d258 Remove c++11-off CI jobs.
This is step 1 in preparing to transition beyond c++03.
2021-08-24 17:42:39 +00:00
jenswehner
90b3b6b572 added doxygen flowchart 2021-08-24 17:11:51 +00:00
jenswehner
d85de1ef56 removed sparse dynamic matrix 2021-08-24 10:33:00 +02:00
Kolja Brix
e5c8f283ae Add support for Eigen::Block types to GDB pretty printer.
Submitted by Allan Leal, see #1539 (https://gitlab.com/libeigen/eigen/issues/1539).
2021-08-23 18:10:17 +02:00
Kolja Brix
58e086b8c8 Add random matrix generation via SVD 2021-08-23 16:00:05 +00:00
Rasmus Munk Larsen
82dd3710da Update version of master branch to 3.4.90. 2021-08-18 13:46:05 -07:00
Antonio Sanchez
2b410ecbef Workaround VS 2017 arg bug.
In VS 2017, `std::arg` for real inputs always returns 0, even for
negative inputs.  It should return `PI` for negative real values.
This seems to be fixed in VS 2019 (MSVC 1920).
2021-08-18 18:39:18 +00:00
Antonio Sanchez
0c4ae56e37 Remove unaligned assert tests.
Manually constructing an unaligned object declared as aligned
invokes UB, so we cannot technically check for alignment from
within the constructor.  Newer versions of clang optimize away
this check.

Removing the affected tests.
2021-08-18 18:05:24 +00:00
Jakob Struye
53a29c7e35 Clearer doc for squaredNorm 2021-08-18 15:11:15 +00:00
Antonio Sanchez
fc9d352432 Renamed shift_left/shift_right to shiftLeft/shiftRight.
For naming consistency.  Also moved to ArrayCwiseUnaryOps, and added
test.
2021-08-17 20:04:48 -07:00
Antonio Sanchez
2cc6ee0d2e Add missing PPC packet comparisons.
This is to fix the packetmath tests on the ppc pipeline.
2021-08-17 07:42:04 -07:00
Chip-Kerchner
8dcf3e38ba Fix unaligned loads in ploadLhs & ploadRhs for P8. 2021-08-16 20:28:22 -05:00
Rasmus Munk Larsen
7e6f94961c Update documentation for matrix decompositions and least squares solvers. 2021-08-16 21:56:18 +00:00
andiwand
5c6b3efead minor doc fix in Map.h 2021-08-16 12:02:33 +00:00
Chip-Kerchner
e07227c411 Reverse compare logic ƒin F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8). 2021-08-13 11:21:28 -05:00
Chip Kerchner
66499f0f17 Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+ 2021-08-12 21:38:54 +00:00
Rasmus Munk Larsen
96e3b4fc95 Add CompleteOrthogonalDecomposition to the table of linear algeba decompositions. 2021-08-12 16:11:15 +00:00
Antonio Sanchez
fb1718ad14 Update code snippet for tridiagonalize_inplace. 2021-08-12 08:30:12 -07:00
Rasmus Munk Larsen
8ce341caf2 * revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B.
* This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B.

Authored by @awoniu.
2021-08-11 18:10:01 +00:00
Nikolay Tverdokhleb
f1b899eef7 Do not set AnnoyingScalar::dont_throw if not defined EIGEN_TEST_ANNOYING_SCALAR_DONT_THROW.
- Because that member is not declared if the macro is defined.
2021-08-11 10:01:21 +00:00
ChipKerchner
413bc491f1 Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl). 2021-08-10 15:03:18 -05:00
jenswehner
e3e74001f7 added includes for unordered_map 2021-08-10 13:34:57 +02:00
Gauri Deshpande
e6a5a594a7 remove denormal flushing in fp32tobf16 for avx & avx512 2021-08-09 22:15:21 +00:00
Daniel N. Miller (APD)
09d7122468 Do not build shared libs if not supported 2021-08-06 20:48:32 +00:00
Rasmus Munk Larsen
a5a7faeb45 Avoid memory allocation in tridiagonalization_inplace_selector::run. 2021-08-06 20:48:10 +00:00
Maxiwell S. Garcia
ae2abe1f58 ci: ppc64le: disable MMA and remove 'allow_failure' for gcc-10 CXX11=off
This patch disables MMA for CI because the building environment is using
Ubuntu 18.04 image with LD 2.30. This linker version together with gcc-10
causes some 'unrecognized opcode' errors.
2021-08-05 21:43:05 +00:00
Jens Wehner
4d870c49b7 updated documentation for middleCol and middleRow 2021-08-05 17:21:16 +00:00
Alexander Karatarakis
4ba872bd75 Avoid leading underscore followed by cap in template identifiers 2021-08-04 22:41:52 +00:00
Antonio Sanchez
5ad8b9bfe2 Make inverse 3x3 faster and avoid gcc bug.
There seems to be a gcc 4.7 bug that incorrectly flags the current
3x3 inverse as using uninitialized memory.  I'm *pretty* sure it's
a false positive, but it's hard to trigger.  The same warning
does not trigger with clang or later compiler versions.

In trying to find a work-around, this implementation turns out to be
faster anyways for static-sized matrices.

```
name                                            old cpu/op  new cpu/op  delta
BM_Inverse3x3<DynamicMatrix3T<float>>            423ns ± 2%   433ns ± 3%   +2.32%    (p=0.000 n=98+96)
BM_Inverse3x3<DynamicMatrix3T<double>>           425ns ± 2%   427ns ± 3%   +0.48%    (p=0.003 n=99+96)
BM_Inverse3x3<StaticMatrix3T<float>>            7.10ns ± 2%  0.80ns ± 1%  -88.67%  (p=0.000 n=114+112)
BM_Inverse3x3<StaticMatrix3T<double>>           7.45ns ± 2%  1.34ns ± 1%  -82.01%  (p=0.000 n=105+111)
BM_AliasedInverse3x3<DynamicMatrix3T<float>>     409ns ± 3%   419ns ± 3%   +2.40%   (p=0.000 n=100+98)
BM_AliasedInverse3x3<DynamicMatrix3T<double>>    414ns ± 3%   413ns ± 2%     ~       (p=0.322 n=98+98)
BM_AliasedInverse3x3<StaticMatrix3T<float>>     7.57ns ± 1%  0.80ns ± 1%  -89.37%  (p=0.000 n=111+114)
BM_AliasedInverse3x3<StaticMatrix3T<double>>    9.09ns ± 1%  2.58ns ±41%  -71.60%  (p=0.000 n=113+116)
```
2021-08-04 21:18:44 +00:00
Antonio Sanchez
31f796ebef Fix MPReal detection and support.
The latest version of `mpreal` has a bug that breaks `min`/`max`.
It also breaks with the latest dev version of `mpfr`. Here we
add `FindMPREAL.cmake` which searches for the library and tests if
compilation works.

Removed our internal copy of `mpreal.h` under `unsupported/test`, as
it is out-of-sync with the latest, and similarly breaks with
the latest `mpfr`.  It would be best to use the installed version
of `mpreal` anyways, since that's what we actually want to test.

Fixes #2282.
2021-08-03 17:55:03 +00:00
Antonio Sanchez
1cdec38653 Fix cmake warnings, FindPASTIX/FindPTSCOTCH.
We were getting a lot of warnings due to nested `find_package` calls
within `Find***.cmake` files.  The recommended approach is to use
[`find_dependency`](https://cmake.org/cmake/help/latest/module/CMakeFindDependencyMacro.html)
in package configuration files. I made this change for all instances.

Case mismatches between `Find<Package>.cmake` and calling
`find_package(<PACKAGE>`) also lead to warnings. Fixed for
`FindPASTIX.cmake` and `FindSCOTCH.cmake`.

`FindBLASEXT.cmake` was broken due to calling `find_package_handle_standard_args(BLAS ...)`.
The package name must match, otherwise the `find_package(BLASEXT)` falsely thinks
the package wasn't found.  I changed to `BLASEXT`, but then also copied that value
to `BLAS_FOUND` for compatibility.

`FindPastix.cmake` had a typo that incorrectly added `PTSCOTCH` when looking for
the `SCOTCH` component.

`FindPTSCOTCH` incorrectly added `***-NOTFOUND` to include/library lists,
corrupting them.  This led to cmake errors down-the-line.

Fixes #2288.
2021-08-03 17:26:28 +00:00
Antonio Sanchez
8cf6cb27ba Fix TriSycl CMake files.
This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was
broken by commit 00f32752, which modified `add_sycl_to_target` for ComputeCPP.
This makes the corresponding modifications for trisycl to make them consistent.

Also, trisycl now requires c++17.
2021-08-03 16:44:29 +00:00
Antonio Sanchez
3d98a6ef5c Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset.
The `memset` function and bitwise manipulation only apply to POD types
that do not require initialization, otherwise resulting in UB. We currently
violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and
bitwise operations are applied byte-by-byte in the generic implementations.

This is causing issues for scalar types that do require initialization
or that contain non-POD info such as pointers (#2201). We either break
them, or force specializations of these functions for custom scalars,
even if they are not vectorized.

Here we modify these functions for scalars only - instead using only
scalar operations:
- `pzero`: `Scalar(0)` for all scalars.
- `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars.
- `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars
- `pand`, `por`, `pxor`, `pnot`: use operators `&`, `|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise.

For non-scalar types, the original implementations are used to maintain
compatibility and minimize the number of changes.

Fixes #2201.
2021-08-03 08:44:28 -07:00
Antonio Sanchez
7880f10526 Enable equality comparisons on GPU.
Since `std::equal_to::operator()` is not a device function, it
fails on GPU.  On my device, I seem to get a silent crash in the
kernel (no reported error, but the kernel does not complete).

Replacing this with a portable version enables comparisons on device.

Addresses #2292 - would need to be cherry-picked.  The 3.3 branch
also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get
fully working.
2021-08-03 01:53:31 +00:00
pvcStillinGradSchool
c86ac71b4f Put code in monospace (typewriter) style. 2021-08-03 01:48:32 +00:00
hyunggi-sv
02a0e79c70 fix:typo in dox (has->have) 2021-08-03 00:45:00 +00:00
Antonio Sanchez
9816fe59b4 Fix assignment operator issue for latest MSVC+NVCC.
Details are scattered across #920, #1000, #1324, #2291.

Summary: some MSVC versions have a bug that requires omitting explicit
`operator=` definitions (leads to duplicate definition errors), and
some MSVC versions require adding explicit `operator=` definitions
(otherwise implicitly deleted errors).  This mess tries to cover
all the cases encountered.

Fixes #2291.
2021-08-03 00:26:10 +00:00
Alexander Karatarakis
f357283d31 _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier 2021-07-29 18:02:04 +00:00
Jonas Harsch
5b81764c0f Fixed typo in TutorialSparse.dox 2021-07-26 07:20:19 +00:00
Antonio Sanchez
de2e62c62d Disable vectorization of comparisons except for bool.
Packet input/output types must currently be the same, and since these
have a return type of `bool`, vectorization will only work if
input is bool.
2021-07-25 13:39:50 -07:00
derekjchow
66ca41bd47 Add support for vectorizing logical comparisons. 2021-07-23 20:07:48 +00:00
arthurfeeney
a77638387d Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector. 2021-07-20 20:11:22 +00:00
Antonio Sanchez
297f0f563d Fix explicit default cache size typo. 2021-07-20 11:40:17 -07:00
Antonio Sanchez
1fd5ce1002 For GpuDevice::fill, use a single memset if all bytes are equal.
The original `fill` implementation introduced a 5x regression on my
nvidia Quadro K1200.  @rohitsan reported up to 100x regression for
HIP.  This restores performance.
2021-07-10 13:37:16 +00:00
Antonio Sanchez
9c22795d65 Put attach/detach buffer back in for TensorDeviceSycl.
Also added a test to verify the original buffer is updated correctly.
2021-07-09 10:00:05 -07:00
Rohit Santhanam
beea14a18f Enable extract et. al. for HIP GPU. 2021-07-09 14:58:07 +00:00
Rasmus Munk Larsen
0c361c4899 Defer to std::fill_n when filling a dense object with a constant value. 2021-07-09 03:59:35 +00:00
Antonio Sanchez
1e6c6c1576 Replace memset with fill to work for non-trivial scalars.
For custom scalars, zero is not necessarily represented by
a zeroed-out memory block (e.g. gnu MPFR). We therefore
cannot rely on `memset` if we want to fill a matrix or tensor
with zeroes. Instead, we should rely on `fill`, which for trivial
types does end up getting converted to a `memset` under-the-hood
(at least with gcc/clang).

Requires adding a `fill(begin, end, v)` to `TensorDevice`.

Replaced all potentially bad instances of memset with fill.

Fixes #2245.
2021-07-08 18:34:41 +00:00
Jonas Harsch
e9c9a3130b Removed superfluous boolean degenerate in TensorMorphing.h. 2021-07-08 18:02:58 +00:00
Guoqiang QI
4bcd42c271 Make a copy of input matrix when try to do the inverse in place, this fixes #2285. 2021-07-08 17:05:26 +00:00
Kolja Brix
a59cf78c8d Add Doxygen-style documentation to main.h. 2021-07-07 18:23:59 +00:00
Antonio Sanchez
f44f05532d Fix CMake directory issues.
Allows absolute and relative paths for
- `INCLUDE_INSTALL_DIR`
- `CMAKEPACKAGE_INSTALL_DIR`
- `PKGCONFIG_INSTALL_DIR`

Type should be `PATH` not `STRING`.  Contrary to !211, these don't
seem to be made absolute if user-defined - according to the doc any
directories should use `PATH` type, which allows a file dialog
to be used via the GUI.  It also better handles file separators.

If user provides an absolute path, it will be made relative to
`CMAKE_INSTALL_PREFIX` so that the `configure_packet_config_file` will
work.

Fixes #2155 and #2269.
2021-07-07 17:24:57 +00:00
Antonio Sanchez
f5a9873bbb Fix Tensor documentation page.
The extra [TOC] tag is generating a huge floating duplicated
table-of-contents, which obscures the majority of the page
(see bottom of https://eigen.tuxfamily.org/dox/unsupported/eigen_tensors.html).
Remove it.

Also, headers do not support markup (see
[doxygen bug](https://github.com/doxygen/doxygen/issues/7467)), so
backticks like
```
```
end up generating titles that looks like
```
Constructor <tt>Tensor<double,2></tt>
```
Removing backticks for now.  To generate proper formatted headers, we
must directly use html instead of markdown, i.e.
```
<h2>Constructor <code>Tensor&lt;double,2&gt;</code></h2>
```
which is ugly.

Fixes #2254.
2021-07-03 04:39:22 +00:00
Rasmus Munk Larsen
7b35638ddb Fix breakage of conj_helper in conjunction with custom types introduced in !537. 2021-07-02 20:42:15 +00:00
Jonas Harsch
aab747021b Don't crash when attempting to shuffle an empty tensor. 2021-07-02 20:33:52 +00:00
Rasmus Munk Larsen
bbfc4d54cd Use padd instead of +. 2021-07-02 02:51:48 +00:00
Rasmus Munk Larsen
9312a5bf5c Implement a generic vectorized version of Smith's algorithms for complex division. 2021-07-01 23:31:12 +00:00
Antonio Sanchez
6035da5283 Fix compile issues for gcc 4.8.
- Move constructors can only be defaulted as NOEXCEPT if all members
have NOEXCEPT move constructors.
- gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter.
2021-07-01 22:58:14 +00:00
Antonio Sanchez
154f00e9ea Fix inverse nullptr/asan errors for LU.
For empty or single-column matrices, the current `PartialPivLU`
currently dereferences a `nullptr` or accesses memory out-of-bounds.
Here we adjust the checks to avoid this.
2021-07-01 13:41:04 -07:00
Dan Miller
eb04775903 Fix duplicate definitions on Mac 2021-07-01 14:54:12 +00:00
Chip Kerchner
91e99ec1e0 Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow 2021-06-30 23:05:04 +00:00
Alexander Karatarakis
60400334a9 Make DenseStorage<> trivially_copyable 2021-06-30 04:27:51 +00:00
大河メタル
c81da59a25 Correct declarations for aarch64-pc-windows-msvc 2021-06-30 04:09:46 +00:00
Rasmus Munk Larsen
5aebbe9098 Get rid of redundant pabs instruction in complex square root. 2021-06-29 23:26:15 +00:00
Antonio Sanchez
3a087ccb99 Modify tensor argmin/argmax to always return first occurence.
As written, depending on multithreading/gpu, the returned index from
`argmin`/`argmax` is not currently stable.  Here we modify the functors
to always keep the first occurence (i.e. if the value is equal to the
current min/max, then keep the one with the smallest index).

This is otherwise causing unpredictable results in some TF tests.
2021-06-29 10:36:20 -07:00
Rohit Santhanam
2d132d1736 Commit 52a5f982 broke conjhelper functionality for HIP GPUs.
This commit addresses this.
2021-06-25 19:28:00 +00:00
Rasmus Munk Larsen
bffd267d17 Small cleanup: Get rid of the macros EIGEN_HAS_SINGLE_INSTRUCTION_CJMADD and CJMADD, which were effectively unused, apart from on x86, where the change results in identically performing code. 2021-06-24 18:52:17 -07:00
Rasmus Munk Larsen
52a5f98212 Get rid of code duplication for conj_helper. For packets where LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations. 2021-06-24 15:47:48 -07:00
Rasmus Munk Larsen
4ad30a73fc Use internal::ref_selector to avoid holding a reference to a RHS expression. 2021-06-22 14:31:32 +00:00
Rasmus Munk Larsen
ea62c937ed Update ComplexEigenSolver_eigenvectors.cpp 2021-06-21 19:06:25 +00:00
Rasmus Munk Larsen
c8a2b4d20a Fix typo in SelfAdjointEigenSolver_eigenvectors.cpp 2021-06-21 19:06:04 +00:00
Antonio Sanchez
e9ab4278b7 Rewrite balancer to avoid overflows.
The previous balancer overflowed for large row/column norms.
Modified to prevent that.

Fixes #2273.
2021-06-21 17:29:55 +00:00
Antonio Sanchez
35a367d557 Fix fix<> for gcc-4.9.3.
There's a missing `EIGEN_HAS_CXX14` -> `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`
replacement.

Fixes ##2267
2021-06-18 13:22:54 -07:00
Antonio Sanchez
12e8d57108 Remove pset, replace with ploadu.
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned.  But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.

This is causing segfaults in Google Maps.
2021-06-16 18:41:17 -07:00
Chip-Kerchner
ef1fd341a8 EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow. Changing to EIGEN_ALWAYS_INLINE where appropiate. 2021-06-16 16:30:31 +00:00
jenswehner
175f0cc1e9 changed documentation to make example compile 2021-06-16 11:45:06 +02:00
Antonio Sanchez
9e94c59570 Add missing ppc pcmp_lt_or_nan<Packet8bf> 2021-06-15 13:42:17 -07:00
Antonio Sanchez
954879183b Fix placement of permanent GPU defines. 2021-06-15 12:17:09 -07:00
Rasmus Munk Larsen
13fb5ab92c Fix more enum arithmetic. 2021-06-15 09:09:31 -07:00
Antonio Sanchez
ad82d20cf6 Fix checking of version number for mingw.
MinGW spits out version strings like: `x86_64-w64-mingw32-g++ (GCC)
10-win32 20210110`, which causes the version extraction to fail.
Added support for this with tests.

Also added `make_unsigned` for `long long`, since mingw seems to
use that for `uint64_t`.

Related to #2268.  CMake and build passes for me after this.
2021-06-11 23:19:10 +00:00
Antonio Sanchez
514977f31b Add ability to permanently enable HIP/CUDA gpu* defines.
When using Eigen for gpu, these simplify portability.  If
`EIGEN_PERMANENTLY_ENABLE_GPU_HIP_CUDA_DEFINES` is set, then
we do not undefine them.
2021-06-11 17:19:54 +00:00
Antonio Sanchez
6aec83263d Allow custom TENSOR_CONTRACTION_DISPATCH macro.
Currently TF lite needs to hack around with the Tensor headers in order
to customize the contraction dispatch method. Here we add simple `#ifndef`
guards to allow them to provide their own dispatch prior to inclusion.
2021-06-11 17:02:19 +00:00
Rasmus Munk Larsen
fc87e2cbaa Use bit_cast to create -0.0 for floating point types to avoid compiler optimization changing sign with --ffast-math enabled. 2021-06-11 02:35:53 +00:00
Rasmus Munk Larsen
f64b2954c7 Fix c++20 warnings about using enums in arithmetic expressions. 2021-06-10 17:17:39 -07:00
Nicolas Cornu
001a57519a Fix parsing of version for nvhpc
As the first line of the version is empty it crashes,
so delete first line if it is empty
2021-06-10 18:30:53 +00:00
Rohit Santhanam
c8d40a7bf1 Removed dead code from GPU float16 unit test. 2021-05-28 20:06:48 +00:00
Cyril Kaiser
91cd67f057 Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor. 2021-05-26 19:28:13 +00:00
Antonio Sanchez
dba753a986 Add missing NEON ptranspose implementations.
Unified implementation using only `vzip`.
2021-05-25 18:25:35 +00:00
Antonio Sanchez
ebb300d0b4 Modify Unary/Binary/TernaryOp evaluators to work for non-class types.
This used to work for non-class types (e.g. raw function pointers) in
Eigen 3.3.  This was changed in commit 11f55b29 to optimize the
evaluator:

> `sizeof((A-B).cwiseAbs2())` with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.

though I cannot reproduce the 16 byte result.  Both before the change
and after, with multiple compilers/versions, I always get a result of 40 bytes.

https://godbolt.org/z/MsjTc1PGe

This change modifies the code slightly to allow non-class types.  The
final generated code is identical, and the expression remains 40 bytes
for the `abs2` sample case.

Fixes #2251
2021-05-23 12:44:37 -07:00
Jakub Lichman
12471fcb5d predux_half_dowto4 test extended to all applicable packets 2021-05-21 16:42:19 +00:00
Steve Bronder
1720057023 Adds macro for checking if C++14 variable templates are supported 2021-05-21 16:25:32 +00:00
Niall Murphy
391094c507 Use derived object type in conservative_resize_like_impl
When calling conservativeResize() on a matrix with DontAlign flag, the
temporary variable used to perform the resize should have the same
Options as the original matrix to ensure that the correct override of
swap is called (i.e. PlainObjectBase::swap(DenseBase<OtherDerived> &
other). Calling the base class swap (i.e in DenseBase) results in
assertions errors or memory corruption.
2021-05-20 23:17:02 +00:00
Jakub Lichman
8877f8d9b2 ptranpose test for non-square kernels added 2021-05-19 08:26:45 +00:00
Guoqiang QI
3e006bfd31 Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 . 2021-05-13 15:03:30 +00:00
Nathan Luehr
972cf0c28a Fix calls to device functions from host code 2021-05-11 22:47:49 +00:00
Nathan Luehr
7e6a1c129c Device implementation of log for std::complex types. 2021-05-11 22:02:21 +00:00
Nathan Luehr
6753f0f197 Fix ambiguity due to argument dependent lookup. 2021-05-11 15:41:11 -05:00
guoqiangqi
3d9051ea84 Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 . 2021-05-10 23:53:16 +00:00
Rohit Santhanam
39ec31c0ad Fix for issue where numext::imag and numext::real are used before they are defined. 2021-05-10 19:48:32 +00:00
Antonio Sanchez
c0eb5f89a4 Restore ABI compatibility for conj with 3.3, fix conflict with boost.
The boost library unfortunately specializes `conj` for various types and
assumes the original two-template-parameter version.  This changes
restores the second parameter.  This also restores ABI compatibility.

The specialization for `std::complex` is because `std::conj` is not
a device function. For custom complex scalar types, users should provide
their own `conj` implementation.

We may consider removing the unnecessary second parameter in the future - but
this will require modifying boost as well.

Fixes #2112.
2021-05-07 18:14:00 +00:00
Antonio Sanchez
0eba8a1fe3 Clean up gpu device properties.
Made a class and singleton to encapsulate initialization and retrieval of
device properties.

Related to !481, which already changed the API to address a static
linkage issue.
2021-05-07 17:51:29 +00:00
Antonio Sanchez
90e9a33e1c Fix numext::arg return type.
The cxx11 path for `numext::arg` incorrectly returned the complex type
instead of the real type, leading to compile errors. Fixed this and
added tests.

Related to !477, which uncovered the issue.
2021-05-07 16:26:57 +00:00
Christoph Hertzberg
722ca0b665 Revert addition of unused paddsub<Packet2cf>. This fixes #2242 2021-05-06 18:36:47 +02:00
Antonio Sanchez
e3b7f59659 Simplify TensorRandom and remove time-dependence.
Time-dependence prevents tests from being repeatable. This has long
been an issue with debugging the tensor tests. Removing this will allow
future tests to be repeatable in the usual way.

Also, the recently added macros in !476 are causing headaches across different
platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple
ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE`
are sometimes defined with values, sometimes defined as empty, and sometimes
not defined at all when they probably should be.  This is leading to
multiple build breakages.

The simplest approach is to generate a seed via
`Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a
hash based on the current thread ID (since `rand()` isn't supported
on GPU).

Fixes #1602.
2021-05-04 13:34:49 -07:00
Antonio Sanchez
1c013be2cc Better CUDA complex division.
The original produced NaNs when dividing 0/b for subnormal b.
The `complex_divide_stable` was changed to use the more common
Smith's algorithm.
2021-04-29 17:39:58 +00:00
Antonio Sanchez
172db7bfc3 Add missing pcmp_lt_or_nan for NEON Packet4bf. 2021-04-27 14:12:11 -07:00
Theo Fletcher
2ced0cc233 Added complex matrix unit tests for SelfAdjointEigenSolve 2021-04-26 19:00:51 +00:00
Jakub Lichman
d87648a6be Tests added and AVX512 bug fixed for pcmp_lt_or_nan 2021-04-25 20:58:56 +00:00
Jakub Lichman
1115f5462e Tests for pcmp_lt and pcmp_le added 2021-04-23 19:51:43 +00:00
Turing Eret
3804ca0d90 Fix for issue with static global variables in TensorDeviceGpu.h
m_deviceProperties and m_devicePropInitialized are defined as global
statics which will define multiple copies which can cause issues if
initializeDeviceProp() is called in one translation unit and then
m_deviceProperties is used in a different translation unit. Added
inline functions getDeviceProperties() and getDevicePropInitialized()
which defines those variables as static locals. As per the C++ standard
7.1.2/4, a static local declared in an inline function always refers
to the same object, so this should be safer. Credit to Sun Chenggen
for this fix.

This fixes issue #1475.
2021-04-23 07:43:35 -06:00
Antonio Sanchez
045c0609b5 Check existence of BSD random before use.
`TensorRandom` currently relies on BSD `random()`, which is not always
available.  The [linux manpage](https://man7.org/linux/man-pages/man3/srandom.3.html)
gives the glibc condition:
```
_XOPEN_SOURCE >= 500
               || /* Glibc since 2.19: */ _DEFAULT_SOURCE
	       || /* Glibc <= 2.19: */ _SVID_SOURCE ||  _BSD_SOURCE
```
In particular, this was failing to compile for MinGW via msys2. If not
available, we fall back to using `rand()`.
2021-04-22 20:42:12 +00:00
Antonio Sanchez
d213a0bcea DenseStorage safely copy/swap.
Fixes #2229.

For dynamic matrices with fixed-sized storage, only copy/swap
elements that have been set.  Otherwise, this leads to inefficient
copying, and potential UB for non-initialized elements.
2021-04-22 18:45:19 +00:00
Rasmus Munk Larsen
85a76a16ea Make vectorized compute_inverse_size4 compile with AVX. 2021-04-22 15:21:01 +00:00
Jakub Lichman
d72c794ccd Compilation of basicbenchmark fixed 2021-04-21 06:53:32 +00:00
Chip-Kerchner
06c2760bd1 Fix taking address of rvalue compiler issue with TensorFlow (plus other warnings). 2021-04-21 00:47:13 +00:00
Jakub Lichman
2b1dfd1ba0 HasExp added for AVX512 Packet8d 2021-04-20 19:07:58 +00:00
Antonio Sanchez
1d79c68ba0 Fix ldexp for AVX512 (#2215)
Wrong shuffle was used.  Need to interleave low/high halves with a
`permute` instruction.

Fixes #2215.
2021-04-20 16:25:22 +00:00
David Tellenbach
3e819d83bf Before 3.4 branch 2021-04-18 23:36:14 +02:00
Antonio Sanchez
69adf26aa3 Modify googlehash use to account for namespace issues.
The namespace declaration for googlehash is a configurable macro that
can be disabled.  In particular, it is disabled within google, causing
compile errors since `dense_hash_map`/`sparse_hash_map` are then in
the global namespace instead of in `::google`.

Here we play a bit of gynastics to allow for both `google::*_hash_map`
and `*_hash_map`, while limiting namespace polution.  Symbols within
the `::google` namespace are imported into `Eigen::google`.

We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is
fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be
defined.
2021-04-12 19:00:39 -07:00
Christoph Hertzberg
9357feedc7 Avoid using uninitialized inputs and if available, use slightly more efficient movsd instruction for pset1<Packet2cf>. 2021-04-13 01:36:59 +02:00
Rasmus Munk Larsen
a2c0542010 Fix typo in TensorDimensions.h 2021-04-12 18:59:56 +00:00
Rohit Santhanam
dfd6720d82 Fix for float16 GPU unit test. 2021-04-12 10:19:06 +00:00
Christoph Hertzberg
1e1c8a735c Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for std::result_of and std::invoke_result.
Fixes #2209
2021-04-12 01:26:15 +00:00
Jens Wehner
f6fc66aa75 fixed doxygen for unsupported iterative solver module 2021-04-11 16:26:14 +00:00
Christoph Hertzberg
d58678069c Make iterators default constructible and assignable, by making... 2021-04-09 17:03:28 +00:00
Rohit Santhanam
2859db0220 This fixes an issue where the compiler was not choosing the GPU specific specialization of ScanLauncher.
The issue was discovered when the GPU scan unit test was run and resulted in a segmentation fault.

The segmantation fault occurred because the unit test allocated GPU memory and passed a pointer to that memory to the computation that it presumed would execute on the GPU.

But because of the issue, the computation was scheduled to execute on the CPU so a situation was constructed where the CPU attempted to access a GPU memory location.

The fix expands the GPU specific ScanLauncher specialization to handle cases where vectorization is enabled.

Previously, the GPU specialization is chosen only if Vectorization is not used.
2021-04-08 15:14:48 +00:00
Antonio Sanchez
fcb5106c6e Scaled epsilon the wrong way.
Should have been 0.5 to widen the bounds, since this is inverse
precision.  Setting to 0.5, however, leads to many more failing
tests at Google, so reverting to 1 for now.
2021-04-07 15:08:39 -07:00
Christoph Hertzberg
6197ce1a35 Replace -2147483648 by -0.0f or -0.0 constants (this should fix #2189).
Also, remove unnecessary `pgather` operations.
2021-04-07 11:25:27 +00:00
Rasmus Munk Larsen
22edb46823 Align local arrays to Packet boundary. 2021-04-06 16:22:36 +00:00
Antonio Sanchez
ace7f132ed Fix clang tidy warnings in AnnoyingScalar.
Clang-tidy complains that full specializations in headers can cause ODR
violations.  Marked these as `inline` to fix.

It also complains about renaming arguments in specializations.  Set the
argument names to match.
2021-04-05 12:49:38 -07:00
Antonio Sanchez
90187a33e1 Fix SelfAdjoingEigenSolver (#2191)
Adjust the relaxation step to use the condition
```
abs(subdiag[i]) <= epsilon * sqrt(abs(diag[i]) + abs(diag[i+1]))
```
for setting the subdiagonal entry to zero.

Also adjust Wilkinson shift for small `e = subdiag[end-1]` -
I couldn't find a reference for the original, and it was not
consistent with the Wilkinson definition.

Fixes #2191.
2021-04-05 11:19:09 -07:00
Rasmus Munk Larsen
3ddc0974ce Fix two bugs in commit 2021-04-02 22:06:27 +00:00
Chip Kerchner
c24bee6120 Fix address of temporary object errors in clang11.
This fixes the problem with taking the address of temporary objects which clang11 treats as errors.
2021-04-02 16:27:08 +00:00
David Tellenbach
e4233b6e3d Add CI infrastructure for pre-merge smoke tests.
This patch adds pre-merge smoke tests for x86 Linux using gcc-10 and clang-10.

Closes #2188.
2021-04-01 00:08:37 +00:00
David Tellenbach
ae95b74af9 Add CMake infrastructure for smoke testing
Necessary CMake changes to implement pre-merge smoke tests
running via CI.
2021-03-31 22:09:00 +00:00
Rasmus Munk Larsen
5bbc9cea93 Add an info() method to the SVDBase class to make it possible to tell the user that the computation failed, possibly due to invalid input.
Make Jacobi and divide-and-conquer fail fast and return info() == InvalidInput if the matrix contains NaN or +/-Inf.
2021-03-31 21:09:19 +00:00
Guoqiang QI
b5a926a0f6 Add GitLab templates for issues and merge requests
This patch adds GitLab templates for bug reports, feature and merge requests.

This closes #2117.
2021-03-31 16:01:12 +00:00
Antonio Sanchez
78ee3d6261 Fix CUDA constexpr issues for numeric_limits.
Some CUDA/HIP constants fail on device with `constexpr` since they
internally rely on non-constexpr functions, e.g.
```
\#define CUDART_INF_F            __int_as_float(0x7f800000)
```
This fails for cuda-clang (though passes with nvcc). These constants are
currently used by `device::numeric_limits`.  For portability, we
need to remove `constexpr` from the affected functions.

For C++11 or higher, we should be able to rely on the `std::numeric_limits`
versions anyways, since the methods themselves are now `constexpr`, so
should be supported on device (clang/hipcc natively, nvcc with
`--expr-relaxed-constexpr`).
2021-03-30 18:01:27 +00:00
Antonio Sanchez
af1247fbc1 Use Index type in loop over coefficients.
Previously was `int`.  Brought up by Kyle Snow (Polaris Geospatial
Services) on the mailing list.
2021-03-29 17:40:55 +00:00
Antonio Sanchez
87729ea39f Eliminate round_impl double-promotion warnings for c++03. 2021-03-25 16:52:19 +00:00
Deven Desai
748489ef9c Un-defining EIGEN_HAS_CONSTEXPR on the HIP platform
The Eigen unit-tests started failing on the HIP/ROCm platform, after the following commit

e7b8643d70

```
In file included from /home/rocm-user/eigen/test/main.h:360:
In file included from /home/rocm-user/eigen/Eigen/QR:11:
In file included from /home/rocm-user/eigen/Eigen/Core:162:
/home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:300:17: error: constexpr function never produces a constant expression [-Winvalid-constexpr]
  static float (max)() {
                ^
/home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:304:12: note: non-constexpr function '__int_as_float' cannot be used in a constant expression
    return HIPRT_MAX_NORMAL_F;
           ^
/home/rocm-user/eigen/Eigen/src/Core/arch/HIP/hcc/math_constants.h:14:28: note: expanded from macro 'HIPRT_MAX_NORMAL_F'
#define HIPRT_MAX_NORMAL_F __int_as_float(0x7f7fffff)
                           ^
/opt/rocm/hip/include/hip/hcc_detail/device_functions.h:913:32: note: declared here
__device__ static inline float __int_as_float(int x) {
                               ^
```

The problem seems to that some of the constants defined in the HIP `math_constants.h` have a call to `__int_as_float` routine which is not declared `constexpr` in the HIP runtime header file.

Working around this issue for now, be skipping the const_expr support (enabled via the above commit) on HIP
2021-03-25 13:45:52 +00:00
Chip Kerchner
d59ef212e1 Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3). 2021-03-25 11:08:19 +00:00
Steve Bronder
e7b8643d70 Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()""
This reverts commit 5f0b4a4010.
2021-03-24 18:14:56 +00:00
Antonio Sanchez
5521c65afb Eliminate mixingtypes_7 warning.
`g_called` is not used in subtest 7, so was generating a
`-Wunneeded-internal-declaration` warnings.  Here we silence
it by initializing the static variable.
2021-03-24 11:05:41 -07:00
Christoph Hertzberg
69a4f70956 Revert "Uses _mm512_abs_pd for Packet8d pabs"
This reverts commit f019b97aca
2021-03-23 18:52:19 +00:00
David Tellenbach
824272cde8 Re-enable CI for Power 2021-03-22 19:28:25 +01:00
David Tellenbach
4811e81966 Remove yet another comma at end of enum 2021-03-18 23:30:00 +01:00
Steve Bronder
f019b97aca Uses _mm512_abs_pd for Packet8d pabs 2021-03-18 15:47:52 +00:00
David Tellenbach
0cc9b5eb40 Split test commainitializer into two substests 2021-03-18 13:28:51 +01:00
Antonio Sanchez
c3fbc6cec7 Use singleton pattern for static registered tests.
The original fails with nvcc+msvc - there's a static order of initialization
issue leading to registered tests being cleared.  The test then fails on
```
VERIFY(EigenTest::all().size()>0);
```
since `EigenTest` no longer contains any tests.  The singleton pattern
fixes this.
2021-03-18 00:56:31 +00:00
Niek Bouman
ed964ba3f1 Proposed fix for issue #2187 2021-03-18 00:55:36 +00:00
Antonio Sanchez
8dfe1029a5 Augment NumTraits with min/max_exponent() again.
Replace usage of `std::numeric_limits<...>::min/max_exponent` in
codebase where possible.  Also replaced some other `numeric_limits`
usages in affected tests with the `NumTraits` equivalent.

The previous MR !443 failed for c++03 due to lack of `constexpr`.
Because of this, we need to keep around the `std::numeric_limits`
version in enum expressions until the switch to c++11.

Fixes #2148
2021-03-16 20:12:46 -07:00
David Tellenbach
eb71e5db98 Fix another warning on missing commas 2021-03-17 03:07:04 +01:00
David Tellenbach
df4bc2731c Revert "Augment NumTraits with min/max_exponent()."
This reverts commit 75ce9cd2a7.
2021-03-17 03:06:08 +01:00
Antonio Sanchez
75ce9cd2a7 Augment NumTraits with min/max_exponent().
Replace usage of `std::numeric_limits<...>::min/max_exponent` in
codebase.  Also replaced some other `numeric_limits` usages in
affected tests with the `NumTraits` equivalent.

Fixes #2148
2021-03-17 01:00:41 +00:00
David Tellenbach
9fb7062440 Silence warning on comma at end of enumerator list 2021-03-17 01:46:52 +01:00
Theo Fletcher
b8502a9dd6 Updated SelfAdjointEigenSolver documentation to include that the eigenvectors matrix is unitary. 2021-03-16 18:48:02 +00:00
Rasmus Munk Larsen
2e83cbbba9 Add NaN propagation options to minCoeff/maxCoeff visitors. 2021-03-16 17:02:50 +00:00
Jens Wehner
c0a889890f Fixed output of complex matrices 2021-03-15 21:51:55 +00:00
Antonio Sanchez
f612df2736 Add fmod(half, half).
This is to support TensorFlow's `tf.math.floormod` for half.
2021-03-15 13:32:24 -07:00
Antonio Sanchez
14b7ebea11 Fix numext::round pre c++11 for large inputs.
This is to resolve an issue for large inputs when +0.5 can
actually lead to +1 if the input doesn't have enough precision
to resolve the addition - leading to an off-by-one error.

See discussion on 9a663973.
2021-03-15 19:08:04 +00:00
Chip Kerchner
c9d4367fa4 Fix pround and add print 2021-03-15 19:07:43 +00:00
Antonio Sanchez
d24f9f9b55 Fix NVCC+ICC issues.
NVCC does not understand `__forceinline`, so we need to use `inline`
when compiling for GPU.

ICC specializes `std::complex` operators for `float` and `double`
by default, which cannot be used on device and conflict with Eigen's
workaround in CUDA/Complex.h.  This can be prevented by defining
`_OVERRIDE_COMPLEX_SPECIALIZATION_` before including `<complex>`.
Added this define to the tests and to `Eigen/Core`, but this will
not work if the user includes `<complex>` before `<Eigen/Core>`.

ICC also seems to generate a duplicate `Map` symbol in
`PlainObjectBase`:
```
error: "Map" has already been declared in the current scope
  static ConstMapType Map(const Scalar *data)

```
I tracked this down to `friend class Eigen::Map`.  Putting the `friend`
statements at the bottom of the class seems to resolve this issue.

Fixes #2180
2021-03-15 18:42:04 +00:00
Antonio Sanchez
14487ed14e Add increment/decrement operators to Eigen::half.
This is for consistency with bfloat16, and to support initialization
with `std::iota`.
2021-03-15 10:52:23 -07:00
Antonio Sanchez
b271110788 Bump up rand histogram threshold.
The previous one sometimes fails for MSVC which has a poor random
number generator.

Fixes #2182
2021-03-10 22:17:03 -08:00
Antonio Sanchez
d098c4d64c Disable EIGEN_OPTIMIZATION_BARRIER for PPC clang.
Doesn't seem to correctly select the register type, and most types
lead to compiler crashes.
2021-03-10 16:05:01 -08:00
Antonio Sanchez
543e34ab9d Re-implement move assignments.
The original swap approach leads to potential undefined behavior (reading
uninitialized memory) and results in unnecessary copying of data for static
storage.

Here we pass down the move assignment to the underlying storage.  Static
storage does a one-way copy, dynamic storage does a swap.

Modified the tests to no longer read from the moved-from matrix/tensor,
since that can lead to UB. Added a test to ensure we do not access
uninitialized memory in a move.

Fixes: #2119
2021-03-10 16:55:20 +00:00
Ben Niu
b8d1857f0d [MSVC-specific] Define EIGEN_ARCH_x86_64 for native x64 (_M_X64 is defined and _M_ARM64EC is not), and define EIGEN_ARCH_ARM64 for both the native ARM64 (_M_ARM64 is defined) or ARM64EC (_M_ARM64EC is defined). _M_ARM64EC is defined when the code is compiled by MSVC for ARM64EC, a new ARM64 ABI designed to be compatible with x64 application emulation on ARM64. If _M_ARM64EC is defined, _M_X64 and _M_AMD64 are also defined, so x64-specific code (especially intrinsics) is also compiled to ARM64 instructions (compliant with the ARM64EC ABI) for maximum x64 compatibility. Although a majority of x64-specific intrinsics can emulated by ARM64 instructions, it is still a good to simply recompile the native ARM64 code paths to ARM64EC for pure computation tasks, for performance reasons. 2021-03-10 10:21:31 +00:00
Antonio Sanchez
853a5c4b84 Fix ambiguous call to CUDA __half constructor. 2021-03-08 21:06:28 -08:00
Antonio Sanchez
94327dbfba Fix typo: DEVICE -> GPU 2021-03-08 11:21:00 -08:00
Antonio Sanchez
1296abdf82 Fix non-trivial Half constructor for CUDA.
Both CUDA and HIP require trivial default constructors for types used
in shared memory. Otherwise failing with
```
error: initialization is not supported for __shared__ variables.
```
2021-03-08 07:32:54 -08:00
Antonio Sanchez
6045243141 Revert stack allocation limit change that crept in.
This was accidentally introduced when copying changes between repos.
2021-03-05 14:29:37 -08:00
Deven Desai
1a96d49afe Changing the Eigen::half implementation for HIP
Currently, when compiling with HIP, Eigen::half is derived from the `__half_raw` struct that is defined within the hip_fp16.h header file. This is true for both the "host" compile phase and the "device" compile phase. This was causing a very hard to detect bug in the ROCm TensorFlow build.

In the ROCm Tensorflow build,
* files that do not contain ant GPU code get compiled via gcc, and
* files that contnain GPU code get compiled via hipcc.

In certain case, we have a function that is defined in a file that is compiled by hipcc, and is called in a file that is compiled by gcc. If such a function had Eigen::half has a "pass-by-value" argument, its value was getting corrupted, when received by the function.

The reason for this seems to be that for the gcc compile, Eigen::half is derived from a `__half_raw` struct that has `uint16_t` as the data-store, and for hipcc the `__half_raw` implementation uses `_Float16` as the data store. There is some ABI incompatibility between gcc / hipcc (which is essentially latest clang), which results in the Eigen::half value (which is correct at the call-site) getting randomly corrupted when passed to the function.

Changing the Eigen::half argument to be "pass by reference" seems to workaround the error.

In order to fix it such that we do not run into it again in TF, this commit changes the Eigne::half implementation to use the same `__half_raw` implementation as the non-GPU compile, during host compile phase of the hipcc compile.
2021-03-05 19:27:13 +00:00
Antonio Sanchez
2468253c9a Define EIGEN_CPLUSPLUS and replace most __cplusplus checks.
The macro `__cplusplus` is not defined correctly in MSVC unless building
with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the
specified c++ standard version number.

Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version
number both for MSVC and otherwise.  This simplifies checks for supported
features.

Also replaced most instances of standard version checking via `__cplusplus`
with the existing `EIGEN_COMP_CXXVER` macro for better clarity.

Fixes: #2170
2021-03-05 18:33:18 +00:00
Antonio Sanchez
82d61af3a4 Fix rint SSE/NEON again, using optimization barrier.
This is a new version of !423, which failed for MSVC.

Defined `EIGEN_OPTIMIZATION_BARRIER(X)` that uses inline assembly to
prevent operations involving `X` from crossing that barrier. Should
work on most `GNUC` compatible compilers (MSVC doesn't seem to need
this). This is a modified version adapted from what was used in
`psincos_float` and tested on more platforms
(see #1674, https://godbolt.org/z/73ezTG).

Modified `rint` to use the barrier to prevent the add/subtract rounding
trick from being optimized away.

Also fixed an edge case for large inputs that get bumped up a power of two
and ends up rounding away more than just the fractional part.  If we are
over `2^digits` then just return the input.  This edge case was missed in
the test since the test was comparing approximate equality, which was still
satisfied.  Adding a strict equality option catches it.
2021-03-05 08:54:12 -08:00
David Tellenbach
5f0b4a4010 Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"
This reverts commit 6cbb3038ac because it
breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.
2021-03-05 13:16:43 +01:00
Steve Bronder
6cbb3038ac Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size() 2021-03-04 18:58:08 +00:00
David Tellenbach
5bfc67f9e7 Deactive CI for Power due to problems with GitLab runner 2021-03-04 17:33:40 +01:00
Eugene Zhulenev
a6601070f2 Add log2 operation to TensorBase 2021-03-04 00:13:36 +00:00
Antonio Sánchez
9a663973b4 Revert "Fix rint for SSE/NEON."
This reverts commit e72dfeb8b9
2021-03-03 18:51:51 +00:00
Antonio Sanchez
e72dfeb8b9 Fix rint for SSE/NEON.
It seems *sometimes* with aggressive optimizations the combination
`psub(padd(a, b), b)` trick to force rounding is compiled away. Here
we replace with inline assembly to prevent this (I tried `volatile`,
but that leads to additional loads from memory).

Also fixed an edge case for large inputs `a` where adding `b` bumps
the value up a power of two and ends up rounding away more than
just the fractional part.  If we are over `2^digits` then just return
the input.  This edge case was missed in the test since the test was
comparing approximate equality, which was still satisfied.  Adding
a strict equality option catches it.
2021-03-03 09:41:46 -08:00
Christoph Hertzberg
199c5f2b47 geo_alignedbox_5 was failing with AVX enabled, due to storing Vector4d in a std::vector without using an aligned allocator.
Got rid of using `std::vector` and simplified the code.
Avoid leading `_`
2021-03-01 03:59:21 +01:00
Antonio Sanchez
1e0c7d4f49 Add print for SSE/NEON, use NEON rounding intrinsics if available.
In SSE, by adding/subtracting 2^MantissaBits, we force rounding according to the
current rounding mode.

For NEON, we use the provided intrinsics for rint/floor/ceil if
available (armv8).

Related to #1969.
2021-02-27 22:42:07 +00:00
David Tellenbach
976ae0ca6f Document that using raw function pointers doesn't work with unaryExpr. 2021-02-27 22:58:42 +01:00
Antonio Sanchez
c65c2b31d4 Make half/bfloat16 constructor take inputs by value, fix powerpc test.
Since `numeric_limits<half>::max_exponent` is a static inline constant,
it cannot be directly passed by reference. This triggers a linker error
in recent versions of `g++-powerpc64le`.

Changing `half` to take inputs by value fixes this.  Wrapping
`max_exponent` with `int(...)` to make an addressable integer also fixes this
and may help with other custom `Scalar` types down-the-road.

Also eliminated some compile warnings for powerpc.
2021-02-27 21:32:06 +00:00
Christoph Hertzberg
39a590dfb6 Remove unused include 2021-02-27 19:02:33 +01:00
Christoph Hertzberg
8f686ac4ec clang 10 aggressively warns about precision loss when converting int to float (or long to double)
(cherry picked from commit cd541ad52c8152340469cae210312c0e27829c8d)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
2660d01fa7 Inherit from no_assignment_operator to avoid implicit copy constructor warnings
(cherry picked from commit 9bbb7ea4b54b1f307863be4ed8d105c38cdefe50)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
a3521d743c Fix some enum-enum conversion warnings
(cherry picked from commit 838f3d8ce22a5549ef10c7386fb03040721749a0)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
ca528593f4 Fixed/masked more implicit copy constructor warnings
(cherry picked from commit 2883e91ce5a99c391fbf28e20160176b70854992)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
81b5fe2f0a ReturnByValue is already non-copyable
(cherry picked from commit abbf95045009619f37bd92b45433eedbfcbe41cf)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
4fb3459a23 Fix double-promotion warnings
(cherry picked from commit c22c103e932e511e96645186831363585a44b7a3)
2021-02-27 18:44:26 +01:00
Jens Wehner
4bfcee47b9 Idrs iterative linear solver 2021-02-27 12:09:33 +00:00
Antonio Sanchez
29ebd84cb7 Fix NEON sqrt for 32-bit, add prsqrt.
With !406, we accidentally broke arm 32-bit NEON builds, since
`vsqrt_f32` is only available for 64-bit.

Here we add back the `rsqrt` implementation for 32-bit, relying
on a `prsqrt` implementation with better handling of edge cases.

Note that several of the 32-bit NEON packet tests are currently
failing - either due to denormal handling (NEON versions flush
to zero, but scalar paths don't) or due to accuracy (e.g. sin/cos).
2021-02-26 14:08:40 -08:00
Rasmus Munk Larsen
fe19714f80 Merge branch 'rmlarsen1/eigen-nan_prop' 2021-02-26 09:21:24 -08:00
Rasmus Munk Larsen
e67672024d Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop 2021-02-26 09:12:44 -08:00
Rasmus Munk Larsen
5e7d4c33d6 Add TODO. 2021-02-26 09:08:45 -08:00
Rasmus Munk Larsen
fb5b59641a Defer default for minCoeff/maxCoeff to templated variant. 2021-02-26 09:07:00 -08:00
Antonio Sanchez
e19829c3b0 Fix floor/ceil for NEON fp16.
Forgot to test this.  Fixes bug introduced in !416.
2021-02-25 20:39:56 -08:00
Antonio Sanchez
5529db7524 Fix SSE/NEON pfloor/pceil for saturated values.
The original will saturate if the input does not fit into an integer
type.  Here we fix this, returning the input if it doesn't have
enough precision to have a fractional part.

Also added `pceil` for NEON.

Fixes #1969.
2021-02-25 14:39:26 -08:00
Rasmus Munk Larsen
51eba8c3e2 Fix indentation. 2021-02-25 18:21:21 +00:00
Rasmus Munk Larsen
5297b7162a Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions. 2021-02-25 18:21:21 +00:00
Antonio Sanchez
ecb7b19dfa Disable new/delete test for HIP 2021-02-25 08:04:05 -08:00
Chip-Kerchner
6eebe97bab Fix clang compile when no MMA flags are set. Simplify MMA compiler detection. 2021-02-24 20:43:23 -06:00
Rasmus Munk Larsen
f284c8592b Don't crash when attempting to slice an empty tensor. 2021-02-24 18:12:51 -08:00
Rasmus Munk Larsen
4cb0592af7 Fix indentation. 2021-02-24 17:59:36 -08:00
Rasmus Munk Larsen
6b34568c74 Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop 2021-02-24 17:54:58 -08:00
Rasmus Munk Larsen
0065f9d322 Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions. 2021-02-25 01:54:36 +00:00
Rasmus Munk Larsen
841c8986f8 Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions. 2021-02-24 17:49:20 -08:00
Rasmus Munk Larsen
113e61f364 Remove unused function scalar_cmp_with_cast. 2021-02-24 23:59:35 +00:00
Rasmus Munk Larsen
98ca58b02c Cast anonymous enums to int when used in expressions. 2021-02-24 23:50:06 +00:00
Chip-Kerchner
c31ead8a15 Having forward template function declarations in a P10 file causes bad code in certain situations. 2021-02-24 23:43:30 +00:00
Guoqiang QI
f44197fabd Some improvements for kissfft from Martin Reinecke(pocketfft author):
1.Only computing about half of the factors and use complex conjugate symmetry for the rest instead of all to save time.
2.All twiddles are calculated in double because that gives the maximum achievable precision when doing float transforms.
3.Reducing all angles to the range 0<angle<pi/4 which gives even more precision.
2021-02-24 21:36:47 +00:00
Antonio Sanchez
a31effc3bc Add invoke_result and eliminate result_of warnings for C++17+.
The `std::result_of` meta struct is deprecated in C++17 and removed
in C++20. It was still slipping through due to a faulty definition of
`EIGEN_HAS_STD_RESULT_OF`.

Added a new macro `EIGEN_HAS_STD_INVOKE_RESULT` and
`Eigen::internal::invoke_result` implementation with fallback for
pre C++17.

Replaces the `result_of` definition with one based on `std::invoke_result`
for C++17 and higher.

For completeness, added nullary op support for c++03.

Fixes #1850.
2021-02-24 21:36:14 +00:00
Chip-Kerchner
8523d447a1 Fixes to support old and new versions of the compilers for built-ins. Cast to non-const when using vector_pair with certain built-ins. 2021-02-24 20:49:15 +00:00
Antonio Sanchez
5908aeeaba Fix CUDA device new and delete, and add test.
HIP does not support new/delete on device, so test is skipped.
2021-02-24 11:31:41 -08:00
Antonio Sanchez
119763cf38 Eliminate CMake FindPackageHandleStandardArgs warnings.
CMake complains that the package name does not match when the case
differs, e.g.:
```
CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (UMFPACK)
  does not match the name of the calling package (Umfpack).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args)
  bench/spbench/CMakeLists.txt:24 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
```
Here we rename the libraries to match their true cases.
2021-02-24 09:52:05 +00:00
Antonio Sanchez
6cf0ab5e99 Disable fast psqrt for NEON.
Accuracy is too poor - requires at least two Newton iterations, but then
it is no longer significantly faster than `vsqrt`.

Fixes #2094.
2021-02-23 19:52:55 -08:00
Antonio Sanchez
aba3998278 Fix check if GPU compile phase for std::hash 2021-02-23 19:52:08 -08:00
Antonio Sanchez
db5691ff2b Fix some CUDA warnings.
Added `EIGEN_HAS_STD_HASH` macro, checking for C++11 support and not
running on GPU.

`std::hash<float>` is not a device function, so cannot be used by
`std::hash<bfloat16>`.  Removed `EIGEN_DEVICE_FUNC` and only
define if `EIGEN_HAS_STD_HASH`. Same for `half`.

Added `EIGEN_CUDA_HAS_FP16_ARITHMETIC` to improve readability,
eliminate warnings about `EIGEN_CUDA_ARCH` not being defined.

Replaced a couple C-style casts with `reinterpret_cast` for aligned
loading of `half*` to `half2*`. This eliminates `-Wcast-align`
warnings in clang.  Although not ideal due to potential type aliasing,
this is how CUDA handles these conversions internally.
2021-02-24 00:16:31 +00:00
Rasmus Munk Larsen
88d4c6d4c8 Accurate pow, part 2. This change adds specializations of log2 and exp2 for double that
make pow<double> accurate the 1 ULP. Speed for AVX-512 is within 0.5% of the currect
implementation.
2021-02-23 23:11:03 +00:00
Adam Shapiro
2ac0b78739 Fixed sparse conservativeResize() when both num cols and rows decreased.
The previous implementation caused a buffer overflow trying to calculate non-
zero counts for columns that no longer exist.
2021-02-23 21:32:39 +00:00
Chip-Kerchner
10c77b0ff4 Fix compilation errors with later versions of GCC and use of MMA. 2021-02-22 15:01:47 -06:00
Christoph Hertzberg
73922b0174 Fixes Bug #1925. Packets should be passed by const reference, even to inline functions. 2021-02-20 18:56:42 +01:00
Antonio Sanchez
5f9cfb2529 Add missing adolc isinf/isnan.
Also modified cmake/FindAdolc.cmake to eliminate warnings, and added
search paths to match install layout.

Fixed: #2157
2021-02-19 22:26:56 +00:00
Christoph Hertzberg
ce4af0b38f Missing change regarding #1910 2021-02-19 20:51:35 +01:00
Christoph Hertzberg
a7749c09bc Bug #1910: Make SparseCholesky work for RowMajor matrices 2021-02-19 19:36:18 +01:00
Antonio Sánchez
128eebf05e Revert "add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC)."
This reverts commit 12fd3dd655
2021-02-19 17:09:16 +00:00
frgossen
33e0af0130 Return nan at poles of polygamma, digamma, and zeta if limit is not defined 2021-02-19 16:35:11 +00:00
Rasmus Munk Larsen
7f09d3487d Use the Cephes double subtraction trick in pexp<float> even when FMA is available. Otherwise the accuracy drops from 1 ulp to 3 ulp. 2021-02-18 20:49:18 +00:00
Masaki Murooka
12fd3dd655 add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC). 2021-02-17 22:55:47 +00:00
David Tellenbach
aa8b22e776 Bump to 3.4.99 2021-02-17 23:23:17 +01:00
David Tellenbach
5336ad8591 Define internal::make_unsigned for [unsigned]long long on macOS.
macOS defines int64_t as long long even for C++03 and therefore expects
a template specialization

  internal::make_unsigned<long long>,

for C++03. Since other platforms define int64_t as long for C++03 we
cannot add the specialization for all cases.
2021-02-17 23:03:10 +01:00
Antonio Sanchez
0845df7f77 Fix uninitialized warning on AVX. 2021-02-17 13:13:39 -08:00
Chip Kerchner
9b51dc7972 Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product 2021-02-17 17:49:23 +00:00
Rasmus Munk Larsen
be0574e215 New accurate algorithm for pow(x,y). This version is accurate to 1.4 ulps for float, while still being 10x faster than std::pow for AVX512. A future change will introduce a specialization for double. 2021-02-17 02:50:32 +00:00
Antonio Sanchez
7ff0b7a980 Updated pfrexp implementation.
The original implementation fails for 0, denormals, inf, and NaN.

See #2150
2021-02-17 02:23:24 +00:00
David Tellenbach
9ad4096ccb Document possible inconsistencies when using Matrix<bool, ...> 2021-02-17 00:50:26 +01:00
Ashutosh Sharma
f702792a7c missing method in packetmath.h void ptranspose(PacketBlock<Packet16uc, 4>& kernel) 2021-02-16 16:33:59 +00:00
Jan van Dijk
db61b8d478 Avoid -Wunused warnings in NDEBUG builds.
In two places in SuperLUSupport.h, a local variable 'size' is
created that is used only inside an eigen_assert. Remove these,
just fetch the required values inside the assert statements.
This avoids annoying -Wunused warnings (and -Werror=unused errors)
in NDEBUG builds.
2021-02-12 18:35:35 +00:00
David Tellenbach
622c598944 Don't allow all test jobs to fail but only the currently failing ones. 2021-02-12 14:01:17 +01:00
Antonio Sanchez
90ee821c56 Use vrsqrts for rsqrt Newton iterations.
It's slightly faster and slightly more accurate, allowing our current
packetmath tests to pass for sqrt with a single iteration.
2021-02-11 11:33:51 -08:00
Antonio Sanchez
9fde9cce5d Adjust bounds for pexp_float/double
The original clamping bounds on `_x` actually produce finite values:
```
  exp(88.3762626647950) = 2.40614e+38 < 3.40282e+38

  exp(709.437) = 1.27226e+308 < 1.79769e+308
```
so with an accurate `ldexp` implementation, `pexp` fails for large
inputs, producing finite values instead of `inf`.

This adjusts the bounds slightly outside the finite range so that
the output will overflow to +/- `inf` as expected.
2021-02-10 22:48:05 +00:00
Antonio Sanchez
4cb563a01e Fix ldexp implementations.
The previous implementations produced garbage values if the exponent did
not fit within the exponent bits.  See #2131 for a complete discussion,
and !375 for other possible implementations.

Here we implement the 4-factor version. See `pldexp_impl` in
`GenericPacketMathFunctions.h` for a full description.

The SSE `pcmp*` methods were moved down since `pcmp_le<Packet4i>`
requires `por`.

Left as a "TODO" is to delegate to a faster version if we know the
exponent does fit within the exponent bits.

Fixes #2131.
2021-02-10 22:45:41 +00:00
Ashutosh Sharma
7eb07da538 loop less ptranspose 2021-02-10 10:21:37 -08:00
David Tellenbach
36200b7855 Remove vim specific comments to recognoize correct file-type.
As discussed in #2143 we remove editor specific comments.
2021-02-09 09:13:09 +01:00
David Tellenbach
54589635ad Replace nullptr by NULL in SparseLU.h to be C++03 compliant. 2021-02-09 09:08:06 +01:00
Ralf Hannemann-Tamas
984d010b7b add specialization of check_sparse_solving() for SuperLU solver, in order to test adjoint and transpose solves 2021-02-08 22:00:31 +00:00
Nikolaus Demmel
b578930657 Fix documentation typos in LDLT.h 2021-02-08 21:07:29 +00:00
Antonio Sanchez
66841ea070 Enable bdcsvd on host.
Currently if compiled by NVCC, the `MatrixBase::bdcSvd()` implementation
is skipped, leading to a linker error.  This prevents it from running on
the host as well.

Seems it was disabled 6 years ago (5384e891) to match `jacobiSvd`, but
`jacobiSvd` is now enabled on host.  Tested and runs fine on host, but
will not compile/run for device (though it's not labelled as a device
function, so this should be fine).

Fixes #2139
2021-02-08 12:56:23 -08:00
Rasmus Munk Larsen
6e3b795f81 Add more tests for pow and fix a corner case for huge exponent where the result is always zero or infinite unless x is one. 2021-02-05 16:58:49 -08:00
Antonio Sanchez
abcde69a79 Disable vectorized pow for half/bfloat16.
We are potentially seeing some accuracy issues with these.  Ideally we
would hand off to `float`, but that's not trivial with the current
setup.

We may want to consider adding `ppow<Packet>` and `HasPow`, so
implementations can more easily specialize this.
2021-02-05 12:17:34 -08:00
Antonio Sanchez
f85038b7f3 Fix excessive GEBP register spilling for 32-bit NEON.
Clang does a poor job of optimizing the GEBP microkernel on 32-bit ARM,
leading to excessive 16-byte register spills, slowing down basic f32
matrix multiplication by approx 50%.

By specializing `gebp_traits`, we can eliminate the register spills.
Volatile inline ASM both acts as a barrier to prevent reordering and
enforces strict register use. In a simple f32 matrix multiply example,
this modification reduces 16-byte spills from 109 instances to zero,
leading to a 1.5x speed increase (search for `16-byte Spill` in the
assembly in https://godbolt.org/z/chsPbE).

This is a replacement of !379.  See there for further discussion.

Also moved `gebp_traits` specializations for NEON to
`Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h` to be alongside
other NEON-specific code.

Fixes #2138.
2021-02-03 09:01:48 -08:00
Antonio Sanchez
56c8b14d87 Eliminate implicit conversions from float to double. 2021-02-01 15:31:01 -08:00
Antonio Sanchez
fb4548e27b Implement bit_* for device.
Unfortunately `std::bit_and` and the like are host-only functions prior
to c++14 (since they are not `constexpr`).  They also never exist in the
global namespace, so the current implementation  always fails to compile via
NVCC - since `EIGEN_USING_STD` tries to import the symbol from the global
namespace on device.

To overcome these limitations, we implement these functionals here.
2021-02-01 13:27:45 -08:00
Antonio Sanchez
1615a27993 Fix altivec packetmath.
Allows the altivec packetmath tests to pass.  There were a few issues:
- `pstoreu` was missing MSQ on `_BIG_ENDIAN` systems
- `cmp_*` didn't properly handle conversion of bool flags (0x7FC instead
of 0xFFFF)
- `pfrexp` needed to set the `exponent` argument.

Related to !370, #2128

cc: @ChipKerchner @pdrocaldeira

Tested on `_BIG_ENDIAN` running on QEMU with VSX.  Couldn't figure out build
flags to get it to work for little endian.
2021-01-28 18:37:09 +00:00
Chip Kerchner
1414e2212c Fix clang compilation for AltiVec from previous check-in 2021-01-28 18:36:40 +00:00
David Tellenbach
170a504c2f Add the following functions
DenseBase::setConstant(NoChange_t, Index, const Scalar&)
  DenseBase::setConstant(Index, NoChange_t, const Scalar&)

to close #663.
2021-01-28 15:13:07 +01:00
David Tellenbach
598e1b6e54 Add the following functions:
DenseBase::setZero(NoChange_t, Index)
  DenseBase::setZero(Index, NoChange_t)
  DenseBase::setOnes(NoChange_t, Index)
  DenseBase::setOnes(Index, NoChange_t)
  DenseBase::setRandom(NoChange_t, Index)
  DenseBase::setRandom(Index, NoChange_t)

This closes #663.
2021-01-28 01:10:36 +01:00
Gael Guennebaud
0668c68b03 Allow for negative strides.
Note that using a stride of -1 is still not possible because it would
clash with the definition of Eigen::Dynamic.

This fixes #747.
2021-01-27 23:32:12 +01:00
Samir Benmendil
288d456c29 Replace language_support module with builtin CheckLanguage
The workaround_9220 function was introduced a long time ago to
workaround a CMake issue with enable_language(OPTIONAL). Since then
CMake has clarified that the OPTIONAL keywords has not been
implemented[0].

A CheckLanguage module is now provided with CMake to check if a language
can be enabled. Use that instead.

[0] https://cmake.org/cmake/help/v3.18/command/enable_language.html
2021-01-27 13:26:40 +00:00
Antonio Sanchez
3f4684f87d Include <cstdint> in one place, remove custom typedefs
Originating from
[this SO issue](https://stackoverflow.com/questions/65901014/how-to-solve-this-all-error-2-in-this-case),
some win32 compilers define `__int32` as a `long`, but MinGW defines
`std::int32_t` as an `int`, leading to a type conflict.

To avoid this, we remove the custom `typedef` definitions for win32.  The
Tensor module requires C++11 anyways, so we are guaranteed to have
included `<cstdint>` already in `Eigen/Core`.

Also re-arranged the headers to only include `<cstdint>` in one place to
avoid this type of error again.
2021-01-26 14:23:05 -08:00
Chip Kerchner
0784d9f87b Fix sqrt, ldexp and frexp compilation errors. 2021-01-25 15:22:19 -06:00
Gmc2
a4edb1079c fix test of ExtractVolumePatchesOp 2021-01-25 03:23:46 +00:00
Antonio Sanchez
4c42d5ee41 Eliminate implicit conversion warning in test/array_cwise.cpp 2021-01-23 11:54:00 -08:00
Antonio Sanchez
e0d13ead90 Replace std::isnan with numext::isnan for c++03 2021-01-23 11:02:35 -08:00
Florian Maurin
c35965b381 Remove unused variable in SparseLU.h 2021-01-22 22:24:11 +00:00
Antonio Sanchez
f0e46ed5d4 Fix pow and other cwise ops for half/bfloat16.
The new `generic_pow` implementation was failing for half/bfloat16 since
their construction from int/float is not `constexpr`. Modified
in `GenericPacketMathFunctions` to remove `constexpr`.

While adding tests for half/bfloat16, found other issues related to
implicit conversions.

Also needed to implement `numext::arg` for non-integer, non-complex,
non-float/double/long double types.  These seem to be  implicitly
converted to `std::complex<T>`, which then fails for half/bfloat16.
2021-01-22 11:10:54 -08:00
Antonio Sanchez
f19bcffee6 Specialize std::complex operators for use on GPU device.
NVCC and older versions of clang do not fully support `std::complex` on device,
leading to either compile errors (Cannot call `__host__` function) or worse,
runtime errors (Illegal instruction).  For most functions, we can
implement specialized `numext` versions. Here we specialize the standard
operators (with the exception of stream operators and member function operators
with a scalar that are already specialized in `<complex>`) so they can be used
in device code as well.

To import these operators into the current scope, use
`EIGEN_USING_STD_COMPLEX_OPERATORS`. By default, these are imported into
the `Eigen`, `Eigen:internal`, and `Eigen::numext` namespaces.

This allow us to remove specializations of the
sum/difference/product/quotient ops, and allow us to treat complex
numbers like most other scalars (e.g. in tests).
2021-01-22 18:19:19 +00:00
David Tellenbach
65e2169c45 Add support for Arm SVE
This patch adds support for Arm's new vector extension SVE (Scalable Vector Extension). In contrast to other vector extensions that are supported by Eigen, SVE types are inherently *sizeless*. For the use in Eigen we fix their size at compile-time (note that this is not necessary in general, SVE is *length agnostic*).

During compilation the flag `-msve-vector-bits=N` has to be set where `N` is a power of two in the range of `128`to `2048`, indicating the length of an SVE vector.

Since SVE is rather young, we decided to disable it by default even if it would be available. A user has to enable it explicitly by defining `EIGEN_ARM64_USE_SVE`.

This patch introduces the packet types `PacketXf` and `PacketXi` for packets of `float` and `int32_t` respectively. The size of these packets depends on the SVE vector length. E.g. if `-msve-vector-bits=512` is set, `PacketXf` will contain `512/32 = 16` elements.

This MR is joint work with Miguel Tairum <miguel.tairum@arm.com>.
2021-01-21 21:11:57 +00:00
Antonio Sanchez
b2126fd6b5 Fix pfrexp/pldexp for half.
The recent addition of vectorized pow (!330) relies on `pfrexp` and
`pldexp`.  This was missing for `Eigen::half` and `Eigen::bfloat16`.
Adding tests for these packet ops also exposed an issue with handling
negative values in `pfrexp`, returning an incorrect exponent.

Added the missing implementations, corrected the exponent in `pfrexp1`,
and added `packetmath` tests.
2021-01-21 19:32:28 +00:00
Antonio Sanchez
25d8498f8b Fix stable_norm_1 test.
Test enters an infinite loop if size is 1x1 when choosing to select
unique indices for adding `inf` and `NaN` to the input. Here we
revert to non-unique indices, and split the `hypotNorm` check into
two cases: one where both `inf` and `NaN` are added, and one where
only `NaN` is added.
2021-01-21 09:44:42 -08:00
David Tellenbach
660c6b857c Remove std::cerr in iterative solver since we don't have iostream.
This fixes #2123
2021-01-21 11:40:05 +01:00
Antonio Sanchez
d5b7981119 Fix signed-unsigned comparison.
Hex literals are interpreted as unsigned, leading to a comparison between
signed max supported function `abcd[0]`  (which was negative) to the unsigned
literal `0x80000006`.  Should not change result since signed is
implicitly converted to unsigned for the comparison, but eliminates the
warning.
2021-01-20 08:34:00 -08:00
Ivan Popivanov
e409795d6b Proper CPUID 2021-01-18 17:10:11 +00:00
Rasmus Munk Larsen
cdd8fdc32e Vectorize pow(x, y). This closes https://gitlab.com/libeigen/eigen/-/issues/2085, which also contains a description of the algorithm.
I ran some testing (comparing to `std::pow(double(x), double(y)))` for `x` in the set of all (positive) floats in the interval `[std::sqrt(std::numeric_limits<float>::min()), std::sqrt(std::numeric_limits<float>::max())]`, and `y` in `{2, sqrt(2), -sqrt(2)}` I get the following error statistics:

```
max_rel_error = 8.34405e-07
rms_rel_error = 2.76654e-07
```

If I widen the range to all normal float I see lower accuracy for arguments where the result is subnormal, e.g. for `y = sqrt(2)`:

```
max_rel_error = 0.666667
rms = 6.8727e-05
count = 1335165689
argmax = 2.56049e-32, 2.10195e-45 != 1.4013e-45
```

which seems reasonable, since these results are subnormals with only couple of significant bits left.
2021-01-18 13:25:16 +00:00
Antonio Sanchez
bde6741641 Improved std::complex sqrt and rsqrt.
Replaces `std::sqrt` with `complex_sqrt` for all platforms (previously
`complex_sqrt` was only used for CUDA and MSVC), and implements
custom `complex_rsqrt`.

Also introduces `numext::rsqrt` to simplify implementation, and modified
`numext::hypot` to adhere to IEEE IEC 6059 for special cases.

The `complex_sqrt` and `complex_rsqrt` implementations were found to be
significantly faster than `std::sqrt<std::complex<T>>` and
`1/numext::sqrt<std::complex<T>>`.

Benchmark file attached.
```
GCC 10, Intel Xeon, x86_64:
---------------------------------------------------------------------------
Benchmark                                 Time             CPU   Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>>           9.21 ns         9.21 ns     73225448
BM_StdSqrt<std::complex<float>>        17.1 ns         17.1 ns     40966545
BM_Sqrt<std::complex<double>>          8.53 ns         8.53 ns     81111062
BM_StdSqrt<std::complex<double>>       21.5 ns         21.5 ns     32757248
BM_Rsqrt<std::complex<float>>          10.3 ns         10.3 ns     68047474
BM_DivSqrt<std::complex<float>>        16.3 ns         16.3 ns     42770127
BM_Rsqrt<std::complex<double>>         11.3 ns         11.3 ns     61322028
BM_DivSqrt<std::complex<double>>       16.5 ns         16.5 ns     42200711

Clang 11, Intel Xeon, x86_64:
---------------------------------------------------------------------------
Benchmark                                 Time             CPU   Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>>           7.46 ns         7.45 ns     90742042
BM_StdSqrt<std::complex<float>>        16.6 ns         16.6 ns     42369878
BM_Sqrt<std::complex<double>>          8.49 ns         8.49 ns     81629030
BM_StdSqrt<std::complex<double>>       21.8 ns         21.7 ns     31809588
BM_Rsqrt<std::complex<float>>          8.39 ns         8.39 ns     82933666
BM_DivSqrt<std::complex<float>>        14.4 ns         14.4 ns     48638676
BM_Rsqrt<std::complex<double>>         9.83 ns         9.82 ns     70068956
BM_DivSqrt<std::complex<double>>       15.7 ns         15.7 ns     44487798

Clang 9, Pixel 2, aarch64:
---------------------------------------------------------------------------
Benchmark                                 Time             CPU   Iterations
---------------------------------------------------------------------------
BM_Sqrt<std::complex<float>>           24.2 ns         24.1 ns     28616031
BM_StdSqrt<std::complex<float>>         104 ns          103 ns      6826926
BM_Sqrt<std::complex<double>>          31.8 ns         31.8 ns     22157591
BM_StdSqrt<std::complex<double>>        128 ns          128 ns      5437375
BM_Rsqrt<std::complex<float>>          31.9 ns         31.8 ns     22384383
BM_DivSqrt<std::complex<float>>        99.2 ns         98.9 ns      7250438
BM_Rsqrt<std::complex<double>>         46.0 ns         45.8 ns     15338689
BM_DivSqrt<std::complex<double>>        119 ns          119 ns      5898944
```
2021-01-17 08:50:57 -08:00
Maozhou, Ge
21a8a2487c fix paddings of TensorVolumePatchOp 2021-01-15 11:51:49 +08:00
Guoqiang QI
38ae5353ab 1)provide a better generic paddsub op implementation
2)make paddsub op support the Packet2cf/Packet4f/Packet2f in NEON
3)make paddsub op support the Packet2cf/Packet4f in SSE
2021-01-13 22:54:03 +00:00
Antonio Sanchez
352f1422d3 Remove inf local variable.
Apparently `inf` is a macro on iOS for `std::numeric_limits<T>::infinity()`,
causing a compile error here. We don't need the local anyways since it's
only used in one spot.
2021-01-12 10:33:15 -08:00
Antonio Sanchez
2044084979 Remove TODO from Transform::computeScaleRotation()
Upon investigation, `JacobiSVD` is significantly faster than `BDCSVD`
for small matrices (twice as fast for 2x2, 20% faster for 3x3,
1% faster for 10x10).  Since the majority of cases will be small,
let's stick with `JacobiSVD`.  See !361.
2021-01-11 11:30:01 -08:00
Antonio Sanchez
3daf92c7a5 Transform::computeScalingRotation flush determinant to +/- 1.
In the previous code, in attempting to correct for a negative
determinant, we end up multiplying and dividing by a number that
is often very near, but not exactly +/-1.  By flushing to +/-1,
we can replace a division with a multiplication, and results
are more numerically consistent.
2021-01-11 10:13:38 -08:00
Antonio Sanchez
587fd6ab70 Only specialize complex sqrt_impl for CUDA if not MSVC.
We already specialize `sqrt_impl` on windows due to MSVC's mishandling
of `inf` (!355).
2021-01-11 09:15:45 -08:00
Deven Desai
2a6addb4f9 Fix for breakage in ROCm support - 210108
The following commit breaks ROCm support for Eigen
f149e0ebc3

All unit tests fail with the following error

```
Building HIPCC object test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o
In file included from /home/rocm-user/eigen/test/gpu_basic.cu:19:
In file included from /home/rocm-user/eigen/test/main.h:356:
In file included from /home/rocm-user/eigen/Eigen/QR:11:
In file included from /home/rocm-user/eigen/Eigen/Core:166:
/home/rocm-user/eigen/Eigen/src/Core/MathFunctionsImpl.h:105:35: error: __host__ __device__ function 'complex_sqrt' cannot overload __host__ function 'complex_sqrt'
EIGEN_DEVICE_FUNC std::complex<T> complex_sqrt(const std::complex<T>& z) {
                                  ^
/home/rocm-user/eigen/Eigen/src/Core/MathFunctions.h:342:38: note: previous declaration is here
template<typename T> std::complex<T> complex_sqrt(const std::complex<T>& a_x);
                                     ^
1 error generated when compiling for gfx900.
CMake Error at gpu_basic_generated_gpu_basic.cu.o.cmake:192 (message):
  Error generating file
  /home/rocm-user/eigen/build/test/CMakeFiles/gpu_basic.dir//./gpu_basic_generated_gpu_basic.cu.o

test/CMakeFiles/gpu_basic.dir/build.make:63: recipe for target 'test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o' failed
make[3]: *** [test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o] Error 1
CMakeFiles/Makefile2:16618: recipe for target 'test/CMakeFiles/gpu_basic.dir/all' failed
make[2]: *** [test/CMakeFiles/gpu_basic.dir/all] Error 2
CMakeFiles/Makefile2:16625: recipe for target 'test/CMakeFiles/gpu_basic.dir/rule' failed
make[1]: *** [test/CMakeFiles/gpu_basic.dir/rule] Error 2
Makefile:5401: recipe for target 'gpu_basic' failed
make: *** [gpu_basic] Error 2

```

The error message is accurate, and the fix (provided in thsi commit) is trivial.
2021-01-08 18:04:40 +00:00
Antonio Sanchez
f149e0ebc3 Fix MSVC complex sqrt and packetmath test.
MSVC incorrectly handles `inf` cases for `std::sqrt<std::complex<T>>`.
Here we replace it with a custom version (currently used on GPU).

Also fixed the `packetmath` test, which previously skipped several
corner cases since `CHECK_CWISE1` only tests the first `PacketSize`
elements.
2021-01-08 01:17:19 +00:00
Antonio Sanchez
8d9cfba799 Fix rand test for MSVC.
MSVC's uniform random number generator is not quite as uniform as
others, requiring a slightly wider threshold on the histogram test.
After inspecting histograms for several runs, there's no obvious
bias -- just some bins end up having slightly more less elements
(often > 2% but less than 2.5%).
2021-01-07 12:48:40 -08:00
Essex Edwards
e741b43668 Make Transform::computeRotationScaling(0,&S) continuous 2021-01-07 17:45:14 +00:00
David Tellenbach
0bdc0dba20 Add missing #endif directive in Macros.h 2021-01-07 12:32:41 +01:00
shrek1402
cb654b1c45 #define was defined incorrectly because the result_of function was deprecated in c++17 and removed in c++20. Also, EIGEN_COMP_MSVC (which is _MSC_VER) only affects result_of indirectly, which can cause errors. 2021-01-07 10:12:25 +00:00
Antonio Sanchez
52d1dd979a Fix Ref initialization.
Since `eigen_assert` is a macro, the statements can become noops (e.g.
when compiling for GPU), so they may not execute the contained logic -- which
in this case is the entire `Ref` construction.  We need to separate the assert
from statements which have consequences.

Fixes #2113
2021-01-06 13:14:20 -08:00
Antonio Sanchez
166fcdecdb Allow CwiseUnaryView to be used on device.
Added `EIGEN_DEVICE_FUNC` to methods.
2021-01-06 09:16:52 -08:00
Antonio Sanchez
bb1de9dbde Fix Ref Stride checks.
The existing `Ref` class failed to consider cases where the Ref's
`Stride` setting *could* match the underlying referred object's stride,
but **didn't** at runtime.  This led to trying to set invalid stride values,
causing runtime failures in some cases, and garbage due to mismatched
strides in others.

Here we add the missing runtime checks.  This involves computing the
strides necessary to align with the referred object's storage, and
verifying we can actually set those strides at runtime.

In the `const` case, if it *may* be possible to refer to the original
storage at compile-time but fails at runtime, then we defer to the
`construct(...)` method that makes a copy.

Added more tests to check these cases.

Fixes #2093.
2021-01-05 10:41:25 -08:00
Christoph Hertzberg
12dda34b15 Eliminate boolean product warnings by factoring out a
`combine_scalar_factors` helper function.
2021-01-05 18:15:30 +00:00
Antonio Sanchez
070d303d56 Add CUDA complex sqrt.
This is to support scalar `sqrt` of complex numbers `std::complex<T>` on
device, requested by Tensorflow folks.

Technically `std::complex` is not supported by NVCC on device
(though it is by clang), so the default `sqrt(std::complex<T>)` function only
works on the host. Here we create an overload to add back the
functionality.

Also modified the CMake file to add `--relaxed-constexpr` (or
equivalent) flag for NVCC to allow calling constexpr functions from
device functions, and added support for specifying compute architecture for
NVCC (was already available for clang).
2020-12-22 23:25:23 -08:00
rgreenblatt
fdf2ee62c5 Fix missing EIGEN_DEVICE_FUNC 2020-12-20 23:22:53 -05:00
Rasmus Munk Larsen
05754100fe * Add iterative psqrt<double> for AVX and SSE when FMA is available. This provides a ~10% speedup.
* Write iterative sqrt explicitly in terms of pmadd. This gives up to 7% speedup for psqrt<float> with AVX & SSE with FMA.
* Remove iterative psqrt<double> for NEON, because the initial rsqrt apprimation is not accurate enough for convergence in 2 Newton-Raphson steps and with 3 steps, just calling the builtin sqrt insn is faster.

The following benchmarks were compiled with clang "-O2 -fast-math -mfma" and with and without -mavx.

AVX+FMA (float)

name                      old cpu/op  new cpu/op  delta
BM_eigen_sqrt_float/1     1.08ns ± 0%  1.09ns ± 1%    ~
BM_eigen_sqrt_float/8     2.07ns ± 0%  2.08ns ± 1%    ~
BM_eigen_sqrt_float/64    12.4ns ± 0%  12.4ns ± 1%    ~
BM_eigen_sqrt_float/512   95.7ns ± 0%  95.5ns ± 0%    ~
BM_eigen_sqrt_float/4k     776ns ± 0%   763ns ± 0%  -1.67%
BM_eigen_sqrt_float/32k   6.57µs ± 1%  6.13µs ± 0%  -6.69%
BM_eigen_sqrt_float/256k  83.7µs ± 3%  83.3µs ± 2%    ~
BM_eigen_sqrt_float/1M     335µs ± 2%   332µs ± 2%    ~

SSE+FMA (float)
name                      old cpu/op  new cpu/op  delta
BM_eigen_sqrt_float/1     1.08ns ± 0%  1.09ns ± 0%    ~
BM_eigen_sqrt_float/8     2.07ns ± 0%  2.06ns ± 0%    ~
BM_eigen_sqrt_float/64    12.4ns ± 0%  12.4ns ± 1%    ~
BM_eigen_sqrt_float/512   95.7ns ± 0%  96.3ns ± 4%    ~
BM_eigen_sqrt_float/4k     774ns ± 0%   763ns ± 0%  -1.50%
BM_eigen_sqrt_float/32k   6.58µs ± 2%  6.11µs ± 0%  -7.06%
BM_eigen_sqrt_float/256k  82.7µs ± 1%  82.6µs ± 1%    ~
BM_eigen_sqrt_float/1M     330µs ± 1%   329µs ± 2%    ~

SSE+FMA (double)
BM_eigen_sqrt_double/1      1.63ns ± 0%  1.63ns ± 0%     ~
BM_eigen_sqrt_double/8      6.51ns ± 0%  6.08ns ± 0%   -6.68%
BM_eigen_sqrt_double/64     52.1ns ± 0%  46.5ns ± 1%  -10.65%
BM_eigen_sqrt_double/512     417ns ± 0%   374ns ± 1%  -10.29%
BM_eigen_sqrt_double/4k     3.33µs ± 0%  2.97µs ± 1%  -11.00%
BM_eigen_sqrt_double/32k    26.7µs ± 0%  23.7µs ± 0%  -11.07%
BM_eigen_sqrt_double/256k    213µs ± 0%   206µs ± 1%   -3.31%
BM_eigen_sqrt_double/1M      862µs ± 0%   870µs ± 2%   +0.96%

AVX+FMA (double)
name                        old cpu/op  new cpu/op  delta
BM_eigen_sqrt_double/1      1.63ns ± 0%  1.63ns ± 0%     ~
BM_eigen_sqrt_double/8      6.51ns ± 0%  6.06ns ± 0%   -6.95%
BM_eigen_sqrt_double/64     52.1ns ± 0%  46.5ns ± 1%  -10.80%
BM_eigen_sqrt_double/512     417ns ± 0%   373ns ± 1%  -10.59%
BM_eigen_sqrt_double/4k     3.33µs ± 0%  2.97µs ± 1%  -10.79%
BM_eigen_sqrt_double/32k    26.7µs ± 0%  23.8µs ± 0%  -10.94%
BM_eigen_sqrt_double/256k    214µs ± 0%   208µs ± 2%   -2.76%
BM_eigen_sqrt_double/1M      866µs ± 3%   923µs ± 7%     ~
2020-12-16 18:16:11 +00:00
Turing Eret
3bee9422d6 Merge branch 'lambdaknight/eigen-master' 2020-12-16 09:18:24 -07:00
Turing Eret
19e6496ce0 Replace call to FixedDimensions() with a singleton instance of
FixedDimensions.
2020-12-16 07:34:44 -07:00
Rasmus Munk Larsen
6cee8d347e Add an additional step of Newton-Raphson for psqrt<double> on Arm, which otherwise has an error of ~1000 ulps. 2020-12-15 04:06:41 +00:00
Turing Eret
bc7d1599fb TensorStorage with FixedDimensions now has zero instance memory overhead.
Removed m_dimension as instance member of TensorStorage with
FixedDimensions and instead use the template parameter. This
means that the sizeof a pure fixed-size storage is exactly
equal to the data it is storing.
2020-12-14 07:19:34 -07:00
Alexander Grund
cf0b5b0344 Remove code checking for CMake < 3.5
As the CMake version is at least 3.5 the code checking for earlier versions can be removed.
2020-12-14 09:57:44 +00:00
David Tellenbach
751f18f2c0 Remove comma at the end of enumeration list to silence C++03 warnings 2020-12-13 18:11:02 +01:00
Antonio Sanchez
5dc2fbabee Fix implicit cast to double.
Triggers `-Wimplicit-float-conversion`, causing a bunch of build errors
in Google due to `-Wall`.
2020-12-12 09:26:20 -08:00
Antonio Sanchez
55967f87d1 Fix NEON pmax<PropagateNumbers,Packet4bf>.
Simple typo, the max impl called pmin instead of pmax for floats.
2020-12-11 21:50:52 -08:00
Antonio Sanchez
839aa505c3 Fix typo in AVX512 packet math. 2020-12-11 21:35:44 -08:00
David Tellenbach
536c8a79f2 Remove unused macro in Half.h 2020-12-12 00:53:26 +01:00
Antonio Sanchez
8c9976d7f0 Fix more SSE/AVX packet conversions for peven.
MSVC doesn't like function-style casts and forces us to use intrinsics.
2020-12-11 15:46:42 -08:00
Antonio Sanchez
c6efc4e0ba Replace M_LOG2E and M_LN2 with custom macros.
For these to exist we would need to define `_USE_MATH_DEFINES` before
`cmath` or `math.h` is first included.  However, we don't
control the include order for projects outside Eigen, so even defining
the macro in `Eigen/Core` does not fix the issue for projects that
end up including `<cmath>` before Eigen does (explicitly or transitively).

To fix this, we define `EIGEN_LOG2E` and `EIGEN_LN2` ourselves.
2020-12-11 14:34:31 -08:00
Antonio Sanchez
e82722a4a7 Fix MSVC SSE casts.
MSVC doesn't like __m128(__m128i) c-style casts, so packets need to be
converted using intrinsic methods.
2020-12-11 08:52:59 -08:00
Deven Desai
f3d2ea48f5 Fix for broken ROCm/HIP Support
The following commit introduced a breakage in ROCm/HIP support for Eigen.

5ec4907434 (1958e65719641efe5483abc4ce0b61806270f6f3_525_517)

```
Building HIPCC object test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o
In file included from /home/rocm-user/eigen/test/gpu_basic.cu:20:
In file included from /home/rocm-user/eigen/test/main.h:356:
In file included from /home/rocm-user/eigen/Eigen/QR:11:
In file included from /home/rocm-user/eigen/Eigen/Core:222:
/home/rocm-user/eigen/Eigen/src/Core/arch/GPU/PacketMath.h:556:10: error: use of undeclared identifier 'half2half2'; did you mean '__half2half2'?
  return half2half2(from);
         ^~~~~~~~~~
         __half2half2
/opt/rocm/hip/include/hip/hcc_detail/hip_fp16.h:547:21: note: '__half2half2' declared here
            __half2 __half2half2(__half x)
                    ^
1 error generated when compiling for gfx900.

```

The cause seems to be a copy-paster error, and the fix is trivial
2020-12-11 16:14:57 +00:00
David Tellenbach
c7eb3a74cb Don't guard psqrt for std::complex<float> with EIGEN_ARCH_ARM64 2020-12-11 12:41:52 +01:00
Everton Constantino
bccf055a7c Add Armv8 guard on PropagateNumbers implementation. 2020-12-10 22:01:55 -03:00
Antonio Sanchez
82c0c18a83 Remove private access of std::deque::_M_impl.
This no longer works on gcc or clang, so we should just remove the hack.
The default should compile to similar code anyways.
2020-12-10 14:59:34 -08:00
David Tellenbach
00be0a7ff3 Fix vectorization of complex sqrt on NEON 2020-12-10 15:23:23 +00:00
David Tellenbach
8eb461a431 Remove comma at end of enumerator list in NEON PacketMath 2020-12-10 15:22:55 +01:00
David Tellenbach
2e8f850c78 Fix a typo in SparseMatrix documentation.
This fixes issue #2091.
2020-12-09 14:48:24 +01:00
Rasmus Munk Larsen
125cc9a5df Implement vectorized complex square root.
Closes #1905

Measured speedup for sqrt of `complex<float>` on Skylake:

SSE:
```
name                      old time/op             new time/op  delta
BM_eigen_sqrt_ctype/1     49.4ns ± 0%             54.3ns ± 0%  +10.01%
BM_eigen_sqrt_ctype/8      332ns ± 0%               50ns ± 1%  -84.97%
BM_eigen_sqrt_ctype/64    2.81µs ± 1%             0.38µs ± 0%  -86.49%
BM_eigen_sqrt_ctype/512   23.8µs ± 0%              3.0µs ± 0%  -87.32%
BM_eigen_sqrt_ctype/4k     202µs ± 0%               24µs ± 2%  -88.03%
BM_eigen_sqrt_ctype/32k   1.63ms ± 0%             0.19ms ± 0%  -88.18%
BM_eigen_sqrt_ctype/256k  13.0ms ± 0%              1.5ms ± 1%  -88.20%
BM_eigen_sqrt_ctype/1M    52.1ms ± 0%              6.2ms ± 0%  -88.18%
```

AVX2:
```
name                      old cpu/op  new cpu/op  delta
BM_eigen_sqrt_ctype/1     53.6ns ± 0%  55.6ns ± 0%   +3.71%
BM_eigen_sqrt_ctype/8      334ns ± 0%    27ns ± 0%  -91.86%
BM_eigen_sqrt_ctype/64    2.79µs ± 0%  0.22µs ± 2%  -92.28%
BM_eigen_sqrt_ctype/512   23.8µs ± 1%   1.7µs ± 1%  -92.81%
BM_eigen_sqrt_ctype/4k     201µs ± 0%    14µs ± 1%  -93.24%
BM_eigen_sqrt_ctype/32k   1.62ms ± 0%  0.11ms ± 1%  -93.29%
BM_eigen_sqrt_ctype/256k  13.0ms ± 0%   0.9ms ± 1%  -93.31%
BM_eigen_sqrt_ctype/1M    52.0ms ± 0%   3.5ms ± 1%  -93.31%
```

AVX512:
```
name                      old cpu/op  new cpu/op  delta
BM_eigen_sqrt_ctype/1     53.7ns ± 0%  56.2ns ± 1%   +4.75%
BM_eigen_sqrt_ctype/8      334ns ± 0%    18ns ± 2%  -94.63%
BM_eigen_sqrt_ctype/64    2.79µs ± 0%  0.12µs ± 1%  -95.54%
BM_eigen_sqrt_ctype/512   23.9µs ± 1%   1.0µs ± 1%  -95.89%
BM_eigen_sqrt_ctype/4k     202µs ± 0%     8µs ± 1%  -96.13%
BM_eigen_sqrt_ctype/32k   1.63ms ± 0%  0.06ms ± 1%  -96.15%
BM_eigen_sqrt_ctype/256k  13.0ms ± 0%   0.5ms ± 4%  -96.11%
BM_eigen_sqrt_ctype/1M    52.1ms ± 0%   2.0ms ± 1%  -96.13%
```
2020-12-08 18:13:35 -08:00
Antonio Sanchez
8cfe0db108 Fix host/device calls for __half.
The previous code had `__host__ __device__` functions calling `__device__`
functions (e.g. `__low2half`) which caused build failures in tensorflow.
Also tried to simplify the `#ifdef` guards to make them more clear.
2020-12-08 20:31:02 +00:00
Everton Constantino
baf9d762b7 - Enabling PropagateNaN and PropagateNumbers for NEON.
- Adding propagate tests to bfloat16.
2020-12-08 17:05:05 +00:00
Antonio Sanchez
634bd79b0e Fix unused warning on new dense_assignment_loop impl. 2020-12-07 19:14:21 -08:00
Antonio Sanchez
655c3a4042 Add specialization for compile-time zero-sized dense assignment.
In the current `dense_assignment_loop` implementations, if the
destination's inner or outer size is zero at compile time and if the kernel
involves a product, we currently get a compile error (#2080).  This is
triggered by attempting to multiply a non-existent row by a column (or
vice-versa).

To address this, we add a specialization for zero-sized assignments
(`AllAtOnceTraversal`) which evaluates to a no-op. We also add a static
check to ensure the size is in-fact zero. This now seems to be the only
existing use of `AllAtOnceTraversal`.

Fixes #2080.
2020-12-07 08:38:43 -08:00
Antonio Sanchez
5ec4907434 Clean up #ifs in GPU PacketPath.
Removed redundant checks and redundant code for CUDA/HIP.

Note: there are several issues here of calling `__device__` functions
from `__host__ __device__` functions, in particular `__low2half`.
We do not address that here -- only modifying this file enough
to get our current tests to compile.

Fixed: #1847
2020-12-04 16:14:03 -08:00
Rasmus Munk Larsen
f9fac1d5b0 Add log2() to Eigen. 2020-12-04 21:45:09 +00:00
Antonio Sanchez
2dbac2f99f Fix bad NEON fp16 check 2020-12-04 13:42:18 -08:00
Antonio Sanchez
e2f21465fe Special function implementations for half/bfloat16 packets.
Current implementations fail to consider half-float packets, only
half-float scalars.  Added specializations for packets on AVX, AVX512 and
NEON.  Added tests to `special_packetmath`.

The current `special_functions` tests would fail for half and bfloat16 due to
lack of precision. The NEON tests also fail with precision issues and
due to different handling of `sqrt(inf)`, so special functions bessel, ndtri
have been disabled.

Tested with AVX, AVX512.
2020-12-04 10:16:29 -08:00
David Tellenbach
305b8bd277 Remove duplicate #if clause 2020-12-04 18:55:46 +01:00
Antonio Sanchez
9ee9ac81de Fix shfl* macros for CUDA/HIP
The `shfl*` functions are `__device__` only, and adjusted `#ifdef`s so
they are defined whenever the corresponding CUDA/HIP ones are.

Also changed the HIP/CUDA<9.0 versions to cast to int instead of
doing the conversion `half`<->`float`.

Fixes #2083
2020-12-04 17:18:32 +00:00
shrek1402
a9a2f2bebf The function 'prefetch' did not work correctly on the win64 platform 2020-12-04 17:18:08 +00:00
Rasmus Munk Larsen
f23dc5b971 Revert "Add log2() operator to Eigen"
This reverts commit 4d91519a9b.
2020-12-03 14:32:45 -08:00
Rasmus Munk Larsen
4d91519a9b Add log2() operator to Eigen 2020-12-03 22:31:44 +00:00
Rasmus Munk Larsen
25d8ae7465 Small cleanup of generic plog implementations:
Adding the term e*ln(2) is split into two step for no obvious reason.
This dates back to the original Cephes code from which the algorithm is adapted.
It appears that this was done in Cephes to prevent the compiler from reordering
the addition of the 3 terms in the approximation

  log(1+x) ~= x - 0.5*x^2 + x^3*P(x)/Q(x)

which must be added in reverse order since |x| < (sqrt(2)-1).

This allows rewriting the code to just 2 pmadd and 1 padd instructions,
which on a Skylake processor speeds up the code by 5-7%.
2020-12-03 19:40:40 +00:00
Antonio Sanchez
eb4d4ae070 Include chrono in main for c++11.
Hack to fix tensor tests, since min/max are overridden by `main.h`.
2020-12-03 11:27:32 -08:00
Rasmus Munk Larsen
71c85df4c1 Clean up the Tensor header and get rid of the EIGEN_SLEEP macro. 2020-12-02 11:04:04 -08:00
Antonio Sanchez
70fbcf82ed Fix typo in F32MaskToBf16Mask. 2020-12-02 07:58:34 -08:00
Antonio Sanchez
2627e2f2e6 Fix neon cmp* functions for bf16.
The current impl corrupts the comparison masks when converting
from float back to bfloat16.  The resulting masks are then
no longer all zeros or all ones, which breaks when used with
`pselect` (e.g. in `pmin<PropagateNumbers>`).  This was
causing `packetmath_15` to fail on arm.

Introducing a simple `F32MaskToBf16Mask` corrects this (takes
the lower 16-bits for each float mask).
2020-12-02 01:29:34 +00:00
Antonio Sanchez
ddd48b242c Implement CUDA __shfl* for Eigen::half
Prior to this fix, `TensorContractionGpu` and the `cxx11_tensor_of_float16_gpu`
test are broken, as well as several ops in Tensorflow. The gpu functions
`__shfl*` became ambiguous now that `Eigen::half` implicitly converts to float.
Here we add the required specializations.
2020-12-01 14:36:52 -08:00
Rasmus Munk Larsen
e57281a741 Fix a few issues for AVX512. This change enables vectorized versions of log, exp, log1p, expm1 when AVX512DQ is not available. 2020-12-01 11:31:47 -08:00
Antonio Sanchez
1992af3de2 Fix #2077, EIGEN_CONSTEXPR in Half.
`bit_cast` cannot be `constexpr`, so we need to remove `EIGEN_CONSTEXPR` from
`raw_half_as_uint16(...)`.  This shouldn't affect anything else, since
it is only used in `a bit_cast<uint16_t,half>()` which is not itself
`constexpr`.

Fixes #2077.
2020-12-01 03:10:21 +00:00
acxz
7b80609d49 add EIGEN_DEVICE_FUNC to methods 2020-12-01 03:08:47 +00:00
Antonio Sanchez
89f90b585d AVX512 missing ops.
This allows the `packetmath` tests to pass for AVX512 on skylake.
Made `half` and `bfloat16` consistent in terms of ops they support.

Note the `log` tests are currently disabled for `bfloat16` since
they fail due to poor precision (they were previously disabled for
`Packet8bf` via test function specialization -- I just removed that
specialization and disabled it in the generic test).
2020-11-30 16:28:57 +00:00
Florian Maurin
c5985c46f5 Fix typo in doc 2020-11-30 10:53:29 +00:00
Jim Lersch
68f69414f7 Workaround for doxygen class template titles in which the template
part of the class signature is lost due to a problem with forward
declarations.  The problem is probably caused by doxygen bug #7689.
It is confirmed to be fixed in doxygen >= 1.8.19.
2020-11-27 19:52:16 -07:00
Jim Lersch
a7170f2aca Fix doxygen class blocks that were not associated with the correct classes. 2020-11-27 08:48:11 -07:00
David Tellenbach
550e8f8f57 Include CMakeDependentOption to be able to use cmake_dependent_option 2020-11-27 13:21:49 +01:00
Bowie Owens
9842366bba Make inclusion of doc sub-directory optional by adjusting options.
Allows exclusion of doc and related targets to help when using eigen via add_subdirectory().

Requested by:

https://gitlab.com/libeigen/eigen/-/issues/1842

Also required making EIGEN_TEST_BUILD_DOCUMENTATION a dependent option on EIGEN_BUILD_DOC. This ensures documentation targets are properly defined when EIGEN_TEST_BUILD_DOCUMENTATION is ON.
2020-11-27 08:11:49 +11:00
filippobrizzi
aa56e1d980 check for include dirs set 2020-11-26 10:22:46 +00:00
Andreas Krebbel
1e74f93d55 Fix some packet-functions in the IBM ZVector packet-math. 2020-11-25 14:11:23 +00:00
Rasmus Munk Larsen
79818216ed Revert "Fix Half NaN definition and test."
This reverts commit c770746d70.
2020-11-24 12:57:28 -08:00
Rasmus Munk Larsen
c770746d70 Fix Half NaN definition and test.
The `half_float` test was failing with `-mcpu=cortex-a55` (native `__fp16`) due
to a bad NaN bit-pattern comparison (in the case of casting a float to `__fp16`,
the signaling `NaN` is quieted). There was also an inconsistency between
`numeric_limits<half>::quiet_NaN()` and `NumTraits::quiet_NaN()`.  Here we
correct the inconsistency and compare NaNs according to the IEEE 754
definition.

Also modified the `bfloat16_float` test to match.

Tested with `cortex-a53` and `cortex-a55`.
2020-11-24 20:53:07 +00:00
Antonio Sanchez
22f67b5958 Fix boolean float conversion and product warnings.
This fixes some gcc warnings such as:
```
Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool]
    Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); }
```

Details:

- Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`).

- Added `scalar_square_op<bool>` and `scalar_cube_op<bool>`
specializations (`-Wint-in-bool-context`)

- Deprecated above specialized ops for bool.

- Modified `cxx11_tensor_block_eval` to specialize generator for
booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to
avoid deprecated bool ops.
2020-11-24 20:20:36 +00:00
Antonio Sanchez
a3b300f1af Implement missing AVX half ops.
Minimal implementation of AVX `Eigen::half` ops to bring in line
with `bfloat16`.  Allows `packetmath_13` to pass.

Also adjusted `bfloat16` packet traits to match the supported set
of ops (e.g. Bessel is not actually implemented).
2020-11-24 16:46:41 +00:00
Antonio Sanchez
38abf2be42 Fix Half NaN definition and test.
The `half_float` test was failing with `-mcpu=cortex-a55` (native `__fp16`) due
to a bad NaN bit-pattern comparison (in the case of casting a float to `__fp16`,
the signaling `NaN` is quieted). There was also an inconsistency between
`numeric_limits<half>::quiet_NaN()` and `NumTraits::quiet_NaN()`.  Here we
correct the inconsistency and compare NaNs according to the IEEE 754
definition.

Also modified the `bfloat16_float` test to match.

Tested with `cortex-a53` and `cortex-a55`.
2020-11-23 14:13:59 -08:00
Antonio Sanchez
4cf01d2cf5 Update AVX half packets, disable test.
The AVX half implementation is incomplete, causing the `packetmath_13` test
to fail.  This disables the test.

Also refactored the existing AVX implementation to use `bit_cast`
instead of direct access to `.x`.
2020-11-21 09:05:10 -08:00
Antonio Sanchez
fd1dcb6b45 Fixes duplicate symbol when building blas
Missing inline breaks blas, since symbol generated in
`complex_single.cpp`, `complex_double.cpp`, `single.cpp`, `double.cpp`

Changed rest of inlines to `EIGEN_STRONG_INLINE`.
2020-11-20 09:37:40 -08:00
David Tellenbach
6c9c3f9a1a Remove explicit casts from Eigen::half and Eigen::bfloat16 to bool
Both, Eigen::half and Eigen::Bfloat16 are implicitly convertible to
float and can hence be converted to bool via the conversion chain

  Eigen::{half,bfloat16} -> float -> bool

We thus remove the explicit cast operator to bool.
2020-11-19 18:49:09 +01:00
Antonio Sanchez
a8fdcae55d Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix.
Multiplication of column-major `DynamicSparseMatrix`es involves three
temporaries:
- two for transposing twice to sort the coefficients
(`ConservativeSparseSparseProduct.h`, L160-161)
- one for a final copy assignment (`SparseAssign.h`, L108)
The latter is avoided in an optimization for `SparseMatrix`.

Since `DynamicSparseMatrix` is deprecated in favor of `SparseMatrix`, it's not
worth the effort to optimize further, so I simply disabled counting
temporaries via a macro.

Note that due to the inclusion of `sparse_product.cpp`, the `sparse_extra`
tests actually re-run all the original `sparse_product` tests as well.

We may want to simply drop the `DynamicSparseMatrix` tests altogether, which
would eliminate the test duplication.

Related to #2048
2020-11-18 23:15:33 +00:00
David Tellenbach
11e4056f6b Re-enable Arm Neon Eigen::half packets of size 8
- Add predux_half_dowto4
- Remove explicit casts in Half.h to match the behaviour of BFloat16.h
- Enable more packetmath tests for Eigen::half
2020-11-18 23:02:21 +00:00
Antonio Sanchez
17268b155d Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom
The existing `TensorRandom.h` implementation makes the assumption that
`half` (`bfloat16`) has a `uint16_t` member `x` (`value`), which is not
always true. This currently fails on arm64, where `x` has type `__fp16`.
Added `bit_cast` specializations to allow casting to/from `uint16_t`
for both `half` and `bfloat16`.  Also added tests in
`half_float`, `bfloat16_float`, and `cxx11_tensor_random` to catch
these errors in the future.
2020-11-18 20:32:35 +00:00
Antonio Sanchez
41d5d5334b Initialize primitives to fix -Wuninitialized-const-reference.
The `meta` test generates warnings with the latest version of clang due
to passing uninitialized variables as const reference arguments.
```
test/meta.cpp:102:45: error: variable 'f' is uninitialized when passed as a const reference argument here [-Werror,-Wuninitialized-const-reference]
    VERIFY(( check_is_convertible(a.dot(b), f) ));
```
We don't actually use the variables, but initializing them eliminates the
new warning.

Fixes #2067.
2020-11-18 20:23:20 +00:00
Antonio Sanchez
3669498f5a Fix rule-of-3 for the Tensor module.
Adds copy constructors to Tensor ops, inherits assignment operators from
`TensorBase`.

Addresses #1863
2020-11-18 18:14:53 +00:00
Antonio Sanchez
60218829b7 EOF newline added to InverseSize4.
Causing build breakages due to `-Wnewline-eof -Werror` that seems to be
common across Google.
2020-11-18 07:58:33 -08:00
Rasmus Munk Larsen
2d63706545 Add missing parens around macro argument. 2020-11-18 00:24:19 +00:00
Rasmus Munk Larsen
6bba58f109 Replace SSE_SHUFFLE_MASK macro with shuffle_mask. 2020-11-17 15:28:37 -08:00
David Tellenbach
e9b55c4db8 Avoid promotion of Arm __fp16 to float in Neon PacketMath
Using overloaded arithmetic operators for Arm __fp16 always
causes a promotion to float. We replace operator* by vmulh_f16
to avoid this.
2020-11-17 20:19:44 +01:00
Antonio Sanchez
117a4c0617 Fix missing EIGEN_CONSTEXPR pop_macro in Half.
`EIGEN_CONSTEXPR` is getting pushed but not popped in `Half.h` if
`EIGEN_HAS_ARM64_FP16_SCALAR_ARITHMETIC` is defined.
2020-11-17 08:29:33 -08:00
Guoqiang QI
394f564055 Unify Inverse_SSE.h and Inverse_NEON.h into a single generic implementation using PacketMath. 2020-11-17 12:27:01 +00:00
Antonio Sanchez
8e9cc5b10a Eliminate double-promotion warnings.
Clang currently complains about implicit conversions, e.g.
```
test/packetmath.cpp:680:59: warning: implicit conversion increases floating-point precision: 'typename Eigen::internal::random_retval<typename Eigen::internal::global_math_functions_filtering_base<double>::type>::type' (aka 'double') to 'long double' [-Wdouble-promotion]
          data1[0] = Scalar((2 * k + k1) * EIGEN_PI / 2 * internal::random<double>(0.8, 1.2));
                                                        ~ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test/packetmath.cpp:681:40: warning: implicit conversion increases floating-point precision: 'float' to 'long double' [-Wdouble-promotion]
          data1[1] = Scalar((2 * k + 2 + k1) * EIGEN_PI / 2 * internal::random<double>(0.8, 1.2));
```

Modified to explicitly cast to double.
2020-11-16 10:39:09 -08:00
acxz
9175f50d6f Add EIGEN_DEVICE_FUNC to TranspositionsBase
Fixes #2057.
2020-11-16 15:37:40 +00:00
Martin Vonheim Larsen
280f4f2407 Enable MathJax in Doxygen.in
Note that HTTPS must be used against the MathJax CDN when hosted on `eigen.tuxfamily.org` (which uses HTTPS) in order to avoid `Mixed Content`-errors from browsers. Using HTTPS for MathJax also works if the Eigen docs are hosted on plain HTTP.
2020-11-16 12:59:13 +00:00
Antonio Sanchez
bb69a8db5d Explicit casts of S -> std::complex<T>
When calling `internal::cast<S, std::complex<T>>(x)`, clang often
generates an implicit conversion warning due to an implicit cast
from type `S` to `T`.  This currently affects the following tests:
- `basicstuff`
- `bfloat16_float`
- `cxx11_tensor_casts`

The implicit cast leads to widening/narrowing float conversions.
Widening warnings only seem to be generated by clang (`-Wdouble-promotion`).

To eliminate the warning, we explicitly cast the real-component first
from `S` to `T`.  We also adjust tests to use `internal::cast` instead
of `static_cast` when a complex type may be involved.
2020-11-14 05:50:42 +00:00
Christoph Hertzberg
90f6d9d23e Suppress ignored-attributes warning (same as in vectorization_logic). Remove redundant include and using namespace. 2020-11-13 16:21:53 +01:00
guoqiangqi
8324e5e049 Fix typo in NEON/PacketMath.h 2020-11-13 00:46:41 +00:00
Antonio Sanchez
852513e7a6 Disable testing of OpenGL by default.
The `OpenGLSupport` module contains mostly deprecated features, and the
test is highly GL context-dependent, relies on deprecated GLUT, and
requires a display.  Until the module is updated to support modern
OpenGL and the test to use newer windowing frameworks (e.g. GLFW)
it's probably best to disable the test by default.

The test can be enabled with `cmake -DEIGEN_TEST_OPENGL=ON`.

See #2053 for more details.
2020-11-12 16:15:40 -08:00
Rasmus Munk Larsen
bec72345d6 Simplify expression for inner product fallback in Gemv product evaluator. 2020-11-12 23:43:15 +00:00
Rasmus Munk Larsen
276db21f26 Remove redundant branch for handling dynamic vector*vector. This will be handled by the equivalent branch in the specialization for GemvProduct. 2020-11-12 21:54:56 +00:00
Rasmus Munk Larsen
cf12474a8b Optimize matrix*matrix and matrix*vector products when they correspond to inner products at runtime.
This speeds up inner products where the one or or both arguments is dynamic for small and medium-sized vectors (up to 32k).

name                           old time/op             new time/op   delta
BM_VecVecStatStat<float>/1     1.64ns ± 0%             1.64ns ± 0%     ~
BM_VecVecStatStat<float>/8     2.99ns ± 0%             2.99ns ± 0%     ~
BM_VecVecStatStat<float>/64    7.00ns ± 1%             7.04ns ± 0%   +0.66%
BM_VecVecStatStat<float>/512   61.6ns ± 0%             61.6ns ± 0%     ~
BM_VecVecStatStat<float>/4k     551ns ± 0%              553ns ± 1%   +0.26%
BM_VecVecStatStat<float>/32k   4.45µs ± 0%             4.45µs ± 0%     ~
BM_VecVecStatStat<float>/256k  77.9µs ± 0%             78.1µs ± 1%     ~
BM_VecVecStatStat<float>/1M     312µs ± 0%              312µs ± 1%     ~
BM_VecVecDynStat<float>/1      13.3ns ± 1%              4.6ns ± 0%  -65.35%
BM_VecVecDynStat<float>/8      14.4ns ± 0%              6.2ns ± 0%  -57.00%
BM_VecVecDynStat<float>/64     24.0ns ± 0%             10.2ns ± 3%  -57.57%
BM_VecVecDynStat<float>/512     138ns ± 0%               68ns ± 0%  -50.52%
BM_VecVecDynStat<float>/4k     1.11µs ± 0%             0.56µs ± 0%  -49.72%
BM_VecVecDynStat<float>/32k    8.89µs ± 0%             4.46µs ± 0%  -49.89%
BM_VecVecDynStat<float>/256k   78.2µs ± 0%             78.1µs ± 1%     ~
BM_VecVecDynStat<float>/1M      313µs ± 0%              312µs ± 1%     ~
BM_VecVecDynDyn<float>/1       10.4ns ± 0%             10.5ns ± 0%   +0.91%
BM_VecVecDynDyn<float>/8       12.0ns ± 3%             11.9ns ± 0%     ~
BM_VecVecDynDyn<float>/64      37.4ns ± 0%             19.6ns ± 1%  -47.57%
BM_VecVecDynDyn<float>/512      159ns ± 0%               81ns ± 0%  -49.07%
BM_VecVecDynDyn<float>/4k      1.13µs ± 0%             0.58µs ± 1%  -49.11%
BM_VecVecDynDyn<float>/32k     8.91µs ± 0%             5.06µs ±12%  -43.23%
BM_VecVecDynDyn<float>/256k    78.2µs ± 0%             78.2µs ± 1%     ~
BM_VecVecDynDyn<float>/1M       313µs ± 0%              312µs ± 1%     ~
2020-11-12 18:02:37 +00:00
Pedro Caldeira
c29935b323 Add support for dynamic dispatch of MMA instructions for POWER 10 2020-11-12 11:31:15 -03:00
acxz
b714dd9701 remove annotation for first declaration of default con/destruction 2020-11-12 04:34:12 +00:00
mehdi-goli
e24a1f57e3 [SYCL Function pointer Issue]: SYCL does not support function pointer inside the kernel, due to the portability issue of a function pointer and memory address space among host and accelerators. To fix the issue, function pointers have been replaced by function objects. 2020-11-12 01:50:28 +00:00
Antonio Sanchez
6961468915 Address issues with openglsupport test.
The existing test fails on several systems due to GL runtime version mismatches,
the use of deprecated features, and memory errors due to improper use of GLUT.
The test was modified to:

- Run within a display function, allowing proper GLUT cleanup.
- Generate dynamic shaders with a supported GLSL version string and output variables.
- Report shader compilation errors.
- Check GL context version before launching version-specific tests.

Note that most of the existing `OpenGLSupport` module and tests rely on deprecated
features (e.g. fixed-function pipeline). The test was modified to allow it to
pass on various systems. We might want to consider removing the module or re-writing
it entirely to support modern OpenGL.  This is beyond the scope of this patch.

Testing of legacy GL (for platforms that support it) can be enabled by defining
`EIGEN_LEGACY_OPENGL`.  Otherwise, the test will try to create a modern context.

Tested on
- MacBook Air (2019), macOS Catalina 10.15.7 (OpenGL 2.1, 4.1)
- Debian 10.6, NVidia Quadro K1200 (OpenGL 3.1, 3.3)
2020-11-11 15:54:43 -08:00
Everton Constantino
348a48682e Fix erroneous forward declaration of boost nvp. 2020-11-10 13:07:34 -03:00
guoqiangqi
82fe059f35 Fix issue2045 which get a error case _mm256_set_m128d op not supported by gcc 7.x 2020-11-04 09:21:39 +08:00
Deven Desai
9d11e2c03e CMakefile update for ROCm 4.0
Starting with ROCm 4.0, the `hipconfig --platform` command will return `amd` (prior return value was `hcc`). Updating the CMakeLists.txt files in the test dirs to account for this change.
2020-10-29 18:06:31 +00:00
Deven Desai
39a038f2e4 Fix for ROCm (and CUDA?) breakage - 201029
The following commit breaks Eigen for ROCm (and probably CUDA too) with the following error

e265f7ed8e

```

Building HIPCC object test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o
In file included from /home/rocm-user/eigen/test/gpu_basic.cu:20:
In file included from /home/rocm-user/eigen/test/main.h:355:
In file included from /home/rocm-user/eigen/Eigen/QR:11:
In file included from /home/rocm-user/eigen/Eigen/Core:169:
/home/rocm-user/eigen/Eigen/src/Core/arch/Default/Half.h:825:76: error: use of undeclared identifier 'numext'; did you mean 'Eigen::numext'?
  return Eigen::half_impl::raw_uint16_to_half(__ldg(reinterpret_cast<const numext::uint16_t*>(ptr)));
                                                                           ^~~~~~
                                                                           Eigen::numext
/home/rocm-user/eigen/Eigen/src/Core/MathFunctions.h:968:11: note: 'Eigen::numext' declared here
namespace numext {
          ^
1 error generated when compiling for gfx900.
CMake Error at gpu_basic_generated_gpu_basic.cu.o.cmake:192 (message):
  Error generating file
  /home/rocm-user/eigen/build/test/CMakeFiles/gpu_basic.dir//./gpu_basic_generated_gpu_basic.cu.o

test/CMakeFiles/gpu_basic.dir/build.make:63: recipe for target 'test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o' failed
make[3]: *** [test/CMakeFiles/gpu_basic.dir/gpu_basic_generated_gpu_basic.cu.o] Error 1
CMakeFiles/Makefile2:16611: recipe for target 'test/CMakeFiles/gpu_basic.dir/all' failed
make[2]: *** [test/CMakeFiles/gpu_basic.dir/all] Error 2
CMakeFiles/Makefile2:16618: recipe for target 'test/CMakeFiles/gpu_basic.dir/rule' failed
make[1]: *** [test/CMakeFiles/gpu_basic.dir/rule] Error 2
Makefile:5401: recipe for target 'gpu_basic' failed
make: *** [gpu_basic] Error 2
```

The fix is in this commit is trivial. Please review and merge
2020-10-29 15:34:05 +00:00
David Tellenbach
f895755c0e Remove unused functions in Half.h.
The following functions have been removed:

  Eigen::half fabsh(const Eigen::half&)
  Eigen::half exph(const Eigen::half&)
  Eigen::half sqrth(const Eigen::half&)
  Eigen::half powh(const Eigen::half&, const Eigen::half&)
  Eigen::half floorh(const Eigen::half&)
  Eigen::half ceilh(const Eigen::half&)
2020-10-29 07:37:52 +01:00
David Tellenbach
09f015852b Replace numext::as_uint with numext::bit_cast<numext::uint32_t> 2020-10-29 07:28:28 +01:00
David Tellenbach
e265f7ed8e Add support for Armv8.2-a __fp16
Armv8.2-a provides a native half-precision floating point (__fp16 aka.
float16_t). This patch introduces

* __fp16 as underlying type of Eigen::half if this type is available
* the packet types Packet4hf and Packet8hf representing float16x4_t and
  float16x8_t respectively
* packet-math for the above packets with corresponding scalar type Eigen::half

The packet-math functionality has been implemented by Ashutosh Sharma
<ashutosh.sharma@amperecomputing.com>.

This closes #1940.
2020-10-28 20:15:09 +00:00
mehdi-goli
a725a3233c [SYCL clean up the code] : removing exrta #pragma unroll in SYCL which was causing issues in embeded systems 2020-10-28 08:34:49 +00:00
mehdi-goli
b9ff791fed [Missing SYCL math op]: Addin the missing LDEXP Function for SYCL. 2020-10-28 08:32:57 +00:00
mehdi-goli
61461d682a [Fixing expf issue]: Eigen uses the packet type operation for scaler type float on Sigmoid function(https://gitlab.com/libeigen/eigen/-/blob/master/Eigen/src/Core/functors/UnaryFunctors.h#L990). As a result SYCL backend breaks since SYCL backend only supports packet operation for vectorized type float4 and double2. The issue has been fixed by adding scalar type float to packet operation pexp for SYCL backend. 2020-10-28 08:30:34 +00:00
Christoph Hertzberg
ecb7bc9514 Bug #2036 make sure find_standard_math_library_test_program actually compiles (and is guaranteed to call math functions) 2020-10-24 15:22:21 +02:00
Susi Lehtola
09f595a269 Make sure compiler does not optimize away calls to math functions 2020-10-24 06:16:50 +00:00
guoqiangqi
28aef8e816 Improve polynomial evaluation with instruction-level parallelism for pexp_float and pexp<Packet16f> 2020-10-20 11:37:09 +08:00
guoqiangqi
4a77eda1fd remove unnecessary specialize template of pexp for scale float/double 2020-10-19 00:51:42 +00:00
Antonio Sanchez
d9f0d9eb76 Fix missing pfirst<Packet16b> for MSVC.
It was only defined under one `#ifdef` case.  This fixes the `packetmath_14`
test for MSVC.
2020-10-16 16:22:00 -07:00
Rasmus Munk Larsen
21edea5edd Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent. 2020-10-15 18:39:58 -07:00
Deven Desai
011e0db31d Fix for ROCm/HIP breakage - 201013
The following commit seems to have introduced regressions in ROCm/HIP support.

183a208212

It causes some unit-tests to fail with the following error

```
...
Eigen/src/Core/GenericPacketMath.h:322:3: error: no member named 'bit_and' in the global namespace; did you mean 'std::bit_and'?
...
Eigen/src/Core/GenericPacketMath.h:329:3: error: no member named 'bit_or' in the global namespace; did you mean 'std::bit_or'?
...
Eigen/src/Core/GenericPacketMath.h:336:3: error: no member named 'bit_xor' in the global namespace; did you mean 'std::bit_xor'?
...
```

The error occurs because, when compiling the device code in HIP/CUDA, the compiler will pick up the some of the std functions (whose calls are prefixed by EIGEN_USING_STD) from the global namespace (i.e. use ::bit_xor instead of std::bit_xor). For this to work, those functions must be declared in the global namespace in the HIP/CUDA header files. The `bit_and`, `bit_or` and `bit_xor` routines are not declared in the HIP header file that contain the decls for the std math functions ( `math_functions.h` ), and this is the cause of the error above.

It seems that the newer HIP compilers do support the calling of `std::` math routines within device code, and the ideal fix here would have been to change all calls to std math functions in EIGEN to use the `std::` namespace (instead of the global namespace ), when compiling  with HIP compiler. However it seems there was a recent commit to remove the EIGEN_USING_STD_MATH macro and collapse it uses into the EIGEN_USING_STD macro ( 4091f6b25c ).

Replacing all std math calls will essentially require re-surrecting the EIGEN_USING_STD_MATH macro, so not choosing that option.

Also HIP compilers only have support std math calls within device code, and not all std functions (specifically not for malloc/free which are prefixed via EIGEN_USING_STD). So modyfing EIGEN_USE_STD implementation to use std:: namspace for HIP will not work either.

Hence going for the ugly solution of special casing the three calls that breaking the HIP compile, to explicitly use the std:: namespace
2020-10-15 12:17:35 +00:00
Rasmus Munk Larsen
6ea8091705 Revert change from 4e4d3f32d1 that broke BFloat16.h build with older compilers. 2020-10-15 01:20:08 +00:00
Guoqiang QI
4700713faf Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 plog<Packet16f> op with generic api 2020-10-15 00:54:45 +00:00
Rasmus Munk Larsen
af6f43d7ff Add specializations for pmin/pmax with prescribed NaN propagation semantics for SSE/AVX/AVX512. 2020-10-14 23:11:24 +00:00
Rasmus Munk Larsen
274ef12b61 Remove leftover debug print statement in cxx11_tensor_expr.cpp 2020-10-14 22:59:51 +00:00
Rasmus Munk Larsen
208b3626d1 Revert generic implementation of predux, since it break compilation of predux_any with MSVC. 2020-10-14 21:41:28 +00:00
David Tellenbach
e3e2cf9d24 Add MatrixBase::cwiseArg() 2020-10-14 01:56:42 +00:00
Rasmus Munk Larsen
61fc78bbda Get rid of nested template specialization in TensorReductionGpu.h, which was broken by c6953f799b. 2020-10-13 23:53:11 +00:00
Rasmus Munk Larsen
c6953f799b Add packet generic ops predux_fmin, predux_fmin_nan, predux_fmax, and predux_fmax_nan that implement reductions with PropagateNaN, and PropagateNumbers semantics. Add (slow) generic implementations for most reductions. 2020-10-13 21:48:31 +00:00
acxz
807e51528d undefine EIGEN_CONSTEXPR before redefinition 2020-10-12 20:28:56 -04:00
Rasmus Munk Larsen
9a4d04c05f Make bitwise_helper a device function to unbreak GPU builds. 2020-10-10 01:45:20 +00:00
Rasmus Munk Larsen
4e4d3f32d1 Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512. 2020-10-09 20:05:49 +00:00
David Tellenbach
7a8d3d5b81 Disable test exceptions when using OpenMP. 2020-10-09 17:49:07 +02:00
David Tellenbach
9022f5aa8a Mention problems when using potentially throwing scalars and OpenMP 2020-10-09 17:04:25 +02:00
Karl Ljungkvist
d199c17b14 Fix typo in Tutorial_BlockOperations_block_assignment.cpp 2020-10-09 07:51:36 +00:00
David Tellenbach
4091f6b25c Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD 2020-10-09 02:05:05 +02:00
Rasmus Munk Larsen
183a208212 Implement generic bitwise logical packet ops that work for all types. 2020-10-08 22:45:20 +00:00
David Tellenbach
8f8d77b516 Add EIGEN prefix for HAS_LGAMMA_R 2020-10-08 18:32:19 +02:00
Eugene Zhulenev
2279f2c62f Use lgamma_r if it is available (update check for glibc 2.19+) 2020-10-08 00:26:45 +00:00
Rasmus Munk Larsen
b431024404 Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms.
Change test to only test for NaN-propagation for pfmin/pfmax.
2020-10-07 19:05:18 +00:00
David Tellenbach
f66f3393e3 Use reinterpret_cast instead of C-style cast in Inverse_NEON.h 2020-10-04 00:35:09 +02:00
Rasmus Munk Larsen
22c971a225 Don't cast away const in Inverse_NEON.h. 2020-10-02 15:06:34 -07:00
Rasmus Munk Larsen
f93841b53e Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h. 2020-10-02 14:47:15 -07:00
Rasmus Munk Larsen
ee714f79f7 Fix CUDA build breakage and incorrect result for absdiff on HIP with long double arguments. 2020-10-02 21:05:35 +00:00
janos
f7b185a8b1 dont use =* might not return a Scalar 2020-10-02 14:36:51 +02:00
Rasmus Munk Larsen
9078f47cd6 Fix build breakage with MSVC 2019, which does not support MMX intrinsics for 64 bit builds, see:
https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c

Instead use the equivalent SSE2 intrinsics.
2020-10-01 12:37:55 -07:00
Rasmus Munk Larsen
3b445d9bf2 Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564. 2020-10-01 16:54:31 +00:00
Rasmus Munk Larsen
44b9d4e412 Specialize pldexp_double and pfdexp_double and get rid of Packet2l definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead. 2020-09-30 13:33:44 -07:00
Antonio Sanchez
d5a0d89491 Fix alignedbox 32-bit precision test failure.
The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors.

In particular, the following is not guaranteed to hold:
```
IsometryTransform identity = IsometryTransform::Identity();
BoxType transformedC;
transformedC.extend(c.transformed(identity));
VERIFY(transformedC.contains(c));
```
since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`.

Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly.  Otherwise, invalid combinations of modes would also incorrectly pass the assertion.
2020-09-30 08:42:03 -07:00
David Tellenbach
30960d485e Fix failure in GEBP kernel when compiling with OpenMP and FMA
Fixes #1995
2020-09-30 01:26:07 +02:00
Rasmus Munk Larsen
f9d1500f74 Revert !182. 2020-09-29 13:56:17 -07:00
Rasmus Munk Larsen
068121ec02 Add missing newline at the end of Inverse_NEON.h 2020-09-29 15:32:52 +00:00
Rasmus Munk Larsen
74ff5719b3 Fix compilation of 64 bit constant arguments to pset1frombits in TypeCasting.h on platforms where uint64_t != unsigned long. 2020-09-28 22:47:11 +00:00
Rasmus Munk Larsen
3a0b23e473 Fix compilation of pset1frombits calls on iOS. 2020-09-28 22:30:36 +00:00
Christoph Hertzberg
6b0c0b587e Provide a more efficient Packet2l->Packet2d cast method 2020-09-28 22:14:02 +00:00
Martin Pecka
6425e875a1 Added AlignedBox::transform(AffineTransform). 2020-09-28 18:06:23 +00:00
Alexander Grund
a967fadb21 Make relative path variables of type STRING
When the type is PATH an absolute path is expected and user-defined
values are converted into absolute paths relative to the current directory.

Fixes #1990
2020-09-28 16:39:48 +00:00
Zhuyie
e4b24e7fb2 Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined 2020-09-25 09:36:43 +00:00
Deven Desai
ce5c59729d Fix for ROCm/HIP breakage - 200921
The following commit causes regressions in the ROCm/HIP support for Eigen
e55182ac09

I suspect the same breakages occur on the CUDA side too.

The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`.

When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `*half*` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does.

This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.
2020-09-22 22:26:45 +00:00
David Tellenbach
b8a13f13ca Add CI configuration for ppc64le 2020-09-22 00:26:23 +00:00
Guoqiang QI
821702e771 Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler 2020-09-21 15:49:00 +00:00
David Tellenbach
493a7c773c Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>> 2020-09-21 12:43:41 +02:00
Павел Мацула
38e4a67394 Fix using FindStandardMathLibrary.cmake with -Wall (-Wunused-value) added to CMAKE_CXX_FLAG 2020-09-19 16:13:16 +00:00
Rasmus Munk Larsen
c4b99f78c7 Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86.
If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
2020-09-18 18:13:20 -07:00
guoqiangqi
9aad16b443 Fix undefined reference to pset1frombits bug on different platforms 2020-09-19 00:53:21 +00:00
David Tellenbach
c4aa8e0db2 Rename variable to avoid shadowing of a previously declared one 2020-09-18 22:53:15 +02:00
Rasmus Munk Larsen
e55182ac09 Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr.
Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
2020-09-18 17:38:58 +00:00
Rasmus Munk Larsen
14022f5eb5 Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.
'vmvnq_u64' does not exist for some reason.
2020-09-18 04:14:13 +00:00
Rasmus Munk Larsen
a5b226920f Fix typo in PacketMath.h 2020-09-18 01:22:23 +00:00
Rasmus Munk Larsen
3af744b023 Add missing packet op pcmp_lt_or_nan for Packet2d on ARM. 2020-09-18 01:07:01 +00:00
Rasmus Munk Larsen
31a6b88ff3 Disable double version of compute_inverse_size4 on Inverse_NEON.h if Packet2d is not supported. 2020-09-17 23:51:06 +00:00
Brad King
880fa43b2b Add support for CastXML on ARM aarch64
CastXML simulates the preprocessors of other compilers, but actually
parses the translation unit with an internal Clang compiler.
Use the same `vld1q_u64` workaround that we do for Clang.

Fixes: #1979
2020-09-16 13:40:23 -04:00
daravi
6f0f6f792e Fix compiler error due to c++20 operator== generation rules 2020-09-16 02:06:53 +00:00
Benoit Jacob
cc0c38ace8 Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler. 2020-09-15 20:54:14 -04:00
Tim Shen
bb56a62582 Make bfloat16(float(-nan)) produce -nan, not nan. 2020-09-15 13:24:23 -07:00
Guoqiang QI
3012e755e9 Add plog ops support packet2d for NEON 2020-09-15 17:10:35 +00:00
Rasmus Munk Larsen
e4fb0ddf78 Add EIGEN_UNUSED_VARIABLE to unused variable in Memory.h 2020-09-15 01:18:55 +00:00
Pedro Caldeira
65e400896b Fix bfloat16 round on gcc 4.8 2020-09-14 10:43:59 -03:00
Rasmus Munk Larsen
5636f80d11 Fix issue #1968. Don't discard return value from "new" in C++17. 2020-09-13 17:38:45 +00:00
Guoqiang QI
7c5d48f313 Unified sse pldexp_double api 2020-09-12 10:56:55 +00:00
Rasmus Munk Larsen
71e08c702b Make blueNorm threadsafe if C++11 atomics are available. 2020-09-12 01:23:29 +00:00
David Tellenbach
adc861cabd New CI infrastructure, including AArch64 runners 2020-09-11 18:11:49 +00:00
Niels Dekker
5328c9be43 Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow
Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning
C26450 from inside `half_impl::float_to_half_rtne(float)`:
> Arithmetic overflow: '<<' operation causes overflow at compile time.
2020-09-10 16:22:28 +02:00
Pedro Caldeira
35d149e34c Add missing functions for Packet8bf in Altivec architecture.
Including new tests for bfloat16 Packets.
Fix prsqrt on GenericPacketMath.
2020-09-08 09:22:11 -05:00
Guoqiang QI
85428a3440 Add Neon psqrt<Packet2d> and pexp<Packet2d> 2020-09-08 09:04:03 +00:00
Alexander Neumann
5272106826 remove semi triggering -Wextra-semi-stmt 2020-09-07 11:42:30 +02:00
Stephen Zheng
5f25bcf7d6 Add Inverse_NEON.h
Implemented fast size-4 matrix inverse (mimicking Inverse_SSE.h) using NEON intrinsics.

```
Benchmark                   Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------
BM_float                 -0.1285         -0.1275           568           495           572           499
BM_double                -0.2265         -0.2254           638           494           641           496
```
2020-09-04 10:55:47 +00:00
Everton Constantino
6fe88a3c9d MatrixProuct enhancements:
- Changes to Altivec/MatrixProduct
  Adapting code to gcc 10.
  Generic code style and performance enhancements.
  Adding PanelMode support.
  Adding stride/offset support.
  Enabling float64, std::complex and std::complex.
  Fixing lack of symm_pack.
  Enabling mixedtypes.
- Adding std::complex tests to blasutil.
- Adding an implementation of storePacketBlock when Incr!= 1.
2020-09-02 18:21:36 -03:00
Everton Constantino
6568856275 Changing u/int8_t to un/signed char because clang does not understand
it.

Implementing pcmp_eq to Packet8 and Packet16.
2020-09-02 17:02:15 -03:00
Gael Guennebaud
27e6648074 fix #1901: warning in Mode==(Upper|Lower) 2020-09-02 15:43:58 +02:00
Hans Johnson
5b9bfc892a BUG: cmake_minimum_required must be the first command
https://cmake.org/cmake/help/v3.5/command/project.html

Note: Call the cmake_minimum_required() command at the beginning of the
top-level CMakeLists.txt file even before calling the project() command.
It is important to establish version and policy settings before invoking
other commands whose behavior they may affect. See also policy CMP0000.
2020-08-28 22:57:16 +00:00
Chip Kerchner
e5886457c8 Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub. 2020-08-28 19:27:32 +00:00
Gael Guennebaud
25424d91f6 Fix #1974: assertion when reserving an empty sparse matrix 2020-08-26 12:32:20 +02:00
Guoqiang QI
8bb0febaf9 add psqrt ops support packet2f/packet4f for NEON 2020-08-21 03:17:15 +00:00
Georg Jäger
1b1082334b adding attributes to constructors to support hip-clang on ROCm 3.5 2020-08-20 16:48:11 +02:00
Deven Desai
603e213d13 Fixing a CUDA / P100 regression introduced by PR 181
PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified.

That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only
2020-08-20 00:29:57 +00:00
David Tellenbach
c060114a25 Fix nightly CI configuration 2020-08-19 20:52:34 +02:00
David Tellenbach
fe8c3ef3cb Add possibility to split test suit build targets and improved CI configuration
- Introduce CMake option `EIGEN_SPLIT_TESTSUITE` that allows to divide the single test build target into several subtargets
- Add CI pipeline for merge request that can be run by GitLab's shared runners
- Add nightly CI pipeline
2020-08-19 18:27:45 +00:00
Rasmus Munk Larsen
d10b27fe37 Add missing inline keyword in Quaternion.h. 2020-08-14 17:51:04 +00:00
David Tellenbach
d4a727d092 Disable min/max NaN propagation in test cxx11_tensor_expr
The current pmin/pmax implementation for Arm Neon propagate NaNs
differently than std::min/std::max.

See issue https://gitlab.com/libeigen/eigen/-/issues/1937
2020-08-14 16:16:27 +00:00
David Tellenbach
d2bb6cf396 Fix compilation error in blasutil test 2020-08-14 18:15:18 +02:00
David Tellenbach
c6820a6316 Replace the call to int64_t in the blasutil test by explicit types
Some platforms define int64_t to be long long even for C++03. If this is
the case we miss the definition of internal::make_unsigned for this
type. If we just define the template we get duplicated definitions
errors for platforms defining int64_t as signed long for C++03.

We need to find a way to distinguish both cases at compile-time.
2020-08-14 17:24:37 +02:00
David Tellenbach
8ba1b0f41a bfloat16 packetmath for Arm Neon backend 2020-08-13 15:48:40 +00:00
Pedro Caldeira
704798d1df Add support for Bfloat16 to use vector instructions on Altivec
architecture
2020-08-10 13:22:01 -05:00
Deven Desai
46f8a18567 Adding an explicit launch_bounds(1024) attribute for GPU kernels.
Starting with ROCm 3.5, the HIP compiler will change from HCC to hip-clang.

This compiler change introduce a change in the default value of the `__launch_bounds__` attribute associated with a GPU kernel. (default value means the value assumed by the compiler as the `__launch_bounds attribute__` value, when it is not explicitly specified by the user)

Currently (i.e. for HIP with ROCm 3.3 and older), the default value is 1024. That changes to 256 with ROCm 3.5 (i.e. hip-clang compiler). As a consequence of this change, if a GPU kernel with a `__luanch_bounds__` attribute of 256 is launched at runtime with a threads_per_block value > 256, it leads to a runtime error. This is leading to a couple of Eigen unit test failures with ROCm 3.5.

This commit adds an explicit `__launch_bounds(1024)__` attribute to every GPU kernel that currently does not have it explicitly specified (and hence will end up getting the default value of 256 with the change to hip-clang)
2020-08-05 01:46:34 +00:00
Zachary Garrett
21122498ec Temporarily turn off the NEON implementation of pfloor as it does not work for large values.
The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point  represented are supported.
2020-08-04 16:28:23 +00:00
David Tellenbach
23b7f0572b Disable CI buildstage again 2020-08-03 15:41:43 +02:00
Gael Guennebaud
d0f5d4bc50 add a banner to advertise the survey 2020-07-29 19:01:38 +02:00
David Tellenbach
5e484fa11d Fix StlDeque for GCC 10
StlDeque extends std::deque by accessing some of its internal members.
Since GCC 10 these are not accessible anymore.
2020-07-29 12:31:13 +00:00
Teng Lu
3ec4f0b641 Fix undefine BF16 union behavior in AVX512. 2020-07-29 02:20:21 +00:00
Rasmus Munk Larsen
b92206676c Inherit alignment trait from argument in TensorBroadcasting to avoid segfault when the argument is unaligned. 2020-07-28 19:19:37 +00:00
David Tellenbach
99da2e1a8d Fix clang-tidy warnings in generic bfloat16 implementation
See !172 for related discussions.
2020-07-27 16:00:24 +02:00
qxxxb
649fd1c2ae Fix CMake install command 2020-07-25 16:35:13 -04:00
David Tellenbach
e48d8e4725 Don't allow failure for CI build stage anymore 2020-07-24 21:12:15 +02:00
David Tellenbach
b8ca93842c Improve CI configuration
- Fix docker Fedora image to Fedora:31
  - Fix gcc version to gcc-9.2.1
  - Use GitLab CI dag
  - Fix usage of build cache
  - Introduce build artificats
2020-07-24 15:58:44 +00:00
Gael Guennebaud
fb0c6868ad Add missing footer declaration 2020-07-24 10:28:44 +02:00
David Tellenbach
c1ffe452fc Fix bfloat16 casts
If we have explicit conversion operators available (C++11) we define
explicit casts from bfloat16 to other types. If not (C++03), we don't
define conversion operators but rely on implicit conversion chains from
bfloat16 over float to other types.
2020-07-23 20:55:06 +00:00
Gael Guennebaud
2ce2f51989 remove piwik tracker 2020-07-23 13:51:39 +02:00
Rasmus Munk Larsen
1b84f21e32 Revert change that made conversion from bfloat16 to {float, double} implicit.
Add roundtrip tests for casting between bfloat16 and complex types.
2020-07-22 18:09:00 -07:00
David Tellenbach
38b91f256b Fix cast of blfoat16 to std::complex<T>
This fixes https://gitlab.com/libeigen/eigen/-/issues/1951
2020-07-22 19:00:17 +00:00
Rasmus Munk Larsen
bed7fbe854 Make sure we take the little-endian path if __BYTE_ORDER__ is not defined. 2020-07-22 18:54:38 +00:00
Niels Dekker
0e1a33a461 Faster conversion from integer types to bfloat16
Specialized `bfloat16_impl::float_to_bfloat16_rtne(float)` for normal floating point numbers, infinity and zero, in order to improve the performance of `bfloat16::bfloat16(const T&)` for integer argument types.

A reduction of more than 20% of the runtime duration of conversion from int to bfloat16 was observed, using Visual C++ 2019 on Windows 10.
2020-07-22 19:25:49 +02:00
Rasmus Munk Larsen
acab22c205 Avoid division by zero in nonZerosEstimate() for empty blocks. 2020-07-22 01:38:30 +00:00
Rasmus Munk Larsen
ac2eca6b11 Update tensor reduction test to avoid undefined division of bfloat16 by int. 2020-07-22 00:35:51 +00:00
Rasmus Munk Larsen
0aeaf5f451 Make numext::as_uint a device function. 2020-07-22 00:33:41 +00:00
Alexander Turkin
60faa9f897 user-defined copy operations removed in favor of compiler-generated ones 2020-07-20 14:59:35 +03:00
Niels Dekker
b11f817bcf Avoid undefined behavior by union type punning in float_to_bfloat16_rtne
Use `numext::as_uint`, instead of union based type punning, to avoid undefined behavior.
See also C++ Core Guidelines: "Don't use a union for type punning"
https://github.com/isocpp/CppCoreGuidelines/blob/v0.8/CppCoreGuidelines.md#c183-dont-use-a-union-for-type-punning

`numext::as_uint` was suggested by David Tellenbach
2020-07-14 19:55:20 +02:00
Sheng Yang
56b3e3f3f8 AVX path for BF16 2020-07-14 01:34:03 +00:00
Niels Dekker
4ab32e2de2 Allow implicit conversion from bfloat16 to float and double
Conversion from `bfloat16` to `float` and `double` is lossless. It seems natural to allow the conversion to be implicit, as the C++ language also support implicit conversion from a smaller to a larger floating point type.

Intel's OneDLL bfloat16 implementation also has an implicit `operator float()`: https://github.com/oneapi-src/oneDNN/blob/v1.5/src/common/bfloat16.hpp
2020-07-11 13:32:28 +02:00
Rasmus Munk Larsen
dcf7655b3d Guard operator<< test by EIGEN_NO_IO. 2020-07-09 19:54:48 +00:00
Rasmus Munk Larsen
ed00df445d Guard operator<< by EIGEN_NO_IO. 2020-07-09 19:52:44 +00:00
Rasmus Munk Larsen
fb77b7288c Add operator<< to print a quaternion. 2020-07-09 12:49:58 -07:00
David Tellenbach
ee4715ff48 Fix test basic stuff
- Guard fundamental types that are not available pre C++11
- Separate subsequent angle brackets >> by spaces
- Allow casting of Eigen::half and Eigen::bfloat16 to complex types
2020-07-09 17:24:00 +00:00
Forrest Voight
8889a2c1c6 Add operator==/operator!= to Quaternion. Fixes #1876. 2020-07-07 20:16:54 +00:00
Rasmus Munk Larsen
6964ae8d52 Change the sign operator in Eigen to return NaN for NaN arguments, not zero. 2020-07-07 01:54:04 +00:00
David Tellenbach
cb63153183 Make test packetmath C++98 compliant 2020-07-01 20:41:59 +02:00
Sheng Yang
116c5235ac BF16 for scalar_cmp_with_cast_op 2020-07-01 18:33:42 +00:00
Kan Chen
8731452b97 Delete duplicate test cases in vectorization_logic.cpp 2020-07-01 00:51:15 +00:00
Antonio Sanchez
9cb8771e9c Fix tensor casts for large packets and casts to/from std::complex
The original tensor casts were only defined for
`SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the
missing 1:N and 8:1.

We also add casting `Eigen::half` to/from `std::complex<T>`, which
was missing to make it consistent with `Eigen:bfloat16`, and
generalize the overload to work for any complex type.

Tests were added to `basicstuff`, `packetmath`, and
`cxx11_tensor_casts` to test all cast configurations.
2020-06-30 18:53:55 +00:00
Antonio Sanchez
145e51516f Fix denormal check pre c++11.
`float_denorm_style` is an old-style `enum`, so the `denorm_present`
symbol only exists in the `std` namespace prior to c++11.
2020-06-30 17:28:30 +00:00
David Tellenbach
689b57070d Report custom C++ flags in CMake testing summary 2020-06-30 17:18:54 +00:00
David Tellenbach
f3b8d441f6 Remote CI tags to enable shared runners 2020-06-29 22:15:41 +02:00
Christoph Grüninger
dc0b81fb1d Pass CMAKE_MAKE_PROGRAM to Fortran language support test
Otherwise the Make (or Ninja) program is used, which is
installed system wide.
2020-06-27 23:52:38 +02:00
David Tellenbach
13d25f5ed8 Add initial CI configuration file.
The initial CI configuration consists of jobs to build and run tests and
to build docs.
2020-06-27 00:03:35 +00:00
Antonio Sanchez
7222f0b6b5 Fix packetmath_1 float tests for arm/aarch64.
Added missing `pmadd<Packet2f>` for NEON. This leads to significant
improvement in precision than previous `pmul+padd`, which was causing
the `pcos` tests to fail. Also added an approx test with
`std::sin`/`std::cos` since otherwise returning any `a^2+b^2=1` would
pass.

Modified `log(denorm)` tests.  Denorms are not always supported by all
systems (returns `::min`), are always flushed to zero on 32-bit arm,
and configurably flush to zero on sse/avx/aarch64. This leads to
inconsistent results across different systems (i.e. `-inf` vs `nan`).
Added a check for existence and exclude ARM.

Removed logistic exactness test, since scalar and vectorized versions
follow different code-paths due to differences in `pexp` and `pmadd`,
which result in slightly different values. For example, exactness always
fails on arm, aarch64, and altivec.
2020-06-24 14:03:35 -07:00
Simon Pfreundschuh
14f84978e8 Replaced call to deprecated 'load' function with appropriate call to 'on'. 2020-06-23 11:23:13 +02:00
Antonio Sanchez
ff4e7a0820 Add missing Packet2l/Packet2ul ops for NEON.
The current multiply (`pmul`) and comparison operators (`pcmp_lt`,
`pcmp_le`, `pcmp_eq`) are missing for packets `Packet2l` and
`Packet2ul`. This leads to compile errors for the `packetmath.cpp` tests
in clang. Here we add and test the missing ops.

Tested:
```
$ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath
$ adb push packetmath /data/local/tmp/
$ adb shell "/data/local/tmp/packetmath"

$ arm-linux-gnueabihf-g++ -mfpu=neon -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath
$ adb push packetmath /data/local/tmp/
$ adb shell "/data/local/tmp/packetmath"

$ clang++ -target aarch64-linux-android21 -static -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath
$ adb push packetmath /data/local/tmp/
$ adb shell "/data/local/tmp/packetmath"

$ clang++ -target armv7-linux-android21 -static -mfpu=neon -I./ '-DEIGEN_TEST_PART_9=1' '-DEIGEN_TEST_PART_10=1' test/packetmath.cpp -o packetmath
$ adb push packetmath /data/local/tmp/
$ adb shell "/data/local/tmp/packetmath"
```
2020-06-22 11:24:43 -07:00
Antonio Sanchez
03ebdf6acb Added missing NEON pcasts, update packetmath tests.
The NEON `pcast` operators are all implemented and tested for existing
packets. This requires adding a `pcast(a,b,c,d,e,f,g,h)` for casting
between `int64_t` and `int8_t` in `GenericPacketMath.h`.

Removed incorrect `HasHalfPacket`  definition for NEON's
`Packet2l`/`Packet2ul`.

Adjustments were also made to the `packetmath` tests. These include
- minor bug fixes for cast tests (i.e. 4:1 casts, only casting for
  packets that are vectorizable)
- added 8:1 cast tests
- random number generation
  - original had uninteresting 0 to 0 casts for many casts between
    floating-point and integers, and exhibited signed overflow
    undefined behavior

Tested:
```
$ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_ALL=1' test/packetmath.cpp -o packetmath
$ adb push packetmath /data/local/tmp/
$ adb shell "/data/local/tmp/packetmath"
```
2020-06-21 09:32:31 -07:00
Teng Lu
386d809bde Support BFloat16 in Eigen 2020-06-20 19:16:24 +00:00
Rasmus Munk Larsen
6b9c92fe7e Add Apache 2.0 license text in COPYING.APACHE. 2020-06-18 12:45:27 -07:00
Nicolas Mellado
cf7adf3a5d Update things you can do message using cmake commands
Print cmake commands instead of make commands, which should work for any generator.
2020-06-16 21:04:33 +00:00
Ilya Tokar
231ce21535 Run two independent chains, when reducing tensors.
Running two chains exposes more instruction level parallelism,
by allowing to execute both chains at the same time.

Results are a bit noisy, but for medium length we almost hit
theoretical upper bound of 2x.

BM_fullReduction_16T/3        [using 16 threads]       17.3ns ±11%        17.4ns ± 9%        ~           (p=0.178 n=18+19)
BM_fullReduction_16T/4        [using 16 threads]       17.6ns ±17%        17.0ns ±18%        ~           (p=0.835 n=20+19)
BM_fullReduction_16T/7        [using 16 threads]       18.9ns ±12%        18.2ns ±10%        ~           (p=0.756 n=20+18)
BM_fullReduction_16T/8        [using 16 threads]       19.8ns ±13%        19.4ns ±21%        ~           (p=0.512 n=20+20)
BM_fullReduction_16T/10       [using 16 threads]       23.5ns ±15%        20.8ns ±24%     -11.37%        (p=0.000 n=20+19)
BM_fullReduction_16T/15       [using 16 threads]       35.8ns ±21%        26.9ns ±17%     -24.76%        (p=0.000 n=20+19)
BM_fullReduction_16T/16       [using 16 threads]       38.7ns ±22%        27.7ns ±18%     -28.40%        (p=0.000 n=20+19)
BM_fullReduction_16T/31       [using 16 threads]        146ns ±17%          74ns ±11%     -49.05%        (p=0.000 n=20+18)
BM_fullReduction_16T/32       [using 16 threads]        154ns ±19%          84ns ±30%     -45.79%        (p=0.000 n=20+19)
BM_fullReduction_16T/64       [using 16 threads]        603ns ± 8%         308ns ±12%     -48.94%        (p=0.000 n=17+17)
BM_fullReduction_16T/128      [using 16 threads]       2.44µs ±13%        1.22µs ± 1%     -50.29%        (p=0.000 n=17+17)
BM_fullReduction_16T/256      [using 16 threads]       9.84µs ±14%        5.13µs ±30%     -47.82%        (p=0.000 n=19+19)
BM_fullReduction_16T/512      [using 16 threads]       78.0µs ± 9%        56.1µs ±17%     -28.02%        (p=0.000 n=18+20)
BM_fullReduction_16T/1k       [using 16 threads]        325µs ± 5%         263µs ± 4%     -19.00%        (p=0.000 n=20+16)
BM_fullReduction_16T/2k       [using 16 threads]       1.09ms ± 3%        0.99ms ± 1%      -9.04%        (p=0.000 n=20+20)
BM_fullReduction_16T/4k       [using 16 threads]       7.66ms ± 3%        7.57ms ± 3%      -1.24%        (p=0.017 n=20+20)
BM_fullReduction_16T/10k      [using 16 threads]       65.3ms ± 4%        65.0ms ± 3%        ~           (p=0.718 n=20+20)
2020-06-16 15:55:11 -04:00
Pedro Caldeira
a475bf14d4 Fix pscatter and pgather for Altivec Complex double 2020-06-16 16:41:02 -03:00
David Tellenbach
c6c84ed961 Fix unused variable warning on Arm 2020-06-15 00:14:58 +02:00
Sebastien Boisvert
6228f27234 Fix #1818: SparseLU: add methods nnzL() and nnzU()
Now this compiles without errors:

$ clang++ -I ../../ test_sparseLU.cpp -std=c++03
2020-06-11 23:49:49 +00:00
Sebastien Boisvert
39cbd6578f Fix #1911: add benchmark for move semantics with fixed-size matrix
$ clang++ -O3 bench/bench_move_semantics.cpp -I. -std=c++11 \
        -o bench_move_semantics

$ ./bench_move_semantics
float copy semantics: 1755.97 ms
float move semantics: 55.063 ms
double copy semantics: 2457.65 ms
double move semantics: 55.034 ms
2020-06-11 23:43:25 +00:00
Antonio Sanchez
a7d2552af8 Remove HasCast and fix packetmath cast tests.
The use of the `packet_traits<>::HasCast` field is currently inconsistent with
`type_casting_traits<>`, and is unused apart from within
`test/packetmath.cpp`. In addition, those packetmath cast tests do not
currently reflect how casts are performed in practice: they ignore the
`SrcCoeffRatio` and `TgtCoeffRatio` fields, assuming a 1:1 ratio.

Here we remove the unsed `HasCast`, and modify the packet cast tests to
better reflect their usage.
2020-06-11 17:26:56 +00:00
Sebastien Boisvert
463ec86648 Fix #1757: remove the word 'suicide' 2020-06-11 00:56:54 +00:00
ShengYang1
b5d66b5e73 Implement scalar_cmp_with_cast_op 2020-06-09 08:12:07 +08:00
Rasmus Munk Larsen
c4059ffcb6 Fix static analyzer warning in SelfadjointProduct.h.
Fix compiler warnings in GeneralBlockPanelKernel.h.
2020-06-08 11:48:44 -07:00
Thales Sabino
1fcaaf460f Update FindComputeCpp.cmake to fix build problems on Windows
- Use standard types in SYCL/PacketMath.h to avoid compilation problems on Windows
- Add EIGEN_HAS_CONSTEXPR to cxx11_tensor_argmax_sycl.cpp to fix build problems on Windows
2020-06-05 20:51:20 +00:00
David Tellenbach
3ce18d3c8f Revert ".gitlab-ci.yml: initial commit"
This reverts commit 95177362ed to
disable GitLab CI temporarily.
2020-06-05 22:43:49 +02:00
Rasmus Munk Larsen
c2ab36f47a Fix broken packetmath test for logistic on Arm. 2020-06-04 16:24:47 -07:00
Rasmus Munk Larsen
537e2b322f Fix typo in previous update to generic predux_any. 2020-06-04 22:25:05 +00:00
Rasmus Munk Larsen
fdc1cbdce3 Avoid implicit float equality comparison in generic predux_any, but use numext::not_equal_strict to avoid breaking builds that compile with -Werror=float-equal. 2020-06-04 22:15:56 +00:00
Rasmus Munk Larsen
daf9bbeca2 Fix compilation error in logistic packet op. 2020-06-03 00:57:41 +00:00
n0mend
6d2a9a524b Update run instructions for benchCholesky 2020-06-01 18:31:46 +00:00
Gael Guennebaud
029a76e115 Bug #1777: make the scalar and packet path consistent for the logistic function + respective unit test 2020-05-31 00:53:37 +02:00
Gael Guennebaud
99b7f7cb9c Fix #556: warnings with mingw 2020-05-31 00:39:44 +02:00
Gael Guennebaud
72782d13e0 Bug #1767: increase required cmake version to 3.5.0 2020-05-31 00:31:09 +02:00
Gael Guennebaud
867a756509 Fix #1833: compilation issue of "array!=scalar" with c++20 2020-05-30 23:53:58 +02:00
Gael Guennebaud
ab615e4114 Save one extra temporary when assigning a sparse product to a row-major sparse matrix 2020-05-30 23:15:12 +02:00
Christoph Junghans
95177362ed .gitlab-ci.yml: initial commit 2020-05-29 09:23:25 -06:00
Kan Chen
8d1302f566 Add support for PacketBlock<Packet8s,4> and PacketBlock<Packet16uc,4> ptranspose on NEON 2020-05-29 00:33:45 +00:00
Antonio Sánchez
8719b9c5bc Disable test for 32-bit systems (e.g. ARM, i386)
Both i386 and 32-bit ARM do not define __uint128_t. On most systems, if
__uint128_t is defined, then so is the macro __SIZEOF_INT128__.

https://stackoverflow.com/questions/18531782/how-to-know-if-uint128-t-is-defined1
2020-05-28 17:40:15 +00:00
Yong Tang
8e1df5b082 Fix incorrect usage of if defined(EIGEN_ARCH_PPC) => if EIGEN_ARCH_PPC
This PR tries to fix an incorrect usage of `if defined(EIGEN_ARCH_PPC)`
in `Eigen/Core` header.

In `Eigen/src/Core/util/Macros.h`, EIGEN_ARCH_PPC was explicitly defined
as either 0 or 1. As a result `if defined(EIGEN_ARCH_PPC)` will always be true.
This causes issues when building on non PPC platform and `MatrixProduct.h` is not
available.

This fix changes `if defined(EIGEN_ARCH_PPC)` => `if EIGEN_ARCH_PPC`.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
2020-05-28 05:53:44 -07:00
Kan Chen
4e7046063b Fix #1874: it works on both MSVC 2017 and other platforms. 2020-05-21 18:42:56 +08:00
Pedro Caldeira
2d67af2d2b Add pscatter for Packet16{u}c (int8) 2020-05-20 17:29:34 -03:00
David Tellenbach
5328cd62b3 Guard usage of decltype since it's a C++11 feature
This fixes https://gitlab.com/libeigen/eigen/-/issues/1897
2020-05-20 16:04:16 +02:00
Rasmus Munk Larsen
cc86a31e20 Add guard around specialization for bool, which is only currently implemented for SSE. 2020-05-19 16:21:56 -07:00
Everton Constantino
8a7f360ec3 - Vectorizing MMA packing.
- Optimizing MMA kernel.
- Adding PacketBlock store to blas_data_mapper.
2020-05-19 19:24:11 +00:00
Rasmus Munk Larsen
a145e4adf5 Add newline at the end of StlIterators.h. 2020-05-15 20:36:00 +00:00
Gael Guennebaud
8ce9630ddb Fix #1874: workaround MSVC 2017 compilation issue. 2020-05-15 20:47:32 +02:00
Rasmus Munk Larsen
9b411757ab Add missing packet ops for bool, and make it pass the same packet op unit tests as other arithmetic types.
This change also contains a few minor cleanups:
  1. Remove packet op pnot, which is not needed for anything other than pcmp_le_or_nan,
     which can be done in other ways.
  2. Remove the "HasInsert" enum, which is no longer needed since we removed the
     corresponding packet ops.
  3. Add faster pselect op for Packet4i when SSE4.1 is supported.

Among other things, this makes the fast transposeInPlace() method available for Matrix<bool>.

Run on ************** (72 X 2994 MHz CPUs); 2020-05-09T10:51:02.372347913-07:00
CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB
Benchmark                        Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------------
BM_TransposeInPlace<float>/4            9.77           9.77    71670320
BM_TransposeInPlace<float>/8           21.9           21.9     31929525
BM_TransposeInPlace<float>/16          66.6           66.6     10000000
BM_TransposeInPlace<float>/32         243            243        2879561
BM_TransposeInPlace<float>/59         844            844         829767
BM_TransposeInPlace<float>/64         933            933         750567
BM_TransposeInPlace<float>/128       3944           3945         177405
BM_TransposeInPlace<float>/256      16853          16853          41457
BM_TransposeInPlace<float>/512     204952         204968           3448
BM_TransposeInPlace<float>/1k     1053889        1053861            664
BM_TransposeInPlace<bool>/4            14.4           14.4     48637301
BM_TransposeInPlace<bool>/8            36.0           36.0     19370222
BM_TransposeInPlace<bool>/16           31.5           31.5     22178902
BM_TransposeInPlace<bool>/32          111            111        6272048
BM_TransposeInPlace<bool>/59          626            626        1000000
BM_TransposeInPlace<bool>/64          428            428        1632689
BM_TransposeInPlace<bool>/128        1677           1677         417377
BM_TransposeInPlace<bool>/256        7126           7126          96264
BM_TransposeInPlace<bool>/512       29021          29024          24165
BM_TransposeInPlace<bool>/1k       116321         116330           6068
2020-05-14 22:39:13 +00:00
Felipe Attanasio
d640276d31 Added support for reverse iterators for Vectorwise operations. 2020-05-14 22:38:20 +00:00
Christopher Moore
fa8fd4b4d5 Indexed view should have RowMajorBit when there is staticly a single row 2020-05-14 22:11:19 +00:00
Christopher Moore
a187ffea28 Resolve "IndexedView of a vector should allow linear access" 2020-05-13 19:24:42 +00:00
Mark Eberlein
ba9d18b938 Add KLU support to spbenchsolver 2020-05-11 21:50:27 +00:00
Pedro Caldeira
5fdc179241 Altivec template functions to better code reusability 2020-05-11 21:04:51 +00:00
mehdi-goli
d3e81db6c5 Eigen moved the scanLauncehr function inside the internal namespace.
This commit applies the following changes:
    - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend.
    - Replacing  `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard.
    - minor fixes: commenting out an unused variable to avoid compiler warnings.
2020-05-11 16:10:33 +01:00
Rasmus Munk Larsen
c1d944dd91 Remove packet ops pinsertfirst and pinsertlast that are only used in a single place, and can be replaced by other ops when constructing the first/final packet in linspaced_op_impl::packetOp.
I cannot measure any performance changes for SSE, AVX, or AVX512.

name                                 old time/op             new time/op             delta
BM_LinSpace<float>/1                 1.63ns ± 0%             1.63ns ± 0%   ~             (p=0.762 n=5+5)
BM_LinSpace<float>/8                 4.92ns ± 3%             4.89ns ± 3%   ~             (p=0.421 n=5+5)
BM_LinSpace<float>/64                34.6ns ± 0%             34.6ns ± 0%   ~             (p=0.841 n=5+5)
BM_LinSpace<float>/512                217ns ± 0%              217ns ± 0%   ~             (p=0.421 n=5+5)
BM_LinSpace<float>/4k                1.68µs ± 0%             1.68µs ± 0%   ~             (p=1.000 n=5+5)
BM_LinSpace<float>/32k               13.3µs ± 0%             13.3µs ± 0%   ~             (p=0.905 n=5+4)
BM_LinSpace<float>/256k               107µs ± 0%              107µs ± 0%   ~             (p=0.841 n=5+5)
BM_LinSpace<float>/1M                 427µs ± 0%              427µs ± 0%   ~             (p=0.690 n=5+5)
2020-05-08 15:41:50 -07:00
David Tellenbach
5c4e19fbe7 Possibility to specify user-defined default cache sizes for GEBP kernel
Some architectures have no convinient way to determine cache sizes at
runtime. Eigen's GEBP kernel falls back to default cache values in this
case which might not be correct in all situations.

This patch introduces three preprocessor directives

  `EIGEN_DEFAULT_L1_CACHE_SIZE`
  `EIGEN_DEFAULT_L2_CACHE_SIZE`
  `EIGEN_DEFAULT_L3_CACHE_SIZE`

to give users the possibility to set these default values explicitly.
2020-05-08 12:54:36 +02:00
Rasmus Munk Larsen
225ab040e0 Remove unused packet op "palign".
Clean up a compiler warning in c++03 mode in AVX512/Complex.h.
2020-05-07 17:14:26 -07:00
Rasmus Munk Larsen
74ec8e6618 Make size odd for transposeInPlace test to make sure we hit the scalar path. 2020-05-07 17:29:56 +00:00
Rasmus Munk Larsen
49f1aeb60d Remove traits declaring NEON vectorized casts that do not actually have packet op implementations. 2020-05-07 09:49:22 -07:00
Rasmus Munk Larsen
2fd8a5a08f Add parallelization of TensorScanOp for types without packet ops.
Clean up the code a bit and do a few micro-optimizations to improve performance for small tensors.

Benchmark numbers for Tensor<uint32_t>:

name                                                       old time/op             new time/op             delta
BM_cumSumRowReduction_1T/8   [using 1 threads]             76.5ns ± 0%             61.3ns ± 4%    -19.80%          (p=0.008 n=5+5)
BM_cumSumRowReduction_1T/64  [using 1 threads]             2.47µs ± 1%             2.40µs ± 1%     -2.77%          (p=0.008 n=5+5)
BM_cumSumRowReduction_1T/256 [using 1 threads]             39.8µs ± 0%             39.6µs ± 0%     -0.60%          (p=0.008 n=5+5)
BM_cumSumRowReduction_1T/4k  [using 1 threads]             13.9ms ± 0%             13.4ms ± 1%     -4.19%          (p=0.008 n=5+5)
BM_cumSumRowReduction_2T/8   [using 2 threads]             76.8ns ± 0%             59.1ns ± 0%    -23.09%          (p=0.016 n=5+4)
BM_cumSumRowReduction_2T/64  [using 2 threads]             2.47µs ± 1%             2.41µs ± 1%     -2.53%          (p=0.008 n=5+5)
BM_cumSumRowReduction_2T/256 [using 2 threads]             39.8µs ± 0%             34.7µs ± 6%    -12.74%          (p=0.008 n=5+5)
BM_cumSumRowReduction_2T/4k  [using 2 threads]             13.8ms ± 1%              7.2ms ± 6%    -47.74%          (p=0.008 n=5+5)
BM_cumSumRowReduction_8T/8   [using 8 threads]             76.4ns ± 0%             61.8ns ± 3%    -19.02%          (p=0.008 n=5+5)
BM_cumSumRowReduction_8T/64  [using 8 threads]             2.47µs ± 1%             2.40µs ± 1%     -2.84%          (p=0.008 n=5+5)
BM_cumSumRowReduction_8T/256 [using 8 threads]             39.8µs ± 0%             28.3µs ±11%    -28.75%          (p=0.008 n=5+5)
BM_cumSumRowReduction_8T/4k  [using 8 threads]             13.8ms ± 0%              2.7ms ± 5%    -80.39%          (p=0.008 n=5+5)
BM_cumSumColReduction_1T/8   [using 1 threads]             59.1ns ± 0%             80.3ns ± 0%    +35.94%          (p=0.029 n=4+4)
BM_cumSumColReduction_1T/64  [using 1 threads]             3.06µs ± 0%             3.08µs ± 1%       ~             (p=0.114 n=4+4)
BM_cumSumColReduction_1T/256 [using 1 threads]              175µs ± 0%              176µs ± 0%       ~             (p=0.190 n=4+5)
BM_cumSumColReduction_1T/4k  [using 1 threads]              824ms ± 1%              844ms ± 1%     +2.37%          (p=0.008 n=5+5)
BM_cumSumColReduction_2T/8   [using 2 threads]             59.0ns ± 0%             90.7ns ± 0%    +53.74%          (p=0.029 n=4+4)
BM_cumSumColReduction_2T/64  [using 2 threads]             3.06µs ± 0%             3.10µs ± 0%     +1.08%          (p=0.016 n=4+5)
BM_cumSumColReduction_2T/256 [using 2 threads]              176µs ± 0%              189µs ±18%       ~             (p=0.151 n=5+5)
BM_cumSumColReduction_2T/4k  [using 2 threads]              836ms ± 2%              611ms ±14%    -26.92%          (p=0.008 n=5+5)
BM_cumSumColReduction_8T/8   [using 8 threads]             59.3ns ± 2%             90.6ns ± 0%    +52.79%          (p=0.008 n=5+5)
BM_cumSumColReduction_8T/64  [using 8 threads]             3.07µs ± 0%             3.10µs ± 0%     +0.99%          (p=0.016 n=5+4)
BM_cumSumColReduction_8T/256 [using 8 threads]              176µs ± 0%               80µs ±19%    -54.51%          (p=0.008 n=5+5)
BM_cumSumColReduction_8T/4k  [using 8 threads]              827ms ± 2%              180ms ±14%    -78.24%          (p=0.008 n=5+5)
2020-05-06 14:48:37 -07:00
Rasmus Munk Larsen
0e59f786e1 Fix accidental copy of loop variable. 2020-05-05 21:35:38 +00:00
Rasmus Munk Larsen
7b76c85daf Vectorize and parallelize TensorScanOp.
TensorScanOp is used in TensorFlow for a number of operations, such as cumulative logexp reduction and cumulative sum and product reductions.

The benchmarks numbers below are for cumulative row- and column reductions of NxN matrices.

name                                                         old time/op             new time/op     delta
BM_cumSumRowReduction_1T/4    [using 1 threads ]             25.1ns ± 1%             35.2ns ± 1%    +40.45%
BM_cumSumRowReduction_1T/8    [using 1 threads ]             73.4ns ± 0%             82.7ns ± 3%    +12.74%
BM_cumSumRowReduction_1T/32   [using 1 threads ]              988ns ± 0%              832ns ± 0%    -15.77%
BM_cumSumRowReduction_1T/64   [using 1 threads ]             4.07µs ± 2%             3.47µs ± 0%    -14.70%
BM_cumSumRowReduction_1T/128  [using 1 threads ]             18.0µs ± 0%             16.8µs ± 0%     -6.58%
BM_cumSumRowReduction_1T/512  [using 1 threads ]              287µs ± 0%              281µs ± 0%     -2.22%
BM_cumSumRowReduction_1T/2k   [using 1 threads ]             4.78ms ± 1%             4.78ms ± 2%       ~
BM_cumSumRowReduction_1T/10k  [using 1 threads ]              117ms ± 1%              117ms ± 1%       ~
BM_cumSumRowReduction_8T/4    [using 8 threads ]             25.0ns ± 0%             35.2ns ± 0%    +40.82%
BM_cumSumRowReduction_8T/8    [using 8 threads ]             77.2ns ±16%             81.3ns ± 0%       ~
BM_cumSumRowReduction_8T/32   [using 8 threads ]              988ns ± 0%              833ns ± 0%    -15.67%
BM_cumSumRowReduction_8T/64   [using 8 threads ]             4.08µs ± 2%             3.47µs ± 0%    -14.95%
BM_cumSumRowReduction_8T/128  [using 8 threads ]             18.0µs ± 0%             17.3µs ±10%       ~
BM_cumSumRowReduction_8T/512  [using 8 threads ]              287µs ± 0%               58µs ± 6%    -79.92%
BM_cumSumRowReduction_8T/2k   [using 8 threads ]             4.79ms ± 1%             0.64ms ± 1%    -86.58%
BM_cumSumRowReduction_8T/10k  [using 8 threads ]              117ms ± 1%               18ms ± 6%    -84.50%

BM_cumSumColReduction_1T/4    [using 1 threads ]             23.9ns ± 0%             33.4ns ± 1%    +39.68%
BM_cumSumColReduction_1T/8    [using 1 threads ]             71.6ns ± 1%             49.1ns ± 3%    -31.40%
BM_cumSumColReduction_1T/32   [using 1 threads ]              973ns ± 0%              165ns ± 2%    -83.10%
BM_cumSumColReduction_1T/64   [using 1 threads ]             4.06µs ± 1%             0.57µs ± 1%    -85.94%
BM_cumSumColReduction_1T/128  [using 1 threads ]             33.4µs ± 1%              4.1µs ± 1%    -87.67%
BM_cumSumColReduction_1T/512  [using 1 threads ]             1.72ms ± 4%             0.21ms ± 5%    -87.91%
BM_cumSumColReduction_1T/2k   [using 1 threads ]              119ms ±53%               11ms ±35%    -90.42%
BM_cumSumColReduction_1T/10k  [using 1 threads ]              1.59s ±67%              0.35s ±49%    -77.96%
BM_cumSumColReduction_8T/4    [using 8 threads ]             23.8ns ± 0%             33.3ns ± 0%    +40.06%
BM_cumSumColReduction_8T/8    [using 8 threads ]             71.6ns ± 1%             49.2ns ± 5%    -31.33%
BM_cumSumColReduction_8T/32   [using 8 threads ]             1.01µs ±12%             0.17µs ± 3%    -82.93%
BM_cumSumColReduction_8T/64   [using 8 threads ]             4.15µs ± 4%             0.58µs ± 1%    -86.09%
BM_cumSumColReduction_8T/128  [using 8 threads ]             33.5µs ± 0%              4.1µs ± 4%    -87.65%
BM_cumSumColReduction_8T/512  [using 8 threads ]             1.71ms ± 3%             0.06ms ±16%    -96.21%
BM_cumSumColReduction_8T/2k   [using 8 threads ]             97.1ms ±14%              3.0ms ±23%    -96.88%
BM_cumSumColReduction_8T/10k  [using 8 threads ]              1.97s ± 8%              0.06s ± 2%    -96.74%
2020-05-05 00:19:43 +00:00
Xiaoxiang Cao
a74a278abd Fix confusing template param name for Stride fwd decl. 2020-04-30 01:43:05 +00:00
Rasmus Munk Larsen
923ee9aba3 Fix the embarrassingly incomplete fix to the embarrassing bug in blocked transpose. 2020-04-29 17:27:36 +00:00
Rasmus Munk Larsen
a32923a439 Fix (embarrassing) bug in blocked transpose. 2020-04-29 17:02:27 +00:00
Rasmus Munk Larsen
1e41406c36 Add missing transpose in cleanup loop. Without it, we trip an assertion in debug mode. 2020-04-29 01:30:51 +00:00
Rasmus Munk Larsen
fbe7916c55 Fix compilation error with Clang on Android: _mm_extract_epi64 fails to compile. 2020-04-29 00:58:41 +00:00
Clément Grégoire
82f54ad144 Fix perf monitoring merge function 2020-04-28 17:02:59 +00:00
Rasmus Munk Larsen
ab773c7e91 Extend support for Packet16b:
* Add ptranspose<*,4> to support matmul and add unit test for Matrix<bool> * Matrix<bool>
* work around a bug in slicing of Tensor<bool>.
* Add tensor tests

This speeds up matmul for boolean matrices by about 10x

name                            old time/op             new time/op             delta
BM_MatMul<bool>/8                267ns ± 0%              479ns ± 0%  +79.25%          (p=0.008 n=5+5)
BM_MatMul<bool>/32              6.42µs ± 0%             0.87µs ± 0%  -86.50%          (p=0.008 n=5+5)
BM_MatMul<bool>/64              43.3µs ± 0%              5.9µs ± 0%  -86.42%          (p=0.008 n=5+5)
BM_MatMul<bool>/128              315µs ± 0%               44µs ± 0%  -85.98%          (p=0.008 n=5+5)
BM_MatMul<bool>/256             2.41ms ± 0%             0.34ms ± 0%  -85.68%          (p=0.008 n=5+5)
BM_MatMul<bool>/512             18.8ms ± 0%              2.7ms ± 0%  -85.53%          (p=0.008 n=5+5)
BM_MatMul<bool>/1k               149ms ± 0%               22ms ± 0%  -85.40%          (p=0.008 n=5+5)
2020-04-28 16:12:47 +00:00
Rasmus Munk Larsen
b47c777993 Block transposeInPlace() when the matrix is real and square. This yields a large speedup because we transpose in registers (or L1 if we spill), instead of one packet at a time, which in the worst case makes the code write to the same cache line PacketSize times instead of once.
rmlarsen@rmlarsen4:.../eigen_bench/google3$ benchy --benchmarks=.*TransposeInPlace.*float.* --reference=srcfs experimental/users/rmlarsen/bench:matmul_bench
 10 / 10 [====================================================================================================================================================================================================================] 100.00% 2m50s
(Generated by http://go/benchy. Settings: --runs 5 --benchtime 1s --reference "srcfs" --benchmarks ".*TransposeInPlace.*float.*" experimental/users/rmlarsen/bench:matmul_bench)

name                                       old time/op             new time/op             delta
BM_TransposeInPlace<float>/4               9.84ns ± 0%             6.51ns ± 0%  -33.80%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/8               23.6ns ± 1%             17.6ns ± 0%  -25.26%          (p=0.016 n=5+4)
BM_TransposeInPlace<float>/16              78.8ns ± 0%             60.3ns ± 0%  -23.50%          (p=0.029 n=4+4)
BM_TransposeInPlace<float>/32               302ns ± 0%              229ns ± 0%  -24.40%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/59              1.03µs ± 0%             0.84µs ± 1%  -17.87%          (p=0.016 n=5+4)
BM_TransposeInPlace<float>/64              1.20µs ± 0%             0.89µs ± 1%  -25.81%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/128             8.96µs ± 0%             3.82µs ± 2%  -57.33%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/256              152µs ± 3%               17µs ± 2%  -89.06%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/512              837µs ± 1%              208µs ± 0%  -75.15%          (p=0.008 n=5+5)
BM_TransposeInPlace<float>/1k              4.28ms ± 2%             1.08ms ± 2%  -74.72%          (p=0.008 n=5+5)
2020-04-28 16:08:16 +00:00
Pedro Caldeira
29f0917a43 Add support to vector instructions to Packet16uc and Packet16c 2020-04-27 12:48:08 -03:00
Rasmus Munk Larsen
e80ec24357 Remove unused packet op "preduxp". 2020-04-23 18:17:14 +00:00
René Wagner
0aebe19aca BooleanRedux.h: Add more EIGEN_DEVICE_FUNC qualifiers.
This enables operator== on Eigen matrices in device code.
2020-04-23 17:25:08 +02:00
Eugene Zhulenev
3c02fefec5 Add async evaluation support to TensorSlicingOp.
Device::memcpy is not async-safe and might lead to deadlocks. Always evaluate slice expression in async mode.
2020-04-22 19:55:01 +00:00
Pedro Caldeira
0c67b855d2 Add Packet8s and Packet8us to support signed/unsigned int16/short Altivec vector operations 2020-04-21 14:52:46 -03:00
Rasmus Munk Larsen
e8f40e4670 Fix bug in ptrue for Packet16b. 2020-04-20 21:45:10 +00:00
Rasmus Munk Larsen
2f6ddaa25c Add partial vectorization for matrices and tensors of bool. This speeds up boolean operations on Tensors by up to 25x.
Benchmark numbers for the logical and of two NxN tensors:

name                                               old time/op             new time/op             delta
BM_booleanAnd_1T/3   [using 1 threads]             14.6ns ± 0%             14.4ns ± 0%   -0.96%
BM_booleanAnd_1T/4   [using 1 threads]             20.5ns ±12%              9.0ns ± 0%  -56.07%
BM_booleanAnd_1T/7   [using 1 threads]             41.7ns ± 0%             10.5ns ± 0%  -74.87%
BM_booleanAnd_1T/8   [using 1 threads]             52.1ns ± 0%             10.1ns ± 0%  -80.59%
BM_booleanAnd_1T/10  [using 1 threads]             76.3ns ± 0%             13.8ns ± 0%  -81.87%
BM_booleanAnd_1T/15  [using 1 threads]              167ns ± 0%               16ns ± 0%  -90.45%
BM_booleanAnd_1T/16  [using 1 threads]              188ns ± 0%               16ns ± 0%  -91.57%
BM_booleanAnd_1T/31  [using 1 threads]              667ns ± 0%               34ns ± 0%  -94.83%
BM_booleanAnd_1T/32  [using 1 threads]              710ns ± 0%               35ns ± 0%  -95.01%
BM_booleanAnd_1T/64  [using 1 threads]             2.80µs ± 0%             0.11µs ± 0%  -95.93%
BM_booleanAnd_1T/128 [using 1 threads]             11.2µs ± 0%              0.4µs ± 0%  -96.11%
BM_booleanAnd_1T/256 [using 1 threads]             44.6µs ± 0%              2.5µs ± 0%  -94.31%
BM_booleanAnd_1T/512 [using 1 threads]              178µs ± 0%               10µs ± 0%  -94.35%
BM_booleanAnd_1T/1k  [using 1 threads]              717µs ± 0%               78µs ± 1%  -89.07%
BM_booleanAnd_1T/2k  [using 1 threads]             2.87ms ± 0%             0.31ms ± 1%  -89.08%
BM_booleanAnd_1T/4k  [using 1 threads]             11.7ms ± 0%              1.9ms ± 4%  -83.55%
BM_booleanAnd_1T/10k [using 1 threads]             70.3ms ± 0%             17.2ms ± 4%  -75.48%
2020-04-20 20:16:28 +00:00
dlazenby
00f6340153 Update PreprocessorDirectives.dox - Added line for the new VectorwiseOp plugin directive (and re-alphabatized the plugin section) 2020-04-17 21:43:37 +00:00
Rasmus Munk Larsen
5ab87d8aba Move eigen_packet_wrapper to GenericPacketMath.h and use it for SSE/AVX/AVX512 as it is already used for NEON.
This will allow us to define multiple packet types backed by the same vector type, e.g., __m128i.
Use this machanism to define packets for half and clean up the packet op implementations.
2020-04-15 18:17:19 +00:00
Rasmus Munk Larsen
4aae8ac693 Fix typo in TypeCasting.h 2020-04-14 02:55:51 +00:00
Rasmus Munk Larsen
1d674003b2 Fix big in vectorized casting of
{uint8, int8} -> {int16, uint16, int32, uint32, float} 
 {uint16, int16} -> {int32, uint32, int64, uint64, float} 

for NEON. These conversions were advertised as vectorized, but not actually implemented.
2020-04-14 02:11:06 +00:00
Changming Sun
b1aa07a8d3 Fix a bug in TensorIndexList.h 2020-04-13 18:22:03 +00:00
Christoph Hertzberg
d46d726e9d CommaInitializer wrongfully asserted for 0-sized blocks
commainitialier unit-test never actually called `test_block_recursion`, which also was not correctly implemented and would have caused too deep template recursion.
2020-04-13 16:41:20 +02:00
Antonio Sanchez
c854e189e6 Fixed commainitializer test.
The removed `finished()` call was responsible for enforcing that the
initializer was provided the correct number of values. Putting it back in
to restore previous behavior.
2020-04-10 13:53:26 -07:00
jangsoopark
39142904cc Resolve C4346 when building eigen on windows 2020-04-08 14:55:39 +09:00
Rasmus Munk Larsen
f0577a2bfd Speed up matrix multiplication for small to medium size matrices by using half- or quarter-packet vectorized loads in gemm_pack_rhs if they have size 4, instead of dropping down the the scalar path.
Benchmark measurements below are for computing ```c.noalias() = a.transpose() * b;``` for square RowMajor matrices of varying size.

Measured improvement with AVX+FMA:

name                           old time/op             new time/op             delta
BM_MatMul_ATB/8                 139ns ± 1%              129ns ± 1%   -7.49%          (p=0.008 n=5+5)
BM_MatMul_ATB/32               1.46µs ± 1%             1.22µs ± 0%  -16.72%          (p=0.008 n=5+5)
BM_MatMul_ATB/64               8.43µs ± 1%             7.41µs ± 0%  -12.04%          (p=0.008 n=5+5)
BM_MatMul_ATB/128              56.8µs ± 1%             52.9µs ± 1%   -6.83%          (p=0.008 n=5+5)
BM_MatMul_ATB/256               407µs ± 1%              395µs ± 3%   -2.94%          (p=0.032 n=5+5)
BM_MatMul_ATB/512              3.27ms ± 3%             3.18ms ± 1%     ~             (p=0.056 n=5+5)


Measured improvement for AVX512:

name                          old time/op             new time/op             delta
BM_MatMul_ATB/8                167ns ± 1%              154ns ± 1%   -7.63%          (p=0.008 n=5+5)
BM_MatMul_ATB/32              1.08µs ± 1%             0.83µs ± 3%  -23.58%          (p=0.008 n=5+5)
BM_MatMul_ATB/64              6.21µs ± 1%             5.06µs ± 1%  -18.47%          (p=0.008 n=5+5)
BM_MatMul_ATB/128             36.1µs ± 2%             31.3µs ± 1%  -13.32%          (p=0.008 n=5+5)
BM_MatMul_ATB/256              263µs ± 2%              242µs ± 2%   -7.92%          (p=0.008 n=5+5)
BM_MatMul_ATB/512             1.95ms ± 2%             1.91ms ± 2%     ~             (p=0.095 n=5+5)
BM_MatMul_ATB/1k              15.4ms ± 4%             14.8ms ± 2%     ~             (p=0.095 n=5+5)
2020-04-07 22:09:51 +00:00
Antonio Sanchez
8e875719b3 Replace norm() with squaredNorm() to address integer overflows
For random matrices with integer coefficients, many of the tests here lead to
integer overflows. When taking the norm() of a row/column, the squaredNorm()
often overflows to a negative value, leading to domain errors when taking the
sqrt(). This leads to a crash on some systems. By replacing the norm() call by
a squaredNorm(), the values still overflow, but at least there is no domain
error.

Addresses https://gitlab.com/libeigen/eigen/-/issues/1856
2020-04-07 19:48:28 +00:00
Antonio Sanchez
9dda5eb7d2 Missing struct definition in NumTraits 2020-04-07 09:01:11 -07:00
Akshay Naresh Modi
bcc0e9e15c Add numeric_limits min and max for bool
This will allow (among other things) computation of argmax and argmin of bool tensors
2020-04-06 23:38:57 +00:00
Bernardo Bahia Monteiro
54a0a9c9dd Bugfix: conjugate_gradient did not compile with lazy-evaluated RealScalar
The error generated by the compiler was:

    no matching function for call to 'maxi'
    RealScalar threshold = numext::maxi(tol*tol*rhsNorm2,considerAsZero);

The important part in the following notes was:

    candidate template ignored: deduced conflicting
    types for parameter 'T'"
    ('codi::Multiply11<...>' vs. 'codi::ActiveReal<...>')
    EIGEN_ALWAYS_INLINE T maxi(const T& x, const T& y)

I am using CoDiPack to provide the RealScalar type.
This bug was introduced in bc000deaa Fix conjugate-gradient for very small rhs
2020-03-29 19:44:12 -04:00
Rasmus Munk Larsen
4fd5d1477b Fix packetmath test build for AVX. 2020-03-27 17:05:39 +00:00
Rasmus Munk Larsen
393dbd8ee9 Fix bug in 52d54278be 2020-03-27 16:42:18 +00:00
Rasmus Munk Larsen
55c8fe8d0f Fix bug in 52d54278be 2020-03-27 16:41:15 +00:00
Joel Holdsworth
6d2dbfc453 NEON: Fixed MSVC types definitions 2020-03-26 20:19:58 +00:00
Joel Holdsworth
52d54278be Additional NEON packet-math operations 2020-03-26 20:18:19 +00:00
Everton Constantino
deb93ed1bf Adhere to recommended load/store intrinsics for pp64le 2020-03-23 15:18:15 -03:00
Aaron Franke
5c22c7a7de Make file formatting comply with POSIX and Unix standards
UTF-8, LF, no BOM, and newlines at the end of files
2020-03-23 18:09:02 +00:00
Everton Constantino
5afdaa473a Fixing float32's pround halfway criteria to match STL's criteria. 2020-03-21 22:30:54 -05:00
Alessio M
96cd1ff718 Fixed:
- access violation when initializing 0x0 matrices
- exception can be thrown during stack unwind while comma-initializing a matrix if eigen_assert if configured to throw
2020-03-21 05:11:21 +00:00
dlazenby
cc954777f2 Update VectorwiseOp.h to allow Plugins similar to MatrixBase.h or ArrayBase.h 2020-03-20 19:30:01 +00:00
Masaki Murooka
55ecd58a3c Bug https://gitlab.com/libeigen/eigen/-/issues/1415: add missing EIGEN_DEVICE_FUNC to diagonal_product_evaluator_base. 2020-03-20 13:37:37 +09:00
Rasmus Munk Larsen
4da2c6b197 Remove reference to non-existent unary_op_base class. 2020-03-19 18:23:06 +00:00
Rasmus Munk Larsen
eda90baf35 Add missing arguments to numext::absdiff(). 2020-03-19 18:16:55 +00:00
Joel Holdsworth
d5c665742b Add absolute_difference coefficient-wise binary Array function 2020-03-19 17:45:20 +00:00
Everton Constantino
6ff5a14091 Reenabling packetmath unsigned tests, adding dummy pabs for relevant unsigned
types.
2020-03-19 17:31:49 +00:00
Joel Holdsworth
232f904082 Add shift_left<N> and shift_right<N> coefficient-wise unary Array functions 2020-03-19 17:24:06 +00:00
Joel Holdsworth
54aa8fa186 Implement integer square-root for NEON 2020-03-19 17:05:13 +00:00
Allan Leal
37ccb86916 Update NullaryFunctors.h 2020-03-16 11:59:02 +00:00
Deven Desai
7158ed4e0e Fixing HIP breakage caused by the recent commit that introduces Packet4h2 as the Eigen::Half packet type 2020-03-12 01:06:24 +00:00
Joel Holdsworth
d53ae40f7b NEON: Added int64_t and uint64_t packet math 2020-03-10 22:46:19 +00:00
Joel Holdsworth
4b9ecf2924 NEON: Added int8_t and uint8_t packet math 2020-03-10 22:46:19 +00:00
Joel Holdsworth
ceaabd4e16 NEON: Added int16_t and uint16_t packet math 2020-03-10 22:46:19 +00:00
Joel Holdsworth
d5d3cf9339 NEON: Added uint32_t packet math 2020-03-10 22:46:19 +00:00
Joel Holdsworth
eacf97f727 NEON: Implemented half-size vectors 2020-03-10 22:46:19 +00:00
Joel Holdsworth
5f411b729e NEON: Set packet_traits<double> flags 2020-03-10 22:46:19 +00:00
Joel Holdsworth
88337acae2 test/packetmath: Add tests for all integer types 2020-03-10 22:46:19 +00:00
Joel Holdsworth
9e68977578 test/packetmath: Made negate non-mandatory 2020-03-10 22:46:19 +00:00
Sami Kama
b733b8b680 remove duplicate pset1 for half and add some comments about why we need expose pmul/add/div/min/max on host 2020-03-10 20:28:43 +00:00
Ram-Z
a45d28256d Don't restrict CMAKE_BUILD_TYPE
This prevents projects that add Eigen using `add_subdirectory` from using their own custom CMAKE_BUILD_TYPE and have Eigen respect the same custom flags.
2020-02-28 20:46:53 +00:00
Cédric Hubert
98bfc5aaa8 Update MarketIO.h 2020-02-28 12:41:51 +00:00
Rasmus Munk Larsen
52a2fbbb00 Revert "avoid selecting half-packets when unnecessary"
This reverts commit 5ca10480b0
2020-02-25 01:07:43 +00:00
Rasmus Munk Larsen
235bcfe08d Revert "Pick full packet unconditionally when EIGEN_UNALIGNED_VECTORIZE"
This reverts commit 44df2109c8
2020-02-25 01:07:28 +00:00
Rasmus Munk Larsen
d7a42eade6 Revert "do not pick full-packet if it'd result in more operations"
This reverts commit e9cc0cd353
2020-02-25 01:07:15 +00:00
Rasmus Munk Larsen
6ac37768a9 Revert "add some static checks for packet-picking logic"
This reverts commit 7769600245
2020-02-25 01:07:04 +00:00
Rasmus Munk Larsen
87cfa4862f Revert "Disable test in test/vectorization_logic.cpp, which is currently failing with AVX."
This reverts commit b625adffd8
2020-02-25 01:04:56 +00:00
Rasmus Munk Larsen
b625adffd8 Disable test in test/vectorization_logic.cpp, which is currently failing with AVX. 2020-02-24 23:28:25 +00:00
Tobias Bosch
f0ce88cff7 Include <sstream> explicitly, and don't rely on the implicit include via <complex>.
This implicit dependency does no longer exist in a recent llbm release (sha 78be61871704).
2020-02-24 23:09:36 +00:00
Ilya Tokar
eb6cc29583 Avoid a division in NonBlockingThreadPool::Steal.
Looking at profiles we spend ~10-20% of Steal on simply computing
random % size. We can reduce random 32-bit int into [0, size) range with
a single multiplication and shift. This transformation is described in
https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
2020-02-14 16:02:57 -05:00
Francesco Mazzoli
7769600245 add some static checks for packet-picking logic 2020-02-07 18:16:16 +01:00
Francesco Mazzoli
e9cc0cd353 do not pick full-packet if it'd result in more operations
See comment and
<https://gitlab.com/libeigen/eigen/merge_requests/46#note_270622952>.
2020-02-07 18:16:16 +01:00
Francesco Mazzoli
44df2109c8 Pick full packet unconditionally when EIGEN_UNALIGNED_VECTORIZE
See comment for details.
2020-02-07 18:16:16 +01:00
Francesco Mazzoli
5ca10480b0 avoid selecting half-packets when unnecessary
See
<https://stackoverflow.com/questions/59709148/ensuring-that-eigen-uses-avx-vectorization-for-a-certain-operation>
for an explanation of the problem this solves.

In short, for some reason, before this commit the half-packet is
selected when the array / matrix size is not a multiple of
`unpacket_traits<PacketType>::size`, where `PacketType` starts out
being the full Packet.

For example, for some data of 100 `float`s, `Packet4f` will be
selected rather than `Packet8f`, because 100 is not a multiple of 8,
the size of `Packet8f`.

This commit switches to selecting the half-packet if the size is
less than the packet size, which seems to make more sense.

As I stated in the SO post I'm not sure that I'm understanding the
issue correctly, but this fix resolves the issue in my program. Moreover,
`make check` passes, with the exception of line 614 and 616 in
`test/packetmath.cpp`, which however also fail on master on my machine:

    CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i0, internal::pbessel_i0);
    ...
    CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i1, internal::pbessel_i1);
2020-02-07 18:16:16 +01:00
Eugene Zhulenev
f584bd9b30 Fail at compile time if default executor tries to use non-default device 2020-02-06 22:43:24 +00:00
Eugene Zhulenev
3fda850c46 Remove dead code from TensorReduction.h 2020-01-29 18:45:31 +00:00
Jeff Daily
b5df8cabd7 fix hip-clang compilation due to new HIP scalar accessor 2020-01-20 21:08:52 +00:00
Deven Desai
6d284bb1b7 Fix for HIP breakage - 200115. Adding a missing EIGEN_DEVICE_FUNC attr 2020-01-16 00:51:43 +00:00
Srinivas Vasudevan
f6c6de5d63 Ensure Igamma does not NaN or Inf for large values. 2020-01-14 21:32:48 +00:00
Rasmus Munk Larsen
6601abce86 Remove rogue include in TypeCasting.h. Meta.h is already included by the top-level header in Eigen/Core. 2020-01-14 21:03:53 +00:00
Eugene Zhulenev
b9362fb8f7 Convert StridedLinearBufferCopy::Kind to enum class 2020-01-13 11:43:24 -08:00
Everton Constantino
5a8b97b401 Switching unpacket_traits<Packet4i> to vectorizable=true. 2020-01-13 16:08:20 -03:00
Everton Constantino
42838c28b8 Adding correct cache sizes for PPC architecture. 2020-01-13 16:58:14 +00:00
Christoph Hertzberg
1d0c45122a Removing executable bit from file mode 2020-01-11 15:02:29 +01:00
Christoph Hertzberg
35219cea68 Bug #1790: Make areApprox check numext::isnan instead of bitwise equality (NaNs don't have to be bitwise equal). 2020-01-11 14:57:22 +01:00
Srinivas Vasudevan
2e099e8d8f Added special_packetmath test and tweaked bounds on tests.
Refactor shared packetmath code to header file.
(Squashed from PR !38)
2020-01-11 10:31:21 +00:00
Rasmus Munk Larsen
e1ecfc162d call Explicitly ::rint and ::rintf for targets without c++11. Without this, the Windows build breaks when trying to compile numext::rint<double>. 2020-01-10 21:14:08 +00:00
Joel Holdsworth
da5a7afed0 Improvements to the tidiness and completeness of the NEON implementation 2020-01-10 18:31:15 +00:00
Anuj Rawat
452371cead Fix for gcc build error when using Eigen headers with AVX512 2020-01-10 18:05:42 +00:00
mehdi-goli
601f89dfd0 Adding RInt vector support for SYCL. 2020-01-10 18:00:36 +00:00
Matthew Powelson
2ea5a715cf Properly initialize b vector in SplineFitting
InterpolateWithDerivative does not initialize the be vector correctly. This issue is discussed In stackoverflow question 48382939.
2020-01-09 21:29:04 +00:00
Rasmus Munk Larsen
9254974115 Don't add EIGEN_DEVICE_FUNC to random() since ::rand is not available in Cuda. 2020-01-09 21:23:09 +00:00
Rasmus Munk Larsen
a3ec89b5bd Add missing EIGEN_DEVICE_FUNC annotations in MathFunctions.h. 2020-01-09 21:06:34 +00:00
Christoph Hertzberg
8333e03590 Use data.data() instead of &data (since it is not obvious that Array is trivially copyable) 2020-01-09 11:38:19 +01:00
Rasmus Munk Larsen
e6fcee995b Don't use the rational approximation to the logistic function on GPUs as it appears to be slightly slower. 2020-01-09 00:04:26 +00:00
Rasmus Munk Larsen
4217a9f090 The upper limits for where to use the rational approximation to the logistic function were not set carefully enough in the original commit, and some arguments would cause the function to return values greater than 1. This change set the versions found by scanning all floating point numbers (using std::nextafterf()). 2020-01-08 22:21:37 +00:00
Christoph Hertzberg
9623c0c4b9 Fix formatting 2020-01-08 13:58:18 +01:00
Ilya Tokar
19876ced76 Bug #1785: Introduce numext::rint.
This provides a new op that matches std::rint and previous behavior of
pround. Also adds corresponding unsupported/../Tensor op.
Performance is the same as e. g. floor (tested SSE/AVX).
2020-01-07 21:22:44 +00:00
mehdi-goli
d0ae052da4 [SYCL Backend]
* Adding Missing operations for vector comparison in SYCL. This caused compiler error for vector comparison when compiling SYCL
 * Fixing the compiler error for placement new in TensorForcedEval.h This caused compiler error when compiling SYCL backend
 * Reducing the SYCL warning by  removing the abort function inside the kernel
 * Adding Strong inline to functions inside SYCL interop.
2020-01-07 15:13:37 +00:00
Everton Constantino
eedb7eeacf Protecting integer_types's long long test with a check to see if we have CXX11 support. 2020-01-07 14:35:35 +00:00
Christoph Hertzberg
bcbaad6d87 Bug #1800: Guard against misleading indentation 2020-01-03 13:47:43 +01:00
Janek Kozicki
00de570793 Fix -Werror -Wfloat-conversion warning. 2019-12-23 23:52:44 +01:00
Deven Desai
636e2bb3fa Fix for HIP breakage - 191220
The breakage was introduced by the following commit :

ae07801dd8

After the commit, HIPCC errors out on some tests with the following error

```
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_device_1.dir/cxx11_tensor_device_1_generated_cxx11_tensor_device.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_device.cu:17:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor💯
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlock.h:129:12: error: no matching constructor for initialization of 'Eigen::internal::TensorBlockResourceRequirements'
    return {merge(lhs.shape_type, rhs.shape_type),           // shape_type
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlock.h:75:8: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 3 were provided
struct TensorBlockResourceRequirements {
       ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlock.h:75:8: note: candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 3 were provided
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlock.h:75:8: note: candidate constructor (the implicit copy constructor) not viable: requires 5 arguments, but 3 were provided
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlock.h:75:8: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 3 were provided
...
...
```

The fix is to explicitly decalre the (implicitly called) constructor as a device func
2019-12-20 21:28:00 +00:00
Christoph Hertzberg
1e9664b147 Bug #1796: Make matrix squareroot usable for Map and Ref types 2019-12-20 18:10:22 +01:00
Christoph Hertzberg
d86544d654 Reduce code duplication and avoid confusing Doxygen 2019-12-19 19:48:39 +01:00
Christoph Hertzberg
dde279f57d Hide recursive meta templates from Doxygen 2019-12-19 19:47:23 +01:00
Christoph Hertzberg
c21771ac04 Use double-braces initialization (as everywhere else in the test-suite). 2019-12-19 19:20:48 +01:00
Christoph Hertzberg
a3273aeff8 Fix trivial shadow warning 2019-12-19 19:13:11 +01:00
Christoph Hertzberg
870e53c0f2 Bug #1788: Fix rule-of-three violations inside the stable modules.
This fixes deprecated-copy warnings when compiling with GCC>=9
Also protect some additional Base-constructors from getting called by user code code (#1587)
2019-12-19 17:30:11 +01:00
Christoph Hertzberg
6965f6de7f Fix unit-test which I broke in previous fix 2019-12-19 13:42:14 +01:00
Eugene Zhulenev
7a65219a2e Fix TensorPadding bug in squeezed reads from inner dimension 2019-12-19 05:43:57 +00:00
Eugene Zhulenev
73e55525e5 Return const data pointer from TensorRef evaluator.data() 2019-12-18 23:19:36 +00:00
Eugene Zhulenev
ae07801dd8 Tensor block evaluation cost model 2019-12-18 20:07:00 +00:00
Christoph Hertzberg
72166d0e6e Fix some maybe-unitialized warnings 2019-12-18 18:26:20 +01:00
Christoph Hertzberg
5a3eaf88ac Workaround class-memaccess warnings on newer GCC versions 2019-12-18 16:37:26 +01:00
Jeff Daily
de07c4d1c2 fix compilation due to new HIP scalar accessor 2019-12-17 20:27:30 +00:00
Eugene Zhulenev
788bef6ab5 Reduce block evaluation overhead for small tensor expressions 2019-12-17 19:06:14 +00:00
Rasmus Munk Larsen
7252163335 Add default definition for EIGEN_PREDICT_* 2019-12-16 22:31:59 +00:00
Rasmus Munk Larsen
a566074480 Improve accuracy of fast approximate tanh and the logistic functions in Eigen, such that they preserve relative accuracy to within a few ULPs where their function values tend to zero (around x=0 for tanh, and for large negative x for the logistic function).
This change re-instates the fast rational approximation of the logistic function for float32 in Eigen (removed in 66f07efeae), but uses the more accurate approximation 1/(1+exp(-1)) ~= exp(x) below -9. The exponential is only calculated on the vectorized path if at least one element in the SIMD input vector is less than -9.

This change also contains a few improvements to speed up the original float specialization of logistic:
  - Introduce EIGEN_PREDICT_{FALSE,TRUE} for __builtin_predict and use it to predict that the logistic-only path is most likely (~2-3% speedup for the common case).
  - Carefully set the upper clipping point to the smallest x where the approximation evaluates to exactly 1. This saves the explicit clamping of the output (~7% speedup).

The increased accuracy for tanh comes at a cost of 10-20% depending on instruction set.

The benchmarks below repeated calls

   u = v.logistic()  (u = v.tanh(), respectively)

where u and v are of type Eigen::ArrayXf, have length 8k, and v contains random numbers in [-1,1].

Benchmark numbers for logistic:

Before:
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
SSE
BM_eigen_logistic_float        4467           4468         155835  model_time: 4827
AVX
BM_eigen_logistic_float        2347           2347         299135  model_time: 2926
AVX+FMA
BM_eigen_logistic_float        1467           1467         476143  model_time: 2926
AVX512
BM_eigen_logistic_float         805            805         858696  model_time: 1463

After:
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
SSE
BM_eigen_logistic_float        2589           2590         270264  model_time: 4827
AVX
BM_eigen_logistic_float        1428           1428         489265  model_time: 2926
AVX+FMA
BM_eigen_logistic_float        1059           1059         662255  model_time: 2926
AVX512
BM_eigen_logistic_float         673            673        1000000  model_time: 1463

Benchmark numbers for tanh:

Before:
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
SSE
BM_eigen_tanh_float        2391           2391         292624  model_time: 4242
AVX
BM_eigen_tanh_float        1256           1256         554662  model_time: 2633
AVX+FMA
BM_eigen_tanh_float         823            823         866267  model_time: 1609
AVX512
BM_eigen_tanh_float         443            443        1578999  model_time: 805

After:
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
SSE
BM_eigen_tanh_float        2588           2588         273531  model_time: 4242
AVX
BM_eigen_tanh_float        1536           1536         452321  model_time: 2633
AVX+FMA
BM_eigen_tanh_float        1007           1007         694681  model_time: 1609
AVX512
BM_eigen_tanh_float         471            471        1472178  model_time: 805
2019-12-16 21:33:42 +00:00
Christoph Hertzberg
8e5da71466 Resolve double-promotion warnings when compiling with clang.
`sin` was calling `sin(double)` instead of `std::sin(float)`
2019-12-13 22:46:40 +01:00
Christoph Hertzberg
9b7a2b43c2 Renamed .hgignore to .gitignore (removing hg-specific "syntax" line) 2019-12-13 19:40:57 +01:00
Ilya Tokar
06e99aaf40 Bug 1785: fix pround on x86 to use the same rounding mode as std::round.
This also adds pset1frombits helper to Packet[24]d.
Makes round ~45% slower for SSE: 1.65µs ± 1% before vs 2.45µs ± 2% after,
stil an order of magnitude faster than scalar version: 33.8µs ± 2%.
2019-12-12 17:38:53 -05:00
Rasmus Munk Larsen
73a8d572f5 Clamp tanh approximation outside [-c, c] where c is the smallest value where the approximation is exactly +/-1. Without FMA, c = 7.90531110763549805, with FMA c = 7.99881172180175781. 2019-12-12 19:34:25 +00:00
Srinivas Vasudevan
88062b7fed Fix implementation of complex expm1. Add tests that fail with previous implementation, but pass with the current one. 2019-12-12 01:56:54 +00:00
Eugene Zhulenev
381f8f3139 Initialize non-trivially constructible types when allocating a temp buffer. 2019-12-12 01:31:30 +00:00
Eugene Zhulenev
64272c7f40 Squeeze reads from two inner dimensions in TensorPadding 2019-12-11 16:54:51 -08:00
Eugene Zhulenev
963ba1015b Add back accidentally deleted default constructor to TensorExecutorTilingContext. 2019-12-11 18:47:55 +00:00
Joel Holdsworth
1b6e0395e6 Added io test 2019-12-11 18:22:57 +00:00
Joel Holdsworth
3c0ef9f394 IO: Fixed printing of char and unsigned char matrices 2019-12-11 18:22:57 +00:00
Joel Holdsworth
e87af0ed37 Added Eigen::numext typedefs for uint8_t, int8_t, uint16_t and int16_t 2019-12-11 18:22:57 +00:00
Gael Guennebaud
15b3bcfca0 Bug 1786: fix compilation with MSVC 2019-12-11 16:16:38 +01:00
Eugene Zhulenev
c9220c035f Remove block memory allocation required by removed block evaluation API 2019-12-10 17:15:55 -08:00
Eugene Zhulenev
1c879eb010 Remove V2 suffix from TensorBlock 2019-12-10 15:40:23 -08:00
Eugene Zhulenev
dbca11e880 Remove TensorBlock.h and old TensorBlock/BlockMapper 2019-12-10 14:31:44 -08:00
Deven Desai
c49f0d851a Fix for HIP breakage detected on 191210
The following commit introduces compile errors when running eigen with hipcc

2918f85ba9

hipcc errors out because it requies the device attribute on the methods within the TensorBlockV2ResourceRequirements struct instroduced by the commit above. The fix is to add the device attribute to those methods
2019-12-10 22:14:05 +00:00
Eugene Zhulenev
2918f85ba9 Do not use std::vector in getResourceRequirements 2019-12-09 16:19:55 -08:00
Artem Belevich
8056a05b54 Undo the block size change.
.z *is* used by the EigenContractionKernelInternal().
2019-12-09 11:10:29 -08:00
Eugene Zhulenev
dbb703d44e Add async evaluation support to TensorSelectOp 2019-12-09 18:36:13 +00:00
Janek Kozicki
11d6465326 fix AlignedVector3 inconsisent interface with other Vector classes, default constructor and operator- were missing. 2019-12-06 21:07:39 +01:00
Eugene Zhulenev
bb7ccac3af Add recursive work splitting to EvalShardedByInnerDimContext 2019-12-05 14:51:49 -08:00
Artem Belevich
25230d1862 Improve performance of contraction kernels
* Force-inline implementations. They pass around pointers to shared memory
  blocks. Without inlining compiler must operate via generic pointers.
  Inlining allows compiler to detect that we're operating on shared memory
  which allows generation of substantially faster code.

* Fixed a long-standing typo which resulted in launching 8x more kernels
  than we needed (.z dimension of the block is unused by the kernel).
2019-12-05 12:48:34 -08:00
Gael Guennebaud
08eeb648ea update hg to git hashes 2019-12-05 16:33:24 +01:00
Rasmus Munk Larsen
366cf005b0 Add missing initialization in cxx11_tensor_trace.cpp. 2019-12-04 23:56:37 +00:00
Gael Guennebaud
c488b8b32f Replace calls to "hg" by calls to "git" 2019-12-04 11:24:06 +01:00
Gael Guennebaud
8fbe0e4699 Update old links to bitbucket to point to gitlab.com 2019-12-04 10:57:07 +01:00
Gael Guennebaud
114a15c66a Added tag before-git-migration for changeset a7c7d329d8 2019-12-04 10:06:00 +01:00
Rasmus Larsen
a7c7d329d8 Merged in ezhulenev/eigen-01 (pull request PR-769)
Capture TensorMap by value inside tensor expression AST
2019-12-04 00:49:10 +00:00
Rasmus Larsen
cacf433975 Merged in anshuljl/eigen-2/Anshul-Jaiswal/update-configurevectorizationh-to-not-op-1573079916090 (pull request PR-754)
Update ConfigureVectorization.h to not optimize fp16 routines when compiling with cuda.

Approved-by: Deven Desai <deven.desai.amd@gmail.com>
2019-12-04 00:45:42 +00:00
Eugene Zhulenev
8f4536e852 Capture TensorMap by value inside tensor expression AST 2019-12-03 16:39:05 -08:00
Rasmus Munk Larsen
4e696901f8 Remove __host__ annotation for device-only function. 2019-12-03 14:33:19 -08:00
Rasmus Munk Larsen
ead81559c8 Use EIGEN_DEVICE_FUNC macro instead of __device__. 2019-12-03 12:08:22 -08:00
Gael Guennebaud
6358599ecb Fix QuaternionBase::cast for quaternion map and wrapper. 2019-12-03 14:51:14 +01:00
Gael Guennebaud
7745f69013 bug #1776: fix vector-wise STL iterator's operator-> using a proxy as pointer type.
This changeset fixes also the value_type definition.
2019-12-03 14:40:15 +01:00
Rasmus Munk Larsen
66f07efeae Revert the specialization for scalar_logistic_op<float> introduced in:
77b447c24e


While providing a 50% speedup on Haswell+ processors, the large relative error outside [-18, 18] in this approximation causes problems, e.g., when computing gradients of activation functions like softplus in neural networks.
2019-12-02 17:00:58 -08:00
Rasmus Larsen
3b15373bb3 Merged in ezhulenev/eigen-02 (pull request PR-767)
Fix shadow warnings in AlignedBox and SparseBlock
2019-12-02 18:23:11 +00:00
Deven Desai
312c8e77ff Fix for the HIP build+test errors.
Recent changes have introduced the following build error when compiling with HIPCC

---------

unsupported/test/../../Eigen/src/Core/GenericPacketMath.h:254:58: error:  'ldexp':  no overloaded function has restriction specifiers that are compatible with the ambient context 'pldexp'

---------

The fix for the error is to pick the math function(s) from the global namespace (where they are declared as device functions in the HIP header files) when compiling with HIPCC.
2019-12-02 17:41:32 +00:00
Rasmus Larsen
956131d0e6 Merged in codeplaysoftware/eigen/SYCL-Backend (pull request PR-691)
SYCL Backend

Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-11-28 16:19:25 +00:00
Mehdi Goli
00f32752f7 [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch.
* Unifying all loadLocalTile from lhs and rhs to an extract_block function.
* Adding get_tensor operation which was missing in TensorContractionMapper.
* Adding the -D method missing from cmake for Disable_Skinny Contraction operation.
* Wrapping all the indices in TensorScanSycl into Scan parameter struct.
* Fixing typo in Device SYCL
* Unifying load to private register for tall/skinny no shared
* Unifying load to vector tile for tensor-vector/vector-tensor operation
* Removing all the LHS/RHS class for extracting data from global
* Removing Outputfunction from TensorContractionSkinnyNoshared.
* Combining the local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining General Tensor-Vector and VectorTensor contraction into one kernel.
* Making double buffering optional for Tensor contraction when local memory is version is used.
* Modifying benchmark to accept custom Reduction Sizes
* Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host
* Adding Test for SYCL
* Modifying SYCL CMake
2019-11-28 10:08:54 +00:00
Eugene Zhulenev
82a47338df Fix shadow warnings in AlignedBox and SparseBlock 2019-11-27 16:22:27 -08:00
Rasmus Munk Larsen
ea51a9eace Add missing EIGEN_DEVICE_FUNC attribute to template specializations for pexp to fix GPU build. 2019-11-27 10:17:09 -08:00
Rasmus Munk Larsen
5a3ebda36b Fix warning due to missing cast for exponent arguments for std::frexp and std::lexp. 2019-11-26 16:18:29 -08:00
Rasmus Larsen
2df57be856 Merged in realjhol/eigen/fix-warnings (pull request PR-760)
Fix warnings
2019-11-26 23:24:23 +00:00
Eugene Zhulenev
5496d0da0b Add async evaluation support to TensorReverse 2019-11-26 15:02:24 -08:00
Eugene Zhulenev
bc66c88255 Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling 2019-11-26 11:41:57 -08:00
Gael Guennebaud
c79b6ffe1f Add an explicit example for auto and re-evaluation 2019-11-20 17:31:23 +01:00
Hans Johnson
e78ed6e7f3 COMP: Simplify install commands for Eigen
Confirm that install directory is identical
before and after this simplifying patch.

```bash
hg clone <<Eigen>>
mkdir eigen-bld
cd eigen-bld
cmake ../Eigen -DCMAKE_INSTALL_PREFIX:PATH=/tmp/bef
make install
find /tmp/pre_eigen_modernize >/tmp/bef

#  Apply this patch

cmake ../Eigen -DCMAKE_INSTALL_PREFIX:PATH=/tmp/aft
make install
find /tmp/post_eigen_modernize |sed 's/post_e/pre_e/g' >/tmp/aft
diff /tmp/bef /tmp/aft
```
2019-11-17 15:14:25 -06:00
Hans Johnson
9d5cdc98c3 COMP: target_compile_definitions requires cmake 2.8.11
Features committed in 2016 have required cmake verison 2.8.11.

`sergiu Tue Nov 22 12:25:06 2016 +0100:   target_compile_definitions`

Set the minimum cmake version to the minimum version that
is capable of compiling or installing the code base.
2019-11-17 14:59:32 -06:00
Gael Guennebaud
e5778b87b9 Fix duplicate symbol linking error. 2019-11-20 17:23:19 +01:00
Joel Holdsworth
86eb41f1cb SparseRef: Fixed alignment warning on ARM GCC 2019-11-07 14:34:06 +00:00
Anshul Jaiswal
c1a67cb5af Update ConfigureVectorization.h to not optimize fp16 routines when compiling with cuda. 2019-11-06 22:40:38 +00:00
Rasmus Munk Larsen
cc3d0e6a40 Add EIGEN_HAS_INTRINSIC_INT128 macro
Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.
2019-11-06 14:24:33 -08:00
Rasmus Munk Larsen
ee404667e2 Rollback or PR-746 and partial rollback of 668ab3fc47
.

std::array is still not supported in CUDA device code on Windows.
2019-11-05 17:17:58 -08:00
Joel Holdsworth
743c925286 test/packetmath: Silence alignment warnings 2019-11-05 19:06:12 +00:00
Rasmus Larsen
0c9745903a Merged in ezhulenev/eigen-01 (pull request PR-746)
Remove internal::smart_copy and replace with std::copy
2019-11-04 20:18:38 +00:00
Hans Johnson
8c8cab1afd STYLE: Convert CMake-language commands to lower case
Ancient CMake versions required upper-case commands.  Later command names
became case-insensitive.  Now the preferred style is lower-case.
2019-10-31 11:36:37 -05:00
Hans Johnson
6fb3e5f176 STYLE: Remove CMake-language block-end command arguments
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.
2019-10-31 11:36:27 -05:00
Rasmus Munk Larsen
f1e8307308 1. Fix a bug in psqrt and make it return 0 for +inf arguments.
2. Simplify handling of special cases by taking advantage of the fact that the
   builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
   This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:

Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)

Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.

4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.

Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
2019-11-15 17:09:46 -08:00
Gael Guennebaud
2cb2915f90 bug #1744: fix compilation with MSVC 2017 and AVX512, plog1p/pexpm1 require plog/pexp, but the later was disabled on some compilers 2019-11-15 13:39:51 +01:00
Gael Guennebaud
c3f6fcf2c0 bug #1747: one more fix for MSVC regarding the Bessel implementation. 2019-11-15 11:12:35 +01:00
Gael Guennebaud
b9837ca9ae bug #1281: fix AutoDiffScalar's make_coherent for nested expression of constant ADs. 2019-11-14 14:58:08 +01:00
Gael Guennebaud
0fb6e24408 Fix case issue with Lapack unit tests 2019-11-14 14:16:05 +01:00
Gael Guennebaud
8af045a287 bug #1774: fix VectorwiseOp::begin()/end() return types regarding constness. 2019-11-14 11:45:52 +01:00
Sakshi Goynar
75b4c0a3e0 PR 751: Fixed compilation issue when compiling using MSVC with /arch:AVX512 flag 2019-10-31 16:09:16 -07:00
Gael Guennebaud
8496f86f84 Enable CompleteOrthogonalDecomposition::pseudoInverse with non-square fixed-size matrices. 2019-11-13 21:16:53 +01:00
Gael Guennebaud
002e5b6db6 Move to my.cdash.org 2019-11-13 13:33:49 +01:00
Eugene Zhulenev
13c3327f5c Remove legacy block evaluation support 2019-11-12 10:12:28 -08:00
Gael Guennebaud
71aa53dd6d Disable AVX on broken xcode versions. See PR 748.
Patch adapted from Hans Johnson's PR 748.
2019-11-12 11:40:38 +01:00
Rasmus Munk Larsen
0ed0338593 Fix a race in async tensor evaluation: Don't run on_done() until after device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs. 2019-11-11 12:26:41 -08:00
Eugene Zhulenev
c952b8dfda Break loop dependence in TensorGenerator block access 2019-11-11 10:32:57 -08:00
Rasmus Munk Larsen
ebf04fb3e8 Fix data race in css11_tensor_notification test. 2019-11-08 17:44:50 -08:00
Eugene Zhulenev
73ecb2c57d Cleanup includes in Tensor module after switch to C++11 and above 2019-10-29 15:49:54 -07:00
Eugene Zhulenev
e7ed4bd388 Remove internal::smart_copy and replace with std::copy 2019-10-29 11:25:24 -07:00
Eugene Zhulenev
fbc0a9a3ec Fix CXX11Meta compilation with MSVC 2019-10-28 18:30:10 -07:00
Eugene Zhulenev
bd864ab42b Prevent potential ODR in TensorExecutor 2019-10-28 15:45:09 -07:00
Mehdi Goli
6332aff0b2 This PR fixes:
* The specialization of array class in the different namespace for GCC<=6.4
* The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
2019-10-23 15:56:56 +01:00
Rasmus Larsen
8e4e29ae99 Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)
Fix for the HIP build+test errors.
2019-10-22 22:18:38 +00:00
Rasmus Munk Larsen
97c0c5d485 Add block evaluation V2 to TensorAsyncExecutor.
Add async evaluation to a number of ops.
2019-10-22 12:42:44 -07:00
Deven Desai
102cf2a72d Fix for the HIP build+test errors.
The errors were introduced by this commit :

After the above mentioned commit, some of the tests started failing with the following error


```
Built target cxx11_tensor_reduction
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible
    DestinationBufferKind m_kind;
    ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible
  DestinationBuffer m_destination;
  ^
```


For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
2019-10-22 19:21:27 +00:00
Rasmus Munk Larsen
668ab3fc47 Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate c++11 functionality with older compilers. 2019-10-18 16:42:00 -07:00
Eugene Zhulenev
df0e8b8137 Propagate block evaluation preference through rvalue tensor expressions 2019-10-17 11:17:33 -07:00
Eugene Zhulenev
0d2a14ce11 Cleanup Tensor block destination and materialized block storage allocation 2019-10-16 17:14:37 -07:00
Eugene Zhulenev
02431cbe71 TensorBroadcasting support for random/uniform blocks 2019-10-16 13:26:28 -07:00
Eugene Zhulenev
d380c23b2c Block evaluation for TensorGenerator/TensorReverse/TensorShuffling 2019-10-14 14:31:59 -07:00
Gael Guennebaud
39fb9eeccf bug #1747: fix compilation with MSVC 2019-10-14 22:50:23 +02:00
Eugene Zhulenev
a411e9f344 Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op 2019-10-10 10:56:58 -07:00
Rasmus Larsen
b03eb63d7c Merged in ezhulenev/eigen-01 (pull request PR-726)
Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
2019-10-10 16:58:11 +00:00
Gael Guennebaud
e7d8ba747c bug #1752: make is_convertible equivalent to the std c++11 equivalent and fallback to std::is_convertible when c++11 is enabled. 2019-10-10 17:41:47 +02:00
Gael Guennebaud
fb557aec5c bug #1752: disable some is_convertible tests for recent compilers. 2019-10-10 11:40:21 +02:00
Eugene Zhulenev
33e1746139 Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing 2019-10-09 12:45:31 -07:00
Gael Guennebaud
f0a4642bab Implement c++03 compatible fix for changeset 7a43af1a33 2019-10-09 16:00:57 +02:00
Gael Guennebaud
196de2efe3 Explicitly bypass resize and memmoves when there is already the exact right number of elements available. 2019-10-08 21:44:33 +02:00
Gael Guennebaud
36da231a41 Disable an expected warning in unit test 2019-10-08 16:28:14 +02:00
Gael Guennebaud
d1def335dc fix one more possible conflicts with real/imag 2019-10-08 16:19:10 +02:00
Gael Guennebaud
87427d2eaa PR 719: fix real/imag namespace conflict 2019-10-08 09:15:17 +02:00
Gael Guennebaud
7a43af1a33 Fix compilation of FFTW unit test 2019-10-08 08:58:35 +02:00
Eugene Zhulenev
f74ab8cb8d Add block evaluation to TensorEvalTo and fix few small bugs 2019-10-07 15:34:26 -07:00
Brian Zhao
3afb640b56 Fixing incorrect size in Tensor documentation. 2019-10-04 21:30:35 -07:00
Rasmus Munk Larsen
20c4a9118f Use "pdiv" rather than operator/ to support packet types. 2019-10-04 16:54:03 -07:00
Rasmus Larsen
d1dd51cb5f Merged in ezhulenev/eigen-01 (pull request PR-723)
Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect

Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-10-04 17:19:13 +00:00
Eugene Zhulenev
98bdd7252e Fix compilation warnings and errors with clang in TensorBlockV2 code and tests 2019-10-04 10:15:33 -07:00
Rasmus Munk Larsen
fab4e3a753 Address comments on Chebyshev evaluation code:
1. Use pmadd when possible.
2. Add casts to avoid c++03 warnings.
2019-10-02 12:48:17 -07:00
Eugene Zhulenev
60ae24ee1a Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect 2019-10-02 12:44:06 -07:00
Eugene Zhulenev
6e40454a6e Add beta to TensorContractionKernel and make memset optional 2019-10-02 11:06:02 -07:00
Rasmus Munk Larsen
bd0fac456f Prevent infinite loop in the nvcc compiler while unrolling the recurrent templates for Chebyshev polynomial evaluation. 2019-10-01 13:15:30 -07:00
Gael Guennebaud
9549ba8313 Fix perf issue in SimplicialLDLT::solve for complexes (again, m_diag is real) 2019-10-01 12:54:25 +02:00
Gael Guennebaud
c8b2c603b0 Fix speed issue with SimplicialLDLT for complexes: the diagonal is real! 2019-09-30 16:14:34 +02:00
Rasmus Munk Larsen
13ef08e5ac Move implementation of vectorized error function erf() to SpecialFunctionsImpl.h. 2019-09-27 13:56:04 -07:00
Eugene Zhulenev
7c8bc0d928 Fix cxx11_tensor_block_io test 2019-09-25 11:48:11 -07:00
Eugene Zhulenev
0c845e28c9 Fix erf in c++03 2019-09-25 11:31:45 -07:00
Eugene Zhulenev
71d5bedf72 Fix compilation warnings and errors with clang in TensorBlockV2 2019-09-25 11:25:22 -07:00
Deven Desai
5e186b1987 Fix for the HIP build+test errors.
The errors were introduced by this commit : d38e6fbc27


After the above mentioned commit, some of the tests started failing with the following error


```
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:70:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h:28:22: error: call to 'erf' is ambiguous
  return Eigen::half(Eigen::numext::erf(static_cast<float>(a)));
                     ^~~~~~~~~~~~~~~~~~
/home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1600:7: note: candidate function [with T = float]
float erf(const float &x) { return ::erff(x); }
      ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = float]
    erf(const Scalar& x) {
    ^
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:23: error: call to 'erf' is ambiguous
  return make_double2(erf(a.x), erf(a.y));
                      ^~~
/home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double]
double erf(const double &x) { return ::erf(x); }
       ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double]
    erf(const Scalar& x) {
    ^
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:29:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/../SpecialFunctions:75:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/GPU/GpuSpecialFunctions.h:87:33: error: call to 'erf' is ambiguous
  return make_double2(erf(a.x), erf(a.y));
                                ^~~
/home/rocm-user/eigen/unsupported/test/../../Eigen/src/Core/MathFunctions.h:1603:8: note: candidate function [with T = double]
double erf(const double &x) { return ::erf(x); }
       ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h:1897:5: note: candidate function [with Scalar = double]
    erf(const Scalar& x) {
    ^
3 errors generated.
```


This PR fixes the compile error by removing the "old" implementation for "erf" (assuming that the "new" implementation is what we want going forward. from a GPU point-of-view both implementations are the same).

This PR also fixes what seems like a cut-n-paste error in the aforementioned commit
2019-09-25 15:39:13 +00:00
Eugene Zhulenev
f35b9ab510 Fix a bug in a packed block type in TensorContractionThreadPool 2019-09-24 16:54:36 -07:00
Rasmus Larsen
d38e6fbc27 Merged in rmlarsen/eigen (pull request PR-704)
Add generic PacketMath implementation of the Error Function (erf).
2019-09-24 23:40:29 +00:00
Rasmus Munk Larsen
591a554c68 Add TODO to cleanup FMA cost modelling. 2019-09-24 16:39:25 -07:00
Eugene Zhulenev
c64396b4c6 Choose TensorBlock StridedLinearCopy type statically 2019-09-24 16:04:29 -07:00
Eugene Zhulenev
c97b208468 Add new TensorBlock api implementation + tests 2019-09-24 15:17:35 -07:00
Eugene Zhulenev
ef9dfee7bd Tensor block evaluation V2 support for unary/binary/broadcsting 2019-09-24 12:52:45 -07:00
Christoph Hertzberg
efd9867ff0 bug #1746: Removed implementation of standard copy-constructor and standard copy-assign-operator from PermutationMatrix and Transpositions to allow malloc-less std::move. Added unit-test to rvalue_types 2019-09-24 11:09:58 +02:00
Christoph Hertzberg
e4c1b3c1d2 Fix implicit conversion warnings and use pnegate to negate packets 2019-09-23 16:07:43 +02:00
Christoph Hertzberg
ba0736fa8e Fix (or mask away) conversion warnings introduced in 553caeb6a3
.
2019-09-23 15:58:05 +02:00
Rasmus Munk Larsen
1d5af0693c Add support for asynchronous evaluation of tensor casting expressions. 2019-09-19 13:54:49 -07:00
Rasmus Munk Larsen
6de5ed08d8 Add generic PacketMath implementation of the Error Function (erf). 2019-09-19 12:48:30 -07:00
Rasmus Munk Larsen
28b6786498 Fix build on setups without AVX512DQ. 2019-09-19 12:36:09 -07:00
Deven Desai
e02d429637 Fix for the HIP build+test errors.
The errors were introduced by this commit : 6e215cf109


The fix is switching to using ::<math_func> instead std::<math_func> when compiling for GPU
2019-09-18 18:44:20 +00:00
Srinivas Vasudevan
df0816b71f Merging eigen/eigen. 2019-09-16 19:33:29 -04:00
Srinivas Vasudevan
6e215cf109 Add Bessel functions to SpecialFunctions.
- Split SpecialFunctions files in to a separate BesselFunctions file.

In particular add:
    - Modified bessel functions of the second kind k0, k1, k0e, k1e
    - Bessel functions of the first kind j0, j1
    - Bessel functions of the second kind y0, y1
2019-09-14 12:16:47 -04:00
Eugene Zhulenev
7c73296849 Revert accidental change to GCC diagnostics 2019-09-13 14:30:58 -07:00
Eugene Zhulenev
bf8866b466 Fix maybe-unitialized warnings in TensorContractionThreadPool 2019-09-13 14:29:55 -07:00
Eugene Zhulenev
553caeb6a3 Use ThreadLocal container in TensorContractionThreadPool 2019-09-13 12:14:44 -07:00
Srinivas Vasudevan
facdec5aa7 Add packetized versions of i0e and i1e special functions.
- In particular refactor the i0e and i1e code so scalar and vectorized path share code.
  - Move chebevl to GenericPacketMathFunctions.


A brief benchmark with building Eigen with FMA, AVX and AVX2 flags

Before:

CPU: Intel Haswell with HyperThreading (6 cores)
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
BM_eigen_i0e_double/1            57.3           57.3     10000000
BM_eigen_i0e_double/8           398            398        1748554
BM_eigen_i0e_double/64         3184           3184         218961
BM_eigen_i0e_double/512       25579          25579          27330
BM_eigen_i0e_double/4k       205043         205042           3418
BM_eigen_i0e_double/32k     1646038        1646176            422
BM_eigen_i0e_double/256k   13180959       13182613             53
BM_eigen_i0e_double/1M     52684617       52706132             10
BM_eigen_i0e_float/1             28.4           28.4     24636711
BM_eigen_i0e_float/8             75.7           75.7      9207634
BM_eigen_i0e_float/64           512            512        1000000
BM_eigen_i0e_float/512         4194           4194         166359
BM_eigen_i0e_float/4k         32756          32761          21373
BM_eigen_i0e_float/32k       261133         261153           2678
BM_eigen_i0e_float/256k     2087938        2088231            333
BM_eigen_i0e_float/1M       8380409        8381234             84
BM_eigen_i1e_double/1            56.3           56.3     10000000
BM_eigen_i1e_double/8           397            397        1772376
BM_eigen_i1e_double/64         3114           3115         223881
BM_eigen_i1e_double/512       25358          25361          27761
BM_eigen_i1e_double/4k       203543         203593           3462
BM_eigen_i1e_double/32k     1613649        1613803            428
BM_eigen_i1e_double/256k   12910625       12910374             54
BM_eigen_i1e_double/1M     51723824       51723991             10
BM_eigen_i1e_float/1             28.3           28.3     24683049
BM_eigen_i1e_float/8             74.8           74.9      9366216
BM_eigen_i1e_float/64           505            505        1000000
BM_eigen_i1e_float/512         4068           4068         171690
BM_eigen_i1e_float/4k         31803          31806          21948
BM_eigen_i1e_float/32k       253637         253692           2763
BM_eigen_i1e_float/256k     2019711        2019918            346
BM_eigen_i1e_float/1M       8238681        8238713             86


After:

CPU: Intel Haswell with HyperThreading (6 cores)
Benchmark                  Time(ns)        CPU(ns)     Iterations
-----------------------------------------------------------------
BM_eigen_i0e_double/1            15.8           15.8     44097476
BM_eigen_i0e_double/8            99.3           99.3      7014884
BM_eigen_i0e_double/64          777            777         886612
BM_eigen_i0e_double/512        6180           6181         100000
BM_eigen_i0e_double/4k        48136          48140          14678
BM_eigen_i0e_double/32k      385936         385943           1801
BM_eigen_i0e_double/256k    3293324        3293551            228
BM_eigen_i0e_double/1M     12423600       12424458             57
BM_eigen_i0e_float/1             16.3           16.3     43038042
BM_eigen_i0e_float/8             30.1           30.1     23456931
BM_eigen_i0e_float/64           169            169        4132875
BM_eigen_i0e_float/512         1338           1339         516860
BM_eigen_i0e_float/4k         10191          10191          68513
BM_eigen_i0e_float/32k        81338          81337           8531
BM_eigen_i0e_float/256k      651807         651984           1000
BM_eigen_i0e_float/1M       2633821        2634187            268
BM_eigen_i1e_double/1            16.2           16.2     42352499
BM_eigen_i1e_double/8           110            110        6316524
BM_eigen_i1e_double/64          822            822         851065
BM_eigen_i1e_double/512        6480           6481         100000
BM_eigen_i1e_double/4k        51843          51843          10000
BM_eigen_i1e_double/32k      414854         414852           1680
BM_eigen_i1e_double/256k    3320001        3320568            212
BM_eigen_i1e_double/1M     13442795       13442391             53
BM_eigen_i1e_float/1             17.6           17.6     41025735
BM_eigen_i1e_float/8             35.5           35.5     19597891
BM_eigen_i1e_float/64           240            240        2924237
BM_eigen_i1e_float/512         1424           1424         485953
BM_eigen_i1e_float/4k         10722          10723          65162
BM_eigen_i1e_float/32k        86286          86297           8048
BM_eigen_i1e_float/256k      691821         691868           1000
BM_eigen_i1e_float/1M       2777336        2777747            256


This shows anywhere from a 50% to 75% improvement on these operations.

I've also benchmarked without any of these flags turned on, and got similar
performance to before (if not better).

Also tested packetmath.cpp + special_functions to ensure no regressions.
2019-09-11 18:34:02 -07:00
Srinivas Vasudevan
b052ec6992 Merged eigen/eigen into default 2019-09-11 18:01:54 -07:00
Deven Desai
cdb377d0cb Fix for the HIP build+test errors introduced by the ndtri support.
The fixes needed are
 * adding EIGEN_DEVICE_FUNC attribute to a couple of funcs (else HIPCC will error out when non-device funcs are called from global/device funcs)
 * switching to using ::<math_func> instead std::<math_func> (only for HIPCC) in cases where the std::<math_func> is not recognized as a device func by HIPCC
 * removing an errant "j" from a testcase (don't know how that made it in to begin with!)
2019-09-06 16:03:49 +00:00
Gael Guennebaud
747c6a51ca bug #1736: fix compilation issue with A(all,{1,2}).col(j) by implementing true compile-time "if" for block_evaluator<>::coeff(i)/coeffRef(i) 2019-09-11 15:40:07 +02:00
Gael Guennebaud
031f17117d bug #1741: fix self-adjoint*matrix, triangular*matrix, and triangular^1*matrix with a destination having a non-trivial inner-stride 2019-09-11 15:04:25 +02:00
Gael Guennebaud
459b2bcc08 Fix compilation of BLAS backend and frontend 2019-09-11 10:02:37 +02:00
Rasmus Larsen
97f1e1d89f Merged in ezhulenev/eigen-01 (pull request PR-698)
ThreadLocal container that does not rely on thread local storage

Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-09-10 23:19:33 +00:00
Eugene Zhulenev
d918bd9a8b Update ThreadLocal to use separate Initialize/Release callables 2019-09-10 16:13:32 -07:00
Gael Guennebaud
afa8d13532 Fix some implicit literal to Scalar conversions in SparseCore 2019-09-11 00:03:07 +02:00
Gael Guennebaud
c06e6fd115 bug #1741: fix SelfAdjointView::rankUpdate and product to triangular part for destination with non-trivial inner stride 2019-09-10 23:29:52 +02:00
Gael Guennebaud
ea0d5dc956 bug #1741: fix C.noalias() = A*C; with C.innerStride()!=1 2019-09-10 16:25:24 +02:00
Eugene Zhulenev
e3dec4dcc1 ThreadLocal container that does not rely on thread local storage 2019-09-09 15:18:14 -07:00
Gael Guennebaud
17226100c5 Fix a circular dependency regarding pshift* functions and GenericPacketMathFunctions.
Another solution would have been to make pshift* fully generic template functions with
partial specialization which is always a mess in c++03.
2019-09-06 09:26:04 +02:00
Gael Guennebaud
55b63d4ea3 Fix compilation without vector engine available (e.g., x86 with SSE disabled):
-> ppolevl is required by ndtri even for the scalar path
2019-09-05 18:16:46 +02:00
Srinivas Vasudevan
a9cf823db7 Merged eigen/eigen 2019-09-04 23:50:52 -04:00
Gael Guennebaud
e6c183f8fd Fix doc issues regarding ndtri 2019-09-04 23:00:21 +02:00
Gael Guennebaud
5702a57926 Fix possible warning regarding strict equality comparisons 2019-09-04 22:57:04 +02:00
Srinivas Vasudevan
99036a3615 Merging from eigen/eigen. 2019-09-03 15:34:47 -04:00
Eugene Zhulenev
a8d264fa9c Add test for const TensorMap underlying data mutation 2019-09-03 11:38:39 -07:00
Eugene Zhulenev
f68f2bba09 TensorMap constness should not change underlying storage constness 2019-09-03 11:08:09 -07:00
Gael Guennebaud
8e7e3d9bc8 Makes Scalar/RealScalar typedefs public in Pardiso's wrappers (see PR 688) 2019-09-03 13:09:03 +02:00
Srinivas Vasudevan
e38dd48a27 PR 681: Add ndtri function, the inverse of the normal distribution function. 2019-08-12 19:26:29 -04:00
Eugene Zhulenev
f59bed7a13 Change typedefs from private to protected to fix MSVC compilation 2019-09-03 19:11:36 -07:00
Eugene Zhulenev
47fefa235f Allow move-only done callback in TensorAsyncDevice 2019-09-03 17:20:56 -07:00
Srinivas Vasudevan
18ceb3413d Add ndtri function, the inverse of the normal distribution function. 2019-08-12 19:26:29 -04:00
Rasmus Munk Larsen
d55d392e7b Fix bugs in log1p and expm1 where repeated using statements would clobber each other.
Add specializations for complex types since std::log1p and std::exp1m do not support complex.
2019-08-08 16:27:32 -07:00
Rasmus Munk Larsen
85928e5f47 Guard against repeated definition of EIGEN_MPL2_ONLY 2019-08-07 14:19:00 -07:00
Rasmus Munk Larsen
facc4e4536 Disable tests for contraction with output kernels when using libxsmm, which does not support this. 2019-08-07 14:11:15 -07:00
Rasmus Munk Larsen
eab7e52db2 [Eigen] Vectorize evaluation of coefficient-wise functions over tensor blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX.
A few benchmark numbers:

name                              old time/op             new time/op             delta
BM_Xent_16_10000_cpu              448µs ± 3%              389µs ± 2%  -13.21%
(p=0.008 n=5+5)
BM_Xent_32_10000_cpu              575µs ± 6%              454µs ± 3%  -21.00%          (p=0.008 n=5+5)
BM_Xent_64_10000_cpu              933µs ± 4%              712µs ± 1%  -23.71%          (p=0.008 n=5+5)
2019-08-07 12:57:42 -07:00
Rasmus Munk Larsen
0987126165 Clean up unnecessary namespace specifiers in TensorBlock.h. 2019-08-07 12:12:52 -07:00
Gael Guennebaud
0050644b23 Fix doc regarding alignment and c++17 2019-08-04 01:09:41 +02:00
Rasmus Munk Larsen
e2999d4c38 Fix performance regressions due to https://bitbucket.org/eigen/eigen/pull-requests/662.
The change caused the device struct to be copied for each expression evaluation, and caused, e.g., a 10% regression in the TensorFlow multinomial op on GPU:


Benchmark                       Time(ns)        CPU(ns)     Iterations
----------------------------------------------------------------------
BM_Multinomial_gpu_1_100000_4     128173         231326           2922  1.610G items/s

VS

Benchmark                       Time(ns)        CPU(ns)     Iterations
----------------------------------------------------------------------
BM_Multinomial_gpu_1_100000_4     146683         246914           2719  1.509G items/s
2019-08-02 11:18:13 -07:00
Alberto Luaces
c694be1214 Fixed Tensor documentation formatting. 2019-07-23 09:24:06 +00:00
Gael Guennebaud
15f3d9d272 More colamd cleanup:
- Move colamd implementation in its own namespace to avoid polluting the internal namespace with Ok, Status, etc.
- Fix signed/unsigned warning
- move some ugly free functions as member functions
2019-09-03 00:50:51 +02:00
Anshul Jaiswal
a4d1a6cd7d Eigen_Colamd.h updated to replace constexpr with consts and enums. 2019-08-17 05:29:23 +00:00
Anshul Jaiswal
283558face Ordering.h edited to fix dependencies on Eigen_Colamd.h 2019-08-15 20:21:56 +00:00
Anshul Jaiswal
39f30923c2 Eigen_Colamd.h edited replacing macros with constexprs and functions. 2019-08-15 20:15:19 +00:00
Anshul Jaiswal
0a6b553ecf Eigen_Colamd.h edited online with Bitbucket replacing constant #defines with const definitions 2019-07-21 04:53:31 +00:00
Kyle Vedder
f22b7283a3 Added leading asterisk for Doxygen to consume as it was removing asterisk intended to be part of the code. 2019-07-18 18:12:14 +00:00
Michael Grupp
6e17491f45 Fix typo in Umeyama method documentation 2019-07-17 11:20:41 +00:00
Christoph Hertzberg
e0f5a2a456 Remove {} accidentally added in previous commit 2019-07-18 20:22:17 +02:00
Christoph Hertzberg
ea6d7eb32f Move variadic constructors outside #ifndef EIGEN_PARSED_BY_DOXYGEN block, to make it actually appear in the generated documentation. 2019-07-12 19:46:37 +02:00
Christoph Hertzberg
9237883ff1 Escape \# inside doxygen docu 2019-07-12 19:45:13 +02:00
Christoph Hertzberg
c2671e5315 Build deprecated snippets with -DEIGEN_NO_DEPRECATED_WARNING
Also, document LinSpaced only where it is implemented
2019-07-12 19:43:32 +02:00
Eugene Zhulenev
3cd148f983 Fix expression evaluation heuristic for TensorSliceOp 2019-07-09 12:10:26 -07:00
Rasmus Munk Larsen
23b958818e Fix compiler for unsigned integers. 2019-07-09 11:18:25 -07:00
Eugene Zhulenev
6083014594 Add outer/inner chipping optimization for chipping dimension specified at runtime 2019-07-03 11:35:25 -07:00
Deven Desai
7eb2e0a95b adding the EIGEN_DEVICE_FUNC attribute to the constCast routine.
Not having this attribute results in the following failures in the `--config=rocm` TF build.

```
In file included from tensorflow/core/kernels/cross_op_gpu.cu.cc:20:
In file included from ./tensorflow/core/framework/register_types.h:20:
In file included from ./tensorflow/core/framework/numeric_types.h:20:
In file included from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:140:
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error:  'Eigen::constCast':  no overloaded function has restriction specifiers that are compatible with the ambient context 'data'
    typename Storage::Type result = constCast(m_impl.data());
                                    ^
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error:  'Eigen::constCast':  no overloaded function has restriction specifiers that are compatible with the ambient context 'data'
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h:148:56: note: in instantiation of member function 'Eigen::TensorEvaluator<const Eigen::TensorChippingOp<1, Eigen::TensorMap<Eigen::Tensor<int, 2, 1, long>, 16, MakePointer> >, Eigen::Gpu\
Device>::data' requested here
    return m_rightImpl.evalSubExprsIfNeeded(m_leftImpl.data());

```

Adding the EIGEN_DEVICE_FUNC attribute resolves those errors
2019-07-02 20:02:46 +00:00
Gael Guennebaud
ef8aca6a89 Merged in codeplaysoftware/eigen (pull request PR-667)
[SYCL] :

Approved-by: Gael Guennebaud <g.gael@free.fr>
Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-07-02 12:45:23 +00:00
Eugene Zhulenev
4ac93f8edc Allocate non-const scalar buffer for block evaluation with DefaultDevice 2019-07-01 10:55:19 -07:00
Mehdi Goli
9ea490c82c [SYCL] :
* Modifying TensorDeviceSYCL to use `EIGEN_THROW_X`.
  * Modifying TensorMacro to use `EIGEN_TRY/CATCH(X)` macro.
  * Modifying TensorReverse.h to use `EIGEN_DEVICE_REF` instead of `&`.
  * Fixing the SYCL device macro in SpecialFunctionsImpl.h.
2019-07-01 16:27:28 +01:00
Eugene Zhulenev
81a03bec75 Fix TensorReverse on GPU with m_stride[i]==0 2019-06-28 15:50:39 -07:00
Rasmus Munk Larsen
8053eeb51e Fix CUDA compilation error for pselect<half>. 2019-06-28 12:07:29 -07:00
Rasmus Munk Larsen
74a9dd1102 Fix preprocessor condition to only generate a warning when calling eigen::GpuDevice::synchronize() from device code, but not when calling from a non-GPU compilation unit. 2019-06-28 11:56:21 -07:00
Rasmus Munk Larsen
70d4020ad9 Remove comma causing warning in c++03 mode. 2019-06-28 11:39:45 -07:00
Eugene Zhulenev
6e7c76481a Merge with Eigen head 2019-06-28 11:22:46 -07:00
Eugene Zhulenev
878845cb25 Add block access to TensorReverseOp and make sure that TensorForcedEval uses block access when preferred 2019-06-28 11:13:44 -07:00
Rasmus Munk Larsen
1f61aee5ca [SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL.
* Abstracting the pointer type so that both SYCL memory and pointer can be captured.
* Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class.
* Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node.
* Adding SYCL macro for controlling loop unrolling.
* Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
2019-06-28 10:11:56 -07:00
Mehdi Goli
7d08fa805a [SYCL] This PR adds the minimum modifications to the Eigen unsupported module required to run it on devices supporting SYCL.
* Abstracting the pointer type so that both SYCL memory and pointer can be captured.
* Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class.
* Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node.
* Adding SYCL macro for controlling loop unrolling.
* Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
2019-06-28 10:08:23 +01:00
Mehdi Goli
16a56b2ddd [SYCL] This PR adds the minimum modifications to Eigen core required to run Eigen unsupported modules on devices supporting SYCL.
* Adding SYCL memory model
* Enabling/Disabling SYCL  backend in Core
*  Supporting Vectorization
2019-06-27 12:25:09 +01:00
Christoph Hertzberg
adec097c61 Remove extra comma (causes warnings in C++03) 2019-06-26 16:14:28 +02:00
Eugene Zhulenev
229db81572 Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp 2019-06-25 15:41:37 -07:00
Deven Desai
ba506d5bd2 fix for a ROCm/HIP specificcompile errror introduced by a recent commit. 2019-06-22 00:06:05 +00:00
Rasmus Munk Larsen
c9394d7a0e Remove extra "one" in comment. 2019-06-20 16:23:19 -07:00
Rasmus Munk Larsen
b8f8dac4eb Update comment as suggested by tra@google.com. 2019-06-20 16:18:37 -07:00
Rasmus Munk Larsen
e5e63c2cad Fix grammar. 2019-06-20 16:03:59 -07:00
Rasmus Munk Larsen
302a404b7e Added comment explaining the surprising EIGEN_COMP_CLANG && !EIGEN_COMP_NVCC clause. 2019-06-20 15:59:08 -07:00
Rasmus Munk Larsen
b5237f53b1 Fix CUDA build on Mac. 2019-06-20 15:44:14 -07:00
Rasmus Munk Larsen
988f24b730 Various fixes for packet ops.
1. Fix buggy pcmp_eq and unit test for half types.
2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types.
3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.
2019-06-20 11:47:49 -07:00
Christoph Hertzberg
e0be7f30e1 bug #1724: Mask buggy warnings with g++-7
(grafted from 427f2f66d6
)
2019-06-14 14:57:46 +02:00
Anshul Jaiswal
fab51d133e Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & COLAMD_DEAD
to prevent conflicts with other libraries / code.
2019-06-08 21:09:06 +00:00
Eugene Zhulenev
79c402e40e Fix shadow warnings in TensorContractionThreadPool 2019-08-30 15:38:31 -07:00
Eugene Zhulenev
edf2ec28d8 Fix block mapper type name in TensorExecutor 2019-08-30 15:29:25 -07:00
Eugene Zhulenev
f0b36fb9a4 evalSubExprsIfNeededAsync + async TensorContractionThreadPool 2019-08-30 15:13:38 -07:00
Eugene Zhulenev
619cea9491 Revert accidentally removed <memory> header from ThreadPool 2019-08-30 14:51:17 -07:00
Eugene Zhulenev
66665e7e76 Asynchronous expression evaluation with TensorAsyncDevice 2019-08-30 14:49:40 -07:00
Rasmus Munk Larsen
f6c51d9209 Fix missing header inclusion and colliding definitions for half type casting, which broke
build with -march=native on Haswell/Skylake.
2019-08-30 14:03:29 -07:00
Eugene Zhulenev
bc40d4522c Const correctness in TensorMap<const Tensor<T, ...>> expressions 2019-08-28 17:46:05 -07:00
Rasmus Munk Larsen
1187bb65ad Add more tests for corner cases of log1p and expm1. Add handling of infinite arguments to log1p such that log1p(inf) = inf. 2019-08-28 12:20:21 -07:00
Eugene Zhulenev
6e77f9bef3 Remove shadow warnings in TensorDeviceThreadPool 2019-08-28 10:32:19 -07:00
Rasmus Munk Larsen
9aba527405 Revert changes to std_falback::log1p that broke handling of arguments less than -1. Fix packet op accordingly. 2019-08-27 15:35:29 -07:00
Rasmus Munk Larsen
b021cdea6d Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories. 2019-08-27 11:30:31 -07:00
Rasmus Larsen
84fefdf321 Merged in ezhulenev/eigen-01 (pull request PR-683)
Asynchronous parallelFor in Eigen ThreadPoolDevice
2019-08-26 21:49:17 +00:00
maratek
8b5ab0e4dd Fix get_random_seed on Native Client
Newlib in Native Client SDK does not provide ::random function.
Implement get_random_seed for NaCl using ::rand, similarly to Windows version.
2019-08-23 15:25:56 -07:00
Eugene Zhulenev
6901788013 Asynchronous parallelFor in Eigen ThreadPoolDevice 2019-08-22 10:50:51 -07:00
Christoph Hertzberg
2fb24384c9 Merged in jaopaulolc/eigen (pull request PR-679)
Fixes for Altivec/VSX and compilation with clang on PowerPC
2019-08-22 15:57:33 +00:00
Rasmus Larsen
57f6b62597 Merged in rmlarsen/eigen (pull request PR-680)
Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.
2019-08-22 00:25:29 +00:00
Eugene Zhulenev
071311821e Remove XSMM support from Tensor module 2019-08-19 11:44:25 -07:00
João P. L. de Carvalho
5ac7984ffa Fix debug macros in p{load,store}u 2019-08-14 11:59:12 -06:00
João P. L. de Carvalho
db9147ae40 Add missing pcmp_XX methods for double/Packet2d
This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector.
2019-08-14 10:37:39 -06:00
Rasmus Munk Larsen
a3298b22ec Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments.
Depending on instruction set, significant speedups are observed for the vectorized path:
log1p wall time is reduced 60-93% (2.5x - 15x speedup)
expm1 wall time is reduced 0-85% (1x - 7x speedup)

The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly.

Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM
2019-08-12 13:53:28 -07:00
João P. L. de Carvalho
787f6ef025 Fix packed load/store for PowerPC's VSX
The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts.

For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f.

Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang.
2019-08-09 16:02:55 -06:00
João P. L. de Carvalho
4d29aa0294 Fix offset argument of ploadu/pstoreu for Altivec
If no offset is given, them it should be zero.

Also passes full address to vec_vsx_ld/st builtins.

Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT.

Removes unnecessary casts.
2019-08-09 15:59:26 -06:00
João P. L. de Carvalho
66d073c38e bug #1718: Add cast to successfully compile with clang on PowerPC
Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h
2019-08-09 15:56:26 -06:00
Rasmus Munk Larsen
6d432eae5d Make is_valid_index_type return false for float and double when EIGEN_HAS_TYPE_TRAITS is off. 2019-06-05 16:42:27 -07:00
Rasmus Munk Larsen
f715f6e816 Add workaround for choosing the right include files with FP16C support with clang. 2019-06-05 13:36:37 -07:00
Justin Carpentier
ffaf658ecd PR 655: Fix missing Eigen namespace in Macros 2019-06-05 09:51:59 +02:00
Mehdi Goli
0b24e1cb5c [SYCL] Adding the SYCL memory model. The SYCL memory model provides :
* an interface for SYCL buffers to behave as a non-dereferenceable pointer
  * an interface for placeholder accessor to behave like a pointer on both host and device
2019-07-01 16:02:30 +01:00
Rasmus Larsen
c1b0aea653 Merged in Artem-B/eigen (pull request PR-654)
Minor build improvements

Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-05-31 22:27:04 +00:00
Rasmus Munk Larsen
b08527b0c1 Clean up CUDA/NVCC version macros and their use in Eigen, and a few other CUDA build failures. 2019-05-31 15:26:06 -07:00
tra
b4c49bf00e Minor build improvements
* Allow specifying multiple GPU architectures. E.g.:
  cmake -DEIGEN_CUDA_COMPUTE_ARCH="60;70"
* Pass CUDA SDK path to clang. Without it it will default to /usr/local/cuda
which may not be the right location, if cmake was invoked with
-DCUDA_TOOLKIT_ROOT_DIR=/some/other/CUDA/path
2019-05-31 14:08:34 -07:00
Christoph Hertzberg
5614400581 digits10() needs to return an integer
Problem reported on https://stackoverflow.com/questions/56395899
2019-05-31 15:45:41 +02:00
Rasmus Larsen
36e0a2b93f Merged in deven-amd/eigen-hip-fix-190524 (pull request PR-649)
fix for HIP build errors that were introduced by a commit earlier this week
2019-05-24 16:05:31 +00:00
Deven Desai
2c38930161 fix for HIP build errors that were introduced by a commit earlier this week 2019-05-24 14:25:32 +00:00
Gustavo Lima Chaves
56bc4974fb GEMV: remove double declaration of constant.
That was hurting users with compilers that would object to proceed with
that:

"""
./Eigen/src/Core/products/GeneralMatrixVector.h:356:10: error: declaration shadows a static data member of 'general_matrix_vector_product<type-parameter-0-0, type-parameter-0-1, type-parameter-0-2, 1, ConjugateLhs, type-parameter-0-4, type-parameter-0-5, ConjugateRhs, Version>' [-Werror,-Wshadow]
         LhsPacketSize = Traits::LhsPacketSize,
         ^
./Eigen/src/Core/products/GeneralMatrixVector.h:307:22: note: previous declaration is here
  static const Index LhsPacketSize = Traits::LhsPacketSize;
"""
2019-05-23 14:50:29 -07:00
Christoph Hertzberg
ac21a08c13 Cast Index to RealScalar
This fixes compilation issues with RealScalar types that are not implicitly castable from Index (e.g. ceres Jet types).
Reported by Peter Anderson-Sprecher via eMail
2019-05-23 15:31:12 +02:00
Rasmus Munk Larsen
3eb5ad0ed0 Enable support for F16C with Clang. The required intrinsics were added here: https://reviews.llvm.org/D16177
and are part of LLVM 3.8.0.
2019-05-20 17:19:20 -07:00
Rasmus Larsen
e92486b8c3 Merged in rmlarsen/eigen (pull request PR-643)
Make Eigen build with cuda 10 and clang.

Approved-by: Justin Lebar <justin.lebar@gmail.com>
2019-05-20 17:02:39 +00:00
Rasmus Munk Larsen
fd595d42a7 Merge 2019-05-20 09:39:11 -07:00
Gael Guennebaud
cc7ecbb124 Merged in scramsby/eigen (pull request PR-646)
Eigen: Fix MSVC C++17 language standard detection logic
2019-05-20 07:19:10 +00:00
Eugene Zhulenev
01654d97fa Prevent potential division by zero in TensorExecutor 2019-05-17 14:02:25 -07:00
Rasmus Larsen
78d3015722 Merged in ezhulenev/eigen-01 (pull request PR-644)
Always evaluate Tensor expressions with broadcasting via tiled evaluation code path
2019-05-17 19:44:25 +00:00
Rasmus Larsen
bf9cbed8d0 Merged in glchaves/eigen (pull request PR-635)
Speed up GEMV on AVX-512 builds, just as done for GEBP previously.

Approved-by: Rasmus Larsen <rmlarsen@google.com>
2019-05-17 19:40:50 +00:00
Eugene Zhulenev
96a276803c Always evaluate Tensor expressions with broadcasting via tiled evaluation code path 2019-05-16 16:15:45 -07:00
Rasmus Munk Larsen
ab0a30e429 Make Eigen build with cuda 10 and clang. 2019-05-15 13:32:15 -07:00
Rasmus Munk Larsen
734a50dc60 Make Eigen build with cuda 10 and clang. 2019-05-15 13:32:15 -07:00
Rasmus Larsen
c8d8d5c0fc Merged in rmlarsen/eigen_threadpool (pull request PR-640)
Fix deadlocks in thread pool.

Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2019-05-13 20:04:35 +00:00
Christoph Hertzberg
5f32b79edc Collapsed revision from PR-641
* SparseLU.h - corrected example, it didn't compile
* Changed encoding back to UTF8
2019-05-13 19:02:30 +02:00
Anuj Rawat
ad372084f5 Removing unused API to fix compile error in TensorFlow due to
AVX512VL, AVX512BW usage
2019-05-12 14:43:10 +00:00
Christoph Hertzberg
4ccd1ece92 bug #1707: Fix deprecation warnings, or disable warnings when testing deprecated functions 2019-05-10 14:57:05 +02:00
Rasmus Munk Larsen
d3ef7cf03e Fix build with clang on Windows. 2019-05-09 11:07:04 -07:00
Rasmus Munk Larsen
e5ac8cbd7a A) fix deadlocks in thread pool caused by EventCount
This fixed 2 deadlocks caused by sloppiness in the EventCount logic.
Both most likely were introduced by cl/236729920 which includes the new EventCount algorithm:
01da8caf00

bug #1 (Prewait):
Prewait must not consume existing signals.
Consider the following scenario.
There are 2 thread pool threads (1 and 2) and 1 external thread (3). RunQueue is empty.
Thread 1 checks the queue, calls Prewait, checks RunQueue again and now is going to call CommitWait.
Thread 2 checks the queue and now is going to call Prewait.
Thread 3 submits 2 tasks, EventCount signals is set to 1 because only 1 waiter is registered the second signal is discarded).
Now thread 2 resumes and calls Prewait and takes away the signal.
Thread 1 resumes and calls CommitWait, there are no pending signals anymore, so it blocks.
As the result we have 2 tasks, but only 1 thread is running.

bug #2 (CancelWait):
CancelWait must not take away a signal if it's not sure that the signal was meant for this thread.
When one thread blocks and another submits a new task concurrently, the EventCount protocol guarantees only the following properties (similar to the Dekker's algorithm):
(a) the registered waiter notices presence of the new task and does not block
(b) the signaler notices presence of the waiters and wakes it
(c) both the waiter notices presence of the new task and signaler notices presence of the waiter
[it's only that both of them do not notice each other must not be possible, because it would lead to a deadlock]
CancelWait is called for cases (a) and (c). For case (c) it is OK to take the notification signal away, but it's not OK for (a) because nobody queued a signals for us and we take away a signal meant for somebody else.
Consider:
Thread 1 calls Prewait, checks RunQueue, it's empty, now it's going to call CommitWait.
Thread 3 submits 2 tasks, EventCount signals is set to 1 because only 1 waiter is registered the second signal is discarded).
Thread 2 calls Prewait, checks RunQueue, discovers the tasks, calls CancelWait and consumes the pending signal (meant for thread 1).
Now Thread 1 resumes and calls CommitWait, since there are no signals it blocks.
As the result we have 2 tasks, but only 1 thread is running.

Both deadlocks are only a problem if the tasks require parallelism. Most computational tasks do not require parallelism, i.e. a single thread will run task 1, finish it and then dequeue and run task 2.

This fix undoes some of the sloppiness in the EventCount that was meant to reduce CPU consumption by idle threads, because we now have more threads running in these corner cases. But we still don't have pthread_yield's and maybe the strictness introduced by this change will actually help to reduce tail latency because we will have threads running when we actually need them running.



B) fix deadlock in thread pool caused by RunQueue

This fixed a deadlock caused by sloppiness in the RunQueue logic.
Most likely this was introduced with the non-blocking thread pool.
The deadlock only affects workloads that require parallelism.
Most computational tasks don't require parallelism.

PopBack must not fail spuriously. If it does, it can effectively lead to single thread consuming several wake up signals.
Consider 2 worker threads are blocked.
External thread submits a task. One of the threads is woken.
It tries to steal the task, but fails due to a spurious failure in PopBack (external thread submits another task and holds the lock).
The thread executes blocking protocol again (it won't block because NonEmptyQueueIndex is precise and the thread will discover pending work, but it has called PrepareWait).
Now external thread submits another task and signals EventCount again.
The signal is consumed by the first thread again. But now we have 2 tasks pending but only 1 worker thread running.

It may be possible to fix this in a different way: make EventCount::CancelWait forward wakeup signal to a blocked thread rather then consuming it. But this looks more complex and I am not 100% that it will fix the bug.
It's also possible to have 2 versions of PopBack: one will do try_to_lock and another won't. Then worker threads could first opportunistically check all queues with try_to_lock, and only use the blocking version before blocking. But let's first fix the bug with the simpler change.
2019-05-08 10:16:46 -07:00
Michael Tesch
c5019f722b Use pade for matrix exponential also for complex values. 2019-05-08 17:04:55 +02:00
Eugene Zhulenev
45b40d91ca Fix AVX512 & GCC 6.3 compilation 2019-05-07 16:44:55 -07:00
Christoph Hertzberg
e6667a7060 Fix stupid shadow-warnings (with old clang versions) 2019-05-07 18:32:19 +02:00
Christoph Hertzberg
e54dc24d62 Restore C++03 compatibility 2019-05-07 18:30:44 +02:00
Christoph Hertzberg
cca76c272c Restore C++03 compatibility 2019-05-06 16:18:22 +02:00
Rasmus Munk Larsen
8e33844fc7 Fix traits for scalar_logistic_op. 2019-05-03 15:49:09 -07:00
Scott Ramsby
ff06ef7584 Eigen: Fix MSVC C++17 language standard detection logic
To detect C++17 support, use _MSVC_LANG macro instead of _MSC_VER. _MSC_VER can indicate whether the current compiler version could support the C++17 language standard, but not whether that standard is actually selected (i.e. via /std:c++17).
See these web pages for more details:
https://devblogs.microsoft.com/cppblog/msvc-now-correctly-reports-__cplusplus/
https://docs.microsoft.com/en-us/cpp/preprocessor/predefined-macros
2019-05-03 14:14:09 -07:00
Eugene Zhulenev
e9f0eb8a5e Add masked_store_available to unpacket_traits 2019-05-02 14:52:58 -07:00
Eugene Zhulenev
96e30e936a Add masked pstoreu for Packet16h 2019-05-02 14:11:01 -07:00
Eugene Zhulenev
b4010f02f9 Add masked pstoreu to AVX and AVX512 PacketMath 2019-05-02 13:14:18 -07:00
Gael Guennebaud
578407f42f Fix regression in changeset ae33e866c7 2019-05-02 15:45:21 +02:00
Rasmus Larsen
ac50afaffa Merged in ezhulenev/eigen-01 (pull request PR-633)
Check if gpu_assert was overridden in TensorGpuHipCudaDefines
2019-04-29 16:29:35 +00:00
Gustavo Lima Chaves
d4dcb71bcb Speed up GEMV on AVX-512 builds, just as done for GEBP previously.
We take advantage of smaller SIMD registers as well, in that case.

Gains up to 3x for select input sizes.
2019-04-26 14:12:39 -07:00
Andy May
ae33e866c7 Fix compilation with PGI version 19 2019-04-25 21:23:19 +01:00
Gael Guennebaud
665ac22cc6 Merged in ezhulenev/eigen-01 (pull request PR-632)
Fix doxygen warnings
2019-04-25 20:02:20 +00:00
Eugene Zhulenev
01d7e6ee9b Check if gpu_assert was overridden in TensorGpuHipCudaDefines 2019-04-25 11:19:17 -07:00
Eugene Zhulenev
8ead5bb3d8 Fix doxygen warnings to enable statis code analysis 2019-04-24 12:42:28 -07:00
Eugene Zhulenev
07355d47c6 Get rid of SequentialLinSpacedReturnType deprecation warnings in DenseBase.h 2019-04-24 11:01:35 -07:00
Rasmus Munk Larsen
144ca33321 Remove deprecation annotation from typedef Eigen::Index Index, as it would generate too many build warnings. 2019-04-24 08:50:07 -07:00
Eugene Zhulenev
a7b7f3ca8a Add missing EIGEN_DEPRECATED annotations to deprecated functions and fix few other doxygen warnings 2019-04-23 17:23:19 -07:00
Eugene Zhulenev
68a2a8c445 Use packet ops instead of AVX2 intrinsics 2019-04-23 11:41:02 -07:00
Anuj Rawat
8c7a6feb8e Adding lowlevel APIs for optimized RHS packet load in TensorFlow
SpatialConvolution

Low-level APIs are added in order to optimized packet load in gemm_pack_rhs
in TensorFlow SpatialConvolution. The optimization is for scenario when a
packet is split across 2 adjacent columns. In this case we read it as two
'partial' packets and then merge these into 1. Currently this only works for
Packet16f (AVX512) and Packet8f (AVX2). We plan to add this for other
packet types (such as Packet8d) also.

This optimization shows significant speedup in SpatialConvolution with
certain parameters. Some examples are below.

Benchmark parameters are specified as:
Batch size, Input dim, Depth, Num of filters, Filter dim

Speedup numbers are specified for number of threads 1, 2, 4, 8, 16.

AVX512:

Parameters                  | Speedup (Num of threads: 1, 2, 4, 8, 16)
----------------------------|------------------------------------------
128,   24x24,  3, 64,   5x5 |2.18X, 2.13X, 1.73X, 1.64X, 1.66X
128,   24x24,  1, 64,   8x8 |2.00X, 1.98X, 1.93X, 1.91X, 1.91X
 32,   24x24,  3, 64,   5x5 |2.26X, 2.14X, 2.17X, 2.22X, 2.33X
128,   24x24,  3, 64,   3x3 |1.51X, 1.45X, 1.45X, 1.67X, 1.57X
 32,   14x14, 24, 64,   5x5 |1.21X, 1.19X, 1.16X, 1.70X, 1.17X
128, 128x128,  3, 96, 11x11 |2.17X, 2.18X, 2.19X, 2.20X, 2.18X

AVX2:

Parameters                  | Speedup (Num of threads: 1, 2, 4, 8, 16)
----------------------------|------------------------------------------
128,   24x24,  3, 64,   5x5 | 1.66X, 1.65X, 1.61X, 1.56X, 1.49X
 32,   24x24,  3, 64,   5x5 | 1.71X, 1.63X, 1.77X, 1.58X, 1.68X
128,   24x24,  1, 64,   5x5 | 1.44X, 1.40X, 1.38X, 1.37X, 1.33X
128,   24x24,  3, 64,   3x3 | 1.68X, 1.63X, 1.58X, 1.56X, 1.62X
128, 128x128,  3, 96, 11x11 | 1.36X, 1.36X, 1.37X, 1.37X, 1.37X

In the higher level benchmark cifar10, we observe a runtime improvement
of around 6% for AVX512 on Intel Skylake server (8 cores).

On lower level PackRhs micro-benchmarks specified in TensorFlow
tensorflow/core/kernels/eigen_spatial_convolutions_test.cc, we observe
the following runtime numbers:

AVX512:

Parameters                                                     | Runtime without patch (ns) | Runtime with patch (ns) | Speedup
---------------------------------------------------------------|----------------------------|-------------------------|---------
BM_RHS_NAME(PackRhs, 128, 24, 24, 3, 64, 5, 5, 1, 1, 256, 56)  |  41350                     | 15073                   | 2.74X
BM_RHS_NAME(PackRhs, 32, 64, 64, 32, 64, 5, 5, 1, 1, 256, 56)  |   7277                     |  7341                   | 0.99X
BM_RHS_NAME(PackRhs, 32, 64, 64, 32, 64, 5, 5, 2, 2, 256, 56)  |   8675                     |  8681                   | 1.00X
BM_RHS_NAME(PackRhs, 32, 64, 64, 30, 64, 5, 5, 1, 1, 256, 56)  |  24155                     | 16079                   | 1.50X
BM_RHS_NAME(PackRhs, 32, 64, 64, 30, 64, 5, 5, 2, 2, 256, 56)  |  25052                     | 17152                   | 1.46X
BM_RHS_NAME(PackRhs, 32, 256, 256, 4, 16, 8, 8, 1, 1, 256, 56) |  18269                     | 18345                   | 1.00X
BM_RHS_NAME(PackRhs, 32, 256, 256, 4, 16, 8, 8, 2, 4, 256, 56) |  19468                     | 19872                   | 0.98X
BM_RHS_NAME(PackRhs, 32, 64, 64, 4, 16, 3, 3, 1, 1, 36, 432)   | 156060                     | 42432                   | 3.68X
BM_RHS_NAME(PackRhs, 32, 64, 64, 4, 16, 3, 3, 2, 2, 36, 432)   | 132701                     | 36944                   | 3.59X

AVX2:

Parameters                                                     | Runtime without patch (ns) | Runtime with patch (ns) | Speedup
---------------------------------------------------------------|----------------------------|-------------------------|---------
BM_RHS_NAME(PackRhs, 128, 24, 24, 3, 64, 5, 5, 1, 1, 256, 56)  | 26233                      | 12393                   | 2.12X
BM_RHS_NAME(PackRhs, 32, 64, 64, 32, 64, 5, 5, 1, 1, 256, 56)  |  6091                      |  6062                   | 1.00X
BM_RHS_NAME(PackRhs, 32, 64, 64, 32, 64, 5, 5, 2, 2, 256, 56)  |  7427                      |  7408                   | 1.00X
BM_RHS_NAME(PackRhs, 32, 64, 64, 30, 64, 5, 5, 1, 1, 256, 56)  | 23453                      | 20826                   | 1.13X
BM_RHS_NAME(PackRhs, 32, 64, 64, 30, 64, 5, 5, 2, 2, 256, 56)  | 23167                      | 22091                   | 1.09X
BM_RHS_NAME(PackRhs, 32, 256, 256, 4, 16, 8, 8, 1, 1, 256, 56) | 23422                      | 23682                   | 0.99X
BM_RHS_NAME(PackRhs, 32, 256, 256, 4, 16, 8, 8, 2, 4, 256, 56) | 23165                      | 23663                   | 0.98X
BM_RHS_NAME(PackRhs, 32, 64, 64, 4, 16, 3, 3, 1, 1, 36, 432)   | 72689                      | 44969                   | 1.62X
BM_RHS_NAME(PackRhs, 32, 64, 64, 4, 16, 3, 3, 2, 2, 36, 432)   | 61732                      | 39779                   | 1.55X

All benchmarks on Intel Skylake server with 8 cores.
2019-04-20 06:46:43 +00:00
Christoph Hertzberg
4270c62812 Split the implementation of i?amax/min into two. Based on PR-627 by Sameer Agarwal.
Like the Netlib reference implementation, I*AMAX now uses the L1-norm instead of the L2-norm for each element. Changed I*MIN accordingly.
2019-04-15 17:18:03 +02:00
Rasmus Munk Larsen
039ee52125 Tweak cost model for tensor contraction when parallelizing over the inner dimension.
https://bitbucket.org/snippets/rmlarsen/MexxLo
2019-04-12 13:35:10 -07:00
Jonathon Koyle
9a3f06d836 Update TheadPoolDevice example to include ThreadPool creation and passing pointer into constructor. 2019-04-10 10:02:33 -06:00
Deven Desai
66a885b61e adding EIGEN_DEVICE_FUNC to the recently added TensorContractionKernel constructor. Not having the EIGEN_DEVICE_FUNC attribute on it was leading to compiler errors when compiling Eigen in the ROCm/HIP path 2019-04-08 13:45:08 +00:00
Eugene Zhulenev
629ddebd15 Add missing semicolon 2019-04-02 15:04:26 -07:00
Eugene Zhulenev
4e2f6de1a8 Add support for custom packed Lhs/Rhs blocks in tensor contractions 2019-04-01 11:47:31 -07:00
Gael Guennebaud
45e65fbb77 bug #1695: fix a numerical robustness issue. Computing the secular equation at the middle range without a shift might give a wrong sign. 2019-03-27 20:16:58 +01:00
William D. Irons
8de66719f9 Collapsed revision from PR-619
* Add support for pcmp_eq in AltiVec/Complex.h
* Fixed implementation of pcmp_eq for double

The new logic is based on the logic from NEON for double.
2019-03-26 18:14:49 +00:00
Gael Guennebaud
f11364290e ICC does not support -fno-unsafe-math-optimizations 2019-03-22 09:26:24 +01:00
David Tellenbach
3031d57200 PR 621: Fix documentation of EIGEN_COMP_EMSCRIPTEN 2019-03-21 02:21:04 +01:00
Deven Desai
51e399fc15 updates requested in the PR feedback. Also droping coded within #ifdef EIGEN_HAS_OLD_HIP_FP16 2019-03-19 21:45:25 +00:00
Deven Desai
2dbea5510f Merged eigen/eigen into default 2019-03-19 16:52:38 -04:00
Rasmus Larsen
5c93b38c5f Merged in rmlarsen/eigen (pull request PR-618)
Make clipping outside [-18:18] consistent for vectorized and non-vectorized paths of scalar_logistic_op<float>.

Approved-by: Gael Guennebaud <g.gael@free.fr>
2019-03-18 15:51:55 +00:00
Gael Guennebaud
48898a988a fix unit test in c++03: c++03 does not allow passing local or anonymous enum as template param 2019-03-18 11:38:36 +01:00
Gael Guennebaud
cf7e2e277f bug #1692: enable enum as sizes of Matrix and Array 2019-03-17 21:59:30 +01:00
Rasmus Munk Larsen
e42f9aa68a Make clipping outside [-18:18] consistent for vectorized and non-vectorized paths of scalar_logistic_<float>. 2019-03-15 17:15:14 -07:00
Rasmus Larsen
1936aac43f Merged in tellenbach/eigen/sykline_consistent_include_guards (pull request PR-617)
Fix include guard comments for Skyline module
2019-03-15 20:04:56 +00:00
David Tellenbach
bd9c2ae3fd Fix include guard comments 2019-03-15 15:29:17 +01:00
Rasmus Munk Larsen
8450a6d519 Clean up half packet traits and add a few more missing packet ops. 2019-03-14 15:18:06 -07:00
David Tellenbach
b013176e52 Remove undefined std::complex<int> 2019-03-14 11:40:28 +01:00
David Tellenbach
97f9a46cb9 PR 593: Add variadtic ctor for DiagonalMatrix with unit tests 2019-03-14 10:18:24 +01:00
Gael Guennebaud
45ab514fe2 revert debug stuff 2019-03-14 10:08:12 +01:00
Rasmus Munk Larsen
6a34003141 Remove EIGEN_MPL2_ONLY guard in IncompleteCholesky that is no longer needed after the AMD reordering code was relicensed to MPL2. 2019-03-13 11:52:41 -07:00
Gael Guennebaud
d7d2f0680e bug #1684: partially workaround clang's 6/7 bug #40815 2019-03-13 10:40:01 +01:00
Rasmus Larsen
690f0795d0 Merged in rmlarsen/eigen (pull request PR-615)
Clean up PacketMathHalf.h and add a few missing logical packet ops.
2019-03-12 16:09:48 +00:00
Thomas Capricelli
1901433674 erm.. use proper id 2019-03-12 13:53:38 +01:00
Thomas Capricelli
90302aa8c9 update tracking code 2019-03-12 13:47:01 +01:00
Rasmus Munk Larsen
77f7d4a894 Clean up PacketMathHalf.h and add a few missing logical packet ops. 2019-03-11 17:51:16 -07:00
Eugene Zhulenev
001f10e3c9 Fix segfaults with cuda compilation 2019-03-11 09:43:33 -07:00
Eugene Zhulenev
899c16fa2c Fix a bug in TensorGenerator for 1d tensors 2019-03-11 09:42:01 -07:00
Eugene Zhulenev
0f8bfff23d Fix a data race in NonBlockingThreadPool 2019-03-11 09:38:44 -07:00
Gael Guennebaud
656d9bc66b Apply SSE's pmin/pmax fix for GCC <= 5 to AVX's pmin/pmax 2019-03-10 21:19:18 +01:00
Gael Guennebaud
2df4f00246 Change license from LGPL to MPL2 with agreement from David Harmon. 2019-03-07 18:17:10 +01:00
Rasmus Munk Larsen
3c3f639fe2 Merge. 2019-03-06 11:54:30 -08:00
Rasmus Munk Larsen
f4ec8edea8 Add macro EIGEN_AVOID_THREAD_LOCAL to make it possible to manually disable the use of thread_local. 2019-03-06 11:52:04 -08:00
Rasmus Munk Larsen
41cdc370d0 Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
Found with -Wundefined-func-template.

Author: tkoeppe@google.com
2019-03-06 11:42:22 -08:00
Rasmus Munk Larsen
cc407c9d4d Fix placement of "#if defined(EIGEN_GPUCC)" guard region.
Found with -Wundefined-func-template.

Author: tkoeppe@google.com
2019-03-06 11:40:06 -08:00
Eugene Zhulenev
1bc2a0a57c Add missing return to NonBlockingThreadPool::LocalSteal 2019-03-06 10:49:49 -08:00
Eugene Zhulenev
4e4dcd9026 Remove redundant steal loop 2019-03-06 10:39:07 -08:00
Rasmus Larsen
4d808e834a Merged in rmlarsen/eigen_threadpool (pull request PR-606)
Remove EIGEN_MPL2_ONLY guards around code re-licensed from LGPL to MPL2 in 2ca1e73239


Approved-by: Sameer Agarwal <sameeragarwal@google.com>
2019-03-06 17:59:03 +00:00
Rasmus Larsen
2ea18e505f Merged in ezhulenev/eigen-01 (pull request PR-610)
Block evaluation for TensorGeneratorOp
2019-03-06 16:49:38 +00:00
Eugene Zhulenev
25abaa2e41 Check that inner block dimension is continuous 2019-03-05 17:34:35 -08:00
Eugene Zhulenev
5d9a6686ed Block evaluation for TensorGeneratorOp 2019-03-05 16:35:21 -08:00
Rasmus Larsen
b4861f4778 Merged in ezhulenev/eigen-01 (pull request PR-609)
Tune tensor contraction threadpool heuristics
2019-03-05 23:54:40 +00:00
Gael Guennebaud
bfbf7da047 bug #1689 fix used-but-marked-unused warning 2019-03-05 23:46:24 +01:00
Eugene Zhulenev
a407e022e6 Tune tensor contraction threadpool heuristics 2019-03-05 14:19:59 -08:00
Eugene Zhulenev
56c6373f82 Add an extra check for the RunQueue size estimate 2019-03-05 11:51:26 -08:00
Eugene Zhulenev
b1a8627493 Do not create Tensor<const T> in cxx11_tensor_forced_eval test 2019-03-05 11:19:25 -08:00
Rasmus Munk Larsen
0318fc7f44 Remove EIGEN_MPL2_ONLY guards around code re-licensed from LGPL to MPL2 in 2ca1e73239 2019-03-05 10:24:54 -08:00
Eugene Zhulenev
efb5080d31 Do not initialize invalid fast_strides in TensorGeneratorOp 2019-03-04 16:58:49 -08:00
Eugene Zhulenev
b95941e5c2 Add tiled evaluation for TensorForcedEvalOp 2019-03-04 16:02:22 -08:00
Eugene Zhulenev
694084ecbd Use fast divisors in TensorGeneratorOp 2019-03-04 11:10:21 -08:00
Gael Guennebaud
b0d406d91c Enable construction of Ref<VectorType> from a runtime vector. 2019-03-03 15:25:25 +01:00
Sam Hasinoff
9ba81cf0ff Fully qualify Eigen::internal::aligned_free
This helps avoids a conflict on certain Windows toolchains
(potentially due to some ADL name resolution bug) in the case
where aligned_free is defined in the global namespace. In any
case, tightening this up is harmless.
2019-03-02 17:42:16 +00:00
Gael Guennebaud
22144e949d bug #1629: fix compilation of PardisoSupport (regression introduced in changeset a7842daef2
)
2019-03-02 22:44:47 +01:00
Bernhard M. Wiedemann
b071672e78 Do not keep latex logs
to make package builds more reproducible.
See https://reproducible-builds.org/ for why this is good.
2019-02-27 11:09:00 +01:00
Rasmus Munk Larsen
cf4a1c81fa Fix specialization for conjugate on non-complex types in TensorBase.h. 2019-03-01 14:21:09 -08:00
Sameer Agarwal
c181dfb8ab Consistently use EIGEN_BLAS_FUNC in BLAS.
Previously, for a few functions, eithe BLASFUNC or, EIGEN_CAT
was being used. This change uses EIGEN_BLAS_FUNC consistently
everywhere.

Also introduce EIGEN_BLAS_FUNC_SUFFIX, which by default is
equal to "_", this allows the user to inject a new suffix as
needed.
2019-02-27 11:30:58 -08:00
Rasmus Larsen
9558f4c25f Merged in rmlarsen/eigen_threadpool (pull request PR-596)
Improve EventCount used by the non-blocking threadpool.

Approved-by: Gael Guennebaud <g.gael@free.fr>
2019-02-26 20:37:26 +00:00
Rasmus Larsen
2ca1e73239 Merged in rmlarsen/eigen (pull request PR-597)
Change licensing of OrderingMethods/Amd.h and SparseCholesky/SimplicialCholesky_impl.h from LGPL to MPL2.

Approved-by: Gael Guennebaud <g.gael@free.fr>
2019-02-25 17:02:16 +00:00
Gael Guennebaud
e409dbba14 Enable SSE vectorization of Quaternion and cross3() with AVX 2019-02-23 10:45:40 +01:00
Rasmus Munk Larsen
6560692c67 Improve EventCount used by the non-blocking threadpool.
The current algorithm requires threads to commit/cancel waiting in order
they called Prewait. Spinning caused by that serialization can consume
lots of CPU time on some workloads. Restructure the algorithm to not
require that serialization and remove spin waits from Commit/CancelWait.
Note: this reduces max number of threads from 2^16 to 2^14 to leave
more space for ABA counter (which is now 22 bits).
Implementation details are explained in comments.
2019-02-22 13:56:26 -08:00
Gael Guennebaud
0b25a5c431 fix alignment in ploadquad 2019-02-22 21:39:36 +01:00
Rasmus Munk Larsen
1dc1677d52 Change licensing of OrderingMethods/Amd.h and SparseCholesky/SimplicialCholesky_impl.h from LGPL to MPL2. Google LLC executed a license agreement with the author of the code from which these files are derived to allow the Eigen project to distribute the code and derived works under MPL2. 2019-02-22 12:33:57 -08:00
Gael Guennebaud
0cb4ba98e7 update wrt recent changes 2019-02-21 17:19:36 +01:00
Gael Guennebaud
cca6c207f4 AVX512: implement faster ploadquad<Packet16f> thus speeding up GEMM 2019-02-21 17:18:28 +01:00
Gael Guennebaud
1c09ee8541 bug #1674: workaround clang fast-math aggressive optimizations 2019-02-22 15:48:53 +01:00
Gael Guennebaud
7e3084bb6f Fix compilation on ARM. 2019-02-22 14:56:12 +01:00
Gael Guennebaud
32502f3c45 bug #1684: add simplified regression test for respective clang's bug (this also reveal the same bug in Apples's clang) 2019-02-22 10:29:06 +01:00
Gael Guennebaud
42c23f14ac Speed up col/row-wise reverse for fixed size matrices by propagating compile-time sizes. 2019-02-21 22:44:40 +01:00
Rasmus Munk Larsen
4d7f317102 Add a few missing packet ops: cmp_eq for NEON. pfloor for GPU. 2019-02-21 13:32:13 -08:00
Gael Guennebaud
2a39659d79 Add fully generic Vector<Type,Size> and RowVector<Type,Size> type aliases. 2019-02-20 15:23:23 +01:00
Gael Guennebaud
302377110a Update documentation of Matrix and Array type aliases. 2019-02-20 15:18:48 +01:00
Gael Guennebaud
475295b5ff Enable documentation of Array's typedefs 2019-02-20 15:18:07 +01:00
Gael Guennebaud
44b54fa4a3 Protect c++11 type alias with Eigen's macro, and add respective unit test. 2019-02-20 14:43:05 +01:00
Gael Guennebaud
7195f008ce Merged in ra_bauke/eigen (pull request PR-180)
alias template for matrix and array classes, see also bug #864

Approved-by: Heiko Bauke <heiko.bauke@mail.de>
2019-02-20 13:22:39 +00:00
Gael Guennebaud
4e8047cdcf Fix compilation with gcc and remove TR1 stuff. 2019-02-20 13:59:34 +01:00
Gael Guennebaud
844e5447f8 Update documentation regarding alignment issue. 2019-02-20 13:54:04 +01:00
Gael Guennebaud
edd413c184 bug #1409: make EIGEN_MAKE_ALIGNED_OPERATOR_NEW* macros empty in c++17 mode:
- this helps clang 5 and 6 to support alignas in STL's containers.
 - this makes the public API of our (and users) classes cleaner
2019-02-20 13:52:11 +01:00
Gael Guennebaud
3b5deeb546 bug #899: make sparseqr unit test more stable by 1) trying with larger threshold and 2) relax rank computation for rank-deficient problems. 2019-02-19 22:57:51 +01:00
Gael Guennebaud
482c5fb321 bug #899: remove "rank-revealing" qualifier for SparseQR and warn that it is not always rank-revealing. 2019-02-19 22:52:15 +01:00
Gael Guennebaud
9ac1634fdf Fix conversion warnings 2019-02-19 21:59:53 +01:00
Gael Guennebaud
292d61970a Fix C++17 compilation 2019-02-19 21:59:41 +01:00
Rasmus Munk Larsen
071629a440 Fix incorrect value of NumDimensions in TensorContraction traits.
Reported here: #1671
2019-02-19 10:49:54 -08:00
Christoph Hertzberg
a1646fc960 Commas at the end of enumerator lists are not allowed in C++03 2019-02-19 14:32:25 +01:00
Gael Guennebaud
2cfc025bda fix unit compilation in c++17: std::ptr_fun has been removed. 2019-02-19 14:05:22 +01:00
Gael Guennebaud
ab78cabd39 Add C++17 detection macro, and make sure throw(xpr) is not used if the compiler is in c++17 mode. 2019-02-19 14:04:35 +01:00
Gael Guennebaud
115da6a1ea Fix conversion warnings 2019-02-19 14:00:15 +01:00
Gael Guennebaud
7d10c78738 bug #1046: add unit tests for correct propagation of alignment through std::alignment_of 2019-02-19 10:31:56 +01:00
Gael Guennebaud
7580112c31 Fix harmless Scalar vs RealScalar cast. 2019-02-18 22:12:28 +01:00
Gael Guennebaud
e23bf40dc2 Add unit test for LinSpaced and complex numbers. 2019-02-18 22:03:47 +01:00
Gael Guennebaud
796db94e6e bug #1194: implement slightly faster and SIMD friendly 4x4 determinant. 2019-02-18 16:21:27 +01:00
Gael Guennebaud
31b6e080a9 Fix regression: .conjugate() was popped out but not re-introduced. 2019-02-18 14:45:55 +01:00
Gael Guennebaud
c69d0d08d0 Set cost of conjugate to 0 (in practice it boils down to a no-op).
This is also important to make sure that A.conjugate() * B.conjugate() does not evaluate
its arguments into temporaries (e.g., if A and B are fixed and small, or * fall back to lazyProduct)
2019-02-18 14:43:07 +01:00
Gael Guennebaud
512b74aaa1 GEMM: catch all scalar-multiple variants when falling-back to a coeff-based product.
Before only s*A*B was caught which was both inconsistent with GEMM, sub-optimal,
and could even lead to compilation-errors (https://stackoverflow.com/questions/54738495).
2019-02-18 11:47:54 +01:00
Christoph Hertzberg
ec032ac03b Guard C++11-style default constructor. Also, this is only needed for MSVC 2019-02-16 09:44:05 +01:00
Gael Guennebaud
902a7793f7 Add possibility to bench row-major lhs and rhs 2019-02-15 16:52:34 +01:00
Gael Guennebaud
83309068b4 bug #1680: improve MSVC inlining by declaring many triavial constructors and accessors as STRONG_INLINE. 2019-02-15 16:35:35 +01:00
Gael Guennebaud
0505248f25 bug #1680: make all "block" methods strong-inline and device-functions (some were missing EIGEN_DEVICE_FUNC) 2019-02-15 16:33:56 +01:00
Gael Guennebaud
559320745e bug #1678: Fix lack of __FMA__ macro on MSVC with AVX512 2019-02-15 10:30:28 +01:00
Gael Guennebaud
d85ae650bf bug #1678: workaround MSVC compilation issues with AVX512 2019-02-15 10:24:17 +01:00
Gael Guennebaud
f2970819a2 bug #1679: avoid possible division by 0 in complex-schur 2019-02-15 09:39:25 +01:00
Rasmus Munk Larsen
65e23ca7e9 Revert b55b5c7280
.
2019-02-14 13:46:13 -08:00
Rasmus Larsen
efeabee445 Merged in ezhulenev/eigen-01 (pull request PR-590)
Do not generate no-op cast() and conjugate() expressions
2019-02-14 21:16:12 +00:00
Eugene Zhulenev
7b837559a7 Fix signed-unsigned return in RuqQueue 2019-02-14 10:40:21 -08:00
Eugene Zhulenev
f0d42d2265 Fix signed-unsigned comparison warning in RunQueue 2019-02-14 10:27:28 -08:00
Eugene Zhulenev
106ba7bb1a Do not generate no-op cast() and conjugate() expressions 2019-02-14 09:51:51 -08:00
Eugene Zhulenev
8c2f30c790 Speedup Tensor ThreadPool RunQueu::Empty() 2019-02-13 10:20:53 -08:00
Gael Guennebaud
bdcb5f3304 Let's properly use Score instead of std::abs, and remove deprecated FIXME ( a /= b does a/b and not a * (1/b) as it was a long time ago...) 2019-02-11 22:56:19 +01:00
Gael Guennebaud
2edfc6807d Fix compilation of empty products of the form: Mx0 * 0xN 2019-02-11 18:24:07 +01:00
Gael Guennebaud
eb46f34a8c Speed up 2x2 LU by a factor 2, and other small fixed sizes by about 10%.
Not sure that's so critical, but this does not complexify the code base much.
2019-02-11 17:59:35 +01:00
Gael Guennebaud
dada863d23 Enable unit tests of PartialPivLU on fixed size matrices, and increase tested matrix size (blocking was not tested!) 2019-02-11 17:56:20 +01:00
Gael Guennebaud
ab6e6edc32 Speedup PartialPivLU for small matrices by passing compile-time sizes when available.
This change set also makes a better use of Map<>+OuterStride and Ref<> yielding surprising speed up for small dynamic sizes as well.
The table below reports times in micro seconds for 10 random matrices:
           | ------ float --------- | ------- double ------- |
     size  | before   after  ratio  |  before   after  ratio |
fixed	  1	 | 0.34     0.11   2.93   |  0.35     0.11   3.06  |
fixed	  2	 | 0.81     0.24   3.38   |  0.91     0.25   3.60  |
fixed	  3	 | 1.49     0.49   3.04   |  1.68     0.55   3.01  |
fixed	  4	 | 2.31     0.70   3.28   |  2.45     1.08   2.27  |
fixed	  5	 | 3.49     1.11   3.13   |  3.84     2.24   1.71  |
fixed	  6	 | 4.76     1.64   2.88   |  4.87     2.84   1.71  |
dyn     1	 | 0.50     0.40   1.23   |  0.51     0.40   1.26  |
dyn     2	 | 1.08     0.85   1.27   |  1.04     0.69   1.49  |
dyn     3	 | 1.76     1.26   1.40   |  1.84     1.14   1.60  |
dyn     4	 | 2.57     1.75   1.46   |  2.67     1.66   1.60  |
dyn     5	 | 3.80     2.64   1.43   |  4.00     2.48   1.61  |
dyn     6	 | 5.06     3.43   1.47   |  5.15     3.21   1.60  |
2019-02-11 13:58:24 +01:00
Eugene Zhulenev
21eb97d3e0 Add PacketConv implementation for non-vectorizable src expressions 2019-02-08 15:47:25 -08:00
Eugene Zhulenev
1e36166ed1 Optimize TensorConversion evaluator: do not convert same type 2019-02-08 15:13:24 -08:00
Steven Peters
953ca5ba2f Spline.h: fix spelling "spang" -> "span" 2019-02-08 06:23:24 +00:00
Eugene Zhulenev
59998117bb Don't do parallel_pack if we can use thread_local memory in tensor contractions 2019-02-07 09:21:25 -08:00
Gael Guennebaud
013cc3a6b3 Make GEMM fallback to GEMV for runtime vectors.
This is a more general and simpler version of changeset 4c0fa6ce0f
2019-02-07 16:24:09 +01:00
Gael Guennebaud
fa2fcb4895 Backed out changeset 4c0fa6ce0f 2019-02-07 16:07:08 +01:00
Gael Guennebaud
b3c4344a68 bug #1676: workaround GCC's bug in c++17 mode. 2019-02-07 15:21:35 +01:00
Rasmus Larsen
3091c03898 Merged in ezhulenev/eigen-01 (pull request PR-581)
Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing

Approved-by: Rasmus Larsen <rmlarsen@google.com>
Approved-by: Gael Guennebaud <g.gael@free.fr>
2019-02-05 22:45:20 +00:00
Eugene Zhulenev
8491127082 Do not reduce parallelism too much in contractions with small number of threads 2019-02-04 12:59:33 -08:00
Eugene Zhulenev
eb21bab769 Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing 2019-02-04 10:43:16 -08:00
Eugene Zhulenev
6d0f6265a9 Remove duplicated comment line 2019-02-04 10:30:25 -08:00
Eugene Zhulenev
690b2c45b1 Fix GeneralBlockPanelKernel Android compilation 2019-02-04 10:29:15 -08:00
Gael Guennebaud
871e2e5339 bug #1674: disable GCC's unsafe-math-optimizations in sin/cos vectorization (results are completely wrong otherwise) 2019-02-03 08:54:47 +01:00
Rasmus Larsen
e7b481ea74 Merged in rmlarsen/eigen (pull request PR-578)
Speed up Eigen matrix*vector and vector*matrix multiplication.

Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2019-02-02 01:53:44 +00:00
Sameer Agarwal
b55b5c7280 Speed up row-major matrix-vector product on ARM
The row-major matrix-vector multiplication code uses a threshold to
check if processing 8 rows at a time would thrash the cache.

This change introduces two modifications to this logic.

1. A smaller threshold for ARM and ARM64 devices.

The value of this threshold was determined empirically using a Pixel2
phone, by benchmarking a large number of matrix-vector products in the
range [1..4096]x[1..4096] and measuring performance separately on
small and little cores with frequency pinning.

On big (out-of-order) cores, this change has little to no impact. But
on the small (in-order) cores, the matrix-vector products are up to
700% faster. Especially on large matrices.

The motivation for this change was some internal code at Google which
was using hand-written NEON for implementing similar functionality,
processing the matrix one row at a time, which exhibited substantially
better performance than Eigen.

With the current change, Eigen handily beats that code.

2. Make the logic for choosing number of simultaneous rows apply
unifiormly to 8, 4 and 2 rows instead of just 8 rows.

Since the default threshold for non-ARM devices is essentially
unchanged (32000 -> 32 * 1024), this change has no impact on non-ARM
performance. This was verified by running the same set of benchmarks
on a Xeon desktop.
2019-02-01 15:23:53 -08:00
Rasmus Munk Larsen
4c0fa6ce0f Speed up Eigen matrix*vector and vector*matrix multiplication.
This change speeds up Eigen matrix * vector and vector * matrix multiplication for dynamic matrices when it is known at runtime that one of the factors is a vector.

The benchmarks below test

c.noalias()= n_by_n_matrix * n_by_1_matrix;
c.noalias()= 1_by_n_matrix * n_by_n_matrix;
respectively.

Benchmark measurements:

SSE:
Run on *** (72 X 2992 MHz CPUs); 2019-01-28T17:51:44.452697457-08:00
CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_MatVec/64                            1096       312    +71.5%
BM_MatVec/128                           4581      1464    +68.0%
BM_MatVec/256                          18534      5710    +69.2%
BM_MatVec/512                         118083     24162    +79.5%
BM_MatVec/1k                          704106    173346    +75.4%
BM_MatVec/2k                         3080828    742728    +75.9%
BM_MatVec/4k                        25421512   4530117    +82.2%
BM_VecMat/32                             352       130    +63.1%
BM_VecMat/64                            1213       425    +65.0%
BM_VecMat/128                           4640      1564    +66.3%
BM_VecMat/256                          17902      5884    +67.1%
BM_VecMat/512                          70466     24000    +65.9%
BM_VecMat/1k                          340150    161263    +52.6%
BM_VecMat/2k                         1420590    645576    +54.6%
BM_VecMat/4k                         8083859   4364327    +46.0%

AVX2:
Run on *** (72 X 2993 MHz CPUs); 2019-01-28T17:45:11.508545307-08:00
CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_MatVec/64                             619       120    +80.6%
BM_MatVec/128                           9693       752    +92.2%
BM_MatVec/256                          38356      2773    +92.8%
BM_MatVec/512                          69006     12803    +81.4%
BM_MatVec/1k                          443810    160378    +63.9%
BM_MatVec/2k                         2633553    646594    +75.4%
BM_MatVec/4k                        16211095   4327148    +73.3%
BM_VecMat/64                             925       227    +75.5%
BM_VecMat/128                           3438       830    +75.9%
BM_VecMat/256                          13427      2936    +78.1%
BM_VecMat/512                          53944     12473    +76.9%
BM_VecMat/1k                          302264    157076    +48.0%
BM_VecMat/2k                         1396811    675778    +51.6%
BM_VecMat/4k                         8962246   4459010    +50.2%

AVX512:
Run on *** (72 X 2993 MHz CPUs); 2019-01-28T17:35:17.239329863-08:00
CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_MatVec/64                             401       111    +72.3%
BM_MatVec/128                           1846       513    +72.2%
BM_MatVec/256                          36739      1927    +94.8%
BM_MatVec/512                          54490      9227    +83.1%
BM_MatVec/1k                          487374    161457    +66.9%
BM_MatVec/2k                         2016270    643824    +68.1%
BM_MatVec/4k                        13204300   4077412    +69.1%
BM_VecMat/32                             324       106    +67.3%
BM_VecMat/64                            1034       246    +76.2%
BM_VecMat/128                           3576       802    +77.6%
BM_VecMat/256                          13411      2561    +80.9%
BM_VecMat/512                          58686     10037    +82.9%
BM_VecMat/1k                          320862    163750    +49.0%
BM_VecMat/2k                         1406719    651397    +53.7%
BM_VecMat/4k                         7785179   4124677    +47.0%
Currently watchingStop watching
2019-01-31 14:24:08 -08:00
Gael Guennebaud
7ef879f6bf GEBP: improves pipelining in the 1pX4 path with FMA.
Prior to this change, a product with a LHS having 8 rows was faster with AVX-only than with AVX+FMA.
With AVX+FMA I measured a speed up of about x1.25 in such cases.
2019-01-30 23:45:12 +01:00
Gael Guennebaud
de77bf5d6c Fix compilation with ARM64. 2019-01-30 16:48:20 +01:00
Gael Guennebaud
d586686924 Workaround lack of support for arbitrary packet-type in Tensor by manually loading half/quarter packets in tensor contraction mapper. 2019-01-30 16:48:01 +01:00
Gael Guennebaud
eb4c6bb22d Fix conflicts and merge 2019-01-30 15:57:08 +01:00
Gael Guennebaud
e3622a0396 Slightly extend discussions on auto and move the content of the Pit falls wiki page here.
http://eigen.tuxfamily.org/index.php?title=Pit_Falls
2019-01-30 13:09:21 +01:00
Gael Guennebaud
df12fae8b8 According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101, the previous GCC issue is fixed in GCC trunk (will be gcc 9). 2019-01-30 11:52:28 +01:00
Gael Guennebaud
3775926bba ARM64 & GEBP: add specialization for double +30% speed up 2019-01-30 11:49:06 +01:00
Gael Guennebaud
be5b0f664a ARM64 & GEBP: Make use of vfmaq_laneq_f32 and workaround GCC's issue in generating good ASM 2019-01-30 11:48:25 +01:00
Christoph Hertzberg
a7779a9b42 Hide some annoying unused variable warnings in g++8.1 2019-01-29 16:48:21 +01:00
Gael Guennebaud
efe02292a6 Add recent gemm related changesets and various cleanups in perf-monitoring 2019-01-29 11:53:47 +01:00
Gael Guennebaud
8a06c699d0 bug #1669: fix PartialPivLU/inverse with zero-sized matrices. 2019-01-29 10:27:13 +01:00
Gael Guennebaud
a2a07e62b9 Fix compilation with c++03 (local class cannot be template arguments), and make SparseMatrix::assignDiagonal truly protected. 2019-01-29 10:10:07 +01:00
Gael Guennebaud
f489f44519 bug #1574: implement "sparse_matrix =,+=,-= diagonal_matrix" with smart insertion strategies of missing diagonal coeffs. 2019-01-28 17:29:50 +01:00
Gael Guennebaud
803fa79767 Move evaluator<SparseCompressedBase>::find(i,j) to a more general and reusable SparseCompressedBase::lower_bound(i,j) functiion 2019-01-28 17:24:44 +01:00
Gael Guennebaud
53560f9186 bug #1672: fix unit test compilation with MSVC by adding overloads of test_is* for long long (and factorize copy/paste code through a macro) 2019-01-28 13:47:28 +01:00
Christoph Hertzberg
c9825b967e Renaming even more I identifiers 2019-01-26 13:22:13 +01:00
Christoph Hertzberg
5a52e35f9a Renaming some more I identifiers 2019-01-26 13:18:21 +01:00
Rasmus Munk Larsen
71429883ee Fix compilation error in NEON GEBP specializaition of madd. 2019-01-25 17:00:21 -08:00
Christoph Hertzberg
934b8a1304 Avoid I as an identifier, since it may clash with the C-header complex.h 2019-01-25 14:54:39 +01:00
Gael Guennebaud
ec8a387972 cleanup 2019-01-24 10:24:45 +01:00
Gael Guennebaud
6908ce2a15 More thoroughly check variadic template ctor of fixed-size vectors 2019-01-24 10:24:28 +01:00
David Tellenbach
237b03b372 PR 574: use variadic template instead of initializer_list to implement fixed-size vector ctor from coefficients. 2019-01-23 00:07:19 +01:00
Christoph Hertzberg
bd6dadcda8 Tell doxygen that cxx11 math is available 2019-01-24 00:14:02 +01:00
Gael Guennebaud
c64d5d3827 Bypass inline asm for non compatible compilers. 2019-01-23 23:43:13 +01:00
Christoph Hertzberg
e16913a45f Fix name of tutorial snippet. 2019-01-23 10:35:06 +01:00
Gael Guennebaud
80f81f9c4b Cleanup SFINAE in Array/Matrix(initializer_list) ctors and minor doc editing. 2019-01-22 17:08:47 +01:00
David Tellenbach
db152b9ee6 PR 572: Add initializer list constructors to Matrix and Array (include unit tests and doc)
- {1,2,3,4,5,...} for fixed-size vectors only
- {{1,2,3},{4,5,6}} for the general cases
- {{1,2,3,4,5,....}} is allowed for both row and column-vector
2019-01-21 16:25:57 +01:00
Gael Guennebaud
543529da6a Add more extensive tests of Array ctors, including {} variants 2019-01-22 15:30:50 +01:00
nluehr
92774f0275 Replace host_define.h with cuda_runtime_api.h 2019-01-18 16:10:09 -06:00
Gael Guennebaud
d18f49cbb3 Fix compilation of unit tests with gcc and c++17 2019-01-18 11:12:42 +01:00
Christoph Hertzberg
da0a41b9ce Mask unused-parameter warnings, when building with NDEBUG 2019-01-18 10:41:14 +01:00
Rasmus Munk Larsen
2eccbaf3f7 Add missing logical packet ops for GPU and NEON. 2019-01-17 17:45:08 -08:00
Christoph Hertzberg
d575505d25 After fixing bug #1557, boostmultiprec_7 failed with NumericalIssue instead of NoConvergence (all that matters here is no Success) 2019-01-17 19:14:07 +01:00
Gael Guennebaud
ee3662abc5 Remove some useless const_cast 2019-01-17 18:27:49 +01:00
Gael Guennebaud
0fe6b7d687 Make nestByValue works again (broken since 3.3) and add unit tests. 2019-01-17 18:27:25 +01:00
Gael Guennebaud
4b7cf7ff82 Extend reshaped unit tests and remove useless const_cast 2019-01-17 17:35:32 +01:00
Gael Guennebaud
b57c9787b1 Cleanup useless const_cast and add missing broadcast assignment tests 2019-01-17 16:55:42 +01:00
Gael Guennebaud
be05d0030d Make FullPivLU use conjugateIf<> 2019-01-17 12:01:00 +01:00
Patrick Peltzer
bba2f05064 Boosttest only available for Boost version >= 1.53.0 2019-01-17 11:54:37 +01:00
Patrick Peltzer
15e53d5d93 PR 567: makes all dense solvers inherit SoverBase (LU,Cholesky,QR,SVD).
This changeset also includes:
 * add HouseholderSequence::conjugateIf
 * define int as the StorageIndex type for all dense solvers
 * dedicated unit tests, including assertion checking
 * _check_solve_assertion(): this method can be implemented in derived solver classes to implement custom checks
 * CompleteOrthogonalDecompositions: add applyZOnTheLeftInPlace, fix scalar type in applyZAdjointOnTheLeftInPlace(), add missing assertions
 * Cholesky: add missing assertions
 * FullPivHouseholderQR: Corrected Scalar type in _solve_impl()
 * BDCSVD: Unambiguous return type for ternary operator
 * SVDBase: Corrected Scalar type in _solve_impl()
2019-01-17 01:17:39 +01:00
Gael Guennebaud
7f32109c11 Add conjugateIf<bool> members to DesneBase, TriangularView, SelfadjointView, and make PartialPivLU use it. 2019-01-17 11:33:43 +01:00
Gael Guennebaud
7b35c26b1c Doc: remove link to porting guide 2019-01-17 10:35:50 +01:00
Gael Guennebaud
4759d9e86d Doc: add manual page on STL iterators 2019-01-17 10:35:14 +01:00
Gael Guennebaud
562985bac4 bug #1646: fix false aliasing detection for A.row(0) = A.col(0);
This changeset completely disable the detection for vectors for which are current mechanism cannot detect any positive aliasing anyway.
2019-01-17 00:14:27 +01:00
Rasmus Munk Larsen
7401e2541d Fix compilation error for logical packet ops with older compilers. 2019-01-16 14:43:33 -08:00
Rasmus Munk Larsen
ee550a2ac3 Fix flaky test for tensor fft. 2019-01-16 14:03:12 -08:00
Gael Guennebaud
0f028f61cb GEBP: fix swapped kernel mode with AVX512 and complex scalars 2019-01-16 22:26:38 +01:00
Gael Guennebaud
e118ce86fd GEBP: cleanup logic to choose between a 4 packets of 1 packet 2019-01-16 21:47:42 +01:00
Gael Guennebaud
70e133333d bug #1661: fix regression in GEBP and AVX512 2019-01-16 21:22:20 +01:00
Gael Guennebaud
ce88e297dc Add a comment stating this doc page is partly obsolete. 2019-01-16 16:29:02 +01:00
Gael Guennebaud
729d1291c2 bug #1585: update doc on lazy-evaluation 2019-01-16 16:28:17 +01:00
Gael Guennebaud
c8e40edac9 Remove Eigen2ToEigen3 migration page (obsolete since 3.3) 2019-01-16 16:27:00 +01:00
Gael Guennebaud
aeffdf909e bug #1617: add unit tests for empty triangular solve. 2019-01-16 15:24:59 +01:00
Gael Guennebaud
502f717980 bug #1646: disable aliasing detection for empty and 1x1 expression 2019-01-16 14:33:45 +01:00
Gael Guennebaud
0b466b6933 bug #1633: use proper type for madd temporaries, factorize RhsPacketx4. 2019-01-16 13:50:13 +01:00
Renjie Liu
dbfcceabf5 Bug: 1633: refactor gebp kernel and optimize for neon 2019-01-16 12:51:36 +08:00
Gael Guennebaud
2b70b2f570 Make Transform::rotation() an alias to Transform::linear() in the case of an Isometry 2019-01-15 22:50:42 +01:00
Gael Guennebaud
2c2c114995 Silent maybe-uninitialized warnings by gcc 2019-01-15 16:53:15 +01:00
Gael Guennebaud
6ec6bf0b0d Enable visitor on empty matrices (the visitor is left unchanged), and protect min/maxCoeff(Index*,Index*) on empty matrices by an assertion (+ doc & unit tests) 2019-01-15 15:21:14 +01:00
Gael Guennebaud
027e44ed24 bug #1592: makes partial min/max reductions trigger an assertion on inputs with a zero reduction length (+doc and tests) 2019-01-15 15:13:24 +01:00
Gael Guennebaud
f8bc5cb39e Fix detection of vector-at-time: use Rows/Cols instead of MaxRow/MaxCols.
This fix VectorXd(n).middleCol(0,0).outerSize() which was equal to 1.
2019-01-15 15:09:49 +01:00
Gael Guennebaud
32d7232aec fix always true warning with gcc 4.7 2019-01-15 11:18:48 +01:00
Gael Guennebaud
6cf7afa3d9 Typo 2019-01-15 11:04:37 +01:00
Gael Guennebaud
e7d4d4f192 cleanup 2019-01-15 10:51:03 +01:00
Rasmus Larsen
7b3aab0936 Merged in rmlarsen/eigen (pull request PR-570)
Add support for inverse hyperbolic functions. Fix cost of division.
2019-01-14 21:31:33 +00:00
Rasmus Munk Larsen
8bf00c2baf Remove extra <tr>. 2019-01-14 13:29:29 -08:00
Rasmus Munk Larsen
ec7fe83554 Merge. 2019-01-14 13:26:58 -08:00
Rasmus Munk Larsen
2ea4efc0c3 Merge. 2019-01-14 13:26:58 -08:00
Rasmus Munk Larsen
2c5843dbbb Update documentation. 2019-01-14 13:26:34 -08:00
Gael Guennebaud
250dcd1fdb bug #1652: fix position of EIGEN_ALIGN16 attributes in Neon and Altivec 2019-01-14 21:45:56 +01:00
Rasmus Larsen
5a59452aae Merged eigen/eigen into default 2019-01-14 10:23:23 -08:00
Gael Guennebaud
3c9e6d206d AVX512: fix pgather/pscatter for Packet4cd and unaligned pointers 2019-01-14 17:57:28 +01:00
Gael Guennebaud
61b6eb05fe AVX512 (r)sqrt(double) was mistakenly disabled with clang and others 2019-01-14 17:28:47 +01:00
Gael Guennebaud
ccddeaad90 fix warning 2019-01-14 16:51:16 +01:00
Gael Guennebaud
d4881751d3 Doc: add Isometry in the list of supported Mode of Transform<> 2019-01-14 16:38:26 +01:00
Greg Coombe
9d988a1e1a Initialize isometric transforms like affine transforms.
The isometric transform, like the affine transform, has an implicit last
row of [0, 0, 0, 1]. This was not being properly initialized, as verified
by a new test function.
2019-01-11 23:14:35 -08:00
Gael Guennebaud
4356a55a61 PR 571: Implements an accurate argument reduction algorithm for huge inputs of sin/cos and call it instead of falling back to std::sin/std::cos.
This makes both the small and huge argument cases faster because:
- for small inputs this removes the last pselect
- for large inputs only the reduction part follows a scalar path,
the rest use the same SIMD path as the small-argument case.
2019-01-14 13:54:01 +01:00
Gael Guennebaud
f566724023 Fix StorageIndex FIXME in dense LU solvers 2019-01-13 17:54:30 +01:00
Rasmus Munk Larsen
1c6e6e2c3f Merge. 2019-01-11 17:47:11 -08:00
Rasmus Larsen
0ba3b45419 Merged eigen/eigen into default 2019-01-11 17:46:04 -08:00
Rasmus Munk Larsen
28ba1b2c32 Add support for inverse hyperbolic functions.
Fix cost of division.
2019-01-11 17:45:37 -08:00
Rasmus Munk Larsen
89c4001d6f Fix warnings in ptrue for complex and half types. 2019-01-11 14:10:57 -08:00
Rasmus Munk Larsen
a49d01edba Fix warnings in ptrue for complex and half types. 2019-01-11 13:18:17 -08:00
Eugene Zhulenev
1e6d15b55b Fix shorten-64-to-32 warning in TensorContractionThreadPool 2019-01-11 11:41:53 -08:00
Rasmus Munk Larsen
df29511ac0 Fix merge. 2019-01-11 10:36:36 -08:00
Rasmus Munk Larsen
8e71ed4cc9 Merge. 2019-01-11 10:35:07 -08:00
Rasmus Munk Larsen
fff5a5b579 Resolve. 2019-01-11 10:28:52 -08:00
Rasmus Munk Larsen
9396ace46b Merge. 2019-01-11 10:28:52 -08:00
Rasmus Larsen
74882471d0 Merged eigen/eigen into default 2019-01-11 10:20:55 -08:00
Rasmus Munk Larsen
e9936cf2b9 Merge. 2019-01-11 09:58:33 -08:00
Gael Guennebaud
9005f0111f Replace compiler's alignas/alignof extension by respective c++11 keywords when available. This also fix a compilation issue with gcc-4.7. 2019-01-11 17:10:54 +01:00
Mark D Ryan
3c9add6598 Remove reinterpret_cast from AVX512 complex implementation
The reinterpret_casts used in ptranspose(PacketBlock<Packet8cf,4>&)
ptranspose(PacketBlock<Packet8cf,8>&) don't appear to be working
correctly.  They're used to convert the kernel parameters to
PacketBlock<Packet8d,T>& so that the complex number versions of
ptranspose can be written using the existing double implementations.
Unfortunately, they don't seem to work and are responsible for 9 unit
test failures in the AVX512 build of tensorflow master.  This commit
fixes the issue by manually initialising PacketBlock<Packet8d,T>
variables with the contents of the kernel parameter before calling
the double version of ptranspose, and then copying the resulting
values back into the kernel parameter before returning.
2019-01-11 14:02:09 +01:00
Christoph Hertzberg
0522460a0d bug #1656: Enable failtests only if BUILD_TESTING is enabled 2019-01-11 11:07:56 +01:00
Eugene Zhulenev
0abe03764c Fix shorten-64-to-32 warning in TensorContractionThreadPool 2019-01-10 10:27:55 -08:00
Rasmus Munk Larsen
fcfced13ed Rename pones -> ptrue. Use _CMP_TRUE_UQ where appropriate. 2019-01-09 17:20:33 -08:00
Rasmus Munk Larsen
ce38c342c3 merge. 2019-01-09 17:20:33 -08:00
Rasmus Munk Larsen
a05ec7993e merge 2019-01-09 17:17:30 -08:00
Rasmus Munk Larsen
e15bb785ad Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
f6ba6071c5 Fix typo. 2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
8f04442526 Collapsed revision
* Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
8f178429b9 Collapsed revision
* Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
1119c73d22 Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
e00521b514 Undo useless diffs. 2019-01-09 16:32:53 -08:00
Rasmus Munk Larsen
f2767112c8 Simplify a bit. 2019-01-09 16:29:18 -08:00
Rasmus Munk Larsen
cb955df9a6 Add packet up "pones". Write pnot(a) as pxor(pones(a), a). 2019-01-09 16:17:08 -08:00
Rasmus Larsen
cb3c059fa4 Merged eigen/eigen into default 2019-01-09 15:04:17 -08:00
Gael Guennebaud
d812f411c3 bug #1654: fix compilation with cuda and no c++11 2019-01-09 18:00:05 +01:00
Gael Guennebaud
3492a1ca74 fix plog(+inf) with AVX512 2019-01-09 16:53:37 +01:00
Gael Guennebaud
47810cf5b7 Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE 2019-01-09 16:40:42 +01:00
Gael Guennebaud
3f14e0d19e fix warning 2019-01-09 15:45:21 +01:00
Gael Guennebaud
aeec68f77b Add missing pcmp_lt and others for AVX512 2019-01-09 15:36:41 +01:00
Gael Guennebaud
e6b217b8dd bug #1652: implements a much more accurate version of vectorized sin/cos. This new version achieve same speed for SSE/AVX, and is slightly faster with FMA. Guarantees are as follows:
- no FMA: 1ULP up to 3pi, 2ULP up to sin(25966) and cos(18838), fallback to std::sin/cos for larger inputs
  - FMA: 1ULP up to sin(117435.992) and cos(71476.0625), fallback to std::sin/cos for larger inputs
2019-01-09 15:25:17 +01:00
Eugene Zhulenev
e70ffef967 Optimize evalShardedByInnerDim 2019-01-08 16:26:31 -08:00
Rasmus Munk Larsen
055f0b73db Add support for pcmp_eq and pnot, including for complex types. 2019-01-07 16:53:36 -08:00
Eugene Zhulenev
190d053e41 Explicitly set fill character when printing aligned data to ostream 2019-01-03 14:55:28 -08:00
Mark D Ryan
bc5dd4cafd PR560: Fix the AVX512f only builds
Commit c53eececb0
 introduced AVX512 support for complex numbers but required
avx512dq to build.  Commit 1d683ae2f5
 fixed some but not, it would seem all,
of the hard avx512dq dependencies.  Build failures are still evident on
Eigen and TensorFlow when compiling with just avx512f and no avx512dq
using gcc 7.3.  Looking at the code there does indeed seem to be a problem.
Commit c53eececb0
 calls avx512dq intrinsics directly, e.g, _mm512_extractf32x8_ps
and _mm512_and_ps.  This commit fixes the issue by replacing the direct
intrinsic calls with the various wrapper functions that are safe to use on
avx512f only builds.
2019-01-03 14:33:04 +01:00
Gael Guennebaud
697fba3bb0 Fix unit test 2018-12-27 11:20:47 +01:00
Gael Guennebaud
60d3fe9a89 One more stupid AVX 512 fix (I don't have direct access to AVX512 machines) 2018-12-24 13:05:03 +01:00
Gael Guennebaud
4aa667b510 Add EIGEN_STRONG_INLINE where required 2018-12-24 10:45:01 +01:00
Gael Guennebaud
961ff567e8 Add missing pcmp_lt_or_nan for AVX512 2018-12-23 22:13:29 +01:00
Gael Guennebaud
0f6f75bd8a Implement a faster fix for sin/cos of large entries that also correctly handle INF input. 2018-12-23 17:26:21 +01:00
Gael Guennebaud
38d704def8 Make sure that psin/pcos return number in [-1,1] for large inputs (though sin/cos on large entries is quite useless because it's inaccurate) 2018-12-23 16:13:24 +01:00
Gael Guennebaud
5713fb7feb Fix plog(+INF): it returned ~87 instead of +INF 2018-12-23 15:40:52 +01:00
Christoph Hertzberg
6dd93f7e3b Make code compile again for older compilers.
See https://stackoverflow.com/questions/7411515/
2018-12-22 13:09:07 +01:00
Gustavo Lima Chaves
1024a70e82 gebp: Add new ½ and ¼ packet rows per (peeling) round on the lhs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The patch works by altering the gebp lhs packing routines to also
consider ½ and ¼ packet lenght rows when packing, besides the original
whole package and row-by-row attempts. Finally, gebp itself will try
to fit a fraction of a packet at a time if:

i) ½ and/or ¼ packets are available for the current context (e.g. AVX2
   and SSE-sized SIMD register for x86)

ii) The matrix's height is favorable to it (it may be it's too small
    in that dimension to take full advantage of the current/maximum
    packet width or it may be the case that last rows may take
    advantage of smaller packets before gebp goes row-by-row)

This helps mitigate huge slowdowns one had on AVX512 builds when
compared to AVX2 ones, for some dimensions. Gains top at an extra 1x
in throughput. This patch is a complement to changeset 4ad359237a
.

Since packing is changed, Eigen users which would go for very
low-level API usage, like TensorFlow, will have to be adapted to work
fine with the changes.
2018-12-21 11:03:18 -08:00
Gustavo Lima Chaves
e763fcd09e Introducing "vectorized" byte on unpacket_traits structs
This is a preparation to a change on gebp_traits, where a new template
argument will be introduced to dictate the packet size, so it won't be
bound to the current/max packet size only anymore.

By having packet types defined early on gebp_traits, one has now to
act on packet types, not scalars anymore, for the enum values defined
on that class. One approach for reaching the vectorizable/size
properties one needs there could be getting the packet's scalar again
with unpacket_traits<>, then the size/Vectorizable enum entries from
packet_traits<>. It turns out guards like "#ifndef
EIGEN_VECTORIZE_AVX512" at AVX/PacketMath.h will hide smaller packet
variations of packet_traits<> for some types (and it makes sense to
keep that). In other words, one can't go back to the scalar and create
a new PacketType, as this will always lead to the maximum packet type
for the architecture.

The less costly/invasive solution for that, thus, is to add the
vectorizable info on every unpacket_traits struct as well.
2018-12-19 14:24:44 -08:00
Gael Guennebaud
efa4c9c40f bug #1615: slightly increase the default unrolling limit to compensate for changeset 101ea26f5e
.
This solves a performance regression with clang and 3x3 matrix products.
2018-12-13 10:42:39 +01:00
Gael Guennebaud
f20c991679 add changesets related to matrix product perf. 2018-12-13 10:33:29 +01:00
Rasmus Munk Larsen
dd6d65898a Fix shorten-64-to-32 warning. Use regular memcpy if num_threads==0. 2018-12-12 14:45:31 -08:00
Gael Guennebaud
f582ea3579 Fix compilation with expression template scalar type. 2018-12-12 22:47:00 +01:00
Gael Guennebaud
cfc70dc13f Add regression test for bug #1174 2018-12-12 18:03:31 +01:00
Gael Guennebaud
2de8da70fd bug #1557: fix RealSchur and EigenSolver for matrices with only zeros on the diagonal. 2018-12-12 17:30:08 +01:00
Gael Guennebaud
72c0bbe2bd Simplify handling of tests that must fail to compile.
Each test is now a normal ctest target, and build properties (compiler+flags) are preserved (instead of starting a new build-dir from scratch).
2018-12-12 15:48:36 +01:00
Gael Guennebaud
37c91e1836 bug #1644: fix warning 2018-12-11 22:07:20 +01:00
Gael Guennebaud
f159cf3d75 Artificially increase l1-blocking size for AVX512. +10% speedup with current kernels.
With a 6pX4 kernel (not committed yet), this provides a +20% speedup.
2018-12-11 15:36:27 +01:00
Gael Guennebaud
0a7e7af6fd Properly set the number of registers for AVX512 2018-12-11 15:33:17 +01:00
Gael Guennebaud
7166496f70 bug #1643: fix compilation issue with gcc and no optimizaion 2018-12-11 13:24:42 +01:00
Gael Guennebaud
0d90637838 enable spilling workaround on architectures with SSE/AVX 2018-12-10 23:22:44 +01:00
Gael Guennebaud
cf697272e1 Remove debug code. 2018-12-09 23:05:46 +01:00
Gael Guennebaud
450dc97c6b Various fixes in polynomial solver and its unit tests:
- cleanup noise in imaginary part of real roots
 - take into account the magnitude of the derivative to check roots.
 - use <= instead of < at appropriate places
2018-12-09 22:54:39 +01:00
Gael Guennebaud
348bb386d1 Enable "old" CMP0026 policy (not perfect, but better than dozens of warning) 2018-12-08 18:59:51 +01:00
Gael Guennebaud
bff90bf270 workaround "may be used uninitialized" warning 2018-12-08 18:58:28 +01:00
Gael Guennebaud
81c27325ae bug #1641: fix testing of pandnot and fix pandnot for complex on SSE/AVX/AVX512 2018-12-08 14:27:48 +01:00
Gael Guennebaud
426bce7529 fix EIGEN_GEBP_2PX4_SPILLING_WORKAROUND for non vectorized type, and non x86/64 target 2018-12-08 09:44:21 +01:00
Gael Guennebaud
cd25b538ab Fix noise in sparse_basic_3 (numerical cancellation) 2018-12-08 00:13:37 +01:00
Gael Guennebaud
efaf03bf96 Fix noise in lu unit test 2018-12-08 00:05:03 +01:00
Gael Guennebaud
956678a4ef bug #1515: disable gebp's 3pX4 micro kernel for MSVC<=19.14 because of register spilling. 2018-12-07 18:03:36 +01:00
Gael Guennebaud
7b6d0ff1f6 Enable FMA with MSVC (through /arch:AVX2). To make this possible, I also has to turn the #warning regarding AVX512-FMA to a #error. 2018-12-07 15:14:50 +01:00
Gael Guennebaud
f233c6194d bug #1637: workaround register spilling in gebp with clang>=6.0+AVX+FMA 2018-12-07 10:01:09 +01:00
Gael Guennebaud
ae59a7652b bug #1638: add a warning if avx512 is enabled without SSE/AVX FMA 2018-12-07 09:23:28 +01:00
Gael Guennebaud
4e7746fe22 bug #1636: fix gemm performance issue with gcc>=6 and no FMA 2018-12-07 09:15:46 +01:00
Gael Guennebaud
cbf2f4b7a0 AVX512f includes FMA but GCC does not define __FMA__ with -mavx512f only 2018-12-06 18:21:56 +01:00
Gael Guennebaud
1d683ae2f5 Fix compilation with avx512f only, i.e., no AVX512DQ 2018-12-06 18:11:07 +01:00
Gael Guennebaud
aab749b1c3 fix test regarding AVX512 vectorization of complexes. 2018-12-06 16:55:00 +01:00
Gael Guennebaud
c53eececb0 Implement AVX512 vectorization of std::complex<float/double> 2018-12-06 15:58:06 +01:00
Gael Guennebaud
3fba59ea59 temporarily re-disable SSE/AVX vectorization of complex<> on AVX512 -> this needs to be fixed though! 2018-12-06 00:13:26 +01:00
Gael Guennebaud
1ac2695ef7 bug #1636: fix compilation with some ABI versions. 2018-12-06 00:05:10 +01:00
Rasmus Munk Larsen
47d8b741b2 #elif -> #else to fix GPU build. 2018-12-05 13:19:31 -08:00
Rasmus Munk Larsen
8a02883d58 Merged in markdryan/eigen/avx512-contraction-2 (pull request PR-554)
Fix tensor contraction on AVX512 builds

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-12-05 18:19:32 +00:00
Gael Guennebaud
acc3459a49 Add help messages in the quick ref/ascii docs regarding slicing, indexing, and reshaping. 2018-12-05 17:17:23 +01:00
Gael Guennebaud
e2e897298a Fix page nesting 2018-12-05 17:13:46 +01:00
Christoph Hertzberg
c1d356e8b4 bug #1635: Use infinity from Numtraits instead of creating it manually. 2018-12-05 15:01:04 +01:00
Mark D Ryan
36f8f6d0be Fix evalShardedByInnerDim for AVX512 builds
evalShardedByInnerDim ensures that the values it passes for start_k and
end_k to evalGemmPartialWithoutOutputKernel are multiples of 8 as the kernel
does not work correctly when the values of k are not multiples of the
packet_size.  While this precaution works for AVX builds, it is insufficient
for AVX512 builds where the maximum packet size is 16.  The result is slightly
incorrect float32 contractions on AVX512 builds.

This commit fixes the problem by ensuring that k is always a multiple of
the packet_size if the packet_size is > 8.
2018-12-05 12:29:03 +01:00
Rasmus Munk Larsen
b57b31cce9 Merged in ezhulenev/eigen-01 (pull request PR-553)
Do not disable alignment with EIGEN_GPUCC

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-12-04 23:47:19 +00:00
Eugene Zhulenev
0bb15bb6d6 Update checks in ConfigureVectorization.h 2018-12-03 17:10:40 -08:00
Eugene Zhulenev
fd0fbfa9b5 Do not disable alignment with EIGEN_GPUCC 2018-12-03 15:54:10 -08:00
Christoph Hertzberg
919414b9fe bug #785: Make Cholesky decomposition work for empty matrices 2018-12-03 16:18:15 +01:00
Gael Guennebaud
0ea7ae7213 Add missing padd for Packet8i (it was implicitly generated by clang and gcc) 2018-11-30 21:52:25 +01:00
Gael Guennebaud
ab4df3e6ff bug #1634: remove double copy in move-ctor of non movable Matrix/Array 2018-11-30 21:25:51 +01:00
Gael Guennebaud
c785464430 Add packet sin and cos to Altivec/VSX and NEON 2018-11-30 16:21:33 +01:00
Gael Guennebaud
69ace742be Several improvements regarding packet-bitwise operations:
- add unit tests
- optimize their AVX512f implementation
- add missing implementations (half, Packet4f, ...)
2018-11-30 15:56:08 +01:00
Gael Guennebaud
fa87f9d876 Add psin/pcos on AVX512 -> almost for free, at last! 2018-11-30 14:33:13 +01:00
Gael Guennebaud
c68bd2fa7a Cleanup 2018-11-30 14:32:31 +01:00
Gael Guennebaud
f91500d303 Fix pandnot order in AVX512 2018-11-30 14:32:06 +01:00
Gael Guennebaud
b477d60bc6 Extend the generic psin_float code to handle cosine and make SSE and AVX use it (-> this adds pcos for AVX) 2018-11-30 11:26:30 +01:00
Gael Guennebaud
e19ece822d Disable fma gcc's workaround for gcc >= 8 (based on GEMM benchmarks) 2018-11-28 17:56:24 +01:00
Gael Guennebaud
41052f63b7 same for pmax 2018-11-28 17:17:28 +01:00
Gael Guennebaud
3e95e398b6 pmin/pmax o SSE: make sure to use AVX instruction with AVX enabled, and disable gcc workaround for fixed gcc versions 2018-11-28 17:14:20 +01:00
Gael Guennebaud
aa6097395b Add missing SSE/AVX type-casting in AVX512 mode 2018-11-28 16:09:08 +01:00
Gael Guennebaud
48fe78c375 bug #1630: fix linspaced when requesting smaller packet size than default one. 2018-11-28 13:15:06 +01:00
Eugene Zhulenev
80f1651f35 Use explicit packet type in SSE/PacketMath pldexp 2018-11-27 17:25:49 -08:00
Benoit Jacob
a4159dba08 do not read buffers out of bounds -- load only the 4 bytes we know exist here. Could also have done a vld1_lane_f32 but doing so here, without the overhead of initializing the unused lane, would have triggered used-of-uninitialized-value errors in tools such as ASan. Note that this code is sub-optimal before or after this change: we should be reading either 2 or 4 float32 values per load-instruction (2 for ARM in-order cores with an affinity for 8-byte loads; 4 for ARM out-of-order cores able to dual-issue 16-byte load instructions with arithmetic instructions). Before or after this patch, we are only loading 4 bytes of useful data here (even if before this patch, we were technically loading 8, only to use only the 4 first). 2018-11-27 16:53:14 -05:00
Gael Guennebaud
b131a4db24 bug #1631: fix compilation with ARM NEON and clang, and cleanup the weird pshiftright_and_cast and pcast_and_shiftleft functions. 2018-11-27 23:45:00 +01:00
Gael Guennebaud
a1a5fbbd21 Update pshiftleft to pass the shift as a true compile-time integer. 2018-11-27 22:57:30 +01:00
Gael Guennebaud
fa7fd61eda Unify SSE/AVX psin functions.
It is based on the SSE version which is much more accurate, though very slightly slower.
This changeset also includes the following required changes:
 - add packet-float to packet-int type traits
 - add packet float<->int reinterpret casts
 - add faster pselect for AVX based on blendv
2018-11-27 22:41:51 +01:00
Rasmus Munk Larsen
08edbc8cfe Merged in bjacob/eigen/fixbuild (pull request PR-549)
fix the build on 64-bit ARM when NEON is disabled
2018-11-27 20:14:12 +00:00
Benoit Jacob
7b1cb8a440 fix the build on 64-bit ARM when NEON is disabled 2018-11-27 11:11:02 -05:00
Gael Guennebaud
b5695a6008 Unify Altivec/VSX pexp(double) with default implementation 2018-11-27 13:53:05 +01:00
Gael Guennebaud
7655a8af6e cleanup 2018-11-26 23:21:29 +01:00
Gael Guennebaud
502f92fa10 Unify SSE and AVX pexp for double. 2018-11-26 23:12:44 +01:00
Gael Guennebaud
4a347a0054 Unify NEON's pexp with generic implementation 2018-11-26 22:15:44 +01:00
Gael Guennebaud
5c8406babc Unify Altivec/VSX's pexp with generic implementation 2018-11-26 16:47:13 +01:00
Gael Guennebaud
cf8b85d5c5 Unify SSE and AVX implementation of pexp 2018-11-26 16:36:19 +01:00
Gael Guennebaud
c2f35b1b47 Unify Altivec/VSX's plog with generic implementation, and enable it! 2018-11-26 15:58:11 +01:00
Gael Guennebaud
c24e98e6a8 Unify NEON's plog with generic implementation 2018-11-26 15:02:16 +01:00
Gael Guennebaud
2c44c40114 First step toward a unification of packet log implementation, currently only SSE and AVX are unified.
To this end, I added the following functions: pzero, pcmp_*, pfrexp, pset1frombits functions.
2018-11-26 14:21:24 +01:00
Gael Guennebaud
5f6045077c Make SSE/AVX pandnot(A,B) consistent with generic version, i.e., "A and not B" 2018-11-26 14:14:07 +01:00
Gael Guennebaud
382279eb7f Extend unit test to recursively check half-packet types and non packet types 2018-11-26 14:10:07 +01:00
Gael Guennebaud
0836a715d6 bug #1611: fix plog(0) on NEON 2018-11-26 09:08:38 +01:00
Patrik Huber
95566eeed4 Fix typos 2018-11-23 22:22:14 +00:00
Gael Guennebaud
e3b22a6bd0 merge 2018-11-23 16:06:21 +01:00
Gael Guennebaud
ccabdd88c9 Fix reserved usage of double __ in macro names 2018-11-23 16:01:47 +01:00
Gael Guennebaud
572d62697d check two ctors 2018-11-23 15:37:09 +01:00
Gael Guennebaud
354f14293b Fix double = bool ! 2018-11-23 15:12:06 +01:00
Gael Guennebaud
a7842daef2 Fix several uninitialized member from ctor 2018-11-23 15:10:28 +01:00
Christoph Hertzberg
ea60a172cf Add default constructor to Bar to make test compile again with clang-3.8 2018-11-23 14:24:22 +01:00
Christoph Hertzberg
806352d844 Small typo found be Patrick Huber (pull request PR-547) 2018-11-23 12:34:27 +00:00
Gael Guennebaud
a476054879 bug #1624: improve matrix-matrix product on ARM 64, 20% speedup 2018-11-23 10:25:19 +01:00
Gael Guennebaud
c685fe9838 Move regression test to right unit test file 2018-11-21 15:59:47 +01:00
Gael Guennebaud
4b2cebade8 Workaround weird MSVC bug 2018-11-21 15:53:37 +01:00
Christoph Hertzberg
0ec8afde57 Fixed most conversion warnings in MatrixFunctions module 2018-11-20 16:23:28 +01:00
Deven Desai
e7e6809e6b ROCm/HIP specfic fixes + updates
1. Eigen/src/Core/arch/GPU/Half.h

   Updating the HIPCC implementation half so that it can declared as a __shared__ variable


2. Eigen/src/Core/util/Macros.h, Eigen/src/Core/util/Memory.h

   introducing a EIGEN_USE_STD(func) macro that calls
   - std::func be default
   - ::func when eigen is being compiled with HIPCC

   This change was requested in the previous HIP PR
   (https://bitbucket.org/eigen/eigen/pull-requests/518/pr-with-hip-specific-fixes-for-the-eigen/diff)


3. unsupported/Eigen/CXX11/src/Tensor/TensorDeviceThreadPool.h

   Removing EIGEN_DEVICE_FUNC attribute from pure virtual methods as it is not supported by HIPCC


4. unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h

   Disabling the template specializations of InnerMostDimReducer as they run into HIPCC link errors
2018-11-19 18:13:59 +00:00
Gael Guennebaud
6a510fe69c Make MaxPacketSize a true upper bound, even for fixed-size inputs 2018-11-16 11:25:32 +01:00
Gael Guennebaud
43c987b1c1 Add explicit regression test for bug #1622 2018-11-16 11:24:51 +01:00
Mark D Ryan
670d56441c PR 544: Set requestedAlignment correctly for SliceVectorizedTraversals
Commit aa110e681b
 optimised the multiplication of small dyanmically
sized matrices by restricting the packet size to a maximum of 4, increasing
the chances that SIMD instructions are used in the computation.  However, it
introduced a mismatch between the packet size and the requestedAlignment.  This
mismatch can lead to crashes when the destination is not aligned.  This patch
fixes the issue by ensuring that the AssignmentTraits are correctly computed
when using a restricted packet size.
* * *
Bind LinearPacketType to MaxPacketSize

This commit applies any packet size limit specified when instantiating
copy_using_evaluator_traits to the LinearPacketType, providing that the
size of the destination is not known at compile time.
* * *
Add unit test for restricted packet assignment

A new unit test is added to check that multiplication of small dynamically
sized matrices works correctly when the packet size is restricted to 4 and
the destination is unaligned.
2018-11-13 16:15:08 +01:00
Nikolaus Demmel
3dc0845046 Fix typo in comment on EIGEN_MAX_STATIC_ALIGN_BYTES 2018-11-14 18:11:30 +01:00
Gael Guennebaud
7fddc6a51f typo 2018-11-14 14:43:18 +01:00
Gael Guennebaud
449f948b2a help doxygen linking to DenseBase::NulllaryExpr 2018-11-14 14:42:59 +01:00
Gael Guennebaud
4263f23c28 Improve doc on multi-threading and warn about hyper-threading 2018-11-14 14:42:29 +01:00
Gael Guennebaud
db529ae4ec doxygen does not like \addtogroup and \ingroup in the same line 2018-11-14 14:42:06 +01:00
Rasmus Munk Larsen
72928a2c8a Merged in rmlarsen/eigen2 (pull request PR-543)
Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.

Approved-by: Eugene Zhulenev <ezhulenev@google.com>
2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen
cda479d626 Remove accidental changes. 2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen
719d9aee65 Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth. 2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen
77b447c24e Add optimized version of logistic function for float. As an example, this is about 50% faster than the existing version on Haswell using AVX. 2018-11-12 13:42:24 -08:00
Gael Guennebaud
c81bdbdadc Add manual doc on STL-compatible iterators 2018-11-12 22:06:33 +01:00
Gael Guennebaud
0105146915 Fix warning in c++03 2018-11-10 09:11:38 +01:00
Rasmus Munk Larsen
93f9988a7e A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles. 2018-11-09 14:15:32 -08:00
Gael Guennebaud
784a3f13cf bug #1619: fix mixing of const and non-const generic iterators 2018-11-09 21:45:10 +01:00
Gael Guennebaud
db9a9a12ba bug #1619: make const and non-const iterators compatible 2018-11-09 16:49:19 +01:00
Gael Guennebaud
fbd6e7b025 add missing ref to a.zeta(b) 2018-11-09 13:53:42 +01:00
Gael Guennebaud
dffd1e11de Limit the size of the toc 2018-11-09 13:52:34 +01:00
Gael Guennebaud
a88e0a0e95 Update doxy hacks wrt doxygen 1.8.13/14 2018-11-09 13:52:10 +01:00
Gael Guennebaud
bd9a00718f Let doxygen sees lastN 2018-11-09 11:35:48 +01:00
Gael Guennebaud
d7c644213c Add and update manual pages for slicing, indexing, and reshaping. 2018-11-09 11:35:27 +01:00
Gael Guennebaud
a368848473 Recent xcode versions does support EIGEN_HAS_STATIC_ARRAY_TEMPLATE 2018-11-09 10:33:17 +01:00
Gael Guennebaud
f62a0f69c6 Fix max-size in indexed-view 2018-11-08 18:40:22 +01:00
Gael Guennebaud
bf495859ff Merged in glchaves/eigen (pull request PR-539)
Vectorize row-by-row gebp loop iterations on 16 packets as well
2018-11-07 07:21:15 +00:00
Gael Guennebaud
995730fc6c Add option to disable plot generation 2018-11-07 00:41:16 +01:00
Gustavo Lima Chaves
4ad359237a Vectorize row-by-row gebp loop iterations on 16 packets as well
Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com>
Signed-off-by: Mark D. Ryan <mark.d.ryan@intel.com>
2018-11-06 10:48:42 -08:00
Gael Guennebaud
9d318b92c6 add unit tests for bug #1619 2018-11-01 15:14:50 +01:00
Matthieu Vigne
8d7a73e48e bug #1617: Fix SolveTriangular.solveInPlace crashing for empty matrix.
This made FullPivLU.kernel() crash when used on the zero matrix.
Add unit test for FullPivLU.kernel() on the zero matrix.
2018-10-31 20:28:18 +01:00
Christoph Hertzberg
66b28e290d bug #1618: Use different power-of-2 check to avoid MSVC warning 2018-11-01 13:23:19 +01:00
Rasmus Munk Larsen
07fcdd1438 Merged in ezhulenev/eigen-02 (pull request PR-534)
Fix cxx11_tensor_{block_access, reduction} tests
2018-10-25 18:34:35 +00:00
Eugene Zhulenev
8a977c1f46 Fix cxx11_tensor_{block_access, reduction} tests 2018-10-25 11:31:29 -07:00
Halie Murray-Davis
fb62d6d96e Fix typo in tutorial documentation. 2018-10-25 04:55:34 +00:00
Christoph Hertzberg
b5f077d22c Document EIGEN_NO_IO preprocessor directive 2018-10-25 16:49:25 +02:00
Christian von Schultz
4a40b3785d Collapsed revision (based on pull request PR-325)
* Support compiling without IO streams

Add the preprocessor definition EIGEN_NO_IO which, if defined,
disables all use of the IO streams part of the standard library.
2018-10-22 21:14:40 +02:00
Rasmus Munk Larsen
14054e217f Do not rely on the compiler generating __device__ functions for constexpr in Cuda (via EIGEN_CONSTEXPR_ARE_DEVICE_FUNC. This breaks several target in the TensorFlow Cuda build, e.g.,
INFO: From Compiling tensorflow/core/kernels/maxpooling_op_gpu.cu.cc:
/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNHWC< ::Eigen::half> ") is not allowed

/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code"

/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNCHW< ::Eigen::half> ") is not allowed

/b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code

4 errors detected in the compilation of "/tmp/tmpxft_00000011_00000000-6_maxpooling_op_gpu.cu.cpp1.ii".
ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: output 'tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o' was not created
ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: Couldn't build file tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o: not all outputs were created or valid
2018-10-22 16:18:24 -07:00
Rasmus Munk Larsen
954b4ca9d0 Suppress compiler warning about unused global variable. 2018-10-22 13:48:56 -07:00
Rasmus Munk Larsen
9caafca550 Merged in rmlarsen/eigen (pull request PR-532)
Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.
2018-10-19 21:37:14 +00:00
Christoph Hertzberg
449ff74672 Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file).
Manually grafted from d107a371c6
2018-10-19 21:10:28 +02:00
Rasmus Munk Larsen
39fec15d5c Merged eigen/eigen into default 2018-10-19 09:48:19 -07:00
Christoph Hertzberg
40fa6f98bf bug #1606: Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11.
Grafted manually from a4afa90d16
2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen
d8f285852b Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available. 2018-10-18 16:55:02 -07:00
Rasmus Munk Larsen
dda68f56ec Fix GPU build due to gpu_assert not always being defined. 2018-10-18 16:29:29 -07:00
Gael Guennebaud
1dcf5a6ed8 fix typo in doc 2018-10-17 09:29:36 +02:00
Eugene Zhulenev
9e96e91936 Move from rvalue arguments in ThreadPool enqueue* methods 2018-10-16 16:48:32 -07:00
Eugene Zhulenev
217d839816 Reduce thread scheduling overhead in parallelFor 2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen
d52763bb4f Merged in ezhulenev/eigen-02 (pull request PR-528)
[TensorBlockIO] Check if it's allowed to squeeze inner dimensions

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-10-16 15:39:40 +00:00
Gael Guennebaud
0f780bb0b4 Fix float-to-double warning 2018-10-16 09:19:45 +02:00
Eugene Zhulenev
900c7c61bb Check if it's allowed to squueze inner dimensions in TensorBlockIO 2018-10-15 16:52:33 -07:00
Gael Guennebaud
a39e0f7438 bug #1612: fix regression in "outer-vectorization" of partial reductions for PacketSize==1 (aka complex<double>) 2018-10-16 01:04:25 +02:00
Gael Guennebaud
e3b85771d7 Show call stack in case of failing sparse solving. 2018-10-16 00:43:44 +02:00
Gael Guennebaud
d2d570c116 Remove useless (and broken) resize 2018-10-16 00:42:48 +02:00
Gael Guennebaud
f0fb95135d Iterative solvers: unify and fix handling of multiple rhs.
m_info was not properly computed and the logic was repeated in several places.
2018-10-15 23:47:46 +02:00
Gael Guennebaud
2747b98cfc DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve 2018-10-15 23:46:00 +02:00
Gael Guennebaud
d835a0bf53 relax number of iterations checks to avoid false negatives 2018-10-15 10:23:32 +02:00
Gael Guennebaud
3a33db4de5 merge 2018-10-15 09:22:27 +02:00
Rasmus Munk Larsen
0ed811a9c1 Suppress unused variable compiler warning in sparse subtest 3. 2018-10-12 13:41:57 -07:00
Mark D Ryan
aa110e681b PR 526: Speed up multiplication of small, dynamically sized matrices
The Packet16f, Packet8f and Packet8d types are too large to use with dynamically
sized matrices typically processed by the SliceVectorizedTraversal specialization of
the dense_assignment_loop.  Using these types is likely to lead to little or no
vectorization.  Significant slowdown in the multiplication of these small matrices can
be observed when building with AVX and AVX512 enabled.

This patch introduces a new dense_assignment_kernel that is used when
computing small products whose operands have dynamic dimensions.  It ensures that the
PacketSize used is no larger than 4, thereby increasing the chance that vectorized
instructions will be used when computing the product.

I tested all 969 possible combinations of M, K, and N that are handled by the
dense_assignment_loop on x86 builds.  Although a few combinations are slowed down
by this patch they are far outnumbered by the cases that are sped up, as the
following results demonstrate.


Disabling Packed8d on AVX512 builds:

Total Cases:             969
Better:                  511
Worse:                   85
Same:                    373
Max Improvement:         169.00% (4 8 6)
Max Degradation:         36.50% (8 5 3)
Median Improvement:      35.46%
Median Degradation:      17.41%
Total FLOPs Improvement: 19.42%


Disabling Packet16f and Packed8f on AVX512 builds:

Total Cases:             969
Better:                  658
Worse:                   5
Same:                    306
Max Improvement:         214.05% (8 6 5)
Max Degradation:         22.26% (16 2 1)
Median Improvement:      60.05%
Median Degradation:      13.32%
Total FLOPs Improvement: 59.58%


Disabling Packed8f on AVX builds:

Total Cases:             969
Better:                  663
Worse:                   96
Same:                    210
Max Improvement:         155.29% (4 10 5)
Max Degradation:         35.12% (8 3 2)
Median Improvement:      34.28%
Median Degradation:      15.05%
Total FLOPs Improvement: 26.02%
2018-10-12 15:20:21 +02:00
Eugene Zhulenev
d9392f9e55 Fix code format 2018-11-02 14:51:35 -07:00
Eugene Zhulenev
118520f04a Workaround nbcc+msvc compiler bug 2018-11-02 14:48:28 -07:00
Christoph Hertzberg
24dc076519 Explicitly convert 0 to Scalar for custom types 2018-10-12 10:22:19 +02:00
Gael Guennebaud
8214cf1896 Make sparse_basic includable from sparse_extra, but disable it since sparse_basic(DynamicSparseMatrix) does not compile at all anyways 2018-10-11 10:27:23 +02:00
Gael Guennebaud
43633fbaba Fix warning with AVX512f 2018-10-11 10:13:48 +02:00
Gael Guennebaud
97e2c808e9 Fix avx512 plog(NaN) to return NaN instead of +inf 2018-10-11 10:13:13 +02:00
Gael Guennebaud
b3f66d29a5 Enable avx512 plog with clang 2018-10-11 10:12:21 +02:00
Gael Guennebaud
2ef1b39674 Relaxed fastmath unit test: if std::foo fails, then let's only trigger a warning is numext::foo fails too.
A true error will triggered only if std::foo works but our numext::foo fails.
2018-10-11 09:45:30 +02:00
Gael Guennebaud
1d5a6363ea relax numerical tests from equal to approx (x87) 2018-10-11 09:29:56 +02:00
Gael Guennebaud
f0aa7e40fc Fix regression in changeset 5335659c47 2018-10-10 23:47:30 +02:00
Gael Guennebaud
ce243ee45b bug #520: add diagmat +/- diagmat operators. 2018-10-10 23:38:22 +02:00
Gael Guennebaud
5335659c47 Merged in ezhulenev/eigen-02 (pull request PR-525)
Fix bug in partial reduction of expressions requiring evaluation
2018-10-10 20:59:00 +00:00
Gael Guennebaud
eec0dfd688 bug #632: add specializations for res ?= dense +/- sparse and res ?= sparse +/- dense.
They are rewritten as two compound assignment to by-pass hybrid dense-sparse iterator.
2018-10-10 22:50:15 +02:00
Eugene Zhulenev
8e6dc2c81d Fix bug in partial reduction of expressions requiring evaluation 2018-10-10 13:23:52 -07:00
Gael Guennebaud
76ceae49c1 bug #1609: add inplace transposition unit test 2018-10-10 21:48:58 +02:00
Eugene Zhulenev
2bf1a31d81 Use void type if stl-style iterators are not supported 2018-10-10 10:31:40 -07:00
Christoph Hertzberg
f3130ee1ba Avoid empty macro arguments 2018-10-10 08:23:40 +02:00
Rasmus Munk Larsen
e8918743c1 Merged in ezhulenev/eigen-01 (pull request PR-523)
Compile time detection for unimplemented stl-style iterators
2018-10-09 23:42:01 +00:00
Eugene Zhulenev
befcac883d Hide stl-container detection test under #if 2018-10-09 15:36:01 -07:00
Eugene Zhulenev
c0ca8a9fa3 Compile time detection for unimplemented stl-style iterators 2018-10-09 15:28:23 -07:00
Gael Guennebaud
1dd1f8e454 bug #65: add vectorization of partial reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean() 2018-10-09 23:36:50 +02:00
Gael Guennebaud
bfa2a81a50 Make redux_vec_unroller more flexible regarding packet-type 2018-10-09 23:30:41 +02:00
Gael Guennebaud
c0c3be26ed Extend unit tests for partial reductions 2018-10-09 22:54:54 +02:00
Christoph Hertzberg
3f2c8b7ff0 Fix a lot of Doxygen warnings in Tensor module 2018-10-09 20:22:47 +02:00
Christoph Hertzberg
f6359ad795 Small Doxygen fixes 2018-10-09 19:33:35 +02:00
Gael Guennebaud
7a882c05ab Fix compilation on CUDA 2018-10-09 17:02:16 +02:00
Gael Guennebaud
93a6192e98 fix mpreal for mpfr<4.0.0 2018-10-09 09:15:22 +02:00
Rasmus Munk Larsen
d16634c4d4 Fix out-of bounds access in TensorArgMax.h. 2018-10-08 16:41:36 -07:00
Rasmus Munk Larsen
1a737e1d6a Fix contraction test. 2018-10-08 16:37:07 -07:00
Gael Guennebaud
e00487f7d2 bug #1603: add parenthesis around ternary operator in function body as well as a harmless attempt to make MSVC happy. 2018-10-08 22:27:04 +02:00
Gael Guennebaud
2eda9783de typo 2018-10-08 21:37:46 +02:00
Gael Guennebaud
c6e2dde714 fix c++11 deprecated warning 2018-10-08 18:26:05 +02:00
Gael Guennebaud
6cc9b2c831 fix warning in mpreal.h 2018-10-08 18:25:37 +02:00
Gael Guennebaud
649d4758a6 merge 2018-10-08 17:35:18 +02:00
Gael Guennebaud
aa5820056e Unify c++11 usage in doc's examples and snippets 2018-10-08 17:32:54 +02:00
Gael Guennebaud
e29bfe8479 Update included mpreal header to 3.6.5 and fix deprecated warnings. 2018-10-08 17:09:23 +02:00
Gael Guennebaud
64b1a15318 Workaround stupid warning 2018-10-08 12:01:18 +02:00
Gael Guennebaud
c9643f4a6f Disable C++11 deprecated warning when limiting Eigen to C++98 2018-10-08 10:43:43 +02:00
Gael Guennebaud
774bb9d6f7 fix a doxygen issue 2018-10-08 09:30:15 +02:00
Gael Guennebaud
6c3f6cd52b Fix maybe-uninitialized warning 2018-10-07 23:29:51 +02:00
Gael Guennebaud
bcb7c66b53 Workaround gcc's alloc-size-larger-than= warning 2018-10-07 21:55:59 +02:00
Gael Guennebaud
16b2001ece Fix gcc 8.1 warning: "maybe use uninitialized" 2018-10-07 21:54:49 +02:00
Gael Guennebaud
6512c5e136 Implement a better workaround for GCC's bug #87544 2018-10-07 15:00:05 +02:00
Gael Guennebaud
409132bb81 Workaround gcc bug making it trigger an invalid warning 2018-10-07 09:23:15 +02:00
Gael Guennebaud
c6a1ab4036 Workaround MSVC compilation issue 2018-10-06 13:49:17 +02:00
Gael Guennebaud
e21766c6f5 Clarify doc of rowwise/colwise/vectorwise. 2018-10-05 23:12:09 +02:00
Gael Guennebaud
d92f004ab7 Simplify API by removing allCols/allRows and reusing rowwise/colwise to define iterators over rows/columns 2018-10-05 23:11:21 +02:00
Gael Guennebaud
91613bf2c2 Add support for c++11 snippets 2018-10-05 23:08:39 +02:00
Gael Guennebaud
3e64b1fc86 Move iterators to internal, improve doc, make unit test c++03 friendly 2018-10-03 15:13:15 +02:00
Gael Guennebaud
2b2b4d0580 fix unused warning 2018-10-03 14:16:21 +02:00
Gael Guennebaud
8a1e98240e add unit tests 2018-10-03 11:56:27 +02:00
Gael Guennebaud
5f26f57598 Change the logic of A.reshaped<Order>() to be a simple alias to A.reshaped<Order>(AutoSize,fix<1>).
This means that now AutoOrder is allowed, and it always return a column-vector.
2018-10-03 11:41:47 +02:00
Gael Guennebaud
0481900e25 Add pointer-based iterator for direct-access expressions 2018-10-02 23:44:36 +02:00
Christoph Hertzberg
c5f1d0a72a Fix shadow warning 2018-10-02 19:01:08 +02:00
Christoph Hertzberg
b92c71235d Move struct outside of method for C++03 compatibility. 2018-10-02 18:59:10 +02:00
Christoph Hertzberg
051f9c1aff Make code compile in C++03 mode again 2018-10-02 18:36:30 +02:00
Christoph Hertzberg
b786ce8c72 Fix conversion warning ... again 2018-10-02 18:35:25 +02:00
Gael Guennebaud
8c38528168 Factorize RowsProxy/ColsProxy and related iterators using subVector<>(Index) 2018-10-02 14:03:26 +02:00
Gael Guennebaud
12487531ce Add templated subVector<Vertical/Horizonal>(Index) aliases to col/row(Index) methods (plus subVectors<>() to retrieve the number of rows/columns) 2018-10-02 14:02:34 +02:00
Gael Guennebaud
37e29fc893 Use Index instead of ptrdiff_t or int, fix random-accessors. 2018-10-02 13:29:32 +02:00
Gael Guennebaud
de2efbc43c bug #1605: workaround ABI issue with vector types (aka __m128) versus scalar types (aka float) 2018-10-01 23:45:55 +02:00
Gael Guennebaud
b0c66adfb1 bug #231: initial implementation of STL iterators for dense expressions 2018-10-01 23:21:37 +02:00
Christoph Hertzberg
564ca71e39 Merged in deven-amd/eigen/HIP_fixes (pull request PR-518)
PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)
2018-10-01 16:51:04 +00:00
Deven Desai
94898488a6 This commit contains the following (HIP specific) updates:
- unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h
  Changing "pass-by-reference" argument to be "pass-by-value" instead
  (in a  __global__ function decl).
  "pass-by-reference" arguments to __global__ functions are unwise,
  and will be explicitly flagged as errors by the newer versions of HIP.

- Eigen/src/Core/util/Memory.h
- unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h
  Changes introduced in recent commits breaks the HIP compile.
  Adding EIGEN_DEVICE_FUNC attribute to some functions and
  calling ::malloc/free instead of the corresponding std:: versions
  to get the HIP compile working again

- unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h
  Change introduced a recent commit breaks the HIP compile
  (link stage errors out due to failure to inline a function).
  Disabling the recently introduced code (only for HIP compile), to get
  the eigen nightly testing going again.
  Will submit another PR once we have te proper fix.

- Eigen/src/Core/util/ConfigureVectorization.h
  Enabling GPU VECTOR support when HIP compiler is in use
  (for both the host and device compile phases)
2018-10-01 14:28:37 +00:00
Rasmus Munk Larsen
2088c0897f Merged eigen/eigen into default 2018-09-28 16:00:46 -07:00
Rasmus Munk Larsen
31629bb964 Get rid of unused variable warning. 2018-09-28 16:00:09 -07:00
Eugene Zhulenev
bb13d5d917 Fix bug in copy optimization in Tensor slicing. 2018-09-28 14:34:42 -07:00
Rasmus Munk Larsen
104e8fa074 Fix a few warnings and rename a variable to not shadow "last". 2018-09-28 12:00:08 -07:00
Rasmus Munk Larsen
7c1b47840a Merged in ezhulenev/eigen-01 (pull request PR-514)
Add tests for evalShardedByInnerDim contraction + fix bugs
2018-09-28 18:37:54 +00:00
Eugene Zhulenev
524c81f3fa Add tests for evalShardedByInnerDim contraction + fix bugs 2018-09-28 11:24:08 -07:00
Christoph Hertzberg
86ba50be39 Fix integer conversion warnings 2018-09-28 19:33:39 +02:00
Eugene Zhulenev
e95696acb3 Optimize TensorBlockCopyOp 2018-09-27 14:49:26 -07:00
Eugene Zhulenev
9f33e71e9d Revert code lost in merge 2018-09-27 12:08:17 -07:00
Eugene Zhulenev
a7a3e9f2b6 Merge with eigen/eigen default 2018-09-27 12:05:06 -07:00
Eugene Zhulenev
9f4988959f Remove explicit mkldnn support and redundant TensorContractionKernelBlocking 2018-09-27 11:49:19 -07:00
Rasmus Munk Larsen
1e5750a5b8 Merged in rmlarsen/eigen4 (pull request PR-511)
Parallelize tensor contraction over the inner dimension.
2018-09-27 17:18:32 +00:00
Gael Guennebaud
af3ad4b513 oops, I've been too fast in previous copy/paste 2018-09-27 09:28:57 +02:00
Gael Guennebaud
24b163a877 #pragma GCC diagnostic push/pop is not supported prioro to gcc 4.6 2018-09-27 09:23:54 +02:00
Eugene Zhulenev
b314376f9c Test mkldnn pack for doubles 2018-09-26 18:22:24 -07:00
Eugene Zhulenev
22ed98a331 Conditionally add mkldnn test 2018-09-26 17:57:37 -07:00
Rasmus Munk Larsen
d956204ab2 Remove "false &&" left over from test. 2018-09-26 17:03:30 -07:00
Rasmus Munk Larsen
3815aeed7a Parallelize tensor contraction over the inner dimension in cases where where one or both of the outer dimensions (m and n) are small but k is large. This speeds up individual matmul microbenchmarks by up to 85%.
Naming below is BM_Matmul_M_K_N_THREADS, measured on a 2-socket Intel Broadwell-based server.

Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_Matmul_1_80_13522_1                  387457    396013     -2.2%
BM_Matmul_1_80_13522_2                  406487    230789    +43.2%
BM_Matmul_1_80_13522_4                  395821    123211    +68.9%
BM_Matmul_1_80_13522_6                  391625     97002    +75.2%
BM_Matmul_1_80_13522_8                  408986    113828    +72.2%
BM_Matmul_1_80_13522_16                 399988     67600    +83.1%
BM_Matmul_1_80_13522_22                 411546     60044    +85.4%
BM_Matmul_1_80_13522_32                 393528     57312    +85.4%
BM_Matmul_1_80_13522_44                 390047     63525    +83.7%
BM_Matmul_1_80_13522_88                 387876     63592    +83.6%
BM_Matmul_1_1500_500_1                  245359    248119     -1.1%
BM_Matmul_1_1500_500_2                  401833    143271    +64.3%
BM_Matmul_1_1500_500_4                  210519    100231    +52.4%
BM_Matmul_1_1500_500_6                  251582     86575    +65.6%
BM_Matmul_1_1500_500_8                  211499     80444    +62.0%
BM_Matmul_3_250_512_1                    70297     68551     +2.5%
BM_Matmul_3_250_512_2                    70141     52450    +25.2%
BM_Matmul_3_250_512_4                    67872     58204    +14.2%
BM_Matmul_3_250_512_6                    71378     63340    +11.3%
BM_Matmul_3_250_512_8                    69595     41652    +40.2%
BM_Matmul_3_250_512_16                   72055     42549    +40.9%
BM_Matmul_3_250_512_22                   70158     54023    +23.0%
BM_Matmul_3_250_512_32                   71541     56042    +21.7%
BM_Matmul_3_250_512_44                   71843     57019    +20.6%
BM_Matmul_3_250_512_88                   69951     54045    +22.7%
BM_Matmul_3_1500_512_1                  369328    374284     -1.4%
BM_Matmul_3_1500_512_2                  428656    223603    +47.8%
BM_Matmul_3_1500_512_4                  205599    139508    +32.1%
BM_Matmul_3_1500_512_6                  214278    139071    +35.1%
BM_Matmul_3_1500_512_8                  184149    142338    +22.7%
BM_Matmul_3_1500_512_16                 156462    156983     -0.3%
BM_Matmul_3_1500_512_22                 163905    158259     +3.4%
BM_Matmul_3_1500_512_32                 155314    157662     -1.5%
BM_Matmul_3_1500_512_44                 235434    158657    +32.6%
BM_Matmul_3_1500_512_88                 156779    160275     -2.2%
BM_Matmul_1500_4_512_1                  363358    349528     +3.8%
BM_Matmul_1500_4_512_2                  303134    263319    +13.1%
BM_Matmul_1500_4_512_4                  176208    130086    +26.2%
BM_Matmul_1500_4_512_6                  148026    115449    +22.0%
BM_Matmul_1500_4_512_8                  131656     98421    +25.2%
BM_Matmul_1500_4_512_16                 134011     82861    +38.2%
BM_Matmul_1500_4_512_22                 134950     85685    +36.5%
BM_Matmul_1500_4_512_32                 133165     90081    +32.4%
BM_Matmul_1500_4_512_44                 133203     90644    +32.0%
BM_Matmul_1500_4_512_88                 134106    100566    +25.0%
BM_Matmul_4_1500_512_1                  439243    435058     +1.0%
BM_Matmul_4_1500_512_2                  451830    257032    +43.1%
BM_Matmul_4_1500_512_4                  276434    164513    +40.5%
BM_Matmul_4_1500_512_6                  182542    144827    +20.7%
BM_Matmul_4_1500_512_8                  179411    166256     +7.3%
BM_Matmul_4_1500_512_16                 158101    155560     +1.6%
BM_Matmul_4_1500_512_22                 152435    155448     -1.9%
BM_Matmul_4_1500_512_32                 155150    149538     +3.6%
BM_Matmul_4_1500_512_44                 193842    149777    +22.7%
BM_Matmul_4_1500_512_88                 149544    154468     -3.3%
2018-09-26 16:47:13 -07:00
Eugene Zhulenev
71cd3fbd6a Support multiple contraction kernel types in TensorContractionThreadPool 2018-09-26 11:08:47 -07:00
Christoph Hertzberg
0a3356f4ec Don't deactivate BVH test for clang (probably, this was failing for very old versions of clang) 2018-09-25 20:26:16 +02:00
Gael Guennebaud
41c3a2ffc1 Fix documentation of reshape to vectors. 2018-09-25 16:35:44 +02:00
Christoph Hertzberg
2c083ace3e Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides 2018-09-24 18:01:17 +02:00
Gael Guennebaud
626942d9dd fix alignment issue in ploaddup for AVX512 2018-09-28 16:57:32 +02:00
Gael Guennebaud
84a1101b36 Merge with default. 2018-09-23 21:52:58 +02:00
Gael Guennebaud
795e12393b Fix logic in diagonal*dense product in a corner case.
The problem was for: diag(1x1) * mat(1,n)
2018-09-22 16:44:33 +02:00
Gael Guennebaud
bac36d0996 Demangle Travseral and Unrolling in Redux 2018-09-21 23:03:45 +02:00
Gael Guennebaud
c696dbcaa6 Fiw shadowing of last and all 2018-09-21 23:02:33 +02:00
Christoph Hertzberg
e3c8289047 Replace unused PREDICATE by corresponding STATIC_ASSERT 2018-09-21 21:15:51 +02:00
Gael Guennebaud
1bf12880ae Add reshaped<>() shortcuts when returning vectors and remove the reshaping version of operator()(all) 2018-09-21 16:50:04 +02:00
Gael Guennebaud
4291f167ee Add missing plugins to DynamicSparseMatrix -- fix sparse_extra_3 2018-09-21 14:53:43 +02:00
Gael Guennebaud
03a0cb2b72 fix unalignedcount for avx512 2018-09-21 14:40:26 +02:00
Gael Guennebaud
371068992a Add more debug output 2018-09-21 14:32:39 +02:00
Gael Guennebaud
91716f03a7 Fix vectorization logic unit test for AVX512 2018-09-21 14:32:24 +02:00
Gael Guennebaud
b00e48a867 Improve slice-vectorization logic for redux (significant speed-up for reduxion of blocks) 2018-09-21 13:45:56 +02:00
Gael Guennebaud
a488d59787 merge with default Eigen 2018-09-21 11:51:49 +02:00
Gael Guennebaud
47720e7970 Doc fixes 2018-09-21 11:48:22 +02:00
Gael Guennebaud
3ec2985914 Merged indexing cleanup (pull request PR-506) 2018-09-21 09:36:05 +00:00
Gael Guennebaud
651e5d4866 Fix EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF_VECTORIZABLE_FIXED_SIZE for AVX512 or AVX with malloc aligned on 8 bytes only.
This change also make it future proof for AVX1024
2018-09-21 10:33:22 +02:00
Eugene Zhulenev
719e438a20 Collapsed revision
* Split cxx11_tensor_executor test
* Register test parts with EIGEN_SUFFIXES
* Fix EIGEN_SUFFIXES in cxx11_tensor_executor test
2018-09-20 15:19:12 -07:00
Gael Guennebaud
f0ef3467de Fix doc 2018-09-20 22:57:28 +02:00
Gael Guennebaud
617f75f117 Add indexing namespace 2018-09-20 22:57:10 +02:00
Gael Guennebaud
0c56d22e2e Fix shadowing 2018-09-20 22:56:21 +02:00
Rasmus Munk Larsen
8e2be7777e Merged eigen/eigen into default 2018-09-20 11:41:15 -07:00
Rasmus Munk Larsen
5d2e759329 Initialize BlockIteratorState in a C++03 compatible way. 2018-09-20 11:40:43 -07:00
Gael Guennebaud
e04faca930 merge 2018-09-20 18:33:54 +02:00
Gael Guennebaud
d37188b9c1 Fix MPrealSupport 2018-09-20 18:30:10 +02:00
Gael Guennebaud
3c6dc93f99 Fix GPU support. 2018-09-20 18:29:21 +02:00
Gael Guennebaud
e0f6d352fb Rename test/array.cpp to test/array_cwise.cpp to avoid conflicts with the array header. 2018-09-20 18:07:32 +02:00
Gael Guennebaud
eeeb18814f Fix warning 2018-09-20 17:48:56 +02:00
Gael Guennebaud
9419f506d0 Fix regression introduced by the previous fix for AVX512.
It brokes the complex-complex case on SSE.
2018-09-20 17:32:34 +02:00
Christoph Hertzberg
a0166ab651 Workaround for spurious "array subscript is above array bounds" warnings with g++4.x 2018-09-20 17:08:43 +02:00
Gael Guennebaud
e38d1ab4d1 Workaround increases required alignment warning 2018-09-20 17:07:33 +02:00
Christoph Hertzberg
c50250cb24 Avoid warning "suggest braces around initialization of subobject".
This test is not run in C++03 mode, so no compatibility is lost.
2018-09-20 17:03:42 +02:00
Gael Guennebaud
71496b0e25 Fix gebp kernel for real+complex in case only reals are vectorized (e.g., AVX512).
This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.
2018-09-20 17:01:24 +02:00
Gael Guennebaud
5a30eed17e Fix warnings in AVX512 2018-09-20 16:58:51 +02:00
Gael Guennebaud
2cf6d3050c Disable ignoring attributes warning 2018-09-20 11:38:19 +02:00
Rasmus Munk Larsen
44d8274383 Cast to longer type. 2018-09-19 13:31:42 -07:00
Rasmus Munk Larsen
d638b62dda Silence compiler warning. 2018-09-19 13:27:55 -07:00
Rasmus Munk Larsen
db9c9df59a Silence more compiler warnings. 2018-09-19 11:50:27 -07:00
Rasmus Munk Larsen
febd09dcc0 Silence compiler warnings in ThreadPoolInterface.h. 2018-09-19 11:11:04 -07:00
Gael Guennebaud
c3a19527a2 Fix doc wrt previous change 2018-09-19 11:49:26 +02:00
Gael Guennebaud
dfa8439e4d Update reshaped API to use RowMajor/ColMajor directly as integral values instead of introducing RowOrder/ColOrder types.
The API changed from A.respahed(rows,cols,RowOrder) to A.template reshaped<RowOrder>(rows,cols).
2018-09-19 11:49:26 +02:00
luz.paz"
f67b19a884 [PATCH 1/2] Misc. typos
From 68d431b4c14ad60a778ee93c1f59ecc4b931950e Mon Sep 17 00:00:00 2001
Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` where the whitelists consists of:
```
als
ans
cas
dum
lastr
lowd
nd
overfl
pres
preverse
substraction
te
uint
whch
```
---
 CMakeLists.txt                                | 26 +++++++++----------
 Eigen/src/Core/GenericPacketMath.h            |  2 +-
 Eigen/src/SparseLU/SparseLU.h                 |  2 +-
 bench/bench_norm.cpp                          |  2 +-
 doc/HiPerformance.dox                         |  2 +-
 doc/QuickStartGuide.dox                       |  2 +-
 .../Eigen/CXX11/src/Tensor/TensorChipping.h   |  6 ++---
 .../Eigen/CXX11/src/Tensor/TensorDeviceGpu.h  |  2 +-
 .../src/Tensor/TensorForwardDeclarations.h    |  4 +--
 .../src/Tensor/TensorGpuHipCudaDefines.h      |  2 +-
 .../Eigen/CXX11/src/Tensor/TensorReduction.h  |  2 +-
 .../CXX11/src/Tensor/TensorReductionGpu.h     |  2 +-
 .../test/cxx11_tensor_concatenation.cpp       |  2 +-
 unsupported/test/cxx11_tensor_executor.cpp    |  2 +-
 14 files changed, 29 insertions(+), 29 deletions(-)
2018-09-18 04:15:01 -04:00
Gael Guennebaud
297ca62319 ease transition by adding placeholders::all/last/and as deprecated 2018-09-17 16:24:52 +02:00
Gael Guennebaud
2014c7ae28 Move all, last, end from Eigen::placeholders namespace to Eigen::, and rename end to lastp1 to avoid conflicts with std::end. 2018-09-15 14:35:10 +02:00
Gael Guennebaud
82772e8d9d Rename Symbolic namespace to symbolic to be consistent with numext namespace 2018-09-15 14:16:20 +02:00
Rasmus Munk Larsen
400512bfad Merged in ezhulenev/eigen-02 (pull request PR-501)
Enable DSizes type promotion with c++03
2018-09-19 00:50:04 +00:00
Eugene Zhulenev
c4627039ac Support static dimensions (aka IndexList) in Tensor::resize(...) 2018-09-18 14:25:21 -07:00
Gael Guennebaud
3e8188fc77 bug #1600: initialize m_info to InvalidInput by default, even though m_info is not accessible until it has been initialized (assert) 2018-09-18 21:24:48 +02:00
Eugene Zhulenev
218a7b9840 Enable DSizes type promotion with c++03 compilers 2018-09-18 10:57:00 -07:00
Ravi Kiran
1f0c941c3d Collapsed revision
* Merged eigen/eigen into default
2018-09-17 18:29:12 -07:00
Rasmus Munk Larsen
03a88c57e1 Merged in ezhulenev/eigen-02 (pull request PR-498)
Add DSizes index type promotion
2018-09-17 21:58:38 +00:00
Rasmus Munk Larsen
5ca0e4a245 Merged in ezhulenev/eigen-01 (pull request PR-497)
Fix warnings in IndexList array_prod
2018-09-17 20:15:06 +00:00
Eugene Zhulenev
a5cd4e9ad1 Replace deprecated Eigen::DenseIndex with Eigen::Index in TensorIndexList 2018-09-17 10:58:07 -07:00
Gael Guennebaud
b311bfb752 bug #1596: fix inclusion of Eigen's header within unsupported modules. 2018-09-17 09:54:29 +02:00
Gael Guennebaud
72f19c827a typo 2018-09-16 22:10:34 +02:00
Eugene Zhulenev
66f056776f Add DSizes index type promotion 2018-09-15 15:17:38 -07:00
Eugene Zhulenev
f313126dab Fix warnings in IndexList array_prod 2018-09-15 13:47:54 -07:00
Christoph Hertzberg
42705ba574 Fix weird error for building with g++-4.7 in C++03 mode. 2018-09-15 12:43:41 +02:00
Rasmus Munk Larsen
c2383f95af Merged in ezhulenev/eigen/fix_dsizes (pull request PR-494)
Fix DSizes IndexList constructor
2018-09-15 02:36:19 +00:00
Rasmus Munk Larsen
30290cdd56 Merged in ezhulenev/eigen/moar_eigen_fixes_3 (pull request PR-493)
Const cast scalar pointer in TensorSlicingOp evaluator

Approved-by: Sameer Agarwal <sameeragarwal@google.com>
2018-09-15 02:35:07 +00:00
Eugene Zhulenev
f7d0053cf0 Fix DSizes IndexList constructor 2018-09-14 19:19:13 -07:00
Rasmus Munk Larsen
601e289d27 Merged in ezhulenev/eigen/moar_eigen_fixes_1 (pull request PR-492)
Explicitly construct tensor block dimensions from evaluator dimensions
2018-09-15 01:36:21 +00:00
Eugene Zhulenev
71070a1e84 Const cast scalar pointer in TensorSlicingOp evaluator 2018-09-14 17:17:50 -07:00
Eugene Zhulenev
4863375723 Explicitly construct tensor block dimensions from evaluator dimensions 2018-09-14 16:55:05 -07:00
Rasmus Munk Larsen
14e35855e1 Merged in chtz/eigen-maxsizevector (pull request PR-490)
Let MaxSizeVector respect alignment of objects

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-09-14 23:29:24 +00:00
Rasmus Munk Larsen
281e631839 Merged in ezhulenev/eigen/indexlist_to_dsize (pull request PR-491)
Support reshaping with static shapes and dimensions conversion in tensor broadcasting

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-09-14 22:45:52 +00:00
Eugene Zhulenev
1b8d70a22b Support reshaping with static shapes and dimensions conversion in tensor broadcasting 2018-09-14 15:25:27 -07:00
Christoph Hertzberg
007f165c69 bug #1598: Let MaxSizeVector respect alignment of objects and add a unit test
Also revert 8b3d9ed081
2018-09-14 20:21:56 +02:00
Christoph Hertzberg
d7378aae8e Provide EIGEN_ALIGNOF macro, and give handmade_aligned_malloc the possibility for alignments larger than the standard alignment. 2018-09-14 20:17:47 +02:00
Rasmus Munk Larsen
9b864cdb37 Merged in rmlarsen/eigen3 (pull request PR-480)
Avoid compilation error in C++11 test when EIGEN_AVOID_STL_ARRAY is set.
2018-09-14 00:05:09 +00:00
Rasmus Munk Larsen
d0eef5fe6c Don't use bracket syntax in ctor. 2018-09-13 17:04:05 -07:00
Rasmus Munk Larsen
6313dde390 Fix merge error. 2018-09-13 16:42:05 -07:00
Rasmus Munk Larsen
0db590d22d Backed out changeset 01197e4452 2018-09-13 16:20:57 -07:00
Rasmus Munk Larsen
b3f4c067d9 Merge 2018-09-13 16:18:52 -07:00
Rasmus Munk Larsen
2b07018140 Enable vectorized version on GPUs. The underlying bug has been fixed. 2018-09-13 16:12:22 -07:00
Rasmus Munk Larsen
53568e3549 Merged in ezhulenev/eigen/tiled_evalution_support (pull request PR-444)
Tiled evaluation for Tensor ops

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
Approved-by: Gael Guennebaud <g.gael@free.fr>
2018-09-13 22:05:47 +00:00
Eugene Zhulenev
01197e4452 Fix warnings 2018-09-13 15:03:36 -07:00
Gael Guennebaud
1141bcf794 Fix conjugate-gradient for very small rhs 2018-09-13 23:53:28 +02:00
Gael Guennebaud
7f3b17e403 MSVC 2015 supports c++11 thread-local-storage 2018-09-13 18:15:07 +02:00
Eugene Zhulenev
d138fe341d Fis static_assert in test to conform c++11 standard 2018-09-11 17:23:18 -07:00
Rasmus Munk Larsen
e289f44c56 Don't vectorize the MeanReducer unless pdiv is available. 2018-09-11 14:09:00 -07:00
Eugene Zhulenev
55bb7e7935 Merge with upstream eigen/default 2018-09-11 13:33:06 -07:00
Eugene Zhulenev
81b38a155a Fix compilation of tiled evaluation code with c++03 2018-09-11 13:32:32 -07:00
Rasmus Munk Larsen
5da960702f Merged eigen/eigen into default 2018-09-11 10:08:46 -07:00
Rasmus Munk Larsen
46f88fc454 Use numerically stable tree reduction in TensorReduction. 2018-09-11 10:08:10 -07:00
Justin Carpentier
4827bec776 LLT: correct doc and add missing reference for the return type of rankUpdate
---
 Eigen/src/Cholesky/LLT.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
2018-09-11 09:33:21 +02:00
Rasmus Munk Larsen
3d057e0453 Avoid compilation error in C++11 test when EIGEN_AVOID_STL_ARRAY is set. 2018-09-06 12:59:36 -07:00
cgs1019
c6066ac411 Make param name and docs constistent for JacobiRotation::makeGivens
Previously the rendered math in the doc string called the optional return value
'r', while the actual parameter and the doc string text referred to the
parameter as 'z'. This changeset renames all the z's to r's to match the math.
2018-09-06 11:04:17 -04:00
Alexey Frunze
edeee16a16 Fix build failures in matrix_power and matrix_exponential tests.
This fixes the static assertion complaining about double being
used in place of long double. This happened on MIPS32, where
double and long double have the same type representation.
This can be simulated on x86 as well if we pass -mlong-double-64
to g++.
2018-08-31 14:11:10 -07:00
Deven Desai
c64fe9ea1f Updates to fix HIP-clang specific compile errors.
Compiling the eigen unittests with hip-clang (HIP with clang as the underlying compiler instead of hcc or nvcc), results in compile errors. The changes in this commit fix those compile errors. The main change is to convert a few instances of "__device__" to "EIGEN_DEVICE_FUNC"
2018-08-30 20:22:16 +00:00
Rasmus Munk Larsen
8b3d9ed081 Use padding instead of alignment attribute, which MaxSizeVector does not respect. This leads to undefined behavior and hard-to-trace bugs. 2018-09-05 11:20:06 -07:00
Gael Guennebaud
5927eef612 Enable std::result_of for msvc 2015 and later 2018-09-13 09:44:46 +02:00
Christoph Hertzberg
3adece4827 Fix misleading indentation of errorCode and make it loop-local 2018-09-12 14:41:38 +02:00
Christoph Hertzberg
7e9c9fbb2d Disable type-limits warnings for g++ < 4.8 2018-09-12 14:40:39 +02:00
Christoph Hertzberg
ba2c8efdcf EIGEN_UNUSED is not supported by g++4.7 (and not portable) 2018-09-12 11:49:10 +02:00
Christoph Hertzberg
ff4e835d6b "sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect 2018-08-30 20:10:11 +02:00
Christoph Hertzberg
023ed6b9a8 Product of empty array must be 1 and not 0. 2018-08-30 17:14:52 +02:00
Christoph Hertzberg
c2f4e8c08e Fix integer conversion warning 2018-08-30 17:12:53 +02:00
Christoph Hertzberg
ddbc564386 Fixed a few more shadowing warnings when compiling with g++ (and c++03) 2018-08-30 16:33:03 +02:00
Deven Desai
946c3e2544 adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode 2018-08-27 23:04:08 +00:00
Mehdi Goli
7ec8b40ad9 Collapsed revision
* Separating SYCL math function.
* Converting function overload to function specialisation.
* Applying the suggested design.
2018-08-28 14:20:48 +01:00
Christoph Hertzberg
20ba2eee6d gcc thinks this may not be initialized 2018-08-28 18:33:24 +02:00
Christoph Hertzberg
73ca600bca Fix numerous shadow-warnings for GCC<=4.8 2018-08-28 18:32:39 +02:00
Christoph Hertzberg
ef4d79fed8 Disable/ReenableStupidWarnings did not work properly, when included recursively 2018-08-28 18:26:22 +02:00
Gael Guennebaud
befaf83f5f bug #1590: fix collision with some system headers defining the macro FP32 2018-08-28 13:21:28 +02:00
Christoph Hertzberg
42f3ee4fb8 Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop
Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers
2018-08-28 11:44:15 +02:00
Eugene Zhulenev
c144bb355b Merge with upstream eigen/default 2018-08-27 14:34:07 -07:00
Gael Guennebaud
5747288676 Disable a bonus unit-test which is broken with gcc 4.7 2018-08-27 13:07:34 +02:00
Gael Guennebaud
d5ed64512f bug #1573: workaround gcc 4.7 and 4.8 bug 2018-08-27 10:38:20 +02:00
Christoph Hertzberg
b1653d1599 Fix some trivial C++11 vs C++03 compatibility warnings 2018-08-25 12:21:00 +02:00
Christoph Hertzberg
42123ff38b Make unit test C++03 compatible 2018-08-25 11:53:28 +02:00
Christoph Hertzberg
4b1ad086b5 Fix shadow warnings in doc-snippets 2018-08-25 10:07:17 +02:00
Christoph Hertzberg
117bc5d505 Fix some shadow warnings 2018-08-25 09:06:08 +02:00
Christoph Hertzberg
f155e97adb Previous fix broke compilation for clang 2018-08-25 00:10:46 +02:00
Christoph Hertzberg
209b4972ec Fix conversion warning 2018-08-25 00:02:46 +02:00
Christoph Hertzberg
495f6c3c3a Fix missing-braces warnings 2018-08-24 23:56:13 +02:00
Christoph Hertzberg
5aaedbeced Fixed more sign-compare and type-limits warnings 2018-08-24 23:54:12 +02:00
Christoph Hertzberg
8295f02b36 Hide "maybe uninitialized" warning on gcc 2018-08-24 23:22:20 +02:00
Christoph Hertzberg
f7675b826b Fix several integer conversion and sign-compare warnings 2018-08-24 22:58:55 +02:00
Christoph Hertzberg
949b0ad9cb Merged in rmlarsen/eigen3 (pull request PR-468)
Add support for emulating thread local.
2018-08-24 17:29:03 +00:00
Rasmus Munk Larsen
744e2fe0de Address comments about EIGEN_THREAD_LOCAL. 2018-08-24 10:24:54 -07:00
Christoph Hertzberg
ad4a08fb68 Use Intel cast intrinsics, since MSVC does not allow direct casting.
Reported by David Winkler.
2018-08-24 19:04:33 +02:00
Rasmus Munk Larsen
8d9bc5cc02 Fix g++ compilation. 2018-08-23 13:06:39 -07:00
Rasmus Munk Larsen
e9f9d70611 Don't rely on __had_feature for g++.
Don't use __thread.
Only use thread_local for gcc 4.8 or newer.
2018-08-23 12:59:46 -07:00
Rasmus Munk Larsen
668690978f Pad PerThread when we emulate thread_local to prevent false sharing. 2018-08-23 12:54:33 -07:00
Rasmus Munk Larsen
6cedc5a9b3 rename mu. 2018-08-23 12:11:58 -07:00
Rasmus Munk Larsen
6e0464004a Store std::unique_ptr instead of raw pointers in per_thread_map_. 2018-08-23 12:10:08 -07:00
Rasmus Munk Larsen
e51d9e473a Protect #undef max with #ifdef max. 2018-08-23 11:42:05 -07:00
Rasmus Munk Larsen
d35880ed91 merge 2018-08-23 11:36:49 -07:00
Christoph Hertzberg
a709c8efb4 Replace pointers by values or unique_ptr for better leak-safety 2018-08-23 19:41:59 +02:00
Christoph Hertzberg
39335cf51e Make MaxSizeVector leak-safe 2018-08-23 19:37:56 +02:00
Benoit Steiner
ff8e0ecc2f Updated one more line of code to avoid making the test dependent on cxx11 features. 2018-08-17 15:15:52 -07:00
Benoit Steiner
43d9dd9b28 Removed more dependencies on cxx11. 2018-08-17 08:49:32 -07:00
Gael Guennebaud
f76c802973 Add missing empty line 2018-08-17 17:16:12 +02:00
Christoph Hertzberg
41f1cc67b8 Assertion depended on a not yet initialized value 2018-08-17 16:42:53 +02:00
Christoph Hertzberg
4713465eef Silence double-promotion warning 2018-08-17 16:39:43 +02:00
Christoph Hertzberg
595cae9b09 Silence logical-op-parentheses warning 2018-08-17 16:30:32 +02:00
Christoph Hertzberg
c9b25fbefa Silence unused parameter warning 2018-08-17 16:28:28 +02:00
Christoph Hertzberg
dbdeceabdd Silence double-promotion warning (when converting double to complex<long double>) 2018-08-17 16:26:11 +02:00
Benoit Steiner
19df4d5752 Merged in codeplaysoftware/eigen-upstream-pure/Pointer_type_creation (pull request PR-461)
Creating a pointer type in TensorCustomOp.h
2018-08-16 18:28:33 +00:00
Benoit Steiner
f641cf1253 Adding missing at method in Eigen::array 2018-08-16 11:24:37 -07:00
Benoit Steiner
ede580ccda Avoid using the auto keyword to make the tensor block access test more portable 2018-08-16 10:49:47 -07:00
Benoit Steiner
e23c8c294e Use actual types instead of the auto keyword to make the code more portable 2018-08-16 10:41:01 -07:00
Mehdi Goli
80f1a76dec removing the noises. 2018-08-16 13:33:24 +01:00
Mehdi Goli
d0b01ebbf6 Reverting the unitended delete from the code. 2018-08-16 13:21:36 +01:00
Mehdi Goli
161dcbae9b Using PointerType struct and specializing it per device for TensorCustomOp.h 2018-08-16 00:07:02 +01:00
Sameer Agarwal
f197c3f55b Removed an used variable (PacketSize) from TensorExecutor 2018-08-15 11:24:57 -07:00
Benoit Steiner
4181556907 Fixed the tensor contraction code. 2018-08-15 09:34:47 -07:00
Benoit Steiner
b6f96cf7dd Removed dependencies on cxx11 language features from the tensor_block_access test 2018-08-15 08:54:31 -07:00
Benoit Steiner
fbb834144d Fixed more compilation errors 2018-08-15 08:52:58 -07:00
Benoit Steiner
6bb3f1b43e Made the tensor_block_access test compile again 2018-08-14 14:26:59 -07:00
Benoit Steiner
43ec0082a6 Made the kronecker_product test compile again 2018-08-14 14:08:36 -07:00
Benoit Steiner
ab3f481141 Cleaned up the code and make it compile with more compilers 2018-08-14 14:05:46 -07:00
Rasmus Munk Larsen
fa0bcbf230 merge 2018-08-14 12:18:31 -07:00
Rasmus Munk Larsen
15d4f515e2 Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version. 2018-08-14 12:17:46 -07:00
Rasmus Munk Larsen
aebdb06424 Fix a few compiler warnings in CXX11 tests. 2018-08-14 12:06:39 -07:00
Rasmus Munk Larsen
2a98bd9c8e Merged eigen/eigen into default 2018-08-14 12:02:09 -07:00
Benoit Steiner
59bba77ead Fixed compilation errors with gcc 4.7 and 4.8 2018-08-14 10:54:48 -07:00
Mehdi Goli
a97aaa2bcf Merge with upstream. 2018-08-14 17:49:29 +01:00
Mehdi Goli
8ba799805b Merge with upstream 2018-08-14 09:43:45 +01:00
Rasmus Munk Larsen
6d6e7b7027 merge 2018-08-13 15:34:50 -07:00
Rasmus Munk Larsen
9bb75d8d31 Add Barrier.h. 2018-08-13 15:34:03 -07:00
Rasmus Munk Larsen
2e1adc0324 Merged eigen/eigen into default 2018-08-13 15:32:00 -07:00
Rasmus Munk Larsen
8278ae6313 Add support for thread local support on platforms that do not support it through emulation using a hash map. 2018-08-13 15:31:23 -07:00
Benoit Steiner
501be70b27 Code cleanup 2018-08-13 15:16:40 -07:00
Benoit Steiner
3d3711f22f Fixed compilation errors. 2018-08-13 15:16:06 -07:00
Gael Guennebaud
3ec60215df Merged in rmlarsen/eigen2 (pull request PR-466)
Move sigmoid functor to core and rename it to 'logistic'.
2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen
0f1b2e08a5 Call logistic functor from Tensor::sigmoid. 2018-08-13 11:52:58 -07:00
Rasmus Munk Larsen
d6e283ba96 sigmoid -> logistic 2018-08-13 11:14:50 -07:00
Benoit Steiner
26239ee580 Use NULL instead of nullptr to avoid adding a cxx11 requirement. 2018-08-13 11:05:51 -07:00
Benoit Steiner
3810ec228f Don't use the auto keyword since it's not always supported properly. 2018-08-13 10:46:09 -07:00
Benoit Steiner
e6d5be811d Fixed syntax of nested templates chevrons to make it compatible with c++97 mode. 2018-08-13 10:29:21 -07:00
Mehdi Goli
1aa86aad14 Merge with upstream. 2018-08-13 15:40:31 +01:00
Eugene Zhulenev
35d90e8960 Fix BlockAccess enum in CwiseUnaryOp evaluator 2018-08-10 17:37:58 -07:00
Eugene Zhulenev
855b68896b Merge with eigen/default 2018-08-10 17:18:42 -07:00
Eugene Zhulenev
f2209d06e4 Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators 2018-08-10 16:53:36 -07:00
Benoit Steiner
c8ea398675 Avoided language features that are only available in cxx11 mode. 2018-08-10 13:02:41 -07:00
Benoit Steiner
4be4286224 Made the code compile with gcc 5.4. 2018-08-10 11:32:58 -07:00
Justin Carpentier
eabc7a4031 PR 465: Fix issue in RowMajor assignment in plain_matrix_type_row_major::type
The type should be RowMajor
2018-08-10 14:30:06 +02:00
Rasmus Munk Larsen
c49e93440f SuiteSparse defines the macro SuiteSparse_long to control what type is used for 64bit integers. The default value of this macro on non-MSVC platforms is long and __int64 on MSVC. CholmodSupport defaults to using long for the long variants of CHOLMOD functions. This creates problems when SuiteSparse_long is different than long. So the correct thing to do here is
to use SuiteSparse_long as the type instead of long.
2018-08-13 15:53:31 -07:00
Mehdi Goli
3a2e1b1fc6 Merge with upstream. 2018-08-10 12:28:38 +01:00
Rasmus Munk Larsen
bfc5091dd5 Cast to diagonalSize to RealScalar instead Scalar. 2018-08-09 14:46:17 -07:00
Rasmus Munk Larsen
8603d80029 Cast diagonalSize() to Scalar before multiplication. Without this, automatic differentiation in Ceres breaks because Scalar is a custom type that does not support multiplication by Index. 2018-08-09 11:09:10 -07:00
Eugene Zhulenev
cfaedb38cd Fix bug in a test + compilation errors 2018-08-09 09:44:07 -07:00
Mehdi Goli
ea8fa5e86f Merge with upstream 2018-08-09 14:07:56 +01:00
Mehdi Goli
8c083bfd0e Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type. 2018-08-09 13:57:43 +01:00
Alexey Frunze
050bcf6126 bug #1584: Improve random (avoid undefined behavior). 2018-08-08 20:19:32 -07:00
Eugene Zhulenev
1c8b9e10a7 Merged with upstream eigen 2018-08-08 16:57:58 -07:00
Benoit Steiner
131ed1191f Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462)
Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
2018-08-08 18:14:15 +00:00
Benoit Steiner
1285c080b3 Merged in codeplaysoftware/eigen-upstream-pure/disabling_assert_in_sycl (pull request PR-459)
Disabling assert inside SYCL kernel.
2018-08-08 18:12:42 +00:00
Benoit Steiner
c4b2845be9 Merged in rmlarsen/eigen3 (pull request PR-458)
Fix init order.
2018-08-08 18:11:49 +00:00
Benoit Steiner
7124172b83 Merged in codeplaysoftware/eigen-upstream-pure/EIGEN_UNROLL_LOOP (pull request PR-460)
Adding EIGEN_UNROLL_LOOP macro.
2018-08-08 18:10:54 +00:00
Mehdi Goli
532a0be05c Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation. 2018-08-08 12:12:26 +01:00
Mehdi Goli
67711eaa31 Fixing typo. 2018-08-08 11:38:10 +01:00
Mehdi Goli
3055e3a7c2 Creating a pointer type in TensorCustomOp.h 2018-08-08 11:19:02 +01:00
Mehdi Goli
22031ab59a Adding EIGEN_UNROLL_LOOP macro. 2018-08-08 11:07:27 +01:00
Mehdi Goli
908b906d79 Disabling assert inside SYCL kernel. 2018-08-08 10:01:10 +01:00
Rasmus Munk Larsen
693fb1d41e Fix init order. 2018-08-07 17:18:51 -07:00
Benoit Steiner
10d286f55b Silenced a couple of compilation warnings. 2018-08-06 16:00:29 -07:00
Benoit Steiner
d011d05fd6 Fixed compilation errors. 2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen
36e7e7dd8f Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation. 2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen
fa68342ef8 Move sigmoid functor to core. 2018-08-03 17:31:23 -07:00
Gael Guennebaud
09c81ac033 bug #1451: fix numeric_limits<AutoDiffScalar<Der>> with a reference as derivative type 2018-08-04 00:17:37 +02:00
luz.paz"
43fd42a33b Fix doxy and misc. typos
Found via `codespell -q 3 -I ../eigen-word-whitelist.txt`
---
 Eigen/src/Core/ProductEvaluators.h |  4 ++--
 Eigen/src/Core/arch/GPU/Half.h     |  2 +-
 Eigen/src/Core/util/Memory.h       |  2 +-
 Eigen/src/Geometry/Hyperplane.h    |  2 +-
 Eigen/src/Geometry/Transform.h     |  2 +-
 Eigen/src/Geometry/Translation.h   | 12 ++++++------
 doc/PreprocessorDirectives.dox     |  2 +-
 doc/TutorialGeometry.dox           |  2 +-
 test/boostmultiprec.cpp            |  2 +-
 test/triangular.cpp                |  2 +-
 10 files changed, 16 insertions(+), 16 deletions(-)
2018-08-01 21:34:47 -04:00
Jean-Christophe Fillion-Robin
2cbd9dd498 [PATCH] cmake: Support source include with add_subdirectory and
find_package use
This commit allows the sources of the project to be included in a parent
project CMakeLists.txt and support use of "find_package(Eigen3 CONFIG REQUIRED)"

Here is an example allowing to test the changes. It is not particularly
useful in itself. This change will allow to support one of the scenario
allowing to create custom 3D Slicer application bundling associated plugins.

/tmp/eigen-git-mirror  # Eigen sources

/tmp/test/CMakeLists.txt:

  cmake_minimum_required(VERSION 3.12)
  project(test)
  add_subdirectory("/tmp/eigen-git-mirror" "eigen-git-mirror")
  find_package(Eigen3 CONFIG REQUIRED)

and configuring it using:

  mkdir /tmp/test-build && cd $_
  cmake \
    -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY:BOOL=1 \
    -DEigen3_DIR:PATH=/tmp/test-build/eigen-git-mirror \
    /tmp/test

Co-authored-by: Pablo Hernandez <pablo.hernandez@kitware.com>
---
 CMakeLists.txt              | 1 +
 cmake/Eigen3Config.cmake.in | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)
2018-09-07 15:50:19 -04:00
Christoph Hertzberg
a80a290079 Fix 'template argument uses local type'-warnings (when compiled in C++03 mode) 2018-09-10 18:57:28 +02:00
Jiandong Ruan
6dcd2642aa bug #1526 - CUDA compilation fails on CUDA 9.x SDK when arch is set to compute_60 and/or above 2018-09-08 12:05:33 -07:00
Christoph Hertzberg
edfb7962fd Use static const int instead of enum to avoid numerous local-type-template-args warnings in C++03 mode 2018-09-07 14:08:39 +02:00
Alexey Frunze
ec38f07b79 bug #1595: Don't use C++11's std::isnan() in MIPS/MSA packet math.
This removes reliance on C++11 and improves generated code.
2018-09-06 15:40:09 -07:00
Eugene Zhulenev
1b0373ae10 Replace all using declarations with typedefs in Tensor ops 2018-08-01 15:55:46 -07:00
Rasmus Munk Larsen
7f8b53fd0e bug #1580: Fix cuda clang build. STL is not supported, so std::equal_to and std::not_equal breaks compilation.
Update the definition of EIGEN_CONSTEXPR_ARE_DEVICE_FUNC to exclude clang.
See also PR 450.
2018-08-01 12:36:24 -07:00
Rasmus Munk Larsen
bcb29f890c Fix initialization order. 2018-08-03 10:18:53 -07:00
Benoit Steiner
cf17794ef4 Merged in codeplaysoftware/eigen-upstream-pure/SYCL-required-changes (pull request PR-454)
SYCL required changes
2018-08-03 16:17:30 +00:00
Mehdi Goli
3074b1ff9e Fixing the compilation error. 2018-08-03 17:13:44 +01:00
Mehdi Goli
225fa112aa Merge with upstream. 2018-08-03 17:04:08 +01:00
Mehdi Goli
01358300d5 Creating separate SYCL required PR for uncontroversial files. 2018-08-03 16:59:15 +01:00
Gustavo Lima Chaves
2bf1cc8cf7 Fix 256 bit packet size assumptions in unit tests.
Like in change 2606abed53
, we have hit the threshould again. With
AVX512 builds we would never have Vector8f packets aligned at 64
bytes (the new value of EIGEN_MAX_ALIGN_BYTES after change 405859f18d
,
for AVX512-enabled builds).

This makes test/dynalloc.cpp pass for those builds.
2018-08-02 15:55:36 -07:00
Benoit Steiner
dd5875e30d Merged in codeplaysoftware/eigen-upstream-pure/constructor_error_clang (pull request PR-451)
Fixing ambigous constructor error for Clang compiler.
2018-08-02 20:46:03 +00:00
Benoit Steiner
113d8343d6 Merged in codeplaysoftware/eigen-upstream-pure/Fixing_visual_studio_error_For_tensor_trace (pull request PR-452)
Fixing compilation error for cxx11_tensor_trace.cpp on Microsoft Visual Studio.
2018-08-02 17:54:26 +00:00
Mehdi Goli
516d2621b9 fixing compilation error for cxx11_tensor_trace.cpp error on Microsoft Visual Studio. 2018-08-02 14:30:48 +01:00
Mehdi Goli
40d6d020a0 Fixing ambigous constructor error for Clang compiler. 2018-08-02 13:34:53 +01:00
Gael Guennebaud
62169419ab Fix two regressions introduced in previous merges: bad usage of EIGEN_HAS_VARIADIC_TEMPLATES and linking issue. 2018-08-01 23:35:34 +02:00
Eugene Zhulenev
64abdf1d7e Fix typo + get rid of redundant member variables for block sizes 2018-08-01 12:35:19 -07:00
Benoit Steiner
93b9e36e10 Merged in paultucker/eigen (pull request PR-431)
Optional ThreadPoolDevice allocator

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-08-01 19:14:34 +00:00
Eugene Zhulenev
385b3ff12f Merged latest changes from upstream/eigen 2018-08-01 11:59:04 -07:00
Benoit Steiner
17221115c9 Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447)
Adding variadic version of assert which can take a parameter pack as its input.
2018-08-01 16:41:54 +00:00
Benoit Steiner
0360c36170 Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446)
Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
2018-08-01 16:13:15 +00:00
Mehdi Goli
c6a5c70712 Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h 2018-08-01 16:56:26 +01:00
Benoit Steiner
9ca1c09131 Merged in codeplaysoftware/eigen-upstream-pure/new-arch-SYCL-headers (pull request PR-448)
Adding new arch/SYCL headers, used for SYCL vectorization.
2018-08-01 15:50:54 +00:00
Benoit Steiner
45f75f1ace Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449)
Enabling per device specialisation of packetSize.
2018-08-01 15:43:03 +00:00
Benoit Steiner
90e632fd66 Merged in codeplaysoftware/eigen-upstream-pure/EIGEN_STRONG_INLINE_MACRO (pull request PR-445)
Replacing ad-hoc inline keyword with EIGEN_STRONG_INLINE MACRO.
2018-08-01 15:41:06 +00:00
Mehdi Goli
af96018b49 Using the suggested modification. 2018-08-01 16:04:44 +01:00
Mehdi Goli
b512a9536f Enabling per device specialisation of packetsize. 2018-08-01 13:39:13 +01:00
Mehdi Goli
c84509d7cc Adding new arch/SYCL headers, used for SYCL vectorization. 2018-08-01 12:40:54 +01:00
Mehdi Goli
3a197a60e6 variadic version of assert which can take a parameter pack as its input. 2018-08-01 12:19:14 +01:00
Mehdi Goli
d7a8414848 Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation. 2018-08-01 11:56:30 +01:00
Mehdi Goli
9e219bb3d3 Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. 2018-08-01 10:47:49 +01:00
Eugene Zhulenev
83c0a16baf Add block evaluation support to TensorOps 2018-07-31 15:56:31 -07:00
Benoit Steiner
edf46bd7a2 Merged in yuefengz/eigen (pull request PR-370)
Use device's allocate function instead of internal::aligned_malloc.
2018-07-31 22:38:28 +00:00
Paul Tucker
385f7b8d0c Change getAllocator() to allocator() in ThreadPoolDevice. 2018-07-31 13:52:18 -07:00
Mark D Ryan
6f5b126e6d Fix tensor contraction for AVX512 machines
This patch modifies the TensorContraction class to ensure that the kc_ field is
always a multiple of the packet_size, if the packet_size is > 8.  Without this
change spatial convolutions in Tensorflow do not work properly as the code that
re-arranges the input matrices can assert if kc_ is not a multiple of the
packet_size.  This leads to a unit test failure,
//tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.
2018-07-31 09:33:37 +01:00
Gael Guennebaud
d6568425f8 Close branch tiling_3. 2018-07-31 08:13:43 +00:00
Gael Guennebaud
678a0dcb12 Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)
Tiled tensor executor
2018-07-31 08:13:00 +00:00
Gael Guennebaud
679eece876 Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437. 2018-07-31 10:10:14 +02:00
Gael Guennebaud
723856dec1 bug #1577: fix msvc compilation of unit test, msvc defines ptrdiff_t as long long 2018-07-30 14:52:15 +02:00
Eugene Zhulenev
966c2a7bb6 Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible 2018-07-27 12:45:17 -07:00
Eugene Zhulenev
6913221c43 Add tiled evaluation support to TensorExecutor 2018-07-25 13:51:10 -07:00
Alexey Frunze
7b91c11207 bug #1578: Improve prefetching in matrix multiplication on MIPS. 2018-07-24 18:36:44 -07:00
Patrik Huber
f5cace5e9f Fix two small typos in the documentation 2018-07-26 19:55:19 +00:00
Gael Guennebaud
34539c4af4 Merged in rmlarsen/eigen1 (pull request PR-441)
Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.
2018-07-30 11:26:24 +00:00
Mark D Ryan
bc615e4585 Re-enable FMA for fast sqrt functions 2018-07-30 13:21:00 +02:00
Mark D Ryan
96b030a8e4 Re-enable FMA for fast sqrt functions
This commit re-enables the use of FMA for the FAST sqrt functions.
Doing so improves the performance of both algorithms.  The float32
version is now 88% the speed of the original function, while the
double version is 90%.
2018-07-30 10:19:51 +01:00
Rasmus Munk Larsen
e478532625 Reduce the number of template specializations of classes related to tensor contraction to reduce binary size. 2018-07-27 12:36:34 -07:00
Rasmus Munk Larsen
2ebcb911b2 Add pcast packet op for NEON. 2018-07-26 14:28:48 -07:00
Christoph Hertzberg
397b0547e1 DIsable static assertions only when necessary and disable double-promotion warnings in that case as well 2018-07-26 00:01:24 +02:00
Christoph Hertzberg
5e79402b4a fix warnings for doc-eigen-prerequisites 2018-07-24 21:59:15 +02:00
Christoph Hertzberg
5f79b7f9a9 Removed several shadowing types and use global Index typedef everywhere 2018-07-25 21:47:45 +02:00
Christoph Hertzberg
44ee201337 Rename variable which shadows class name 2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves
705f66a9ca Account for missing change on commit "Remove SimpleThreadPool and..."
"... always use {NonBlocking}ThreadPool". It seems the non-blocking
implementation was me the default/only one, but a reference to the old
name was left unmodified. Fix that.
2018-07-23 16:29:09 -07:00
Christoph Hertzberg
fd4fe7cbc5 Fixed issue which made documentation not getting built anymore 2018-07-24 22:56:15 +02:00
Christoph Hertzberg
636126ef40 Allow to filter out build-error messages 2018-07-24 20:12:49 +02:00
Eugene Zhulenev
d55efa6f0f TensorBlockIO 2018-07-23 15:50:55 -07:00
Eugene Zhulenev
34a75c3c5c Initial support of TensorBlock 2018-07-20 17:37:20 -07:00
Gael Guennebaud
2c2de9da7d Merged in glchaves/eigen (pull request PR-433)
Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded  block
2018-07-23 19:38:55 +00:00
Gael Guennebaud
4ca3e48f42 fix typo 2018-07-23 16:51:57 +02:00
Gael Guennebaud
c747cde69a Add lastN shorcuts to seq/seqN. 2018-07-23 16:20:25 +02:00
Gustavo Lima Chaves
02eaaacbc5 Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded
block

Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail
right away without this, as this test seems to rely on those language
features. The skip under compilation with MSVC was kept.
2018-07-20 16:08:40 -07:00
Eugene Zhulenev
2bf864f1eb Disable type traits for stdlibc++ <= 4.9.3 2018-07-20 10:11:44 -07:00
Gael Guennebaud
de70671937 Oopps, EIGEN_COMP_MSVC is not available before including Eigen. 2018-07-20 17:51:17 +02:00
Gael Guennebaud
56a750b6cc Disable optimization for sparse_product unit test with MSVC 2013, otherwise it takes several hours to build. 2018-07-20 08:36:38 -07:00
Paul Tucker
d4afccde5a Add test coverage for ThreadPoolDevice optional allocator. 2018-07-19 17:43:44 -07:00
Eugene Zhulenev
c58b874727 PR430: Convert count to the reducer type in MeanReducer
Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails.

cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this)
  ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_'
(type 'const DenseIndex {aka const long int}')
to type 'const type& {aka const Eigen::half&}'
     return pdiv(vaccum, pset1<Packet>(packetCount_));
Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade.

static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
2018-07-19 17:37:03 -07:00
Gael Guennebaud
2424e3b7ac Pass by const ref. 2018-07-19 18:48:19 +02:00
Gael Guennebaud
509a5fa77f Fix IsRelocatable without C++11 2018-07-19 18:47:38 +02:00
Gael Guennebaud
2ca2592009 Fix determination of EIGEN_HAS_TYPE_TRAITS 2018-07-19 18:47:18 +02:00
Gael Guennebaud
5e5987996f Fix stupid error in Quaternion move ctor 2018-07-19 18:33:53 +02:00
Paul Tucker
4e9848fa86 Actually add optional Allocator* arg to ThreadPoolDevice(). 2018-07-16 17:53:36 -07:00
Paul Tucker
b3e7c9132d Add optional Allocator argument to ThreadPoolDevice constructor.
When supplied, this allocator will be used in place of
internal::aligned_malloc.  This permits e.g. use of a NUMA-node specific
allocator where the thread-pool is also restricted a single NUMA-node.
2018-07-16 17:26:05 -07:00
Gael Guennebaud
40797dbea3 bug #1572: use c++11 atomic instead of volatile if c++11 is available, and disable multi-threaded GEMM on non-x86 without c++11. 2018-07-17 00:11:20 +02:00
Gael Guennebaud
add5757488 Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder. 2018-07-16 18:55:40 +02:00
Gael Guennebaud
901c7d31f0 Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally. 2018-07-16 18:35:05 +02:00
Gael Guennebaud
f2b52f9946 Add the cmake option "EIGEN_DASHBOARD_BUILD_TARGET" to control the build target in dashboard mode (e.g., ctest -D Experimental) 2018-07-16 17:59:30 +02:00
Gael Guennebaud
23d82c1ac5 Merged in rmlarsen/eigen2 (pull request PR-422)
Optimize the case where broadcasting is a no-op.
2018-07-14 11:42:58 +00:00
Gael Guennebaud
a87cff20df Fix GeneralizedEigenSolver when requesting for eigenvalues only. 2018-07-14 09:38:49 +02:00
Rasmus Munk Larsen
3a9cf4e290 Get rid of alias for m_broadcast. 2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen
4222550e17 Optimize the case where broadcasting is a no-op. 2018-07-13 16:12:38 -07:00
Rasmus Munk Larsen
4a3952fd55 Relax the condition to not only work on Android. 2018-07-13 11:24:07 -07:00
Rasmus Munk Larsen
02a9443db9 Clang produces incorrect Thumb2 assembler when using alloca.
Don't define EIGEN_ALLOCA when generating Thumb with clang.
2018-07-13 11:03:04 -07:00
Gael Guennebaud
20991c3203 bug #1571: fix is_convertible<from,to> with "from" a reference. 2018-07-13 17:47:28 +02:00
Gael Guennebaud
1920129d71 Remove clang warning 2018-07-13 16:05:35 +02:00
Gael Guennebaud
195c9c054b Print more debug info in gpu_basic 2018-07-13 16:05:07 +02:00
Gael Guennebaud
06eb24cf4d Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda. 2018-07-13 16:04:27 +02:00
Gael Guennebaud
5fd03ddbfb Make EIGEN_TEST_CUDA_CLANG more friendly with OSX 2018-07-13 16:03:14 +02:00
Gael Guennebaud
86d9c0255c Forward declaring std::array does not work with all std libs, so let's just include <array> 2018-07-13 13:06:44 +02:00
David Hyde
d908afe35f bug #1558: fix a corner case in MINRES when both v_new and w_new vanish. 2018-07-08 22:06:38 -07:00
Eugene Zhulenev
6e654f3379 Reduce number of allocations in TensorContractionThreadPool. 2018-07-16 14:26:39 -07:00
Gael Guennebaud
7ccb623746 bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. 2018-07-19 13:15:40 +02:00
Alexey Frunze
1f523e7304 Add MIPS changes missing from previous merge. 2018-07-18 12:27:50 -07:00
Eugene Zhulenev
e3c2d61739 Assert that no output kernel is defined for GPU contraction 2018-07-18 14:34:22 -07:00
Eugene Zhulenev
086ded5c85 Disable type traits for GCC < 5.1.0 2018-07-18 16:32:55 -07:00
Eugene Zhulenev
79d4129cce Specify default output kernel for TensorContractionOp 2018-07-18 14:21:01 -07:00
Gael Guennebaud
6e5a3b898f Add regression for bugs #1573 and #1575 2018-07-18 23:34:34 +02:00
Gael Guennebaud
863580fe88 bug #1432: fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable 2018-07-18 23:33:07 +02:00
Gael Guennebaud
053ed97c72 Generalize ScalarWithExceptions to a full non-copyable and trowing scalar type to be used in other unit tests. 2018-07-18 23:27:37 +02:00
Gael Guennebaud
a503fc8725 bug #1575: fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted. 2018-07-18 23:26:13 +02:00
Gael Guennebaud
308725c3c9 More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA 2018-07-18 13:51:36 +02:00
Alexey Frunze
3875fb05aa Add support for MIPS SIMD (MSA) 2018-07-06 16:04:30 -07:00
Gael Guennebaud
44ea5f7623 Add unit test for -Tensor<complex> on GPU 2018-07-12 17:19:38 +02:00
Gael Guennebaud
12e1ebb68b Remove local Index typedef from unit-tests 2018-07-12 17:16:40 +02:00
Gael Guennebaud
63185be8b2 Disable eigenvalues test for clang-cuda 2018-07-12 17:03:14 +02:00
Gael Guennebaud
bec013b2c9 fix unused warning 2018-07-12 17:02:18 +02:00
Gael Guennebaud
5c73c9223a Fix shadowing typedefs 2018-07-12 17:01:07 +02:00
Gael Guennebaud
98728312c8 Fix compilation regarding std::array 2018-07-12 17:00:37 +02:00
Gael Guennebaud
eb3d8f68bb fix unused warning 2018-07-12 16:59:47 +02:00
Gael Guennebaud
006e18e52b Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h),
and alignment/vectorization logic is now in util/ConfigureVectorization.h
2018-07-12 16:57:41 +02:00
Thales Sabino
9a6a43319f Fix cxx11_tensor_fft not building on Windows.
The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types.
* * *
Changing from "long long" to "std::int64_t".
2018-07-12 11:20:59 +01:00
Gael Guennebaud
b347eb0b1c Fix doc 2018-07-12 11:56:18 +02:00
Mark D Ryan
e79c5149bf Fix AVX512 implementations of psqrt
This commit fixes the AVX512 implementations of psqrt in the same
way that 3ed67cb0bb
 fixed the AVX2 version of this function.  The
AVX512 versions of psqrt incorrectly return -0.0 for negative
values, instead of NaN.  Fixing the issues requires adding
some additional instructions that slow down the algorithms.  A
similar test to the one used in 3ed67cb0bb
 shows that the
corrected Packet16f code runs at 73% of the speed of the existing code,
while the corrected Packed8d function runs at 68% of the original.
2018-06-25 05:05:02 -07:00
Yuefeng Zhou
1eff6cf8a7 Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances. 2018-02-20 16:50:05 -08:00
Gael Guennebaud
adb134d47e Fix implicit conversion from 0.0 to scalar 2018-02-16 22:26:01 +04:00
Gael Guennebaud
937ad18221 add unit test for SimplicialCholesky and Boost multiprec. 2018-02-16 22:25:11 +04:00
Julian Kent
6d451cf2b6 Add missing consts for rows and cols functions in SparseLU 2018-02-10 13:44:05 +01:00
Daniele E. Domenichelli
a12b8a8c75 FindEigen3: Set Eigen3_FOUND variable 2018-07-11 16:31:50 +02:00
Gael Guennebaud
8bdb214fd0 remove double ;; 2018-07-12 11:17:53 +02:00
Gael Guennebaud
a9060378d3 bug #1570: fix warning 2018-07-12 11:07:09 +02:00
Gael Guennebaud
6cd6551b26 Add deprecated header files for TensorFlow 2018-07-12 10:50:53 +02:00
Gael Guennebaud
da0c604078 Merged in deven-amd/eigen (pull request PR-402)
Adding support for using Eigen in HIP kernels.
2018-07-12 08:07:16 +00:00
Gael Guennebaud
a4ea611ca7 Remove useless specialization thanks to is_convertible being more robust. 2018-07-12 09:59:44 +02:00
Gael Guennebaud
8a40dda5a6 Add some basic unit-tests 2018-07-12 09:59:00 +02:00
Gael Guennebaud
8ef267ccbd spellcheck 2018-07-12 09:58:29 +02:00
Gael Guennebaud
21cf4a1a8b Make is_convertible more robust and conformant to std::is_convertible 2018-07-12 09:57:19 +02:00
Gael Guennebaud
8a5955a052 Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product. 2018-07-11 17:16:50 +02:00
Gael Guennebaud
d193cc87f4 Fix regression in 9357838f94 2018-07-11 17:09:23 +02:00
Gael Guennebaud
fb33687736 Fix double ;; 2018-07-11 17:08:30 +02:00
Deven Desai
876f392c39 Updates corresponding to the latest round of PR feedback
The major changes are

1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
    The above three changes effectively enable the Eigen "Packet" layer for the HIP platform

4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places

The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
1fe0b74904 deleting hip specific files that are no longer required 2018-07-11 09:28:44 -04:00
Deven Desai
dec47a6493 renaming CUDA* to GPU* for some header files 2018-07-11 09:26:54 -04:00
Deven Desai
471cfe5ff7 renaming CUDA* to GPU* for some header files 2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575 merging updates from upstream 2018-07-11 09:17:33 -04:00
Gael Guennebaud
f00d08cc0a Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix. 2018-07-11 14:01:47 +02:00
Gael Guennebaud
1625476091 Add internall::is_identity compile-time helper 2018-07-11 14:00:24 +02:00
Gael Guennebaud
fe723d6129 Fix conversion warning 2018-07-10 09:10:32 +02:00
Gael Guennebaud
9357838f94 bug #1543: improve linear indexing for general block expressions 2018-07-10 09:10:15 +02:00
Gael Guennebaud
de9e31a06d Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it.
If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.
2018-07-09 15:41:14 +02:00
Gael Guennebaud
6190aa5632 bug #1567: add optimized path for tensor broadcasting and 'Channel First' shape 2018-07-09 11:23:16 +02:00
Gael Guennebaud
ec323b7e66 Skip null numerators in triangular-vector-solve (as in BLAS TRSV). 2018-07-09 11:13:19 +02:00
Gael Guennebaud
359dd77ec3 Fix legitimate "declaration shadows a typedef" warning 2018-07-09 11:03:39 +02:00
Deven Desai
e2b2c61533 merging from master 2018-06-20 16:47:45 -04:00
Deven Desai
1bb6fa99a3 merging the CUDA and HIP implementation for the Tensor directory and the unit tests 2018-06-20 16:44:58 -04:00
Deven Desai
cfdabbcc8f removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories 2018-06-20 12:57:02 -04:00
Deven Desai
7e41c8f1a9 renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories 2018-06-20 12:52:30 -04:00
Deven Desai
ee73ae0a80 Merged eigen/eigen into default 2018-06-20 12:37:11 -04:00
Mark D Ryan
90a53ca6fd Fix the Packet16h version of ptranspose
The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was
reordering the PacketBlock argument incorrectly.  This lead to errors in
the multiplication of matrices composed of 16 bit floats on AVX512
machines, if at least of the matrices was using RowMajor order.  This
error is responsible for one tensorflow unit test failure on AVX512
machines:

//tensorflow/python/kernel_tests:batch_matmul_op_test
2018-06-16 15:13:06 -07:00
Gael Guennebaud
1f54164eca Fix a few issues with Packet16h 2018-07-07 00:15:07 +02:00
Gael Guennebaud
f2dc048df9 complete implementation of Packet16h (AVX512) 2018-07-06 17:43:11 +02:00
Gael Guennebaud
a937c50208 palign is not used anymore, so let's relax the unit test 2018-07-06 17:41:52 +02:00
Gael Guennebaud
56a33ae57d test product kernel with half-floats. 2018-07-06 17:14:04 +02:00
Gael Guennebaud
f4d623ffa7 Complete Packet8h implementation and test it in packetmath unit test 2018-07-06 17:13:36 +02:00
Gael Guennebaud
a8ab6060df Add unitests for inverse and selfadjoint-eigenvalues on CUDA 2018-07-06 09:58:45 +02:00
Gael Guennebaud
b8271bb368 fix md5sum of lapack_addons 2018-06-15 14:21:29 +02:00
Deven Desai
b6cc0961b1 updates based on PR feedback
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)

1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`

After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.

2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
ba972fb6b4 moving Half headers from CUDA dir to GPU dir, removing the HIP versions 2018-06-13 12:26:18 -04:00
Deven Desai
d1d22ef0f4 syncing this fork with upstream 2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
Derivative of the incomplete Gamma function and the sample of a Gamma random variable

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Andrea Bocci
f7124b3e46 Extend CUDA support to matrix inversion and selfadjointeigensolver 2018-06-11 18:33:24 +02:00
Gael Guennebaud
0537123953 bug #1565: help MSVC to generatenot too bad ASM in reductions. 2018-07-05 09:21:26 +02:00
Gael Guennebaud
6a241bd8ee Implement custom inplace triangular product to avoid a temporary 2018-07-03 14:02:46 +02:00
Gael Guennebaud
3ae2083e23 Make is_same_dense compatible with different scalar types. 2018-07-03 13:21:43 +02:00
Gael Guennebaud
67ec37f7b0 Activate dgmres unit test 2018-07-02 12:54:14 +02:00
Gael Guennebaud
047677a08d Fix regression in changeset f05dea6b23
: computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.
2018-07-02 12:18:25 +02:00
Gael Guennebaud
d625564936 Simplify redux_evaluator using inheritance, and properly rename parameters in reducers. 2018-07-02 11:50:41 +02:00
Gael Guennebaud
d428a199ab bug #1562: optimize evaluation of small products of the form s*A*B by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x... 2018-07-02 11:41:09 +02:00
Gael Guennebaud
a7b313a16c Fix unit test 2018-07-01 22:45:47 +02:00
Gael Guennebaud
0cdacf3fa4 update comment 2018-06-29 11:28:36 +02:00
Gael Guennebaud
54f6eeda90 Merged in net147/eigen (pull request PR-411)
Use std::complex constructor instead of assignment from scalar
2018-06-28 21:01:04 +00:00
Gael Guennebaud
9a81de1d35 Fix order of EIGEN_DEVICE_FUNC and returned type 2018-06-28 00:20:59 +02:00
Jonathan Liu
b7689bded9 Use std::complex constructor instead of assignment from scalar
Fixes GCC conversion to non-scalar type requested compile error when
using boost::multiprecision::cpp_dec_float_50 as scalar type.
2018-06-28 00:32:37 +10:00
Gael Guennebaud
f9d337780d First step towards a generic vectorised quaternion product 2018-06-25 14:26:51 +02:00
Gael Guennebaud
ee5864f72e bug #1560 fix product with a 1x1 diagonal matrix 2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen
2f62cc68cd merge 2018-06-22 15:09:44 -07:00
Rasmus Munk Larsen
bda71ad394 Fix typo in pbend for AltiVec. 2018-06-22 15:04:35 -07:00
Benoit Steiner
b6ffcd22e3 Merged in rmlarsen/eigen2 (pull request PR-409)
Fix oversharding bug in parallelFor.
2018-06-21 18:34:57 +00:00
Gael Guennebaud
4cc32d80fd bug #1555: compilation fix with XLC 2018-06-21 10:28:38 +02:00
Rasmus Munk Larsen
5418154a45 Fix oversharding bug in parallelFor. 2018-06-20 17:51:48 -07:00
Gael Guennebaud
cb4c9a6a94 bug #1531: make dedicatd unit testing for NumDimensions 2018-06-08 17:11:45 +02:00
Gael Guennebaud
d6813fb1c5 bug #1531: expose NumDimensions for solve and sparse expressions. 2018-06-08 16:55:10 +02:00
Gael Guennebaud
89d65bb9d6 bug #1531: expose NumDimensions for compatibility with Tensor 2018-06-08 16:50:17 +02:00
Gael Guennebaud
f05dea6b23 bug #1550: prevent avoidable memory allocation in RealSchur 2018-06-08 10:14:57 +02:00
Gael Guennebaud
7933267c67 fix prototype 2018-06-08 09:56:01 +02:00
Gael Guennebaud
f4d1461874 Fix the way matrix folder is passed to the tests. 2018-06-08 09:55:46 +02:00
Benoit Steiner
522d3ca54d Don't use std::equal_to inside cuda kernels since it's not supported. 2018-06-07 13:02:07 -07:00
Christoph Hertzberg
7d7bb91537 Missing line during manual rebase of PR-374 2018-06-07 20:30:09 +02:00
Michael Figurnov
30fa3d0454 Merge from eigen/eigen 2018-06-07 17:57:56 +01:00
Benoit Steiner
d2b0a4a59b Merged in mfigurnov/eigen/fix-bessel (pull request PR-404)
Fix compilation of special functions without C99 math.
2018-06-07 16:12:42 +00:00
Michael Figurnov
6c71c7d360 Merge from eigen/eigen. 2018-06-07 15:54:18 +01:00
Gael Guennebaud
c25034710e Fiw some warnings in dox examples 2018-06-07 16:09:22 +02:00
Gael Guennebaud
37348d03ae Fix int versus Index 2018-06-07 15:56:43 +02:00
Gael Guennebaud
c723ffd763 Fix warning 2018-06-07 15:56:20 +02:00
Gael Guennebaud
af7c83b9a2 Fix warning 2018-06-07 15:45:24 +02:00
Gael Guennebaud
7fe29aceeb Fix MSVC warning C4290: C++ exception specification ignored except to indicate a function is not __declspec(nothrow) 2018-06-07 15:36:20 +02:00
Michael Figurnov
aa813d417b Fix compilation of special functions without C99 math.
The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly,
causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not
actually require C99 math, so now they are always available.
2018-06-07 14:35:07 +01:00
Gael Guennebaud
55774b48e4 Fix short vs long 2018-06-07 15:26:25 +02:00
Christoph Hertzberg
e5f9f4768f Avoid unnecessary C++11 dependency 2018-06-07 15:03:50 +02:00
Gael Guennebaud
b3fd93207b Fix typos found using codespell 2018-06-07 14:43:02 +02:00
Michael Figurnov
5172a32849 Updated the stopping criteria in igammac_cf_impl.
Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64.
2018-06-07 12:03:58 +01:00
Michael Figurnov
4bd158fa37 Derivative of the incomplete Gamma function and the sample of a Gamma random variable.
In addition to igamma(a, x), this code implements:
* igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter
* gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter

The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
2018-06-06 18:49:26 +01:00
Deven Desai
8fbd47052b Adding support for using Eigen in HIP kernels.
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.

Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)


Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Benoit Steiner
e206f8d4a4 Merged in mfigurnov/eigen (pull request PR-400)
Exponentially scaled modified Bessel functions of order zero and one.

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-05 17:05:21 +00:00
Penporn Koanantakool
e2ed0cf8ab Add a ThreadPoolInterface* getter for ThreadPoolDevice. 2018-06-02 12:07:49 -07:00
Gael Guennebaud
84868da904 Don't run hg on non mercurial clone 2018-05-31 21:21:57 +02:00
Michael Figurnov
f216854453 Exponentially scaled modified Bessel functions of order zero and one.
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)

The code is ported from Cephes and tested against SciPy.
2018-05-31 15:34:53 +01:00
Gael Guennebaud
6af1433cb5 Doc: add aliasing in common pitfaffs. 2018-05-29 22:37:47 +02:00
Katrin Leinweber
ea94543190 Hyperlink DOIs against preferred resolver 2018-05-24 18:55:40 +02:00
Gael Guennebaud
999b552c16 Search for sequential Pastix. 2018-05-29 20:49:25 +02:00
Gael Guennebaud
eef4b7bd87 Fix handling of path names containing spaces and the likes. 2018-05-29 20:49:06 +02:00
Gael Guennebaud
647b724a36 Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted) 2018-05-29 20:46:46 +02:00
Gael Guennebaud
49262dfee6 Fix compilation and SSE support with PGI compiler 2018-05-29 15:09:31 +02:00
Christoph Hertzberg
750af06362 Add an option to test with external BLAS library 2018-05-22 21:04:32 +02:00
Christoph Hertzberg
d06a753d10 Make qr_fullpivoting unit test run for fixed-sized matrices 2018-05-22 20:29:17 +02:00
Gael Guennebaud
f0862b062f Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older. 2018-05-22 19:29:51 +02:00
Gael Guennebaud
36e413a534 Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int) 2018-05-22 18:51:35 +02:00
Gael Guennebaud
725bd92903 fix stupid typo 2018-05-18 17:46:43 +02:00
Gael Guennebaud
a382bc9364 is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions 2018-05-18 17:02:27 +02:00
Gael Guennebaud
4dd767f455 add some internal checks 2018-05-18 13:59:55 +02:00
Gael Guennebaud
345c0ab450 check that all integer types are properly handled by mat(i,j) 2018-05-18 13:46:46 +02:00
Mark D Ryan
405859f18d Set EIGEN_IDEAL_MAX_ALIGN_BYTES correctly for AVX512 builds
bug #1548

The macro EIGEN_IDEAL_MAX_ALIGN_BYTES is being incorrectly set to 32
on AVX512 builds.  It should be set to 64.  In the current code it is
only set to 64 if the macro EIGEN_VECTORIZE_AVX512 is defined.  This
macro does get defined in AVX512 builds in Core, but only after Macros.h,
the file that defines EIGEN_IDEAL_MAX_ALIGN_BYTES, has been included.
This commit fixes the issue by setting EIGEN_IDEAL_MAX_ALIGN_BYTES to
64 if __AVX512F__ is defined.
2018-05-17 17:04:00 +01:00
Vamsi Sripathi
6293ad3f39 Performance improvements to tensor broadcast operation
1. Added new packet functions using SIMD for NByOne, OneByN cases
  2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD
  3. Added 4 test cases to cover the new packet functions
2018-05-23 14:02:05 -07:00
Gael Guennebaud
7134fa7a2e Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype). 2018-06-07 09:33:10 +02:00
Jeff Trull
e7147f69ae Add tests for sparseQR results (value and size) covering bugs #1522 and #1544 2018-04-21 10:26:30 -07:00
Robert Lukierski
b2053990d0 Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment
specializations. Otherwise causes problems with small fixed size matrix multiplication (call to
0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).
2018-03-14 16:19:43 +00:00
Jeff Trull
9f0c5c3669 Make sparse QR result sizes consistent with dense QR, with the following rules:
1) Q is always square
2) Q*R*P' is valid and recovers the original matrix

This implies that the size of Q is the number of rows in the original matrix, square,
and that the size of R is the size of the original matrix.
2018-02-15 15:00:31 -08:00
Christoph Hertzberg
d655900953 bug #1544: Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386. 2018-05-17 19:17:01 +02:00
Benoit Steiner
0371380d5b Merged in rmlarsen/eigen2 (pull request PR-393)
Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.
2018-05-16 21:45:42 +00:00
Rasmus Munk Larsen
b8d36774fa Rename clip2 to clamp. 2018-05-16 14:04:48 -07:00
Rasmus Munk Larsen
812480baa3 Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow. 2018-05-16 09:49:24 -07:00
Benoit Steiner
1403c2c15b Merged in didierjansen/eigen (pull request PR-360)
Fix bugs and typos in the contraction example of the tensor README
2018-05-16 01:16:36 +00:00
Benoit Steiner
ad355b3f05 Merged in rmlarsen/eigen2 (pull request PR-392)
Add vectorized clip functor for Eigen Tensors
2018-05-16 01:15:56 +00:00
Christoph Hertzberg
0272f2451a Fix "suggest parentheses around comparison" warning 2018-05-15 19:35:53 +02:00
Rasmus Munk Larsen
afec3021f7 Use numext::maxi & numext::mini. 2018-05-14 16:35:39 -07:00
Rasmus Munk Larsen
b8c8e5f436 Add vectorized clip functor for Eigen Tensors. 2018-05-14 16:07:13 -07:00
Benoit Steiner
6118c6ff4f Enable RawAccess to tensor slices whenever possinle.
Avoid 32-bit integer overflow in TensorSlicingOp
2018-04-30 11:28:12 -07:00
Gael Guennebaud
6e7118265d Fix compilation with NEON+MSVC 2018-04-26 10:50:41 +02:00
Gael Guennebaud
097dd4616d Fix unit test for SIMD engine not supporting sqrt 2018-04-26 10:47:39 +02:00
Gael Guennebaud
8810baaed4 Add multi-threading for sparse-row-major * dense-row-major 2018-04-25 10:14:48 +02:00
Gael Guennebaud
2f3287da7d Fix "used uninitialized" warnings 2018-04-24 17:17:25 +02:00
Gael Guennebaud
3ffd449ef5 Workaround warning 2018-04-24 17:11:51 +02:00
Gael Guennebaud
e8ca5166a9 bug #1428: atempt to make NEON vectorization compilable by MSVC.
The workaround is to wrap NEON packet types to make them different c++ types.
2018-04-24 11:19:49 +02:00
Benoit Steiner
6f5935421a fix AVX512 plog 2018-04-23 15:49:26 +00:00
Gael Guennebaud
e9da464e20 Add specializations of is_arithmetic for long long in c++11 2018-04-23 16:26:29 +02:00
Gael Guennebaud
a57e6e5f0f workaround MSVC 2013 compilation issue (ambiguous call) 2018-04-23 15:31:51 +02:00
Gael Guennebaud
11123175db typo in doc 2018-04-23 15:30:35 +02:00
Gael Guennebaud
5679e439e0 bug #1543: fix linear indexing in generic block evaluation (this completes the fix in commit 12efc7d41b
)
2018-04-23 14:40:16 +02:00
Gael Guennebaud
35b31353ab Fix unit test 2018-04-22 22:49:08 +02:00
Christoph Hertzberg
34e499ad36 Disable -Wshadow when compiling with g++ 2018-04-21 22:08:26 +02:00
Jayaram Bobba
b7b868d1c4 fix AVX512 plog 2018-04-20 13:39:18 -07:00
Gael Guennebaud
686fb57233 fix const cast in NEON 2018-04-18 18:46:34 +02:00
Dmitriy Korchemkin
02d2f1cb4a Cast zeros to Scalar in RealSchur 2018-04-18 13:52:46 +03:00
Christoph Hertzberg
50633d1a83 Renamed .trans() et al. to .reverseFlag() et at. Adapted documentation of .setReverseFlag() 2018-04-17 11:30:27 +02:00
nicolov
39c2cba810 Add a specialization of Eigen::numext::conj for std::complex<T> to be used when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
This follows the previous change in 2a69290ddb
, which mentions OSX (I guess because it uses libc++ too).
2018-04-13 22:29:10 +00:00
Christoph Hertzberg
775766d175 Add parenthesis to fix compiler warnings 2018-04-15 18:43:56 +02:00
Christoph Hertzberg
42715533f1 bug #1493: Make representation of HouseholderSequence consistent and working for complex numbers. Made corresponding unit test actually test that. Also simplify implementation of QR decompositions 2018-04-15 10:15:28 +02:00
Christoph Hertzberg
c9ecfff2e6 Add links where to make PRs and report bugs into README.md 2018-04-13 21:05:28 +00:00
Christoph Hertzberg
c8b19702bc Limit test size for sparse Cholesky solvers to EIGEN_TEST_MAX_SIZE 2018-04-13 20:36:58 +02:00
Christoph Hertzberg
2cbb00b18e No need to make noise, if KLU is found 2018-04-13 19:14:25 +02:00
Christoph Hertzberg
84dcd998a9 Recent Adolc versions require C++11 2018-04-13 19:10:23 +02:00
Christoph Hertzberg
4d392d93aa Make hypot_impl compile again for types with expression-templates (e.g., boost::multiprecision) 2018-04-13 19:01:37 +02:00
Christoph Hertzberg
072e111ec0 SelfAdjointView<...,Mode> causes a static assert since commit d820ab9edc 2018-04-13 19:00:34 +02:00
Gael Guennebaud
7a9089c33c fix linking issue 2018-04-13 08:51:47 +02:00
Gael Guennebaud
e43ca0320d bug #1520: workaround some -Wfloat-equal warnings by calling std::equal_to 2018-04-11 15:24:13 +02:00
Weiming Zhao
b0eda3cb9f Avoid using memcpy for non-POD elements 2018-04-11 11:37:06 +02:00
Gael Guennebaud
79266fec75 extend doxygen splitter for huge screens 2018-04-11 11:31:17 +02:00
Gael Guennebaud
426052ef6e Update header/footer for doxygen 1.8.13 2018-04-11 11:30:34 +02:00
Gael Guennebaud
9c8decffbf Fix javascript hacks for oxygen 1.8.13 2018-04-11 11:30:14 +02:00
Gael Guennebaud
e798466871 bug #1538: update manual pages regarding BDCSVD. 2018-04-11 10:46:11 +02:00
Gael Guennebaud
c91906b065 Umfpack: UF_long has been removed in recent versions of suitesparse, and fix a few long-to-int conversions issues. 2018-04-11 09:59:59 +02:00
Gael Guennebaud
0050709ea7 Merged in v_huber/eigen (pull request PR-378)
Add interface to umfpack_*l_* functions
2018-04-11 07:43:04 +00:00
Guillaume Jacob
8c1652055a Fix code sample output in block(int, int, int, int) doxygen 2018-04-09 17:23:59 +02:00
vhuber
08008f67e1 Add unitTest 2018-04-09 17:07:46 +02:00
Gael Guennebaud
add15924ac Fix MKL backend for symmetric eigenvalues on row-major matrices. 2018-04-09 13:29:26 +02:00
Gael Guennebaud
04b1628e55 Add missing empty line. 2018-04-09 13:28:31 +02:00
Gael Guennebaud
c2624c0318 Fix cmake scripts with no fortran compiler 2018-04-07 08:45:19 +02:00
Gael Guennebaud
2f833b1c64 bug #1509: fix computeInverseWithCheck for complexes 2018-04-04 15:47:46 +02:00
Gael Guennebaud
b903fa74fd Extend list of MSVC versions 2018-04-04 15:14:09 +02:00
Gael Guennebaud
403f09ccef Make stableNorm and blueNorm compatible with 2D matrices. 2018-04-04 15:13:31 +02:00
Gael Guennebaud
4213b63f5c Factories code between numext::hypot and scalar_hyot_op functor. 2018-04-04 15:12:43 +02:00
Gael Guennebaud
368dd4cd9d Make innerVector() and innerVectors() methods available to all expressions supported by Block.
Before, only SparseBase exposed such methods.
2018-04-04 15:09:21 +02:00
Gael Guennebaud
e116f6847e bug #1521: avoid signalling NaN in hypot and make it std::complex<> friendly. 2018-04-04 13:47:23 +02:00
Gael Guennebaud
73729025a4 bug #1521: add unit test dedicated to numbest::hypos 2018-04-04 13:45:34 +02:00
Gael Guennebaud
13f5df9f67 Add a note on vec_min vs asm 2018-04-04 13:10:38 +02:00
Gael Guennebaud
e91e314347 bug #1494: makes pmin/pmax behave on Altivec/VSX as on x86 regading NaNs 2018-04-04 11:39:19 +02:00
Gael Guennebaud
112c899304 comment unreachable code 2018-04-03 23:16:43 +02:00
Gael Guennebaud
a1292395d6 Fix compilation of product with inverse transpositions (e.g., mat * Transpositions().inverse()) 2018-04-03 23:06:44 +02:00
Gael Guennebaud
8c7b5158a1 commit 45e9c9996da790b55ed9c4b0dfeae49492ac5c46 (HEAD -> memory_fix)
Author: George Burgess IV <gbiv@google.com>
Date:   Thu Mar 1 11:20:24 2018 -0800

    Prefer `::operator new` to `new`

    The C++ standard allows compilers much flexibility with `new`
    expressions, including eliding them entirely
    (https://godbolt.org/g/yS6i91). However, calls to `operator new` are
    required to be treated like opaque function calls.

    Since we're calling `new` for side-effects other than allocating heap
    memory, we should prefer the less flexible version.

    Signed-off-by: George Burgess IV <gbiv@google.com>
2018-04-03 17:15:38 +02:00
Gael Guennebaud
dd4cc6bd9e bug #1527: fix support for MKL's VML (destination was not properly resized) 2018-04-03 17:11:15 +02:00
Gael Guennebaud
c5b56f1fb2 bug #1528: better use numeric_limits::min() instead of 1/highest() that with underflow. 2018-04-03 16:49:35 +02:00
Gael Guennebaud
8d0ffe3655 bug #1516: add assertion for out-of-range diagonal index in MatrixBase::diagonal(i) 2018-04-03 16:15:43 +02:00
Gael Guennebaud
407e3e2621 bug #1532: disable stl::*_negate in C++17 (they are deprecated) 2018-04-03 15:59:30 +02:00
Gael Guennebaud
40b4bf3d32 AVX512: _mm512_rsqrt28_ps is available for AVX512ER only 2018-04-03 14:36:27 +02:00
Gael Guennebaud
584951ca4d Rename predux_downto4 to be more accurate on its semantic. 2018-04-03 14:28:38 +02:00
Gael Guennebaud
67bac6368c protect calls to isnan 2018-04-03 14:19:04 +02:00
Gael Guennebaud
d43b2f01f4 Fix unit testing of predux_downto4 (bad name), and add unit testing of prsqrt 2018-04-03 14:14:00 +02:00
Gael Guennebaud
7b0630315f AVX512: fix psqrt and prsqrt 2018-04-03 14:12:50 +02:00
Gael Guennebaud
6719409cd9 AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ 2018-04-03 14:11:56 +02:00
Gael Guennebaud
524119d32a Fix uninitialized output argument. 2018-04-03 10:56:10 +02:00
vhuber
267a144da5 Remove unnecessary define 2018-03-30 23:04:53 +02:00
vhuber
baf9a5a776 Add interface to umfpack_*l_* functions 2018-03-30 18:53:34 +02:00
luz.paz
e3912f5e63 MIsc. source and comment typos
Found using `codespell` and `grep` from downstream FreeCAD
2018-03-11 10:01:44 -04:00
Gael Guennebaud
5deeb19e7b bug #1517: fix triangular product with unit diagonal and nested scaling factor: (s*A).triangularView<UpperUnit>()*B 2018-02-09 16:52:35 +01:00
Gael Guennebaud
12efc7d41b Fix linear indexing in generic block evaluation. 2018-02-09 16:45:49 +01:00
Gael Guennebaud
f4a6863c75 Fix typo 2018-02-09 16:43:49 +01:00
Viktor Csomor
000840cae0 Added a move constructor and move assignment operator to Tensor and wrote some tests. 2018-02-07 19:10:54 +01:00
Gael Guennebaud
3a2dc3869e Fix weird issue with MSVC 2013 2018-07-18 02:26:43 -07:00
Eugene Zhulenev
c95aacab90 Fix TensorContractionOp evaluators for GPU and SYCL 2018-07-17 14:09:37 -07:00
Gael Guennebaud
038b55464b Merged in deven-amd/eigen (pull request PR-425)
applying EIGEN_DECLARE_TEST to *gpu  unit tests
2018-07-17 21:14:40 +00:00
Deven Desai
f124f07965 applying EIGEN_DECLARE_TEST to *gpu* tests
Also, a few minor fixes for GPU tests running in HIP mode.

1. Adding an include for hip/hip_runtime.h in the Macros.h file
   For HIP __host__ and __device__ are macros which are defined in hip headers.
   Their definitions need to be included before their use in the file.

2. Fixing the compile failure in TensorContractionGpu introduced by the commit to
   "Fuse computations into the Tensor contractions using output kernel"

3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
2018-07-17 14:16:48 -04:00
Gael Guennebaud
dff3a92d52 Remove usage of #if EIGEN_TEST_PART_XX in unit tests that does not require them (splitting can thus be avoided for them) 2018-07-17 15:52:58 +02:00
Gael Guennebaud
82f0ce2726 Get rid of EIGEN_TEST_FUNC, unit tests must now be declared with EIGEN_DECLARE_TEST(mytest) { /* code */ }.
This provide several advantages:
- more flexibility in designing unit tests
- unit tests can be glued to speed up compilation
- unit tests are compiled with same predefined macros, which is a requirement for zapcc
2018-07-17 14:46:15 +02:00
Gael Guennebaud
37f4bdd97d Fix VERIFY_EVALUATION_COUNT(EXPR,N) with a complex expression as N 2018-07-17 13:20:49 +02:00
Gael Guennebaud
2b2cd85694 bug #1573: add noexcept move constructor and move assignment operator to Quaternion 2018-07-17 11:11:33 +02:00
Eugene Zhulenev
43206ac4de Call OutputKernel in evalGemv 2018-07-12 14:52:23 -07:00
Eugene Zhulenev
e204ecdaaf Remove SimpleThreadPool and always use {NonBlocking}ThreadPool 2018-07-16 15:06:57 -07:00
Eugene Zhulenev
b324ed55d9 Call OutputKernel in evalGemv 2018-07-12 14:52:23 -07:00
Eugene Zhulenev
01fd4096d3 Fuse computations into the Tensor contractions using output kernel 2018-07-10 13:16:38 -07:00
Gael Guennebaud
5539587b1f Some warning fixes 2018-07-17 10:29:12 +02:00
Benoit Steiner
8f55956a57 Update the padding computation for PADDING_SAME to be consistent with TensorFlow. 2018-01-30 20:22:12 +00:00
Gael Guennebaud
09a16ba42f bug #1412: fix compilation with nvcc+MSVC 2018-01-17 23:13:16 +01:00
Lee.Deokjae
5b3c367926 Fix typos in the contraction example of tensor README 2018-01-06 14:36:19 +09:00
Eugene Chereshnev
f558ad2955 Fix incorrect ldvt in LAPACKE call from JacobiSVD 2018-01-03 12:55:52 -08:00
Benoit Steiner
22de74aa76 Disable use of recurrence for computing twiddle factors. 2018-01-09 18:32:52 +00:00
Gael Guennebaud
73629f8b68 Fix gcc7 warning 2018-01-09 08:59:27 +01:00
RJ Ryan
59985cfd26 Disable use of recurrence for computing twiddle factors. Fixes FFT precision issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689 2017-12-31 10:44:56 -05:00
nluehr
f9bdcea022 For cuda 9.1 replace math_functions.hpp with cuda_runtime.h 2017-12-18 16:51:15 -08:00
Gael Guennebaud
06bf1047f9 Fix compilation of stableNorm with some expressions as input 2017-12-15 15:15:37 +01:00
Gael Guennebaud
73214c4bd0 Workaround nvcc 9.0 issue. See PR 351.
https://bitbucket.org/eigen/eigen/pull-requests/351
2017-12-15 14:10:59 +01:00
Gael Guennebaud
31e0bda2e3 Fix cmake warning 2017-12-14 15:48:27 +01:00
Gael Guennebaud
26a2c6fc16 fix unit test 2017-12-14 15:11:04 +01:00
Gael Guennebaud
546ab97d76 Add possibility to overwrite EIGEN_STRONG_INLINE. 2017-12-14 14:47:38 +01:00
Gael Guennebaud
9c3aed9d48 Fix packet and alignment propagation logic of Block<Xpr> expressions. In particular, (A+B).col(j) lost vectorisation. 2017-12-14 14:24:33 +01:00
Gael Guennebaud
76c7dae600 ignore all *build* sub directories 2017-12-14 14:22:14 +01:00
Gael Guennebaud
b2cacd189e fix header inclusion 2017-12-14 10:01:02 +01:00
Yangzihao Wang
3122477c86 Update the padding computation for PADDING_SAME to be consistent with TensorFlow. 2017-12-12 11:15:24 -08:00
Benoit Steiner
393b7c4959 Merged in ncluehr/eigen/float2half-fix (pull request PR-349)
Replace __float2half_rn with __float2half
2017-12-01 00:29:51 +00:00
nluehr
aefd5fd5c4 Replace __float2half_rn with __float2half
The latter provides a consistent definition for CUDA 8.0 and 9.0.
2017-11-28 10:15:46 -08:00
Gael Guennebaud
d0b028e173 clarify Pastix requirements 2017-11-27 22:11:57 +01:00
Gael Guennebaud
3587e481fb silent MSVC warning 2017-11-27 21:53:02 +01:00
Benoit Steiner
3a327cd3c7 Merged in ncluehr/eigen/predux_fp16_fix (pull request PR-348)
Fix incorrect integer cast in half2 predux.
2017-11-21 21:11:45 +00:00
nluehr
dd6de618c3 Fix incorrect integer cast in predux<half2>().
Bug corrupts results on Maxwell and earlier GPU architectures.
2017-11-21 10:47:00 -08:00
Gael Guennebaud
3dc6ff73ca Handle PGI compiler 2017-11-17 22:54:39 +01:00
Zvi Rackover
599a88da27 Disable gcc-specific workaround for Clang to allow build with AVX512
There is currently a workaround for an issue in gcc that requires invoking gcc with the -fabi-version flag. This workaround is not needed for Clang and moreover is not supported.
2017-11-16 19:53:38 +00:00
Gael Guennebaud
672bdc126b bug #1479: fix failure detection in LDLT 2017-11-16 17:55:24 +01:00
Basil Fierz
624df50945 Adds missing EIGEN_STRONG_INLINE to support MSVC properly inlining small vector calculations
When working with MSVC often small vector operations are not properly inlined. This behaviour is observed even on the most recent compiler versions.
2017-10-26 22:44:28 +02:00
Benoit Steiner
746a6b7b81 Merged in zzp11/eigen/zzp11/a-small-mistake-quickreferencedox-edited-1510217281963 (pull request PR-346)
a small mistake QuickReference.dox edited online with Bitbucket
2018-03-23 01:02:34 +00:00
Benoit Steiner
d2631ef61d Merged in facaiy/eigen/ENH/exp_support_complex_for_gpu (pull request PR-359)
ENH: exp supports complex type for cuda
2018-03-23 00:59:15 +00:00
Benoit Steiner
8fcbd6d4c9 Merged in dtrebbien/eigen (pull request PR-369)
Move up the specialization of std::numeric_limits
2018-03-23 00:54:58 +00:00
Rasmus Munk Larsen
e900b010c8 Improve robustness of igamma and igammac to bad inputs.
Check for nan inputs and propagate them immediately. Limit the number of internal iterations to 2000 (same number as used by scipy.special.gammainc). This prevents an infinite loop when the function is called with nan or very large arguments.

Original change by mfirgunov@google.com
2018-03-19 09:04:54 -07:00
Gael Guennebaud
f7d17689a5 Add static assertion for fixed sizes Ref<> 2018-03-09 10:11:13 +01:00
Gael Guennebaud
f6be7289d7 Implement better static assertion checking to make sure that the first assertion is a static one and not a runtime one. 2018-03-09 10:00:51 +01:00
Gael Guennebaud
d820ab9edc Add static assertion on selfadjoint-view's UpLo parameter. 2018-03-09 09:33:43 +01:00
Daniel Trebbien
0c57be407d Move up the specialization of std::numeric_limits
This fixes a compilation error seen when building TensorFlow on macOS:
https://github.com/tensorflow/tensorflow/issues/17067
2018-02-18 15:35:45 -08:00
Yan Facai (颜发才)
42a8334668 ENH: exp supports complex type for cuda 2018-01-04 16:01:01 +08:00
zhouzhaoping
912e9965ef a small mistake QuickReference.dox edited online with Bitbucket 2017-11-09 08:49:01 +00:00
Gael Guennebaud
4c03b3511e Fix issue with boost::multiprec in previous commit 2017-11-08 23:28:01 +01:00
Gael Guennebaud
e9d2888e74 Improve debugging tests and output in BDCSVD 2017-11-08 10:26:03 +01:00
Gael Guennebaud
e8468ea91b Fix overflow issues in BDCSVD 2017-11-08 10:24:28 +01:00
Benoit Steiner
3949615176 Merged in JonasMu/eigen (pull request PR-329)
Added an example for a contraction to a scalar value to README.md

Approved-by: Jonas Harsch <jonas.harsch@gmail.com>
2017-10-27 07:27:46 +00:00
Christoph Hertzberg
11ddac57e5 Merged in guillaume_michel/eigen (pull request PR-334)
- Add support for NEON plog PacketMath function
2017-10-23 13:22:22 +00:00
Benoit Steiner
a6d875bac8 Removed unecesasry #include 2017-10-22 08:12:45 -07:00
Benoit Steiner
f16ba2a630 Merged in LaFeuille/eigen-1/LaFeuille/typo-fix-alignmeent-alignment-1505889397887 (pull request PR-335)
Typo fix alignmeent ->alignment
2017-10-21 01:59:55 +00:00
Benoit Steiner
ee6ad21b25 Merged in henryiii/eigen/henryiii/device (pull request PR-343)
Fixing missing inlines on device functions for newer CUDA cards
2017-10-21 01:58:22 +00:00
Henry Schreiner
9bb26eb8f1 Restore __device__ 2017-10-21 00:50:38 +00:00
Henry Schreiner
4245475d22 Fixing missing inlines on device functions for newer CUDA cards 2017-10-20 03:20:13 +00:00
Benoit Steiner
8eb4b9d254 Merged in benoitsteiner/opencl (pull request PR-341) 2017-10-17 16:39:28 +00:00
Rasmus Munk Larsen
2dd63ed395 Merge 2017-10-13 15:58:52 -07:00
Rasmus Munk Larsen
f349507e02 Specialize ThreadPoolDevice::enqueueNotification for the case with no args. As an example this reduces binary size of an TensorFlow demo app for Android by about 2.5%. 2017-10-13 15:58:12 -07:00
Benoit Steiner
688451409d Merged in mehdi_goli/upstr_benoit/ComputeCppNewReleaseFix (pull request PR-16)
Changes required for new ComputeCpp CE version.
2017-10-13 20:56:01 +00:00
Konstantinos Margaritis
0e6e027e91 check both z13 and z14 arches 2017-10-12 15:38:34 -04:00
Konstantinos Margaritis
6c3475f110 remove debugging 2017-10-12 15:34:55 -04:00
Konstantinos Margaritis
df7644aec3 Merged eigen/eigen into default 2017-10-12 22:23:13 +03:00
Konstantinos Margaritis
98e52cc770 rollback 374f750ad4 2017-10-12 15:22:10 -04:00
Konstantinos Margaritis
c4ad358565 explicitly set conjugate mask 2017-10-11 11:05:29 -04:00
Konstantinos Margaritis
380d41fd76 added some extra debugging 2017-10-11 10:40:12 -04:00
Konstantinos Margaritis
d0b7b9d0d3 some Packet2cf pmul fixes 2017-10-11 10:17:22 -04:00
Konstantinos Margaritis
df173f5620 initial pexp() for 32-bit floats, commented out due to vec_cts() 2017-10-11 09:40:49 -04:00
Konstantinos Margaritis
3dcae2a27f initial pexp() for 32-bit floats, commented out due to vec_cts() 2017-10-11 09:40:45 -04:00
Konstantinos Margaritis
c2a2246489 fix predux_mul for z14/float 2017-10-10 13:38:32 -04:00
Konstantinos Margaritis
374f750ad4 eliminate 'enumeral and non-enumeral type in conditional expression' warning 2017-10-09 16:56:30 -04:00
Konstantinos Margaritis
bc30305d29 complete z14 port 2017-10-09 16:55:10 -04:00
Gael Guennebaud
0e85a677e3 bug #1472: fix warning 2017-09-26 10:53:33 +02:00
Gael Guennebaud
8579195169 bug #1468 (1/2) : add missing std:: to memcpy 2017-09-22 09:23:24 +02:00
Gael Guennebaud
f92567fecc Add link to a useful example. 2017-09-20 10:22:23 +02:00
Gael Guennebaud
7ad07fc6f2 Update documentation for aligned_allocator 2017-09-20 10:22:00 +02:00
LaFeuille
7c9b07dc5c Typo fix alignmeent ->alignment 2017-09-20 06:38:39 +00:00
Mehdi Goli
2062ac9958 Changes required for new ComputeCpp CE version. 2017-09-18 18:17:39 +01:00
Christoph Hertzberg
23f8b00bc8 clang provides __has_feature(is_enum) (but not <type_traits>) in C++03 mode 2017-09-14 19:26:03 +02:00
Christoph Hertzberg
0c9ad2f525 std::integral_constant is not C++03 compatible 2017-09-14 19:23:38 +02:00
Rasmus Munk Larsen
1b7294f6fc Fix cut-and-paste error. 2017-09-08 16:35:58 -07:00
Rasmus Munk Larsen
94e2213b38 Avoid undefined behavior in Eigen::TensorCostModel::numThreads.
If the cost is large enough then the thread count can be larger than the maximum
representable int, so just casting it to an int is undefined behavior.

Contributed by phurst@google.com.
2017-09-08 15:49:55 -07:00
Gael Guennebaud
6d42309f13 Fix compilation of Vector::operator()(enum) by treating enums as Index 2017-09-07 14:34:30 +02:00
Benoit Steiner
ea4e65bf41 Fixed compilation with cuda_clang. 2017-09-07 09:13:52 +00:00
Gael Guennebaud
a91918a105 Merged in infinitei/eigen (pull request PR-328)
bug #1464 : Fixes construction of EulerAngles from 3D vector expression.

Approved-by: Tal Hadad <tal_hd@hotmail.com>
Approved-by: Abhijit Kundu <abhijit.kundu@gatech.edu>
2017-09-06 08:42:14 +00:00
Gael Guennebaud
9c353dd145 Add C++11 max_digits10 for half. 2017-09-06 10:22:47 +02:00
Gael Guennebaud
b35d1ce4a5 Implement true compile-time "if" for apply_rotation_in_the_plane. This fixes a compilation issue for vectorized real type with missing vectorization for complexes, e.g. AVX512. 2017-09-06 10:02:49 +02:00
Gael Guennebaud
80142362ac Fix mixing types in sparse matrix products. 2017-09-02 22:50:20 +02:00
Jonas Harsch
810b70ad09 Merged in JonasMu/added-an-example-for-a-contraction-to-a--1504265366851 (pull request PR-1)
Added an example for a contraction to a scalar value
2017-09-01 12:01:39 +00:00
Jonas Harsch
a34fb212cd Close branch JonasMu/added-an-example-for-a-contraction-to-a--1504265366851 2017-09-01 12:01:39 +00:00
Jonas Harsch
a991c80365 Added an example for a contraction to a scalar value, e.g. a double contraction of two second order tensors and how you can get the value of the result. I lost one day to get this doen so I think it will help some guys. I also added Eigen:: to the IndexPair and and array in the same example. 2017-09-01 11:30:26 +00:00
Benoit Steiner
a4089991eb Added support for CUDA 9.0. 2017-08-31 02:49:39 +00:00
Abhijit Kundu
6d991a9595 bug #1464 : Fixes construction of EulerAngles from 3D vector expression. 2017-08-30 13:26:30 -04:00
Gael Guennebaud
304ef29571 Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.h 2017-08-24 11:26:41 +02:00
Konstantinos Margaritis
1affe3d8df Merged eigen/eigen into default 2017-08-24 12:24:01 +03:00
Gael Guennebaud
21633e585b bug #1462: remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER 2017-08-24 11:06:47 +02:00
Gael Guennebaud
12249849b5 Make the threshold from gemm to coeff-based-product configurable, and add some explanations. 2017-08-24 10:43:21 +02:00
Gael Guennebaud
39864ebe1e bug #336: improve doc for PlainObjectBase::Map 2017-08-22 17:18:43 +02:00
Gael Guennebaud
600e52fc7f Add missing scalar conversion 2017-08-22 17:06:57 +02:00
Gael Guennebaud
9deee79922 bug #1457: add setUnit() methods for consistency. 2017-08-22 16:48:07 +02:00
Gael Guennebaud
bc4dae9aeb bug #1449: fix redux_3 unit test 2017-08-22 15:59:08 +02:00
Gael Guennebaud
bc91a2df8b bug #1461: fix compilation of Map<const Quaternion>::x() 2017-08-22 15:10:42 +02:00
Gael Guennebaud
fc39d5954b Merged in dtrebbien/eigen/patch-1 (pull request PR-312)
Work around a compilation error seen with nvcc V8.0.61
2017-08-22 12:17:37 +00:00
Gael Guennebaud
b223918ea9 Doc: warn about constness in LLT::solveInPlace 2017-08-22 14:12:47 +02:00
Konstantinos Margaritis
4ce5ec5197 initial support for z14 2017-08-07 05:54:29 -04:00
Konstantinos Margaritis
e1e71ca4e4 initial support for z14 2017-08-06 19:53:18 -04:00
Benoit Steiner
84d7be103a Fixing Argmax that was breaking upstream TensorFlow. 2017-07-22 03:19:34 +00:00
Benoit Steiner
f0b154a4b0 Code cleanup 2017-07-10 09:54:09 -07:00
Benoit Steiner
575cda76b3 Fixed syntax errors generated by xcode 2017-07-09 11:39:01 -07:00
Benoit Steiner
5ac27d5b51 Avoid relying on cxx11 features when possible. 2017-07-08 21:58:44 -07:00
Benoit Steiner
c5a241ab9b Merged in benoitsteiner/opencl (pull request PR-323)
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
b7ae4dd9ef Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull request PR-315)
Add labels to #ifdef, in TensorReductionCuda.h
2017-07-07 04:23:52 +00:00
Benoit Steiner
9daed67952 Merged in tntnatbry/eigen (pull request PR-319)
Tensor Trace op
2017-07-07 04:18:03 +00:00
Benoit Steiner
6795512e59 Improved the randomness of the tensor random generator 2017-07-06 21:12:45 -07:00
Benoit Steiner
dc524ac716 Fixed compilation warning 2017-07-06 21:11:15 -07:00
Benoit Steiner
62b4634ebe Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
Applying Benoit's comment for Fixing ImageVolumePatch.

* Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file.

* Fixing dealocation of the memory in ImagePatch test for SYCL.

* Fixing the automerge issue.
2017-07-06 05:08:13 +00:00
Benoit Steiner
c92faf9d84 Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13)
Adding hyperbolic operations for sycl.

* Adding hyperbolic operations.

* Adding the hyperbolic operations for CPU as well.
2017-07-06 05:05:57 +00:00
Benoit Steiner
53725c10b8 Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)
DataDependancy

* Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code.

* Applying Ronnan's Comments.

* Applying benoit's comments
2017-06-28 17:55:23 +00:00
Gael Guennebaud
c010b17360 Fix warning 2017-06-27 14:29:57 +02:00
Gael Guennebaud
561f777075 Fix a gcc7 warning about bool * bool in abs2 default implementation. 2017-06-27 12:05:17 +02:00
Gael Guennebaud
b651ce0ffa Fix a gcc7 warning: Wint-in-bool-context 2017-06-26 09:58:28 +02:00
Christoph Hertzberg
157040d44f Make sure CMAKE_Fortran_COMPILER is set before checking for Fortran functions 2017-06-20 16:58:03 +02:00
Gael Guennebaud
24fe1de9b4 merge 2017-06-15 10:17:39 +02:00
Gael Guennebaud
b240080e64 bug #1436: fix compilation of Jacobi rotations with ARM NEON, some specializations of internal::conj_helper were missing. 2017-06-15 10:16:30 +02:00
Benoit Steiner
3baef62b9a Added missing __device__ qualifier 2017-06-13 12:56:55 -07:00
Benoit Steiner
449936828c Added missing __device__ qualifier 2017-06-13 12:54:57 -07:00
Benoit Steiner
b8e805497e Merged in benoitsteiner/opencl (pull request PR-318)
Improved support for OpenCL
2017-06-13 05:01:10 +00:00
Gael Guennebaud
9fbdf02059 Enable Array(EigenBase<>) ctor for compatible scalar types only. This prevents nested arrays to look as being convertible from/to simple arrays. 2017-06-12 22:30:32 +02:00
Gael Guennebaud
e43d8fe9d7 Fix compilation of streaming nested Array, i.e., cout << Array<Array<>> 2017-06-12 22:26:26 +02:00
Gael Guennebaud
d9d7bd6d62 Fix 1x1 case in Solve expression with EIGEN_DEFAULT_MATRIX_STORAGE_ORDER_OPTION==RowMajor 2017-06-12 22:25:02 +02:00
Androbin42
95ecb2b5d6 Make buildtests.in more robust 2017-06-12 17:11:06 +00:00
Androbin42
3f7fb5a6d6 Make eigen_monitor_perf.sh more robust 2017-06-12 17:07:56 +00:00
Gael Guennebaud
7f42a93349 Merged in alainvaucher/eigen/find-module-imported-target (pull request PR-324)
In the CMake find module, define the Eigen imported target as when installing with CMake

* In the CMake find module, define the Eigen imported target

* Add quotes to the imported location, in case there are spaces in the path.

Approved-by: Alain Vaucher <acvaucher@gmail.com>
2017-11-15 20:45:09 +00:00
Gael Guennebaud
7cc503f9f5 bug #1485: fix linking issue of non template functions 2017-11-15 21:33:37 +01:00
Gael Guennebaud
103c0aa6ad Add KLU in the list of third-party sparse solvers 2017-11-10 14:13:29 +01:00
Gael Guennebaud
00bc67c374 Move KLU support to official 2017-11-10 14:11:22 +01:00
Gael Guennebaud
b82cd93c01 KLU: truely disable unimplemented code, add proper static assertions in solve 2017-11-10 14:09:01 +01:00
Gael Guennebaud
6365f937d6 KLU depends on BTF but not on libSuiteSparse nor Cholmod 2017-11-10 13:58:52 +01:00
Gael Guennebaud
8cf63ccb99 Merged in kylemacfarlan/eigen (pull request PR-337)
Add support for SuiteSparse's KLU routines
2017-11-10 10:43:17 +00:00
Gael Guennebaud
1495b98a8e Merged in spraetor/eigen (pull request PR-305)
Issue with mpreal and std::numeric_limits::digits
2017-11-10 10:28:54 +00:00
Gael Guennebaud
fc45324380 Merged in jkflying/eigen-fix-scaling (pull request PR-302)
Make scaling work with non-square matrices
2017-11-10 10:11:36 +00:00
Gael Guennebaud
d306b96fb7 Merged in carpent/eigen (pull request PR-342)
Use col method for column-major matrix
2017-11-10 10:09:53 +00:00
Gael Guennebaud
1b2dcf9a47 Check that Schur decomposition succeed. 2017-11-10 10:26:09 +01:00
Gael Guennebaud
0a1cc73942 bug #1484: restore deleted line for 128 bits long doubles, and improve dispatching logic. 2017-11-10 10:25:41 +01:00
Gael Guennebaud
f86bb89d39 Add EIGEN_MKL_NO_DIRECT_CALL option 2017-11-09 11:07:45 +01:00
Gael Guennebaud
5fa79f96b8 Patch from Konstantin Arturov to enable MKL's direct call by default 2017-11-09 10:58:38 +01:00
Justin Carpentier
a020d9b134 Use col method for column-major matrix 2017-10-17 21:51:27 +02:00
Kyle Vedder
c0e1d510fd Add support for SuiteSparse's KLU routines 2017-10-04 21:01:23 -05:00
Gael Guennebaud
6dcf966558 Avoid implicit scalar conversion with accuracy loss in pow(scalar,array) 2017-06-12 16:47:22 +02:00
Gael Guennebaud
50e09cca0f fix tipo 2017-06-11 15:30:36 +02:00
Gael Guennebaud
a4fd4233ad Fix compilation with some compilers 2017-06-09 23:02:02 +02:00
Gael Guennebaud
c3e2afce0d Enable MSVC 2010 workaround from MSVC only 2017-06-09 16:25:18 +02:00
Gael Guennebaud
731c8c704d bug #1403: more scalar conversions fixes in BDCSVD 2017-06-09 15:45:49 +02:00
Gael Guennebaud
1bbcf19029 bug #1403: fix implicit scalar type conversion. 2017-06-09 14:44:02 +02:00
Gael Guennebaud
ba5cab576a bug #1405: enable StrictlyLower/StrictlyUpper triangularView as the destination of matrix*matrix products. 2017-06-09 14:38:04 +02:00
Gael Guennebaud
90168c003d bug #1414: doxygen, add EigenBase to CoreModule 2017-06-09 14:01:44 +02:00
Gael Guennebaud
26f552c18d fix compilation of Half in C++98 (issue introduced in previous commit) 2017-06-09 13:36:58 +02:00
Gael Guennebaud
1d59ca2458 Fix compilation with gcc 4.3 and ARM NEON 2017-06-09 13:20:52 +02:00
Gael Guennebaud
fb1ee04087 bug #1410: fix lvalue propagation of Array/Matrix-Wrapper with a const nested expression. 2017-06-09 13:13:03 +02:00
Gael Guennebaud
723a59ac26 add regression test for aliasing in product rewritting 2017-06-09 12:54:40 +02:00
Gael Guennebaud
8640093af1 fix compilation in C++98 2017-06-09 12:45:01 +02:00
Gael Guennebaud
a7be4cd1b1 Fix LeastSquareDiagonalPreconditioner for complexes (issue introduced in previous commit) 2017-06-09 11:57:53 +02:00
Gael Guennebaud
498aa95a8b bug #1424: add numext::abs specialization for unsigned integer types. 2017-06-09 11:53:49 +02:00
Gael Guennebaud
d588822779 Add missing std::numeric_limits specialization for half, and complete NumTraits<half> 2017-06-09 11:51:53 +02:00
Gael Guennebaud
682b2ef17e bug #1423: fix LSCG\'s Jacobi preconditioner for row-major matrices. 2017-06-08 15:06:27 +02:00
Gael Guennebaud
4bbc320468 bug #1435: fix aliasing issue in exressions like: A = C - B*A; 2017-06-08 12:55:25 +02:00
Hugh Perkins
9341f258d4 Add labels to #ifdef, in TensorReductionCuda.h 2017-06-06 15:51:06 +01:00
Benoit Steiner
1e736b9ead Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7)
Fixing SYCL alignment issue required by TensorFlow.
2017-05-26 17:23:00 +00:00
Benoit Steiner
9dee55ec33 Merged eigen/eigen into default 2017-05-26 09:01:04 -07:00
Mehdi Goli
0370d3576e Applying Ronnan's comments. 2017-05-26 16:01:48 +01:00
Benoit Steiner
615aff4d6e Merged in a-doumoulakis/opencl (pull request PR-12)
Enable triSYCL with Eigen
2017-05-25 18:18:23 +00:00
a-doumoulakis
c3bd860de8 Modification upon request
- Remove warning suppression
2017-05-25 18:46:18 +01:00
Mehdi Goli
e3f964ed55 Applying Benoit's comment;removing dead code. 2017-05-25 11:17:26 +01:00
Benoit Steiner
df90010cdd Merged in mehdi_goli/opencl/CmakeFixForUbuntu16.04 (pull request PR-11)
CmakeFixForUbuntu16.04
2017-05-24 19:03:57 +00:00
a-doumoulakis
fb853a857a Restore misplaced comment 2017-05-24 17:50:15 +01:00
a-doumoulakis
7a8ba565f8 Merge changed from upstream 2017-05-24 17:45:29 +01:00
Mehdi Goli
daf99daadd Merged in DuncanMcBain/opencl/default (pull request PR-2)
Update FindComputeCpp.cmake with new changes from SDK
2017-05-24 13:59:53 +00:00
Mehdi Goli
9ef5c948ba Fixing Cmake for gcc>=5. 2017-05-24 13:11:16 +01:00
Duncan McBain
0cb3c7c7dd Update FindComputeCpp.cmake with new changes from SDK 2017-05-24 12:24:21 +01:00
Mmanu Chaturvedi
2971503fed Specializing numeric_limits For AutoDiffScalar 2017-05-23 17:12:36 -04:00
Gael Guennebaud
26e8f9171e Fix compilation of matrix log with Map as input 2017-06-07 10:51:23 +02:00
Gael Guennebaud
f2a553fb7b bug #1411: fix usage of alignment information in vectorization of quaternion product and conjugate. 2017-06-07 10:10:30 +02:00
Christoph Hertzberg
e018142604 Make sure CholmodSupport works when included in multiple compilation units (issue was reported on stackoverflow.com) 2017-06-06 19:23:14 +02:00
Gael Guennebaud
8508db52ab bug #1417: make LinSpace compatible with std::complex 2017-06-06 17:25:56 +02:00
Mehdi Goli
9aa7c30163 Merge with Benoit. 2017-05-23 10:51:34 +01:00
Mehdi Goli
b42d775f13 Temporarry branch for synch with upstream 2017-05-23 10:51:14 +01:00
Benoit Steiner
615733381e Merged in mehdi_goli/opencl/FixingCmakeDependency (pull request PR-2)
Fixing Cmake Dependency for SYCL
2017-05-22 17:43:06 +00:00
Benoit Steiner
1500a67c41 Merged in mehdi_goli/opencl/TensorSupportedDevice (pull request PR-6)
Fixing suported device list.
2017-05-22 16:22:21 +00:00
Mehdi Goli
76c0fc1f95 Fixing SYCL alignment issue required by TensorFlow. 2017-05-22 16:49:32 +01:00
Mehdi Goli
2d17128d6f Fixing suported device list. 2017-05-22 16:40:33 +01:00
Mehdi Goli
61d7f3664a Fixing Cmake Dependency for SYCL 2017-05-22 14:58:28 +01:00
a-doumoulakis
a5226ce4f7 Add cmake file FindTriSYCL.cmake 2017-05-17 17:59:30 +01:00
a-doumoulakis
052426b824 Add support for triSYCL
Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options

Fix contraction kernel with correct nd_item dimension
2017-05-05 19:26:27 +01:00
Abhijit Kundu
4343db84d8 updated warning number for nvcc relase 8 (V8.0.61) for the stupid warning message 'calling a __host__ function from a __host__ __device__ function is not allowed'. 2017-05-01 10:36:27 -04:00
Abhijit Kundu
9bc0a35731 Fixed nested angle barckets >> issue when compiling with cuda 8 2017-04-27 03:09:03 -04:00
Gael Guennebaud
891ac03483 Fix dense * sparse-selfadjoint-view product. 2017-04-25 13:58:10 +02:00
RJ Ryan
949a2da38c Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer.
Improves support for std::complex types when compiling for CUDA.

Expands on e2e9cdd169
 and 2bda1b0d93
.
2017-04-14 13:23:35 -07:00
Gael Guennebaud
d9084ac8e1 Improve mixing of complex and real in the vectorized path of apply_rotation_in_the_plane 2017-04-14 11:05:13 +02:00
Gael Guennebaud
f75dfdda7e Fix unwanted Real to Scalar to Real conversions in column-pivoting QR. 2017-04-14 10:34:30 +02:00
Gael Guennebaud
0f83aeb6b2 Improve cmake scripts for Pastix and BLAS detection. 2017-04-14 10:22:12 +02:00
Benoit Steiner
0d08165a7f Merged in benoitsteiner/opencl (pull request PR-309)
OpenCL improvements
2017-04-05 14:28:08 +00:00
Benoit Steiner
068cc09708 Preserve file naming conventions 2017-04-04 10:09:10 -07:00
Benoit Steiner
c302ea7bc4 Deleted empty line of code 2017-04-04 10:05:16 -07:00
Benoit Steiner
a5a0c8fac1 Guard sycl specific code under a EIGEN_USE_SYCL ifdef 2017-04-04 10:03:21 -07:00
Benoit Steiner
a1304b95b7 Code cleanup 2017-04-04 10:00:46 -07:00
Benoit Steiner
66c63826bd Guard the sycl specific code with EIGEN_USE_SYCL 2017-04-04 09:59:09 -07:00
Benoit Steiner
e3e343390a Guard the sycl specific code with a #ifdef EIGEN_USE_SYCL 2017-04-04 09:56:33 -07:00
Benoit Steiner
63840d4666 iGate the sycl specific code under a EIGEN_USE_SYCL define 2017-04-04 09:54:31 -07:00
Benoit Steiner
bc050ea9f0 Fixed compilation error when sycl is enabled. 2017-04-04 09:47:04 -07:00
Gagan Goel
4910630c96 fix typos in the Tensor readme 2017-03-31 20:32:16 -04:00
Benoit Steiner
c1b3d5ecb6 Restored code compatibility with compilers that dont support c++11
Gated more sycl code under #ifdef sycl
2017-03-31 08:31:28 -07:00
Benoit Steiner
e2d5d4e7b3 Restore the old constructors to retain compatibility with non c++11 compilers. 2017-03-31 08:26:13 -07:00
Benoit Steiner
73fcaa319f Gate the sycl specific code under #ifdef sycl 2017-03-31 08:22:25 -07:00
Mehdi Goli
bd64ee8555 Fixing TensorArgMaxSycl.h; Removing warning related to the hardcoded type of dims to be int in Argmax. 2017-03-28 16:50:34 +01:00
Simon Praetorius
511810797e Issue with mpreal and std::numeric_limits, i.e. digits is not a constant. Added a digits() traits in NumTraits with fallback to static constant. Specialization for mpreal added in MPRealSupport. 2017-03-24 17:45:56 +01:00
Luke Iwanski
a91417a7a5 Introduces align allocator for SYCL buffer 2017-03-20 14:48:54 +00:00
Gael Guennebaud
aae19c70ac update has_ReturnType to be more consistent with other has_ helpers 2017-03-17 17:33:15 +01:00
Benoit Steiner
f8a622ef3c Merged eigen/eigen into default 2017-03-15 20:06:19 -07:00
Benoit Steiner
fd7db52f9b Silenced compilation warning 2017-03-15 20:02:39 -07:00
Luke Iwanski
9597d6f6ab Temporary: Disables cxx11_tensor_argmax_sycl test since it is causing zombie thread 2017-03-15 19:28:09 +00:00
Luke Iwanski
c06861d15e Fixes bug in get_sycl_supported_devices() that was reporting unsupported Intel CPU on AMD platform - causing timeouts in that configuration 2017-03-15 19:26:08 +00:00
Benoit Steiner
7f31bb6822 Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304)
Fixed compilation with cuda-clang
2017-03-15 16:48:52 +00:00
Gael Guennebaud
89fd0c3881 better check array index before using it 2017-03-15 15:18:03 +01:00
Benoit Jacob
61160a21d2 ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32. 2017-03-15 06:57:25 -04:00
Benoit Steiner
f0f3591118 Made the reduction code compile with cuda-clang 2017-03-14 14:16:53 -07:00
Mehdi Goli
f499fe9496 Adding synchronisation to convolution kernel for sycl backend. 2017-03-13 09:18:37 +00:00
Rasmus Munk Larsen
bfd7bf9c5b Get rid of Init(). 2017-03-10 08:48:20 -08:00
Rasmus Munk Larsen
d56ab01094 Use C++11 ctor forwarding to simplify code a bit. 2017-03-10 08:30:22 -08:00
Rasmus Munk Larsen
344c2694a6 Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases.
* Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O.

* This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time.

* Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for.

* Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
2017-03-09 15:41:03 -08:00
Luke Iwanski
1b32a10053 Use name to distinguish name instead of the vendor 2017-03-08 18:26:34 +00:00
Mehdi Goli
aadb7405a7 Fixing typo in sycl Benchmark. 2017-03-08 18:20:06 +00:00
Gael Guennebaud
970ff78294 bug #1401: fix compilation of "cond ? x : -x" with x an AutoDiffScalar 2017-03-08 16:16:53 +01:00
Mehdi Goli
5e9a1e7a7a Adding sycl Benchmarks. 2017-03-08 14:17:48 +00:00
Mehdi Goli
e2e3f78533 Fixing potential race condition on sycl device. 2017-03-07 17:48:15 +00:00
Mehdi Goli
f84963ed95 Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch. 2017-03-07 14:27:10 +00:00
Gael Guennebaud
e5156e4d25 fix typo 2017-03-07 11:25:58 +01:00
Gael Guennebaud
5694315fbb remove UTF8 symbol 2017-03-07 10:53:47 +01:00
Gael Guennebaud
e958c2baac remove UTF8 symbols 2017-03-07 10:47:40 +01:00
Gael Guennebaud
d967718525 do not include std header within extern C 2017-03-07 10:16:39 +01:00
Gael Guennebaud
659087b622 bug #1400: fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY 2017-03-07 10:02:34 +01:00
Ilya Biryukov
1c03d43a5c Fixed compilation with cuda-clang 2017-03-06 12:01:12 +01:00
Julian Kent
bbe717fa2f Make scaling work with non-square matrices 2017-03-03 12:58:51 +01:00
Benoit Steiner
a71943b9a4 Made the Tensor code compile with clang 3.9 2017-03-02 10:47:29 -08:00
Benoit Steiner
09ae0e6586 Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:
* they're used consistently between the declaration and the definition of a function
  * we avoid calling host only methods from host device methods.
2017-03-01 11:47:47 -08:00
Benoit Steiner
1e2d046651 Silenced a couple of compilation warnings 2017-03-01 10:13:42 -08:00
Benoit Steiner
c1d87ec110 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-03-01 10:08:50 -08:00
Benoit Steiner
3a3f040baa Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 17:06:15 -08:00
Benoit Steiner
7b61944669 Made most of the packet math primitives usable within CUDA kernel when compiling with clang 2017-02-28 17:05:28 -08:00
Benoit Steiner
c92406d613 Silenced clang compilation warning. 2017-02-28 17:03:11 -08:00
Benoit Steiner
857adbbd52 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 16:42:00 -08:00
Benoit Steiner
c36bc2d445 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 14:58:45 -08:00
Benoit Steiner
4a7df114c8 Added missing EIGEN_DEVICE_FUNC 2017-02-28 14:00:15 -08:00
Benoit Steiner
de7b0fdea9 Made the TensorStorage class compile with clang 3.9 2017-02-28 13:52:22 -08:00
Benoit Steiner
765f4cc4b4 Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet. 2017-02-28 11:57:00 -08:00
Benoit Steiner
e993c94f07 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:56:45 -08:00
Benoit Steiner
33443ec2b0 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:50:10 -08:00
Benoit Steiner
f3e9c42876 Added missing EIGEN_DEVICE_FUNC qualifiers 2017-02-28 09:46:30 -08:00
Mehdi Goli
8296b87d7b Adding sycl backend for TensorCustomOp; fixing the partial lhs modification issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used. 2017-02-28 17:16:14 +00:00
Gael Guennebaud
4e98a7b2f0 bug #1396: add some missing EIGEN_DEVICE_FUNC 2017-02-28 09:47:38 +01:00
Gael Guennebaud
478a9f53be Fix typo. 2017-02-28 09:32:45 +01:00
Benoit Steiner
889c606f8f Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops 2017-02-27 17:17:47 -08:00
Benoit Steiner
193939d6aa Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods. 2017-02-27 17:11:47 -08:00
Benoit Steiner
ed4dc9d01a Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel 2017-02-27 16:57:01 -08:00
Benoit Steiner
b1fc7c9a09 Added missing EIGEN_DEVICE_FUNC qualifiers. 2017-02-27 16:48:30 -08:00
Benoit Steiner
554116bec1 Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase 2017-02-27 16:45:31 -08:00
Benoit Steiner
34d9fce93b Avoid unecessary float to double conversions. 2017-02-27 16:33:33 -08:00
Benoit Steiner
e0bd6f5738 Merged eigen/eigen into default 2017-02-26 10:02:14 -08:00
Mehdi Goli
2fa2b617a9 Adding TensorVolumePatchOP.h for sycl 2017-02-24 19:16:24 +00:00
Mehdi Goli
0b7875f137 Converting fixed float type into template type for TensorContraction. 2017-02-24 18:13:30 +00:00
Mehdi Goli
89dfd51fae Adding Sycl Backend for TensorGenerator.h. 2017-02-22 16:36:24 +00:00
Gael Guennebaud
5c68ba41a8 typos 2017-02-21 17:10:55 +01:00
Gael Guennebaud
b0f55ef85a merge 2017-02-21 17:04:10 +01:00
Gael Guennebaud
d29e9d7119 Improve documentation of reshaped 2017-02-21 17:03:10 +01:00
Gael Guennebaud
9b6e365018 Fix linking issue. 2017-02-21 16:52:22 +01:00
Gael Guennebaud
3d200257d7 Add support for automatic-size deduction in reshaped, e.g.:
mat.reshaped(4,AutoSize); <-> mat.reshaped(4,mat.size()/4);
2017-02-21 15:57:25 +01:00
Gael Guennebaud
f8179385bd Add missing const version of mat(all). 2017-02-21 13:56:26 +01:00
Gael Guennebaud
1e3aa470fa Fix long to int conversion 2017-02-21 13:56:01 +01:00
Gael Guennebaud
b3fc0007ae Add support for mat(all) as an alias to mat.reshaped(mat.size(),fix<1>); 2017-02-21 13:49:09 +01:00
Mehdi Goli
4f07ac16b0 Reducing the number of warnings. 2017-02-21 10:09:47 +00:00
Gael Guennebaud
76687f385c bug #1394: fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse); 2017-02-20 14:27:26 +01:00
Gael Guennebaud
d8b1f6cebd bug #1380: for Map<> as input of matrix exponential 2017-02-20 14:06:06 +01:00
Gael Guennebaud
6572825703 bug #1395: fix the use of compile-time vectors as inputs of JacobiSVD. 2017-02-20 13:44:37 +01:00
Mehdi Goli
79ebc8f761 Adding Sycl backend for TensorImagePatchOP.h; adding Sycl backend for TensorInflation.h. 2017-02-20 12:11:05 +00:00
Gael Guennebaud
9081c8f6ea Add support for RowOrder reshaped 2017-02-20 11:46:21 +01:00
Gael Guennebaud
a811a04696 Silent warning. 2017-02-20 10:14:21 +01:00
Gael Guennebaud
63798df038 Fix usage of CUDACC_VER 2017-02-20 08:16:36 +01:00
Gael Guennebaud
deefa54a54 Fix tracking of temporaries in unit tests 2017-02-19 10:32:54 +01:00
Gael Guennebaud
f8a55cc062 Fix compilation. 2017-02-18 10:08:13 +01:00
Gael Guennebaud
cbbf88c4d7 Use int32_t instead of int in NEON code. Some platforms with 16 bytes int supports ARM NEON. 2017-02-17 14:39:02 +01:00
Gael Guennebaud
582b5e39bf bug #1393: enable Matrix/Array explicit ctor from types with conversion operators (was ok with 3.2) 2017-02-17 14:10:57 +01:00
Benoit Steiner
cfa0568ef7 Size indices are signed. 2017-02-16 10:13:34 -08:00
Mehdi Goli
91982b91c0 Adding TensorLayoutSwapOp for sycl. 2017-02-15 16:28:12 +00:00
Mehdi Goli
b1e312edd6 Adding TensorPatch.h for sycl backend. 2017-02-15 10:13:01 +00:00
Benoit Steiner
31a25ab226 Merged eigen/eigen into default 2017-02-14 15:36:21 -08:00
Mehdi Goli
0d153ded29 Adding TensorChippingOP for sycl backend; fixing the index value in the verification operation for cxx11_tensorChipping.cpp test 2017-02-13 17:25:12 +00:00
Gael Guennebaud
5937c4ae32 Fall back is_integral to std::is_integral in c++11 2017-02-13 17:14:26 +01:00
Gael Guennebaud
7073430946 Fix overflow and make use of long long in c++11 only. 2017-02-13 17:14:04 +01:00
Jonathan Hseu
3453b00a1e Fix vector indexing with uint64_t 2017-02-11 21:45:32 -08:00
Gael Guennebaud
e7ebe52bfb bug #1391: include IO.h before DenseBase to enable its usage in DenseBase plugins. 2017-02-13 09:46:20 +01:00
Gael Guennebaud
b3750990d5 Workaround some gcc 4.7 warnings 2017-02-11 23:24:06 +01:00
Gael Guennebaud
4b22048cea Fallback Reshaped to MapBase when possible (same storage order and linear access to the nested expression) 2017-02-11 15:32:53 +01:00
Gael Guennebaud
83d6a529c3 Use Eigen::fix<N> to pass compile-time sizes. 2017-02-11 15:31:28 +01:00
Gael Guennebaud
c16ee72b20 bug #1392: fix #include <Eigen/Sparse> with mpl2-only 2017-02-11 10:35:01 +01:00
Gael Guennebaud
e43016367a Forgot to include a file in previous commit 2017-02-11 10:34:18 +01:00
Gael Guennebaud
6486d4fc95 Worakound gcc 4.7 issue in c++11. 2017-02-11 10:29:10 +01:00
Gael Guennebaud
4a4a72951f Fix previous commits: disbale only problematic indexed view methods for old compilers instead of disabling everything.
Tested with gcc 4.7 (c++03) and gcc 4.8 (c++03 & c++11)
2017-02-11 10:28:44 +01:00
Benoit Steiner
fad776492f Merged eigen/eigen into default 2017-02-10 14:27:43 -08:00
Benoit Steiner
1ef30b8090 Fixed bug introduced in previous commit 2017-02-10 13:35:10 -08:00
Benoit Steiner
769208a17f Pulled latest updates from upstream 2017-02-10 13:11:40 -08:00
Benoit Steiner
8b3cc54c42 Added a new EIGEN_HAS_INDEXED_VIEW define that set to 0 for older compilers that are known to fail to compile the indexed views (I used the define from the indexed_views.cpp test).
Only include the indexed view methods when the compiler supports the code.
This makes it possible to use Eigen again in complex code bases such as TensorFlow and older compilers such as gcc 4.8
2017-02-10 13:08:49 -08:00
Gael Guennebaud
a1ff24f96a Fix prunning in (sparse*sparse).pruned() when the result is nearly dense. 2017-02-10 13:59:32 +01:00
Gael Guennebaud
0256c52359 Include clang in the list of non strict MSVC (just to be sure) 2017-02-10 13:41:52 +01:00
Alexander Neumann
dd58462e63 fixed inlining issue with clang-cl on visual studio
(grafted from 7962ac1a58
)
2017-02-08 23:50:38 +01:00
Gael Guennebaud
fc8fd5fd24 Improve multi-threading heuristic for matrix products with a small number of columns. 2017-02-07 17:19:59 +01:00
Mehdi Goli
0ee97b60c2 Adding mean to TensorReductionSycl.h 2017-02-07 15:43:17 +00:00
Mehdi Goli
42bd5c4e7b Fixing TensorReductionSycl for min and max. 2017-02-06 18:05:23 +00:00
Gael Guennebaud
4254b3eda3 bug #1389: MSVC's std containers do not properly align in 64 bits mode if the requested alignment is larger than 16 bytes (e.g., with AVX) 2017-02-03 15:22:35 +01:00
Mehdi Goli
bc128f9f3b Reducing the warnings in Sycl backend. 2017-02-02 10:43:47 +00:00
Benoit Steiner
442e9cbb30 Silenced several compilation warnings 2017-02-01 15:50:58 -08:00
Benoit Steiner
2db75c07a6 fixed the ordering of the template and EIGEN_DEVICE_FUNC keywords in a few more places to get more of the Eigen codebase to compile with nvcc again. 2017-02-01 15:41:29 -08:00
Benoit Steiner
fcd257039b Replaced EIGEN_DEVICE_FUNC template<foo> with template<foo> EIGEN_DEVICE_FUNC to make the code compile with nvcc8. 2017-02-01 15:30:49 -08:00
Gael Guennebaud
84090027c4 Disable a part of the unit test for gcc 4.8 2017-02-01 23:37:44 +01:00
Gael Guennebaud
0eceea4efd Define EIGEN_COMP_GNUC to reflect version number: 47, 48, 49, 50, 60, ... 2017-02-01 23:36:40 +01:00
Mehdi Goli
ff53050034 Converting ptrdiff_t type to int64_t type in cxx11_tensor_contract_sycl.cpp in order to be the same as other tests. 2017-02-01 15:36:03 +00:00
Mehdi Goli
bab29936a1 Reducing warnings in Sycl backend. 2017-02-01 15:29:53 +00:00
Gael Guennebaud
645a8e32a5 Fix compilation of JacobiSVD for vectors type 2017-01-31 16:22:54 +01:00
Mehdi Goli
48a20b7d95 Fixing compiler error on TensorContractionSycl.h; Silencing the compiler unused parameter warning for eval_op_indices in TensorContraction.h 2017-01-31 14:06:36 +00:00
Gael Guennebaud
53026d29d4 bug #478: fix regression in the eigen decomposition of zero matrices. 2017-01-31 14:22:42 +01:00
Benoit Steiner
fbc39fd02c Merge latest changes from upstream 2017-01-30 15:25:57 -08:00
Gael Guennebaud
63de19c000 bug #1380: fix matrix exponential with Map<> 2017-01-30 13:55:27 +01:00
Gael Guennebaud
c86911ac73 bug #1384: fix evaluation of "sparse/scalar" that used the wrong evaluation path. 2017-01-30 13:38:24 +01:00
Mehdi Goli
82ce92419e Fixing the buffer type in memcpy. 2017-01-30 11:38:20 +00:00
Gael Guennebaud
24409f3acd Use fix<> API to specify compile-time reshaped sizes. 2017-01-29 15:20:35 +01:00
Gael Guennebaud
9036cda364 Cleanup intitial reshape implementation:
- reshape -> reshaped
 - make it compatible with evaluators.
2017-01-29 14:57:45 +01:00
Gael Guennebaud
0e89baa5d8 import yoco xiao's work on reshape 2017-01-29 14:29:31 +01:00
Gael Guennebaud
d024e9942d MSVC 1900 release is not c++14 compatible enough for us. The 1910 update seems to be fine though. 2017-01-27 22:17:59 +01:00
Gael Guennebaud
83592659ba merge 2017-01-27 21:59:59 +01:00
Gael Guennebaud
4a351be163 Fix warning 2017-01-27 11:59:35 +01:00
Gael Guennebaud
251ad3e04f Fix unamed type as template parametre issue. 2017-01-27 11:57:52 +01:00
Rasmus Munk Larsen
edaa0fc5d1 Revert PR-292. After further investigation, the memcpy->memmove change was only good for Haswell on older versions of glibc. Adding a switch for small sizes is perhaps useful for string copies, but also has an overhead for larger sizes, making it a poor trade-off for general memcpy.
This PR also removes a couple of unnecessary semi-colons in Eigen/src/Core/AssignEvaluator.h that caused compiler warning everywhere.
2017-01-26 12:46:06 -08:00
Gael Guennebaud
25a1703579 Merged in ggael/eigen-flexidexing (pull request PR-294)
generalized operator() for indexed access and slicing
2017-01-26 08:04:23 +00:00
Gael Guennebaud
98dfe0c13f Fix useless ';' warning 2017-01-25 22:55:04 +01:00
Gael Guennebaud
28351073d8 Fix unamed type as template argument (ok in c++11 only) 2017-01-25 22:54:51 +01:00
Gael Guennebaud
607be65a03 Fix duplicates of array_size bewteen unsupported and Core 2017-01-25 22:53:58 +01:00
Rasmus Munk Larsen
7d39c6d50a Merged eigen/eigen into default 2017-01-25 09:22:26 -08:00
Rasmus Munk Larsen
5c9ed4ba0d Reverse arguments for pmin in AVX. 2017-01-25 09:21:57 -08:00
Gael Guennebaud
850ca961d2 bug #1383: fix regression in LinSpaced for integers and high<low 2017-01-25 18:13:53 +01:00
Gael Guennebaud
296d24be4d bug #1381: fix sparse.diagonal() used as a rvalue.
The problem was that is "sparse" is not const, then sparse.diagonal() must have the
LValueBit flag meaning that sparse.diagonal().coeff(i) must returns a const reference,
const Scalar&. However, sparse::coeff() cannot returns a reference for a non-existing
zero coefficient. The trick is to return a reference to a local member of
evaluator<SparseMatrix>.
2017-01-25 17:39:01 +01:00
Gael Guennebaud
d06a48959a bug #1383: Fix regression from 3.2 with LinSpaced(n,0,n-1) with n==0. 2017-01-25 15:27:13 +01:00
Rasmus Munk Larsen
ae3e43a125 Remove extra space. 2017-01-24 16:16:39 -08:00
Benoit Steiner
e96c77668d Merged in rmlarsen/eigen2 (pull request PR-292)
Adds a fast memcpy function to Eigen.
2017-01-25 00:14:04 +00:00
Rasmus Munk Larsen
3be5ee2352 Update copy helper to use fast_memcpy. 2017-01-24 14:22:49 -08:00
Rasmus Munk Larsen
e6b1020221 Adds a fast memcpy function to Eigen. This takes advantage of the following:
1. For small fixed sizes, the compiler generates inline code for memcpy, which is much faster.

2. My colleague eriche at googl dot com discovered that for large sizes, memmove is significantly faster than memcpy (at least on Linux with GCC or Clang). See benchmark numbers measured on a Haswell (HP Z440) workstation here: https://docs.google.com/a/google.com/spreadsheets/d/1jLs5bKzXwhpTySw65MhG1pZpsIwkszZqQTjwrd_n0ic/pubhtml This is of course surprising since memcpy is a less constrained version of memmove. This stackoverflow thread contains some speculation as to the causes: http://stackoverflow.com/questions/22793669/poor-memcpy-performance-on-linux

Below are numbers for copying and slicing tensors using the multithreaded TensorDevice. The numbers show significant improvements for memcpy of very small blocks and for memcpy of large blocks single threaded (we were already able to saturate memory bandwidth for >1 threads before on large blocks). The "slicingSmallPieces" benchmark also shows small consistent improvements, since memcpy cost is a fair portion of that particular computation.

The benchmarks operate on NxN matrices, and the names are of the form BM_$OP_${NUMTHREADS}T/${N}.

Measured improvements in wall clock time:

Run on rmlarsen3.mtv (12 X 3501 MHz CPUs); 2017-01-20T11:26:31.493023454-08:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_memcpy_1T/2                          3.48      2.39    +31.3%
BM_memcpy_1T/8                          12.3      6.51    +47.0%
BM_memcpy_1T/64                          371       383     -3.2%
BM_memcpy_1T/512                       66922     66720     +0.3%
BM_memcpy_1T/4k                      9892867   6849682    +30.8%
BM_memcpy_1T/5k                     14951099  10332856    +30.9%
BM_memcpy_2T/2                          3.50      2.46    +29.7%
BM_memcpy_2T/8                          12.3      7.66    +37.7%
BM_memcpy_2T/64                          371       376     -1.3%
BM_memcpy_2T/512                       66652     66788     -0.2%
BM_memcpy_2T/4k                      6145012   6117776     +0.4%
BM_memcpy_2T/5k                      9181478   9010942     +1.9%
BM_memcpy_4T/2                          3.47      2.47    +31.0%
BM_memcpy_4T/8                          12.3      6.67    +45.8
BM_memcpy_4T/64                          374       376     -0.5%
BM_memcpy_4T/512                       67833     68019     -0.3%
BM_memcpy_4T/4k                      5057425   5188253     -2.6%
BM_memcpy_4T/5k                      7555638   7779468     -3.0%
BM_memcpy_6T/2                          3.51      2.50    +28.8%
BM_memcpy_6T/8                          12.3      7.61    +38.1%
BM_memcpy_6T/64                          373       378     -1.3%
BM_memcpy_6T/512                       66871     66774     +0.1%
BM_memcpy_6T/4k                      5112975   5233502     -2.4%
BM_memcpy_6T/5k                      7614180   7772246     -2.1%
BM_memcpy_8T/2                          3.47      2.41    +30.5%
BM_memcpy_8T/8                          12.4      10.5    +15.3%
BM_memcpy_8T/64                          372       388     -4.3%
BM_memcpy_8T/512                       67373     66588     +1.2%
BM_memcpy_8T/4k                      5148462   5254897     -2.1%
BM_memcpy_8T/5k                      7660989   7799058     -1.8%
BM_memcpy_12T/2                         3.50      2.40    +31.4%
BM_memcpy_12T/8                         12.4      7.55    +39.1
BM_memcpy_12T/64                         374       378     -1.1%
BM_memcpy_12T/512                      67132     66683     +0.7%
BM_memcpy_12T/4k                     5185125   5292920     -2.1%
BM_memcpy_12T/5k                     7717284   7942684     -2.9%
BM_slicingSmallPieces_1T/2              47.3      47.5     +0.4%
BM_slicingSmallPieces_1T/8              53.6      52.3     +2.4%
BM_slicingSmallPieces_1T/64              491       476     +3.1%
BM_slicingSmallPieces_1T/512           21734     18814    +13.4%
BM_slicingSmallPieces_1T/4k           394660    396760     -0.5%
BM_slicingSmallPieces_1T/5k           218722    209244     +4.3%
BM_slicingSmallPieces_2T/2              80.7      79.9     +1.0%
BM_slicingSmallPieces_2T/8              54.2      53.1     +2.0
BM_slicingSmallPieces_2T/64              497       477     +4.0%
BM_slicingSmallPieces_2T/512           21732     18822    +13.4%
BM_slicingSmallPieces_2T/4k           392885    390490     +0.6%
BM_slicingSmallPieces_2T/5k           221988    208678     +6.0%
BM_slicingSmallPieces_4T/2              80.8      80.1     +0.9%
BM_slicingSmallPieces_4T/8              54.1      53.2     +1.7%
BM_slicingSmallPieces_4T/64              493       476     +3.4%
BM_slicingSmallPieces_4T/512           21702     18758    +13.6%
BM_slicingSmallPieces_4T/4k           393962    404023     -2.6%
BM_slicingSmallPieces_4T/5k           249667    211732    +15.2%
BM_slicingSmallPieces_6T/2              80.5      80.1     +0.5%
BM_slicingSmallPieces_6T/8              54.4      53.4     +1.8%
BM_slicingSmallPieces_6T/64              488       478     +2.0%
BM_slicingSmallPieces_6T/512           21719     18841    +13.3%
BM_slicingSmallPieces_6T/4k           394950    397583     -0.7%
BM_slicingSmallPieces_6T/5k           223080    210148     +5.8%
BM_slicingSmallPieces_8T/2              81.2      80.4     +1.0%
BM_slicingSmallPieces_8T/8              58.1      53.5     +7.9%
BM_slicingSmallPieces_8T/64              489       480     +1.8%
BM_slicingSmallPieces_8T/512           21586     18798    +12.9%
BM_slicingSmallPieces_8T/4k           394592    400165     -1.4%
BM_slicingSmallPieces_8T/5k           219688    208301     +5.2%
BM_slicingSmallPieces_12T/2             80.2      79.8     +0.7%
BM_slicingSmallPieces_12T/8             54.4      53.4     +1.8
BM_slicingSmallPieces_12T/64             488       476     +2.5%
BM_slicingSmallPieces_12T/512          21931     18831    +14.1%
BM_slicingSmallPieces_12T/4k          393962    396541     -0.7%
BM_slicingSmallPieces_12T/5k          218803    207965     +5.0%
2017-01-24 13:55:18 -08:00
Rasmus Munk Larsen
7b6aaa3440 Fix NaN propagation for AVX512. 2017-01-24 13:37:08 -08:00
Rasmus Munk Larsen
5e144bbaa4 Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
See #1373 for details.
2017-01-24 13:32:50 -08:00
Gael Guennebaud
d83db761a2 Add support for std::integral_constant 2017-01-24 16:28:12 +01:00
Gael Guennebaud
bc10201854 Add test for multiple symbols 2017-01-24 16:27:51 +01:00
Gael Guennebaud
c43d254d13 Fix seq().reverse() in c++98 2017-01-24 11:36:43 +01:00
Gael Guennebaud
5783158e8f Add unit test for FixedInt and Symbolic 2017-01-24 10:55:12 +01:00
Gael Guennebaud
ddd83f82d8 Add support for "SymbolicExpr op fix<N>" in C++98/11 mode. 2017-01-24 10:54:42 +01:00
Gael Guennebaud
228fef1b3a Extended the set of arithmetic operators supported by FixedInt (-,+,*,/,%,&,|) 2017-01-24 10:53:51 +01:00
Gael Guennebaud
bb52f74e62 Add internal doc 2017-01-24 10:13:35 +01:00
Gael Guennebaud
41c523a0ab Rename fix_t to FixedInt 2017-01-24 09:39:49 +01:00
Gael Guennebaud
156e6234f1 bug #1375: fix cmake installation with cmake 2.8 2017-01-24 09:16:40 +01:00
Gael Guennebaud
ba3f977946 bug #1376: add missing assertion on size mismatch with compound assignment operators (e.g., mat += mat.col(j)) 2017-01-23 22:06:08 +01:00
Gael Guennebaud
b0db4eff36 bug #1382: move using std::size_t/ptrdiff_t to Eigen's namespace (still better than the global namespace!) 2017-01-23 22:03:57 +01:00
Gael Guennebaud
ca79c1545a Add std:: namespace prefix to all (hopefully) instances if size_t/ptrdfiff_t 2017-01-23 22:02:53 +01:00
Gael Guennebaud
4b607b5692 Use Index instead of size_t 2017-01-23 22:00:33 +01:00
Luke Iwanski
bf44fed9b7 Allows AMD APU 2017-01-23 15:56:45 +00:00
Gael Guennebaud
0fe278f7be bug #1379: fix compilation in sparse*diagonal*dense with openmp 2017-01-21 23:27:01 +01:00
Gael Guennebaud
22a172751e bug #1378: fix doc (DiagonalIndex vs Diagonal) 2017-01-21 22:09:59 +01:00
Mehdi Goli
602f8c27f5 Reverting back to the previous TensorDeviceSycl.h as the total number of buffer is not enough for tensorflow. 2017-01-20 18:23:20 +00:00
Gael Guennebaud
4d302a080c Recover compile-time size from seq(A,B) when A and B are fixed values. (c++11 only) 2017-01-19 20:34:18 +01:00
Gael Guennebaud
54f3fbee24 Exploit fixed values in seq and reverse with C++98 compatibility 2017-01-19 19:57:32 +01:00
Gael Guennebaud
7691723e34 Add support for fixed-value in symbolic expression, c++11 only for now. 2017-01-19 19:25:29 +01:00
Benoit Steiner
924600a0e8 Made sure that enabling avx2 instructions enables avx and sse instructions as well. 2017-01-19 09:54:48 -08:00
Mehdi Goli
77cc4d06c7 Removing unused variables 2017-01-19 17:06:21 +00:00
Mehdi Goli
837fdbdcb2 Merging with Benoit's upstream. 2017-01-19 11:34:34 +00:00
Mehdi Goli
6bdd15f572 Adding non-deferrenciable pointer track for ComputeCpp backend; Adding TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class. 2017-01-19 11:30:59 +00:00
Benoit Steiner
aa7fb88dfa Merged in LaFeuille/eigen (pull request PR-289)
Fix a typo
2017-01-18 16:44:39 -08:00
Gael Guennebaud
e84ed7b6ef Remove dead code 2017-01-18 23:18:28 +01:00
Gael Guennebaud
f3ccbe0419 Add a Symbolic::FixedExpr helper expression to make sure the compiler fully optimize the usage of last and end. 2017-01-18 23:16:32 +01:00
Mehdi Goli
c6f7b33834 Applying Benoit's comment. Embedding synchronisation inside device memcpy so there is no need to externally call synchronise() for device memcopy. 2017-01-18 10:45:28 +00:00
Gael Guennebaud
15471432fe Add a .reverse() member to ArithmeticSequence. 2017-01-18 11:35:27 +01:00
Gael Guennebaud
e4f8dd860a Add missing operator* 2017-01-18 10:49:01 +01:00
Gael Guennebaud
198507141b Update all block expressions to accept compile-time sizes passed by fix<N> or fix<N>(n) 2017-01-18 09:43:58 +01:00
Gael Guennebaud
5484ddd353 Merge the generic and dynamic overloads of block() 2017-01-17 22:11:46 +01:00
Gael Guennebaud
655ba783f8 Defer set-to-zero in triangular = product so that no aliasing issue occur in the common:
A.triangularView() = B*A.sefladjointView()*B.adjoint()
case that used to work in 3.2.
2017-01-17 18:03:35 +01:00
Gael Guennebaud
5e36ec3b6f Fix regression when passing enums to operator() 2017-01-17 17:10:16 +01:00
Gael Guennebaud
f7852c3d16 Fix -Wunnamed-type-template-args 2017-01-17 16:05:58 +01:00
Gael Guennebaud
4f36dcfda8 Add a generic block() method compatible with Eigen::fix 2017-01-17 11:34:28 +01:00
Gael Guennebaud
71e5b71356 Add a get_runtime_value helper to deal with pointer-to-function hack,
plus some refactoring to make the internals more consistent.
2017-01-17 11:33:57 +01:00
Gael Guennebaud
59801a3250 Add \newin{3.x} doxygen command 2017-01-17 10:31:28 +01:00
Gael Guennebaud
23bfcfc15f Add missing overload of get_compile_time for c++98/11 2017-01-17 10:30:21 +01:00
Gael Guennebaud
edff32c2c2 Disambiguate the two versions of fix for doxygen 2017-01-17 10:29:33 +01:00
Gael Guennebaud
4989922be2 Add support for symbolic expressions as arguments of operator() 2017-01-16 22:21:23 +01:00
Gael Guennebaud
12e22a2844 typos in doc 2017-01-16 16:31:19 +01:00
Gael Guennebaud
e70c4c97fa Typo 2017-01-16 16:20:16 +01:00
Gael Guennebaud
a9232af845 Introduce a variable_or_fixed<N> proxy returned by fix<N>(val) to pass both a compile-time and runtime fallback value in case N means "runtime".
This mechanism is used by the seq/seqN functions. The proxy object is immediately converted to pure compile-time (as fix<N>) or pure runtime (i.e., an Index) to avoid redundant template instantiations.
2017-01-16 16:17:01 +01:00
Gael Guennebaud
6e97698161 Introduce a EIGEN_HAS_CXX14 macro 2017-01-16 16:13:37 +01:00
Mehdi Goli
e46e722381 Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying Tensor Contractsycl to be located in any place in the expression tree. 2017-01-16 13:58:49 +00:00
Luke Iwanski
23778a15d8 Reverting unintentional change to Eigen/Geometry 2017-01-16 11:05:56 +00:00
LaFeuille
1b19b80c06 Fix a typo 2017-01-13 07:24:55 +00:00
Fraser Cormack
8245d3c7ad Fix case-sensitivity of file include 2017-01-12 12:13:18 +00:00
Gael Guennebaud
752bd92ba5 Large code refactoring:
- generalize some utilities and move them to Meta (size(), array_size())
 - move handling of all and single indices to IndexedViewHelper.h
 - several cleanup changes
2017-01-11 17:24:02 +01:00
Gael Guennebaud
f93d1c58e0 Make get_compile_time compatible with variable_if_dynamic 2017-01-11 17:08:59 +01:00
Gael Guennebaud
c020d307a6 Make variable_if_dynamic<T> implicitely convertible to T 2017-01-11 17:08:05 +01:00
Gael Guennebaud
43c617e2ee merge 2017-01-11 14:33:37 +01:00
Gael Guennebaud
152cd57bb7 Enable generation of doc for static variables in Eigen's namespace. 2017-01-11 14:29:20 +01:00
Gael Guennebaud
b1dc0fa813 Move fix and symbolic to their own file, and improve doxygen compatibility 2017-01-11 14:28:28 +01:00
Gael Guennebaud
04397f17e2 Add 1D overloads of operator() 2017-01-11 13:17:09 +01:00
Gael Guennebaud
45199b9773 Fix typo 2017-01-11 09:34:08 +01:00
Gael Guennebaud
1b5570988b Add doc to seq, seqN, ArithmeticSequence, operator(), etc. 2017-01-10 22:58:58 +01:00
Gael Guennebaud
17eac60446 Factorize const and non-const version of the generic operator() method. 2017-01-10 21:45:55 +01:00
Gael Guennebaud
d072fc4b14 add writeable IndexedView 2017-01-10 17:10:35 +01:00
Gael Guennebaud
c9d5e5c6da Simplify Symbolic API: std::tuple is now used internally and automatically built. 2017-01-10 16:55:07 +01:00
Gael Guennebaud
407e7b7a93 Simplify symbolic API by using "symbol=value" to associate a runtime value to a symbol. 2017-01-10 16:45:32 +01:00
Gael Guennebaud
96e6cf9aa2 Fix linking issue. 2017-01-10 16:35:46 +01:00
Gael Guennebaud
e63678bc89 Fix ambiguous call 2017-01-10 16:33:40 +01:00
Gael Guennebaud
8e247744a4 Fix linking issue 2017-01-10 16:32:06 +01:00
Gael Guennebaud
b47a7e5c3a Add doc for IndexedView 2017-01-10 16:28:57 +01:00
Gael Guennebaud
87963f441c Fallback to Block<> when possible (Index, all, seq with > increment).
This is important to take advantage of the optimized implementations (evaluator, products, etc.),
and to support sparse matrices.
2017-01-10 14:25:30 +01:00
Gael Guennebaud
a98c7efb16 Add a more generic evaluation mechanism and minimalistic doc. 2017-01-10 11:46:29 +01:00
Gael Guennebaud
13d954f270 Cleanup Eigen's namespace 2017-01-10 11:06:02 +01:00
Gael Guennebaud
9eaab4f9e0 Refactoring: move all symbolic stuff into its own namespace 2017-01-10 10:57:08 +01:00
Gael Guennebaud
acd08900c9 Move 'last' and 'end' to their own namespace 2017-01-10 10:31:07 +01:00
Gael Guennebaud
1df2377d78 Implement c++98 version of seq() 2017-01-10 10:28:45 +01:00
Gael Guennebaud
ecd9cc5412 Isolate legacy code (we keep it for performance comparison purpose) 2017-01-10 09:34:25 +01:00
Gael Guennebaud
b50c3e967e Add a minimalistic symbolic scalar type with expression template and make use of it to define the last placeholder and to unify the return type of seq and seqN. 2017-01-09 23:42:16 +01:00
Gael Guennebaud
68064e14fa Rename span/range to seqN/seq 2017-01-09 17:35:21 +01:00
Gael Guennebaud
ad3eef7608 Add link to SO 2017-01-09 13:01:39 +01:00
Gael Guennebaud
75aef5b37f Fix extraction of compile-time size of std::array with gcc 2017-01-06 22:04:49 +01:00
Gael Guennebaud
233dff1b35 Add support for plain arrays for columns and both rows/columns 2017-01-06 22:01:53 +01:00
Gael Guennebaud
76e183bd52 Propagate compile-time size for plain arrays 2017-01-06 22:01:23 +01:00
Gael Guennebaud
3264d3c761 Add support for plain-array as indices, e.g., mat({1,2,3,4}) 2017-01-06 21:53:32 +01:00
Gael Guennebaud
831fffe874 Add missing doc of SparseView 2017-01-06 18:01:29 +01:00
Gael Guennebaud
a875167d99 Propagate compile-time increment and strides.
Had to introduce a UndefinedIncr constant for non structured list of indices.
2017-01-06 15:54:55 +01:00
Gael Guennebaud
e383d6159a MSVC 2015 has all we want about c++11 and MSVC 2017 fails on binder1st/binder2nd 2017-01-06 15:44:13 +01:00
Gael Guennebaud
fad1fa75b3 Propagate compile-time size with "all" and add c++11 array unit test 2017-01-06 13:29:33 +01:00
Gael Guennebaud
3730e3ca9e Use "fix" for compile-time values, propagate compile-time sizes for span, clean some cleanup. 2017-01-06 13:10:10 +01:00
Gael Guennebaud
60e99ad8d7 Add unit test for indexed views 2017-01-06 11:59:08 +01:00
Gael Guennebaud
ac7e4ac9c0 Initial commit to add a generic indexed-based view of matrices.
This version already works as a read-only expression.
Numerous refactoring, renaming, extension, tuning passes are expected...
2017-01-06 00:01:44 +01:00
Gael Guennebaud
f3f026c9aa Convert integers to real numbers when computing relative L2 error 2017-01-05 13:36:08 +01:00
Jim Radford
0c226644d8 LLT: const the arg to solveInPlace() to allow passing .transpose(), .block(), etc. 2017-01-04 14:42:57 -08:00
Jim Radford
be281e5289 LLT: avoid making a copy when decomposing in place 2017-01-04 14:43:56 -08:00
Gael Guennebaud
e27f17bf5c Gub 1453: fix Map with non-default inner-stride but no outer-stride. 2017-08-22 13:27:37 +02:00
Gael Guennebaud
21d0a0bcf5 bug #1456: add perf recommendation for LLT and storage format 2017-08-22 12:46:35 +02:00
Gael Guennebaud
2c3d70d915 Re-enable hidden doc in LLT 2017-08-22 12:04:09 +02:00
Gael Guennebaud
a6e7a41a55 bug #1455: Cholesky module depends on Jacobi for rank-updates. 2017-08-22 11:37:32 +02:00
Gael Guennebaud
e6021cc8cc bug #1458: fix documentation of LLT and LDLT info() method. 2017-08-22 11:32:55 +02:00
Gael Guennebaud
2810ba194b Clarify MKL_DIRECT_CALL doc. 2017-08-17 22:12:26 +02:00
Gael Guennebaud
f727844658 use MKL's lapacke.h header when using MKL 2017-08-17 21:58:39 +02:00
Gael Guennebaud
8c858bd891 Clarify doc regarding the usage of MKL_DIRECT_CALL 2017-08-17 12:17:45 +02:00
Gael Guennebaud
b95f92843c Fix support for MKL's BLAS when using MKL_DIRECT_CALL. 2017-08-17 12:07:10 +02:00
Gael Guennebaud
89c01a494a Add unit test for has_ReturnType 2017-08-17 11:55:00 +02:00
Gael Guennebaud
687bedfcad Make NoAlias and JacobiRotation compatible with CUDA. 2017-08-17 11:51:22 +02:00
Gael Guennebaud
1f4b24d2df Do not preallocate more space than the matrix size (when the sparse matrix boils down to a vector 2017-07-20 10:13:48 +02:00
Gael Guennebaud
d580a90c9a Disable BDCSVD preallocation check. 2017-07-20 10:03:54 +02:00
Gael Guennebaud
55d7181557 Fix lazyness of operator* with CUDA 2017-07-20 09:47:28 +02:00
Gael Guennebaud
cda47c42c2 Fix compilation in c++98 mode. 2017-07-17 21:08:20 +02:00
Gael Guennebaud
a74b9ba7cd Update documentation for CUDA 2017-07-17 11:05:26 +02:00
Gael Guennebaud
3182bdbae6 Disable vectorization when compiled by nvcc, even is EIGEN_NO_CUDA is defined 2017-07-17 11:01:28 +02:00
Gael Guennebaud
9f8136ff74 disable nvcc boolean-expr-is-constant warning 2017-07-17 10:43:18 +02:00
Gael Guennebaud
bbd97b4095 Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH aliases 2017-07-17 01:02:51 +02:00
Gael Guennebaud
2299717fd5 Fix and workaround several doxygen issues/warnings 2017-01-04 23:27:33 +01:00
Luke Iwanski
90c5bc8d64 Fixes auto appearance in functor template argument for reduction. 2017-01-04 22:18:44 +00:00
Gael Guennebaud
ee6f7f6c0c Add doc for sparse triangular solve functions 2017-01-04 23:10:36 +01:00
Gael Guennebaud
5165de97a4 Add missing snippet files. 2017-01-04 23:08:27 +01:00
Gael Guennebaud
a0a36ad0ef bug #1336: workaround doxygen failing to include numerous members of MatriBase in Matrix 2017-01-04 22:02:39 +01:00
Gael Guennebaud
29a1a58113 Document selfadjointView 2017-01-04 22:01:50 +01:00
Gael Guennebaud
a5ebc92f8d bug #1336: fix doxygen issue regarding EIGEN_CWISE_BINARY_RETURN_TYPE 2017-01-04 18:21:44 +01:00
Gael Guennebaud
45b289505c Add debug output 2017-01-03 11:31:02 +01:00
Gael Guennebaud
5838f078a7 Fix inclusion 2017-01-03 11:30:27 +01:00
Gael Guennebaud
8702562177 bug #1370: add doc for StorageIndex 2017-01-03 11:25:41 +01:00
Gael Guennebaud
575c078759 bug #1370: rename _Index to _StorageIndex in SparseMatrix, and add a warning in the doc regarding the 3.2 to 3.3 change of SparseMatrix::Index 2017-01-03 11:19:14 +01:00
NeroBurner
c4fc2611ba add cmake-option to enable/disable creation of tests
* * *
disable unsupportet/test when test are disabled
* * *
rename EIGEN_ENABLE_TESTS to BUILD_TESTING
* * *
consider BUILD_TESTING in blas
2017-01-02 09:09:21 +01:00
Valentin Roussellet
d3c5525c23 Added += and + operators to inner iterators
Fix #1340
#1340
2016-12-28 18:29:30 +01:00
Gael Guennebaud
5c27962453 Move common cwise-unary method from MatrixBase/ArrayBase to the common DenseBase class. 2017-01-02 22:27:07 +01:00
Marco Falke
4ebf69394d doc: Fix trivial typo in AsciiQuickReference.txt
* * *
fixup!
2017-01-01 13:25:48 +00:00
Gael Guennebaud
8d7810a476 bug #1365: fix another type mismatch warning
(sync is set from and compared to an Index)
2016-12-28 23:35:43 +01:00
Gael Guennebaud
97812ff0d3 bug #1369: fix type mismatch warning.
Returned values of omp thread id and numbers are int,
o let's use int instead of Index here.
2016-12-28 23:29:35 +01:00
Gael Guennebaud
7713e20fd2 Fix compilation 2016-12-27 22:04:58 +01:00
Gael Guennebaud
ab69a7f6d1 Cleanup because trait<CwiseBinaryOp>::Flags now expose the correct storage order 2016-12-27 16:55:47 +01:00
Gael Guennebaud
d32a43e33a Make sure that traits<CwiseBinaryOp>::Flags reports the correct storage order so that methods like .outerSize()/.innerSize() work properly. 2016-12-27 16:35:45 +01:00
Gael Guennebaud
7136267461 Add missing .outer() member to iterators of evaluators of cwise sparse binary expression 2016-12-27 16:34:30 +01:00
Gael Guennebaud
fe0ee72390 Fix check of storage order mismatch for "sparse cwiseop sparse". 2016-12-27 16:33:19 +01:00
Gael Guennebaud
6b8f637ab1 Harmless typo 2016-12-27 16:31:17 +01:00
Benoit Steiner
3eda02d78d Fixed the sycl benchmarking code 2016-12-22 10:37:05 -08:00
Mehdi Goli
8b1c2108ba Reverting asynchronous exec to Synchronous exec regarding random race condition. 2016-12-22 16:45:38 +00:00
Benoit Steiner
354baa0fb1 Avoid using horizontal adds since they're not very efficient. 2016-12-21 20:55:07 -08:00
Benoit Steiner
d7825b6707 Use native AVX512 types instead of Eigen Packets whenever possible. 2016-12-21 20:06:18 -08:00
Benoit Steiner
660da83e18 Pulled latest update from trunk 2016-12-21 16:43:27 -08:00
Benoit Steiner
4236aebe10 Simplified the contraction code` 2016-12-21 16:42:56 -08:00
Benoit Steiner
3cfa16f41d Merged in benoitsteiner/opencl (pull request PR-279)
Fix for auto appearing in functor template argument.
2016-12-21 15:08:54 -08:00
Benoit Steiner
519d63d350 Added support for libxsmm kernel in multithreaded contractions 2016-12-21 15:06:06 -08:00
Benoit Steiner
0657228569 Simplified the way we link libxsmm 2016-12-21 14:40:08 -08:00
Benoit Steiner
bbca405f04 Pulled latest updates from trunk 2016-12-21 13:45:28 -08:00
Benoit Steiner
b91be60220 Automatically include and link libxsmm when present. 2016-12-21 13:44:59 -08:00
Gael Guennebaud
c6882a72ed Merged in joaoruileal/eigen (pull request PR-276)
Minor improvements to Umfpack support
2016-12-21 21:39:48 +01:00
Benoit Steiner
f9eff17e91 Leverage libxsmm kernels within signle threaded contractions 2016-12-21 12:32:06 -08:00
Benoit Steiner
c19fe5e9ed Added support for libxsmm in the eigen makefiles 2016-12-21 10:43:40 -08:00
Benoit Steiner
a34d4ebd74 Merged in benoitsteiner/opencl (pull request PR-278) 2016-12-21 08:24:17 -08:00
Luke Iwanski
c55ecfd820 Fix for auto appearing in functor template argument. 2016-12-21 15:42:51 +00:00
Joao Rui Leal
c8c89b5e19 renamed methods umfpackReportControl(), umfpackReportInfo(), and umfpackReportStatus() from UmfPackLU to printUmfpackControl(), printUmfpackInfo(), and printUmfpackStatus() 2016-12-21 09:16:28 +00:00
Benoit Steiner
0f577d4744 Merged eigen/eigen into default 2016-12-20 17:02:06 -08:00
Gael Guennebaud
f2f9df8aa5 Remove MSVC warning 4127 - conditional expression is constant from the disabled list as we now have a local workaround. 2016-12-20 22:53:19 +01:00
Gael Guennebaud
2b3fc981b8 bug #1362: workaround constant conditional warning produced by MSVC 2016-12-20 22:52:27 +01:00
Luke Iwanski
29186f766f Fixed order of initialisation in ExecExprFunctorKernel functor. 2016-12-20 21:32:42 +00:00
Gael Guennebaud
94e8d8902f Fix bug #1367: compilation fix for gcc 4.1! 2016-12-20 22:17:01 +01:00
Gael Guennebaud
e8d6862f14 Properly adjust precision when saving to Market format. 2016-12-20 22:10:33 +01:00
Gael Guennebaud
e2f4ee1c2b Speed up parsing of sparse Market file. 2016-12-20 21:56:21 +01:00
Luke Iwanski
8245851d1b Matching parameters order between lambda and the functor. 2016-12-20 16:18:15 +00:00
Gael Guennebaud
684cfc762d Add transpose, adjoint, conjugate methods to SelfAdjointView (useful to write generic code) 2016-12-20 16:33:53 +01:00
Gael Guennebaud
8bd0d3aa34 merge 2016-12-20 15:56:00 +01:00
Gael Guennebaud
11f55b2979 Optimize storage layout of Cwise* and PlainObjectBase evaluator to remove the functor or outer-stride if they are empty.
For instance, sizeof("(A-B).cwiseAbs2()") with A,B Vector4f is now 16 bytes, instead of 48 before this optimization.
In theory, evaluators should be completely optimized away by the compiler, but this might help in some cases.
2016-12-20 15:55:40 +01:00
Gael Guennebaud
5271474b15 Remove common "noncopyable" base class from evaluator_base to get a chance to get EBO (Empty Base Optimization)
Note: we should probbaly get rid of this class and define a macro instead.
2016-12-20 15:51:30 +01:00
Christoph Hertzberg
1c024e5585 Added some possible temporaries to .hgignore 2016-12-20 14:45:44 +01:00
Gael Guennebaud
316673bbde Clean-up usage of ExpressionTraits in all/any implementation. 2016-12-20 14:38:05 +01:00
Benoit Steiner
548ed30a1c Added an OpenCL regression test 2016-12-19 18:56:26 -08:00
Christoph Hertzberg
10c6bcdc2e Add support for long indexes and for (real-valued) row-major matrices to CholmodSupport module 2016-12-19 14:07:42 +01:00
Gael Guennebaud
f5d644b415 Make sure that HyperPlane::transform manitains a unit normal vector in the Affine case. 2016-12-20 09:35:00 +01:00
Benoit Steiner
27ceb43bf6 Fixed race condition in the tensor_shuffling_sycl test 2016-12-19 15:34:42 -08:00
Benoit Steiner
923acadfac Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics 2016-12-19 13:02:27 -08:00
Benoit Jacob
751e097c57 Use 32 registers on ARM64 2016-12-19 13:44:46 -05:00
Benoit Steiner
fb1d0138ec Include SSE packet instructions when compiling with avx512 enabled. 2016-12-19 07:32:48 -08:00
Joao Rui Leal
95b804c0fe it is now possible to change Umfpack control settings before factorizations; added access to the report functions of Umfpack 2016-12-19 10:45:59 +00:00
Gael Guennebaud
8c0e701504 bug #1360: fix sign issue with pmull on altivec 2016-12-18 22:13:19 +00:00
Gael Guennebaud
fc94258e77 Fix unused warning 2016-12-18 22:11:48 +00:00
Benoit Steiner
0e0d92d34b Merged in benoitsteiner/opencl (pull request PR-275)
Improved support for OpenCL
2016-12-17 10:14:17 -08:00
Benoit Steiner
9e03dfb452 Made sure EIGEN_HAS_C99_MATH is defined when compiling OpenCL code 2016-12-17 09:23:37 -08:00
Benoit Steiner
70d0172f0c Merged eigen/eigen into default 2016-12-16 17:37:04 -08:00
Benoit Steiner
8910442e19 Fixed memcpy, memcpyHostToDevice and memcpyDeviceToHost for Sycl. 2016-12-16 15:45:04 -08:00
Luke Iwanski
54db66c5df struct -> class in order to silence compilation warning. 2016-12-16 20:25:20 +00:00
Mehdi Goli
35bae513a0 Converting all parallel for lambda to functor in order to prevent kernel duplication name error; adding tensorConcatinationOp backend for sycl. 2016-12-16 19:46:45 +00:00
ermak
d60cca32e5 Transformation methods added to ParametrizedLine class. 2016-12-17 00:45:13 +07:00
Jeff Trull
7949849ebc refactor common row/column iteration code into its own class 2016-12-08 19:40:15 -08:00
Jeff Trull
d7bc64328b add display of entries to gdb sparse matrix prettyprinter 2016-12-08 18:50:17 -08:00
Jeff Trull
ff424927bc Introduce a simple pretty printer for sparse matrices (no contents) 2016-12-08 09:45:27 -08:00
Jeff Trull
5ce5418631 Correct prettyprinter comment - Quaternions are in fact supported 2016-12-08 07:31:16 -08:00
Rafael Guglielmetti
8f11df2667 NumTraits.h:
For the values 'ReadCost, AddCost and MulCost', information about value Eigen::HugeCost
2016-12-16 09:07:12 +00:00
Gael Guennebaud
7d5303a083 Partly revert changeset 642dddcce2
, just in case the x87 issue popup again
2016-12-16 09:25:14 +01:00
Benoit Steiner
2f7c2459b7 Merged in benoitsteiner/opencl (pull request PR-272)
Adding asynchandler to sycl queue as lack of it can cause undefined behaviour.
2016-12-15 17:46:40 -08:00
Mehdi Goli
c5e8546306 Adding asynchandler to sycl queue as lack of it can cause undefined behaviour. 2016-12-15 16:59:57 +00:00
Christoph Hertzberg
4247d35d4b Fixed bug which (extremely rarely) could end in an infinite loop 2016-12-15 17:22:12 +01:00
Christoph Hertzberg
642dddcce2 Fix nonnull-compare warning 2016-12-15 17:16:56 +01:00
Benoit Steiner
1324ffef2f Reenabled the use of constexpr on OpenCL devices 2016-12-15 06:49:38 -08:00
Gael Guennebaud
5d00fdf0e8 bug #1363: fix mingw's ABI issue 2016-12-15 11:58:31 +01:00
Benoit Steiner
2c2e218471 Avoid using #define since they can conflict with user code 2016-12-14 19:49:15 -08:00
Benoit Steiner
3beb180ee5 Don't call EnvThread::OnCancel by default since it doesn't do anything. 2016-12-14 18:33:39 -08:00
Benoit Steiner
9ff5d0f821 Merged eigen/eigen into default 2016-12-14 17:32:16 -08:00
Mehdi Goli
730eb9fe1c Adding asynchronous execution as it improves the performance. 2016-12-14 17:38:53 +00:00
Gael Guennebaud
11b492e993 bug #1358: fix compilation for sparse += sparse.selfadjointView(); 2016-12-14 17:53:47 +01:00
Gael Guennebaud
e67397bfa7 bug #1359: fix compilation of col_major_sparse.row() *= scalar
(used to work in 3.2.9 though the expression is not really writable)
2016-12-14 17:05:26 +01:00
Gael Guennebaud
98d7458275 bug #1359: fix sparse /=scalar and *=scalar implementation.
InnerIterators must be obtained from an evaluator.
2016-12-14 17:03:13 +01:00
Mehdi Goli
2d4a091beb Adding tensor contraction operation backend for Sycl; adding test for contractionOp sycl backend; adding temporary solution to prevent memory leak in buffer; cleaning up cxx11_tensor_buildins_sycl.h 2016-12-14 15:30:37 +00:00
Gael Guennebaud
c817ce3ba3 bug #1361: fix compilation issue in mat=perm.inverse() 2016-12-13 23:10:27 +01:00
Benoit Steiner
a432fc102d Moved the choice of ThreadPool to unsupported/Eigen/CXX11/ThreadPool 2016-12-12 15:24:16 -08:00
Benoit Steiner
8ae68924ed Made ThreadPoolInterface::Cancel() an optional functionality 2016-12-12 11:58:38 -08:00
Gael Guennebaud
57acb05eef Update and extend doc on alignment issues. 2016-12-11 22:45:32 +01:00
Benoit Steiner
76fca22134 Use a more accurate timer to sleep on Linux systems. 2016-12-09 15:12:24 -08:00
Benoit Steiner
4deafd35b7 Introduce a portable EIGEN_SLEEP macro. 2016-12-09 14:52:15 -08:00
Benoit Steiner
aafa97f4d2 Fixed build error with MSVC 2016-12-09 14:42:32 -08:00
Benoit Steiner
2f5b7a199b Reworked the threadpool cancellation mechanism to not depend on pthread_cancel since it turns out that pthread_cancel doesn't work properly on numerous platforms. 2016-12-09 13:05:14 -08:00
Benoit Steiner
3d59a47720 Added a message to ease the detection of platforms on which thread cancellation isn't supported. 2016-12-08 14:51:46 -08:00
Benoit Steiner
28ee8f42b2 Added a Flush method to the RunQueue 2016-12-08 14:07:56 -08:00
Benoit Steiner
69ef267a77 Added the new threadpool cancel method to the threadpool interface based class. 2016-12-08 14:03:25 -08:00
Benoit Steiner
7bfff85355 Added support for thread cancellation on Linux 2016-12-08 08:12:49 -08:00
Benoit Steiner
6811e6cf49 Merged in srvasude/eigen/fix_cuda_exp (pull request PR-268)
Fix expm1 CUDA implementation (do not shadow exp CUDA implementation).
2016-12-08 05:14:11 -08:00
Gael Guennebaud
747202d338 typo 2016-12-08 12:48:15 +01:00
Gael Guennebaud
bb297abb9e make sure we use the right eigen version 2016-12-08 12:00:11 +01:00
Gael Guennebaud
8b4b00d277 fix usage of custom compiler 2016-12-08 11:59:39 +01:00
Gael Guennebaud
7105596899 Add missing include and use -O3 2016-12-07 16:56:08 +01:00
Gael Guennebaud
780f3c1adf Fix call to convert on linux 2016-12-07 16:30:11 +01:00
Gael Guennebaud
3855ab472f Cleanup file structure 2016-12-07 14:23:49 +01:00
Gael Guennebaud
59a59fa8e7 Update perf monitoring scripts to generate html/svg outputs 2016-12-07 13:36:56 +01:00
Angelos Mantzaflaris
7694684992 Remove superfluous const's (can cause warnings on some Intel compilers)
(grafted from e236d3443c
)
2016-12-07 00:37:48 +01:00
Gael Guennebaud
f2c506b03d Add a script example to run and upload performance tests 2016-12-06 16:46:52 +01:00
Gael Guennebaud
1b4e085a7f generate png file for web upload 2016-12-06 16:46:22 +01:00
Gael Guennebaud
f725f1cebc Mention the CMAKE_PREFIX_PATH variable. 2016-12-06 15:23:45 +01:00
Gael Guennebaud
f90c4aebc5 Update monitored changeset lists 2016-12-06 15:07:46 +01:00
Gael Guennebaud
eb621413c1 Revert vec/y to vec*(1/y) in row-major TRSM:
- div is extremely costly
- this is consistent with the column-major case
- this is consistent with all other BLAS implementations
2016-12-06 15:04:50 +01:00
Gael Guennebaud
8365c2c941 Fix BLAS backend for symmetric rank K updates. 2016-12-06 14:47:09 +01:00
Gael Guennebaud
0c4d05b009 Explain how to choose your favorite Eigen version 2016-12-06 11:34:06 +01:00
Silvio Traversaro
e049a2a72a Added relocatable cmake support also for CMake before 3.0 and after 2.8.8 2016-12-06 10:37:34 +01:00
Srinivas Vasudevan
e6c8b5500c Change comparisons to use Scalar instead of RealScalar. 2016-12-05 14:01:45 -08:00
Srinivas Vasudevan
f7d7c33a28 Fix expm1 CUDA implementation (do not shadow exp CUDA implementation). 2016-12-05 12:19:01 -08:00
Silvio Traversaro
18481b518f Make CMake config file relocatable 2016-12-05 10:39:52 +01:00
Gael Guennebaud
c68c8631e7 fix compilation of BTL's blaze interface 2016-12-05 23:02:16 +01:00
Gael Guennebaud
1ff1d4a124 Add performance monitoring for LLT 2016-12-05 23:01:52 +01:00
Srinivas Vasudevan
09ee7f0c80 Fix small nit where I changed name of plog1p to pexpm1. 2016-12-02 15:30:12 -08:00
Srinivas Vasudevan
a0d3ac760f Sync from Head. 2016-12-02 14:14:45 -08:00
Srinivas Vasudevan
218764ee1f Added support for expm1 in Eigen. 2016-12-02 14:13:01 -08:00
Gael Guennebaud
66f65ccc36 Ease compiler job to generate clean and efficient code in mat*vec. 2016-12-02 22:41:26 +01:00
Gael Guennebaud
fe696022ec Operators += and -= do not resize! 2016-12-02 22:40:25 +01:00
Angelos Mantzaflaris
18de92329e use numext::abs
(grafted from 0a08d4c60b
)
2016-12-02 11:48:06 +01:00
Angelos Mantzaflaris
e8a6aa518e 1. Add explicit template to abs2 (resolves deduction for some arithmetic types)
2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned)
(grafted from 4086187e49
)
2016-12-02 11:39:18 +01:00
Gael Guennebaud
a6b971e291 Fix memory leak in Ref<Sparse> 2016-12-05 16:59:30 +01:00
Gael Guennebaud
8640ffac65 Optimize SparseLU::solve for rhs vectors 2016-12-05 15:41:14 +01:00
Gael Guennebaud
62acd67903 remove temporary in SparseLU::solve 2016-12-05 15:11:57 +01:00
Gael Guennebaud
0db6d5b3f4 bug #1356: fix calls to evaluator::coeffRef(0,0) to get the address of the destination
by adding a dstDataPtr() member to the kernel. This fixes undefined behavior if dst is empty (nullptr).
2016-12-05 15:08:09 +01:00
Gael Guennebaud
91003f3b86 typo 2016-12-05 13:51:07 +01:00
Gael Guennebaud
445c015751 extend monitoring benchmarks with transpose matrix-vector and triangular matrix-vectors. 2016-12-05 13:36:26 +01:00
Gael Guennebaud
e3f613cbd4 Improve performance of row-major-dense-matrix * vector products for recent CPUs.
This revised version does not bother about aligned loads/stores,
and rather processes 8 rows at ones for better instruction pipelining.
2016-12-05 13:02:01 +01:00
Gael Guennebaud
3abc827354 Clean debugging code 2016-12-05 12:59:32 +01:00
Benoit Steiner
462c28e77a Merged in srvasude/eigen (pull request PR-265)
Add Expm1 support to Eigen.
2016-12-05 02:31:11 +00:00
Gael Guennebaud
4465d20403 Add missing generic load methods. 2016-12-03 21:25:04 +01:00
Gael Guennebaud
6a5fe86098 Complete rewrite of column-major-matrix * vector product to deliver higher performance of modern CPU.
The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive.
This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA.
According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast.
Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching.
We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).
2016-12-03 21:14:14 +01:00
Benoit Steiner
2bfece5cd1 Merged eigen/eigen into default 2016-12-02 16:30:14 -08:00
Mehdi Goli
592acc5bfa Makingt default numeric_list works with sycl. 2016-12-02 17:58:30 +00:00
Gael Guennebaud
8dfb3e00b8 merge 2016-12-02 11:34:21 +01:00
Gael Guennebaud
4c0d5f3c01 Add perf monitoring for gemv 2016-12-02 11:34:12 +01:00
Gael Guennebaud
d2718d662c Re-enable A^T*A action in BTL 2016-12-02 11:32:03 +01:00
Christoph Hertzberg
22f7d398e2 bug #1355: Fixed wrong line-endings on two files 2016-12-02 11:22:05 +01:00
Gael Guennebaud
27873008d4 Clean up SparseCore module regarding ReverseInnerIterator 2016-12-01 21:55:10 +01:00
Angelos Mantzaflaris
8c24723a09 typo UIntPtr
(grafted from b6f04a2dd4
)
2016-12-01 21:25:58 +01:00
Angelos Mantzaflaris
aeba0d8655 fix two warnings(unused typedef, unused variable) and a typo
(grafted from a9aa3bcf50
)
2016-12-01 21:23:43 +01:00
Gael Guennebaud
181138a1cb fix member order 2016-12-01 17:06:20 +01:00
Gael Guennebaud
9f297d57ae Merged in rmlarsen/eigen (pull request PR-256)
Add a default constructor for the "fake" __half class when not using the __half class provided by CUDA.
2016-12-01 15:27:33 +00:00
Gael Guennebaud
f95e3b84a5 merge 2016-12-01 16:18:57 +01:00
Benoit Steiner
7ff26ddcbb Merged eigen/eigen into default 2016-12-01 07:13:17 -08:00
Gael Guennebaud
037b46762d Fix misleading-indentation warnings. 2016-12-01 16:05:42 +01:00
Mehdi Goli
79aa2b784e Adding sycl backend for TensorPadding.h; disbaling __unit128 for sycl in TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code. 2016-12-01 13:02:27 +00:00
Benoit Steiner
a70393fd02 Cleaned up forward declarations 2016-11-30 21:59:07 -08:00
Benoit Steiner
e073de96dc Moved the MemCopyFunctor back to TensorSyclDevice since it's the only caller and it makes TensorFlow compile again 2016-11-30 21:36:52 -08:00
Benoit Steiner
fca27350eb Added the deallocate_all() method back 2016-11-30 20:45:20 -08:00
Benoit Steiner
e633a8371f Simplified includes 2016-11-30 20:21:18 -08:00
Benoit Steiner
7cd33df4ce Improved formatting 2016-11-30 20:20:44 -08:00
Benoit Steiner
fd1dc3363e Merged eigen/eigen into default 2016-11-30 20:16:17 -08:00
Benoit Steiner
f5107010ee Udated the Sizes class to work on AMD gpus without requiring a separate implementation 2016-11-30 19:57:28 -08:00
Benoit Steiner
e37c2c52d3 Added an implementation of numeric_list that works with sycl 2016-11-30 19:55:15 -08:00
Gael Guennebaud
8df272af88 Fix slection of product implementation for dynamic size matrices with fixed max size. 2016-11-30 22:21:33 +01:00
Benoit Steiner
faa2ff99c6 Pulled latest update from trunk 2016-11-30 09:31:24 -08:00
Benoit Steiner
df3da0780d Updated customIndices2Array to handle various index sizes. 2016-11-30 09:30:12 -08:00
Gael Guennebaud
c927af60ed Fix a performance regression in (mat*mat)*vec for which mat*mat was evaluated multiple times. 2016-11-30 17:59:13 +01:00
Luke Iwanski
26fff1c5b1 Added EIGEN_STRONG_INLINE to get_sycl_supported_device(). 2016-11-30 16:55:22 +00:00
Gael Guennebaud
ab4ef5e66e bug #1351: fix compilation of random with old compilers 2016-11-30 17:37:53 +01:00
Sergiu Deitsch
5e3c5c42f6 cmake: remove architecture dependency from Eigen3ConfigVersion.cmake
Also, install Eigen3*.cmake under $prefix/share/eigen3/cmake by default.
(grafted from 86ab00cdcf
)
2016-11-30 15:46:46 +01:00
Sergiu Deitsch
3440b46e2f doc: mention the NO_MODULE option and target availability
(grafted from 65f09be8d2
)
2016-11-30 15:41:38 +01:00
Rasmus Munk Larsen
a0329f64fb Add a default constructor for the "fake" __half class when not using the
__half class provided by CUDA.
2016-11-29 13:18:09 -08:00
Mehdi Goli
577ce78085 Adding TensorShuffling backend for sycl; adding TensorReshaping backend for sycl; cleaning up the sycl backend. 2016-11-29 15:30:42 +00:00
Benoit Steiner
3011dc94ef Call internal::array_prod to compute the total size of the tensor. 2016-11-28 09:00:31 -08:00
Benoit Steiner
02080e2b67 Merged eigen/eigen into default 2016-11-27 07:27:30 -08:00
Benoit Steiner
9fd081cddc Fixed compilation warnings 2016-11-26 20:22:25 -08:00
Benoit Steiner
9f8fbd9434 Merged eigen/eigen into default 2016-11-26 11:28:25 -08:00
Benoit Steiner
67b2c41f30 Avoided unnecessary type conversion 2016-11-26 11:27:29 -08:00
Benoit Steiner
7fe704596a Added missing array_get method for numeric_list 2016-11-26 11:26:07 -08:00
Mehdi Goli
7318daf887 Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h. 2016-11-25 16:19:07 +00:00
Benoit Steiner
7ad37606dd Fixed the documentation of Scalar Tensors 2016-11-24 12:31:43 -08:00
Benoit Steiner
3be1afca11 Disabled the "remove the call to 'std::abs' since unsigned values cannot be negative" warning introduced in clang 3.5 2016-11-23 18:49:51 -08:00
Gael Guennebaud
308961c05e Fix compilation. 2016-11-23 22:17:52 +01:00
Gael Guennebaud
21d0286d81 bug #1348: Document EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES,
and reflect in the doc that EIGEN_DONT_ALIGN* are deprecated.
2016-11-23 22:15:03 +01:00
Mehdi Goli
b8cc5635d5 Removing unsupported device from test case; cleaning the tensor device sycl. 2016-11-23 16:30:41 +00:00
Gael Guennebaud
7f6333c32b Merged in tal500/eigen-eulerangles (pull request PR-237)
Euler angles
2016-11-23 15:17:38 +00:00
Gael Guennebaud
f12b368417 Extend polynomial solver unit tests to complexes 2016-11-23 16:05:45 +01:00
Gael Guennebaud
56e5ec07c6 Automatically switch between EigenSolver and ComplexEigenSolver, and fix a few Real versus Scalar issues. 2016-11-23 16:05:10 +01:00
Gael Guennebaud
9246587122 Patch from Oleg Shirokobrod to extend polynomial solver to complexes 2016-11-23 15:42:26 +01:00
Gael Guennebaud
e340866c81 Fix compilation with gcc and old ABI version 2016-11-23 14:04:57 +01:00
Gael Guennebaud
a91de27e98 Fix compilation issue with MSVC:
MSVC always messes up with shadowed template arguments, for instance in:
  struct B { typedef float T; }
  template<typename T> struct A : B {
    T g;
  };
The type of A<double>::g will be float and not double.
2016-11-23 12:24:48 +01:00
Gael Guennebaud
74637fa4e3 Optimize predux<Packet8f> (AVX) 2016-11-22 21:57:52 +01:00
Gael Guennebaud
178c084856 Disable usage of SSE3 _mm_hadd_ps that is extremely slow. 2016-11-22 21:53:14 +01:00
Gael Guennebaud
7dd894e40e Optimize predux<Packet4d> (AVX) 2016-11-22 21:41:30 +01:00
Gael Guennebaud
f3fb0a1940 Disable usage of SSE3 haddpd that is extremely slow. 2016-11-22 16:58:31 +01:00
Sergiu Deitsch
5c516e4e0a cmake: added Eigen3::Eigen imported target
(grafted from a287140f72
)
2016-11-22 12:25:06 +01:00
Gael Guennebaud
6a84246a6a Fix regression in assigment of sparse block to spasre block. 2016-11-21 21:46:42 +01:00
Benoit Steiner
f11da1d83b Made the QueueInterface thread safe 2016-11-20 13:17:08 -08:00
Benoit Steiner
ed839c5851 Enable the use of constant expressions with clang >= 3.6 2016-11-20 10:34:49 -08:00
Benoit Steiner
6d781e3e52 Merged eigen/eigen into default 2016-11-20 10:12:54 -08:00
Benoit Steiner
79a07b891b Fixed a typo 2016-11-20 07:07:41 -08:00
Gael Guennebaud
465ede0f20 Fix compilation issue in mat = permutation (regression introduced in 8193ffb3d3
)
2016-11-20 09:41:37 +01:00
Benoit Steiner
81151bd474 Fixed merge conflicts 2016-11-19 19:12:59 -08:00
Benoit Steiner
9265ca707e Made it possible to check the state of a sycl device without synchronization 2016-11-19 10:56:24 -08:00
Benoit Steiner
2d1aec15a7 Added missing include 2016-11-19 08:09:54 -08:00
Luke Iwanski
af67335e0e Added test for cwiseMin, cwiseMax and operator%. 2016-11-19 13:37:27 +00:00
Benoit Steiner
1bdf1b9ce0 Merged in benoitsteiner/opencl (pull request PR-253)
OpenCL improvements
2016-11-19 04:44:43 +00:00
Benoit Steiner
a357fe1fb9 Code cleanup 2016-11-18 16:58:09 -08:00
Benoit Steiner
1c6eafb46b Updated cxx11_tensor_device_sycl to run only on the OpenCL devices available on the host 2016-11-18 16:43:27 -08:00
Benoit Steiner
ca754caa23 Only runs the cxx11_tensor_reduction_sycl on devices that are available. 2016-11-18 16:31:14 -08:00
Benoit Steiner
dc601d79d1 Added the ability to run test exclusively OpenCL devices that are listed by sycl::device::get_devices(). 2016-11-18 16:26:50 -08:00
Benoit Steiner
8649e16c2a Enable EIGEN_HAS_C99_MATH when building with the latest version of Visual Studio 2016-11-18 14:18:34 -08:00
Benoit Steiner
110b7f8d9f Deleted unnecessary semicolons 2016-11-18 14:06:17 -08:00
Benoit Steiner
b5e3285e16 Test broadcasting on OpenCL devices with 64 bit indexing 2016-11-18 13:44:20 -08:00
Gael Guennebaud
164414c563 Merged in ChunW/eigen (pull request PR-252)
Workaround for error in VS2012 with /clr
2016-11-18 21:07:29 +00:00
Benoit Steiner
37c2c516a6 Cleaned up the sycl device code 2016-11-18 12:38:06 -08:00
Benoit Steiner
7335c49204 Fixed the cxx11_tensor_device_sycl test 2016-11-18 12:37:13 -08:00
Mehdi Goli
15e226d7d3 adding Benoit changes on the TensorDeviceSycl.h 2016-11-18 16:34:54 +00:00
Mehdi Goli
622805a0c5 Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}. 2016-11-18 16:20:42 +00:00
Luke Iwanski
5159675c33 Added isnan, isfinite and isinf for SYCL device. Plus test for that. 2016-11-18 16:01:48 +00:00
Tal Hadad
76b2a3e6e7 Allow to construct EulerAngles from 3D vector directly.
Using assignment template struct to distinguish between 3D vector and 3D rotation matrix.
2016-11-18 15:01:06 +02:00
Luke Iwanski
927bd62d2a Now testing out (+=, =) in.FUNC() and out (+=, =) out.FUNC() 2016-11-18 11:16:42 +00:00
Gael Guennebaud
8193ffb3d3 bug #1343: fix compilation regression in mat+=selfadjoint_view.
Generic EigenBase2EigenBase assignment was incomplete.
2016-11-18 10:17:34 +01:00
Gael Guennebaud
cebff7e3a2 bug #1343: fix compilation regression in array = matrix_product 2016-11-18 10:09:33 +01:00
Benoit Steiner
7c30078b9f Merged eigen/eigen into default 2016-11-17 22:53:37 -08:00
Benoit Steiner
553f50b246 Added a way to detect errors generated by the opencl device from the host 2016-11-17 21:51:48 -08:00
Benoit Steiner
72a45d32e9 Cleanup 2016-11-17 21:29:15 -08:00
Benoit Steiner
4349fc640e Created a test to check that the sycl runtime can successfully report errors (like ivision by 0).
Small cleanup
2016-11-17 20:27:54 -08:00
Benoit Steiner
a6a3fd0703 Made TensorDeviceCuda.h compile on windows 2016-11-17 16:15:27 -08:00
Chun Wang
0d0948c3b9 Workaround for error in VS2012 with /clr 2016-11-17 17:54:27 -05:00
Benoit Steiner
004344cf54 Avoid calling log(0) or 1/0 2016-11-17 11:56:44 -08:00
Konstantinos Margaritis
a1d5c503fa replace sizeof(Packet) with PacketSize else it breaks for ZVector.Packet4f 2016-11-17 13:27:45 -05:00
Konstantinos Margaritis
672aa97d4d implement float/std::complex<float> for ZVector as well, minor fixes to ZVector 2016-11-17 13:27:33 -05:00
Konstantinos Margaritis
8290e21fb5 replace sizeof(Packet) with PacketSize else it breaks for ZVector.Packet4f 2016-11-17 13:21:15 -05:00
Luke Iwanski
7878756dea Fixed existing test. 2016-11-17 17:46:55 +00:00
Luke Iwanski
c5130dedbe Specialised basic math functions for SYCL device. 2016-11-17 11:47:13 +00:00
Benoit Steiner
f2e8b73256 Enable the use of AVX512 instruction by default 2016-11-16 21:28:04 -08:00
Gael Guennebaud
7b09e4dd8c bump default branch to 3.3.90 2016-11-16 22:20:58 +01:00
Benoit Steiner
dff9a049c4 Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs 2016-11-16 09:01:51 -08:00
Benoit Steiner
b5c75351e3 Merged eigen/eigen into default 2016-11-14 15:54:44 -08:00
Rasmus Munk Larsen
32df1b1046 Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics). 2016-11-14 14:18:16 -08:00
Mehdi Goli
05e8c2a1d9 Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl. 2016-11-14 18:13:53 +00:00
Mehdi Goli
f8ca893976 Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing. 2016-11-14 17:51:57 +00:00
Gael Guennebaud
0ee92aa38e Optimize sparse<bool> && sparse<bool> to use the same path as for coeff-wise products. 2016-11-14 18:47:41 +01:00
Gael Guennebaud
2e334f5da0 bug #426: move operator && and || to MatrixBase and SparseMatrixBase. 2016-11-14 18:47:02 +01:00
Gael Guennebaud
a048aba14c Merged in olesalscheider/eigen (pull request PR-248)
Make sure not to call numext::maxi on expression templates
2016-11-14 13:25:53 +00:00
Gael Guennebaud
eedb87f4ba Fix regression in SparseMatrix::ReverseInnerIterator 2016-11-14 14:05:53 +01:00
Niels Ole Salscheider
51fef87408 Make sure not to call numext::maxi on expression templates 2016-11-12 12:20:57 +01:00
Mehdi Goli
a5c3f15682 Adding comment to TensorDeviceSycl.h and cleaning the code. 2016-11-11 19:06:34 +00:00
Benoit Steiner
f4722aa479 Merged in benoitsteiner/opencl (pull request PR-247) 2016-11-11 00:01:28 +00:00
Mehdi Goli
3be3963021 Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor. 2016-11-10 19:16:31 +00:00
Mehdi Goli
12387abad5 adding the missing in eigen_assert! 2016-11-10 18:58:08 +00:00
Mehdi Goli
2e704d4257 Adding Memset; optimising MecopyDeviceToHost by removing double copying; 2016-11-10 18:45:12 +00:00
Gael Guennebaud
eeac81b8c0 bump to 3.3.0 2016-11-10 13:55:14 +01:00
Gael Guennebaud
e80bc2ddb0 Fix printing of sparse expressions 2016-11-10 10:35:32 +01:00
Benoit Steiner
75c080b176 Added a test to validate memory transfers between host and sycl device 2016-11-09 06:23:42 -08:00
Benoit Steiner
db3903498d Merged in benoitsteiner/opencl (pull request PR-246)
Improved support for OpenCL
2016-11-08 22:28:44 +00:00
Benoit Steiner
dcc14bee64 Fixed the formatting of the code 2016-11-08 14:24:46 -08:00
Benoit Steiner
b88c1117d4 Fixed the indentation of the cmake file 2016-11-08 14:22:36 -08:00
Luke Iwanski
912cb3d660 #if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS. 2016-11-08 22:01:14 +00:00
Luke Iwanski
1b345b0895 Fix for SYCL queue initialisation. 2016-11-08 21:56:31 +00:00
Luke Iwanski
1b95717358 Use try/catch only when exceptions are enabled. 2016-11-08 21:08:53 +00:00
Mehdi Goli
d57430dd73 Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion. 2016-11-08 17:08:02 +00:00
Gael Guennebaud
73985ead27 Extend unit test to check sparse solvers with a SparseVector as the rhs and result. 2016-11-06 20:29:57 +01:00
Gael Guennebaud
436a111792 Generalize Cholmod support to hanlde any sparse type as the rhs and result of the solve method 2016-11-06 20:29:23 +01:00
Gael Guennebaud
afc55b1885 Generalize IterativeSolverBase::solve to hanlde any sparse type as the results (instead of SparseMatrix only) 2016-11-06 20:28:18 +01:00
Gael Guennebaud
a5c2d8a3cc Generalize solve_sparse_through_dense_panels to handle SparseVector. 2016-11-06 15:20:58 +01:00
Gael Guennebaud
f8bfe10613 Add missing friend declaration 2016-11-06 15:20:30 +01:00
Gael Guennebaud
fc7180cda8 Add a default ctor to evaluator<SparseVector>.
Needed for evaluator<Solve>.
2016-11-06 15:20:00 +01:00
Gael Guennebaud
4d226ab5b5 Enable swapping between SparseMatrix and SparseVector 2016-11-06 15:15:03 +01:00
Benoit Steiner
ad086b03e4 Removed unnecessary statement 2016-11-05 12:43:27 -07:00
Benoit Steiner
dad177be01 Added missing includes 2016-11-05 10:04:42 -07:00
Gael Guennebaud
55b4fd1d40 Extend mpreal unit test to check LLT with complexes. 2016-11-05 11:28:53 +01:00
Gael Guennebaud
a354c3ca59 Fix compilation of LLT with complex<mpreal>. 2016-11-05 11:28:29 +01:00
Benoit Steiner
d46a36cc84 Merged eigen/eigen into default 2016-11-04 18:22:55 -07:00
Mehdi Goli
0ebe3808ca Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size; 2016-11-04 18:18:19 +00:00
Gael Guennebaud
47d1b4a609 Added tag 3.3-rc2 for changeset ba05572dcb 2016-11-04 09:09:18 +01:00
Gael Guennebaud
ba05572dcb bump to 3.3-rc2 2016-11-04 09:09:06 +01:00
Benoit Steiner
5c3995769c Improved AVX512 configuration 2016-11-03 04:50:28 -07:00
Benoit Steiner
fbe672d599 Reenable the generation of dynamic blas libraries. 2016-11-03 04:08:43 -07:00
Benoit Steiner
ca0ba0d9a4 Improved AVX512 support 2016-11-03 04:00:49 -07:00
Benoit Steiner
c80587c92b Merged eigen/eigen into default 2016-11-03 03:55:11 -07:00
Gael Guennebaud
3f1d0cdc22 bug #1337: improve doc of homogeneous() and hnormalized() 2016-11-03 11:03:08 +01:00
Gael Guennebaud
78e93ac1ad bug #1330: Cholmod supports double precision only, so let's trigger a static assertion if the scalar type does not match this requirement. 2016-11-03 10:21:59 +01:00
Benoit Steiner
3e37166d0b Merged in benoitsteiner/opencl (pull request PR-244)
Disable vectorization on device only when compiling for sycl
2016-11-02 22:01:03 +00:00
Benoit Steiner
0585b2965d Disable vectorization on device only when compiling for sycl 2016-11-02 11:44:27 -07:00
Benoit Steiner
e6e77ed08b Don't call lgamma_r when compiling for an Apple device, since the function isn't available on MacOS 2016-11-02 09:55:39 -07:00
Benoit Steiner
b238f387b4 Pulled latest updates from trunk 2016-11-02 08:53:13 -07:00
Benoit Steiner
c8db17301e Special functions require math.h: make sure it is included. 2016-11-02 08:51:52 -07:00
Gael Guennebaud
a07bb428df bug #1004: improve accuracy of LinSpaced for abs(low) >> abs(high). 2016-11-02 11:34:38 +01:00
Gael Guennebaud
598de8b193 Add pinsertfirst function and implement pinsertlast for complex on SSE/AVX. 2016-11-02 10:38:13 +01:00
Benoit Steiner
e44519744e Merged in benoitsteiner/opencl (pull request PR-243)
Fixed the ambiguity in callig make_tuple for sycl backend.
2016-11-02 02:56:58 +00:00
Rasmus Munk Larsen
0a6ae41555 Merged eigen/eigen into default 2016-11-01 15:37:00 -07:00
Rasmus Munk Larsen
b730952414 Don't attempts to use lgamma_r for CUDA devices.
Fix type in lgamma_impl<double>.
2016-11-01 15:34:19 -07:00
Benoit Steiner
7a0e96b80d Gate the code that refers to cuda fp16 primitives more thoroughly 2016-11-01 12:08:09 -07:00
Mehdi Goli
51af6ae971 Fixed the ambiguity in callig make_tuple for sycl backend. 2016-10-31 16:35:51 +00:00
Benoit Steiner
0a9ad6fc72 Worked around Visual Studio compilation errors 2016-10-28 07:54:27 -07:00
Benoit Steiner
d5f88e2357 Sharded the tensor_image_patch test to help it run on low power devices 2016-10-27 21:48:21 -07:00
Benoit Steiner
0b4b0f11e8 Fixed a few more compilation warnings 2016-10-28 04:01:01 +00:00
Benoit Steiner
306daa24a3 Fixed a compilation warning 2016-10-28 03:50:31 +00:00
Benoit Steiner
8471cf1996 Fixed compilation warning 2016-10-28 03:46:08 +00:00
Benoit Steiner
b0c5bfdf78 Added missing template parameters 2016-10-28 03:43:41 +00:00
Rasmus Munk Larsen
2ebb314fa7 Use threadsafe versions of lgamma and lgammaf if possible. 2016-10-27 16:17:12 -07:00
Gael Guennebaud
530f20c21a Workaround MSVC issue. 2016-10-27 21:51:37 +02:00
Gael Guennebaud
c3ce4f9ac0 Merged in enricodetoma/eigen (pull request PR-241)
Always enable /bigobj for tests to avoid a compile error in MSVC 2015
2016-10-27 19:21:28 +00:00
Benoit Steiner
7d64e6752c Pulled latest updates from trunk 2016-10-26 18:48:06 -07:00
Benoit Steiner
0a4c4d40b4 Removed a template parameter for fixed sized tensors 2016-10-26 18:47:37 -07:00
Gael Guennebaud
3ecb343dc3 Fix regression in X = (X*X.transpose())/s with X rectangular by deferring resizing of the destination after the creation of the evaluator of the source expression. 2016-10-26 22:50:41 +02:00
enrico.detoma
6ed571744b Always enable /bigobj for tests to avoid a compile error in MSVC 2015 2016-10-26 22:48:46 +02:00
Gael Guennebaud
97feea9d39 add a generic EIGEN_HAS_CXX11 2016-10-26 15:53:13 +02:00
Gael Guennebaud
ca6a2a5248 Fix warning with ICC 2016-10-26 14:13:05 +02:00
Benoit Steiner
5f2dd503ff Replaced tabs with spaces 2016-10-25 20:40:58 -07:00
Benoit Steiner
1644bafe29 Code cleanup 2016-10-25 20:36:14 -07:00
Gael Guennebaud
b15a5dc3f4 Fix ICC warnings 2016-10-25 22:20:24 +02:00
Gael Guennebaud
aad72f3c6d Add missing inline keywords 2016-10-25 20:20:09 +02:00
Benoit Steiner
3e194a6a73 Fixed a typo 2016-10-25 08:42:15 -07:00
Gael Guennebaud
58146be99b bug #1004: one more rewrite of LinSpaced for floating point numbers to guarantee both interpolation and monotonicity.
This version simply does low+i*step plus a branch to return high if i==size-1.
Vectorization is accomplished with a branch and the help of pinsertlast.
Some quick benchmark revealed that the overhead is really marginal, even when filling small vectors.
2016-10-25 16:53:09 +02:00
Gael Guennebaud
13fc18d3a2 Add a pinsertlast function replacing the last entry of a packet by a scalar.
(useful to vectorize LinSpaced)
2016-10-25 16:48:49 +02:00
Gael Guennebaud
2634f9386c bug #1333: fix bad usage of const_cast_derived. Better use .data() for that purpose. 2016-10-24 22:22:35 +02:00
Gael Guennebaud
9e8f07d7b5 Cleanup ArrayWrapper and MatrixWrapper by removing redundant accessors. 2016-10-24 22:16:48 +02:00
Gael Guennebaud
b027d7a8cf bug #1004: remove the inaccurate "sequential" path for LinSpaced, mark respective function as deprecated, and enforce strict interpolation of the higher range using a correction term.
Now, even with floating point precision, both the 'low' and 'high' bounds are exactly reproduced at i=0 and i=size-1 respectively.
2016-10-24 20:27:21 +02:00
Benoit Steiner
b11aab5fcc Merged in benoitsteiner/opencl (pull request PR-238)
Added support for OpenCL to the Tensor Module
2016-10-24 15:30:45 +00:00
Gael Guennebaud
53c77061f0 bug #698: rewrite LinSpaced for integer scalar types to avoid overflow and guarantee an even spacing when possible.
Otherwise, the "high" bound is implicitly lowered to the largest value allowing for an even distribution.
This changeset also disable vectorization for this integer path.
2016-10-24 15:50:27 +02:00
Gael Guennebaud
e8e56c7642 Add unit test for overflow in LinSpaced 2016-10-24 15:43:51 +02:00
Gael Guennebaud
40f62974b7 bug #1328: workaround a compilation issue with gcc 4.2 2016-10-20 19:19:37 +02:00
Benoit Steiner
cf20b30d65 Merge latest updates from trunk 2016-10-20 09:42:05 -07:00
Luke Iwanski
03b63e182c Added SYCL include in Tensor. 2016-10-20 15:32:44 +01:00
Benoit Steiner
d3943cd50c Fixed a few typos in the ternary tensor expressions types 2016-10-19 12:56:12 -07:00
Tal Hadad
15eca2432a Euler tests: Tighter precision when no roll exists and clean code. 2016-10-18 23:24:57 +03:00
Tal Hadad
6f4f12d1ed Add isApprox() and cast() functions.
test cases included
2016-10-17 22:23:47 +03:00
Tal Hadad
7402cfd4cc Add safty for near pole cases and test them better. 2016-10-17 20:42:08 +03:00
Mehdi Goli
8fb162fc85 Fixing the typo regarding missing #if needed for proper handling of exceptions in Eigen/Core. 2016-10-16 12:52:34 +01:00
Tal Hadad
58f5d7d058 Fix calc bug, docs and better testing.
Test code changes:
* better coded
* rand and manual numbers
* singularity checking
2016-10-16 14:39:26 +03:00
Mehdi Goli
e36cb91c99 Fixing the code indentation in the TensorReduction.h file. 2016-10-14 18:03:00 +01:00
Luke Iwanski
2e188dd4d4 Merged ComputeCpp to default. 2016-10-14 16:47:40 +01:00
Mehdi Goli
15380f9a87 Applyiing Benoit's comment to return the missing line back in Eigen/Core 2016-10-14 16:39:41 +01:00
Gael Guennebaud
692b30ca95 Fix previous merge. 2016-10-14 17:16:28 +02:00
Gael Guennebaud
050c681bdd Merged in rmlarsen/eigen2 (pull request PR-232)
Improve performance of parallelized matrix multiply for rectangular matrices
2016-10-14 14:51:09 +00:00
Tal Hadad
078a202621 Merge Hongkai Dai correct range calculation, and remove ranges from API.
Docs updated.
2016-10-14 16:03:28 +03:00
Luke Iwanski
e742da8b28 Merged ComputeCpp into default. 2016-10-14 13:36:51 +01:00
Mehdi Goli
524fa4c46f Reducing the code by generalising sycl backend functions/structs. 2016-10-14 12:09:55 +01:00
Hongkai Dai
014d9f1d9b implement euler angles with the right ranges 2016-10-13 14:45:51 -07:00
Benoit Steiner
737e4152c3 Merged in lukier/eigen (pull request PR-234)
Enabling CUDA in Geometry
2016-10-13 18:09:28 +00:00
Benoit Steiner
d0ee2267d6 Relaxed the resizing checks so that they don't fail with gcc >= 5.3 2016-10-13 10:59:46 -07:00
Robert Lukierski
a94791b69a Fixes for min and abs after Benoit's comments, switched to numext. 2016-10-13 15:00:22 +01:00
Avi Ginsburg
ac63d6891c Patch to allow VS2015 & CUDA 8.0 to compile with Eigen included. I'm not sure
whether to limit the check to this compiler combination
(` || (EIGEN_COMP_MSVC == 1900 &&  __CUDACC_VER__) `)
or to leave it as it is. I also don't know if this will have any affect on
including Eigen in device code (I'm not in my current project).
2016-10-13 08:47:32 +00:00
Benoit Steiner
7e4a6754b2 Merged eigen/eigen into default 2016-10-12 22:42:33 -07:00
Benoit Steiner
38b6048e14 Deleted redundant implementation of predux 2016-10-12 14:37:56 -07:00
Gael Guennebaud
e74612b9a0 Remove double ;; 2016-10-12 22:49:47 +02:00
Benoit Steiner
78d2926508 Merged eigen/eigen into default 2016-10-12 13:46:29 -07:00
Benoit Steiner
2e2f48e30e Take advantage of AVX512 instructions whenever possible to speedup the processing of 16 bit floats. 2016-10-12 13:45:39 -07:00
Gael Guennebaud
f939c351cb Fix SPQR for rectangular matrices 2016-10-12 22:39:33 +02:00
Gael Guennebaud
091d373ee9 Fix outer-stride. 2016-10-12 21:47:52 +02:00
Robert Lukierski
471075f7ad Fixes min() warnings. 2016-10-12 18:59:05 +01:00
Gael Guennebaud
5c366fe1d7 Merged in rmlarsen/eigen (pull request PR-230)
Fix a bug in psqrt for SSE and AVX when EIGEN_FAST_MATH=1
2016-10-12 16:30:51 +00:00
Robert Lukierski
86711497c4 Adding EIGEN_DEVICE_FUNC in the Geometry module.
Additional CUDA necessary fixes in the Core (mostly usage of
EIGEN_USING_STD_MATH).
2016-10-12 16:35:17 +01:00
Rasmus Munk Larsen
47150af1c8 Fix copy-paste error: Must use _mm256_cmp_ps for AVX. 2016-10-12 08:34:39 -07:00
Gael Guennebaud
89e315152c bug #1325: fix compilation on NEON with clang 2016-10-12 16:55:47 +02:00
Benoit Steiner
7f0599b6eb Manually define int16_t and uint16_t when compiling with Visual Studio 2016-10-08 22:56:32 -07:00
Benoit Steiner
5727e4d89c Reenabled the use of variadic templates on tegra x1 provides that the latest version (i.e. JetPack 2.3) is used. 2016-10-08 22:19:03 +00:00
Benoit Steiner
5266ff8966 Cleaned up a regression test 2016-10-08 19:12:44 +00:00
Benoit Steiner
5c68051cd7 Merge the content of the ComputeCpp branch into the default branch 2016-10-07 11:04:16 -07:00
Gael Guennebaud
4860727ac2 Remove static qualifier of free-functions (inline is enough and this helps ICC to find the right overload) 2016-10-07 09:21:12 +02:00
Benoit Steiner
507b661106 Renamed predux_half into predux_downto4 2016-10-06 17:57:04 -07:00
Benoit Steiner
a498ff7df6 Fixed incorrect comment 2016-10-06 15:27:27 -07:00
Benoit Steiner
8ba3c41fcf Revergted unecessary change 2016-10-06 15:12:15 -07:00
Benoit Steiner
a7473d6d5a Fixed compilation error with gcc >= 5.3 2016-10-06 14:33:22 -07:00
Benoit Steiner
5e64cea896 Silenced a compilation warning 2016-10-06 14:24:17 -07:00
Benoit Steiner
33fba3f08d Merged in rryan/eigen/tensorfunctors (pull request PR-233)
Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.
2016-10-06 12:29:19 -07:00
RJ Ryan
bfc264abe8 Add a test that GPU complex product reductions match CPU reductions. 2016-10-06 11:10:14 -07:00
RJ Ryan
e2e9cdd169 Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. 2016-10-06 10:49:48 -07:00
Benoit Steiner
d485d12c51 Added missing AVX intrinsics for fp16: in particular, implemented predux which is required by the matrix-vector code. 2016-10-06 10:41:03 -07:00
Rasmus Munk Larsen
48c635e223 Add a simple cost model to prevent Eigen's parallel GEMM from using too many threads when the inner dimension is small.
Timing for square matrices is unchanged, but both CPU and Wall time are significantly improved for skinny matrices. The benchmarks below are for multiplying NxK * KxN matrices with test names of the form BM_OuterishProd/N/K.

Improvements in Wall time:

Run on [redacted] (12 X 3501 MHz CPUs); 2016-10-05T17:40:02.462497196-07:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_OuterishProd/64/1                    3088      1610    +47.9%
BM_OuterishProd/64/4                    3562      2414    +32.2%
BM_OuterishProd/64/32                   8861      7815    +11.8%
BM_OuterishProd/128/1                  11363      6504    +42.8%
BM_OuterishProd/128/4                  11128      9794    +12.0%
BM_OuterishProd/128/64                 27691     27396     +1.1%
BM_OuterishProd/256/1                  33214     28123    +15.3%
BM_OuterishProd/256/4                  34312     36818     -7.3%
BM_OuterishProd/256/128               174866    176398     -0.9%
BM_OuterishProd/512/1                7963684    104224    +98.7%
BM_OuterishProd/512/4                7987913    112867    +98.6%
BM_OuterishProd/512/256              8198378   1306500    +84.1%
BM_OuterishProd/1k/1                 7356256    324432    +95.6%
BM_OuterishProd/1k/4                 8129616    331621    +95.9%
BM_OuterishProd/1k/512              27265418   7517538    +72.4%

Improvements in CPU time:

Run on [redacted] (12 X 3501 MHz CPUs); 2016-10-05T17:40:02.462497196-07:00
CPU: Intel Haswell with HyperThreading (6 cores) dL1:32KB dL2:256KB dL3:15MB
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_OuterishProd/64/1                    6169      1608    +73.9%
BM_OuterishProd/64/4                    7117      2412    +66.1%
BM_OuterishProd/64/32                  17702     15616    +11.8%
BM_OuterishProd/128/1                  45415      6498    +85.7%
BM_OuterishProd/128/4                  44459      9786    +78.0%
BM_OuterishProd/128/64                110657    109489     +1.1%
BM_OuterishProd/256/1                 265158     28101    +89.4%
BM_OuterishProd/256/4                 274234    183885    +32.9%
BM_OuterishProd/256/128              1397160   1408776     -0.8%
BM_OuterishProd/512/1               78947048    520703    +99.3%
BM_OuterishProd/512/4               86955578   1349742    +98.4%
BM_OuterishProd/512/256             74701613  15584661    +79.1%
BM_OuterishProd/1k/1                78352601   3877911    +95.1%
BM_OuterishProd/1k/4                78521643   3966221    +94.9%
BM_OuterishProd/1k/512              258104736  89480530    +65.3%
2016-10-06 10:33:10 -07:00
Benoit Steiner
9f3276981c Enabling AVX512 should also enable AVX2. 2016-10-06 10:29:48 -07:00
Gael Guennebaud
80b5133789 Fix compilation of qr.inverse() for column and full pivoting variants. 2016-10-06 09:55:50 +02:00
Benoit Steiner
4131074818 Deleted unecessary CMakeLists.txt file 2016-10-05 18:54:35 -07:00
Benoit Steiner
cb5cd69872 Silenced a compilation warning. 2016-10-05 18:50:53 -07:00
Benoit Steiner
78b569f685 Merged latest updates from trunk 2016-10-05 18:48:55 -07:00
Benoit Steiner
9c2b6c049b Silenced a few compilation warnings 2016-10-05 18:37:31 -07:00
Benoit Steiner
6f3cd529af Pulled latest updates from trunk 2016-10-05 18:31:43 -07:00
Benoit Steiner
d7f9679a34 Fixed a couple of compilation warnings 2016-10-05 15:00:32 -07:00
Benoit Steiner
ae1385c7e4 Pull the latest updates from trunk 2016-10-05 14:54:36 -07:00
Benoit Steiner
73b0012945 Fixed compilation warnings 2016-10-05 14:24:24 -07:00
Benoit Steiner
c84084c0c0 Fixed compilation warning 2016-10-05 14:15:41 -07:00
Benoit Steiner
4387433acf Increased the robustness of the reduction tests on fp16 2016-10-05 10:42:41 -07:00
Benoit Steiner
aad20d700d Increase the tolerance to numerical noise. 2016-10-05 10:39:24 -07:00
Benoit Steiner
8b69d5d730 ::rand() returns a signed integer on win32 2016-10-05 08:55:02 -07:00
Benoit Steiner
ed7a220b04 Fixed a typo that impacts windows builds 2016-10-05 08:51:31 -07:00
Benoit Steiner
ceee1c008b Silenced compilation warning 2016-10-04 18:47:53 -07:00
Benoit Steiner
698ff69450 Properly characterize the CUDA packet primitives for fp16 as device only 2016-10-04 16:53:30 -07:00
Rasmus Munk Larsen
7f67e6dfdb Update comment for fast sqrt. 2016-10-04 15:09:11 -07:00
Rasmus Munk Larsen
765615609d Update comment for fast sqrt. 2016-10-04 15:08:41 -07:00
Rasmus Munk Larsen
3ed67cb0bb Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
Benchmark speed in Giga-sqrts/s
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
-----------------------------------------
                    SSE        AVX
Fast=1              2.529G     4.380G
Fast=0              1.944G     1.898G
Fast=1 fixed        2.214G     3.739G

This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
2016-10-04 14:22:56 -07:00
Benoit Steiner
6af5ac7e27 Cleanup the cuda executor code. 2016-10-04 08:52:13 -07:00
Benoit Steiner
2f6d1607c8 Cleaned up the random number generation code. 2016-10-04 08:38:23 -07:00
Benoit Steiner
881b90e984 Use explicit type casting to generate packets of zeros. 2016-10-04 08:23:38 -07:00
Benoit Steiner
616a7a1912 Improved support for compiling CUDA code with clang as the host compiler 2016-10-03 17:09:33 -07:00
Benoit Steiner
409e887d78 Added support for constand std::complex numbers on GPU 2016-10-03 11:06:24 -07:00
Gael Guennebaud
9d6d0dff8f bug #1317: fix performance regression with some Block expressions and clang by helping it to remove dead code.
The trick is to get rid of the nested expression in the evaluator by copying only the required information (here, the strides).
2016-10-01 15:37:00 +02:00
Gael Guennebaud
8b84801f7f bug #1310: workaround a compilation regression from 3.2 regarding triangular * homogeneous 2016-09-30 22:49:59 +02:00
Benoit Steiner
422530946f Renamed the SYCL tests to follow the standard naming convention. 2016-09-30 08:22:10 -07:00
Gael Guennebaud
67b4f45836 Fix angle range 2016-09-30 12:46:33 +02:00
Gael Guennebaud
27f3970453 Remove std:: prefix 2016-09-30 12:40:41 +02:00
Gael Guennebaud
3860a0bc8f bug #1312: Quaternion to AxisAngle conversion now ensures the angle will be in the range [-pi,pi]. This also increases accuracy when q.w is negative. 2016-09-29 23:23:35 +02:00
Gael Guennebaud
33500050c3 bug #1308: fix compilation of some small products involving nullary-expressions. 2016-09-29 09:40:44 +02:00
Benoit Steiner
27d7628f16 Updated the list of warnings to reflect the new message ids introduced in cuda 8.0 2016-09-28 17:42:59 -07:00
Benoit Steiner
2bda1b0d93 Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus. 2016-09-28 17:08:41 -07:00
Mehdi Goli
dd602e62c8 Converting alias template to nested struct in order to be compatible with CXX-03 2016-09-27 16:21:19 +01:00
Gael Guennebaud
f3a00dd2b5 Merged in sergiu/eigen (pull request PR-229)
Disabled MSVC level 4 warning C4714
2016-09-27 09:28:08 +02:00
Gael Guennebaud
892afb9416 Add debug info. 2016-09-26 23:53:57 +02:00
Gael Guennebaud
779774f98c bug #1311: fix alignment logic in some cases of (scalar*small).lazyProduct(small) 2016-09-26 23:53:40 +02:00
Benoit Steiner
6565f8d60f Made the initialization of a CUDA device thread safe. 2016-09-26 11:00:32 -07:00
Gael Guennebaud
48dfe98abd bug #1308: fix compilation of vector * rowvector::nullary. 2016-09-25 14:54:35 +02:00
Sergiu Deitsch
fe29157d02 disabled MSVC level 4 warning C4714
The level 4 warning (/W4) warns about functions marked as __forceinline not
inlined, and generates a lot of noise.
2016-09-25 14:25:47 +02:00
Benoit Steiner
f6ac51a054 Made TensorEvalTo compatible with c++0x again. 2016-09-23 16:45:17 -07:00
Benoit Steiner
00d4e65f00 Deleted unused TensorMap data member 2016-09-23 16:44:45 -07:00
Gael Guennebaud
86caba838d bug #1304: fix Projective * scaling and Projective *= scaling 2016-09-23 13:41:21 +02:00
Gael Guennebaud
b9f7a17e47 Add missing file. 2016-09-23 10:26:08 +02:00
Benoit Steiner
1301d744f8 Made the gaussian generator usable on GPU 2016-09-22 19:04:44 -07:00
Benoit Steiner
2a69290ddb Added a specialization of Eigen::numext::real and Eigen::numext::imag for std::complex<T> to be used when compiling a cuda kernel. This is unfortunately necessary to be able to process complex numbers from a CUDA kernel on MacOS. 2016-09-22 15:52:23 -07:00
Gael Guennebaud
3946768916 Added tag 3.3-rc1 for changeset 77e27fbeee 2016-09-22 22:38:36 +02:00
Gael Guennebaud
77e27fbeee bump to 3.3-rc1 2016-09-22 22:37:39 +02:00
Gael Guennebaud
2ada122bc6 merge 2016-09-22 22:33:18 +02:00
Gael Guennebaud
8f2bdde373 merge 2016-09-22 22:32:55 +02:00
Gael Guennebaud
ba0f844d6b Backout changeset ce3557ca69 2016-09-22 22:28:51 +02:00
Gael Guennebaud
9bcdc8b756 Add a nullary-functor example performing index-based sub-matrices. 2016-09-22 22:27:54 +02:00
Benoit Steiner
50e3bbfc90 Calls x.imag() instead of imag(x) when x is a complex number since the former
is a constexpr while the later isn't. This fixes compilation errors triggered by nvcc on Mac.
2016-09-22 13:17:25 -07:00
Gael Guennebaud
ca3746c6f8 Bypass identity reflectors. 2016-09-22 22:07:13 +02:00
Felix Gruber
8bde7da086 fix documentation of LinSpaced
The index of the highest value in a LinSpace is size-1.
2016-09-22 14:50:07 +02:00
Gael Guennebaud
66cbabafed Add a note regarding gcc bug #72867 2016-09-22 11:18:52 +02:00
Christoph Hertzberg
4b377715d7 Do not manually add absolute path to boost-library.
Also set C++ standard for blaze to C++14
2016-09-22 00:10:47 +02:00
Gael Guennebaud
aecc51a3e8 fix typo 2016-09-21 21:53:00 +02:00
Gael Guennebaud
1fc3a21ed0 Disable a failure test if extended double precision is in use (x87) 2016-09-21 20:09:07 +02:00
Gael Guennebaud
9fa2c8650e Fix alignement of statically allocated temporaries in symv, and trmv. 2016-09-21 17:34:24 +02:00
Gael Guennebaud
ac5377e161 Improve cost estimation of complex division 2016-09-21 17:26:04 +02:00
Gael Guennebaud
5269d11935 Fix compilation if ICC. 2016-09-21 17:08:51 +02:00
Benoit Steiner
26f9907542 Added missing typedefs 2016-09-20 12:58:03 -07:00
RJ Ryan
608b1acd6d Don't use c++11 features and fix include. 2016-09-20 07:49:05 -07:00
RJ Ryan
b2c6dc48d9 Add CUDA-specific std::complex<T> specializations for scalar_sum_op, scalar_difference_op, scalar_product_op, and scalar_quotient_op. 2016-09-20 07:18:20 -07:00
Benoit Steiner
8a66ca4b10 Pulled latest updates from trunk 2016-09-19 14:13:55 -07:00
Benoit Steiner
59e9edfbf1 Removed EIGEN_DEVICE_FUNC qualifers for the lu(), fullPivLu(), partialPivLu(), and inverse() functions since they aren't ready to run on GPU 2016-09-19 14:13:20 -07:00
Gael Guennebaud
3ada6e4bed Merged hongkai-dai/eigen/tip into default (bug #1298) 2016-09-19 22:08:06 +02:00
Benoit Steiner
c3ca9b1e76 Deleted some unecessary and confusing EIGEN_DEVICE_FUNC 2016-09-19 11:33:39 -07:00
Hongkai Dai
5dcc6d301a remove ternary operator in euler angles 2016-09-19 10:30:30 -07:00
Luke Iwanski
c771df6bc3 Updated the owners of the file. 2016-09-19 14:09:25 +01:00
Luke Iwanski
b91e021172 Merged with default. 2016-09-19 14:03:54 +01:00
Luke Iwanski
cb81975714 Partial OpenCL support via SYCL compatible with ComputeCpp CE. 2016-09-19 12:44:13 +01:00
Gael Guennebaud
bf03820339 Silent warning. 2016-09-17 14:14:01 +02:00
Gael Guennebaud
de05a18fe0 fix compilation with boost::multiprec 2016-09-17 14:13:48 +02:00
Gael Guennebaud
4cc2c73e6a Fix alignement of statically allocated temporaries in gemv. 2016-09-17 12:52:27 +02:00
Christoph Hertzberg
ce3557ca69 Make makeHouseholder more stable for cases where real(c0) is not very small (but the rest is). 2016-09-16 14:24:47 +02:00
Emil Fresk
6edd2e2851 Made AutoDiffJacobian more intuitive to use and updated for C++11
Changes:
* Removed unnecessary types from the Functor by inferring from its types
* Removed inputs() function reference, replaced with .rows()
* Updated the forward constructor to use variadic templates
* Added optional parameters to the Fuctor for passing parameters,
  control signals, etc
* Has been tested with fixed size and dynamic matricies

Ammendment by chtz: overload operator() for compatibility with not fully conforming compilers
2016-09-16 14:03:55 +02:00
Gael Guennebaud
4adeababf9 Fix undeflow 2016-09-16 11:46:46 +02:00
Gael Guennebaud
18f6e47815 Fix order of "static inline". 2016-09-16 11:32:54 +02:00
Gael Guennebaud
ee62f168e6 Doc: add link from block methods to respective tutorial section. 2016-09-16 11:26:25 +02:00
Gael Guennebaud
ca7f061a5f bug #828: clarify documentation of SparseMatrixBase's methods returning a sub-matrix. 2016-09-16 11:23:19 +02:00
Gael Guennebaud
50e203c717 bug #828: clarify documentation of SparseMatrixBase's unary methods. 2016-09-16 10:40:50 +02:00
Gael Guennebaud
fa9049a544 Let be consistent and consider any denormal number as zero. 2016-09-15 11:24:03 +02:00
Gael Guennebaud
b33144e4df merge 2016-09-15 11:22:16 +02:00
Benoit Steiner
c0d56a543e Added several missing EIGEN_DEVICE_FUNC qualifiers 2016-09-14 14:06:21 -07:00
Benoit Steiner
488ad7dd1b Added missing EIGEN_DEVICE_FUNC qualifiers 2016-09-14 13:35:00 -07:00
Benoit Steiner
779faaaeba Fixed compilation warnings generated by nvcc 6.5 (and below) when compiling the EIGEN_THROW macro 2016-09-14 09:56:11 -07:00
Gael Guennebaud
1c8347e554 Fix product for custom complex type. (conjugation was ignored) 2016-09-14 18:28:49 +02:00
Benoit Steiner
ff47717f25 Suppress warning 2527 and 2529, which correspond to the "calling a __host__ function from a __host__ __device__ function is not allowed" message in nvcc 6.5. 2016-09-13 12:49:40 -07:00
Benoit Steiner
309190cf02 Suppress message 1222 when compiling with nvcc: this ensures that we don't warnings about unknown warning messages when compiling with older versions of nvcc 2016-09-13 12:42:13 -07:00
Gael Guennebaud
c10620b2b0 Fix typo in doc. 2016-09-13 09:25:07 +02:00
Gael Guennebaud
73c8f2f697 bug #1285: fix regression introduced in changeset 00c29c2cae 2016-09-13 07:58:39 +02:00
Benoit Steiner
e4d4d15588 Register the cxx11_tensor_device only for recent cuda architectures (i.e. >= 3.0) since the test instantiate contractions that require a modern gpu. 2016-09-12 19:01:52 -07:00
Benoit Steiner
4dfd888c92 CUDA contractions require arch >= 3.0: don't compile the cuda contraction tests on older architectures. 2016-09-12 18:49:01 -07:00
Benoit Steiner
028e299577 Fixed a bug impacting some outer reductions on GPU 2016-09-12 18:36:52 -07:00
Benoit Steiner
5f50f12d2c Added the ability to compute the absolute value of a complex number on GPU, as well as a test to catch the problem. 2016-09-12 13:46:13 -07:00
Benoit Steiner
8321dcce76 Merged latest updates from trunk 2016-09-12 10:33:05 -07:00
Benoit Steiner
eb6ba00cc8 Properly size the list of waiters 2016-09-12 10:31:55 -07:00
Benoit Steiner
a618094b62 Added a resize method to MaxSizeVector 2016-09-12 10:30:53 -07:00
Gael Guennebaud
228ae29591 Fix compilation on 32 bits systems. 2016-09-09 22:34:38 +02:00
Gael Guennebaud
471eac5399 bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX) 2016-09-08 08:36:27 +02:00
Gael Guennebaud
d780983f59 Doc: explain minimal requirements on nullary functors 2016-09-06 23:14:52 +02:00
Gael Guennebaud
85fb517eaf Generalize ScalarBinaryOpTraits to any complex-real combination as defined by NumTraits (instead of supporting std::complex only). 2016-09-06 17:23:15 +02:00
Gael Guennebaud
447f269561 Disable previous workaround. 2016-09-06 15:49:02 +02:00
Gael Guennebaud
b046a3f87d Workaround MSVC instantiation faillure of has_*ary_operator at the level of triats<Ref>::match so that the has_*ary_operator are really properly instantiated throughout the compilation unit. 2016-09-06 15:47:04 +02:00
Gael Guennebaud
3cb914f332 bug #1266: remove CUDA guards on MatrixBase::<decomposition> definitions. (those used to break old nvcc versions that we propably don't care anymore) 2016-09-06 09:55:50 +02:00
Gael Guennebaud
e1642f485c bug #1288: fix memory leak in arpack wrapper. 2016-09-05 18:01:30 +02:00
Gael Guennebaud
19a95b3309 Fix shadowing wrt Eigen::Index 2016-09-05 17:19:47 +02:00
Gael Guennebaud
dabc81751f Fix compilation when cuda_fp16.h does not exist. 2016-09-05 17:14:20 +02:00
Gael Guennebaud
e13071dd13 Workaround a weird msvc 2012 compilation error. 2016-09-05 15:50:41 +02:00
Gael Guennebaud
d123717e21 Fix for msvc 2012 and older 2016-09-05 15:26:56 +02:00
Benoit Steiner
87a8a1975e Fixed a regression test 2016-09-02 19:29:33 -07:00
Benoit Steiner
13df3441ae Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled. 2016-09-02 19:25:47 -07:00
Benoit Steiner
373c340b71 Fixed a typo 2016-09-02 15:41:17 -07:00
Benoit Steiner
cadd124d73 Pulled latest update from trunk 2016-09-02 15:30:02 -07:00
Benoit Steiner
05b0518077 Made the index type an explicit template parameter to help some compilers compile the code. 2016-09-02 15:29:34 -07:00
Benoit Steiner
adf864fec0 Merged in rmlarsen/eigen (pull request PR-222)
Fix CUDA build broken by changes to min and max reduction.
2016-09-02 14:11:20 -07:00
Benoit Steiner
5a6be66cef Turned the Index type used by the nullary wrapper into a template parameter. 2016-09-02 14:10:29 -07:00
Rasmus Munk Larsen
13e93ca8b7 Fix CUDA build broken by changes to min and max reduction. 2016-09-02 13:41:36 -07:00
Benoit Steiner
6c05c3dd49 Fix the cxx11_tensor_cuda.cu test on 32bit platforms. 2016-09-02 11:12:16 -07:00
Gael Guennebaud
49c0390ce0 merge 2016-09-02 15:24:14 +02:00
Gael Guennebaud
d6c8366d84 Fix compilation with MSVC 2012 2016-09-02 15:23:32 +02:00
Benoit Steiner
039e225f7f Added a test for nullary expressions on CUDA
Also check that we can mix 64 and 32 bit indices in the same compilation unit
2016-09-01 13:28:12 -07:00
Benoit Steiner
c53f783705 Updated the contraction code to support constant inputs. 2016-09-01 11:41:27 -07:00
Gael Guennebaud
ef54723dbe One more msvc fix iteration, the previous one was over-simplified for visual 2016-09-01 15:04:53 +02:00
Gael Guennebaud
46475eff9a Adjust Tensor module wrt recent change in nullary functor 2016-09-01 13:40:45 +02:00
Gael Guennebaud
72a4d49315 Fix compilation with CUDA 8 2016-09-01 13:39:33 +02:00
Gael Guennebaud
f9f32e9e2d Fix compilation with nvcc 2016-09-01 13:06:14 +02:00
Gael Guennebaud
3d946e42b3 Fix compilation with visual studio 2016-09-01 12:59:32 +02:00
Benoit Steiner
221f619bea Merged in rmlarsen/eigen (pull request PR-221)
Fix bugs to make min- and max reducers work with correctly with IEEE infinities.
2016-08-31 15:10:10 -07:00
Rasmus Munk Larsen
a1e092d1e8 Fix bugs to make min- and max reducers with correctly with IEEE infinities. 2016-08-31 15:04:16 -07:00
Gael Guennebaud
836fa25a82 Make sure sizeof is truelly needed, thus improving SFINAE portability. 2016-08-31 23:40:18 +02:00
Gael Guennebaud
84cf6e42ca minor tweaks in has_* helpers 2016-08-31 23:04:14 +02:00
Gael Guennebaud
7ae819123c Simplify CwiseNullaryOp example. 2016-08-31 15:46:04 +02:00
Gael Guennebaud
218c37beb4 bug #1286: automatically detect the available prototypes of functors passed to CwiseNullaryExpr such that functors have only to implement the operators that matters among:
operator()()
 operator()(i)
 operator()(i,j)
Linear access is also automatically detected based on the availability of operator()(i,j).
2016-08-31 15:45:25 +02:00
Gael Guennebaud
efe2c225c9 bug #1283: add regression unit test 2016-08-31 13:04:29 +02:00
Gael Guennebaud
3456247437 bug #1283: quick fix for products involving uncommon general block access to vectors. 2016-08-31 08:17:15 +02:00
Gael Guennebaud
8c48d42530 Fix 4x4 inverse with non-linear destination 2016-08-30 23:16:38 +02:00
Gael Guennebaud
e7fbbc2748 Doc: add links and discourage user to write their own expression (better use CwiseNullaryOp) 2016-08-30 15:57:46 +02:00
Gael Guennebaud
1e2ab8b0b3 Doc: add an exemple showing how custom expression can be advantageously implemented via CwiseNullaryOp. 2016-08-30 15:40:41 +02:00
Gael Guennebaud
9c9e23858e Doc: split customizing-eigen page into sub-pages and re-structure a bit the different topics 2016-08-30 11:10:08 +02:00
Gael Guennebaud
cffe8bbff7 Doc: add link to example 2016-08-30 10:45:27 +02:00
Gael Guennebaud
c57317035a Fix unit test for 1x1 matrices 2016-08-30 10:20:23 +02:00
Gael Guennebaud
1f84f0d33a merge EulerAngles module 2016-08-30 10:01:53 +02:00
Gael Guennebaud
68e803a26e Fix warning 2016-08-30 09:21:57 +02:00
Gael Guennebaud
e074f720c7 Include missing forward declaration of SparseMatrix 2016-08-29 18:56:46 +02:00
Gael Guennebaud
2915e1fc5d Revert part of changeset 5b3a6f51d3
to keep accuracy of smallest eigenvalues.
2016-08-29 14:14:18 +02:00
Gael Guennebaud
7e029d1d6e bug #1271: add SparseMatrix::coeffs() methods returning a 1D view of the non zero coefficients. 2016-08-29 12:06:37 +02:00
Gael Guennebaud
a93e354d92 Add some pre-allocation unit tests (not working yet) 2016-08-29 11:08:44 +02:00
Gael Guennebaud
6cd7b9ea6b Fix compilation with cuda 8 2016-08-29 11:06:08 +02:00
Gael Guennebaud
8f4b4ad5fb use ::hlog if available. 2016-08-29 11:05:32 +02:00
Gael Guennebaud
35a8e94577 bug #1167: simplify installation of header files using cmake's install(DIRECTORY ...) command. 2016-08-29 10:59:37 +02:00
Gael Guennebaud
0decc31aa8 Add generic implementation of conj_helper for custom complex types. 2016-08-29 09:42:29 +02:00
Gael Guennebaud
fd9caa1bc2 bug #1282: fix implicit double to float conversion warning 2016-08-28 22:45:56 +02:00
Gael Guennebaud
68d1897e8a Make sure that our log1p implementation is called as a last resort only. 2016-08-26 15:30:55 +02:00
Gael Guennebaud
fe60856fed Add overload of numext::log1p for float/double in CUDA 2016-08-26 15:28:59 +02:00
Gael Guennebaud
0f56b5a6de enable vectorization path when testing half on cuda, and add test for log1p 2016-08-26 14:55:51 +02:00
Gael Guennebaud
965e595f02 Add missing log1p method 2016-08-26 14:55:00 +02:00
Gael Guennebaud
1329c55875 Fix compilation with boost::multiprec. 2016-08-25 14:54:39 +02:00
Gael Guennebaud
441b7eaab2 Add support for non trivial scalar factor in sparse selfadjoint * dense products, and enable +=/-= assignement for such products.
This changeset also improves the performance by working on column of the result at once.
2016-08-24 13:06:34 +02:00
Gael Guennebaud
8132a12625 bug #1268: detect faillure in LDLT and report them through info() 2016-08-23 23:15:55 +02:00
Gael Guennebaud
bde9b456dc Typo 2016-08-23 21:36:36 +02:00
Gael Guennebaud
326320ec7b Fix compilation in non C++11 mode. 2016-08-23 19:28:57 +02:00
Gael Guennebaud
ea2e968257 Address several implicit scalar conversions. 2016-08-23 18:44:33 +02:00
Gael Guennebaud
0a6a50d1b0 Cleanup eiegnvector extraction: leverage matrix products and compile-time sizes, remove numerous useless temporaries. 2016-08-23 18:14:37 +02:00
Gael Guennebaud
00b2666853 bug #645: patch from Tobias Wood implementing the extraction of eigenvectors in GeneralizedEigenSolver 2016-08-23 17:37:38 +02:00
Gael Guennebaud
504a4404f1 Optimize expression matching "d?=a-b*c" as "d?=a; d?=b*c;" 2016-08-23 16:52:22 +02:00
Gael Guennebaud
e47a8928ec Fix compilation in check_for_aliasing due to ambiguous specializations 2016-08-23 16:19:10 +02:00
Gael Guennebaud
6739f6bb1b Merged in traversaro/eigen-1/traversaro/modify-findeigen3cmake-to-find-eigen3con-1469782761059 (pull request PR-213)
Modify FindEigen3.cmake to find Eigen3Config.cmake
2016-08-23 15:53:57 +02:00
Gael Guennebaud
ef3de20481 Cleanup cost of tanh 2016-08-23 14:39:55 +02:00
Gael Guennebaud
b3151bca40 Implement pmadd for float and double to make it consistent with the vectorized path when FMA is available. 2016-08-23 14:24:08 +02:00
Gael Guennebaud
a4c266f827 Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only. 2016-08-23 14:23:08 +02:00
Gael Guennebaud
82147cefff Fix possible overflow and biais in integer random generator 2016-08-23 13:25:31 +02:00
Silvio Traversaro
068ccab9fe FindEigen3.cmake : search for package only if EIGEN3_INCLUDE_DIR is not already defined 2016-08-22 22:13:10 +00:00
Gael Guennebaud
581b6472d1 bug #1265: remove outdated notes 2016-08-22 23:25:39 +02:00
Igor Babuschkin
59bacfe520 Fix compilation on CUDA 8 by removing call to h2log1p 2016-08-15 23:38:05 +01:00
Benoit Steiner
34ae80179a Use array_prod instead of calling TotalSize since TotalSize is only available on DSize. 2016-08-15 10:29:14 -07:00
Benoit Steiner
2556565b4b Merged in ibab/eigen/extend-log1p (pull request PR-218)
Fix compilation on CUDA 8 due to missing h2log1p function
2016-08-15 08:31:03 -07:00
Benoit Steiner
30dd6f5e34 Close branch extend-log1p 2016-08-15 08:31:03 -07:00
Benoit Steiner
fe73648c98 Fixed a bug in the documentation. 2016-08-12 10:00:43 -07:00
Christoph Hertzberg
9636a8ed43 bug #1273: Add parentheses when redefining eigen_assert 2016-08-12 15:34:21 +02:00
Christoph Hertzberg
c83b754ee0 bug #1272: Disable assertion when total number of columns is zero.
Also moved assertion to finished() method and adapted unit-test
2016-08-12 15:15:34 +02:00
Benoit Steiner
e3a8dfb02f std::erfcf doesn't exist: use numext::erfc instead 2016-08-11 15:24:06 -07:00
Benoit Steiner
64e68cbe87 Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything. 2016-08-08 19:29:59 -07:00
Benoit Steiner
5157ce8cbf Merged in ibab/eigen/extend-log1p (pull request PR-217)
Add log1p support for CUDA and half floats
2016-08-08 14:50:00 -07:00
Igor Babuschkin
aee693ac52 Add log1p support for CUDA and half floats 2016-08-08 20:24:59 +01:00
Benoit Steiner
72096f3bd4 Merged in suiyuan2009/eigen/fix_tanh_inconsistent_for_tensorflow (pull request PR-215)
Fix_tanh_inconsistent_for_tensorflow
2016-08-08 09:06:45 -07:00
Christoph Hertzberg
3e4a33d4ba bug #1272: Let CommaInitializer work for more border cases (enhances fix of bug #1242).
The unit test tests all combinations of 2x2 block-sizes from 0 to 3.
2016-08-08 17:26:48 +02:00
Ziming Dong
1031223c09 fix tanh inconsistent 2016-08-06 19:48:50 +08:00
Ziming Dong
5cf1e4c79b create fix_tanh_inconsistent branch 2016-08-06 15:54:33 +08:00
Christoph Hertzberg
fe4b927e9c Add aliases Eigen_*_DIR to Eigen3_*_DIR
This is to make configuring work again after project was renamed from Eigen to Eigen3
2016-08-05 15:21:14 +02:00
Benoit Steiner
fe778427f2 Fixed the constructors of the new half_base class. 2016-08-04 18:32:26 -07:00
Benoit Steiner
5eea1c7f97 Fixed cut and paste bug in debud message 2016-08-04 17:34:13 -07:00
Benoit Steiner
9506343349 Fixed the isnan, isfinite and isinf operations on GPU 2016-08-04 17:25:53 -07:00
Benoit Steiner
b50d8f8c4a Extended a regression test to validate that we basic fp16 support works with cuda 7.0 2016-08-03 16:50:13 -07:00
Benoit Steiner
fad9828769 Deleted redundant regression test. 2016-08-03 16:08:37 -07:00
Benoit Steiner
373bb12dc6 Check that it's possible to forward declare the hlaf type. 2016-08-03 16:07:31 -07:00
Gael Guennebaud
17b9a55d98 Move Eigen::half_impl::half to Eigen::half while preserving the free functions to the Eigen::half_impl namespace together with ADL 2016-08-04 00:00:43 +02:00
Benoit Steiner
ca2cee2739 Merged in ibab/eigen (pull request PR-206)
Expose real and imag methods on Tensors
2016-08-03 11:53:04 -07:00
Benoit Steiner
d92df04ce8 Cleaned up the new float16 test a bit 2016-08-03 11:50:07 -07:00
Benoit Steiner
81099ef482 Added a test for fp16 2016-08-03 11:41:17 -07:00
Benoit Steiner
a20b58845f CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running. 2016-08-03 10:00:43 -07:00
Gael Guennebaud
819d0cea1b List PARDISO solver. 2016-08-02 23:32:41 +02:00
Christoph Hertzberg
f4404777ff Change project name to Eigen3, to be compatible with FindEigen3.cmake and Eigen3Config.cmake.
This is related to pull-requests 214.
2016-08-02 17:08:57 +00:00
Benoit Steiner
fd220dd8b0 Use numext::conj instead of std::conj 2016-08-01 18:16:16 -07:00
Benoit Steiner
e256acec7c Avoid unecessary object copies 2016-08-01 17:03:39 -07:00
Gael Guennebaud
7995cec90c Fix vectorization logic for coeff-based product for some corner cases. 2016-07-31 15:20:22 +02:00
Benoit Steiner
02fe89f5ef half implementation has been moved to half_impl namespace 2016-07-29 15:09:34 -07:00
Benoit Steiner
2693fd54bf bug #1266: half implementation has been moved to half_impl namespace 2016-07-29 13:45:56 -07:00
Christoph Hertzberg
c5b893f434 bug #1266: half implementation has been moved to half_impl namespace 2016-07-29 18:36:08 +02:00
Silvio Traversaro
5e51a361fe Modify FindEigen3.cmake to find Eigen3Config.cmake 2016-07-29 08:59:38 +00:00
klimpel
ca5effa16c MSVC-2010 is making problems with SFINAE again. But restricting to the variant for very old compilers (enum, template<typename C> for both function definitions) fixes the problem. 2016-07-28 15:58:17 +01:00
Gael Guennebaud
4057f9b1fc Enable slice-vectorization+inner-unrolling when unaligned vectorization is allowed. For instance, this permits to vectorize 5x5 matrices (including product) 2016-07-28 13:47:33 +02:00
Gael Guennebaud
5fbe7aa604 Update and fix Cholesky mini benchmark 2016-07-28 11:26:30 +02:00
Gael Guennebaud
a72752caac Vectorize more small product expressions by letting the general assignement logic decides on the sizes that are OK for vectorization. 2016-07-28 11:21:07 +02:00
Gael Guennebaud
cc2f6d68b1 bug #1264: fix compilation 2016-07-27 23:30:47 +02:00
Gael Guennebaud
188590db82 Add instructions for LAPACKE+Accelerate 2016-07-27 15:07:35 +02:00
Gael Guennebaud
8972323c08 Big 1261: add missing max(ADS,ADS) overload (same for min) 2016-07-27 14:52:48 +02:00
Gael Guennebaud
5d94dc85e5 bug #1260: add regression test 2016-07-27 14:38:30 +02:00
Gael Guennebaud
0d7039319c bug #1260: remove doubtful specializations of ScalarBinaryOpTraits 2016-07-27 14:35:52 +02:00
Christoph Hertzberg
d3d7c6245d Add brackets to block matrix and fixed some typos 2016-07-27 09:55:39 +02:00
Gael Guennebaud
0eece608b4 Added tag 3.3-beta2 for changeset f6b3cf8de9 2016-07-26 23:52:14 +02:00
Gael Guennebaud
f6b3cf8de9 Bump to 3.3-beta2 2016-07-26 23:51:59 +02:00
Gael Guennebaud
9d16b6e1cf Formatting 2016-07-26 23:51:43 +02:00
Gael Guennebaud
fd2f989b1d Fix testing of nearly zero input matrices. 2016-07-26 14:46:02 +02:00
Gael Guennebaud
c9e3e438eb Add more very small numbers in the list of nearly "zero" values when testing SVD and EVD algorithms 2016-07-26 14:45:44 +02:00
Gael Guennebaud
95113cb15c Improve robustness of 2x2 eigenvalue with shifting and scaling 2016-07-26 14:43:54 +02:00
Gael Guennebaud
7f7e84aa36 Fix compilation with MKL support 2016-07-26 13:31:29 +02:00
Gael Guennebaud
429028b652 Typo. 2016-07-26 12:12:53 +02:00
Gael Guennebaud
6b89fa802c Typos. 2016-07-26 12:08:04 +02:00
Gael Guennebaud
c581c8fa79 Fix with expession template scalar types. 2016-07-26 11:33:28 +02:00
Gael Guennebaud
8021aed89e Split BLAS/LAPACK versus MKL documentation 2016-07-26 11:11:59 +02:00
Gael Guennebaud
757971e7ea bug #1258: fix compilation of Map<SparseMatrix>::coeffRef 2016-07-26 09:40:19 +02:00
Gael Guennebaud
c9425492c8 Update doc. 2016-07-25 18:41:26 +02:00
Gael Guennebaud
0592b4cfbf merge 2016-07-25 18:20:22 +02:00
Gael Guennebaud
9c663e4ee8 Clean references to MKL in LAPACKe support. 2016-07-25 18:20:08 +02:00
Gael Guennebaud
0c06077efa Rename MKL files 2016-07-25 18:00:47 +02:00
Gael Guennebaud
4d54e3dd33 bug #173: remove dependency to MKL for LAPACKe backend. 2016-07-25 17:55:07 +02:00
Benoit Steiner
3d3d34e442 Deleted dead code. 2016-07-25 08:53:37 -07:00
Gael Guennebaud
34b483e25d bug #1249: enable use of __builtin_prefetch for GCC, clang, and ICC only. 2016-07-25 15:17:45 +02:00
Gael Guennebaud
6d5daf32f5 bug #1255: comment out broken and unsused line. 2016-07-25 14:48:30 +02:00
Gael Guennebaud
f9598d73b5 bug #1250: fix pow() for AutoDiffScalar with custom nested scalar type. 2016-07-25 14:42:19 +02:00
Gael Guennebaud
fd1117f2be Implement digits10 for mpreal 2016-07-25 14:38:55 +02:00
Gael Guennebaud
9908020d36 Add minimal support for Array<string>, and fix Tensor<string> 2016-07-25 14:25:56 +02:00
Gael Guennebaud
4184a3e544 Extend boost.multiprec unit test with ET on, complexes, and general/generalized eigenvalue solvers. 2016-07-25 12:36:22 +02:00
Gael Guennebaud
1b2049fbda Enforce scalar types in calls to max/min (helps with expression template scalar types) 2016-07-25 12:35:10 +02:00
Gael Guennebaud
b118bc76eb Add digits10 overload for complex. 2016-07-25 12:33:21 +02:00
Gael Guennebaud
c96af5381f Remove custom complex division function cdiv. 2016-07-25 12:31:58 +02:00
Gael Guennebaud
e1c7c5968a Update doc. 2016-07-25 11:18:04 +02:00
Gael Guennebaud
8fffc81606 Add NumTraits::digits10() function based on numeric_limits::digits10 and make use of it for printing matrices. 2016-07-25 11:13:01 +02:00
Gael Guennebaud
5f03584752 merge 2016-07-23 17:52:44 +02:00
Gael Guennebaud
1b0353c659 Fix misuse of dummy_precesion in eigenvalues solvers 2016-07-23 17:52:31 +02:00
Benoit Steiner
c6b0de2c21 Improved partial reductions in more cases 2016-07-22 17:18:20 -07:00
Gael Guennebaud
72744d93ef Allows the compiler to inline outer products (the change from default to dont-inline in changeset 737bed19c1
was not motivated)
2016-07-22 17:02:28 +02:00
Gael Guennebaud
32d95e86c9 merge 2016-07-22 16:43:12 +02:00
Gael Guennebaud
60d5980a41 add a note 2016-07-22 15:46:23 +02:00
Gael Guennebaud
d7a0e52478 Fix testing of log nearby 1 2016-07-22 15:44:26 +02:00
Gael Guennebaud
7acf23c14c Truely split unit test. 2016-07-22 15:41:23 +02:00
Gael Guennebaud
24af67a6cc Fix boostmultiprec for C++03 2016-07-22 15:30:54 +02:00
Gael Guennebaud
395c835f4b Fix CUDA compilation 2016-07-22 15:30:24 +02:00
Gael Guennebaud
d075d122ea Move half unit test from unsupported to main tests 2016-07-22 14:34:19 +02:00
Gael Guennebaud
47afc9a365 More cleaning in half:
- put its definition and functions in its own half_impl namespace such that the free function does not polute the Eigen namespace while still making them visible for half through ADL.
 - expose Eigen::half throguh a using statement
 - move operator<< from std to half_float namespace
2016-07-22 14:33:28 +02:00
Gael Guennebaud
0f350a8b7e Fix CUDA compilation 2016-07-21 18:47:07 +02:00
Gael Guennebaud
bf91a44f4a Use ADL and log10 for printing matrices. 2016-07-21 15:48:24 +02:00
Gael Guennebaud
82798162c0 Extend unit testing of half with ADL and arrays. 2016-07-21 15:47:21 +02:00
Gael Guennebaud
87fbda812f Add missing log10 and random generator for half. 2016-07-21 15:46:45 +02:00
Gael Guennebaud
01d12d3e82 Some cleanup in Halh: standard functions should be defined in the namespace of the class half to make ADL work, and thus the global is* functions can be removed. 2016-07-21 15:10:48 +02:00
Gael Guennebaud
007edee1ac Add a doc page summarizing the true speed of Eigen's decompositions. 2016-07-21 12:32:02 +02:00
Gael Guennebaud
9b76be9d21 Update benchmark for dense solver to stress least-squares pb, and to output a HTML table 2016-07-21 12:30:53 +02:00
Gael Guennebaud
72950effdf enable testing of Boost.Multiprecision with expression templates 2016-07-20 18:21:30 +02:00
Yi Lin
7b4abc2b1d Fixed a code comment error 2016-07-20 22:28:54 +08:00
Gael Guennebaud
b64b9d0172 Add a unit test to stress our solvers with Boost.Multiprecision 2016-07-20 15:20:14 +02:00
Gael Guennebaud
5e4dda8a12 Enable custom scalar types in some unit tests. 2016-07-20 15:19:17 +02:00
Gael Guennebaud
87d480d785 Make use of EIGEN_TEST_MAX_SIZE 2016-07-20 15:14:20 +02:00
Gael Guennebaud
7722913475 Fix ambiguous specialization with custom scalar type 2016-07-20 15:13:44 +02:00
Gael Guennebaud
fd057f86b3 Complete the coeff-wise math function table. 2016-07-20 12:14:10 +02:00
Gael Guennebaud
9e8476ef22 Add missing Eigen::rsqrt global function 2016-07-20 11:59:49 +02:00
Gael Guennebaud
4b4c296d6e Simplify ScalarBinaryOpTraits by removing the Defined enum, and extend its documentation. 2016-07-20 09:56:39 +02:00
Gael Guennebaud
e3bf874c83 Workaround MSVC 2010 compilation issue. 2016-07-18 15:17:25 +02:00
Gael Guennebaud
0f89c6d6b5 Add a summary of possible values for EIGEN_COMP_MSVC 2016-07-18 15:16:13 +02:00
Gael Guennebaud
18884f17d7 Remove static constant declaration: this enforces compiler to generate costly code for thread safety. 2016-07-18 15:05:17 +02:00
Gael Guennebaud
79574e384e Make scalar_product_op the default (instead of void) 2016-07-18 12:03:05 +02:00
Gael Guennebaud
6a3c451c1c Permits call to explicit ctor. 2016-07-18 12:02:20 +02:00
Gael Guennebaud
0c3fe4aca5 merge 2016-07-18 10:44:15 +02:00
Gael Guennebaud
db9b154193 Add missing non-const reverse method in VectorwiseOp. 2016-07-16 15:19:28 +02:00
Gael Guennebaud
461cd819c2 Workaround VS2015 bug 2016-07-13 18:46:01 +02:00
Gael Guennebaud
5ea0864c81 Fix regression in a previous commit: some diagonal entry might not be treated by the 2x2 real preconditioner. 2016-07-13 18:37:54 +02:00
Benoit Steiner
20f7ef2f89 An evalTo expression is only aligned iff both the lhs and the rhs are aligned. 2016-07-12 10:56:42 -07:00
Gael Guennebaud
b4343aa67e Avoid division by very small entries when extracting singularvalues, and explicitly handle the 1x1 complex case. 2016-07-12 17:22:03 +02:00
Gael Guennebaud
e2aa58b631 Consider denormals as zero in makeJacobi and 2x2 SVD.
This also fix serious issues with x387 for which values can be much smaller than the smallest denormal!
2016-07-12 17:21:03 +02:00
Gael Guennebaud
263993a7b6 Fix test for nearly null input 2016-07-12 17:19:26 +02:00
Gael Guennebaud
9ab35d8ba4 Fix compilation of doc 2016-07-12 16:47:39 +02:00
Gael Guennebaud
19614497ae Add some doxygen's images to support both old and recent doxygen versions
(with some vague definitions of old and recent ;) )
2016-07-12 16:45:43 +02:00
Gael Guennebaud
c98bac2966 Manually add -stdd=c++11 to nvcc for old cmake versions 2016-07-12 09:29:18 +02:00
Benoit Steiner
013a904237 Pulled latest updates from trunk 2016-07-11 14:29:05 -07:00
Benoit Steiner
40eb97516c reverted unintended change. 2016-07-11 14:28:03 -07:00
Benoit Steiner
03b71c273e Made the packetmath test compile again. A better fix would be to move the special function tests to the unsupported directory where the code now resides. 2016-07-11 13:50:24 -07:00
Benoit Steiner
3a2dd352ae Improved the contraction mapper to properly support tensor products 2016-07-11 13:43:41 -07:00
Benoit Steiner
0bc020be9d Improved the detection of packet size in the tensor scan evaluator. 2016-07-11 12:14:56 -07:00
Gael Guennebaud
a96a7ce3f7 Move CUDA's special functions to SpecialFunctions module. 2016-07-11 18:39:11 +02:00
Gael Guennebaud
bec35f4c55 Clarify that SpecialFunctions is unsupported 2016-07-11 18:38:40 +02:00
Gael Guennebaud
fd60966310 merge 2016-07-11 18:11:47 +02:00
Gael Guennebaud
7d636349dc Fix configuration of CUDA:
- preserve user defined CUDA_NVCC_FLAGS
 - remove the -ansi flag that conflicts with -std=c++11
 - do not add -std=c++11 if already there
2016-07-11 18:09:04 +02:00
klimpel
8b3fc31b55 compile fix (SFINAE variant apparently didn't work for all compilers) for the following compiler/platform:
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)
Copyright (C) 2006 Free Software Foundation, Inc.
2016-07-11 17:42:22 +02:00
Gael Guennebaud
3e348fdcf9 Workaround MSVC bug 2016-07-11 15:24:52 +02:00
Gael Guennebaud
131ee4bb8e Split test_slice_in_expr which seems to be huge for visual 2016-07-11 11:46:55 +02:00
Gael Guennebaud
194daa3048 Fix assertion (it did not make sense for static_val types) 2016-07-11 11:39:27 +02:00
Gael Guennebaud
18c35747ce Emulate _BitScanReverse64 for 32 bits builds 2016-07-11 11:38:04 +02:00
Konstantinos Margaritis
ef05463fcf Merged kmargar/eigen/tip into default, Altivec/VSX port should be working ok now. 2016-07-10 16:11:46 +03:00
Konstantinos Margaritis
9f7caa7e7d minor fixes for big endian altivec/vsx 2016-07-10 07:05:10 -03:00
Christoph Hertzberg
3c795c6923 bug #1119: Adjust call to ?gssvx for SuperLU 5
Also improved corresponding cmake module to detect versions 5.x

Based on patch by Christoph Grüninger.
2016-07-10 02:29:57 +02:00
Gael Guennebaud
57113e00f9 Relax strict equality 2016-07-09 23:37:11 +02:00
Gael Guennebaud
599f8ba617 Change runtime to compile-time conditional. 2016-07-08 11:39:43 +02:00
Gael Guennebaud
544935101a Fix warnings 2016-07-08 11:38:52 +02:00
Gael Guennebaud
59bf2774a3 Fix warnings 2016-07-08 11:38:11 +02:00
Gael Guennebaud
2f7e2614e7 bug #1232: refactor special functions as a new SpecialFunctions module, currently in unsupported/. 2016-07-08 11:13:55 +02:00
Gael Guennebaud
8b7431d8fd fix compilation with c++11 2016-07-07 15:18:23 +02:00
Gael Guennebaud
69378eed0b Split huge unit test 2016-07-07 15:18:04 +02:00
Gael Guennebaud
c684e37d32 Prevent division by zero. 2016-07-07 11:03:01 +02:00
Gael Guennebaud
179ebb88f9 Fix warning 2016-07-07 09:16:40 +02:00
Gael Guennebaud
5d2dada197 Fix warnings 2016-07-07 09:05:15 +02:00
Gael Guennebaud
f5e780fb05 split huge unit test 2016-07-07 08:59:59 +02:00
Gael Guennebaud
66917299a9 Add debug output 2016-07-06 22:27:15 +02:00
Gael Guennebaud
5ca2457fa5 Fix unit test. 2016-07-06 22:25:24 +02:00
Gael Guennebaud
9b68ed4537 Relax is_equal to is_approx because scaling might modify last bit. 2016-07-06 15:02:49 +02:00
Gael Guennebaud
c3b23d7dbf Fix support of Intel's VML 2016-07-06 14:07:32 +02:00
Gael Guennebaud
8ec4d6480d Fix compilation with recent updates of icc 2016 2016-07-06 14:07:14 +02:00
Gael Guennebaud
5b3a6f51d3 Improve numerical robustness of RealSchur: add scaling and compare sub-diag entries to largest diagonal entry instead of the 2 neighbors. 2016-07-06 13:45:30 +02:00
Gael Guennebaud
d2b5a19e0f Fix warning. 2016-07-06 11:05:30 +02:00
Gael Guennebaud
367ef66af3 Re-enable some specializations for Assignment<.,Product<>> 2016-07-05 22:58:14 +02:00
Gael Guennebaud
155d8d8603 Fix compilation with msvc 2016-07-05 14:43:42 +02:00
Gael Guennebaud
43696ede8f Revert unwanted changes. 2016-07-04 22:40:36 +02:00
Gael Guennebaud
b39fd8217f Fix nesting of SolveWithGuess, and add unit test. 2016-07-04 17:47:47 +02:00
Gael Guennebaud
ec02af1047 Fix template resolution. 2016-07-04 17:37:33 +02:00
Gael Guennebaud
fbcfc2f862 Add unit test for solveWithGuess, and fix template resolution. 2016-07-04 17:19:38 +02:00
Gael Guennebaud
7f7839c12f Add documentation and exemples for inplace decomposition. 2016-07-04 17:18:26 +02:00
Gael Guennebaud
32a41ee659 bug #707: add inplace decomposition through Ref<> for Cholesky, LU and QR decompositions. 2016-07-04 15:13:35 +02:00
Gael Guennebaud
75e80792cc Update relevent list of changesets. 2016-07-04 14:32:34 +02:00
Gael Guennebaud
dacc544b84 asm escape was not strong enough to prevent too aggressive compiler optimization let's fallback to no-inline. 2016-07-04 14:32:15 +02:00
Gael Guennebaud
b74e45906c Few fixes in perf-monitoring. 2016-07-04 14:30:50 +02:00
Gael Guennebaud
ce9fc0ce14 fix clang compilation 2016-07-04 12:59:02 +02:00
Gael Guennebaud
440020474c Workaround compilation issue with msvc 2016-07-04 12:49:19 +02:00
Gael Guennebaud
e61cee7a50 Fix compilation of some unit tests with msvc 2016-07-04 11:49:03 +02:00
Gael Guennebaud
91b3039013 Change the semantic of the last template parameter of Assignment from "Scalar" to "SFINAE" only.
The previous "Scalar" semantic was obsolete since we allow for different scalar types in the source and destination expressions.
On can still specialize on scalar types through SFINAE and/or assignment functor.
2016-07-04 11:02:00 +02:00
Gael Guennebaud
0fa9e4a15c Fix performance regression in dgemm introduced by changeset 5d51a7f12c 2016-07-02 17:35:08 +02:00
Gael Guennebaud
672076db5d Fix performance regression introduced in changeset e56aabf205
.
Register blocking sizes are better handled by the cache size heuristics.
The current code introduced very small blocks, for instance for 9x9 matrix,
thus killing performance.
2016-07-02 15:40:56 +02:00
Igor Babuschkin
78f37ca03c Expose real and imag methods on Tensors 2016-07-01 17:34:31 +01:00
Gael Guennebaud
d161b8f03a Merged in carpent/eigen (pull request PR-204)
Use complete nested namespace Eigen::internal, thus making the custom static assertion macros available outside the Eigen's namespace.
2016-07-01 09:56:44 +02:00
Benoit Steiner
cb2d8b8fa6 Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu. 2016-06-29 15:42:01 -07:00
Benoit Steiner
b2a47641ce Made the code compile when using CUDA architecture < 300 2016-06-29 15:32:47 -07:00
Benoit Steiner
b047ca765f Merged in ibab/eigen/fix-tensor-scan-gpu (pull request PR-205)
Add missing CUDA kernel to tensor scan op
2016-06-29 14:52:19 -07:00
Igor Babuschkin
85699850d9 Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
2016-06-29 11:54:35 +01:00
Justin Carpentier
6126886a67 Use complete nested namespace Eigen::internal 2016-06-28 20:09:25 +02:00
Benoit Jacob
328c5d876a Undo changes in AltiVec --- I don't have any way to test there. 2016-06-28 11:15:25 -04:00
Benoit Jacob
38fb606052 Avoid global variables with static constructors in NEON/Complex.h 2016-06-28 11:12:49 -04:00
Benoit Steiner
1a9f92e781 Added a test to validate the tensor scan evaluation on GPU. The test is currently disabled since the code segfaults. 2016-06-27 16:02:52 -07:00
Benoit Steiner
75c333f94c Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
2016-06-27 10:32:38 -07:00
xantares
c52c8d76da Disable pkgconfig only for native windows builds
ie enable it for MinGW
2016-06-27 16:43:08 +00:00
Gael Guennebaud
d937a420a2 Fix compilation with MSVC by using our portable numext::log1p implementation. 2016-08-22 15:44:21 +02:00
Gael Guennebaud
2d5731e40a bug #1270: bypass custom asm for pmadd and recent clang version 2016-08-22 15:38:03 +02:00
Gael Guennebaud
49b005181a Define EIGEN_COMP_CLANG to clang version as major*100+minor (e.g., 307 corresponds to clang 3.7) 2016-08-22 15:37:05 +02:00
Gael Guennebaud
130f891bb0 bug #1278: ease parsing 2016-08-22 15:00:29 +02:00
Benoit Steiner
7944d4431f Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code. 2016-08-18 13:46:36 -07:00
Benoit Steiner
647a51b426 Force the inlining of a simple accessor. 2016-08-18 12:31:02 -07:00
Benoit Steiner
a452dedb4f Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
2016-08-18 12:29:54 -07:00
Igor Babuschkin
18c67df31c Fix remaining CUDA >= 300 checks 2016-08-18 17:18:30 +01:00
Igor Babuschkin
1569a7d7ab Add the necessary CUDA >= 300 checks back 2016-08-18 17:15:12 +01:00
Benoit Steiner
2b17f34574 Properly detect the type of the result of a contraction. 2016-08-16 16:00:30 -07:00
Igor Babuschkin
841e075154 Remove CUDA >= 300 checks and enable outer reductin for doubles 2016-08-06 18:07:50 +01:00
Igor Babuschkin
0425118e2a Merge upstream changes 2016-08-05 14:34:57 +01:00
Igor Babuschkin
9537e8b118 Make use of atomicExch for atomicExchCustom 2016-08-05 14:29:58 +01:00
Igor Babuschkin
eeb0d880ee Enable efficient Tensor reduction for doubles 2016-07-01 19:08:26 +01:00
Gael Guennebaud
d476cadbb8 bug #1247: fix regression in compilation of pow(integer,integer), and add respective unit tests. 2016-06-25 10:12:06 +02:00
Gael Guennebaud
cfff370549 Fix hyperbolic functions for autodiff. 2016-06-24 23:21:35 +02:00
Gael Guennebaud
c50c73cae2 Fix missing specialization. 2016-06-24 23:10:39 +02:00
Gael Guennebaud
3852351793 merge pull request 198 2016-06-24 11:48:17 +02:00
Gael Guennebaud
6dd9077070 Fix some unused typedef warnings. 2016-06-24 11:34:21 +02:00
Gael Guennebaud
ce90647fa5 Fix NumTraits<AutoDiff> 2016-06-24 11:34:02 +02:00
Gael Guennebaud
fa39f81b48 Fix instantiation of ScalarBinaryOpTraits for AutoDiff. 2016-06-24 11:33:30 +02:00
Gael Guennebaud
cd577a275c Relax promote_scalar_arg logic to enable promotion to Expr::Scalar if conversion to Expr::Literal fails.
This is useful to cancel expression template at the scalar level, e.g. with AutoDiff<AutoDiff<>>.
This patch also defers calls to NumTraits in cases for which types are not directly compatible.
2016-06-24 11:28:54 +02:00
Gael Guennebaud
deb45ad4bc bug #1245: fix compilation with msvc 2016-06-24 09:52:25 +02:00
Rasmus Munk Larsen
a9c1e4d7b7 Return -1 from CurrentThreadId when called by thread outside the pool. 2016-06-23 16:40:07 -07:00
Rasmus Munk Larsen
d39df320d2 Resolve merge. 2016-06-23 15:08:03 -07:00
Gael Guennebaud
361dbd246d Add unit test for printing empty tensors 2016-06-23 18:54:30 +02:00
Gael Guennebaud
360a743a10 bug #1241: does not emmit anything for empty tensors 2016-06-23 18:47:31 +02:00
Gael Guennebaud
55fc04e8b5 Fix operator priority 2016-06-23 15:36:42 +02:00
Gael Guennebaud
bf2d5edecc Fix warning. 2016-06-23 15:35:17 +02:00
Gael Guennebaud
7c6561485a merge PR 194 2016-06-23 15:29:57 +02:00
Konstantinos Margaritis
be107e387b fix compilation with clang 3.9, fix performance with pset1, use vector operators instead of intrinsics in some cases 2016-06-23 10:19:05 -03:00
Gael Guennebaud
76faf4a965 Introduce a NumTraits<T>::Literal type to be used for literals, and
improve mixing type support in operations between arrays and scalars:
 - 2 * ArrayXcf is now optimized in the sense that the integer 2 is properly promoted to a float instead of a complex<float> (fix a regression)
 - 2.1 * ArrayXi is now forbiden (previously, 2.1 was converted to 2)
 - This mechanism should be applicable to any custom scalar type, assuming NumTraits<T>::Literal is properly defined (it defaults to T)
2016-06-23 14:27:20 +02:00
Gael Guennebaud
a3f7edf7e7 Biug 1242: fix comma init with empty matrices. 2016-06-23 10:25:04 +02:00
Benoit Steiner
a29a2cb4ff Silenced a couple of compilation warnings generated by xcode 2016-06-22 16:43:02 -07:00
Benoit Steiner
f8fcd6b32d Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers 2016-06-22 16:03:11 -07:00
Benoit Steiner
c58df31747 Handle empty tensors in the print functions 2016-06-21 09:22:43 -07:00
Benoit Steiner
de32f8d656 Fixed the printing of rank-0 tensors 2016-06-20 10:46:45 -07:00
Konstantinos Margaritis
8c34b5a0e3 mostly cleanups and modernizing code 2016-06-19 16:13:17 -03:00
Konstantinos Margaritis
b410d46482 mostly cleanups and modernizing code 2016-06-19 16:12:52 -03:00
Konstantinos Margaritis
b80379bda0 fixed pexp<Packet2d>, was failing tests 2016-06-19 16:11:58 -03:00
Tal Hadad
8e198d6835 Complete docs and add ostream operator for EulerAngles. 2016-06-19 20:42:45 +03:00
Benoit Steiner
b055590e91 Made log1p_impl usable inside a GPU kernel 2016-06-16 11:37:40 -07:00
Geoffrey Lalonde
72c95383e0 Add autodiff coverage for standard library hyperbolic functions, and tests.
* * *
Corrected tanh derivatived, moved test definitions.
* * *
Added more test cases, removed lingering lines
2016-06-15 23:33:19 -07:00
Gael Guennebaud
67c12531e5 Fix warnings with gcc 2016-06-15 18:11:33 +02:00
Gael Guennebaud
eb91345d64 Move scalar/expr to ArrayBase and fix documentation 2016-06-15 15:22:03 +02:00
Gael Guennebaud
4794834397 Propagate functor to ScalarBinaryOpTraits 2016-06-15 09:58:49 +02:00
Gael Guennebaud
c55035b9c0 Include the cost of stores in unrolling of triangular expressions. 2016-06-15 09:57:33 +02:00
Benoit Steiner
7d495d890a Merged in ibab/eigen (pull request PR-197)
Implement exclusive scan option for Tensor library
2016-06-14 17:54:59 -07:00
Benoit Steiner
aedc5be1d6 Avoid generating pseudo random numbers that are multiple of 5: this helps
spread the load over multiple cpus without havind to rely on work stealing.
2016-06-14 17:51:47 -07:00
Gael Guennebaud
4e7c3af874 Cleanup useless helper: internal::product_result_scalar 2016-06-15 00:04:10 +02:00
Gael Guennebaud
101ea26f5e Include the cost of stores in unrolling (also fix infinite unrolling with expression costing 0 like Constant) 2016-06-15 00:01:16 +02:00
Igor Babuschkin
c4d10e921f Implement exclusive scan option 2016-06-14 19:44:07 +01:00
Gael Guennebaud
76236cdea4 merge 2016-06-14 15:33:47 +02:00
Gael Guennebaud
1004c4df99 Cleanup unused functors. 2016-06-14 15:27:28 +02:00
Gael Guennebaud
70dad84b73 Generalize expr/expr and scalar/expr wrt scalar types. 2016-06-14 15:26:37 +02:00
Gael Guennebaud
62134082aa Update AutoDiffScalar wrt to scalar-multiple. 2016-06-14 15:06:35 +02:00
Gael Guennebaud
5d38203735 Update Tensor module to use bind1st_op and bind2nd_op 2016-06-14 15:06:03 +02:00
Gael Guennebaud
396d9cfb6e Generalize expr.pow(scalar), pow(expr,scalar) and pow(scalar,expr).
Internal: scalar_pow_op (unary) is removed, and scalar_binary_pow_op is renamed scalar_pow_op.
2016-06-14 14:10:07 +02:00
Gael Guennebaud
a9bb653a68 Update doc (scalar_add_op is now deprecated) 2016-06-14 12:07:00 +02:00
Gael Guennebaud
a8c08e8b8e Implement expr+scalar, scalar+expr, expr-scalar, and scalar-expr as binary expressions, and generalize supported scalar types.
The following functors are now deprecated: scalar_add_op, scalar_sub_op, and scalar_rsub_op.
2016-06-14 12:06:10 +02:00
Gael Guennebaud
756ac4a93d Fix doc. 2016-06-14 12:03:39 +02:00
Gael Guennebaud
f925dba3d9 Fix compilation of BVH example 2016-06-14 11:32:09 +02:00
Gael Guennebaud
12350d3ac7 Add unit test for AlignedBox::center 2016-06-14 11:31:52 +02:00
Gael Guennebaud
bcc0f38f98 Add unittesting plugins to scalar_product_op and scalar_quotient_op to help chaking that types are properly propagated. 2016-06-14 11:31:27 +02:00
Gael Guennebaud
f57fd78e30 Generalize coeff-wise sparse products to support different scalar types 2016-06-14 11:29:54 +02:00
Gael Guennebaud
f5b1c73945 Set cost of constant expression to 0 (the cost should be amortized through the expression) 2016-06-14 11:29:06 +02:00
Gael Guennebaud
deb8306e60 Move MatrixBase::operaotr*(UniformScaling) as a free function in Scaling.h, and fix return type. 2016-06-14 11:28:03 +02:00
Gael Guennebaud
64fcfd314f Implement scalar multiples and division by a scalar as a binary-expression with a constant expression.
This slightly complexifies the type of the expressions and implies that we now have to distinguish between scalar*expr and expr*scalar to catch scalar-multiple expression (e.g., see BlasUtil.h), but this brings several advantages:
- it makes it clear on each side the scalar is applied,
- it clearly reflects that we are dealing with a binary-expression,
- the complexity of the type is hidden through macros defined at the end of Macros.h,
- distinguishing between "scalar op expr" and "expr op scalar" is important to support non commutative fields (like quaternions)
- "scalar op expr" is now fully equivalent to "ConstantExpr(scalar) op expr"
- scalar_multiple_op, scalar_quotient1_op and scalar_quotient2_op are not used anymore in officially supported modules (still used in Tensor)
2016-06-14 11:26:57 +02:00
Gael Guennebaud
39781dc1e2 Fix compilation of evaluator unit test 2016-06-14 11:03:26 +02:00
Tal Hadad
6edfe8771b Little bit docs 2016-06-13 22:03:19 +03:00
Tal Hadad
6e1c086593 Add static assertion 2016-06-13 21:55:17 +03:00
Gael Guennebaud
3c12e24164 Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them. 2016-06-13 16:18:59 +02:00
Gael Guennebaud
7a9ef7bbb4 Add default template parameters for the second scalar type of binary functors.
This enhences backward compatibility.
2016-06-13 16:17:23 +02:00
Gael Guennebaud
2ca2ffb65e check for mixing types in "array / scalar" expressions 2016-06-13 16:15:32 +02:00
Gael Guennebaud
4c61f00838 Add missing explicit scalar conversion 2016-06-12 22:42:13 +02:00
Tal Hadad
06206482d9 More docs, and minor code fixes 2016-06-12 23:40:17 +03:00
Gael Guennebaud
a3a4714aba Add debug output. 2016-06-11 14:41:53 +02:00
Gael Guennebaud
83904a21c1 Make sure T(i+1,i)==0 when diagonalizing T(i:i+1,i:i+1) 2016-06-11 14:41:36 +02:00
Benoit Steiner
65d33e5898 Merged in ibab/eigen (pull request PR-195)
Add small fixes to TensorScanOp
2016-06-10 19:31:17 -07:00
Benoit Steiner
a05607875a Don't refer to the half2 type unless it's been defined 2016-06-10 11:53:56 -07:00
Gael Guennebaud
fabae6c9a1 Cleanup 2016-06-10 15:58:33 +02:00
Gael Guennebaud
5de8d7036b Add real.pow(complex), complex.pow(real) unit tests. 2016-06-10 15:58:22 +02:00
Gael Guennebaud
5fdd703629 Enable mixing types in numext::pow 2016-06-10 15:58:04 +02:00
Gael Guennebaud
2e238bafb6 Big 279: enable mixing types for comparisons, min, and max. 2016-06-10 15:05:43 +02:00
Gael Guennebaud
0028049380 bug #1240: Remove any assumption on NEON vector types. 2016-06-09 23:08:11 +02:00
Igor Babuschkin
86aedc9282 Add small fixes to TensorScanOp 2016-06-07 20:06:38 +01:00
Christoph Hertzberg
db0118342c Fixed compilation of BVH_Example (required for make doc) 2016-06-07 19:17:18 +02:00
Benoit Steiner
84b2060a9e Fixed compilation error with gcc 4.4 2016-06-06 17:16:19 -07:00
Gael Guennebaud
2c462f4201 Clean handling for void type in EIGEN_CHECK_BINARY_COMPATIBILIY 2016-06-06 23:11:38 +02:00
Gael Guennebaud
3d71d3918e Disable shortcuts for res ?= prod when the scalar types do not match exactly. 2016-06-06 23:10:55 +02:00
Benoit Steiner
7ef9f47b58 Misc small improvements to the reduction code. 2016-06-06 14:09:46 -07:00
Benoit Steiner
ea75dba201 Added missing EIGEN_DEVICE_FUNC qualifiers to the unary array ops 2016-06-06 13:32:28 -07:00
Benoit Steiner
33f0340188 Implement result_of for the new ternary functors 2016-06-06 12:06:42 -07:00
Tal Hadad
e30133e439 Doc EulerAngles class, and minor fixes. 2016-06-06 22:01:40 +03:00
Gael Guennebaud
df24f4a01d bug #1201: improve code generation of affine*vec with MSVC 2016-06-06 16:46:46 +02:00
Benoit Steiner
9137f560f0 Moved assertions to the constructor to make the code more portable 2016-06-06 07:26:48 -07:00
Gael Guennebaud
66e99ab6a1 Relax mixing-type constraints for binary coefficient-wise operators:
- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP>
- Remove the "functor_is_product_like" helper (was pretty ugly)
- Currently, OP is not used, but it is available to the user for fine grained tuning
- Currently, only the following operators have been generalized: *,/,+,-,=,*=,/=,+=,-=
- TODO: generalize all other binray operators (comparisons,pow,etc.)
- TODO: handle "scalar op array" operators (currently only * is handled)
- TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
2016-06-06 15:11:41 +02:00
Benoit Steiner
1f1e0b9e30 Silenced compilation warning 2016-06-05 12:59:11 -07:00
Benoit Steiner
5b95b4daf9 Moved static assertions into the class constructor to make the code more portable 2016-06-05 12:57:48 -07:00
Christoph Hertzberg
d7e3e4bb04 Removed executable bits from header files. 2016-06-05 10:15:41 +02:00
Eugene Brevdo
c53687dd14 Add randomized properties tests for betainc special function. 2016-06-05 11:10:30 -07:00
Rasmus Munk Larsen
f1f2ff8208 size_t -> int 2016-06-03 18:06:37 -07:00
Rasmus Munk Larsen
76308e7fd2 Add CurrentThreadId and NumThreads methods to Eigen threadpools and TensorDeviceThreadPool. 2016-06-03 16:28:58 -07:00
Sean Templeton
bd21243821 Fix compile errors initializing packets on ARM DS-5 5.20
The ARM DS-5 5.20 compiler fails compiling with the following errors:

"src/Core/arch/NEON/PacketMath.h", line 113: Error:  #146: too many initializer values
    Packet4f countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/PacketMath.h", line 118: Error:  #146: too many initializer values
    Packet4i countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/Complex.h", line 30: Error:  #146: too many initializer values
  static uint32x4_t p4ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET4(0x00000000, 0x80000000, 0x00000000, 0x80000000);
                                    ^
"src/Core/arch/NEON/Complex.h", line 31: Error:  #146: too many initializer values
  static uint32x2_t p2ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET2(0x00000000, 0x80000000);
                                    ^

The vectors are implemented as two doubles, hence the too many initializer values error.
Changed the code to use intrinsic load functions which all compilers
implementing NEON should have.
2016-06-03 10:51:35 -05:00
Gael Guennebaud
1fc2746417 Make Arrays's ctor/assignment noexcept 2016-06-09 22:52:37 +02:00
Benoit Steiner
37638dafd7 Simplified the code that dispatches vectorized reductions on GPU 2016-06-09 10:29:52 -07:00
Benoit Steiner
66796e843d Fixed definition of some of the reducer_traits 2016-06-09 08:50:01 -07:00
Benoit Steiner
4434b16694 Pulled latest updates from trunk 2016-06-09 08:25:47 -07:00
Benoit Steiner
14a112ee15 Use signed integers more consistently to encode the number of threads to use to evaluate a tensor expression. 2016-06-09 08:25:22 -07:00
Benoit Steiner
8f92c26319 Improved code formatting 2016-06-09 08:23:42 -07:00
Benoit Steiner
aa33446dac Improved support for vectorization of 16-bit floats 2016-06-09 08:22:27 -07:00
Gael Guennebaud
e2b3836326 Include recent changesets that played with product's kernel 2016-06-09 17:13:33 +02:00
Gael Guennebaud
2bd59b0e0d Take advantage that T is already diagonal in the extraction of generalized complex eigenvalues. 2016-06-09 17:12:03 +02:00
Gael Guennebaud
c1f9ca9254 Update RealQZ to reduce 2x2 diagonal block of T corresponding to non reduced diagonal block of S to positive diagonal form.
This step involve a real 2x2 SVD problem. The respective routine is thus in src/misc/ to be shared by both EVD and AVD modules.
2016-06-09 17:11:03 +02:00
Gael Guennebaud
15890c304e Add unit test for non symmetric generalized eigenvalues 2016-06-09 16:17:27 +02:00
Gael Guennebaud
a20d2ec1c0 Fix shadow variable, and indexing. 2016-06-09 16:16:22 +02:00
Abhijit Kundu
0beabb4776 Fixed type conversion from int 2016-06-08 16:12:04 -04:00
Gael Guennebaud
df095cab10 Fixes for PARDISO: warnings, and defaults to metis+ in-core mode. 2016-06-08 18:31:19 +02:00
Gael Guennebaud
9fc8379328 Fix extraction of complex eigenvalue pairs in real generalized eigenvalue problems. 2016-06-08 16:39:11 +02:00
Christoph Hertzberg
9dd9d58273 Copied a regression test from 3.2 branch. 2016-06-08 15:36:42 +02:00
Benoit Steiner
8fd57a97f2 Enable the vectorization of adds and mults of fp16 2016-06-07 18:22:18 -07:00
Benoit Steiner
d6d39c7ddb Added missing EIGEN_DEVICE_FUNC 2016-06-07 14:35:08 -07:00
Gael Guennebaud
8d97ba6b22 bug #725: make move ctor/assignment noexcept. 2016-06-03 14:28:25 +02:00
Gael Guennebaud
e8b922ca63 Fix MatrixFunctions module. 2016-06-03 09:21:35 +02:00
Gael Guennebaud
82293f38d6 Fix unit test. 2016-06-03 08:12:14 +02:00
Gael Guennebaud
fe62c06d9b Fix compilation. 2016-06-03 07:47:38 +02:00
Gael Guennebaud
969b8959a0 Fix compilation: Matrix does not indirectly live in the internal namespace anymore! 2016-06-03 07:44:58 +02:00
Gael Guennebaud
f2c2465acc Fix function dependencies 2016-06-03 07:44:18 +02:00
Benoit Steiner
c3c8ad8046 Align the first element of the Waiter struct instead of padding it. This reduces its memory footprint a bit while achieving the goal of preventing false sharing 2016-06-02 21:17:41 -07:00
Eugene Brevdo
39baff850c Add TernaryFunctors and the betainc SpecialFunction.
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.

Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.

Added unit tests in array.cpp and cxx11_tensor_cuda.cu


Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc().  Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe.  and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Benoit Steiner
02db4e1a82 Disable the tensor tests when using msvc since older versions of the compiler fail to handle this code 2016-06-04 08:21:17 -07:00
Benoit Steiner
c21eaedce6 Use array_prod to compute the number of elements contained in the input tensor expression 2016-06-04 07:47:04 -07:00
Benoit Steiner
36a4500822 Merged in ibab/eigen (pull request PR-192)
Add generic scan method
2016-06-03 17:28:33 -07:00
Benoit Steiner
c2a102345f Improved the performance of full reductions.
AFTER:
BM_fullReduction/10        4541       4543     154017  21.0M items/s
BM_fullReduction/64        5191       5193     100000  752.5M items/s
BM_fullReduction/512       9588       9588      71361  25.5G items/s
BM_fullReduction/4k      244314     244281       2863  64.0G items/s
BM_fullReduction/5k      359382     359363       1946  64.8G items/s

BEFORE:
BM_fullReduction/10        9085       9087      74395  10.5M items/s
BM_fullReduction/64        9478       9478      72014  412.1M items/s
BM_fullReduction/512      14643      14646      46902  16.7G items/s
BM_fullReduction/4k      260338     260384       2678  60.0G items/s
BM_fullReduction/5k      385076     385178       1818  60.5G items/s
2016-06-03 17:27:08 -07:00
Igor Babuschkin
dc03b8f3a1 Add generic scan method 2016-06-03 17:37:04 +01:00
Gael Guennebaud
5b77481d58 merge 2016-06-02 22:21:45 +02:00
Gael Guennebaud
53feb73b45 Remove dead code. 2016-06-02 22:19:55 +02:00
Gael Guennebaud
2c00ac0b53 Implement generic scalar*expr and expr*scalar operator based on scalar_product_traits.
This is especially useful for custom scalar types, e.g., to enable float*expr<multi_prec> without conversion.
2016-06-02 22:16:37 +02:00
Rasmus Munk Larsen
811aadbe00 Add syntactic sugar to Eigen tensors to allow more natural syntax.
Specifically, this enables expressions involving:

scalar + tensor
scalar * tensor
scalar / tensor
scalar - tensor
2016-06-02 12:41:28 -07:00
Tal Hadad
52e4cbf539 Merged eigen/eigen into default 2016-06-02 22:15:20 +03:00
Tal Hadad
2aaaf22623 Fix Gael reports (except documention)
- "Scalar angle(int) const"  should be  "const Vector& angles() const"
- then method "coeffs" could be removed.
- avoid one letter names like h, p, r -> use alpha(), beta(), gamma() ;)
- about the "fromRotation" methods:
 - replace the ones which are not static by operator= (as in Quaternion)
 - the others are actually static methods: use a capital F: FromRotation
- method "invert" should be removed.
- use a macro to define both float and double EulerAnglesXYZ* typedefs
- AddConstIf -> not used
- no needs for NegateIfXor, compilers are extremely good at optimizing away branches based on compile time constants:
  if(IsHeadingOpposite-=IsEven) res.alpha() = -res.alpha();
2016-06-02 22:12:57 +03:00
Benoit Steiner
6021c90fdf Merged in ibab/eigen (pull request PR-189)
Add scan op to Tensor module
2016-06-02 08:08:11 -07:00
Gael Guennebaud
8b6f53222b bug #1193: fix lpNorm<Infinity> for empty input. 2016-06-02 15:29:59 +02:00
Gael Guennebaud
d616a81294 Disable MSVC's "decorated name length exceeded, name was truncated" warning in unit tests. 2016-06-02 14:48:38 +02:00
Gael Guennebaud
61a32f2a4c Fix pointer to long conversion warning. 2016-06-02 14:45:45 +02:00
Igor Babuschkin
fbd7ed6ff7 Add tensor scan op
This is the initial implementation a generic scan operation.
Based on this, cumsum and cumprod method have been added to TensorBase.
2016-06-02 13:35:47 +01:00
Benoit Steiner
0ed08fd281 Use a single PacketSize variable 2016-06-01 21:19:05 -07:00
Benoit Steiner
8f6fedc55f Fixed compilation warning 2016-06-01 21:14:46 -07:00
Benoit Steiner
c3cada38e2 Speedup a test 2016-06-01 21:13:00 -07:00
Gael Guennebaud
360e311b66 Doc: add some cross references (also fix empty macro argument warning) 2016-06-01 23:34:09 +02:00
Benoit Steiner
873e6ac54b Silenced compilation warning generated by nvcc. 2016-06-01 14:20:50 -07:00
Benoit Steiner
d27b0ad4c8 Added support for mean reductions on fp16 2016-06-01 11:12:07 -07:00
Gael Guennebaud
cd221a62ee Doc: start of a table summarizing coefficient-wise math functions. 2016-06-01 17:09:48 +02:00
Gael Guennebaud
3c69afca4c Add missing ArrayBase::log1p 2016-06-01 17:08:47 +02:00
Gael Guennebaud
89099b0cf7 Expose log1p to Array. 2016-06-01 17:00:08 +02:00
Gael Guennebaud
afd33539dd Doc: makes the global unary math functions visible to doxygen (and docuement them) 2016-06-01 15:27:13 +02:00
Gael Guennebaud
77e652d8ad Doc: improve documentation of Map<SparseMatrix> 2016-06-01 10:03:32 +02:00
Gael Guennebaud
da4970ead2 Doc: disable inlining of inherited members, workaround Doxygen's limited C++ parsing abilities, and improve doc of MapBase. 2016-06-01 09:38:49 +02:00
Benoit Steiner
099b354ca7 Pulled latest updates from trunk 2016-05-31 10:34:16 -07:00
Benoit Steiner
5aeb3687c4 Only enable optimized reductions of fp16 if the reduction functor supports them 2016-05-31 10:33:40 -07:00
Benoit Steiner
b6e306f189 Improved support for CUDA 8.0 2016-05-31 09:47:59 -07:00
Gael Guennebaud
1d3b253329 bug #1181: help MSVC inlining. 2016-05-31 17:23:42 +02:00
Gael Guennebaud
d79eee05ef Fix compilation with old icc 2016-05-31 17:13:51 +02:00
Gael Guennebaud
2c1b56f4c1 bug #1238: fix SparseMatrix::sum() overload for un-compressed mode. 2016-05-31 10:56:53 +02:00
Benoit Steiner
c4bd3b1f21 Silenced some compilation warnings triggered by nvcc 8.0 2016-05-27 14:40:49 -07:00
Benoit Steiner
e2946d962d Reimplement clamp as a static function. 2016-05-27 12:58:43 -07:00
Benoit Steiner
e96d36d4cd Use NULL instead of nullptr to preserve the compatibility with cxx03 2016-05-27 12:54:06 -07:00
Benoit Steiner
abc815798b Added a new operation to enable more powerful tensorindexing. 2016-05-27 12:22:25 -07:00
Benoit Steiner
5707537592 Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5 2016-05-27 10:47:53 -07:00
Benoit Steiner
3a5d6a3c38 Disable the use of MMX instructions since the code is broken on many platforms 2016-05-27 09:13:26 -07:00
Christoph Hertzberg
f2c86384f4 Cleaner implementation of dont_over_optimize. 2016-05-27 11:13:38 +02:00
Gael Guennebaud
22a035db95 Fix compilation when defaulting to row-major 2016-05-27 10:31:11 +02:00
Gael Guennebaud
e0cb73b46b Fix compilation with old ICC version (use C99 types instead of C++11 ones) 2016-05-27 10:28:09 +02:00
Benoit Steiner
1ae2567861 Fixed some compilation warnings 2016-05-26 15:57:19 -07:00
Benoit Steiner
094f4a56c8 Deleted extra namespace 2016-05-26 14:49:51 -07:00
Benoit Steiner
1a47844529 Preserve the ability to vectorize the evaluation of an expression even when it involves a cast that isn't vectorized (e.g fp16 to float) 2016-05-26 14:37:09 -07:00
Benoit Steiner
36369ab63c Resolved merge conflicts 2016-05-26 13:39:39 -07:00
Benoit Steiner
28fcb5ca2a Merged latest reduction improvements 2016-05-26 12:19:33 -07:00
Benoit Steiner
b24cf21235 Merged latest code improvements 2016-05-26 11:57:50 -07:00
Benoit Steiner
c1c7f06c35 Improved the performance of inner reductions. 2016-05-26 11:53:59 -07:00
Benoit Steiner
22d02c9855 Improved the coverage of the fp16 reduction tests 2016-05-26 11:12:16 -07:00
Christoph Hertzberg
41dcd047d7 bug #1237: Redefine eigen_assert instead of disabling assertions for documentation snippets 2016-05-26 18:13:33 +02:00
Benoit Steiner
8288b0aec2 Code cleanup. 2016-05-26 09:00:04 -07:00
Gael Guennebaud
7ff5fadcc0 Disable usage of MMX with msvc. 2016-05-26 17:58:46 +02:00
Gael Guennebaud
e8cef383b7 bug #1236: fix possible integer overflow in density estimation. 2016-05-26 17:51:04 +02:00
Gael Guennebaud
35df3a32eb Disabled GCC6's ignored-attributes warning in packetmath unit test. 2016-05-26 17:42:58 +02:00
Gael Guennebaud
db62719eda Fix some conversion warnings in unit tests. 2016-05-26 17:42:12 +02:00
Gael Guennebaud
fdcad686ee Fix numerous pointer-to-integer conversion warnings in unit tests. 2016-05-26 17:41:28 +02:00
Gael Guennebaud
30d97c03ce Defer the allocation of the working space:
- it is not always needed,
- and this fixes a long-to-float conversion warning
2016-05-26 17:39:42 +02:00
Gael Guennebaud
e08f54e9eb Fix copy ctor prototype. 2016-05-26 17:37:25 +02:00
Gael Guennebaud
c7f54b11ec linspaced's divisor for integer is better stored as the underlying scalar type. 2016-05-26 17:36:54 +02:00
Gael Guennebaud
bebc5a2147 Fix/handle some int-to-long conversions. 2016-05-26 17:35:53 +02:00
Gael Guennebaud
00c29c2cae Store permutation's determinant as char.
This also fixes some long to float conversion warnings
2016-05-26 17:34:23 +02:00
Gael Guennebaud
2f56d91063 Fix a pointer to integer conversion warning 2016-05-26 17:31:45 +02:00
Gael Guennebaud
2a44a70142 Handle some Index to int conversions in BLAS/LAPACK support. 2016-05-26 17:29:04 +02:00
Gael Guennebaud
f253e19296 Disable some long to float conversion warnings 2016-05-26 17:27:14 +02:00
Christoph Hertzberg
2ee306e44a Temporary workaround for bug #1237. The snippet (expectedly) failed with enabled assertions. 2016-05-26 16:16:41 +02:00
Gael Guennebaud
37197b602b Remove debuging code. 2016-05-26 11:53:10 +02:00
Gael Guennebaud
27f0434233 Introduce internal's UIntPtr and IntPtr types for pointer to integer conversions.
This fixes "conversion from pointer to same-sized integral type" warnings by ICC.
Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only,
let's be safe.
2016-05-26 10:52:12 +02:00
Gael Guennebaud
40e4637d79 Turn off ICC's conversion warning in is_convertible implementation 2016-05-26 10:48:43 +02:00
Gael Guennebaud
cc1ab64f29 Add missing inclusion of mmintrin.h 2016-05-26 09:51:50 +02:00
Benoit Steiner
2d7ed54ba2 Made the static storage class qualifier come first. 2016-05-25 22:16:15 -07:00
Benoit Steiner
e1fca8866e Deleted unnecessary explicit qualifiers. 2016-05-25 22:15:26 -07:00
Benoit Steiner
9b0aaf5113 Don't mark inline functions as static since it confuses the ICC compiler 2016-05-25 22:10:11 -07:00
Benoit Steiner
3585ff585e Silenced a compilation warning 2016-05-25 22:09:19 -07:00
Benoit Steiner
037a463fd5 Marked unused variables as such 2016-05-25 22:07:48 -07:00
Benoit Steiner
efeb89dcdb Specify the rounding mode in the correct location 2016-05-25 17:53:24 -07:00
Benoit Steiner
457204cb83 Updated the README file for the tensor benchmarks 2016-05-25 16:13:41 -07:00
Benoit Steiner
0322c66a3f Explicitly specify the rounding mode when converting floats to fp16 2016-05-25 15:56:15 -07:00
Benoit Steiner
3ac4045272 Made the IndexPair code compile in non cxx11 mode 2016-05-25 15:15:12 -07:00
Benoit Steiner
66556d0e05 Made the index pair list code more portable accross various compilers 2016-05-25 14:34:27 -07:00
Benoit Steiner
034aa3b2c0 Improved the performance of tensor padding 2016-05-25 11:43:08 -07:00
Benoit Steiner
58026905ae Added support for statically known lists of pairs of indices 2016-05-25 11:04:14 -07:00
Benoit Steiner
ed783872ab Disable the use of MMX instructions on x86_64 since too many compilers only support them in 32bit mode 2016-05-25 08:27:26 -07:00
Benoit Steiner
bcfff64f9e Use numext:: instead of std:: functions. 2016-05-25 08:08:21 -07:00
Gael Guennebaud
f57260a997 Fix typo in dont_over_optimize 2016-05-25 11:17:53 +02:00
Gael Guennebaud
2cd32be70b Fix warning. 2016-05-25 11:15:54 +02:00
Gael Guennebaud
bbf9109e25 Fix compilation with ICC. 2016-05-25 10:00:55 +02:00
Gael Guennebaud
2a1bff67fd Fix static/inline order. 2016-05-25 10:00:11 +02:00
Benoit Steiner
0835667329 There is no need to make the fp16 full reduction kernel a static function. 2016-05-24 23:11:56 -07:00
Benoit Steiner
b5d6b52a4d Fixed compilation warning 2016-05-24 23:10:57 -07:00
Benoit Steiner
d041a528da Cleaned up the fp16 code a little more 2016-05-24 22:43:26 -07:00
Benoit Steiner
cb26784d07 Pulled latest updates from trunk 2016-05-24 18:51:39 -07:00
Benoit Steiner
ff4a289572 Cleaned up the fp16 code 2016-05-24 18:50:09 -07:00
Gael Guennebaud
3f715e1701 update doc wrt to unaligned vectorization 2016-05-24 22:34:59 +02:00
Gael Guennebaud
9216abe28d Document EIGEN_UNALIGNED_VECTORIZE. 2016-05-24 22:14:34 +02:00
Gael Guennebaud
0fd953c217 Workaround clang/llvm bug in code generation. 2016-05-24 21:55:46 +02:00
Gael Guennebaud
e68e165a23 bug #256: enable vectorization with unaligned loads/stores.
This concerns all architectures and all sizes.
This new behavior can be disabled by defining EIGEN_UNALIGNED_VECTORIZE=0
2016-05-24 21:54:03 +02:00
Gael Guennebaud
78390e4189 Block<> should not disable vectorization based on inner-size, this is the responsibilty of the assignment logic. 2016-05-24 17:14:01 +02:00
Gael Guennebaud
64bb7576eb Clean propagation of Dest/Src alignments. 2016-05-24 17:12:12 +02:00
Benoit Jacob
40a16282c7 Remove now-unused protate PacketMath func 2016-05-24 11:01:18 -04:00
Benoit Jacob
6136f4fdd4 Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm Krait) that are not as ubiquitous today as they were when I introduced it. 2016-05-24 10:00:32 -04:00
Benoit Steiner
e617711306 Don't attempt to use MMX instructions with visualstudio since they're only partially supported. 2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f Worked around missing clang intrinsic 2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b Use the generic ploadquad intrinsics since it does the job 2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b Worked around missing clang intrinsics 2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61 Added missing EIGEN_DEVICE_FUNC qualifier 2016-05-23 20:56:59 -07:00
Benoit Steiner
3d0741f027 Include mmintrin.h to make it possible to use mmx instructions when needed. For example, this will enable the definition of a half packet for the Packet4f type. 2016-05-23 20:43:48 -07:00
Benoit Steiner
33a94f5dc7 Use the Index type instead of integers to specify the strides in pgather/pscatter 2016-05-23 20:37:30 -07:00
Benoit Steiner
6bc684ab6a Added missing alignment in the fp16 packet traits 2016-05-23 20:32:30 -07:00
Benoit Steiner
283e33dea4 ptranspose is not a template. 2016-05-23 19:55:55 -07:00
Benoit Steiner
a5a3ba2b80 Avoid unnecessary float to double conversions 2016-05-23 17:16:09 -07:00
Benoit Steiner
5ba0ebe7c9 Avoid unnecessary float to double conversion. 2016-05-23 17:14:31 -07:00
Benoit Steiner
7d980d74e5 Started to vectorize the processing of 16bit floats on CPU. 2016-05-23 15:21:40 -07:00
Benoit Steiner
5d51a7f12c Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path. 2016-05-23 15:13:16 -07:00
Benoit Steiner
7aa5bc9558 Fixed a typo in the array.cpp test 2016-05-23 14:39:51 -07:00
Benoit Steiner
a09cbf9905 Merged in rmlarsen/eigen (pull request PR-188)
Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
2016-05-23 12:55:12 -07:00
Christoph Hertzberg
88654762da Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit. 2016-05-23 10:03:03 +02:00
Christoph Hertzberg
718521d5cf Silenced several double-promotion warnings 2016-05-22 18:17:04 +02:00
Christoph Hertzberg
b5a7603822 fixed macro name 2016-05-22 16:49:29 +02:00
Christoph Hertzberg
25a03c02d6 Fix some sign-compare warnings 2016-05-22 16:42:27 +02:00
Christoph Hertzberg
0851d5d210 Identify clang++ even if it is not named llvm-clang++ 2016-05-22 15:21:14 +02:00
Gael Guennebaud
6a15e14cda Document EIGEN_MAX_CPP_VER and user controllable compiler features. 2016-05-20 15:26:09 +02:00
Gael Guennebaud
ccaace03c9 Make EIGEN_HAS_CONSTEXPR user configurable 2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable 2016-05-20 15:05:38 +02:00
Gael Guennebaud
abd1c1af7a Make EIGEN_HAS_STD_RESULT_OF user configurable 2016-05-20 15:01:27 +02:00
Gael Guennebaud
1395056fc0 Make EIGEN_HAS_C99_MATH user configurable 2016-05-20 14:58:19 +02:00
Gael Guennebaud
48bf5ec216 Make EIGEN_HAS_RVALUE_REFERENCES user configurable 2016-05-20 14:54:20 +02:00
Gael Guennebaud
f43ae88892 Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES 2016-05-20 14:48:51 +02:00
Gael Guennebaud
8d6bd5691b polygamma is C99/C++11 only 2016-05-20 14:45:33 +02:00
Gael Guennebaud
998f2efc58 Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used. 2016-05-20 14:44:28 +02:00
Gael Guennebaud
c028d96089 Improve doc of special math functions 2016-05-20 14:18:48 +02:00
Gael Guennebaud
0ba32f99bd Rename UniformRandom to UnitRandom. 2016-05-20 13:21:34 +02:00
Gael Guennebaud
7a9d9cde94 Fix coding practice in Quaternion::UniformRandom 2016-05-20 13:19:52 +02:00
Joseph Mirabel
eb0cc2573a bug #823: add static method to Quaternion for uniform random rotations. 2016-05-20 13:15:40 +02:00
Gael Guennebaud
2f656ce447 Remove std:: to enable custom scalar types. 2016-05-19 23:13:47 +02:00
Rasmus Larsen
b1e080c752 Merged eigen/eigen into default 2016-05-18 15:21:50 -07:00
Rasmus Munk Larsen
5624219b6b Merge. 2016-05-18 15:16:06 -07:00
Rasmus Munk Larsen
7df811cfe5 Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL. 2016-05-18 15:09:48 -07:00
Benoit Steiner
bb3ff8e9d9 Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available. 2016-05-18 14:52:49 -07:00
Gael Guennebaud
84df9142e7 bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests 2016-05-18 23:00:13 +02:00
Gael Guennebaud
21d692d054 Use coeff(i,j) instead of operator(). 2016-05-18 17:09:20 +02:00
Gael Guennebaud
8456bbbadb bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only. 2016-05-18 16:53:28 +02:00
Gael Guennebaud
b507b82326 Use default sorting strategy for square products. 2016-05-18 16:51:54 +02:00
Gael Guennebaud
1fa15ceee6 Extend sparse*sparse product unit test to check that the expected implementation is used (conservative vs auto pruning). 2016-05-18 16:50:54 +02:00
Gael Guennebaud
548a487800 bug #1229: bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway. 2016-05-18 16:44:05 +02:00
Gael Guennebaud
43790e009b Pass argument by const ref instead of by value in pow(AutoDiffScalar...) 2016-05-18 16:28:02 +02:00
Gael Guennebaud
1fbfab27a9 bug #1223: fix compilation of AutoDiffScalar's min/max operators, and add regression unit test. 2016-05-18 16:26:26 +02:00
Gael Guennebaud
448d9d943c bug #1222: fix compilation in AutoDiffScalar and add respective unit test 2016-05-18 16:00:11 +02:00
Gael Guennebaud
5a71eb5985 Big 1213: add regression unit test. 2016-05-18 14:03:03 +02:00
Gael Guennebaud
747e3290c0 bug #1213: rename some enums type for consistency. 2016-05-18 13:26:56 +02:00
Rasmus Munk Larsen
f519fca72b Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor. 2016-05-17 16:06:00 -07:00
Benoit Steiner
86ae94462e #if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly. 2016-05-17 14:06:15 -07:00
Benoit Steiner
997c335970 Fixed compilation error 2016-05-17 12:54:18 -07:00
Benoit Steiner
ebf6ada5ee Fixed compilation error in the tensor thread pool 2016-05-17 12:33:46 -07:00
Rasmus Munk Larsen
0bb61b04ca Merge upstream. 2016-05-17 10:26:10 -07:00
Rasmus Munk Larsen
0dbd68145f Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h. 2016-05-17 10:25:19 -07:00
Rasmus Larsen
00228f2506 Merged eigen/eigen into default 2016-05-17 09:49:31 -07:00
Benoit Steiner
e7e64c3277 Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit:
Before"
M_broadcasting/10        500000       3690    27.10 MFlops/s
BM_broadcasting/80        500000       4014  1594.24 MFlops/s
BM_broadcasting/640       100000      14770 27731.35 MFlops/s
BM_broadcasting/4K          5000     632711 39512.48 MFlops/s
After:
BM_broadcasting/10        500000       4287    23.33 MFlops/s
BM_broadcasting/80        500000       4455  1436.41 MFlops/s
BM_broadcasting/640       200000      10195 40173.01 MFlops/s
BM_broadcasting/4K          5000     423746 58997.57 MFlops/s
2016-05-17 09:24:35 -07:00
Benoit Steiner
5fa27574dd Allow vectorized padding on GPU. This helps speed things up a little
Before:
BM_padding/10            5000000        460   217.03 MFlops/s
BM_padding/80            5000000        460 13899.40 MFlops/s
BM_padding/640           5000000        461 888421.17 MFlops/s
BM_padding/4K            5000000        460 54316322.55 MFlops/s
After:
BM_padding/10            5000000        454   220.20 MFlops/s
BM_padding/80            5000000        455 14039.86 MFlops/s
BM_padding/640           5000000        452 904968.83 MFlops/s
BM_padding/4K            5000000        411 60750049.21 MFlops/s
2016-05-17 09:17:26 -07:00
Benoit Steiner
a910bcee43 Merged latest updates from trunk 2016-05-17 09:14:22 -07:00
Benoit Steiner
8d06c02ffd Allow vectorized padding on GPU. This helps speed things up a little.
Before:
BM_padding/10            5000000        460   217.03 MFlops/s
BM_padding/80            5000000        460 13899.40 MFlops/s
BM_padding/640           5000000        461 888421.17 MFlops/s
BM_padding/4K            5000000        460 54316322.55 MFlops/s
After:
BM_padding/10            5000000        454   220.20 MFlops/s
BM_padding/80            5000000        455 14039.86 MFlops/s
BM_padding/640           5000000        452 904968.83 MFlops/s
BM_padding/4K            5000000        411 60750049.21 MFlops/s
2016-05-17 09:13:27 -07:00
Benoit Steiner
86da77cb9b Pulled latest updates from trunk. 2016-05-17 07:21:48 -07:00
Benoit Steiner
92fc6add43 Don't rely on c++11 extension when we don't have to. 2016-05-17 07:21:22 -07:00
Benoit Steiner
2d74ef9682 Avoid float to double conversion 2016-05-17 07:20:11 -07:00
David Dement
ccc7563ac5 made a fix to the GMRES solver so that it now correctly reports the error achieved in the solution process 2016-05-16 14:26:41 -04:00
Gael Guennebaud
575bc44c3f Fix unit test. 2016-05-19 22:48:16 +02:00
Gael Guennebaud
ccb408ee6a Improve unit tests of zeta, polygamma, and digamma 2016-05-19 18:34:41 +02:00
Gael Guennebaud
6761c64d60 zeta and polygamma are not unary functions, but binary ones. 2016-05-19 18:34:16 +02:00
Gael Guennebaud
7a54032408 zeta and digamma do not require C++11/C99 2016-05-19 17:36:47 +02:00
Gael Guennebaud
ce12562710 Add some c++11 flags in documentation 2016-05-19 17:35:30 +02:00
Gael Guennebaud
b6ed8244b4 bug #1201: optimize affine*vector products 2016-05-19 16:09:15 +02:00
Gael Guennebaud
73693b5de6 bug #1221: disable gcc 6 warning: ignoring attributes on template argument 2016-05-19 15:21:53 +02:00
Gael Guennebaud
df9a5e13c6 Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs. 2016-05-19 13:07:33 +02:00
Gael Guennebaud
6a2916df80 DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag. 2016-05-19 13:06:21 +02:00
Gael Guennebaud
a226f6af6b Add support for SelfAdjointView::diagonal() 2016-05-19 13:05:33 +02:00
Gael Guennebaud
ee7da3c7c5 Fix SelfAdjointView::triangularView for complexes. 2016-05-19 13:01:51 +02:00
Gael Guennebaud
b6b8578a67 bug #1230: add support for SelfadjointView::triangularView. 2016-05-19 11:36:38 +02:00
Benoit Steiner
a80d875916 Added missing costPerCoeff method 2016-05-16 09:31:10 -07:00
Benoit Steiner
83ef39e055 Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions.
Before:
BM_colReduction_12T/10       1000000       1949    51.29 MFlops/s
BM_colReduction_12T/80        100000      15636   409.29 MFlops/s
BM_colReduction_12T/640        20000      95100  4307.01 MFlops/s
BM_colReduction_12T/4K           500    4573423  5466.36 MFlops/s
BM_colReduction_4T/10        1000000       1867    53.56 MFlops/s
BM_colReduction_4T/80         500000       5288  1210.11 MFlops/s
BM_colReduction_4T/640         10000     106924  3830.75 MFlops/s
BM_colReduction_4T/4K            500    9946374  2513.48 MFlops/s
BM_colReduction_8T/10        1000000       1912    52.30 MFlops/s
BM_colReduction_8T/80         200000       8354   766.09 MFlops/s
BM_colReduction_8T/640         20000      85063  4815.22 MFlops/s
BM_colReduction_8T/4K            500    5445216  4591.19 MFlops/s
BM_rowReduction_12T/10       1000000       2041    48.99 MFlops/s
BM_rowReduction_12T/80        100000      15426   414.87 MFlops/s
BM_rowReduction_12T/640        50000      39117 10470.98 MFlops/s
BM_rowReduction_12T/4K           500    3034298  8239.14 MFlops/s
BM_rowReduction_4T/10        1000000       1834    54.51 MFlops/s
BM_rowReduction_4T/80         500000       5406  1183.81 MFlops/s
BM_rowReduction_4T/640         50000      35017 11697.16 MFlops/s
BM_rowReduction_4T/4K            500    3428527  7291.76 MFlops/s
BM_rowReduction_8T/10        1000000       1925    51.95 MFlops/s
BM_rowReduction_8T/80         200000       8519   751.23 MFlops/s
BM_rowReduction_8T/640         50000      33441 12248.42 MFlops/s
BM_rowReduction_8T/4K           1000    2852841  8763.19 MFlops/s


After:
BM_colReduction_12T/10      50000000         59  1678.30 MFlops/s
BM_colReduction_12T/80       5000000        725  8822.71 MFlops/s
BM_colReduction_12T/640        20000      90882  4506.93 MFlops/s
BM_colReduction_12T/4K           500    4668855  5354.63 MFlops/s
BM_colReduction_4T/10       50000000         59  1687.37 MFlops/s
BM_colReduction_4T/80        5000000        737  8681.24 MFlops/s
BM_colReduction_4T/640         50000     108637  3770.34 MFlops/s
BM_colReduction_4T/4K            500    7912954  3159.38 MFlops/s
BM_colReduction_8T/10       50000000         60  1657.21 MFlops/s
BM_colReduction_8T/80        5000000        726  8812.48 MFlops/s
BM_colReduction_8T/640         20000      91451  4478.90 MFlops/s
BM_colReduction_8T/4K            500    5441692  4594.16 MFlops/s
BM_rowReduction_12T/10      20000000         93  1065.28 MFlops/s
BM_rowReduction_12T/80       2000000        950  6730.96 MFlops/s
BM_rowReduction_12T/640        50000      38196 10723.48 MFlops/s
BM_rowReduction_12T/4K           500    3019217  8280.29 MFlops/s
BM_rowReduction_4T/10       20000000         93  1064.30 MFlops/s
BM_rowReduction_4T/80        2000000        959  6667.71 MFlops/s
BM_rowReduction_4T/640         50000      37433 10941.96 MFlops/s
BM_rowReduction_4T/4K            500    3036476  8233.23 MFlops/s
BM_rowReduction_8T/10       20000000         93  1072.47 MFlops/s
BM_rowReduction_8T/80        2000000        959  6670.04 MFlops/s
BM_rowReduction_8T/640         50000      38069 10759.37 MFlops/s
BM_rowReduction_8T/4K           1000    2758988  9061.29 MFlops/s
2016-05-16 08:55:21 -07:00
Benoit Steiner
b789a26804 Fixed syntax error 2016-05-16 08:51:08 -07:00
Benoit Steiner
83dfb40f66 Turnon the new thread pool by default since it scales much better over multiple cores. It is still possible to revert to the old thread pool by compiling with the EIGEN_USE_SIMPLE_THREAD_POOL define. 2016-05-13 17:23:15 -07:00
Benoit Steiner
97605c7b27 New multithreaded contraction that doesn't rely on the thread pool to run the closure in the order in which they are enqueued. This is needed in order to switch to the new non blocking thread pool since this new thread pool can execute the closure in any order. 2016-05-13 17:11:29 -07:00
Benoit Steiner
069a0b04d7 Added benchmarks for contraction on CPU. 2016-05-13 14:32:17 -07:00
Benoit Steiner
c4fc8b70ec Removed unnecessary thread synchronization 2016-05-13 10:49:38 -07:00
Benoit Steiner
7aa3557d31 Fixed compilation errors triggered by old versions of gcc 2016-05-12 18:59:04 -07:00
Rasmus Munk Larsen
5005b27fc8 Diasbled cost model by accident. Revert. 2016-05-12 16:55:21 -07:00
Rasmus Munk Larsen
989e419328 Address comments by bsteiner. 2016-05-12 16:54:19 -07:00
Rasmus Munk Larsen
e55deb21c5 Improvements to parallelFor.
Move some scalar functors from TensorFunctors. to Eigen core.
2016-05-12 14:07:22 -07:00
Benoit Steiner
ae9688f313 Worked around a compilation error triggered by nvcc when compiling a tensor concatenation kernel. 2016-05-12 12:06:51 -07:00
Benoit Steiner
2a54b70d45 Fixed potential race condition in the non blocking thread pool 2016-05-12 11:45:48 -07:00
Benoit Steiner
a071629fec Replace implicit cast with an explicit one 2016-05-12 10:40:07 -07:00
Benoit Steiner
2f9401b061 Worked around compilation errors with older versions of gcc 2016-05-11 23:39:20 -07:00
Benoit Steiner
09653e1f82 Improved the portability of the tensor code 2016-05-11 23:29:09 -07:00
Benoit Steiner
fae0493f98 Fixed a couple of bugs related to the Pascalfamily of GPUs
H: Enter commit message.  Lines beginning with 'HG:' are removed.
2016-05-11 23:02:26 -07:00
Benoit Steiner
886445ce4d Avoid unnecessary conversions between floats and doubles 2016-05-11 23:00:03 -07:00
Benoit Steiner
595e890391 Added more tests for half floats 2016-05-11 21:27:15 -07:00
Benoit Steiner
b6a517c47d Added the ability to load fp16 using the texture path.
Improved the performance of some reductions on fp16
2016-05-11 21:26:48 -07:00
Benoit Steiner
518149e868 Misc fixes for fp16 2016-05-11 20:11:14 -07:00
Benoit Steiner
56a1757d74 Made predux_min and predux_max on fp16 less noisy 2016-05-11 17:37:34 -07:00
Benoit Steiner
9091351dbe __ldg is only available with cuda architectures >= 3.5 2016-05-11 15:22:13 -07:00
Benoit Steiner
02f76dae2d Fixed a typo 2016-05-11 15:08:38 -07:00
Christoph Hertzberg
131e5a1a4a Do not copy for trivial 1x1 case. This also avoids a "maybe-uninitialized" warning in some situations. 2016-05-11 23:50:13 +02:00
Benoit Steiner
70195a5ff7 Added missing EIGEN_DEVICE_FUNC 2016-05-11 14:10:09 -07:00
Benoit Steiner
09a19c33a8 Added missing EIGEN_DEVICE_FUNC qualifiers 2016-05-11 14:07:43 -07:00
Christoph Hertzberg
1a1ce6ff61 Removed deprecated flag (which apparently was ignored anyway) 2016-05-11 23:05:37 +02:00
Christoph Hertzberg
2150f13d65 fixed some double-promotion and sign-compare warnings 2016-05-11 23:02:26 +02:00
Christoph Hertzberg
7268b10203 Split unit test 2016-05-11 19:41:53 +02:00
Christoph Hertzberg
8d4ef391b0 Don't flood test output with successful VERIFY_IS_NOT_EQUAL tests. 2016-05-11 19:40:45 +02:00
Christoph Hertzberg
bda21407dd Fix help output of buildtests and check scripts 2016-05-11 19:39:09 +02:00
Christoph Hertzberg
33ca7e3c8d bug #1207: Add and fix logical-op warnings 2016-05-11 19:36:34 +02:00
Benoit Steiner
217d984abc Fixed a typo in my previous commit 2016-05-11 10:22:15 -07:00
Benoit Steiner
08348b4e48 Fix potential race condition in the CUDA reduction code. 2016-05-11 10:08:51 -07:00
Benoit Steiner
cbb14ed47e Added a few tests to validate the generation of random tensors on GPU. 2016-05-11 10:05:56 -07:00
Benoit Steiner
6a5717dc74 Explicitely initialize all the atomic variables. 2016-05-11 10:04:41 -07:00
Christoph Hertzberg
0f61343893 Workaround maybe-uninitialized warning 2016-05-11 09:00:18 +02:00
Christoph Hertzberg
3bfc9b47ca Workaround "misleading-indentation" warnings 2016-05-11 08:41:36 +02:00
Benoit Steiner
4ede059de1 Properly gate the use of half2. 2016-05-10 17:04:01 -07:00
Benoit Steiner
bf185c3c28 Extended the tests for ptanh 2016-05-10 16:21:43 -07:00
Benoit Steiner
661e710092 Added support for fp16 to the sigmoid functor. 2016-05-10 12:25:27 -07:00
Benoit Steiner
0eb69b7552 Small improvement to the full reduction of fp16 2016-05-10 11:58:18 -07:00
Benoit Steiner
0b9e3dcd06 Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This improves the performance by 10 to 30%. 2016-05-10 11:05:33 -07:00
Benoit Steiner
6bf8273bc0 Added a test to validate the new non blocking thread pool 2016-05-10 10:49:34 -07:00
Benoit Steiner
4013b8feca Simplified the reduction code a little. 2016-05-10 09:40:42 -07:00
Benoit Steiner
75bd2bd32d Fixed compilation warning 2016-05-09 19:24:41 -07:00
Benoit Steiner
4670d7d5ce Improved the performance of full reductions on GPU:
Before:
BM_fullReduction/10       200000      11751     8.51 MFlops/s
BM_fullReduction/80         5000     523385    12.23 MFlops/s
BM_fullReduction/640          50   36179326    11.32 MFlops/s
BM_fullReduction/4K            1 2173517195    11.50 MFlops/s

After:
BM_fullReduction/10       500000       5987    16.70 MFlops/s
BM_fullReduction/80       200000      10636   601.73 MFlops/s
BM_fullReduction/640       50000      58428  7010.31 MFlops/s
BM_fullReduction/4K         1000    2006106 12461.95 MFlops/s
2016-05-09 17:09:54 -07:00
Benoit Steiner
c3859a2b58 Added the ability to use a scratch buffer in cuda kernels 2016-05-09 17:05:53 -07:00
Benoit Steiner
ba95e43ea2 Added a new parallelFor api to the thread pool device. 2016-05-09 10:45:12 -07:00
Benoit Steiner
dc7dbc2df7 Optimized the non blocking thread pool:
* Use a pseudo-random permutation of queue indices during random stealing. This ensures that all the queues are considered.
 * Directly pop from a non-empty queue when we are waiting for work,
instead of first noticing that there is a non-empty queue and
then doing another round of random stealing to re-discover the non-empty
queue.
 * Steal only 1 task from a remote queue instead of half of tasks.
2016-05-09 10:17:17 -07:00
Benoit Steiner
05c365fb16 Pulled latest updates from trunk 2016-05-07 13:39:04 -07:00
Benoit Steiner
691614bd2c Worked around a bug in nvcc on tegra x1 2016-05-07 13:28:53 -07:00
Benoit Steiner
a2d94fc216 Merged latest updates from trunk 2016-05-06 19:17:57 -07:00
Benoit Steiner
8adf5cc70f Added support for packet processing of fp16 on kepler and maxwell gpus 2016-05-06 19:16:43 -07:00
Benoit Steiner
1660e749b4 Avoid double promotion 2016-05-06 08:15:12 -07:00
Christoph Hertzberg
a11bd82dc3 bug #1213: Give names to anonymous enums 2016-05-06 11:31:56 +02:00
Benoit Steiner
c54ae65c83 Marked a few tensor operations as read only 2016-05-05 17:18:47 -07:00
Benoit Steiner
69a8a4e1f3 Added a test to validate full reduction on tensor of half floats 2016-05-05 16:52:50 -07:00
Benoit Steiner
678a17ba79 Made the testing of contractions on fp16 more robust 2016-05-05 16:36:39 -07:00
Benoit Steiner
e3d053e14e Refined the testing of log and exp on fp16 2016-05-05 16:24:15 -07:00
Benoit Steiner
9a48688d37 Further improved the testing of fp16 2016-05-05 15:58:05 -07:00
Benoit Steiner
0451940fa4 Relaxed the dummy precision for fp16 2016-05-05 15:40:01 -07:00
Benoit Steiner
910e013506 Relaxed an assertion that was tighter that necessary. 2016-05-05 15:38:16 -07:00
Benoit Steiner
f81e413180 Added a benchmark to measure the performance of full reductions of 16 bit floats 2016-05-05 14:15:11 -07:00
Benoit Steiner
28d5572658 Fixed some incorrect assertions 2016-05-05 10:02:26 -07:00
Benoit Steiner
2aba40d208 Avoid unecessary type promotion 2016-05-05 09:26:57 -07:00
Benoit Steiner
a4d6e8fef0 Strongly hint but don't force the compiler to unroll a some loops in the tensor executor. This results in up to 27% faster code. 2016-05-05 09:25:55 -07:00
Benoit Steiner
7875437ca0 Avoided unecessary type promotion 2016-05-05 09:08:42 -07:00
Benoit Steiner
f363e533aa Added tests for full contractions using thread pools and gpu devices.
Fixed a couple of issues in the corresponding code.
2016-05-05 09:05:45 -07:00
Benoit Steiner
06d774bf58 Updated the contraction code to ensure that full contraction return a tensor of rank 0 2016-05-05 08:37:47 -07:00
Christoph Hertzberg
b300a84989 Fixed some singed/unsigned comparison warnings 2016-05-05 13:36:28 +02:00
Christoph Hertzberg
dacb469bc9 Enable and fix -Wdouble-conversion warnings 2016-05-05 13:35:45 +02:00
Benoit Steiner
62b710072e Reduced the memory footprint of the cxx11_tensor_image_patch test 2016-05-04 21:08:22 -07:00
Benoit Steiner
dd2b45feed Removed extraneous 'explicit' keywords 2016-05-04 16:57:52 -07:00
Ola Røer Thorsen
be78aea6b3 fix double-promotion/float-conversion in Core/SpecialFunctions.h 2016-05-04 10:52:08 +02:00
Gael Guennebaud
75a94b9662 Improve documentation of BDCSVD 2016-05-04 12:53:14 +02:00
Benoit Steiner
968ec1c2ae Use numext::isfinite instead of std::isfinite 2016-05-03 19:56:40 -07:00
Gael Guennebaud
e2ca478485 bug #1214: consider denormals as zero in D&C SVD. This also workaround infinite binary search when compiling with ICC's unsafe optimizations. 2016-05-03 23:15:29 +02:00
Benoit Steiner
f899e08946 Enabled a number of tests previously disabled by mistake 2016-05-03 14:07:47 -07:00
Benoit Steiner
4c05fb03a3 Merged eigen/eigen into default 2016-05-03 13:15:00 -07:00
Benoit Steiner
577a07a86e Re-enabled the product_small test now that everything compiles correctly. 2016-05-03 13:11:38 -07:00
Benoit Steiner
2c5568a757 Added a test to validate the computation of exp and log on 16bit floats 2016-05-03 12:06:07 -07:00
Benoit Steiner
6c3e5b85bc Fixed compilation error with cuda >= 7.5 2016-05-03 09:38:42 -07:00
Benoit Steiner
aad9a04da4 Deleted superfluous explicit keyword. 2016-05-03 09:37:19 -07:00
Benoit Steiner
da50419df8 Made a cast explicit 2016-05-02 19:50:22 -07:00
Benoit Steiner
73ef5371e4 Pulled latest updates from trunk 2016-05-01 14:48:57 -07:00
Benoit Steiner
8a9228ed9b Fixed compilation error 2016-05-01 14:48:01 -07:00
Gael Guennebaud
b1bd53aa6b Fix performance regression: with AVX, unaligned stores were emitted instead of aligned ones for fixed size assignement. 2016-05-01 23:25:06 +02:00
Benoit Steiner
d6c9596fd8 Added missing accessors to fixed sized tensors 2016-04-29 18:51:33 -07:00
Benoit Steiner
17fe7f354e Deleted trailing commas 2016-04-29 18:39:01 -07:00
Benoit Steiner
e5f71aa6b2 Deleted useless trailing commas 2016-04-29 18:36:10 -07:00
Benoit Steiner
44f592dceb Deleted unnecessary trailing commas. 2016-04-29 18:33:46 -07:00
Benoit Steiner
2b890ae618 Fixed compilation errors generated by clang 2016-04-29 18:30:40 -07:00
Benoit Steiner
d217217842 Added a few tests to ensure that the dimensions of rank 0 tensors are correctly computed 2016-04-29 18:15:34 -07:00
Benoit Steiner
f100d1494c Return the proper size (ie 1) for tensors of rank 0 2016-04-29 18:14:33 -07:00
Benoit Steiner
d14105f158 Made several tensor tests compatible with cxx03 2016-04-29 17:22:37 -07:00
Benoit Steiner
c0882ef4d9 Moved a number of tensor tests that don't require cxx11 to work properly outside the EIGEN_TEST_CXX11 test section 2016-04-29 17:13:51 -07:00
Benoit Steiner
9d1dbd1ec0 Fixed teh cxx11_tensor_empty test to compile without requiring cxx11 support 2016-04-29 16:53:55 -07:00
Benoit Steiner
a8c0405cf5 Deleted unused default values for template parameters 2016-04-29 16:34:43 -07:00
Benoit Steiner
4f53178e62 Made a coupe of tensor tests compile without requiring c++11 support. 2016-04-29 16:09:54 -07:00
Benoit Steiner
1131a984a6 Made the cxx11_tensor_forced_eval compile without c++11. 2016-04-29 15:48:59 -07:00
Benoit Steiner
46bcb70969 Don't turn on const expressions when compiling with gcc >= 4.8 unless the -std=c++11 option has been used 2016-04-29 15:20:59 -07:00
Benoit Steiner
c07404f6a1 Restore Tensor support for non c++11 compilers 2016-04-29 15:19:19 -07:00
Benoit Steiner
ba32ded021 Fixed include path 2016-04-29 15:11:09 -07:00
Benoit Steiner
3b8da4be5a Extended the packetmath test to cover all the alignments made possible by avx512 instructions. 2016-04-29 14:13:43 -07:00
Benoit Steiner
2f28ccbea3 Update the makefile to make the tests compile with gcc 4.9 2016-04-29 14:11:09 -07:00
Benoit Steiner
7a4bd337d9 Resolved merge conflict 2016-04-29 13:42:22 -07:00
Benoit Steiner
07a247dcf4 Pulled latest updates from upstream 2016-04-29 13:41:26 -07:00
Benoit Steiner
fa5a8f055a Implemented palign_impl for AVX512 2016-04-29 13:30:13 -07:00
Benoit Steiner
ef3ac9d05a Fixed the AVX512 packet traits 2016-04-29 13:28:36 -07:00
Benoit Steiner
d7b75e8d86 Added pdiv packet primitives for avx512 2016-04-29 13:26:47 -07:00
Benoit Steiner
5e89ded685 Implemented preduxp for AVX512 2016-04-29 13:00:33 -07:00
Benoit Steiner
5f85662ad8 Implemented the pabs and preverse primitives for avx512. 2016-04-29 12:53:34 -07:00
Benoit Steiner
d37ee89ca8 Disabled some of the AVX512 primitives on compilers that don't support them 2016-04-29 12:50:29 -07:00
Gael Guennebaud
0f3c4c8ff4 Fix compilation of sparse.cast<>().transpose(). 2016-04-29 18:26:08 +02:00
Benoit Steiner
a524a26fdc Fixed a few memory leaks 2016-04-28 18:55:53 -07:00
Benoit Steiner
dacb23277e Fixed the igamma and igammac implementations to make them callable from a gpu kernel. 2016-04-28 18:54:54 -07:00
Benoit Steiner
a5d4545083 Deleted unused variable 2016-04-28 14:14:48 -07:00
Justin Lebar
40d1e2f8c7 Eliminate mutual recursion in igamma{,c}_impl::Run.
Presently, igammac_impl::Run calls igamma_impl::Run, which in turn calls
igammac_impl::Run.

This isn't actually mutual recursion; the calls are guarded such that we never
get into a loop.  Nonetheless, it's a stretch for clang to prove this.  As a
result, clang emits a recursive call in both igammac_impl::Run and
igamma_impl::Run.

That this is suboptimal code is bad enough, but it's particularly bad when
compiling for CUDA/nvptx.  nvptx allows recursion, but only begrudgingly: If
you have recursive calls in a kernel, it's on you to manually specify the
kernel's stack size.  Otherwise, ptxas will dump a warning, make a guess, and
who knows if it's right.

This change explicitly eliminates the mutual recursion in igammac_impl::Run and
igamma_impl::Run.
2016-04-28 13:57:08 -07:00
Konstantinos Margaritis
87294c84a6 define Packet2d constants with VSX only 2016-04-28 14:39:56 -03:00
Konstantinos Margaritis
6ed7a7281c remove accidentally pasted code 2016-04-28 14:35:55 -03:00
Konstantinos Margaritis
62f9093b31 improve state of MathFunctions as well 2016-04-28 14:33:09 -03:00
Konstantinos Margaritis
8ed26120c8 bring Altivec/VSX to a better state, implement some of the missing functions 2016-04-28 14:32:42 -03:00
Konstantinos Margaritis
950158f6d1 add name to copyrights 2016-04-28 14:32:11 -03:00
Konstantinos Margaritis
ee0459300b minor fix, add to copyright 2016-04-28 14:31:21 -03:00
Benoit Steiner
3ec81fc00f Fixed compilation error with clang. 2016-04-27 19:32:12 -07:00
Benoit Steiner
2b917291d9 Merged in rmlarsen/eigen2 (pull request PR-183)
Detect cxx_constexpr support when compiling with clang.
2016-04-27 15:19:54 -07:00
Rasmus Munk Larsen
09b9e951e3 Depend on the more extensive support for constexpr in clang:
http://clang.llvm.org/docs/LanguageExtensions.html#c-1y-relaxed-constexpr
2016-04-27 14:59:11 -07:00
Rasmus Munk Larsen
1a325ef71c Detect cxx_constexpr support when compiling with clang. 2016-04-27 14:33:51 -07:00
Benoit Steiner
1a97fd8b4e Merged latest update from trunk 2016-04-27 14:22:45 -07:00
Benoit Steiner
c61170e87d fpclassify isn't portable enough. In particular, the return values of the function are not available on all the platforms Eigen supportes: remove it from Eigen. 2016-04-27 14:22:20 -07:00
Gael Guennebaud
318e65e0ae Fix missing inclusion of Eigen/Core 2016-04-27 23:05:40 +02:00
Benoit Steiner
f629fe95c8 Made the index type a template parameter to evaluateProductBlockingSizes
Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.
2016-04-27 13:11:19 -07:00
Benoit Steiner
66b215b742 Merged latest updates from trunk 2016-04-27 12:57:48 -07:00
Benoit Steiner
25141b69d4 Improved support for min and max on 16 bit floats when running on recent cuda gpus 2016-04-27 12:57:21 -07:00
Rasmus Larsen
ff33798acd Merged eigen/eigen into default 2016-04-27 12:27:00 -07:00
Rasmus Munk Larsen
463738ccbe Use computeProductBlockingSizes to compute blocking for both ShardByCol and ShardByRow cases. 2016-04-27 12:26:18 -07:00
Benoit Steiner
6744d776ba Added support for fpclassify in Eigen::Numext 2016-04-27 12:10:25 -07:00
Rasmus Munk Larsen
1f48f47ab7 Implement stricter argument checking for SYRK and SY2K and real matrices. To implement the BLAS API they should return info=2 if op='C' is passed for a complex matrix. Without this change, the Eigen BLAS fails the strict zblat3 and cblat3 tests in LAPACK 3.5. 2016-04-27 19:59:44 +02:00
Gael Guennebaud
3dddd34133 Refactor the unsupported CXX11/Core module to internal headers only. 2016-04-26 11:20:25 +02:00
Benoit Steiner
4a164d2c46 Fixed the partial evaluation of non vectorizable tensor subexpressions 2016-04-25 10:43:03 -07:00
Benoit Steiner
fd9401f260 Refined the cost of the striding operation. 2016-04-25 09:16:08 -07:00
Heiko Bauke
e19b58e672 alias template for matrix and array classes 2016-04-23 00:08:51 +02:00
Konstantinos Margaritis
3f80696ae1 Merged eigen/eigen into default 2016-04-22 15:05:21 +03:00
Benoit Steiner
5c372d19e3 Merged in rmlarsen/eigen (pull request PR-179)
Prevent crash in CompleteOrthogonalDecomposition if object was default constructed.
2016-04-21 18:06:36 -07:00
Benoit Steiner
4bbc97be5e Provide access to the base threadpool classes 2016-04-21 17:59:33 -07:00
Rasmus Munk Larsen
a3256d78d8 Prevent crash in CompleteOrthogonalDecomposition if object was default constructed. 2016-04-21 16:49:28 -07:00
Benoit Steiner
33adce5c3a Added the ability to switch to the new thread pool with a #define 2016-04-21 11:59:58 -07:00
Benoit Steiner
79b900375f Use index list for the striding benchmarks 2016-04-21 11:58:27 -07:00
Benoit Steiner
f670613e4b Fixed several compilation warnings 2016-04-21 11:03:02 -07:00
Benoit Steiner
6015422ee6 Added an option to enable the use of the F16C instruction set 2016-04-21 10:30:29 -07:00
Benoit Steiner
32ffce04fc Use EIGEN_THREAD_YIELD instead of std::this_thread::yield to make the code more portable. 2016-04-21 08:47:28 -07:00
Konstantinos Margaritis
e5b2ef47d5 Merged eigen/eigen into default 2016-04-21 18:03:08 +03:00
Benoit Steiner
2dde1b1028 Don't crash when attempting to reduce empty tensors. 2016-04-20 18:08:20 -07:00
Benoit Steiner
a792cd357d Added more tests 2016-04-20 17:33:58 -07:00
Benoit Steiner
80200a1828 Don't attempt to leverage the _cvtss_sh and _cvtsh_ss instructions when compiling with clang since it's unclear which versions of clang actually support these instruction. 2016-04-20 12:10:27 -07:00
Benoit Steiner
c7c2054bb5 Started to implement a portable way to yield. 2016-04-19 17:59:58 -07:00
Benoit Steiner
1d0238375d Made sure all the required header files are included when trying to use fp16 2016-04-19 17:44:12 -07:00
Benoit Steiner
2b72163028 Implemented a more portable version of thread local variables 2016-04-19 15:56:02 -07:00
Benoit Steiner
04f954956d Fixed a few typos 2016-04-19 15:27:09 -07:00
Benoit Steiner
5b1106c56b Fixed a compilation error with nvcc 7. 2016-04-19 14:57:57 -07:00
Benoit Steiner
7129d998db Simplified the code that launches cuda kernels. 2016-04-19 14:55:21 -07:00
Benoit Steiner
b9ea40c30d Don't take the address of a kernel on CUDA devices that don't support this feature. 2016-04-19 14:35:11 -07:00
Benoit Steiner
884c075058 Use numext::ceil instead of std::ceil 2016-04-19 14:33:30 -07:00
Benoit Steiner
a278414d1b Avoid an unnecessary copy of the evaluator. 2016-04-19 13:54:28 -07:00
Benoit Steiner
f953c60705 Fixed 2 recent regression tests 2016-04-19 12:57:39 -07:00
Benoit Steiner
50968a0a3e Use DenseIndex in the MeanReducer to avoid overflows when processing very large tensors. 2016-04-19 11:53:58 -07:00
Benoit Steiner
84543c8be2 Worked around the lack of a rand_r function on windows systems 2016-04-17 19:29:27 -07:00
Benoit Steiner
5fbcfe5eb4 Worked around the lack of a rand_r function on windows systems 2016-04-17 18:42:31 -07:00
Gael Guennebaud
e4fe611e2c Enable lazy-coeff-based-product for vector*(1x1) products 2016-04-16 15:17:39 +02:00
Benoit Steiner
c8e8f93d6c Move the evalGemm method into the TensorContractionEvaluatorBase class to make it accessible from both the single and multithreaded contraction evaluators. 2016-04-15 16:48:10 -07:00
Benoit Steiner
1a16fb1532 Deleted extraneous comma. 2016-04-15 15:50:13 -07:00
Benoit Steiner
7cff898e0a Deleted unnecessary variable 2016-04-15 15:46:14 -07:00
Benoit Steiner
6c43c49e4a Fixed a few compilation warnings 2016-04-15 15:34:34 -07:00
Benoit Steiner
eb669f989f Merged in rmlarsen/eigen (pull request PR-178)
Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions.
2016-04-15 14:53:15 -07:00
Gael Guennebaud
2a7115daca bug #1203: by-pass large stack-allocation in stableNorm if EIGEN_STACK_ALLOCATION_LIMIT is too small 2016-04-15 22:34:11 +02:00
Rasmus Munk Larsen
3718bf654b Get rid of void* casting when calling EvalRange::run. 2016-04-15 12:51:33 -07:00
Benoit Steiner
40c9923a8a Fixed compilation errors with msvc 2016-04-15 11:27:52 -07:00
Benoit Steiner
1d23430628 Improved the matrix multiplication blocking in the case where mr is not a power of 2 (e.g on Haswell CPUs). 2016-04-15 10:53:31 -07:00
Gael Guennebaud
1e80bddde3 Fix trmv for mixing types. 2016-04-15 17:58:36 +02:00
Konstantinos Margaritis
0e8fc31087 remove pgather/pscatter for std::complex<double> for s390x 2016-04-15 07:08:57 -04:00
Benoit Steiner
a62e924656 Added ability to access the cache sizes from the tensor devices 2016-04-14 21:25:06 -07:00
Benoit Steiner
18e6f67426 Added support for exclusive or 2016-04-14 20:37:46 -07:00
Rasmus Munk Larsen
07ac4f7e02 Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions. The cost model is turned off by default. 2016-04-14 18:28:23 -07:00
Benoit Steiner
9624a1ea3d Added missing definition of PacketSize in the gpu evaluator of convolution 2016-04-14 17:16:58 -07:00
Benoit Steiner
6fbedf5a4e Merged in rmlarsen/eigen (pull request PR-177)
Eigen Tensor cost model part 1.
2016-04-14 17:13:19 -07:00
Benoit Steiner
bebb89acfa Enabled the new threadpool tests 2016-04-14 16:44:10 -07:00
Benoit Steiner
9c064b5a97 Cleanup 2016-04-14 16:41:31 -07:00
Benoit Steiner
1372156c41 Prepared the migration to the new non blocking thread pool 2016-04-14 16:16:42 -07:00
Rasmus Munk Larsen
aeb5494a0b Improvements to cost model. 2016-04-14 15:52:58 -07:00
Benoit Steiner
00dfe18487 Merged latest updates from trunk 2016-04-14 15:25:20 -07:00
Benoit Steiner
a8e8837ba7 Added tests for the non blocking thread pool 2016-04-14 15:23:49 -07:00
Benoit Steiner
78a51abc12 Added a more scalable non blocking thread pool 2016-04-14 15:23:10 -07:00
Rasmus Munk Larsen
d2e95492e7 Merge upstream updates. 2016-04-14 13:59:50 -07:00
Rasmus Munk Larsen
235e83aba6 Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions. 2016-04-14 13:57:35 -07:00
Gael Guennebaud
68897c52f3 Add extreme values to the imaginary part for SVD unit tests. 2016-04-14 22:47:30 +02:00
Gael Guennebaud
20f387fafa Improve numerical robustness of JacoviSVD:
- avoid noise amplification in complex to real conversion
 - compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix.
2016-04-14 22:46:55 +02:00
Benoit Steiner
7718749fee Force the inlining of the << operator on half floats 2016-04-14 11:51:54 -07:00
Benoit Steiner
5379d2b594 Inline the << operator on half floats 2016-04-14 11:40:48 -07:00
Benoit Steiner
5912ad877c Silenced a compilation warning 2016-04-14 11:40:14 -07:00
Benoit Steiner
2b6e3de02f Added tests to validate flooring and ceiling of fp16 2016-04-14 11:39:18 -07:00
Benoit Steiner
6f23e945f6 Added simple test for numext::sqrt and numext::pow on fp16 2016-04-14 10:32:52 -07:00
Benoit Steiner
72510c80e1 Added basic test for trigonometric functions on fp16 2016-04-14 10:27:24 -07:00
Benoit Steiner
7b3d7acebe Added support for fp16 to test_isApprox, test_isMuchSmallerThan, and test_isApproxOrLessThan 2016-04-14 10:25:50 -07:00
Benoit Steiner
5c13765ee3 Added ability to printf fp16 2016-04-14 10:24:52 -07:00
Benoit Steiner
c7167fee0e Added support for fp16 to the sigmoid function 2016-04-14 10:08:33 -07:00
Benoit Steiner
f6003f0873 Made the test msvc friendly 2016-04-14 09:47:26 -07:00
Gael Guennebaud
3551dea887 Cleaning pass on rcond estimator. 2016-04-14 16:45:41 +02:00
Gael Guennebaud
d8a3bdaa24 remove useless include 2016-04-14 15:18:56 +02:00
Gael Guennebaud
d402adc3d7 Better use .data() than &coeffRef(0) 2016-04-14 15:18:08 +02:00
Gael Guennebaud
ea7087ef31 Merged in rmlarsen/eigen (pull request PR-174)
Add matrix condition number estimation module.
2016-04-14 15:11:33 +02:00
Benoit Steiner
36f5a10198 Properly gate the definition of the error and gamma functions for fp16 2016-04-13 18:44:48 -07:00
Benoit Steiner
10b69810d1 Improved support for trigonometric functions on GPU 2016-04-13 16:00:51 -07:00
Benoit Steiner
d6105b53b8 Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16 2016-04-13 15:26:02 -07:00
Gael Guennebaud
703251f10f merge 2016-04-13 23:45:10 +02:00
Gael Guennebaud
39211ba46b Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block. 2016-04-13 23:43:26 +02:00
Benoit Steiner
2986253259 Cleaned up the implementation of digamma 2016-04-13 14:24:06 -07:00
Benoit Steiner
d5de1a8220 Pulled latest updates from trunk 2016-04-13 14:17:11 -07:00
Benoit Steiner
87ca15c4e8 Added support for sin, cos, tan, and tanh on fp16 2016-04-13 14:12:38 -07:00
Gael Guennebaud
2c9e4fa417 Add debug output for random unit test 2016-04-13 22:56:12 +02:00
Gael Guennebaud
7d1391d049 Turn a converge check to a warning 2016-04-13 22:50:54 +02:00
Gael Guennebaud
feef39e2d1 Fix underflow in JacoviSVD's complex to real preconditioner 2016-04-13 22:49:51 +02:00
Gael Guennebaud
f4e12272f1 Fix corner case in unit test. 2016-04-13 22:18:02 +02:00
Gael Guennebaud
a95e1a273e Fix warning in unit tests 2016-04-13 22:00:38 +02:00
Benoit Steiner
bf3f6688f0 Added support for computing cos, sin, tan, and tanh on GPU. 2016-04-13 11:55:08 -07:00
Benoit Steiner
473c8380ea Added constructors to convert unsigned integers into fp16 2016-04-13 11:03:37 -07:00
Gael Guennebaud
42a3352a3b Workaround a division by zero when outerstride==0 2016-04-13 19:02:02 +02:00
Gael Guennebaud
6f960b83ff Make use of is_same_dense helper instead of extract_data to detect input/outputs are the same. 2016-04-13 18:47:12 +02:00
Gael Guennebaud
b7716c0328 Fix incomplete previous patch on matrix comparision. 2016-04-13 18:32:56 +02:00
Gael Guennebaud
2630d97c62 Fix detection of same matrices when both matrices are not handled by extract_data. 2016-04-13 18:26:08 +02:00
Gael Guennebaud
512ba0ac76 Add regression unit tests for half-packet vectorization 2016-04-13 18:16:35 +02:00
Gael Guennebaud
06447e0a39 Improve half-packet vectorization logic to distinguish linear versus inner traversal modes. 2016-04-13 18:15:49 +02:00
Gael Guennebaud
bbb8854bf7 Enable half-packet in reduxions. 2016-04-13 13:02:34 +02:00
Benoit Steiner
e9b12cc1f7 Fixed compilation warnings generated by clang 2016-04-12 20:53:18 -07:00
Benoit Steiner
eaeb6ca93a Enable the benchmarks for algebraic and transcendental fnctions on fp16. 2016-04-12 16:29:00 -07:00
Benoit Steiner
aa1ba8bbd2 Don't put a command at the end of an enumerator list 2016-04-12 16:28:11 -07:00
Benoit Steiner
e49945ced4 Pulled latest update from trunk 2016-04-12 14:13:41 -07:00
Benoit Steiner
25d05c4b8f Fixed the vectorization logic test 2016-04-12 14:13:25 -07:00
Benoit Steiner
53121c0119 Turned on the contraction benchmarks for fp16 2016-04-12 14:11:52 -07:00
Gael Guennebaud
b67c983291 Enable the use of half-packet in coeff-based product.
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
2016-04-12 23:03:03 +02:00
Benoit Steiner
e3a184785c Fixed the zeta test 2016-04-12 11:12:36 -07:00
Benoit Steiner
3b76df64fc Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3 2016-04-12 10:58:51 -07:00
Benoit Steiner
8bfe739cd2 Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions 2016-04-11 18:40:16 -07:00
Rasmus Larsen
6498dadc2f Merged eigen/eigen into default 2016-04-11 17:42:05 -07:00
Benoit Steiner
d6e596174d Pull latest updates from upstream 2016-04-11 17:20:17 -07:00
Benoit Steiner
748c4c4599 More accurate cost estimates for exp, log, tanh, and sqrt. 2016-04-11 13:11:04 -07:00
Benoit Steiner
833efb39bf Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16 2016-04-11 11:03:56 -07:00
Benoit Steiner
e939b087fe Pulled latest update from trunk 2016-04-11 11:03:02 -07:00
Gael Guennebaud
1744b5b5d2 Update doc regarding the genericity of EIGEN_USE_BLAS 2016-04-11 17:16:07 +02:00
Gael Guennebaud
91bf925fc1 Improve constness of level2 blas API. 2016-04-11 17:13:01 +02:00
Gael Guennebaud
0483430283 Move LAPACK declarations from blas.h to lapack.h and fix compatibility with EIGEN_USE_MKL 2016-04-11 17:12:31 +02:00
Gael Guennebaud
097d1e8823 Cleanup obsolete assign_scalar_eig2mkl helper. 2016-04-11 16:09:29 +02:00
Gael Guennebaud
fec4c334ba Remove all references to MKL in BLAS wrappers. 2016-04-11 16:04:09 +02:00
Gael Guennebaud
ddabc992fa Fix long to int conversion in BLAS API. 2016-04-11 15:52:01 +02:00
Gael Guennebaud
8191f373be Silent unused warning. 2016-04-11 15:37:16 +02:00
Gael Guennebaud
6a9ca88e7e Relax dependency on MKL for EIGEN_USE_BLAS 2016-04-11 15:17:14 +02:00
Gael Guennebaud
4e8e5888d7 Improve constness of blas level-3 interface. 2016-04-11 15:12:44 +02:00
Gael Guennebaud
675e0a2224 Fix static/inline keywords order. 2016-04-11 15:06:20 +02:00
Gael Guennebaud
fc6a0ebb1c Typos in doc. 2016-04-11 10:54:58 +02:00
Till Hoffmann
643b697649 Proper handling of domain errors. 2016-04-10 00:37:53 +01:00
Rasmus Munk Larsen
1f70bd4134 Merge. 2016-04-09 15:31:53 -07:00
Rasmus Munk Larsen
096e355f8e Add short-circuit to avoid calling matrix norm for empty matrix. 2016-04-09 15:29:56 -07:00
Rasmus Larsen
be80fb49fc Merged default (4a92b590a0
) into default
2016-04-09 13:13:01 -07:00
Rasmus Larsen
7a8176587b Merged eigen/eigen into default 2016-04-09 12:47:41 -07:00
Rasmus Munk Larsen
4a92b590a0 Merge. 2016-04-09 12:47:24 -07:00
Rasmus Munk Larsen
ee6c69733a A few tiny adjustments to short-circuit logic. 2016-04-09 12:45:49 -07:00
Till Hoffmann
7f4826890c Merge upstream 2016-04-09 20:08:07 +01:00
Till Hoffmann
de057ebe54 Added nans to zeta function. 2016-04-09 20:07:36 +01:00
Gael Guennebaud
af2161cdb4 bug #1197: fix/relax some LM unit tests 2016-04-09 11:14:02 +02:00
Gael Guennebaud
a05a683d83 bug #1160: fix and relax some lm unit tests by turning faillures to warnings 2016-04-09 10:49:19 +02:00
Benoit Steiner
5da90fc8dd Use numext::abs instead of std::abs in scalar_fuzzy_default_impl to make it usable inside GPU kernels. 2016-04-08 19:40:48 -07:00
Benoit Steiner
01bd577288 Fixed the implementation of Eigen::numext::isfinite, Eigen::numext::isnan, andEigen::numext::isinf on CUDA devices 2016-04-08 16:40:10 -07:00
Benoit Steiner
89a3dc35a3 Fixed isfinite_impl: NumTraits<T>::highest() and NumTraits<T>::lowest() are finite numbers. 2016-04-08 15:56:16 -07:00
Benoit Steiner
995f202cea Disabled the use of half2 on cuda devices of compute capability < 5.3 2016-04-08 14:43:36 -07:00
Benoit Steiner
8d22967bd9 Initial support for taking the power of fp16 2016-04-08 14:22:39 -07:00
Benoit Steiner
3394379319 Fixed the packet_traits for half floats. 2016-04-08 13:33:59 -07:00
Benoit Steiner
0d2a532fc3 Created the new EIGEN_TEST_CUDA_CLANG option to compile the CUDA tests using clang instead of nvcc 2016-04-08 13:16:08 -07:00
Rasmus Larsen
0b81a18d12 Merged eigen/eigen into default 2016-04-08 12:58:57 -07:00
Benoit Steiner
2d072b38c1 Don't test the division by 0 on float16 when compiling with msvc since msvc detects and errors out on divisions by 0. 2016-04-08 12:50:25 -07:00
Benoit Jacob
cd2b667ac8 Add references to filed LLVM bugs 2016-04-08 08:12:47 -04:00
Benoit Steiner
3bd16457e1 Properly handle complex numbers. 2016-04-07 23:28:04 -07:00
Benoit Steiner
63102ee43d Turn on the coeffWise benchmarks on fp16 2016-04-07 23:05:20 -07:00
Benoit Steiner
7c47d3e663 Fixed the type casting benchmarks for fp16 2016-04-07 22:50:25 -07:00
Benoit Steiner
166b56bc61 Fixed the type casting benchmark for float16 2016-04-07 22:45:54 -07:00
Benoit Steiner
2f2801f096 Merged in parthaEth/eigen (pull request PR-175)
Static casting scalar types so as to let chlesky module of eigen work with ceres
2016-04-07 22:10:14 -07:00
Benoit Steiner
d962fe6a99 Renamed float16 into cxx11_float16 since the test relies on c++11 features 2016-04-07 20:28:32 -07:00
Rasmus Larsen
c34e55c62b Merged eigen/eigen into default 2016-04-07 20:23:03 -07:00
Benoit Steiner
7d5b17087f Added missing EIGEN_DEVICE_FUNC to the tensor conversion code. 2016-04-07 20:01:19 -07:00
Benoit Steiner
a6d08be9b2 Fixed the benchmarking of fp16 coefficient wise operations 2016-04-07 17:13:44 -07:00
Rasmus Munk Larsen
283c51cd5e Widen short-circuiting ReciprocalConditionNumberEstimate so we don't call InverseMatrixL1NormEstimate for dec.rows() <= 1. 2016-04-07 16:45:40 -07:00
Rasmus Munk Larsen
d51803a728 Use Index instead of int for indexing and sizes. 2016-04-07 16:39:48 -07:00
Rasmus Munk Larsen
fd872aefb3 Remove transpose() method from LLT and LDLT classes as it would imply conjugation.
Explicitly cast constants to RealScalar in ConditionEstimator.h.
2016-04-07 16:28:44 -07:00
Rasmus Munk Larsen
0b5546d182 Use lpNorm<1>() to compute l1 norms in LLT and LDLT. 2016-04-07 15:49:30 -07:00
parthaEth
2d5bb375b7 Static casting scalar types so as to let chlesky module of eigen work with ceres 2016-04-08 00:14:44 +02:00
Benoit Steiner
a02ec09511 Worked around numerical noise in the test for the zeta function. 2016-04-07 12:11:02 -07:00
Benoit Steiner
c912b1d28c Fixed a typo in the polygamma test. 2016-04-07 11:51:07 -07:00
Benoit Steiner
74f64838c5 Updated the unary functors to use the numext implementation of typicall functions instead of the one provided in the standard library. The standard library functions aren't supported officially by cuda, so we're better off using the numext implementations. 2016-04-07 11:42:14 -07:00
Benoit Steiner
737644366f Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace 2016-04-07 11:40:15 -07:00
Benoit Steiner
dc45aaeb93 Added tests for float16 2016-04-07 11:18:05 -07:00
Benoit Steiner
8db269e055 Fixed a typo in a test 2016-04-07 10:41:51 -07:00
Benoit Steiner
b89d3f78b2 Updated the isnan, isinf and isfinite functions to make compatible with cuda devices. 2016-04-07 10:08:49 -07:00
Benoit Steiner
48308ed801 Added support for isinf, isnan, and isfinite checks to the tensor api 2016-04-07 09:48:36 -07:00
Benoit Steiner
cfb34d808b Fixed a possible integer overflow. 2016-04-07 08:46:52 -07:00
Benoit Steiner
df838736e2 Fixed compilation warning triggered by msvc 2016-04-06 20:48:55 -07:00
Benoit Steiner
14ea7c7ec7 Fixed packet_traits<half> 2016-04-06 19:30:21 -07:00
Benoit Steiner
532fdf24cb Added support for hardware conversion between fp16 and full floats whenever
possible.
2016-04-06 17:11:31 -07:00
Benoit Steiner
165150e896 Fixed the tests for the zeta and polygamma functions 2016-04-06 14:31:01 -07:00
Benoit Steiner
7be1eaad1e Fixed typos in the implementation of the zeta and polygamma ops. 2016-04-06 14:15:37 -07:00
Benoit Steiner
58c1dbff19 Made the fp16 code more portable. 2016-04-06 13:44:08 -07:00
Benoit Steiner
cf7e73addd Added some missing conversions to the Half class, and fixed the implementation of the < operator on cuda devices. 2016-04-06 09:59:51 -07:00
Benoit Steiner
10bdd8e378 Merged in tillahoffmann/eigen (pull request PR-173)
Added zeta function of two arguments and polygamma function
2016-04-06 09:40:17 -07:00
Benoit Steiner
7781f865cb Renamed the EIGEN_TEST_NVCC cmake option into EIGEN_TEST_CUDA per the discussion in bug #1173. 2016-04-06 09:35:23 -07:00
Benoit Steiner
72abfa11dd Added support for isfinite on fp16 2016-04-06 09:07:30 -07:00
Rasmus Munk Larsen
4d07064a3d Fix bug in alternate lower bound calculation due to missing parentheses.
Make a few expressions more concise.
2016-04-05 16:40:48 -07:00
Konstantinos Margaritis
2bba4ee2cf Merged kmargar/eigen/tip into default 2016-04-05 22:22:08 +03:00
Konstantinos Margaritis
317384b397 complete the port, remove float support 2016-04-05 14:56:45 -04:00
tillahoffmann
726bd5f077 Merged eigen/eigen into default 2016-04-05 18:21:05 +01:00
Till Hoffmann
a350c25a39 Added accuracy comments. 2016-04-05 18:20:40 +01:00
Gael Guennebaud
4d7e230d2f bug #1189: fix pow/atan2 compilation for AutoDiffScalar 2016-04-05 14:49:41 +02:00
Konstantinos Margaritis
bc0ad363c6 add remaining includes 2016-04-05 06:01:17 -04:00
Konstantinos Margaritis
2d41dc9622 complete int/double specialized traits for ZVector 2016-04-05 06:00:51 -04:00
Konstantinos Margaritis
644d0f91d2 enable all tests again 2016-04-05 05:59:54 -04:00
Konstantinos Margaritis
988344daf1 enable the other includes as well 2016-04-05 05:59:30 -04:00
Rasmus Larsen
d7eeee0c1d Merged eigen/eigen into default 2016-04-04 15:58:27 -07:00
Rasmus Munk Larsen
513c372960 Fix docstrings to list all supported decompositions. 2016-04-04 14:34:59 -07:00
Rasmus Munk Larsen
86e0ed81f8 Addresses comments on Eigen pull request PR-174.
* Get rid of code-duplication for real vs. complex matrices.
* Fix flipped arguments to select.
* Make the condition estimation functions free functions.
* Use Vector::Unit() to generate canonical unit vectors.
* Misc. cleanup.
2016-04-04 14:20:01 -07:00
Benoit Jacob
158fea0f5e bug #1190 - Don't trust __ARM_FEATURE_FMA on Clang/ARM 2016-04-04 16:42:40 -04:00
Benoit Jacob
03f2997a11 bug #1191 - Prevent Clang/ARM from rewriting VMLA into VMUL+VADD 2016-04-04 16:41:47 -04:00
Till Hoffmann
b0143de177 Merge upstream. 2016-04-04 19:16:48 +01:00
Till Hoffmann
b97911dd18 Refactored code into type-specific helper functions. 2016-04-04 19:16:03 +01:00
Benoit Steiner
c4179dd470 Updated the scalar_abs_op struct to make it compatible with cuda devices. 2016-04-04 11:11:51 -07:00
Benoit Steiner
1108b4f218 Fixed the signature of numext::abs to make it compatible with complex numbers 2016-04-04 11:09:25 -07:00
tillahoffmann
b8245cc325 Merged eigen/eigen into default 2016-04-04 12:28:11 +01:00
Gael Guennebaud
2b457f8e5e Fix cross-compiling windows version detection 2016-04-04 11:47:46 +02:00
Rasmus Larsen
30242b7565 Merged eigen/eigen into default 2016-04-01 17:19:36 -07:00
Rasmus Munk Larsen
9d51f7c457 Add rcond method to LDLT. 2016-04-01 16:48:38 -07:00
Rasmus Munk Larsen
f54137606e Add condition estimation to Cholesky (LLT) factorization. 2016-04-01 16:19:45 -07:00
Rasmus Munk Larsen
fb8dccc23e Replace "inline static" with "static inline" for consistency. 2016-04-01 12:48:18 -07:00
Rasmus Munk Larsen
91414e0042 Fix comments in ConditionEstimator and minor cleanup. 2016-04-01 11:58:17 -07:00
Rasmus Munk Larsen
1aa89fb855 Add matrix condition estimator module that implements the Higham/Hager algorithm from http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf used in LPACK. Add rcond() methods to FullPivLU and PartialPivLU. 2016-04-01 10:27:59 -07:00
Till Hoffmann
80eba21ad0 Merge upstream. 2016-04-01 18:18:49 +01:00
Till Hoffmann
eb0ae602bd Added CUDA tests. 2016-04-01 18:17:45 +01:00
Till Hoffmann
ffd770ce94 Fixed CUDA signature. 2016-04-01 17:58:24 +01:00
Till Hoffmann
3cb0a237c1 Fixed suggestions by Eugene Brevdo. 2016-04-01 17:51:39 +01:00
tillahoffmann
49960adbdd Merged eigen/eigen into default 2016-04-01 14:36:15 +01:00
Till Hoffmann
57239f4a81 Added polygamma function. 2016-04-01 14:35:21 +01:00
Till Hoffmann
dd5d390daf Added zeta function. 2016-04-01 13:32:29 +01:00
Benoit Steiner
3da495e6b9 Relaxed the condition used to gate the fft code. 2016-03-31 18:11:51 -07:00
Benoit Steiner
0ea7ab4f62 Hashing was only officially introduced in c++11. Therefore only define an implementation of the hash function for float16 if c++11 is enabled. 2016-03-31 14:44:55 -07:00
Benoit Steiner
92b7f7b650 Improved code formating 2016-03-31 13:09:58 -07:00
Benoit Steiner
f197813f37 Added the ability to hash a fp16 2016-03-31 13:09:23 -07:00
Benoit Steiner
0f5cc504fe Properly gate the fft code 2016-03-31 12:59:39 -07:00
Benoit Steiner
4c859181da Made it possible to use the NumTraits for complex and Array in a cuda kernel. 2016-03-31 12:48:38 -07:00
Benoit Steiner
c36ab19902 Added __ldg primitive for fp16. 2016-03-31 10:55:03 -07:00
Benoit Steiner
b575fb1d02 Added NumTraits for half floats 2016-03-31 10:43:59 -07:00
Benoit Steiner
8c8a79cec1 Fixed a typo 2016-03-31 10:33:32 -07:00
Benoit Steiner
af4ef540bf Fixed a off-by-one bug in a debug assertion 2016-03-30 18:37:19 -07:00
Benoit Steiner
791e5cfb69 Added NumTraits for type2index. 2016-03-30 18:36:36 -07:00
Benoit Steiner
4f1a7e51c1 Pull math functions from the global namespace only when compiling cuda code with nvcc. When compiling with clang, we want to use the std namespace. 2016-03-30 17:59:49 -07:00
Benoit Steiner
bc68fc2fe7 Enable constant expressions when compiling cuda code with clang. 2016-03-30 17:58:32 -07:00
Benoit Steiner
483aaad10a Fixed compilation warning 2016-03-30 17:08:13 -07:00
Benoit Steiner
1b40abbf99 Added missing assignment operator to the TensorUInt128 class, and made misc small improvements 2016-03-30 13:17:03 -07:00
Benoit Jacob
01b5333e44 bug #1186 - vreinterpretq_u64_f64 fails to build on Android/Aarch64/Clang toolchain 2016-03-30 11:02:33 -04:00
Benoit Steiner
aa45ad2aac Fixed the formatting of the README. 2016-03-29 15:06:13 -07:00
Benoit Steiner
56df5ef1d7 Attempt to fix the formatting of the README 2016-03-29 15:03:38 -07:00
Benoit Steiner
1bcd82e31b Pulled latest updates from trunk 2016-03-29 13:36:18 -07:00
Gael Guennebaud
09ad31aa85 Add regression test for nesting type handling in blas_traits 2016-03-29 22:33:57 +02:00
Benoit Steiner
1841d6d4c3 Added missing cuda template specializations for numext::ceil 2016-03-29 13:29:34 -07:00
Benoit Steiner
7b7d2a9fa5 Use false instead of 0 as the expected value of a boolean 2016-03-29 11:50:17 -07:00
Benoit Steiner
e02b784ec3 Added support for standard mathematical functions and trancendentals(such as exp, log, abs, ...) on fp16 2016-03-29 09:20:36 -07:00
Benoit Steiner
c38295f0a0 Added support for fmod 2016-03-28 15:53:02 -07:00
Benoit Steiner
6772f653c3 Made it possible to customize the threadpool 2016-03-28 10:01:04 -07:00
Benoit Steiner
1bc81f7889 Fixed compilation warnings on arm 2016-03-28 09:21:04 -07:00
Benoit Steiner
78f83d6f6a Prevent potential overflow. 2016-03-28 09:18:04 -07:00
Konstantinos Margaritis
01e7298fe6 actually include ZVector files, passes most basic tests (float still fails) 2016-03-28 10:58:02 -04:00
Konstantinos Margaritis
f48011119e Merged eigen/eigen into default 2016-03-28 01:48:45 +03:00
Konstantinos Margaritis
ed6b9d08f1 some primitives ported, but missing intrinsics and crash with asm() are a problem 2016-03-27 18:47:49 -04:00
Benoit Steiner
74f91ed06c Improved support for integer modulo 2016-03-25 17:21:56 -07:00
Benoit Steiner
65716e99a5 Improved the cost estimate of the quotient op 2016-03-25 11:13:53 -07:00
Benoit Steiner
d94f6ba965 Started to model the cost of divisions more accurately. 2016-03-25 11:02:56 -07:00
Benoit Steiner
a86c9f037b Fixed compilation error on windows 2016-03-24 18:54:31 -07:00
Benoit Steiner
0968e925a0 Updated the benchmarking code to use Eigen::half instead of half 2016-03-24 18:00:33 -07:00
Benoit Steiner
044efea965 Made sure that the cxx11_tensor_cuda test can be compiled even without support for cxx11. 2016-03-23 20:02:11 -07:00
Benoit Steiner
2e4e4cb74d Use numext::abs instead of abs to avoid incorrect conversion to integer of the argument 2016-03-23 16:57:12 -07:00
Benoit Steiner
41434a8a85 Avoid unnecessary conversions 2016-03-23 16:52:38 -07:00
Benoit Steiner
92693b50eb Fixed compilation warning 2016-03-23 16:40:36 -07:00
Benoit Steiner
9bc9396e88 Use portable includes 2016-03-23 16:30:06 -07:00
Benoit Steiner
393bc3b16b Added comment 2016-03-23 16:22:15 -07:00
Benoit Steiner
81d340984a Removed executable bit from header files 2016-03-23 16:15:02 -07:00
Benoit Steiner
bff8cbad06 Removed executable bit from header files 2016-03-23 16:14:23 -07:00
Benoit Steiner
7a570e50ef Fixed contractions of fp16 2016-03-23 16:00:06 -07:00
Benoit Steiner
7168afde5e Made the tensor benchmarks compile on MacOS 2016-03-23 14:21:04 -07:00
Benoit Steiner
2062ee2d26 Added a test to verify that notifications are working properly 2016-03-23 13:39:00 -07:00
Benoit Steiner
fc3660285f Made type conversion explicit 2016-03-23 09:56:50 -07:00
Benoit Steiner
0e68882604 Added the ability to divide a half float by an index 2016-03-23 09:46:42 -07:00
Benoit Steiner
6971146ca9 Added more conversion operators for half floats 2016-03-23 09:44:52 -07:00
Christoph Hertzberg
9642fd7a93 Replace all M_PI by EIGEN_PI and add a check to the testsuite. 2016-03-23 15:37:45 +01:00
Benoit Steiner
28e02996df Merged patch 672 from Justin Lebar: Don't use long doubles with cuda 2016-03-22 16:53:57 -07:00
Benoit Steiner
3d1e857327 Fixed compilation error 2016-03-22 15:48:28 -07:00
Benoit Steiner
de7d92c259 Pulled latest updates from trunk 2016-03-22 15:24:49 -07:00
Benoit Steiner
002cf0d1c9 Use a single Barrier instead of a collection of Notifications to reduce the thread synchronization overhead 2016-03-22 15:24:23 -07:00
Benoit Steiner
bc2b802751 Fixed a couple of typos 2016-03-22 14:27:34 -07:00
Benoit Steiner
e7a468c5b7 Filter some compilation flags that nvcc warns about. 2016-03-22 14:26:50 -07:00
Benoit Steiner
6a31b7be3e Avoid using std::vector whenever possible 2016-03-22 14:02:50 -07:00
Benoit Steiner
65a7113a36 Use an enum instead of a static const int to prevent possible link error 2016-03-22 09:33:54 -07:00
Benoit Steiner
f9ad25e4d8 Fixed contractions of 16 bit floats 2016-03-22 09:30:23 -07:00
Benoit Steiner
8ef3181f15 Worked around a constness related issue 2016-03-21 11:24:05 -07:00
Benoit Steiner
7a07d6aa2b Small cleanup 2016-03-21 11:12:17 -07:00
Konstantinos Margaritis
a9a6710e15 add initial s390x(zEC13) ZVECTOR support 2016-03-21 13:46:47 -04:00
Benoit Steiner
e91f255301 Marked variables that's only used in debug mode as such 2016-03-21 10:02:00 -07:00
Benoit Steiner
db5c14de42 Explicitly cast the default value into the proper scalar type. 2016-03-21 09:52:58 -07:00
Christoph Hertzberg
b224771f40 bug #1178: Simplified modification of the SSE control register for better portability 2016-03-20 10:57:08 +01:00
Benoit Steiner
8e03333f06 Renamed some class members to make the code more readable. 2016-03-18 15:21:04 -07:00
Benoit Steiner
6c08943d9f Fixed a bug in the padding of extracted image patches. 2016-03-18 15:19:10 -07:00
Benoit Steiner
134d750eab Completed the implementation of vectorized type casting of half floats. 2016-03-18 13:36:28 -07:00
Benoit Steiner
7bd551b3a9 Make all the conversions explicit 2016-03-18 12:20:08 -07:00
Benoit Steiner
bb0e73c191 Gate all the CUDA tests under the EIGEN_TEST_NVCC option 2016-03-18 12:17:37 -07:00
Benoit Steiner
2db4a04827 Fixed a typo 2016-03-18 12:08:01 -07:00
Benoit Steiner
dd514de8a9 Added a test to validate the fallback path for half floats 2016-03-18 12:02:39 -07:00
Benoit Steiner
9a7ece9caf Worked around constness issue 2016-03-18 10:38:29 -07:00
Benoit Steiner
edc679f6c6 Fixed compilation warning 2016-03-18 07:12:34 -07:00
Benoit Steiner
53d498ef06 Fixed compilation warnings in the cuda tests 2016-03-18 07:04:54 -07:00
Benoit Steiner
e10e126cd0 pulled latest updates from trunk 2016-03-17 21:48:38 -07:00
Benoit Steiner
70eb70f5f8 Avoid mutable class members when possible 2016-03-17 21:47:18 -07:00
Benoit Steiner
7b98de1f15 Implemented some of the missing type casting for half floats 2016-03-17 21:45:45 -07:00
Benoit Steiner
afb81b7ded Made sure to use the hard abi when compiling with NEON instructions to avoid the "gnu/stubs-soft.h: No such file or directory" error 2016-03-17 21:24:24 -07:00
Benoit Steiner
95b8961a9b Allocate the mersenne twister used by the random number generators on the heap instead of on the stack since they tend to keep a lot of state (i.e. about 5k) around. 2016-03-17 15:23:51 -07:00
Benoit Steiner
f7329619da Fix bug in tensor contraction. The code assumes that contraction axis indices for the LHS (after possibly swapping to ColMajor!) is increasing. Explicitly sort the contraction axis pairs to make it so. 2016-03-17 15:08:02 -07:00
Christoph Hertzberg
46aa9772fc Merged in ebrevdo/eigen (pull request PR-169)
Bugfixes to cuda tests, igamma & igammac implemented, & tests for digamma, igamma, igammac on CPU & GPU.
2016-03-16 21:59:08 +01:00
Eugene Brevdo
f1f7181f53 Merge default branch. 2016-03-16 12:46:19 -07:00
Eugene Brevdo
1f69a1b65f Change the header guard around certain numext functions to be CUDA specific. 2016-03-16 12:44:35 -07:00
Benoit Steiner
ab9b749b45 Improved a test 2016-03-14 20:03:13 -07:00
Benoit Steiner
5a51366ea5 Fixed a typo. 2016-03-14 09:25:16 -07:00
Benoit Steiner
fcf59e1c37 Properly gate the use of cuda intrinsics in the code 2016-03-14 09:13:44 -07:00
Benoit Steiner
97a1f1c273 Make sure we only use the half float intrinsic when compiling with a version of CUDA that is recent enough to provide them 2016-03-14 08:37:58 -07:00
Eugene Brevdo
9550be925d Merge specfun branch. 2016-03-13 15:46:51 -07:00
Eugene Brevdo
b1a9afe9a9 Add tests in array.cpp that check igamma/igammac properties.
This adds to the set of existing tests, which compare a specific
set of values to third party calculated ground truth.
2016-03-13 15:45:34 -07:00
Benoit Steiner
e29c9676b1 Don't mark the cast operator as explicit, since this is a c++11 feature that's not supported by older compilers. 2016-03-12 00:15:58 -08:00
Benoit Steiner
eecd914864 Also replaced uint32_t with unsigned int to make the code more portable 2016-03-11 19:34:21 -08:00
Benoit Steiner
1ca8c1ec97 Replaced a couple more uint16_t with unsigned short 2016-03-11 19:28:28 -08:00
Benoit Steiner
0423b66187 Use unsigned short instead of uint16_t since they're more portable 2016-03-11 17:53:41 -08:00
Benoit Steiner
048c4d6efd Made half floats usable on hardware that doesn't support them natively. 2016-03-11 17:21:42 -08:00
Benoit Steiner
b72ffcb05e Made the comparison of Eigen::array GPU friendly 2016-03-11 16:37:59 -08:00
Benoit Steiner
25f69cb932 Added a comparison operator for Eigen::array
Alias Eigen::array to std::array when compiling with Visual Studio 2015
2016-03-11 15:20:37 -08:00
Benoit Steiner
c5b98a58b8 Updated the cxx11_meta test to work on the Eigen::array class when std::array isn't available. 2016-03-11 11:53:38 -08:00
Benoit Steiner
456e038a4e Fixed the +=, -=, *= and /= operators to return a reference 2016-03-10 15:17:44 -08:00
Benoit Steiner
86d45a3c83 Worked around visual studio compilation warnings. 2016-03-09 21:29:39 -08:00
Benoit Steiner
8fd4241377 Fixed a typo. 2016-03-10 02:28:46 +00:00
Benoit Steiner
a685a6beed Made the list reductions less ambiguous. 2016-03-09 17:41:52 -08:00
Benoit Steiner
3149b5b148 Avoid implicit cast 2016-03-09 17:35:17 -08:00
Benoit Steiner
b2100b83ad Made sure to include the <random> header file when compiling with visual studio 2016-03-09 16:03:16 -08:00
Benoit Steiner
f05fb449b8 Avoid unnecessary conversion from 32bit int to 64bit unsigned int 2016-03-09 15:27:45 -08:00
Benoit Steiner
1d566417d2 Enable the random number generators when compiling with visual studio 2016-03-09 10:55:11 -08:00
Eugene Brevdo
836e92a051 Update MathFunctions/SpecialFunctions with intelligent header guards. 2016-03-09 09:04:45 -08:00
Benoit Steiner
b084133dbf Fixed the integer division code on windows 2016-03-09 07:06:36 -08:00
Benoit Steiner
6d30683113 Fixed static assertion 2016-03-08 21:02:51 -08:00
Eugene Brevdo
5e7de771e3 Properly fix merge issues. 2016-03-08 17:35:05 -08:00
Eugene Brevdo
73220d2bb0 Resolve bad merge. 2016-03-08 17:28:21 -08:00
Eugene Brevdo
5f17de3393 Merge changes. 2016-03-08 17:22:26 -08:00
Eugene Brevdo
14f0fde51f Add certain functions to numext (log, exp, tan) because CUDA doesn't support std::
Use these in SpecialFunctions.
2016-03-08 17:17:44 -08:00
Benoit Steiner
46177c8d64 Replace std::vector with our own implementation, as using the stl when compiling with nvcc and avx enabled leads to many issues. 2016-03-08 16:37:27 -08:00
Benoit Steiner
6d6413f768 Simplified the full reduction code 2016-03-08 16:02:00 -08:00
Benoit Steiner
5a427a94a9 Fixed the tensor generator code 2016-03-08 13:28:06 -08:00
Benoit Steiner
a81b88bef7 Fixed the tensor concatenation code 2016-03-08 12:30:19 -08:00
Benoit Steiner
551ff11d0d Fixed the tensor layout swapping code 2016-03-08 12:28:10 -08:00
Benoit Steiner
8768c063f5 Fixed the tensor chipping code. 2016-03-08 12:26:49 -08:00
Benoit Steiner
e09eb835db Decoupled the packet type definition from the definition of the tensor ops. All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit. 2016-03-08 12:07:33 -08:00
Benoit Steiner
3b614a2358 Use NumTraits::highest() and NumTraits::lowest() instead of the std::numeric_limits to make the tensor min and max functors more CUDA friendly. 2016-03-07 17:53:28 -08:00
Eugene Brevdo
dd6dcad6c2 Merge branch specfun. 2016-03-07 15:37:12 -08:00
Eugene Brevdo
0bb5de05a1 Finishing touches on igamma/igammac for GPU. Tests now pass. 2016-03-07 15:35:09 -08:00
Benoit Steiner
769685e74e Added the ability to pad a tensor using a non-zero value 2016-03-07 14:45:37 -08:00
Benoit Steiner
7f87cc3a3b Fix a couple of typos in the code. 2016-03-07 14:31:27 -08:00
Eugene Brevdo
5707004d6b Fix Eigen's building of sharded tests that use CUDA & more igamma/igammac bugfixes.
0. Prior to this PR, not a single sharded CUDA test was actually being *run*.
Fixed that.

GPU tests are still failing for igamma/igammac.

1. Add calls for igamma/igammac to TensorBase
2. Fix up CUDA-specific calls of igamma/igammac
3. Add unit tests for digamma, igamma, igammac in CUDA.
2016-03-07 14:08:56 -08:00
Benoit Steiner
e5f25622e2 Added a test to validate the behavior of some of the tensor syntactic sugar. 2016-03-07 09:04:27 -08:00
Benoit Steiner
9f5740cbc1 Added missing include 2016-03-06 22:03:18 -08:00
Benoit Steiner
5238e03fe1 Don't try to compile the uint128 test with compilers that don't support uint127 2016-03-06 21:59:40 -08:00
Benoit Steiner
9a54c3e32b Don't warn that msvc 2015 isn't c++11 compliant just because it doesn't claim to be. 2016-03-06 09:38:56 -08:00
Benoit Steiner
05bbca079a Turn on some of the cxx11 features when compiling with visual studio 2015 2016-03-05 10:52:08 -08:00
Benoit Steiner
6093eb9ff5 Don't test our 128bit emulation code when compiling with msvc 2016-03-05 10:37:11 -08:00
Benoit Steiner
57b263c5b9 Avoid using initializer lists in test since not all version of msvc support them 2016-03-05 08:35:26 -08:00
Benoit Steiner
23aed8f2e4 Use EIGEN_PI instead of redefining our own constant PI 2016-03-05 08:04:45 -08:00
Eugene Brevdo
0b9e0abc96 Make igamma and igammac work correctly.
This required replacing ::abs with std::abs.
Modified some unit tests.
2016-03-04 21:12:10 -08:00
Benoit Steiner
c23e0be18f Use the CMAKE_CXX_STANDARD variable to turn on cxx11 2016-03-04 20:18:01 -08:00
Benoit Steiner
ec35068edc Don't rely on the M_PI constant since not all compilers provide it. 2016-03-04 16:42:38 -08:00
Benoit Steiner
60d9df11c1 Fixed the computation of leading zeros when compiling with msvc. 2016-03-04 16:27:02 -08:00
Benoit Steiner
4e49fd5eb9 MSVC uses __uint128 while other compilers use __uint128_t to encode 128bit unsigned integers. Make the cxx11_tensor_uint128.cpp test work in both cases. 2016-03-04 14:49:18 -08:00
Benoit Steiner
667fcc2b53 Fixed syntax error 2016-03-04 14:37:51 -08:00
Benoit Steiner
4416a5dcff Added missing include 2016-03-04 14:35:43 -08:00
Benoit Steiner
c561eeb7bf Don't use implicit type conversions in initializer lists since not all compilers support them. 2016-03-04 14:12:45 -08:00
Benoit Steiner
174edf976b Made the contraction test more portable 2016-03-04 14:11:13 -08:00
Benoit Steiner
2c50fc878e Fixed a typo 2016-03-04 14:09:38 -08:00
Eugene Brevdo
7ea35bfa1c Initial implementation of igamma and igammac. 2016-03-03 19:39:41 -08:00
Benoit Steiner
deea866bbd Added tests to cover the new rounding, flooring and ceiling tensor operations. 2016-03-03 12:38:02 -08:00
Benoit Steiner
5cf4558c0a Added support for rounding, flooring, and ceiling to the tensor api 2016-03-03 12:36:55 -08:00
Benoit Steiner
dac58d7c35 Added a test to validate the conversion of half floats into floats on Kepler GPUs.
Restricted the testing of the random number generation code to GPU architecture greater than or equal to 3.5.
2016-03-03 10:37:25 -08:00
Benoit Steiner
1032441c6f Enable partial support for half floats on Kepler GPUs. 2016-03-03 10:34:20 -08:00
Benoit Steiner
1da10a7358 Enable the conversion between floats and half floats on older GPUs that support it. 2016-03-03 10:33:20 -08:00
Benoit Steiner
2de8cc9122 Merged in ebrevdo/eigen (pull request PR-167)
Add infinity() support to numext::numeric_limits, use it in lgamma.

I tested the code on my gtx-titan-black gpu, and it appears to work as expected.
2016-03-03 09:42:12 -08:00
Eugene Brevdo
ab3dc0b0fe Small bugfix to numeric_limits for CUDA. 2016-03-02 21:48:46 -08:00
Eugene Brevdo
6afea46838 Add infinity() support to numext::numeric_limits, use it in lgamma.
This makes the infinity access a __device__ function, removing
nvcc warnings.
2016-03-02 21:35:48 -08:00
Gael Guennebaud
3fccef6f50 bug #537: fix compilation with Apples's compiler 2016-03-02 13:22:46 +01:00
Benoit Steiner
fedaf19262 Pulled latest updates from trunk 2016-03-01 06:15:44 -08:00
Gael Guennebaud
dfa80b2060 Compilation fix 2016-03-01 12:48:56 +01:00
Gael Guennebaud
bee9efc203 Compilation fix 2016-03-01 12:47:27 +01:00
Benoit Steiner
68ac5c1738 Improved the performance of large outer reductions on cuda 2016-02-29 18:11:58 -08:00
Benoit Steiner
56a3ada670 Added benchmarks for full reduction 2016-02-29 14:57:52 -08:00
Benoit Steiner
b2075cb7a2 Made the signature of the inner and outer reducers consistent 2016-02-29 10:53:38 -08:00
Benoit Steiner
3284842045 Optimized the performance of narrow reductions on CUDA devices 2016-02-29 10:48:16 -08:00
Gael Guennebaud
e9bea614ec Fix shortcoming in fixed-value deduction of startRow/startCol 2016-02-29 10:31:27 +01:00
Benoit Steiner
609b3337a7 Print some information to stderr when a CUDA kernel fails 2016-02-27 20:42:57 +00:00
Benoit Steiner
1031b31571 Improved the README 2016-02-27 20:22:04 +00:00
Gael Guennebaud
8e6faab51e bug #1172: make valuePtr and innderIndexPtr properly return null for empty matrices. 2016-02-27 14:55:40 +01:00
Benoit Steiner
ac2e6e0d03 Properly vectorized the random number generators 2016-02-26 13:52:24 -08:00
Benoit Steiner
caa54d888f Made the TensorIndexList usable on GPU without having to use the -relaxed-constexpr compilation flag 2016-02-26 12:38:18 -08:00
Benoit Steiner
93485d86bc Added benchmarks for type casting of float16 2016-02-26 12:24:58 -08:00
Benoit Steiner
002824e32d Added benchmarks for fp16 2016-02-26 12:21:25 -08:00
Benoit Steiner
2cd32cad27 Reverted previous commit since it caused more problems than it solved 2016-02-26 13:21:44 +00:00
Benoit Steiner
d9d05dd96e Fixed handling of long doubles on aarch64 2016-02-26 04:13:58 -08:00
Benoit Steiner
af199b4658 Made the CUDA architecture level a build setting. 2016-02-25 09:06:18 -08:00
Benoit Steiner
c36c09169e Fixed a typo in the reduction code that could prevent large full reductionsx from running properly on old cuda devices. 2016-02-24 17:07:25 -08:00
Benoit Steiner
7a01cb8e4b Marked the And and Or reducers as stateless. 2016-02-24 16:43:01 -08:00
Gael Guennebaud
91e1375ba9 merge 2016-02-23 11:09:05 +01:00
Gael Guennebaud
055000a424 Fix startRow()/startCol() for dense Block with direct access:
the initial implementation failed for empty rows/columns for which are ambiguous.
2016-02-23 11:07:59 +01:00
Benoit Steiner
1d9256f7db Updated the padding code to work with half floats 2016-02-23 05:51:22 +00:00
Benoit Steiner
8cb9bfab87 Extended the tensor benchmark suite to support types other than floats 2016-02-23 05:28:02 +00:00
Benoit Steiner
f442a5a5b3 Updated the tensor benchmarking code to work with compilers that don't support cxx11. 2016-02-23 04:15:48 +00:00
Benoit Steiner
72d2cf642e Deleted the coordinate based evaluation of tensor expressions, since it's hardly ever used and started to cause some issues with some versions of xcode. 2016-02-22 15:29:41 -08:00
Benoit Steiner
6270d851e3 Declare the half float type as arithmetic. 2016-02-22 13:59:33 -08:00
Benoit Steiner
5cd00068c0 include <iostream> in the tensor header since we now use it to better report cuda initialization errors 2016-02-22 13:59:03 -08:00
Benoit Steiner
257b640463 Fixed compilation warning generated by clang 2016-02-21 22:43:37 -08:00
Benoit Steiner
584832cb3c Implemented the ptranspose function on half floats 2016-02-21 12:44:53 -08:00
Benoit Steiner
e644f60907 Pulled latest updates from trunk 2016-02-21 20:24:59 +00:00
Benoit Steiner
95fceb6452 Added the ability to compute the absolute value of a half float 2016-02-21 20:24:11 +00:00
Benoit Steiner
ed69cbeef0 Added some debugging information to the test to figure out why it fails sometimes 2016-02-21 11:20:20 -08:00
Benoit Steiner
96a24b05cc Optimized casting of tensors in the case where the casting happens to be a no-op 2016-02-21 11:16:15 -08:00
Benoit Steiner
203490017f Prevent unecessary Index to int conversions 2016-02-21 08:49:36 -08:00
Benoit Steiner
9ff269a1d3 Moved some of the fp16 operators outside the Eigen namespace to workaround some nvcc limitations. 2016-02-20 07:47:23 +00:00
Benoit Steiner
1e6fe6f046 Fixed the float16 tensor test. 2016-02-20 07:44:17 +00:00
Rasmus Munk Larsen
8eb127022b Get rid of duplicate code. 2016-02-19 16:33:30 -08:00
Rasmus Munk Larsen
d5e2ec7447 Speed up tensor FFT by up ~25-50%.
Benchmark                          Base (ns)  New (ns) Improvement
------------------------------------------------------------------
BM_tensor_fft_single_1D_cpu/8            132       134     -1.5%
BM_tensor_fft_single_1D_cpu/9           1162      1229     -5.8%
BM_tensor_fft_single_1D_cpu/16           199       195     +2.0%
BM_tensor_fft_single_1D_cpu/17          2587      2267    +12.4%
BM_tensor_fft_single_1D_cpu/32           373       341     +8.6%
BM_tensor_fft_single_1D_cpu/33          5922      4879    +17.6%
BM_tensor_fft_single_1D_cpu/64           797       675    +15.3%
BM_tensor_fft_single_1D_cpu/65         13580     10481    +22.8%
BM_tensor_fft_single_1D_cpu/128         1753      1375    +21.6%
BM_tensor_fft_single_1D_cpu/129        31426     22789    +27.5%
BM_tensor_fft_single_1D_cpu/256         4005      3008    +24.9%
BM_tensor_fft_single_1D_cpu/257        70910     49549    +30.1%
BM_tensor_fft_single_1D_cpu/512         8989      6524    +27.4%
BM_tensor_fft_single_1D_cpu/513       165402    107751    +34.9%
BM_tensor_fft_single_1D_cpu/999       198293    115909    +41.5%
BM_tensor_fft_single_1D_cpu/1ki        21289     14143    +33.6%
BM_tensor_fft_single_1D_cpu/1k        361980    233355    +35.5%
BM_tensor_fft_double_1D_cpu/8            138       131     +5.1%
BM_tensor_fft_double_1D_cpu/9           1253      1133     +9.6%
BM_tensor_fft_double_1D_cpu/16           218       200     +8.3%
BM_tensor_fft_double_1D_cpu/17          2770      2392    +13.6%
BM_tensor_fft_double_1D_cpu/32           406       368     +9.4%
BM_tensor_fft_double_1D_cpu/33          6418      5153    +19.7%
BM_tensor_fft_double_1D_cpu/64           856       728    +15.0%
BM_tensor_fft_double_1D_cpu/65         14666     11148    +24.0%
BM_tensor_fft_double_1D_cpu/128         1913      1502    +21.5%
BM_tensor_fft_double_1D_cpu/129        36414     24072    +33.9%
BM_tensor_fft_double_1D_cpu/256         4226      3216    +23.9%
BM_tensor_fft_double_1D_cpu/257        86638     52059    +39.9%
BM_tensor_fft_double_1D_cpu/512         9397      6939    +26.2%
BM_tensor_fft_double_1D_cpu/513       203208    114090    +43.9%
BM_tensor_fft_double_1D_cpu/999       237841    125583    +47.2%
BM_tensor_fft_double_1D_cpu/1ki        20921     15392    +26.4%
BM_tensor_fft_double_1D_cpu/1k        455183    250763    +44.9%
BM_tensor_fft_single_2D_cpu/8           1051      1005     +4.4%
BM_tensor_fft_single_2D_cpu/9          16784     14837    +11.6%
BM_tensor_fft_single_2D_cpu/16          4074      3772     +7.4%
BM_tensor_fft_single_2D_cpu/17         75802     63884    +15.7%
BM_tensor_fft_single_2D_cpu/32         20580     16931    +17.7%
BM_tensor_fft_single_2D_cpu/33        345798    278579    +19.4%
BM_tensor_fft_single_2D_cpu/64         97548     81237    +16.7%
BM_tensor_fft_single_2D_cpu/65       1592701   1227048    +23.0%
BM_tensor_fft_single_2D_cpu/128       472318    384303    +18.6%
BM_tensor_fft_single_2D_cpu/129      7038351   5445308    +22.6%
BM_tensor_fft_single_2D_cpu/256      2309474   1850969    +19.9%
BM_tensor_fft_single_2D_cpu/257     31849182  23797538    +25.3%
BM_tensor_fft_single_2D_cpu/512     10395194   8077499    +22.3%
BM_tensor_fft_single_2D_cpu/513     144053843  104242541    +27.6%
BM_tensor_fft_single_2D_cpu/999     279885833  208389718    +25.5%
BM_tensor_fft_single_2D_cpu/1ki     45967677  36070985    +21.5%
BM_tensor_fft_single_2D_cpu/1k      619727095  456489500    +26.3%
BM_tensor_fft_double_2D_cpu/8           1110      1016     +8.5%
BM_tensor_fft_double_2D_cpu/9          17957     15768    +12.2%
BM_tensor_fft_double_2D_cpu/16          4558      4000    +12.2%
BM_tensor_fft_double_2D_cpu/17         79237     66901    +15.6%
BM_tensor_fft_double_2D_cpu/32         21494     17699    +17.7%
BM_tensor_fft_double_2D_cpu/33        357962    290357    +18.9%
BM_tensor_fft_double_2D_cpu/64        105179     87435    +16.9%
BM_tensor_fft_double_2D_cpu/65       1617143   1288006    +20.4%
BM_tensor_fft_double_2D_cpu/128       512848    419397    +18.2%
BM_tensor_fft_double_2D_cpu/129      7271322   5636884    +22.5%
BM_tensor_fft_double_2D_cpu/256      2415529   1922032    +20.4%
BM_tensor_fft_double_2D_cpu/257     32517952  24462177    +24.8%
BM_tensor_fft_double_2D_cpu/512     10724898   8287617    +22.7%
BM_tensor_fft_double_2D_cpu/513     146007419  108603266    +25.6%
BM_tensor_fft_double_2D_cpu/999     296351330  221885776    +25.1%
BM_tensor_fft_double_2D_cpu/1ki     59334166  48357539    +18.5%
BM_tensor_fft_double_2D_cpu/1k      666660132  483840349    +27.4%
2016-02-19 16:29:23 -08:00
Gael Guennebaud
d90a2dac5e merge 2016-02-19 23:01:27 +01:00
Gael Guennebaud
485823b5f5 Add COD and BDCSVD in list of benched solvers. 2016-02-19 23:00:33 +01:00
Gael Guennebaud
2af04f1a57 Extend unit test to stress smart_copy with empty input/output. 2016-02-19 22:59:28 +01:00
Gael Guennebaud
6fa35bbd28 bug #1170: skip calls to memcpy/memmove for empty imput. 2016-02-19 22:58:52 +01:00
Benoit Steiner
46fc23f91c Print an error message to stderr when the initialization of the CUDA runtime fails. This helps debugging setup issues. 2016-02-19 13:44:22 -08:00
Gael Guennebaud
6f0992c05b Fix nesting type and complete reflection methods of Block expressions. 2016-02-19 22:21:02 +01:00
Gael Guennebaud
f3643eec57 Add typedefs for the return type of all block methods. 2016-02-19 22:15:01 +01:00
Benoit Steiner
670db7988d Updated the contraction code to make it compatible with half floats. 2016-02-19 13:03:26 -08:00
Benoit Steiner
180156ba1a Added support for tensor reductions on half floats 2016-02-19 10:05:59 -08:00
Benoit Steiner
5c4901b83a Implemented the scalar division of 2 half floats 2016-02-19 10:03:19 -08:00
Benoit Steiner
f268db1c4b Added the ability to query the minor version of a cuda device 2016-02-19 16:31:04 +00:00
Benoit Steiner
a08d2ff0c9 Started to work on contractions and reductions using half floats 2016-02-19 15:59:59 +00:00
Benoit Steiner
f3352e0fb0 Don't make the array constructors explicit 2016-02-19 15:58:57 +00:00
Benoit Steiner
f7cb755299 Added support for operators +=, -=, *= and /= on CUDA half floats 2016-02-19 15:57:26 +00:00
Benoit Steiner
dc26459b99 Implemented protate() for CUDA 2016-02-19 15:16:54 +00:00
Benoit Steiner
cd042dbbfd Fixed a bug in the tensor type converter 2016-02-19 15:03:26 +00:00
Benoit Steiner
ac5d706a94 Added support for simple coefficient wise tensor expression using half floats on CUDA devices 2016-02-19 08:19:12 +00:00
Benoit Steiner
0606a0a39b FP16 on CUDA are only available starting with cuda 7.5. Disable them when using an older version of CUDA 2016-02-18 23:15:23 -08:00
Benoit Steiner
f36c0c2c65 Added regression test for float16 2016-02-19 06:23:28 +00:00
Benoit Steiner
7151bd8768 Reverted unintended changes introduced by a bad merge 2016-02-19 06:20:50 +00:00
Benoit Steiner
1304e1fb5e Pulled latest updates from trunk 2016-02-19 06:17:02 +00:00
Benoit Steiner
17b9fbed34 Added preliminary support for half floats on CUDA GPU. For now we can simply convert floats into half floats and vice versa 2016-02-19 06:16:07 +00:00
Benoit Steiner
8ce46f9d89 Improved implementation of ptanh for SSE and AVX 2016-02-18 13:24:34 -08:00
Eugene Brevdo
832380c455 Merged eigen/eigen into default 2016-02-17 14:44:06 -08:00
Eugene Brevdo
06a2bc7c9c Tiny bugfix in SpecialFunctions: some compilers don't like doubles
implicitly downcast to floats in an array constructor.
2016-02-17 14:41:59 -08:00
Gael Guennebaud
f6f057bb7d bug #1166: fix shortcomming in gemv when the destination is not a vector at compile-time. 2016-02-15 21:43:07 +01:00
Gael Guennebaud
8e1f1ba6a6 Import wiki's paragraph: "I disabled vectorization, but I'm still getting annoyed about alignment issues" 2016-02-12 22:16:59 +01:00
Gael Guennebaud
c8b4c4b48a bug #795: mention allocate_shared as a condidate for aligned_allocator. 2016-02-12 22:09:16 +01:00
Gael Guennebaud
6eff3e5185 Fix triangularView versus triangularPart. 2016-02-12 17:09:28 +01:00
Gael Guennebaud
4252af6897 Remove dead code. 2016-02-12 16:13:35 +01:00
Gael Guennebaud
2f5f56a820 Fix usage of evaluator in sparse * permutation products. 2016-02-12 16:13:16 +01:00
Gael Guennebaud
0a537cb2d8 bug #901: fix triangular-view with unit diagonal of sparse rectangular matrices. 2016-02-12 15:58:31 +01:00
Gael Guennebaud
b35d1a122e Fix unit test: accessing elements in a deque by offsetting a pointer to another element causes undefined behavior. 2016-02-12 15:31:16 +01:00
Benoit Steiner
9e3f3a2d27 Deleted outdated comment 2016-02-11 17:27:35 -08:00
Benoit Steiner
de345eff2e Added a method to conjugate the content of a tensor or the result of a tensor expression. 2016-02-11 16:34:07 -08:00
Benoit Steiner
17e93ba148 Pulled latest updates from trunk 2016-02-11 15:05:38 -08:00
Benoit Steiner
3628f7655d Made it possible to run the scalar_binary_pow_op functor on GPU 2016-02-11 15:05:03 -08:00
Hauke Heibel
eeac46f980 bug #774: re-added comment referencing equations in the original paper 2016-02-11 19:38:37 +01:00
Benoit Steiner
c569cfe12a Inline the +=, -=, *= and /= operators consistently between DenseBase.h and SelfCwiseBinaryOp.h 2016-02-11 09:33:32 -08:00
Gael Guennebaud
8cc9232b9a bug #774: fix a numerical issue producing unwanted reflections. 2016-02-11 15:32:56 +01:00
Gael Guennebaud
2d35c0cb5f Merged in rmlarsen/eigen (pull request PR-163)
Implement complete orthogonal decomposition in Eigen.
2016-02-11 15:12:34 +01:00
Benoit Steiner
33e2373f01 Merged in nnyby/eigen/nnyby/doc-grammar-fix-linearly-space-linearly-1443742971203 (pull request PR-138)
[doc] grammar fix: "linearly space" -> "linearly spaced"
2016-02-10 23:29:59 -08:00
Benoit Steiner
6d8b1dce06 Avoid implicit cast from double to float. 2016-02-10 18:07:11 -08:00
Benoit Steiner
1dfaafe28a Added a regression test for tanh 2016-02-10 17:41:47 -08:00
Rasmus Munk Larsen
b6fdf7468c Rename inverse -> pseudoInverse. 2016-02-10 13:03:07 -08:00
Benoit Jacob
9d6f1ad398 I'm told to use __EMSCRIPTEN__ by an Emscripten dev. 2016-02-10 12:48:34 -05:00
Benoit Steiner
bfb3fcd94f Optimized implementation of the tanh function for SSE 2016-02-10 08:52:30 -08:00
Benoit Steiner
2d523332b3 Optimized implementation of the hyperbolic tangent function for AVX 2016-02-10 08:48:05 -08:00
Benoit Jacob
e6ee18d6b4 Make the GCC workaround for sqrt GCC-only; detect Emscripten as non-GCC 2016-02-10 11:11:49 -05:00
Benoit Steiner
2ac59e5d36 Pulled latest updates from trunk 2016-02-10 08:03:02 -08:00
Benoit Steiner
9a21b38ccc Worked around a few clang compilation warnings 2016-02-10 08:02:04 -08:00
Benoit Jacob
964a95bf5e Work around Emscripten bug - https://github.com/kripken/emscripten/issues/4088 2016-02-10 10:37:22 -05:00
Benoit Steiner
72ab7879f7 Fixed clang comilation warnings 2016-02-10 06:48:28 -08:00
Benoit Steiner
e88535634d Fixed some clang compilation warnings 2016-02-09 23:32:41 -08:00
Benoit Steiner
970751ece3 Disabling the nvcc warnings in addition to the clang warnings when clang is used as a frontend for nvcc 2016-02-09 20:55:50 -08:00
Benoit Steiner
6323851ea9 Fixed compilation warning 2016-02-09 20:43:41 -08:00
Rasmus Munk Larsen
bb8811c655 Enable inverse() method for computing pseudo-inverse. 2016-02-09 20:35:20 -08:00
Benoit Steiner
5cc0dd5f44 Fixed the code that disables the use of variadic templates when compiling with nvcc on ARM devices. 2016-02-09 10:32:01 -08:00
Benoit Steiner
a9cc6a06b9 Fixed compilation warning in the splines test 2016-02-09 05:10:06 +00:00
Benoit Steiner
d69946183d Updated the TensorIntDivisor code to work properly on LLP64 systems 2016-02-08 21:03:59 -08:00
Benoit Steiner
24d291cf16 Worked around nvcc crash when compiling Eigen on Tegra X1 2016-02-09 02:34:02 +00:00
Rasmus Munk Larsen
53f60e0afc Make applyZAdjointOnTheLeftInPlace protected. 2016-02-08 09:01:43 -08:00
Rasmus Munk Larsen
414efa47d3 Add missing calls to tests of COD.
Fix a few mistakes in 3.2 -> 3.3 port.
2016-02-08 08:50:34 -08:00
Gael Guennebaud
c2bf2f56ef Remove custom unaligned loads for SSE. They were only useful for core2 CPU. 2016-02-08 14:29:12 +01:00
Gael Guennebaud
a4c76f8d34 Improve inlining 2016-02-08 11:33:02 +01:00
Rasmus Munk Larsen
16ec450ca1 Nevermind. 2016-02-06 17:54:01 -08:00
Rasmus Munk Larsen
019fff9a00 Add my name to copyright notice in ColPivHouseholder.h, mostly for previous work on stable norm downdate formula. 2016-02-06 17:48:42 -08:00
Rasmus Munk Larsen
86d6201d7b Merge. 2016-02-06 16:36:56 -08:00
Rasmus Munk Larsen
d904c8ac8f Implement complete orthogonal decomposition in Eigen. 2016-02-06 16:32:00 -08:00
Gael Guennebaud
010afe1619 Add exemples for reshaping/slicing with Map. 2016-02-06 22:49:18 +01:00
Gael Guennebaud
8e599bc098 Fix warning in unit test 2016-02-06 20:26:59 +01:00
Gael Guennebaud
c6a12d1dc6 Fix warning with gcc < 4.8 2016-02-06 18:06:51 +01:00
Benoit Steiner
4d4211c04e Avoid unecessary type conversions 2016-02-05 18:19:41 -08:00
Benoit Steiner
d2cba52015 Only enable the cxx11_tensor_uint128 test on 64 bit machines since 32 bit systems don't support the __uin128_t type 2016-02-05 18:14:23 -08:00
Benoit Steiner
fb00a4af2b Made the tensor fft test compile on tegra x1 2016-02-06 01:42:14 +00:00
Gael Guennebaud
5b2d287878 bug #779: allow non aligned buffers for buffers smaller than the requested alignment. 2016-02-05 21:46:39 +01:00
Gael Guennebaud
e8e1d504d6 Add an explicit assersion on the alignment of the pointer returned by std::malloc 2016-02-05 21:38:16 +01:00
Gael Guennebaud
62a1c911cd Remove posix_memalign, _mm_malloc, and _aligned_malloc special paths. 2016-02-05 21:24:35 +01:00
Rasmus Munk Larsen
093f2b3c01 Merge. 2016-02-04 14:32:19 -08:00
Benoit Steiner
3ca1ae2bb7 Commented out the version of pexp<Packet8d> since it fails to compile with gcc 5.3 2016-02-04 13:49:06 -08:00
Rasmus Munk Larsen
2e39cc40a4 Fix condition that made the unit test spam stdout with bogus error messages. 2016-02-04 12:56:14 -08:00
Benoit Steiner
23f69ab936 Added implementations of pexp, plog, psqrt, and prsqrt optimized for AVX512 2016-02-04 10:36:36 -08:00
Benoit Steiner
6c9cf117c1 Fixed indentation 2016-02-04 10:34:10 -08:00
Benoit Steiner
bcdcdace48 Pulled latest updates from trunk 2016-02-04 08:56:49 -08:00
Gael Guennebaud
659fc9c159 Remove dead code 2016-02-04 09:55:09 +01:00
Gael Guennebaud
d5d7798b9d Improve heuritics for switching between coeff-based and general matrix product implementation. 2016-02-04 09:53:47 +01:00
Benoit Steiner
f535378995 Added support for vectorized type casting of int to char. 2016-02-03 18:58:29 -08:00
Benoit Steiner
4ab63a3f6f Fixed the initialization of the dummy member of the array class to make it compatible with pairs of element. 2016-02-03 17:23:07 -08:00
Benoit Steiner
727ff26960 Disable 2 more nvcc warning messages 2016-02-03 16:01:37 -08:00
Benoit Steiner
1cbb79cdfd Made sure the dummy element of size 0 array is always intialized to silence some compiler warnings 2016-02-03 15:58:26 -08:00
Benoit Steiner
bcbde37a11 Made sure the code compiles when EIGEN_HAS_C99_MATH isn't defined 2016-02-03 14:53:08 -08:00
Benoit Steiner
f933f69021 Added a few comments 2016-02-03 14:12:18 -08:00
Benoit Steiner
5d82e47ef6 Properly disable nvcc warning messages in user code. 2016-02-03 14:10:06 -08:00
Benoit Steiner
af8436b196 Silenced the "calling a __host__ function from a __host__ __device__ function is not allowed" messages 2016-02-03 13:48:36 -08:00
Benoit Steiner
d7742d22e4 Revert the nvcc messages to their default severity instead of the forcing them to be warnings 2016-02-03 13:47:28 -08:00
Benoit Steiner
ac26e1aaf3 Pulled latest updates from trunk 2016-02-03 12:52:20 -08:00
Benoit Steiner
492fe7ce02 Silenced some unhelpful warnings generated by nvcc. 2016-02-03 12:51:19 -08:00
Gael Guennebaud
b70db60e4d Merged in rmlarsen/eigen (pull request PR-161)
Change Eigen's ColPivHouseholderQR to use  numerically stable norm downdate formula
2016-02-03 21:37:06 +01:00
Rasmus Munk Larsen
5fb04ab2da Fix bad line break. Don't repeat Kahan matrix test since it is deterministic. 2016-02-03 10:12:10 -08:00
Rasmus Munk Larsen
d9a6f86cc0 Make the array of directly compute column norms a member to avoid allocation in computeInPlace. 2016-02-03 09:55:30 -08:00
Gael Guennebaud
70dc14e4e1 bug #1161: fix division by zero for huge scalar types 2016-02-03 18:25:41 +01:00
Damien R
c301f99208 bug #1164: fix list and deque specializations such that our aligned allocator is automatically activatived only when the user did not specified an allocator (or specified the default std::allocator). 2016-02-03 18:07:25 +01:00
Gael Guennebaud
eb6d9aea0e Clarify error message when writing to a read-only sparse-sub-matrix. 2016-02-03 16:58:23 +01:00
Gael Guennebaud
040cf33e8f merge 2016-02-03 16:09:51 +01:00
Gael Guennebaud
c85fbfd0b7 Clarify documentation on the restrictions of writable sparse block expressions. 2016-02-03 16:08:43 +01:00
Benoit Steiner
dc413dbe8a Merged in ville-k/eigen/explicit_long_constructors (pull request PR-158)
Add constructor for long types.
2016-02-02 20:58:06 -08:00
Ville Kallioniemi
783018d8f6 Use EIGEN_STATIC_ASSERT for backward compatibility. 2016-02-02 16:45:12 -07:00
Benoit Steiner
99cde88341 Don't try to use direct offsets when computing a tensor product, since the required stride isn't available. 2016-02-02 11:06:53 -08:00
Ville Kallioniemi
ff0a83aaf8 Use single template constructor to avoid overload resolution issues. 2016-02-02 00:33:25 -07:00
Ville Kallioniemi
aedea349aa Replace separate low word constructors with a single templated constructor. 2016-02-01 20:25:02 -07:00
Ville Kallioniemi
f0fdefa96f Rebase to latest. 2016-02-01 19:32:31 -07:00
Benoit Steiner
d93b71a301 Updated the packetmath test to call predux_half instead of predux4 2016-02-01 15:18:33 -08:00
Benoit Steiner
ef66f2887b Updated the matrix multiplication code to make it compile with AVX512 enabled. 2016-02-01 14:38:05 -08:00
Benoit Steiner
85b6d82b49 Generalized predux4 to support AVX512 packets, and renamed it predux_half.
Disabled the implementation of pabs for avx512 since the corresponding intrinsics are not shipped with gcc
2016-02-01 14:35:51 -08:00
Benoit Steiner
64ce78c2ec Cleaned up a tensor contraction test 2016-02-01 13:57:41 -08:00
Benoit Steiner
0ce5d32be5 Sharded the cxx11_tensor_contract_cuda test 2016-02-01 13:33:23 -08:00
Benoit Steiner
922b5f527b Silenced a few compilation warnings 2016-02-01 13:30:49 -08:00
Benoit Steiner
6b5dff875e Made it possible to limit the number of blocks that will be used to evaluate a tensor expression on a CUDA device. This makesit possible to set aside streaming multiprocessors for other computations. 2016-02-01 12:46:32 -08:00
Rasmus Munk Larsen
00f9ef6c76 merging. 2016-02-01 11:10:30 -08:00
Benoit Steiner
264f8141f8 Shared the tensor reduction test 2016-02-01 07:44:31 -08:00
Benoit Steiner
11bb71c8fc Sharded the tensor device test 2016-02-01 07:34:59 -08:00
Gael Guennebaud
ff1157bcbf bug #694: document that SparseQR::matrixR is not sorted. 2016-02-01 16:09:34 +01:00
Gael Guennebaud
ec469700dc bug #557: make InnerIterator of sparse storage types more versatile by adding default-ctor, copy-ctor/assignment 2016-02-01 15:04:33 +01:00
Gael Guennebaud
6e0a86194c Fix integer path for num_steps==1 2016-02-01 15:00:04 +01:00
Gael Guennebaud
e1d219e5c9 bug #698: fix linspaced for integer types. 2016-02-01 14:25:34 +01:00
Gael Guennebaud
2c3224924b Fix warning and replace min/max macros by calls to mini/maxi 2016-02-01 10:23:45 +01:00
Benoit Steiner
e80ed948e1 Fixed a number of compilation warnings generated by the cuda tests 2016-01-31 20:09:41 -08:00
Benoit Steiner
6720b38fbf Fixed a few compilation warnings 2016-01-31 16:48:50 -08:00
Benoit Steiner
3f1ee45833 Fixed compilation errors triggered by duplicate inline declaration 2016-01-31 10:48:49 -08:00
Benoit Steiner
70be6f6531 Pulled latest changes from trunk 2016-01-31 10:44:45 -08:00
Benoit Steiner
4a2ddfb81d Sharded the CUDA argmax tensor test 2016-01-31 10:44:15 -08:00
Gael Guennebaud
d142165942 bug #667: declare several critical functions as FORECE_INLINE to make ICC happier.
<g.gael@free.fr> HG: branch 'default' HG: changed Eigen/src/Core/ArrayBase.h HG: changed Eigen/src/Core/AssignEvaluator.h HG: changed
Eigen/src/Core/CoreEvaluators.h HG: changed Eigen/src/Core/CwiseUnaryOp.h HG: changed Eigen/src/Core/DenseBase.h HG: changed Eigen/src/Core/MatrixBase.h
2016-01-31 16:34:10 +01:00
Gael Guennebaud
a4e4542b89 Avoid overflow in unit test. 2016-01-30 22:26:17 +01:00
Gael Guennebaud
3ba8a3ab1a Disable underflow unit test on the i387 FPU. 2016-01-30 22:14:04 +01:00
Benoit Steiner
483082ef6e Fixed a few memory leaks in the cuda tests 2016-01-30 11:59:22 -08:00
Benoit Steiner
bd21aba181 Sharded the cxx11_tensor_cuda test and fixed a memory leak 2016-01-30 11:47:09 -08:00
Benoit Steiner
9de155d153 Added a test to cover threaded tensor shuffling 2016-01-30 10:56:47 -08:00
Benoit Steiner
32088c06a1 Made the comparison between single and multithreaded contraction results more resistant to numerical noise to prevent spurious test failures. 2016-01-30 10:51:14 -08:00
Benoit Steiner
2053478c56 Made sure to use a tensor of rank 0 to store the result of a full reduction in the tensor thread pool test 2016-01-30 10:46:36 -08:00
Benoit Steiner
d0db95f730 Sharded the tensor thread pool test 2016-01-30 10:43:57 -08:00
Benoit Steiner
ba27c8a7de Made the CUDA contract test more robust to numerical noise. 2016-01-30 10:28:43 -08:00
Benoit Steiner
4281eb1e2c Added 2 benchmarks to the suite of tensor benchmarks running on GPU 2016-01-30 10:20:43 -08:00
Gael Guennebaud
102fa96a96 Extend doc on dense+sparse 2016-01-30 14:58:21 +01:00
Gael Guennebaud
1bc207c528 backout changeset d4a9e61569
: the extended SparseView is not needed anymore
2016-01-30 14:43:21 +01:00
Gael Guennebaud
8ed1553d20 bug #632: implement general coefficient-wise "dense op sparse" operations through specialized evaluators instead of using SparseView.
This permits to deal with arbitrary storage order, and to by-pass the more complex iterator of the sparse-sparse case.
2016-01-30 14:39:50 +01:00
Gael Guennebaud
699634890a bug #946: generalize Cholmod::solve to handle any rhs expression 2016-01-29 23:02:22 +01:00
Gael Guennebaud
15084cf1ac bug #632: add support for "dense +/- sparse" operations. The current implementation is based on SparseView to make the dense subexpression compatible with the sparse one. 2016-01-29 22:09:45 +01:00
Gael Guennebaud
d4a9e61569 Extend SparseView to allow keeping explicit zeros. This is equivalent to sparseView(1,-1) but faster because the test is removed at compile-time. 2016-01-29 22:07:56 +01:00
Gael Guennebaud
d8d37349c3 bug #696: enable zero-sized block at compile-time by relaxing the respective assertion 2016-01-29 12:44:49 +01:00
Gael Guennebaud
e8ccc06fe5 merge 2016-01-29 09:40:38 +01:00
Benoit Steiner
963f2d2a8f Marked several methods EIGEN_DEVICE_FUNC 2016-01-28 23:37:48 -08:00
Benoit Steiner
c5d25bf1d0 Fixed a couple of compilation warnings. 2016-01-28 23:15:45 -08:00
Benoit Steiner
e4f83bae5d Fixed the tensor benchmarks on apple devices 2016-01-28 21:08:07 -08:00
Benoit Steiner
10bea90c4a Fixed clang related compilation error 2016-01-28 20:52:08 -08:00
Benoit Steiner
d3f533b395 Fixed compilation warning 2016-01-28 20:09:45 -08:00
Abhijit Kundu
3fde202215 Making ceil() functor generic w.r.t packet type 2016-01-28 21:27:00 -05:00
Benoit Steiner
211d350fc3 Fixed a typo 2016-01-28 17:13:04 -08:00
Benoit Steiner
bd2e5a788a Made sure the number of floating point operations done by a benchmark is computed using 64 bit integers to avoid overflows. 2016-01-28 17:10:40 -08:00
Benoit Steiner
120e13b1b6 Added a readme to explain how to compile the tensor benchmarks. 2016-01-28 17:06:00 -08:00
Benoit Steiner
a68864b6bc Updated the benchmarking code to print the number of flops processed instead of the number of bytes. 2016-01-28 16:51:40 -08:00
Benoit Steiner
8217281ae4 Merge latest updates from trunk 2016-01-28 16:20:53 -08:00
Benoit Steiner
c8d5f21941 Added extra tensor benchmarks 2016-01-28 16:20:36 -08:00
Benoit Steiner
7b3044d086 Made sure to call nvcc with the relaxed-constexpr flag. 2016-01-28 15:36:34 -08:00
Rasmus Munk Larsen
acce4dd050 Change Eigen's ColPivHouseholderQR to use the numerically stable norm downdate formula from http://www.netlib.org/lapack/lawnspdf/lawn176.pdf, which has been used in LAPACK's xGEQPF and xGEQP3 since 2006. With the old formula, the code chooses the wrong pivots and fails to correctly determine rank on graded matrices.
This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code.

Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes.

Benchmark               Time(ns)     CPU(ns) Iterations
-------------------------------------------------------
Before:
BM_EigencolPivQR/64        53677       53627      12890
BM_EigencolPivQR/512    15265408    15250784         46
BM_EigencolPivQR/4k  15403556228 15388788368          2

After (non-vectorized version):
Benchmark               Time(ns)     CPU(ns) Iterations  Degradation
--------------------------------------------------------------------
BM_EigencolPivQR/64        63736       63669      10844         18.5%
BM_EigencolPivQR/512    16052546    16037381         43          5.1%
BM_EigencolPivQR/4k  15149263620 15132025316          2         -2.0%

Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.
2016-01-28 15:07:26 -08:00
Gael Guennebaud
b908e071a8 bug #178: get rid of some const_cast in SparseCore 2016-01-28 22:11:18 +01:00
Gael Guennebaud
c1d900af61 bug #178: remove additional const on nested expression, and remove several const_cast. 2016-01-28 21:43:20 +01:00
Benoit Steiner
12f8bd12a2 Merged in jiayq/eigen (pull request PR-159)
Modifications to the tensor benchmarks to allow compilation in a standalone fashion.
2016-01-28 11:28:55 -08:00
Yangqing Jia
270c4e1ecd bugfix 2016-01-28 11:11:45 -08:00
Yangqing Jia
c4e47630b1 benchmark modifications to make it compilable in a standalone fashion. 2016-01-28 10:35:14 -08:00
Gael Guennebaud
f50bb1e6f3 Fix compilation with gcc 2016-01-28 13:25:26 +01:00
Gael Guennebaud
ddf64babde merge 2016-01-28 13:21:48 +01:00
Gael Guennebaud
df15fbc452 bug #1158: PartialReduxExpr is a vector expression, and it thus must expose the LinearAccessBit flag 2016-01-28 13:16:30 +01:00
Gael Guennebaud
9bcadb7fd1 Disable stupid MSVC warning 2016-01-28 12:14:16 +01:00
Gael Guennebaud
b4d87fff4a Fix MSVC warning. 2016-01-28 12:12:30 +01:00
Gael Guennebaud
2bad3e78d9 bug #96, bug #1006: fix by value argument in result_of. 2016-01-28 12:12:06 +01:00
Gael Guennebaud
7802a6bb1c Fix unit test filename. 2016-01-28 09:35:37 +01:00
Benoit Steiner
4bf9eaf77a Deleted an invalid assertion that prevented the assignment of empty tensors. 2016-01-27 17:09:30 -08:00
Benoit Steiner
291069e885 Fixed some compilation problems with nvcc + clang 2016-01-27 15:37:03 -08:00
Benoit Steiner
47ca9dc809 Fixed the tensor_cuda test 2016-01-27 14:58:48 -08:00
Benoit Steiner
55a5204319 Fixed the flags passed to nvcc to compile the tensor code. 2016-01-27 14:46:34 -08:00
Gael Guennebaud
4865e1e732 Update link to suitesparse. 2016-01-27 22:48:40 +01:00
Benoit Steiner
9dfbd4fe8d Made the cuda tests compile using make check 2016-01-27 12:22:17 -08:00
Benoit Steiner
5973bcf939 Properly specify the namespace when calling cout/endl 2016-01-27 12:04:42 -08:00
Eugene Brevdo
c8d94ae944 digamma special function: merge shared code.
Moved type-specific code into a helper class digamma_impl_maybe_poly<Scalar>.
2016-01-27 09:52:29 -08:00
Gael Guennebaud
9c8f7dfe94 bug #1156: fix several function declarations whose arguments were passed by value instead of being passed by reference 2016-01-27 18:34:42 +01:00
Gael Guennebaud
9aa6fae123 bug #1154: move to dynamic scheduling for spmv products. 2016-01-27 18:03:51 +01:00
Gael Guennebaud
9ac8e8c6a1 Extend mixing type unit test with trmv, and the following not yet supported products: trmm, symv, symm 2016-01-27 17:29:53 +01:00
Gael Guennebaud
6da5d87f92 add nomalloc unit test for rank2 updates 2016-01-27 17:26:48 +01:00
Gael Guennebaud
9801c959e6 Fix tri = complex * real product, and add respective unit test. 2016-01-27 17:12:25 +01:00
Gael Guennebaud
21b5345782 Add meta_least_common_multiple helper. 2016-01-27 17:11:39 +01:00
Gael Guennebaud
fecea26d93 Extend doc on shifting strategy 2016-01-27 15:55:15 +01:00
Ville Kallioniemi
02db1228ed Add constructor for long types. 2016-01-26 23:41:01 -07:00
Gael Guennebaud
412bb5a631 Remove redundant test. 2016-01-26 23:35:30 +01:00
Gael Guennebaud
0f8d26c6a9 Doc: add flip* and arrayfun MatLab equivalent. 2016-01-26 23:34:48 +01:00
Gael Guennebaud
cfa21f8123 Remove dead code. 2016-01-26 23:33:15 +01:00
Gael Guennebaud
6850eab33b Re-enable blocking on rows in non-l3 blocking mode. 2016-01-26 23:32:48 +01:00
Gael Guennebaud
aa8c6a251e Make sure that micro-panel-size is smaller than blocking sizes (otherwise we might get a buffer overflow) 2016-01-26 23:31:48 +01:00
Gael Guennebaud
5b0a9ee003 Make sure that block sizes are smaller than input matrix sizes. 2016-01-26 23:30:24 +01:00
Benoit Jacob
639b1d864a bug #1152: Fix data race in static initialization of blas 2016-01-26 11:44:16 -05:00
Christoph Hertzberg
44d4674955 bug #1153: Don't rely on __GXX_EXPERIMENTAL_CXX0X__ to detect C++11 support 2016-01-26 16:45:33 +01:00
Hauke Heibel
5eb2790be0 Fixed minor typo in SplineFitting. 2016-01-25 22:17:52 +01:00
Gael Guennebaud
8328caa618 bug #51: add block preallocation mechanism to selfadjoit*matrix product. 2016-01-25 22:06:42 +01:00
Gael Guennebaud
2f9e6314b1 update BLAS interface to general_matrix_matrix_triangular_product 2016-01-25 21:56:05 +01:00
Gael Guennebaud
e58827d2ed bug #51: make general_matrix_matrix_triangular_product use L3-blocking helper so that general symmetric rank-updates and general-matrix-to-triangular products do not trigger dynamic memory allocation for fixed size matrices. 2016-01-25 17:16:33 +01:00
Gael Guennebaud
c10021c00a bug #1144: clarify the doc about aliasing in case of resizing and matrix product. 2016-01-25 15:50:55 +01:00
Gael Guennebaud
b114e6fd3b Improve documentation. 2016-01-25 11:56:25 +01:00
Gael Guennebaud
869b4443ac Add SparseVector::conservativeResize() method. 2016-01-25 11:55:39 +01:00
Benoit Steiner
e3a15a03a4 Don't explicitely evaluate the subexpression from TensorForcedEval::evalSubExprIfNeeded, as it will be done when executing the EvalTo subexpression 2016-01-24 23:04:50 -08:00
Benoit Steiner
bd207ce11e Added missing EIGEN_DEVICE_FUNC qualifier 2016-01-24 20:36:05 -08:00
Gael Guennebaud
acf6f7af6b Merged in larsmans/eigen (pull request PR-156)
Documentation fixes
2016-01-24 22:28:49 +01:00
Lars Buitinck
cc482e32f1 Method is called visit, not visitor 2016-01-24 15:50:59 +01:00
Lars Buitinck
19e437daf0 Copyedit documentation: typos, spelling 2016-01-24 15:50:36 +01:00
Gael Guennebaud
1cf85bd875 bug #977: add stableNormalize[d] methods: they are analogues to normalize[d] but with carefull handling of under/over-flow 2016-01-23 22:40:11 +01:00
Gael Guennebaud
369d6d1ae3 Add link to reference paper. 2016-01-23 22:16:03 +01:00
Gael Guennebaud
0caa4b1531 bug #1150: make IncompleteCholesky more robust by iteratively increase the shift until the factorization succeed (with at most 10 attempts). 2016-01-23 22:13:54 +01:00
Benoit Steiner
cb4e53ff7f Merged in ville-k/eigen/tensorflow_fix (pull request PR-153)
Add ctor for long
2016-01-22 19:11:31 -08:00
Ville Kallioniemi
9f94e030c1 Re-add executable flags to minimize changeset. 2016-01-22 20:08:45 -07:00
Benoit Steiner
3aeeca32af Leverage the new blocking code in the tensor contraction code. 2016-01-22 16:36:30 -08:00
Benoit Steiner
4beb447e27 Created a mechanism to enable contraction mappers to determine the best blocking strategy. 2016-01-22 14:37:26 -08:00
Gael Guennebaud
5358c38589 bug #1095: add Cholmod*::logDeterminant/determinant (from patch of Joshua Pritikin) 2016-01-22 16:05:29 +01:00
Gael Guennebaud
6a44ccb58b Backout changeset 690bc950f7 2016-01-22 15:03:53 +01:00
Gael Guennebaud
06971223ef Unify std::numeric_limits and device::numeric_limits within numext namespace 2016-01-22 15:02:21 +01:00
Ville Kallioniemi
9b6c72958a Update to latest default branch 2016-01-21 23:08:54 -07:00
Ville Kallioniemi
73aec9219b Make use of 32 bit ints explicit and remove executable bit from headers. 2016-01-21 23:00:32 -07:00
Benoit Steiner
7b68cf2e0f Pulled latest updates from trunk 2016-01-21 17:17:56 -08:00
Benoit Steiner
c33479324c Fixed a constness bug 2016-01-21 17:08:11 -08:00
Gael Guennebaud
ee37eb4eed bug #977: avoid division by 0 in normalize() and normalized(). 2016-01-21 20:43:42 +01:00
Gael Guennebaud
7cae8918c0 Fix compilation on old gcc+AVX 2016-01-21 20:30:32 +01:00
Gael Guennebaud
8dca9f97e3 Add numext::sqrt function to enable custom optimized implementation.
This changeset add two specializations for float/double on SSE. Those
are mostly usefull with GCC for which std::sqrt add an extra and costly
check on the result of _mm_sqrt_*. Clang does not add this burden.

In this changeset, only DenseBase::norm() makes use of it.
2016-01-21 20:18:51 +01:00
Gael Guennebaud
34340458cb bug #1151: remove useless critical section 2016-01-21 14:29:45 +01:00
Jan Prach
690bc950f7 fix clang warnings
"braces around scalar initializer"
2016-01-20 19:35:59 -08:00
Benoit Steiner
f2a842294f Pulled latest updates from the trunk 2016-01-20 18:12:53 -08:00
Benoit Steiner
7ce932edd3 Small cleanup and small fix to the contraction of row major tensors 2016-01-20 18:12:08 -08:00
Gael Guennebaud
62f7e77711 add upper|lower case in incomplete_cholesky unit test 2016-01-21 00:02:59 +01:00
Benoit Steiner
47076bf00e Reduce the register pressure exerted by the tensor mappers whenever possible. This improves the performance of the contraction of a matrix with a vector by about 35%. 2016-01-20 14:51:48 -08:00
Benoit Steiner
ebd3388ee6 Pulled latest updates from trunk 2016-01-20 13:56:43 -08:00
Gael Guennebaud
ed8ade9c65 bug #1149: fix Pastix*::*parm() 2016-01-20 19:01:24 +01:00
Gael Guennebaud
4c5e96aab6 bug #1148: silent Pastix by default 2016-01-20 18:56:17 +01:00
Gael Guennebaud
db237d0c75 bug #1145: fix PastixSupport LLT/LDLT wrappers (missing resize prior to calls to selfAdjointView) 2016-01-20 18:49:01 +01:00
Gael Guennebaud
0b7169d1f7 bug #1147: fix compilation of PastixSupport 2016-01-20 18:15:59 +01:00
Gael Guennebaud
234a1094b7 Add static assertion to y(), z(), w() accessors 2016-01-20 09:18:44 +01:00
Ville Kallioniemi
915e7667cd Remove executable bit from header files 2016-01-19 21:17:29 -07:00
Ville Kallioniemi
2832175a68 Use explicitly 32 bit integer types in constructors. 2016-01-19 20:12:17 -07:00
Benoit Steiner
df79c00901 Improved the formatting of the code 2016-01-19 17:24:08 -08:00
Benoit Steiner
6d472d8375 Moved the contraction mapping code to its own file to make the code more manageable. 2016-01-19 17:22:05 -08:00
Benoit Steiner
b3b722905f Improved code indentation 2016-01-19 17:09:47 -08:00
Benoit Steiner
5b7713dd33 Record whether the underlying tensor storage can be accessed directly during the evaluation of an expression. 2016-01-19 17:05:10 -08:00
Ville Kallioniemi
63fb66f53a Add ctor for long 2016-01-17 21:25:36 -07:00
Eugene Brevdo
6a75e7e0d5 Digamma cleanup
* Added permission from cephes author to use his code
* Cleanup in ArrayCwiseUnaryOps
2016-01-15 16:32:21 -08:00
Benoit Steiner
34057cff23 Fixed a race condition that could affect some reductions on CUDA devices. 2016-01-15 15:11:56 -08:00
Benoit Steiner
0461f0153e Made it possible to compare tensor dimensions inside a CUDA kernel. 2016-01-15 11:22:16 -08:00
Benoit Steiner
aed4cb1269 Use warp shuffles instead of shared memory access to speedup the inner reduction kernel. 2016-01-14 21:45:14 -08:00
Benoit Steiner
c1a42c2d0d Don't disable the AVX implementations of plset when compiling with AVX512 enabled 2016-01-14 17:21:39 -08:00
Benoit Steiner
0366478df8 Added alignment requirement to the AVX512 packet traits. 2016-01-14 17:02:39 -08:00
Benoit Steiner
3cfd16f3af Fixed the signature of the plset primitives for AVX512 2016-01-14 16:58:01 -08:00
Benoit Steiner
67f44365ea Fixed the AVX512 signature of the ptranspose primitives 2016-01-14 16:51:11 -08:00
Benoit Steiner
a282eb1363 pscatter/pgather use Index instead of int to specify the stride 2016-01-14 16:39:39 -08:00
Benoit Steiner
7832485575 Deleted unnecessary commas and semicolons 2016-01-14 16:36:29 -08:00
Benoit Steiner
8fe2532e70 Fixed a boundary condition bug in the outer reduction kernel 2016-01-14 09:29:48 -08:00
Benoit Steiner
9f013a9d86 Properly record the rank of reduced tensors in the tensor traits. 2016-01-13 14:24:37 -08:00
Benoit Steiner
79b69b7444 Trigger the optimized matrix vector path more conservatively. 2016-01-12 15:21:09 -08:00
Benoit Steiner
d920d57f38 Improved the performance of the contraction of a 2d tensor with a 1d tensor by a factor of 3 or more. This helps speedup LSTM neural networks. 2016-01-12 11:32:27 -08:00
Benoit Steiner
bd7d901da9 Reverted a previous change that tripped nvcc when compiling in debug mode. 2016-01-11 17:49:44 -08:00
Benoit Steiner
bbdabbb379 Made the blas utils usable from within a cuda kernel 2016-01-11 17:26:56 -08:00
Benoit Steiner
c5e6900400 Silenced a few compilation warnings. 2016-01-11 17:06:39 -08:00
Benoit Steiner
f894736d61 Updated the tensor traits: the alignment is not part of the Flags enum anymore 2016-01-11 16:42:18 -08:00
Benoit Steiner
4f7714d72c Enabled the use of fixed dimensions from within a cuda kernel. 2016-01-11 16:01:00 -08:00
Benoit Steiner
01c55d37e6 Deleted unused variable. 2016-01-11 15:53:19 -08:00
Benoit Steiner
0504c56ea7 Silenced a nvcc compilation warning 2016-01-11 15:49:21 -08:00
Benoit Steiner
b523771a24 Silenced several compilation warnings triggered by nvcc. 2016-01-11 14:25:43 -08:00
Benoit Steiner
2c3b13eded Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152)
Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.
2016-01-11 11:43:37 -08:00
Benoit Steiner
2ccb1c8634 Fixed a bug in the dispatch of optimized reduction kernels. 2016-01-11 10:36:37 -08:00
Benoit Steiner
780623261e Re-enabled the optimized reduction CUDA code. 2016-01-11 09:07:14 -08:00
Jeremy Barnes
91678f489a Cleaned up double-defined macro from last commit 2016-01-10 22:44:45 -05:00
Jeremy Barnes
403a7cb6c3 Alternative way of forcing instantiation of device kernels without
causing warnings or requiring device to device kernel invocations.

This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.
2016-01-10 22:39:13 -05:00
Gael Guennebaud
b557662e58 merge 2016-01-09 08:37:01 +01:00
Gael Guennebaud
8b9dc9f0df bug #1144: fix regression in x=y+A*x (aliasing), and move evaluator_traits::AssumeAliasing to evaluator_assume_aliasing. 2016-01-09 08:30:38 +01:00
Benoit Steiner
e76904af1b Simplified the dispatch code. 2016-01-08 16:50:57 -08:00
Benoit Steiner
d726e864ac Made it possible to use array of size 0 on CUDA devices 2016-01-08 16:38:14 -08:00
Benoit Steiner
3358dfd5dd Reworked the dispatch of optimized cuda reduction kernels to workaround a nvcc bug that prevented the code from compiling in optimized mode in some cases 2016-01-08 16:28:53 -08:00
Benoit Steiner
53749ff415 Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this reintroduces some compulation warnings but it's much better than having to deal with random assertion failures. 2016-01-08 13:53:40 -08:00
Gael Guennebaud
f9d71a1729 extend matlab conversion table 2016-01-08 22:24:45 +01:00
Benoit Steiner
6639b7d6e8 Removed a couple of partial specialization that confuse nvcc and result in errors such as this:
error: more than one partial specialization matches the template argument list of class "Eigen::internal::get<3, Eigen::internal::numeric_list<std::size_t, 1UL, 1UL, 1UL, 1UL>>"
            "Eigen::internal::get<n, Eigen::internal::numeric_list<T, a, as...>>"
            "Eigen::internal::get<n, Eigen::internal::numeric_list<T, as...>>"
2016-01-07 18:45:19 -08:00
Benoit Steiner
0cb2ca5de2 Fixed a typo. 2016-01-06 18:50:28 -08:00
Benoit Steiner
213459d818 Optimized the performance of broadcasting of scalars. 2016-01-06 18:47:45 -08:00
Gael Guennebaud
ee738321aa rm remaining debug code 2016-01-06 14:49:40 +01:00
Christoph Hertzberg
54bf582303 bug #1143: Work-around gcc bug 2016-01-06 11:59:24 +01:00
Benoit Steiner
99093c0fe0 Added support for AVX512 to the build files 2016-01-05 10:02:49 -08:00
Benoit Steiner
cfff40b1d4 Improved the performance of reductions on CUDA devices 2016-01-04 17:25:00 -08:00
Benoit Steiner
515dee0baf Added a 'divup' util to compute the floor of the quotient of two integers 2016-01-04 16:29:26 -08:00
Gael Guennebaud
715f6f049f Improve inline documentation of SparseCompressedBase and its derived classes 2016-01-03 21:56:30 +01:00
Gael Guennebaud
8b0d1eb0f7 Fix numerous doxygen shortcomings, and workaround some clang -Wdocumentation warnings 2016-01-01 21:45:06 +01:00
Gael Guennebaud
9900782e88 Mark AlignedBit and EvalBeforeNestingBit with deprecated attribute, and remove the remaining usages of EvalBeforeNestingBit. 2015-12-30 16:47:49 +01:00
Gael Guennebaud
70404e07c2 Workaround clang -Wdocumentation warning about "/*<" 2015-12-30 16:46:45 +01:00
Gael Guennebaud
addb7066e8 Workaround "empty paragraph" warning with clang -Wdocumentation 2015-12-30 16:45:44 +01:00
Gael Guennebaud
eadc377b3f Add missing doc of Derived template parameter 2015-12-30 16:43:19 +01:00
Gael Guennebaud
29bb599e03 Fix numerous doxygen issues in auto-link generation 2015-12-30 16:04:24 +01:00
Gael Guennebaud
162ccb2938 Fix links to Eigen2-to-Eigen3 porting helpers 2015-12-30 16:03:14 +01:00
Gael Guennebaud
5fae3750b5 Recent versions of doxygen miss-parsed Eigen/* headers 2015-12-30 16:02:05 +01:00
Gael Guennebaud
b84cefe61d Add missing snippets for erf/erfc/lgamma functions. 2015-12-30 15:12:15 +01:00
Gael Guennebaud
16dd82ed51 Add missing snippet for sign/cwiseSign functions. 2015-12-30 15:11:42 +01:00
Gael Guennebaud
978c379ed7 Add missing ctor from uint 2015-12-30 12:52:38 +01:00
Gael Guennebaud
25f2b8d824 bug #1141: add missing initialization of CholmodBase::m_*IsOk 2015-12-29 15:50:11 +01:00
Eugene Brevdo
f2471f31e0 Modify constants in SpecialFunctions to lowercase (avoid name conflicts). 2015-12-28 17:48:38 -08:00
Eugene Brevdo
afb35385bf Change PI* to M_PI* in SpecialFunctions to avoid possible breakage
with external DEFINEs.
2015-12-28 17:34:06 -08:00
Eugene Brevdo
14897600b7 Protect digamma tests behind a EIGEN_HAS_C99_MATH check. 2015-12-24 21:28:18 -08:00
Eugene Brevdo
cef81c9084 Merged eigen/eigen into default 2015-12-24 21:17:33 -08:00
Eugene Brevdo
f7362772e3 Add digamma for CPU + CUDA. Includes tests. 2015-12-24 21:15:38 -08:00
Gael Guennebaud
d2e288ae50 Workaround compilers that do not even define _mm256_set_m128. 2015-12-24 16:53:43 +01:00
Benoit Steiner
bdcbc66a5c Don't attempt to vectorize mean reductions of integers since we can't use
SSE or AVX instructions to divide 2 integers.
2015-12-22 17:51:55 -08:00
Benoit Steiner
a1e08fb2a5 Optimized the configuration of the outer reduction cuda kernel 2015-12-22 16:30:10 -08:00
Benoit Steiner
9c7d96697b Added missing define 2015-12-22 16:11:07 -08:00
Benoit Steiner
e7e6d01810 Made sure the optimized gpu reduction code is actually compiled. 2015-12-22 15:07:33 -08:00
Benoit Steiner
b5d2078c4a Optimized outer reduction on GPUs. 2015-12-22 15:06:17 -08:00
Benoit Steiner
3504ae47ca Made it possible to run the lgamma, erf, and erfc functors on a CUDA gpu. 2015-12-21 15:20:06 -08:00
Benoit Steiner
1c3e78319d Added missing const 2015-12-21 15:05:01 -08:00
Benoit Steiner
9f9d8d2f62 Disabled part of the matrix matrix peeling code that's incompatible with 512 bit registers 2015-12-21 13:04:52 -08:00
Benoit Steiner
b74887d5f2 Implemented most of the packet primitives for AVX512 2015-12-21 11:46:36 -08:00
Benoit Steiner
6ffb208c77 Make sure EIGEN_HAS_MM_MALLOC is set to 1 when using the avx512 instruction set. 2015-12-21 11:23:15 -08:00
Benoit Steiner
994d1c60b9 Free memory allocated using posix_memalign() with free() instead of std::free() 2015-12-21 11:21:39 -08:00
Benoit Steiner
b407948a77 Merged in connor-k/eigen (pull request PR-149)
[doc] Remove extra ';' in Advanced Initialization sample
2015-12-21 09:44:25 -08:00
Benoit Steiner
a6c243617b Fixed a typo in previous change. 2015-12-21 09:05:45 -08:00
Benoit Steiner
51be91f15e Added support for CUDA architectures that don's support for 3.5 capabilities 2015-12-21 08:42:58 -08:00
connor-k
95dd423cca [doc] Remove extra ';' in Tutorial_AdvancedInitialization_Join.cpp 2015-12-21 01:12:26 +00:00
Tal Hadad
c006ecace1 Fix comments 2015-12-20 20:07:06 +02:00
Tal Hadad
bfed274df3 Use RotationBase, test quaternions and support ranges. 2015-12-20 16:24:53 +02:00
Tal Hadad
b091b7e6ea Remove unneccesary comment. 2015-12-20 13:00:07 +02:00
Tal Hadad
fabd8474ff Merged eigen/eigen into default 2015-12-20 12:50:07 +02:00
Tal Hadad
6752a69aa5 Much better tests, and a little bit more functionality. 2015-12-20 12:49:12 +02:00
Benoit Steiner
6d777e1bc7 Fixed a typo. 2015-12-18 19:25:50 -08:00
Benoit Steiner
1b82969559 Add alignment requirement for local buffer used by the slicing op. 2015-12-18 14:36:35 -08:00
Benoit Steiner
75a7fa1919 Doubled the speed of full reductions on GPUs. 2015-12-18 14:07:31 -08:00
Gael Guennebaud
3abd8470ca bug #1140: remove custom definition and use of _mm256_setr_m128 2015-12-18 14:18:59 +01:00
Benoit Steiner
8dd17cbe80 Fixed a clang compilation warning triggered by the use of arrays of size 0. 2015-12-17 14:00:33 -08:00
Benoit Steiner
4aac55f684 Silenced some compilation warnings triggered by nvcc 2015-12-17 13:39:01 -08:00
Benoit Steiner
40e6250fc3 Made it possible to run tensor chipping operations on CUDA devices 2015-12-17 13:29:08 -08:00
Benoit Steiner
2ca55a3ae4 Fixed some compilation error triggered by the tensor code with msvc 2008 2015-12-16 20:45:58 -08:00
Gael Guennebaud
55aef139ff Added tag 3.3-beta1 for changeset 9f9de1aaa9 2015-12-16 21:49:02 +01:00
Gael Guennebaud
9f9de1aaa9 bump to 3.3-beta1 2015-12-16 21:48:48 +01:00
Christoph Hertzberg
49d96aee64 bug #1120: Make sure that SuperLU version is checked 2015-12-16 11:37:16 +01:00
Gael Guennebaud
ae8b217a01 Update doc to make it clear that only SuperLU 4.x is supported 2015-12-16 10:47:03 +01:00
Gael Guennebaud
35d8725c73 Disable AutoDiffScalar generic copy ctor for non compatible scalar types (fix ambiguous template instantiation) 2015-12-16 10:14:24 +01:00
Christoph Hertzberg
92655e7215 bug #1136: Protect isinf for Intel compilers. Also don't distinguish GCC from ICC and don't rely on EIGEN_NOT_A_MACRO, which might not be defined when including this. 2015-12-15 11:34:52 +01:00
Benoit Steiner
17352e2792 Made the entire TensorFixedSize api callable from a CUDA kernel. 2015-12-14 15:20:31 -08:00
Benoit Steiner
75e19fc7ca Marked the tensor constructors as EIGEN_DEVICE_FUNC: This makes it possible to call them from a CUDA kernel. 2015-12-14 15:12:55 -08:00
Gael Guennebaud
140f3a02a8 Fix MKL wrapper for ComplexSchur 2015-12-11 23:31:21 +01:00
Gael Guennebaud
4483c0fdf6 Fix unused variable warning. 2015-12-11 23:29:53 +01:00
Gael Guennebaud
774dba87c8 merge 2015-12-11 23:28:44 +01:00
Gael Guennebaud
c884a8e7f4 merge 2015-12-11 23:07:33 +01:00
Gael Guennebaud
4d708457d0 Increase axpy vector size 2015-12-11 23:07:22 +01:00
Benoit Steiner
b8861b0c25 Make sure the data is aligned on a 64 byte boundary when using avx512 instructions. 2015-12-11 09:19:57 -08:00
Gael Guennebaud
b60a8967f5 bug #1134: fix JacobiSVD pre-allocation
(grafted from f22036f5f8
)
2015-12-11 11:59:11 +01:00
Gael Guennebaud
ca39b1546e Merged in ebrevdo/eigen (pull request PR-148)
Add special functions to eigen: lgamma, erf, erfc.
2015-12-11 11:52:09 +01:00
Gael Guennebaud
82152f2ae6 bug #1132: add EIGEN_MAPBASE_PLUGIN 2015-12-11 11:43:49 +01:00
Gael Guennebaud
4519fd5d40 Fix MKL compilation issue 2015-12-11 11:11:38 +01:00
Gael Guennebaud
7385e6e2ef Remove useless explicit 2015-12-11 11:11:19 +01:00
Gael Guennebaud
bcb4f126a7 Fix compilation of PardisoSupport 2015-12-11 11:11:00 +01:00
Gael Guennebaud
30b5c4cd14 Remove useless "explicit", and fix inline/static order. 2015-12-11 10:59:39 +01:00
Gael Guennebaud
79c1e6d0a6 Fix compilation of MKL support. 2015-12-11 10:55:07 +01:00
Gael Guennebaud
c684a07eba merge 2015-12-11 10:06:38 +01:00
Gael Guennebaud
836da91b3f Fix unit tests wrt EIGEN_DEFAULT_TO_ROW_MAJOR 2015-12-11 10:06:28 +01:00
Benoit Steiner
6af52a1227 Fixed a typo in the constructor of tensors of rank 5. 2015-12-10 23:31:12 -08:00
Benoit Steiner
2d8f2e4042 Made 2 tests compile without cxx11.
HdG: --
2015-12-10 23:20:04 -08:00
Benoit Steiner
8d28a161b2 Use the proper accessor to refer to the value of a scalar tensor 2015-12-10 22:53:56 -08:00
Benoit Steiner
8e00ea9a92 Fixed the coefficient accessors use for the 2d and 3d case when compiling without cxx11 support. 2015-12-10 22:45:10 -08:00
Benoit Steiner
9db8316c93 Updated the cxx11_tensor_custom_op to not require cxx11. 2015-12-10 20:53:44 -08:00
Benoit Steiner
4e324ca6ae Updated the cxx11_tensor_assign test to make it compile without support for cxx11 2015-12-10 20:47:25 -08:00
Benoit Steiner
6acf2bd472 Fixed compilation error triggered by MSVC 2008 2015-12-10 17:17:42 -08:00
Benoit Steiner
9a415fb1e2 Preliminary support for AVX512 2015-12-10 15:34:57 -08:00
Benoit Steiner
b820b097b8 Created EIGEN_HAS_C99_MATH define as Gael suggested. 2015-12-10 13:52:05 -08:00
Gael Guennebaud
df6f54ff63 Fix storage order of PartialRedux 2015-12-10 22:24:58 +01:00
Gael Guennebaud
d1862967a8 Make sure ADOLC is recent enough by searching for adtl.h 2015-12-10 22:23:21 +01:00
Mark Borgerding
22dd368ea0 sign(complex) compiles for GPU 2015-12-10 16:14:29 -05:00
Benoit Steiner
8314962ce2 Only test the lgamma, erf and erfc function when using a C99 compliant compiler 2015-12-10 13:13:45 -08:00
Benoit Steiner
58e06447de Silence a compilation warning 2015-12-10 13:11:36 -08:00
Benoit Steiner
48877a6933 Only implement the lgamma, erf, and erfc functions when using a compiler compliant with the C99 specification. 2015-12-10 13:09:49 -08:00
Gael Guennebaud
46d2f6cd78 Workaround gcc issue with -O3 and the i387 FPU. 2015-12-10 21:33:43 +01:00
Gael Guennebaud
7ad1aaec1d bug #1103: fix neon vectorization of pmul(Packet1cd,Packet1cd) 2015-12-10 16:06:33 +01:00
Gael Guennebaud
b0a1d6f2e5 Improve handling of deprecated EIGEN_INCLUDE_INSTALL_DIR variable 2015-12-10 15:47:06 +01:00
Benoit Steiner
53b196aa5f Simplified the implementation of lgamma, erf, and erfc 2015-12-08 14:17:34 -08:00
Benoit Steiner
e535450573 Cleanup 2015-12-08 14:06:39 -08:00
Benoit Steiner
b630d10b62 Only disable the erf, erfc, and lgamma tests for older versions of c++. 2015-12-07 17:08:08 -08:00
Benoit Steiner
b1ae39794c Simplified the code a bit 2015-12-07 16:46:35 -08:00
Benoit Steiner
73b68d4370 Fixed a couple of typos
Cleaned up the code a bit.
2015-12-07 16:38:48 -08:00
Eugene Brevdo
fa4f933c0f Add special functions to Eigen: lgamma, erf, erfc.
Includes CUDA support and unit tests.
2015-12-07 15:24:49 -08:00
Benoit Steiner
7dfe75f445 Fixed compilation warnings 2015-12-07 08:12:30 -08:00
Gael Guennebaud
ad3d68400e Add matrix-free solver example 2015-12-07 12:33:38 +01:00
Gael Guennebaud
b37036afce Implement wrapper for matrix-free iterative solvers 2015-12-07 12:23:22 +01:00
Benoit Steiner
f4ca8ad917 Use signed integers instead of unsigned ones more consistently in the codebase. 2015-12-04 18:14:16 -08:00
Benoit Steiner
490d26e4c1 Use integers instead of std::size_t to encode the number of dimensions in the Tensor class since most of the code currently already use integers. 2015-12-04 10:15:11 -08:00
Benoit Steiner
d20efc974d Made it possible to use the sigmoid functor within a CUDA kernel. 2015-12-04 09:38:15 -08:00
Benoit Steiner
e25e3a041b Added rsqrt() method to the Array class: this method computes the coefficient-wise inverse square root much more efficiently than calling sqrt().inverse(). 2015-12-03 18:16:35 -08:00
Benoit Steiner
029052d276 Deleted redundant code 2015-12-03 17:08:47 -08:00
Benoit Steiner
c41e9e4bd0 Merged in Unril/eigen-1/Unril/fixes-internal-compiler-error-while-comp-1449156092576 (pull request PR-147)
Fixes internal compiler error while compiling with VC2015 Update1 x64.
2015-12-03 14:26:14 -08:00
Gael Guennebaud
1562e13aba Add missing Rotation2D::operator=(Matrix2x2) 2015-12-03 22:25:26 +01:00
Nikolay Fedorov
944647c0aa Fixes internal compiler error while compiling with VC2015 Update1 x64. 2015-12-03 15:21:43 +00:00
Benoit Steiner
d2d4c45d55 Made it possible to leverage several binary functor in a CUDA kernel
Explicitely specified the return type of the various scalar_cmp_op functors.
2015-12-02 17:21:33 -08:00
Gael Guennebaud
c5b86893e7 bug #1123: add missing documentation of angle() and axis() 2015-12-01 14:45:08 +01:00
Gael Guennebaud
0bb12fa614 Add LU::transpose().solve() and LU::adjoint().solve() API. 2015-12-01 14:38:47 +01:00
Rasmus Munk Larsen
1663d15da7 Add internal method _solve_impl_transposed() to LU decomposition classes that solves A^T x = b or A^* x = b. 2015-11-30 13:39:24 -08:00
Gael Guennebaud
274b2272b7 Make bench_gemm compatible with 3.2 2015-12-01 09:57:31 +01:00
Gael Guennebaud
6c02cbbb0f Fix matrix to quaternion (and angleaxis) conversion for matrix expression. 2015-12-01 09:45:56 +01:00
Gael Guennebaud
844561939f Do not check NeedsToAlign if no static alignment 2015-11-30 22:36:14 +01:00
Gael Guennebaud
1d906d883d Fix degenerate cases in syrk and trsm 2015-11-30 22:20:31 +01:00
Gael Guennebaud
e7a1c48185 Update BLAS API unit tests 2015-11-30 22:19:20 +01:00
Gael Guennebaud
034ca5a22d Clean hardcoded compilation options 2015-11-30 17:05:42 +01:00
Gael Guennebaud
fd727249ad Update ADOL-C support. 2015-11-30 16:00:22 +01:00
Gael Guennebaud
6fcd316f23 Extend superlu cmake script to check version 2015-11-30 14:48:11 +01:00
Gael Guennebaud
afa11d646d Fix UmfPackLU ctor for exppressions 2015-11-27 22:04:22 +01:00
Gael Guennebaud
6bdeb8cfbe bug #918, umfpack: add access to umfpack return code and parameters 2015-11-27 21:58:36 +01:00
Gael Guennebaud
3f32f5ec22 ArrayBase::sign: add unit test and fix doc 2015-11-27 16:27:53 +01:00
Gael Guennebaud
da46b1ed54 bug #1112: fix compilation on exotic architectures 2015-11-27 15:57:18 +01:00
Gael Guennebaud
1261d020c3 bug #1120, superlu: mem_usage_t is now uniquely defined, so let's use it. 2015-11-27 10:39:09 +01:00
Gael Guennebaud
0ff127e896 Preserve CMAKE_CXX_FLAGS in BTL 2015-11-27 10:18:39 +01:00
Gael Guennebaud
ca001d7c2a Big 1009, part 2/2: add static assertion on LinearAccessBit in coeff(index)-like methods. 2015-11-27 10:06:47 +01:00
Gael Guennebaud
91a7059459 bug #1009, part 1/2: make sure vector expressions expose LinearAccessBit flag. 2015-11-27 10:06:07 +01:00
Mark Borgerding
7ddcf97da7 added scalar_sign_op (both real,complex) 2015-11-24 17:15:07 -05:00
Benoit Steiner
44848ac39b Fixed a bug in TensorArgMax.h 2015-11-23 15:58:47 -08:00
Benoit Steiner
547a8608e5 Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC.
Also updated the code to silence bogux warnings generated by nvcc when compilining this function.
2015-11-23 12:17:45 -08:00
Benoit Steiner
562078780a Don't create more cuda blocks than necessary 2015-11-23 11:00:10 -08:00
Benoit Steiner
df31ca3b9e Made it possible to refer t oa GPUDevice from code compile with a regular C++ compiler 2015-11-23 10:03:53 -08:00
Benoit Steiner
1e04059012 Deleted unused variable. 2015-11-23 08:36:54 -08:00
Benoit Steiner
4286b2d494 Pulled latest updates from trunk 2015-11-23 08:28:34 -08:00
Gael Guennebaud
f9fff67a56 Disable "decorated name length exceeded, name was truncated" MSVC warning. 2015-11-23 15:03:24 +01:00
Gael Guennebaud
f3dca16a1d bug #1117: workaround unused-local-typedefs warning when EIGEN_NO_STATIC_ASSERT and NDEBUG are both defined. 2015-11-23 14:07:52 +01:00
Gael Guennebaud
31b661e4ca Add a note on initParallel being optional in C++11. 2015-11-23 13:28:43 +01:00
Gael Guennebaud
8a2659f0cb Improve numerical robustness of some unit tests 2015-11-23 10:53:55 +01:00
Gael Guennebaud
82bd4e546a Merged in dr15jones/eigen (pull request PR-146)
Use a class constructor to initialize CPU cache sizes
2015-11-22 22:50:31 +01:00
Gael Guennebaud
35c17a3fc8 Use overload instead of template full specialization to please old MSVC 2015-11-22 22:09:57 +01:00
Gael Guennebaud
b265979a70 Make FullPivLU::solve use rank() instead of nonzeroPivots(). 2015-11-21 15:03:04 +01:00
Benoit Steiner
9fa65d3838 Split TensorDeviceType.h in 3 files to make it more manageable 2015-11-20 17:42:50 -08:00
Benoit Steiner
a367804856 Added option to force the usage of the Eigen array class instead of the std::array class. 2015-11-20 12:41:40 -08:00
Benoit Steiner
86486eee2d Pulled latest updates from trunk 2015-11-20 11:10:37 -08:00
Benoit Steiner
383d1cc2ed Added proper support for fast 64bit integer division on CUDA 2015-11-20 11:09:46 -08:00
Chris Jones
4946d758c9 Use a class constructor to initialize CPU cache sizes
Using a static instance of a class to initialize the values for
the CPU cache sizes guarantees thread-safe initialization of the
values when using C++11. Therefore under C++11 it is no longer
necessary to call Eigen::initParallel() before calling any eigen
functions on different threads.
2015-11-20 19:58:08 +01:00
Gael Guennebaud
027a846b34 Use .data() instead of &coeffRef(0). 2015-11-20 15:30:10 +01:00
Gael Guennebaud
4522ffd17c Add regression using test for array<complex>/real 2015-11-20 15:29:32 +01:00
Gael Guennebaud
4fc36079e7 Fix overload instantiation for clang 2015-11-20 15:29:03 +01:00
Gael Guennebaud
4a985e793c Workaround msvc broken complex/complex division in unit test 2015-11-20 14:52:08 +01:00
Gael Guennebaud
5c9c0dca4d Add missing using statement to enable fast Array<complex> / real operations. (was ok for Matrix only) 2015-11-20 14:51:36 +01:00
Gael Guennebaud
e1b27bcb0b Workaround MSVC missing overloads of std::fpclassify for integral types 2015-11-20 13:55:34 +01:00
Gael Guennebaud
e52d4f8d8d Add is_integral<> type traits 2015-11-20 13:54:28 +01:00
Benoit Steiner
0ad7c7b1ad Fixed another clang compilation warning 2015-11-19 15:52:51 -08:00
Benoit Steiner
66ff9b2c6c Fixed compilation warning generated by clang 2015-11-19 15:40:32 -08:00
Benoit Steiner
f37a5f1c53 Fixed compilation error triggered by nvcc 2015-11-19 14:34:26 -08:00
Benoit Steiner
04f1284f9a Shard the uint128 test 2015-11-19 14:08:08 -08:00
Benoit Steiner
e2859c6b71 Cleanup the integer division test 2015-11-19 14:07:50 -08:00
Benoit Steiner
f8df393165 Added support for 128bit integers on CUDA devices. 2015-11-19 13:57:27 -08:00
Benoit Steiner
7d1cedd0fe Added numeric limits for unsigned integers 2015-11-18 17:17:44 -08:00
Gael Guennebaud
1994999105 Add regression unit test for prod.maxCoeff(i) 2015-11-18 23:29:07 +01:00
Benoit Steiner
1dd444ea71 Avoid using the version of TensorIntDiv optimized for 32-bit integers when the divisor can be equal to one since it isn't supported. 2015-11-18 11:37:58 -08:00
Benoit Jacob
4926251f13 bug #1115: enable static alignment on ARM outside of old-GCC 2015-11-18 10:55:23 -05:00
Gael Guennebaud
a64156cae5 Workaround i387 issue in unit test 2015-11-16 13:33:54 +01:00
Benoit Steiner
bf792f59e3 Only enable the use of constexpr with nvcc if we're using version 7.5 or above 2015-11-13 12:24:22 -08:00
Benoit Steiner
f1fbd74db9 Added sanity check 2015-11-13 09:07:27 -08:00
Benoit Steiner
1e1755352d Made it possible to compute atan, tanh, sinh and cosh on GPU 2015-11-12 20:19:38 -08:00
Benoit Steiner
7815b84be4 Fixed a compilation warning 2015-11-12 20:16:59 -08:00
Benoit Steiner
10a91930cc Fixed a compilation warning triggered by nvcc 2015-11-12 20:10:52 -08:00
Benoit Steiner
ed4b37de02 Fixed a few compilation warnings 2015-11-12 20:08:01 -08:00
Benoit Steiner
b69248fa2a Added a couple of missing EIGEN_DEVICE_FUNC 2015-11-12 20:01:50 -08:00
Benoit Steiner
0aaa5941df Silenced some compilation warnings triggered by nvcc 2015-11-12 19:11:43 -08:00
Benoit Steiner
2c73633b28 Fixed a few more typos 2015-11-12 18:39:19 -08:00
Benoit Steiner
be08e82953 Fixed typos 2015-11-12 18:37:40 -08:00
Benoit Steiner
e4d45f3440 Only enable the use of const expression when nvcc is called with the -std=c++11 option 2015-11-12 18:18:35 -08:00
Benoit Steiner
150c12e138 Completed the IndexList rewrite 2015-11-12 18:11:56 -08:00
Benoit Steiner
8037826367 Simplified more of the IndexList code. 2015-11-12 17:19:45 -08:00
Benoit Steiner
e9ecfad796 Started to make the IndexList code compile by more compilers 2015-11-12 16:41:14 -08:00
Benoit Steiner
7a1316fcc5 Fixed compilation error with xcode. 2015-11-12 11:05:54 -08:00
Benoit Steiner
737d237722 Made it possible to run some of the CXXMeta functions on a CUDA device. 2015-11-12 09:02:59 -08:00
Benoit Steiner
1e072424e8 Moved the array code into it's own file. 2015-11-12 08:57:04 -08:00
Benoit Steiner
aa5f1ca714 gen_numeric_list takes a size_t, not a int 2015-11-12 08:30:10 -08:00
Gael Guennebaud
dfbb889fe9 Fix missing Dynamic versus HugeCost changes 2015-11-12 12:09:48 +01:00
Gael Guennebaud
e701cb2c7c Update EIGEN_FAST_MATH doc 2015-11-12 12:09:19 +01:00
Benoit Steiner
9fa10fe52d Don't use std::array when compiling with nvcc since nvidia doesn't support the use of STL containers on GPU. 2015-11-11 15:38:30 -08:00
Benoit Steiner
c587293e48 Fixed a compilation warning 2015-11-11 15:35:12 -08:00
Benoit Steiner
7f1c29fb0c Make it possible for a vectorized tensor expression to be executed in a CUDA kernel. 2015-11-11 15:22:50 -08:00
Benoit Steiner
4f471146fb Allow the vectorized version of the Binary and the Nullary functors to run on GPU 2015-11-11 15:19:00 -08:00
Benoit Steiner
99f4778506 Disable SFINAE when compiling with nvcc 2015-11-11 15:04:58 -08:00
Benoit Steiner
5cb18e5b5e Fixed CUDA compilation errors 2015-11-11 14:36:33 -08:00
Benoit Steiner
228edfe616 Use Eigen::NumTraits instead of std::numeric_limits 2015-11-11 09:26:23 -08:00
Taylor Braun-Jones
b836acb799 Further fixes for CMAKE_INSTALL_PREFIX correctness
And other related cmake cleanup, including:

- Use CMAKE_CURRENT_LIST_DIR to find UseEigen3.cmake
- Use INSTALL_DIR term consistently for variable names
- Drop unnecessary extra EIGEN_INCLUDE_INSTALL_DIR
- Fix some paths in generated eigen3.pc and Eigen3Config.cmake files
    missing CMAKE_INSTALL_PREFIX
- Fix pkgconfig directory choice ignored if it doesn't exist at configure
    time (bug #711)
2015-11-07 21:29:24 -05:00
Gael Guennebaud
e73ef4f25e bug #1109: use noexcept instead of throw for C++11 compilers 2015-12-10 14:21:23 +01:00
Gael Guennebaud
145ad5d800 Use more explicit names. 2015-12-10 12:03:38 +01:00
Gael Guennebaud
75f0fe3795 Fix usage of "Index" as a compile time integral. 2015-12-10 12:01:06 +01:00
Gael Guennebaud
f248249c1f bug #1113: fix name conflict with C99's "I". 2015-12-10 11:57:57 +01:00
Gael Guennebaud
21ed29e2c9 Disable complex scalar types because the compiler might aggressively vectorize
the initialization of complex coeffs to 0 before we can check for alignedness
2015-12-09 20:46:09 +01:00
Gael Guennebaud
fbe18d5507 Forbid the creation of SparseCompressedBase object 2015-12-09 15:47:32 +01:00
Gael Guennebaud
dc73430d4b bug #1074: forbid the creation of PlainObjectBase object by making its ctor protected 2015-12-09 15:47:08 +01:00
Gael Guennebaud
1257fbd2f9 Fix sign-unsigned issue in enum 2015-12-09 10:06:42 +01:00
Gael Guennebaud
4549549992 Fix and clarify documentation of Transform wrt operator*(MatrixBase) 2015-12-08 16:21:49 +01:00
Gael Guennebaud
543bd28a24 Fix Alignment in coeff-based product, and enable unaligned vectorization 2015-12-08 11:28:05 +01:00
Gael Guennebaud
03ad4fc504 Extend unit test of coeff-based product to check many more combinations 2015-12-08 11:27:43 +01:00
Benoit Steiner
20e2ab1121 Fixed another compilation warning 2015-12-07 16:17:57 -08:00
Benoit Steiner
d573efe303 Code cleanup 2015-11-06 14:54:28 -08:00
Benoit Steiner
9fa283339f Silenced a compilation warning 2015-11-06 11:44:22 -08:00
Benoit Steiner
53432a17b2 Added static assertions to avoid misuses of padding, broadcasting and concatenation ops. 2015-11-06 10:26:19 -08:00
Benoit Steiner
6857a35a11 Fixed typos 2015-11-06 09:42:05 -08:00
Benoit Steiner
33cbdc2d15 Added more missing EIGEN_DEVICE_FUNC 2015-11-06 09:29:59 -08:00
Benoit Steiner
d27e4f1cba Added missing EIGEN_DEVICE_FUNC statements 2015-11-06 09:23:58 -08:00
Benoit Steiner
ed1962b464 Reimplement the tensor comparison operators by using the scalar_cmp_op functors. This makes them more cuda friendly. 2015-11-06 09:18:43 -08:00
Gael Guennebaud
bfd6ee64f3 bug #1105: fix default preallocation when moving from compressed to uncompressed mode 2015-11-06 15:05:37 +01:00
Benoit Steiner
29038b982d Added support for modulo operation 2015-11-05 19:39:48 -08:00
Benoit Steiner
fbcf8cc8c1 Pulled latest updates from trunk 2015-11-05 14:30:02 -08:00
Benoit Steiner
0d15ad8019 Updated the regressions tests that cover full reductions 2015-11-05 14:22:30 -08:00
Benoit Steiner
c75a19f815 Misc fixes to full reductions 2015-11-05 14:21:20 -08:00
Benoit Steiner
ec5a81b45a Fixed a bug in the extraction of sizes of fixed sized tensors of rank 0 2015-11-05 13:39:48 -08:00
Gael Guennebaud
589b839ad0 Add unit test for Hessian via AutoDiffScalar 2015-11-05 14:54:05 +01:00
Gael Guennebaud
9ceaa8e445 bug #1063: nest AutoDiffScalar by value to avoid dead references 2015-11-05 13:54:26 +01:00
Gael Guennebaud
ae87f094eb Fix "," in non SSE4 mode 2015-11-05 12:08:36 +01:00
Gael Guennebaud
2844e7ae43 SPQR and UmfPack need to link to cholmod.
(grafted from 47592d31ea
)
2015-11-05 12:05:02 +01:00
Gael Guennebaud
780eeb3be7 prevent stack overflow in unit test 2015-11-05 00:32:48 -08:00
Benoit Steiner
beedd9630d Updated the reduction code so that full reductions now return a tensor of rank 0. 2015-11-04 13:57:36 -08:00
Gael Guennebaud
90323f1751 Fix AVX round/ceil/floor, and fix respective unit test 2015-11-04 22:15:57 +01:00
Gael Guennebaud
3dd24bdf99 Merged in aavenel/eigen (pull request PR-142)
Add round, ceil and floor for SSE4.1/AVX (Bug #70)
2015-11-04 18:26:38 +01:00
Gael Guennebaud
902750826b Add support for dense.cwiseProduct(sparse)
This also fixes a regression regarding (dense*sparse).diagonal()
2015-11-04 17:42:07 +01:00
Gael Guennebaud
f6b1deebab Fix compilation of sparse-triangular to dense assignment 2015-11-04 17:02:32 +01:00
Benoit Steiner
36cd6daaae Made the CUDA implementation of ploadt_ro compatible with cuda implementations older than 3.5 2015-11-03 16:36:30 -08:00
Gael Guennebaud
29a94c8055 compilation issue 2015-11-02 16:11:59 +01:00
Alexandre Avenel
38832e0791 Merge 2015-11-01 10:55:42 +01:00
Alexandre Avenel
d46e2c10a6 Add round, ceil and floor for SSE4.1/AVX (Bug #70) 2015-11-01 10:49:27 +01:00
Gael Guennebaud
c0352197a1 bug #1099: add missing incude for CUDA 2015-10-31 18:06:28 +01:00
Gael Guennebaud
b32948c642 bug #1102: fix multiple definition linking issue 2015-10-30 22:25:59 +01:00
Gael Guennebaud
5a2007f7e4 typo 2015-10-30 22:16:23 +01:00
Gael Guennebaud
8a3151de2e Limit matrix size for other eigen and schur decompositions 2015-10-30 18:06:03 +01:00
Gael Guennebaud
fdf3030ff8 Limit matrix sizes for trmm unit test and complexes. 2015-10-30 15:07:50 +01:00
Gael Guennebaud
9285647dfe Limit matrix size when testing for NaN: they can become prohibitively expensive when running on x87 fp unit 2015-10-30 14:44:22 +01:00
Gael Guennebaud
ddaaa2d381 bug #1101: typo 2015-10-30 12:02:52 +01:00
Gael Guennebaud
c8c8821038 Biug 1100: remove explicit CMAKE_INSTALL_PREFIX prefix to please cmake install's DESTINATION argument 2015-10-30 12:00:34 +01:00
Gael Guennebaud
0e6cb08f92 Fix shadow warning 2015-10-30 11:44:22 +01:00
Gael Guennebaud
27c56bf60f Workaround compilation issue with MSVC<=2013 2015-10-30 10:57:11 +01:00
Gael Guennebaud
213bd0253a Fix gcc 4.4 compilation issue 2015-10-30 08:44:37 +01:00
Benoit Steiner
6a02c2a85d Fixed a compilation warning 2015-10-29 20:21:29 -07:00
Benoit Steiner
ca12d4c3b3 Pulled latest updates from trunk 2015-10-29 17:57:48 -07:00
Benoit Steiner
31bdafac67 Added a few tests to cover rank-0 tensors 2015-10-29 17:56:48 -07:00
Benoit Steiner
ce19e38c1f Added support for tensor maps of rank 0. 2015-10-29 17:49:04 -07:00
Benoit Steiner
3785c69294 Added support for fixed sized tensors of rank 0 2015-10-29 17:31:03 -07:00
Benoit Steiner
0d7a23d34e Extended the reduction code so that reducing an empty set returns the neural element for the operation 2015-10-29 17:29:49 -07:00
Benoit Steiner
1b0685d09a Added support for rank-0 tensors 2015-10-29 17:27:38 -07:00
Benoit Steiner
c444a0a8c3 Consistently use the same index type in the fft codebase. 2015-10-29 16:39:47 -07:00
Benoit Steiner
09ea3a7acd Silenced a few more compilation warnings 2015-10-29 16:22:52 -07:00
Benoit Steiner
0974a57910 Silenced compiler warning 2015-10-29 15:00:06 -07:00
Benoit Steiner
ac142773a7 Don't call internal::check_rows_cols_for_overflow twice in PlainObjectBase::resize since this is extremely expensive for small arrays 2015-10-29 13:13:39 -07:00
Gael Guennebaud
05a0ee25df Fix warning. 2015-10-29 21:06:07 +01:00
Gael Guennebaud
7cfbe35e49 Fix duplicated declaration 2015-10-29 21:05:52 +01:00
Gael Guennebaud
568d488a27 Fusion the two similar specialization of Sparse2Dense Assignment.
This change also fixes a compilation issue with MSVC<=2013.
2015-10-29 13:16:15 +01:00
Gael Guennebaud
7a5f83ca60 Add overloads for real times sparse<complex> operations.
This avoids real to complex conversions, and also fixes a compilation issue with MSVC.
2015-10-29 03:55:39 -07:00
Gael Guennebaud
c688cc28d6 fix copy/paste typo 2015-10-28 20:20:05 +01:00
Gael Guennebaud
5b6cff5b0e fix typo 2015-10-28 20:18:00 +01:00
Gael Guennebaud
6759a21e49 CUDA support: define more accurate min/max values for device::numeric_limits of float and double using values from cfloat header 2015-10-28 16:49:15 +01:00
Gael Guennebaud
28ddb5158d Enable std::isfinite/nan/inf on MSVC 2013 and newer and clang. Fix isinf for gcc4.4 and older msvc with fast-math. 2015-10-28 16:27:20 +01:00
Ilya Popov
1a842c0dc4 Fix typo in TutorialSparse: laplace equation contains gradient symbol (\nabla) instead of laplacian (\Delta). 2015-10-28 09:52:55 +00:00
Gael Guennebaud
8531304858 Simplify cost computations based on HugeCost being smaller that unrolling limit 2015-10-28 13:39:02 +01:00
Gael Guennebaud
1f11dd6ced Add a unit test for large chains of products 2015-10-28 12:53:13 +01:00
Gael Guennebaud
902c2db5a5 Extend vectorwiseop unit test with column/row vectors as input. 2015-10-28 11:59:20 +01:00
Gael Guennebaud
77ff3386b7 Refactoring of the cost model:
- Dynamic is now an invalid value
 - introduce a HugeCost constant to be used for runtime-cost values or arbitrarily huge cost
 - add sanity checks for cost values: must be >=0 and not too large
This change provides several benefits:
 - it fixes shortcoming is some cost computation where the Dynamic case was not properly handled.
 - it simplifies cost computation logic, and should avoid future similar shortcomings.
 - it allows to distinguish between different level of dynamic/huge/infinite cost
 - it should enable further simplifications in the computation of costs (save compilation time)
2015-10-28 11:42:14 +01:00
Gael Guennebaud
827d8a9bad Fix false negative in redux test 2015-10-27 21:37:03 +01:00
Gael Guennebaud
d4cf436cb1 Enable mpreal unit test for C++11 compiler only 2015-10-27 17:35:54 +01:00
Gael Guennebaud
946f8850e8 bug #1008: add a unit test for fast-math mode and isinf/isnan/isfinite/etc. functions. 2015-10-27 16:44:45 +01:00
Gael Guennebaud
e3031d7bfa bug #1008: improve handling of fast-math mode for older gcc versions. 2015-10-27 16:43:23 +01:00
Gael Guennebaud
2475a1de48 bug #1008: stabilize isfinite/isinf/isnan/hasNaN/allFinite functions for fast-math mode. 2015-10-27 15:39:50 +01:00
Gael Guennebaud
699c33e76a merge 2015-10-27 11:10:11 +01:00
Gael Guennebaud
8c66b6bc61 Simplify evaluator::Flags for Map<> 2015-10-27 11:06:42 +01:00
Gael Guennebaud
12f50a4697 Fix assign vectorization logic with respect to fixed outer-stride 2015-10-27 11:04:19 +01:00
Gael Guennebaud
c1e0b6dde3 merge 2015-10-27 11:02:03 +01:00
Gael Guennebaud
73f692d16b Fix ambiguous instantiation 2015-10-27 11:01:37 +01:00
Gael Guennebaud
0fc8954282 Improve readibility of EIGEN_DEBUG_ASSIGN mode. 2015-10-27 10:38:49 +01:00
Benoit Steiner
1c8312c811 Started to add support for tensors of rank 0 2015-10-26 14:29:26 -07:00
Benoit Steiner
1f4c98abb1 Fixed compilation warning 2015-10-26 12:42:55 -07:00
Benoit Steiner
9dc236bc83 Fixed compilation warning 2015-10-26 12:41:48 -07:00
Benoit Steiner
9f721384e0 Added support for empty dimensions 2015-10-26 11:21:27 -07:00
Benoit Steiner
ded4336988 Pulled latest updates from trunk 2015-10-26 10:48:29 -07:00
Benoit Steiner
a3e144727c Fixed compilation warning 2015-10-26 10:48:11 -07:00
Benoit Steiner
f8e7b9590d Fixed compilation error triggered by gcc 4.7 2015-10-26 10:47:37 -07:00
Gael Guennebaud
e6f8c5c325 Add support to directly evaluate the product of two sparse matrices within a dense matrix. 2015-10-26 18:20:00 +01:00
Gael Guennebaud
a5324a131f bug #1092: fix iterative solver ctors for expressions as input 2015-10-26 16:16:24 +01:00
Gael Guennebaud
f93654ae16 bug #1098: fix regression introduced when generalizing some compute() methods in changeset 7031a851d4
.
2015-10-26 16:00:25 +01:00
Gael Guennebaud
af2e25d482 Merged in infinitei/eigen (pull request PR-140)
bug #1097 Added ArpackSupport to cmake install target
2015-10-26 15:31:39 +01:00
Gael Guennebaud
4704bdc9c0 Make the IterativeLinearSolvers module compatible with MPL2-only mode
by defaulting to COLAMDOrdering and NaturalOrdering for ILUT and ILLT respectively.
2015-10-26 15:17:52 +01:00
Gael Guennebaud
47d44c2f37 Add missing licence header to some top header files 2015-10-26 11:46:05 +01:00
Gael Guennebaud
8a211bb1a9 bug #1088: fix setIdenity for non-compressed sparse-matrix 2015-10-25 22:01:58 +01:00
Gael Guennebaud
ac6b2266b9 Fix SparseMatrix::insert/coeffRef for non-empty compressed matrix 2015-10-25 22:00:38 +01:00
Abhijit Kundu
0ed41bdefa ArpackSupport was missing here also. 2015-10-16 18:21:02 -07:00
Abhijit Kundu
1127ca8586 Added ArpackSupport to cmake install target 2015-10-16 16:41:33 -07:00
Gael Guennebaud
e99279f444 merge 2015-10-16 22:12:54 +02:00
Benoit Steiner
de1e9f29f4 Updated the custom indexing code: we can now use any container that provides the [] operator to index a tensor. Added unit tests to validate the use of std::map and a few more types as valid custom index containers 2015-10-15 14:58:49 -07:00
Benoit Steiner
6585efc553 Tightened the definition of isOfNormalIndex to take into account integer types in addition to arrays of indices
Only compile the custom index code  when EIGEN_HAS_SFINAE is defined. For the time beeing, EIGEN_HAS_SFINAE is a synonym for EIGEN_HAS_VARIADIC_TEMPLATES, but this might evolve in the future.
Moved some code around.
2015-10-14 09:31:37 -07:00
Gael Guennebaud
c0adf6e38d Fix perm*sparse return type and nesting, and add several sanity checks for perm*sparse 2015-10-14 10:16:48 +02:00
Gael Guennebaud
527fc4bc86 Fix ambiguous instantiation issues of product_evaluator. 2015-10-14 10:14:47 +02:00
Gael Guennebaud
2598f3987e Add a plain_object_eval<> helper returning a plain object type based on evaluator's Flags,
and base nested_eval on it.
2015-10-14 10:12:58 +02:00
Gael Guennebaud
b4c79ee1d3 Update custom setFromTripplets API to allow passing a functor object, and add a collapseDuplicates method to cleanup the API. Also add respective unit test 2015-10-13 11:30:41 +02:00
Gabriel Nützi
fc7478c04d name changes 2
user: Gabriel Nützi <gnuetzi@gmx.ch>
branch 'default'
changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h
changed unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h
2015-10-09 19:10:08 +02:00
Gabriel Nützi
7b34834f64 name changes
user: Gabriel Nützi <gnuetzi@gmx.ch>
branch 'default'
changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h
2015-10-09 19:08:14 +02:00
Gabriel Nützi
6edae2d30d added CustomIndex capability only to Tensor and not yet to TensorBase.
using Sfinae and is_base_of to select correct template which converts to array<Index,NumIndices>


 user: Gabriel Nützi <gnuetzi@gmx.ch>
 branch 'default'
 added unsupported/Eigen/CXX11/src/Tensor/TensorMetaMacros.h
 added unsupported/test/cxx11_tensor_customIndex.cpp
 changed unsupported/Eigen/CXX11/Tensor
 changed unsupported/Eigen/CXX11/src/Tensor/Tensor.h
 changed unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h
 changed unsupported/test/CMakeLists.txt
2015-10-09 18:52:48 +02:00
Calixte Denizet
b9d81c9150 Add a functor to setFromTriplets to handle duplicated entries 2015-10-06 13:29:41 +02:00
Gael Guennebaud
9acfc7c4f3 remove reference to internal method 2015-10-13 10:55:58 +02:00
Gael Guennebaud
a44d91a0b2 extend unit test for SparseMatrix::prune 2015-10-13 10:53:38 +02:00
Gael Guennebaud
ac22b66f1c Fix macro issues 2015-10-13 10:18:09 +02:00
Gael Guennebaud
3e32f6b554 update mpreal.h 2015-10-13 09:58:54 +02:00
Gael Guennebaud
ea9749fd6c Fix packetmath unit test for pdiv not being always defined 2015-10-13 09:53:46 +02:00
Gael Guennebaud
252e89b11b bug #1086: replace deprecated UF_long by SuiteSparse_long 2015-10-12 16:20:12 +02:00
Gael Guennebaud
6407e367ee Add missing epxlicit keyword, and fix regression in DynamicSparseMatrix 2015-10-12 09:49:05 +02:00
Gael Guennebaud
63e29e7765 Workaround ICC issue with first_aligned 2015-10-11 22:47:28 +02:00
Gael Guennebaud
6163db814c bug #1085: workaround gcc default ABI issue 2015-10-10 22:38:55 +02:00
Gael Guennebaud
6536b4bad7 Implement temporary-free path for "D.nolias() ?= C + A*B". (I thought it was already implemented) 2015-10-09 15:28:09 +02:00
Gael Guennebaud
a4cc4c1e5e Clarify note in nested_eval for evaluator creating temporaries. 2015-10-09 14:57:51 +02:00
Gael Guennebaud
ae38910693 The evalautor of Solve was missing the EvalBeforeNestingBit flag. 2015-10-09 14:57:19 +02:00
Gael Guennebaud
515ecddb97 Add unit test for nested_eval 2015-10-09 14:29:46 +02:00
Gael Guennebaud
78b8c344b5 Add unit test for CoeffReadCost 2015-10-09 14:28:48 +02:00
Gael Guennebaud
321cb56bf6 Add unit test to check nesting of complex expressions in redux() 2015-10-09 13:29:39 +02:00
Gael Guennebaud
2632b3446c Improve documentation of TriangularView. 2015-10-09 12:10:58 +02:00
Gael Guennebaud
1429daf850 Add lvalue check for TriangularView::swap, and fix deprecated TriangularView::lazyAssign 2015-10-09 12:10:48 +02:00
Gael Guennebaud
72bd05b6d8 Cleaning in Redux.h 2015-10-09 12:07:42 +02:00
Gael Guennebaud
2c516ba38f Remove auto references and referenced-by relation in doc. 2015-10-09 12:07:06 +02:00
Gael Guennebaud
041e038fef Remove dead code in selfadjoint_matrix_vector_product 2015-10-09 10:42:14 +02:00
Gael Guennebaud
c2d68b984f Optimize a bit complex selfadjoint * vector product. 2015-10-09 10:34:58 +02:00
Gael Guennebaud
1932a24760 Simplify EIGEN_DENSE_PUBLIC_INTERFACE 2015-10-09 10:21:54 +02:00
Gael Guennebaud
186ec1437c Cleanup EIGEN_SPARSE_PUBLIC_INTERFACE, it is now a simple alias to EIGEN_GENERIC_PUBLIC_INTERFACE 2015-10-08 22:06:49 +02:00
Gael Guennebaud
c9718514f5 Fix nesting sub-expression in outer-products 2015-10-08 21:41:53 +02:00
Gael Guennebaud
4140ee039d Fix propagation of AssumeAliasing for expression as: "scalar * (A*B)" 2015-10-08 21:41:27 +02:00
Gael Guennebaud
d866279364 Clean a bit the implementation of inverse permutations 2015-10-08 18:36:39 +02:00
Gael Guennebaud
8d00a953af Fix a nesting issue in some matrix-vector cases. 2015-10-08 17:36:57 +02:00
Gael Guennebaud
dd934ad057 Re-enable vectorization of LinSpaced, plus some cleaning 2015-10-08 17:27:01 +02:00
Gael Guennebaud
f6f6f50272 Clean evaluator<EvalToTemp> 2015-10-08 16:34:33 +02:00
Gael Guennebaud
67bfba07fd Fix some CUDA issues 2015-10-08 16:30:28 +02:00
Gael Guennebaud
412c049ba4 Fix a warning 2015-10-08 16:27:54 +02:00
Gael Guennebaud
aa6b1aebf3 Properly implement PartialReduxExpr on top of evaluators, and fix multiple evaluation of nested expression 2015-10-08 15:57:05 +02:00
Gael Guennebaud
5cc7251188 Some cleaning in evaluators 2015-10-08 15:22:04 +02:00
Gael Guennebaud
e30bc89190 Add missing include of std vector 2015-10-08 15:20:50 +02:00
Gael Guennebaud
5d7ebfb275 Update sparse solver list to make it more complete 2015-10-08 11:33:17 +02:00
Gael Guennebaud
1b148d9e2e Move IncompleteCholesky to official modules 2015-10-08 11:32:46 +02:00
Gael Guennebaud
632e7705b1 Improve doc of IncompleteCholesky 2015-10-08 10:54:36 +02:00
Gael Guennebaud
64242b8bf3 Doc: add link to doc of sparse solver concept 2015-10-08 10:50:39 +02:00
Gael Guennebaud
131db3c552 Fix return by value versus ref typo in IncompleteCholesky 2015-10-07 16:37:46 +02:00
Gael Guennebaud
13294b5152 Unify gemm and lazy_gemm benchmarks 2015-10-07 16:06:48 +02:00
Gael Guennebaud
247259f805 Add a perfromance regression benchmark for lazyProduct 2015-10-07 15:51:06 +02:00
Gael Guennebaud
c6eb17cbe9 Add helper routines to help bypassing some compiler otpimization when benchmarking 2015-10-07 15:50:42 +02:00
Gael Guennebaud
f047ecc36a _mm_hadd_epi32 is for SSSE3 only (and not SSE3) 2015-10-07 15:48:35 +02:00
Gael Guennebaud
aba1eda71e Help clang to inline some functions, thus fixing some regressions 2015-10-07 15:44:12 +02:00
Gael Guennebaud
41cc1f9033 Remove debuging prod() and lazyprod() function, plus some cleaning in noalias assignment 2015-10-07 15:41:22 +02:00
Gael Guennebaud
ca0dd7ae26 Fix implicit cast in unit test 2015-10-07 15:36:12 +02:00
Gael Guennebaud
8bb51a87f7 Re-enable some invalid scalar type conversion checks by disabling explicit vectorization 2015-10-06 17:24:01 +02:00
Gael Guennebaud
27a94299aa Add sparse vector to Ref<SparseMatrix> conversion unit tests, and improve output of sparse_ref unit test in case of failure. 2015-10-06 17:23:11 +02:00
Gael Guennebaud
2e0ece7b66 Fix wrong casting syntax 2015-10-06 17:22:12 +02:00
Gael Guennebaud
69a7897e72 Fix storage index type in empty permutations 2015-10-06 17:21:24 +02:00
Gael Guennebaud
26cde4db3c Define Permutation*<>::Scalar to 'void', re-enable scalar type compatibility check in assignment while relaxing this test for void types. 2015-10-06 17:18:06 +02:00
Gael Guennebaud
fb51bab272 Some cleaning 2015-10-06 17:14:56 +02:00
Gael Guennebaud
2c676ddb40 Handle various TODOs in SSE vectorization (remove splitted storeu, enable SSE3 integer vectorization, plus minor tweaks) 2015-10-06 15:43:27 +02:00
Gael Guennebaud
2d287a4898 Fix Ref<SparseMatrix> for Transpose<SparseVector> 2015-10-06 15:09:04 +02:00
Gael Guennebaud
752a0e5339 bug #1076: fix scaling in IncompleteCholesky, improve doc, add read-only access to the different factors, remove debugging code. 2015-10-06 13:25:45 +02:00
Gael Guennebaud
f25bdc707f Optimise assignment into a Block<SparseMatrix> by using Ref and avoiding useless updates in non-compressed mode. This make row-by-row filling of a row-major sparse matrix very efficient. 2015-10-06 11:59:08 +02:00
Gael Guennebaud
945b80c83e Optimize Ref<SparseMatrix> by removing useless default initialisation of SparseMapBase and SparseMatrix 2015-10-06 11:57:03 +02:00
Gael Guennebaud
9a070638de Enable to view a SparseVector as a Ref<SparseMatrix> 2015-10-06 11:53:19 +02:00
Gael Guennebaud
1b43860bc1 Make SparseVector derive from SparseCompressedBase, thus improving compatibility between sparse vectors and matrices 2015-10-06 11:41:03 +02:00
Gael Guennebaud
6100d1ae64 Improve counting of sparse temporaries 2015-10-06 11:32:02 +02:00
Gael Guennebaud
1879917d35 Propagate cmake generator 2015-10-05 16:18:22 +02:00
Gael Guennebaud
deb261f64b Make abs2 compatible with custom complex types 2015-10-02 10:33:25 +02:00
nnyby
ccc7b0ffea [doc] grammar fix: "linearly space" -> "linearly spaced" 2015-10-01 23:43:06 +00:00
Gael Guennebaud
75a60d3ac0 bug #1075: fix AlignedBox::sample for runtime dimension 2015-09-30 11:44:02 +02:00
Gael Guennebaud
9136b95219 Merged in doug_kwan/eigen (pull request PR-137)
Specified signedness of char type in test
2015-09-30 11:37:04 +02:00
Gael Guennebaud
781e8c38bd merge 2015-09-29 11:12:43 +02:00
Gael Guennebaud
b2b8c1d41e Fix performance regression in sparse * dense product where "sparse" is an expression 2015-09-29 11:11:40 +02:00
Doug Kwan
239c9946cd Specified signedness of char type in test so that test passes
consistently on different targets.
2015-09-28 14:26:10 -07:00
Benoit Steiner
d46bacb6bb Call numext::mini instead of std::min in several places. 2015-09-28 10:40:41 -07:00
Gael Guennebaud
ceafed519f Add support for permutation * homogenous 2015-09-28 16:56:11 +02:00
Gael Guennebaud
ddb5650530 bug #1070: propagate last three Matrix template arguments for NumTraits<AutoDiffScalar<>>::Real 2015-09-28 15:07:03 +02:00
Gael Guennebaud
02e940fc9f bug #1071: improve doc on lpNorm and add example for some operator norms 2015-09-28 11:55:36 +02:00
Gael Guennebaud
8c1ee3629f Add support for row/col-wise lpNorm() 2015-09-28 11:36:00 +02:00
Gael Guennebaud
75861f6650 bug #1069: fix AVX support on MSVC (use of non portable C-style cast) 2015-09-28 10:08:26 +02:00
Tal Hadad
5e0a178df2 Initial fork of unsupported module EulerAngles. 2015-09-27 16:51:24 +03:00
Gael Guennebaud
d16797cfc0 Fix bug #1067: naming conflict 2015-09-19 21:44:14 +02:00
Benoit Steiner
13aee4463e Cleaned up a test 2015-09-18 09:42:08 -07:00
Benoit Steiner
58a6453d48 Fixed compilation warning 2015-09-17 10:18:49 -07:00
Benoit Steiner
31afdcb4c2 Fix return type for TensorEvaluator<TensorSlicingOp>::data 2015-09-17 09:40:21 -07:00
Gael Guennebaud
9d993c709b Fix typo in Vectowise::any() 2015-09-16 22:31:19 +02:00
Christoph Hertzberg
43ba07d4d7 Merged in daalpa/eigen/daalpa/removed-documentation-that-did-not-match-1442148941751 (pull request PR-136)
Removed documentation that did not match the member function DenseBase::outerSize()
2015-09-13 16:35:32 +02:00
daalpa
fab96f2ff3 Removed documentation that did not match the member function DenseBase::outerSize() 2015-09-13 12:55:57 +00:00
Christoph Hertzberg
d6f762d955 Fixed cuda code: EIGEN_DEVICE_FUNC must come after template<...> 2015-09-10 11:46:27 +02:00
Gael Guennebaud
680d318352 Add unit tests for bug #981: valid and invalid usage of ternary operator 2015-09-09 11:38:25 +02:00
Benoit Steiner
84e0c27b61 Fixed a compilation warning 2015-09-08 17:05:35 -07:00
Benoit Steiner
05f2f94f2b Fixed a compilation warning 2015-09-08 17:04:03 -07:00
Benoit Steiner
98f8f0db9a Added support for predux_mul for CUDA devices 2015-09-08 15:37:25 -07:00
Christoph Hertzberg
e3f69eb60d Fixed minor regression caused by 7031a851d4 2015-09-08 10:53:10 +02:00
Gael Guennebaud
5bf971e5b8 MKL is now free of charge for opensource 2015-09-07 11:23:55 +02:00
Gael Guennebaud
73a86cfcd3 Add EIGEN_QUATERNION_PLUGIN 2015-09-07 11:12:30 +02:00
Gael Guennebaud
7fad309631 Fix link and code formating 2015-09-07 11:08:41 +02:00
Gael Guennebaud
7031a851d4 Generalize matrix ctor and compute() method of dense decomposition to 1) limit temporaries, 2) forward expressions to nested decompositions, 3) fix ambiguous ctor instanciation for square decomposition 2015-09-07 10:42:04 +02:00
Gael Guennebaud
1702fcb72e Added tag 3.3-alpha1 for changeset f9303cc7c5 2015-09-04 17:27:20 +02:00
Gael Guennebaud
f9303cc7c5 bump to 3.3-alpha1 2015-09-04 17:26:36 +02:00
Gael Guennebaud
b20a55a608 Workaround wrong instanciation made by VS2010 2015-09-04 15:25:58 +02:00
Gael Guennebaud
ed265258e4 Fix returned index type of inner iterators of sparse blocks. 2015-09-03 15:07:35 +02:00
Gael Guennebaud
a835dfca73 InnerIterator::index() should really return a StorageIndex 2015-09-03 14:53:51 +02:00
Gael Guennebaud
941a99ac1a Add a few missing EIGEN_DEVICE_FUNC declarations 2015-09-03 14:14:54 +02:00
Gael Guennebaud
d91db41a31 Fix documentation example 2015-09-03 14:14:14 +02:00
Gael Guennebaud
3942db9d7c Use inline versus static free functions. 2015-09-03 13:42:54 +02:00
Sergiu Dotenco
85afb61417 use explicit Scalar types for AngleAxis initialization
(grafted from 89a222ce50
)
2015-08-28 22:20:15 +02:00
Benoit Steiner
56983f6d43 Fixed compilation warning 2015-10-23 12:03:42 -07:00
Benoit Steiner
57857775b4 Added support for arrays of size 0 2015-10-23 10:20:51 -07:00
Benoit Steiner
c40c2ceb27 Reordered the code of fft constructor to prevent compilation warnings 2015-10-23 09:38:19 -07:00
Benoit Steiner
a586fdaa91 Reworked the tensor contraction mapper code to make it compile on Android 2015-10-23 09:33:41 -07:00
Benoit Steiner
29c3b7513e Pulled latest updates from trunk 2015-10-23 09:16:14 -07:00
Benoit Steiner
9ea39ce13c Refined the #ifdef __CUDACC__ guard to ensure that when trying to compile gpu code with a non cuda compiler results in a linking error instead of bogus code. 2015-10-23 09:15:34 -07:00
Gael Guennebaud
c244081490 disable usage of INTMAX_T 2015-10-23 14:48:54 +02:00
Gael Guennebaud
0905ed5390 remove useless cstdint header 2015-10-23 14:41:25 +02:00
Gael Guennebaud
54b23cce16 Switch to MPL2 2015-10-23 10:36:33 +02:00
Benoit Steiner
ac99b49249 Added missing glue logic 2015-10-22 16:54:21 -07:00
Benoit Steiner
2dd9446613 Added mapping between a specific device and the corresponding packet type 2015-10-22 16:53:36 -07:00
Benoit Steiner
2495e2479f Added tests for the fft code 2015-10-22 16:52:55 -07:00
Benoit Steiner
a147c62998 Added support for fourier transforms (code courtesy of thucjw@gmail.com) 2015-10-22 16:51:30 -07:00
Gael Guennebaud
71b473aab1 Remove invalid typename keyword 2015-10-22 21:58:18 +02:00
Gael Guennebaud
ebc1af1683 merge 2015-10-22 21:47:47 +02:00
Benoit Steiner
825146c8fd Fixed incorrect expected value 2015-10-22 11:56:00 -07:00
Benoit Steiner
4cf7da63de Added a constructor to simplify the construction of tensormap from tensor 2015-10-22 11:48:02 -07:00
Gael Guennebaud
0eb46508e2 Avoid any openmp calls if multi-threading is explicitely disabled at runtime. 2015-10-22 16:30:28 +02:00
Gael Guennebaud
6df8e99470 bug #1089: add a warning when using a MatrixBase method which is implemented within another module by declaring them inline. 2015-10-22 16:10:28 +02:00
Gael Guennebaud
e78bc111f1 bug #1090: fix a shortcoming in redux logic for which slice-vectorization plus unrolling might happen. 2015-10-21 20:58:33 +02:00
Benoit Steiner
b178cc3479 Added some syntactic sugar to make it simpler to compare a tensor to a scalar. 2015-10-21 11:28:28 -07:00
Gael Guennebaud
5ca2e25967 merge 2015-10-21 13:49:13 +02:00
Gael Guennebaud
8afd0ce955 add FIXME 2015-10-21 13:48:15 +02:00
Gael Guennebaud
8961265889 bug #1064: add support for Ref<SparseVector> 2015-10-21 09:47:43 +02:00
Benoit Steiner
0af63493fd Disable SFINAE for versions of gcc older than 4.8 2015-10-20 11:53:30 -07:00
Benoit Steiner
73b8e719ae Removed bogus assertion 2015-10-20 11:42:34 -07:00
Benoit Steiner
eaf4b98180 Added support for boolean reductions (ie 'and' & 'or' reductions) 2015-10-20 11:41:22 -07:00
Benoit Steiner
f5c1587e4e Fixed a bug in the tensor conversion op 2015-10-20 11:37:44 -07:00
Gael Guennebaud
fe630c9873 Improve numerical accuracy in LLT and triangular solve by using true scalar divisions (instead of x * (1/y)) 2015-10-18 22:15:01 +02:00
Doug Kwan
5c9ee73eb9 Implement plog and pexp for AltiVec. 2015-07-30 11:12:42 -07:00
Gael Guennebaud
5a1cc5d24c bug #1053: fix SuplerLU::solve with EIGEN_DEFAULT_TO_ROW_MAJOR 2015-09-03 11:25:36 +02:00
Gael Guennebaud
2795ffd6a0 Fix Index vs StorageIndex naming convention 2015-09-03 11:18:27 +02:00
Gael Guennebaud
ef2b54f422 Fix AMD ordering when a column has only one off-diagonal non-zero (also fix bug #1045) 2015-09-03 11:04:06 +02:00
Christoph Hertzberg
5ad7981f73 Use full packet size for Dynamic-sized objects (otherwise, the unalignedcount unit test fails with AVX enabled) 2015-09-02 22:51:43 +02:00
Gael Guennebaud
aa768add0b Since there is no reason for evaluators to be nested by reference, let's remove the evaluator<>::nestedType indirection. 2015-09-02 22:10:39 +02:00
Gael Guennebaud
51455824ea Fix AlignedVector3 wrt previous change 2015-09-02 21:51:58 +02:00
Gael Guennebaud
f8976fdbe0 Make evaluators non-copyable. This guarantee that evaluators storing temporaries do not introduce unwanted copy overhead. 2015-09-02 21:39:49 +02:00
Gael Guennebaud
92b9f0e102 Cleaning pass on evaluators: remove the useless and error prone evaluator<>::type indirection. 2015-09-02 21:38:40 +02:00
Gael Guennebaud
cda55ab245 Fix compilation of cuda unit test 2015-09-02 16:59:07 +02:00
Gael Guennebaud
14458ec0a0 Fix packetmath unit test for exp and log 2015-09-02 15:47:58 +02:00
Gael Guennebaud
6b99afa5ae Fix LSCG::solve with a sparse destination. 2015-09-02 15:34:03 +02:00
Gael Guennebaud
b5ad3d2cf7 Remove deprecated Flagged expression. 2015-09-02 14:53:50 +02:00
Gael Guennebaud
6522c3a6f0 Add regression test for bug #817 2015-09-02 13:16:03 +02:00
Gael Guennebaud
be5e2ecc21 bug #505: add more examples of bad and correct usages of auto and eval(). 2015-09-02 13:04:30 +02:00
Gael Guennebaud
aba8c9ee17 Add a documentation page for common pitfalls 2015-09-02 11:23:55 +02:00
Gael Guennebaud
a75616887e bug #1057: fix a declaration missmatch with MSVC 2015-09-02 09:31:32 +02:00
Gael Guennebaud
280f93ff65 Fix FullPivLU::image documentation 2015-09-02 09:19:27 +02:00
Gael Guennebaud
6059188f9d Simplify implementation of the evaluation's iterator of Sparse*Diagonal products to help the compiler to generate better code. 2015-09-01 22:34:30 +02:00
Gael Guennebaud
0b2412df50 Remove duplicated temporary in Sparse to Sparse assignment 2015-09-01 22:31:30 +02:00
Gael Guennebaud
9001f4a46b Add missing specialization of evaluator of sub-sparse-matrices that can be seen as a SparseCompressedBase. This changeset enable faster iterator for such expressions. 2015-09-01 22:29:17 +02:00
Benoit Steiner
f41831e445 Added support for argmax/argmin 2015-08-31 08:18:53 -07:00
Benoit Steiner
2ab603316a Use numext::mini/numext::maxi instead of std::min/std::max in the tensor code 2015-08-28 08:14:15 -07:00
Benoit Steiner
2ed1495eec nvcc doesn't support std::min or std::max on GPU. Use our own custom implementation instead 2015-08-27 16:59:55 -07:00
Sergiu Dotenco
d4c24eb016 fixed Quaternion identity initialization for non-implicitly convertible types 2015-08-20 20:55:37 +02:00
Christoph Hertzberg
78358a7241 Fixed broken commit a09cfe650f
. Missing } and unprotected min/max calls and definitions.
2015-08-22 15:03:16 +02:00
Benoit Steiner
a09cfe650f std::numeric_limits doesn't work reliably on CUDA devices. Use our own definition of numeric_limit<T>::max() and numeric_limit<T>::min() instead of the stl ones. 2015-08-21 16:01:40 -07:00
Christoph Hertzberg
e5c78d85c8 bug #1043: Avoid integer conversion sign warning 2015-08-19 21:50:21 +02:00
Christoph Hertzberg
1bdd06a199 Fix some trivial warnings 2015-08-19 21:38:18 +02:00
Christoph Hertzberg
0721690dbb Use standard include syntax in Tensor module (<> for include-path and "" for relative path) 2015-08-18 14:34:00 +02:00
Christoph Hertzberg
8097d8d028 surpress some warnings 2015-08-17 21:50:52 +02:00
Christoph Hertzberg
d2e0927127 Define EIGEN_MAX_STATIC_ALIGN_BYTES to 0 for architectures that don't require stack alignment 2015-08-17 16:44:52 +02:00
Gael Guennebaud
dc2c103b3b merge 2015-08-16 14:22:02 +02:00
Christoph Hertzberg
d6a4805fdf Protect further isnan/isfinite/isinf calls 2015-08-16 14:00:02 +02:00
Christoph Hertzberg
a40f6ab276 Merged in ITimer/eigen (pull request PR-133)
[Doc] Fix a spelling error in TopicMultithreading.dox
2015-08-14 17:46:57 +02:00
Christoph Hertzberg
61e0977e10 Protect all calls to isnan, isinf and isfinite with parentheses. 2015-08-14 17:32:34 +02:00
Christoph Hertzberg
712e2fed17 bug #829: Introduce macro EIGEN_HAS_CXX11_CONTAINERS and do not specialize std-containers if it is enabled. 2015-08-14 16:09:48 +02:00
Christoph Hertzberg
a5d1bb2be8 bug #1054: Use set(EIGEN_CXX_FLAG_VERSION "/version") only for Intel compilers on Windows.
Also removed code calling `head -n1` and always use integrated REGEX functionality.
2015-08-14 15:30:59 +02:00
ITimer
93635cafee Fixed a spelling error 2015-08-10 15:11:10 +08:00
Gael Guennebaud
23aab82c0c merge 2015-08-09 21:24:20 +02:00
Gael Guennebaud
0d5e673baa Fix Tensor module wrt nullary functor recent change 2015-08-09 21:20:24 +02:00
Christoph Hertzberg
cac6b23033 bug #1053: SparseLU failed with EIGEN_DEFAULT_TO_ROW_MAJOR 2015-08-07 23:10:56 +02:00
Gael Guennebaud
febcce34f1 Enable vectorization with half-packets 2015-08-07 20:05:31 +02:00
Gael Guennebaud
6245591349 Fix prototype of plset and generalize linspace functor. 2015-08-07 19:27:59 +02:00
Gael Guennebaud
60e4260d0d Some functors were not generic wrt packet-type. 2015-08-07 17:41:39 +02:00
Gael Guennebaud
e68c7b8368 Include SSE packetmath when AVX is enabled, and enable AVX's sine function only in fast-math mode (as SSE) 2015-08-07 17:40:39 +02:00
Gael Guennebaud
65bfa5fce7 Allow to use arbitrary packet-types during evaluation.
This is implemented by adding a PacketType template parameter to packet and writePacket members of evaluator<>.
2015-08-07 12:01:39 +02:00
Gael Guennebaud
3602926ed5 Mark ALignedBit as deprecated. 2015-08-07 10:45:02 +02:00
Gael Guennebaud
ce57dbd937 Let unpacket_traits<> exposes the required alignment and make use of it everywhere 2015-08-07 10:44:01 +02:00
Gael Guennebaud
2afdef6a54 Generalize first_aligned to take the requested alignment as a template parameter, and add a first_default_aligned variante calling first_aligned with the requirement of the largest packet for the given scalar type. 2015-08-06 17:52:01 +02:00
Gael Guennebaud
1f5024332e First part of a big refactoring of alignment control to enable the handling of arbitrarily aligned buffers. It includes:
- AlignedBit flag is deprecated. Alignment is now specified by the evaluator through the 'Alignment' enum, e.g., evaluator<Xpr>::Alignment. Its value is in Bytes.
 - Add several enums to specify alignment: Aligned8, Aligned16, Aligned32, Aligned64, Aligned128. AlignedMax corresponds to EIGEN_MAX_ALIGN_BYTES. Such enums are used to define the above Alignment value, and as the 'Options' template parameter of Map<> and Ref<>.
 - The Aligned enum is now deprecated. It is now an alias for Aligned16.
 - Currently, traits<Matrix<>>, traits<Array<>>, traits<Ref<>>, traits<Map<>>, and traits<Block<>> also expose the Alignment enum.
2015-08-06 15:31:07 +02:00
Gael Guennebaud
65186ef18d Fix logic in compute_default_alignment, extend it to Dynamic size, and move it to XprHelper.h file. 2015-08-06 14:07:59 +02:00
Gael Guennebaud
becd89df29 Enable runtime stack alignment in gemm_blocking_space. 2015-08-06 14:00:26 +02:00
Gael Guennebaud
d4f5efc51a Add a EIGEN_DEFAULT_ALIGN_BYTES macro defining default alignment for alloca and aligned_malloc.
It is defined as the max of EIGEN_IDEAL_MAX_ALIGN_BYTES and EIGEN_MAX_ALIGN_BYTES
2015-08-06 13:56:53 +02:00
Gael Guennebaud
7e0d7a76b8 Remove dense nested loops in IncompleteCholesky 2015-08-04 18:01:38 +02:00
Gael Guennebaud
e31fc50280 Numerous fixes for IncompleteCholesky. Still have to make it fully exploit the sparse structure of the L factor, and improve robustness to illconditionned problems. 2015-08-04 16:16:02 +02:00
Gael Guennebaud
9a4713e505 Add a unit test for IncompleteCholesky 2015-08-04 16:14:06 +02:00
Gael Guennebaud
506964fc29 Propagate precondition info to the iterative solver. 2015-08-04 16:13:34 +02:00
Gael Guennebaud
db0f5c9d90 Fix conversion warning 2015-08-04 16:12:44 +02:00
Gael Guennebaud
b986c147cd Fix ForceNonZeroDiag for complexes 2015-08-04 16:12:16 +02:00
Benoit Steiner
cbce0e3b12 Fixed compilation warning 2015-08-03 21:52:29 -07:00
Benoit Steiner
a5dc49e7e8 Fixed 2 compilation warnings generated by llvm 2015-07-29 15:06:08 -07:00
Benoit Steiner
e1d28b7ea7 Added a test for shuffling 2015-07-29 15:01:21 -07:00
Benoit Steiner
0570594f2c Fixed a few compilation warnings triggered by clang 2015-07-29 11:48:38 -07:00
Benoit Steiner
099597406f Simplified and generalized the DividerTraits code 2015-07-29 10:02:42 -07:00
Gael Guennebaud
6db3a557f4 Add missing specialization of struct DividerTraits<long> 2015-07-29 11:38:53 +02:00
Gael Guennebaud
aec4814370 Many files were missing in previous changeset. 2015-07-29 11:11:23 +02:00
Gael Guennebaud
f7d5b9323d typo 2015-07-29 11:08:49 +02:00
Gael Guennebaud
175ed636ea bug #973: update macro-level control of alignement by introducing user-controllable EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES macros. This changeset also removes EIGEN_ALIGN (replaced by EIGEN_MAX_ALIGN_BYTES>0), EIGEN_ALIGN_STATICALLY (replaced by EIGEN_MAX_STATIC_ALIGN_BYTES>0), EIGEN_USER_ALIGN*, EIGEN_ALIGN_DEFAULT (replaced by EIGEN_ALIGN_MAX). 2015-07-29 10:22:25 +02:00
Gael Guennebaud
76874b128e bug #1047: document the structure layout of class Matrix 2015-07-29 10:21:28 +02:00
Gael Guennebaud
41e1f3498c bug #1048: fix unused variable warning 2015-07-28 22:59:50 +02:00
Benoit Steiner
b9db19aec4 Pulled latest updates from trunk. 2015-07-27 09:39:57 -07:00
Benoit Steiner
f84417d97b Removed an incorrect assertion. 2015-07-27 09:25:22 -07:00
Benoit Steiner
1a30a8e7a2 Merged in godeffroy/eigen_tensor_generalized_contraction (pull request PR-130)
Allowed tensor contraction operation with an empty array of dimension pairs, which performs a tensor product.
2015-07-27 09:19:35 -07:00
Christoph Hertzberg
a44d022caf bug #792: SparseLU::factorize failed for structurally rank deficient matrices 2015-07-26 20:30:30 +02:00
Godeffroy Valet
2195822df6 Allowed tensor contraction operation with an empty array of dimension pairs, which performs a tensor product. 2015-07-25 11:58:36 +02:00
Benoit Steiner
f6282e451a Fixed a typo in an assertion. 2015-07-24 17:35:47 -07:00
Benoit Steiner
4b3052c54d Pulled latest update from trunk 2015-07-23 08:47:33 -07:00
Benoit Steiner
a446020b78 Reenable 2 tests previously disabled by mistake 2015-07-23 08:47:00 -07:00
Christoph Hertzberg
3d951df223 Re-enabled unit tests which were disabled in commit 4200bdec24
.
2015-07-23 10:55:03 +02:00
Benoit Steiner
6d6e6d0b88 Define EIGEN_VECTORIZE_AVX2 and EIGEN_VECTORIZE_FMA when the corresponding instructions can be used by the compiler 2015-07-22 18:22:16 -07:00
Benoit Steiner
ce65c2922a Pulled latest updates from trunk 2015-07-22 18:12:16 -07:00
Benoit Steiner
4200bdec24 Extended the range of value inputs for TensorIntDiv to support tensors with more than 4 billion elements. 2015-07-22 17:02:30 -07:00
Gael Guennebaud
3b0ad02c10 Remove wrongly pushed debugging statements 2015-07-22 14:33:57 +02:00
Jonas Adler
815fa0dbf6 Fixed some compiler bugs in NVCC, now compiles with CUDA.
(chtz: Manually joined sevaral commits to keep the history clean)
2015-07-22 12:29:18 +02:00
Benoit Steiner
d259b719d1 Made sure that the use const expressions are not enabled when compiling with nvcc even when gcc 4.9 is used as the host compiler. 2015-07-21 17:35:58 -07:00
Benoit Steiner
0dda72316f The eigen_check macro doesn't exist anymore: use assert instead 2015-07-21 17:34:15 -07:00
Gael Guennebaud
586d10f7e0 Fix compilation of tri(sparse) * dense with OpenMP 2015-07-21 22:52:21 +02:00
Gael Guennebaud
d3e5db9a80 add regression unit test for previous changeset 2015-07-21 22:23:17 +02:00
Valentin Roussellet
5e635f9ca1 AlignedVector3 accepts implicit conversions from more operators. 2015-07-21 16:42:52 +00:00
Gael Guennebaud
45ee14a13a Fix output of relative error, and add more support for long double 2015-07-21 22:22:12 +02:00
Gael Guennebaud
87f3e533f5 bug #1036: implement verify_is_approx_upto_permutation through a combinatorial search.
The previous implementation was subject to numerical cancellation issues.
2015-07-20 15:34:06 +02:00
Gael Guennebaud
ab8b497a7e Add pow(scalar,array) in quick ref 2015-07-20 13:59:21 +02:00
Gael Guennebaud
6544b49e59 Generalize pow(x,e) such that x and e can be a different expression type or a scalar for either x or e. Add x.pow(e) with e an array expression. 2015-07-20 13:57:55 +02:00
Gael Guennebaud
2d93060291 Fix trivial warnings. 2015-07-20 13:55:48 +02:00
Gael Guennebaud
c11971de37 Fix compilation of isnan(complex) 2015-07-20 12:56:01 +02:00
Gael Guennebaud
88e352adac Add support for replicate in CUDA 2015-07-20 10:53:03 +02:00
Benoit Steiner
6799c26cd6 Fixed a typo in a test and a compilation warning 2015-07-17 16:50:47 -07:00
Benoit Steiner
7a39439904 Rewrote Eigen::dimensions_match to prevent a static assertion when the rank of the tensors is different. 2015-07-17 16:46:30 -07:00
Benoit Steiner
e94f9eb637 Fixed a const correctness issue in TensorLayoutSwap 2015-07-17 15:44:26 -07:00
Benoit Steiner
513e357b48 Added support for prefetching on cuda devices 2015-07-17 15:35:16 -07:00
Benoit Steiner
943035e5bd Pulled latest updates from trunk 2015-07-17 09:42:45 -07:00
Benoit Steiner
06a22ca5bd Added support for sigmoid function to the tensor module 2015-07-17 09:29:00 -07:00
Nicolas Mellado
3275eddc24 Add const getters for LM parameters 2015-07-17 09:11:49 +02:00
Benoit Steiner
979b73cebf Fixed a typo in Macro.h 2015-07-16 14:17:50 -07:00
Benoit Steiner
a5ec25f11c Use the new EIGEN_HAS_INDEX_LIST define to enable the cxx11_tensor_index_list tests 2015-07-16 13:16:08 -07:00
Benoit Steiner
7a243959b4 Define EIGEN_HAS_INDEX_LIST whenever the class is defined. This makes it easier to support compilers that are cxx11 compliant and compilers that aren't. 2015-07-16 13:14:18 -07:00
Benoit Steiner
b756f6af5e Added missing APIs to the Eigen::Sizes class 2015-07-16 12:14:18 -07:00
Benoit Steiner
05787f8367 Added support for tensor inflation. 2015-07-16 09:04:05 -07:00
Benoit Steiner
b900fe47d5 Avoid relying on a default value for the Vectorizable template parameter of the EvalRange functor 2015-07-15 17:17:04 -07:00
Benoit Steiner
4b3d697e12 Fixed compilation error in a cuda test 2015-07-15 17:14:24 -07:00
Benoit Steiner
8315e025e1 Updated the cuda tests to use the new GpuDevice constructor 2015-07-15 12:39:26 -07:00
Benoit Steiner
e892524efe Added support for multi gpu configuration to the GpuDevice class 2015-07-15 12:38:34 -07:00
Gael Guennebaud
f5aa640862 Clean some previous changes and more cuda fixes 2015-07-15 10:57:55 +02:00
Nicolas Mellado
7cecd39a84 Merged eigen/eigen into default 2015-07-15 10:15:54 +02:00
Nicolas Mellado
592ee2a4b4 Add missing EIGEN_DEVICE_FUNC 2015-07-15 10:14:52 +02:00
Gael Guennebaud
6527dbb9f8 Merged in emartin/eigen (pull request PR-123)
Modify GEMM to handle m=0, n=0, and k=0 cases.
2015-07-13 23:58:30 +02:00
Benoit Steiner
b80036abec Enabled the construction of a fixed sized tensor directly from an expression. 2015-07-13 11:16:37 -07:00
Benoit Steiner
3912ca0d53 Fixed a bug in the integer division code that caused some large numerators to be incorrectly handled 2015-07-13 11:14:59 -07:00
Christoph Hertzberg
ea87561564 bug #1039: Redefining EIGEN_DEFAULT_DENSE_INDEX_TYPE may lead to errors 2015-07-13 16:08:25 +02:00
Gael Guennebaud
b8df8815f4 Fix operator<<(ostream,AlignedVector3) 2015-07-13 13:55:59 +02:00
Eric Martin
002c2923c2 Modify GEMM to handle m=0, n=0, and k=0 cases. 2015-07-11 21:46:13 -05:00
Nicolas Mellado
dbb3e2cf8a Cleaning 2015-07-11 18:15:31 +00:00
Nicolas Mellado
0d09845562 Revert files to remove EIGEN_USING_NUMEXT_MATH 2015-07-11 20:11:36 +02:00
Nicolas Mellado
20b96025fd Replace double constants by Scalar constants 2015-07-11 20:02:30 +02:00
Nicolas Mellado
1dd6a329e8 Cuda compatibility: remove explicit call to std math functions 2015-07-11 19:40:15 +02:00
Nicolas Mellado
bc40eb745d Merged eigen/eigen into default 2015-07-11 19:33:43 +02:00
Benoit Steiner
e6297741c9 Added support for generation of random complex numbers on CUDA devices 2015-07-07 17:40:49 -07:00
Benoit Steiner
6de6fa9483 Use NumTraits<T>::RequireInitialization instead of internal::is_arithmetic<T>::value to check whether it's possible to bypass the type constructor in the tensor code. 2015-07-07 15:23:56 -07:00
Benoit Steiner
7b7df7b6b8 Updated internal::is_arithmetic::value to be true for complex numbers 2015-07-07 12:57:35 -07:00
Benoit Steiner
6e55284e51 Pulled latest changes from trunk 2015-07-07 08:54:37 -07:00
Benoit Steiner
a93af65938 Improved and cleaned up the 2d patch extraction code 2015-07-07 08:52:14 -07:00
Gael Guennebaud
7fa6fe8d8c typo 2015-07-07 17:47:24 +02:00
Gael Guennebaud
fa17358c4b Rotation2D: fix slerp to take the shortest path, and add convenient method to get the angle in [-pi,pi] or [0,pi] 2015-07-07 17:27:12 +02:00
Benoit Steiner
3f2101b03b Use numext::swap instead of std::swap 2015-07-06 17:02:29 -07:00
Benoit Steiner
0485a2468d use Eigen smart_copy instead of std::copy 2015-07-06 17:01:51 -07:00
Benoit Steiner
ebdacfc5ea Fixed a compilation warning generated by clang 2015-07-06 15:03:11 -07:00
Benoit Steiner
81f9e968fd Only attempt to use the texture path on GPUs when it's supported by CUDA 2015-07-06 13:32:38 -07:00
Nicolas Mellado
66b30728f8 Merged eigen/eigen into default 2015-07-06 20:58:31 +02:00
Nicolas Mellado
5359e5cdb2 Protect against compilation errors with nvcc and numext/complex.
Disable functions explicitely involving std::complex when compiling with nvcc.
Improve code compatilibity using the new macro EIGEN_USING_NUMEXT_MATH (same spirit than EIGEN_USING_STD_MATH but for numext functions)
2015-07-06 20:55:01 +02:00
Benoit Steiner
864318e508 Misc small fixes to the tensor slicing code. 2015-07-06 11:45:56 -07:00
Gael Guennebaud
c2019dfeb3 Merged in Emie/eigen (pull request PR-121)
typo correction in mathFunction
2015-07-06 16:48:54 +02:00
Emilie Guy
ea7113dd0c typo correction in mathFunction 2015-07-06 14:31:08 +02:00
Nicolas Mellado
9115896590 Merged eigen/eigen into default 2015-07-03 00:41:11 +02:00
Benoit Steiner
95ef94f1ee Fixed a typo in the patch 2015-07-02 07:06:55 +00:00
Benoit Steiner
8f1d547c92 Added a default value for the cuda stream in the GpuDevice constructor 2015-07-01 18:32:18 -07:00
Benoit Steiner
1e911b276c Misc improvements and optimizations 2015-07-01 13:59:11 -07:00
Benoit Steiner
4ed213f97b Improved a previous fix 2015-07-01 13:06:30 -07:00
Benoit Steiner
56e155dd60 Fixed a couple of mistakes in the previous commit. 2015-07-01 12:40:27 -07:00
Benoit Steiner
925d0d375a Enabled the vectorized evaluation of several tensor expressions that was previously disabled by mistake 2015-07-01 11:32:04 -07:00
Benoit Steiner
44eedd8915 Marked the cast functions as EIGEN_DEVICE_FUNC to ensure that we can run casting on GPUs 2015-06-30 15:48:55 -07:00
Benoit Steiner
6021b68d8b Silenced a compilation warning 2015-06-30 15:42:25 -07:00
Benoit Steiner
f1f480b116 Added support for user defined custom tensor op. 2015-06-30 15:36:29 -07:00
Benoit Steiner
dc31fcb9ba Added support for 3D patch extraction 2015-06-30 14:48:26 -07:00
Benoit Steiner
f587075987 Made ThreadPoolDevice inherit from a new pure abstract ThreadPoolInterface class: this enables users to leverage their existing threadpool when using eigen tensors. 2015-06-30 14:21:24 -07:00
Benoit Steiner
28b36632ec Turned Eigen::array::size into a function to make the code compatible with std::array 2015-06-30 13:23:05 -07:00
Benoit Steiner
109005c6c9 Added a test for multithreaded full reductions 2015-06-30 13:08:12 -07:00
Benoit Steiner
a4aa7c6217 Fixed a few compilation warnings 2015-06-30 10:36:17 -07:00
Benoit Steiner
7d41e97fa9 Silenced a number of compilation warnings 2015-06-29 14:47:40 -07:00
Benoit Steiner
fffe63045c Added a test for full reductions on GPU 2015-06-29 14:10:32 -07:00
Benoit Steiner
db9dbbda32 Improved performance of full reduction by 2 order of magnitude on CPU and 3 orders of magnitude on GPU 2015-06-29 14:06:32 -07:00
Benoit Steiner
f0ce85b757 Improved support for fixed size tensors 2015-06-29 14:04:15 -07:00
Benoit Steiner
670c71d906 Express the full reduction operations (such as sum, max, min) using TensorDimensionList 2015-06-29 11:30:36 -07:00
Benoit Steiner
d8098ee7d5 Added support for tanh function to the tensor code 2015-06-29 11:14:42 -07:00
Benoit Steiner
3625734bc8 Moved some utilities to TensorMeta.h to make it easier to reuse them accross several tensor operations.
Created the TensorDimensionList class to encode the list of all the dimensions of a tensor of rank n. This could be done using TensorIndexList, however TensorIndexList require cxx11 which isn't yet supported as widely as we'd like.
2015-06-29 10:49:55 -07:00
Gael Guennebaud
392a30db82 Use VERIFY_IS_EQUAL instead of VERIFY(a==b) to get more feedback in case of failure 2015-06-26 16:22:49 +02:00
Gael Guennebaud
c911fc8dee split compiler intensive bdcsvd_1 unit test 2015-06-26 16:14:23 +02:00
Gael Guennebaud
98ff17eb9e Add special path for matrix<complex>/real.
This also fixes underflow issues when scaling complex matrices through complex/complex operator.
2015-06-26 16:08:15 +02:00
Gael Guennebaud
e102ddbf1f bug #1026: fix infinite loop for an empty input 2015-06-26 14:02:52 +02:00
Gael Guennebaud
555b9c6843 Doc: explain perf and multithreading issues in sparse iterative solvers 2015-06-26 10:49:40 +02:00
Gael Guennebaud
53b930887d Enable OpenMP parallelization of row-major-sparse * dense products.
I observed significant speed-up of the CG solver.
2015-06-26 10:32:34 +02:00
Gael Guennebaud
3f49cf4c90 More msvc 2013/2015 workarounds 2015-06-26 09:07:53 +02:00
Gael Guennebaud
7f824dd613 Optimize CG to enable faster spare row-major * dense vector products when the input matrix is complete (Upper|Lower) but column major. 2015-06-25 17:17:38 +02:00
Gael Guennebaud
c5f9eafcbc Fix assignement to selfadjoint-view when testing real-world problems 2015-06-25 17:08:58 +02:00
Gael Guennebaud
33e699c9fe Remove redundant accessors in Reverse 2015-06-25 14:14:59 +02:00
Gael Guennebaud
6b4d255cab Avoid division by a zero complex 2015-06-25 14:04:05 +02:00
Gael Guennebaud
973b0a90db Clarify documentation of the tolerance and error returned in iterative solvers 2015-06-25 13:51:13 +02:00
Gael Guennebaud
84264ceebc workaround msvc 2013/2015 wrong instanciation of isnan, isfinite, isinf 2015-06-25 10:00:26 +02:00
Gael Guennebaud
b4ab72678c bug #1000: MSVC 2013 does need the operator= workaround 2015-06-25 09:45:22 +02:00
Gael Guennebaud
788941d3b1 Workaround MSVC ambiguous instanciation 2015-06-24 23:35:17 +02:00
Gael Guennebaud
4c8cd13b35 Add explicit ctor for diagonal to sparse conversion 2015-06-24 18:11:06 +02:00
Gael Guennebaud
c38c195321 Document how cross behaves on complex numbers 2015-06-24 18:02:33 +02:00
Gael Guennebaud
23535ed31c Add unit test for dense = SparseQR::matrixQ 2015-06-24 17:55:41 +02:00
Gael Guennebaud
62f21e2d11 Add support for sparse = diagonal 2015-06-24 17:55:00 +02:00
Gael Guennebaud
763c833637 Make SparseSelfAdjointView, twists, and SparseQR more evaluator friendly 2015-06-24 17:54:09 +02:00
Gael Guennebaud
36643eec0c Add a call_assignment_no_alias_no_transpose shortcut 2015-06-24 17:50:43 +02:00
Gael Guennebaud
02db7c9bc6 Inherit operator+= and -= with 'using' kkeyword 2015-06-24 17:49:20 +02:00
Gael Guennebaud
53a61a067b Fallback to CMAKE_CXX_COMPILER_VERSION if VS version unknown 2015-06-24 15:17:37 +02:00
Gael Guennebaud
95e19be381 Fix compilation of MKL Pardiso support 2015-06-24 14:53:43 +02:00
Gael Guennebaud
2a33075aeb std::isnan is c++11 only 2015-06-24 10:29:17 +02:00
Gael Guennebaud
23da99492f Add unit-test for Visual2013 ambiguous call to operator= 2015-06-24 10:27:02 +02:00
Benoit Steiner
6441befbb3 Added more checks to test the correctness of the pexp implementation 2015-06-23 19:12:46 -07:00
Gael Guennebaud
c3e398d138 Fix overflow when checking SVD accuracy 2015-06-23 15:05:20 +02:00
Gael Guennebaud
b0d08869a9 Fix underflow in 3x3 tridiagonalization 2015-06-23 14:54:31 +02:00
Gael Guennebaud
18c9d155f3 Fix the fact that float(int) != float(int(float(int))) 2015-06-23 14:33:00 +02:00
Gael Guennebaud
71523a2e25 Fix a warning with icc 2015-06-23 14:20:20 +02:00
Gael Guennebaud
d9778f3391 Enable VML's pow wrapper on windows (the previous wrapper used the Fortran interface) 2015-06-23 14:04:50 +02:00
Gael Guennebaud
5f9630d7f9 bug #923: update support for Intel's VML wrt new evaluation mechanisms 2015-06-23 14:03:25 +02:00
Gael Guennebaud
793e4c6d77 bug #923: fix EIGEN_USE_BLAS mode 2015-06-23 11:13:24 +02:00
Gael Guennebaud
307c4fc292 Workaround missalignment produced by first_aligned for PacketSize==1 and size==1 2015-06-23 10:10:17 +02:00
Gael Guennebaud
bb3a9b4941 Use Ref<> to bypass forceAlignmentIf 2015-06-22 17:48:28 +02:00
Gael Guennebaud
476beed7f8 bug #1017: apply Christoph's patch preventing underflows in makeHouseholder 2015-06-22 16:51:45 +02:00
Gael Guennebaud
9fc1c92137 Fix isinf unit tests 2015-06-22 16:48:27 +02:00
Gael Guennebaud
9c7cfa7dab Update list of main modules 2015-06-22 14:17:24 +02:00
Gael Guennebaud
3ccd23efc0 Update coeff-wise quick-reference doc. 2015-06-22 14:08:54 +02:00
Gael Guennebaud
0848ba0a6e Fix return nullary return types: it must be based on the PlainObject type instead of the expression type. 2015-06-22 10:52:08 +02:00
Gael Guennebaud
b3b3dcad05 Reduce compiler memory consumption for SVD unit tests 2015-06-22 09:58:06 +02:00
Nicolas Mellado
ad5fdc7ddd Fix double to Scalar unwanted promotions 2015-06-21 20:21:23 +02:00
Gael Guennebaud
40821876ea Fix regression on CompressedStorage::operator= 2015-06-20 13:59:13 +02:00
Michael Abrahams
7043083be4 Use GCC flags in mingw 2015-06-20 18:54:41 +00:00
Gael Guennebaud
84aaef93ba Merged in vanhoucke/eigen_vanhoucke (pull request PR-118)
Fix two small undefined behaviors caught by static analysis.
2015-06-20 13:56:48 +02:00
Gael Guennebaud
6b33b29f00 Get rid of must_nest_by_value 2015-06-19 18:12:40 +02:00
Gael Guennebaud
846b227bb7 Get rid of class internal::nested<> (still have to updated Tensor module) 2015-06-19 17:56:39 +02:00
vanhoucke
368ea23406 Fix undefined behavior. When resizing a default-constructed SparseArray, we end up calling memcpy(ptr, 0, 0), which is technically UB and gets caught by static analysis. 2015-06-19 15:53:30 +00:00
vanhoucke
4cc0c961f3 Fix undefined behavior. 2015-06-19 15:46:46 +00:00
Gael Guennebaud
386d9e5ebd Fix usage of nested versus nested_eval 2015-06-19 17:42:27 +02:00
Gael Guennebaud
a5a7b68b76 Fix ambiguous instanciation using clean class-level SFINAE in product_evaluator 2015-06-19 17:25:13 +02:00
Gael Guennebaud
6fc5438205 Remove a few deprecated internal expressions 2015-06-19 17:06:12 +02:00
Gael Guennebaud
e9edb085c0 Check number of temporaries when applying permutations 2015-06-19 16:39:24 +02:00
Gael Guennebaud
6318d53b41 Factorize VERIFY_EVALUATION_COUNT in unit tests 2015-06-19 16:38:26 +02:00
Gael Guennebaud
5c84dd5665 Fix permutation/transposiitons products wrt nested_eval 2015-06-19 16:37:04 +02:00
Gael Guennebaud
0c8b0e007b Introduce a AliasFreeProduct option for Permutations and Transpositions 2015-06-19 15:38:19 +02:00
Gael Guennebaud
3f6aa4cd5d Remove useless specializations of evaluator_traits 2015-06-19 14:18:29 +02:00
Gael Guennebaud
4a8888dfbc Improbe compatibility of Transpositions and evaluators 2015-06-19 14:10:44 +02:00
Gael Guennebaud
3af4c6c1c9 Make Transpositions use evaluators 2015-06-19 11:50:24 +02:00
Gael Guennebaud
82b6ac0864 Enforce eigenvectors to be column-major (for performance reasons) 2015-06-19 11:25:46 +02:00
Gael Guennebaud
fad36cc814 Clean implementation of permutation * matrix products. 2015-06-19 10:51:57 +02:00
Gael Guennebaud
06036d8bb1 Fix compilation of BDCSVD with DEFAULT_TO_ROWMAJOR 2015-06-19 10:37:25 +02:00
Gael Guennebaud
d2db15016b Fix storage order computation in traits<Product> 2015-06-19 10:36:38 +02:00
Benoit Steiner
6a9a29e96f Fixed a compilation warning 2015-06-17 10:14:13 -07:00
Gael Guennebaud
bb6acc561e Workaround broken complex*real product on old clang versions 2015-06-17 16:11:58 +02:00
Gael Guennebaud
40f326ef2e workaround clang's broken complex division 2015-06-17 15:33:09 +02:00
Benoit Steiner
ab5db86fe9 Fixed merge conflict 2015-06-16 19:52:20 -07:00
Benoit Steiner
ea160a898c Pulled latest updates from trunk 2015-06-16 19:46:23 -07:00
Benoit Steiner
367794e668 Fixed compilation warnings triggered by clang 2015-06-16 19:43:49 -07:00
Gael Guennebaud
736a805883 Add unit test for bug #879 2015-06-16 22:11:41 +02:00
Gael Guennebaud
9ab8ac5c8b Fix compilation in TensorImagePatch 2015-06-16 14:50:08 +02:00
Gael Guennebaud
38874b1651 Fix shadow warnings in Tensor module 2015-06-16 14:43:46 +02:00
Gael Guennebaud
e2e66930c6 Fix compilation of alignedvector3 unit test 2015-06-16 14:40:55 +02:00
Gael Guennebaud
7baa1ba03e Remove the usage of result_of for DenseBase::redux as discussed in bug #1006 2015-06-15 22:40:18 +02:00
Gael Guennebaud
97cbe28829 Remove support of std::binder* in C++11 2015-06-15 15:34:05 +02:00
Gael Guennebaud
972a535288 Remove aligned-on-scalar assert and fallback to non vectorized path at runtime (first_aligned already had this runtime guard) 2015-06-14 15:04:07 +02:00
Gael Guennebaud
e5b490b654 Fix isfinite/isinf/isnan code snippets 2015-06-15 15:09:25 +02:00
Gael Guennebaud
a546be56e0 typo 2015-06-15 15:08:51 +02:00
Gael Guennebaud
3946c981b1 Relax tolerance when testing LDLT on singular problems 2015-06-15 15:08:16 +02:00
Gael Guennebaud
2212e40e95 Extend VERIFY_IS_APPROX to report the magnitude of the relative difference in case of failure. This will ease identifying strongest failing tests 2015-06-15 15:03:19 +02:00
Gael Guennebaud
321a2cbe3d Add missing forward declaration of AlignedBox 2015-06-15 15:01:20 +02:00
Gael Guennebaud
2f2a441a4d Fix use of unitialized buffers. 2015-06-13 22:19:40 +02:00
Gael Guennebaud
91b64a9c65 Relax aligned-on-scalar assert as in the 3.2 branch 2015-06-12 11:25:57 +02:00
Gael Guennebaud
84d103bee8 Enable C++11 math function in a more conservative manner. 2015-06-11 21:45:02 +02:00
Gael Guennebaud
916ef52fff merge 2015-06-11 09:35:49 +02:00
Gael Guennebaud
d93ba137f2 Introduce EIGEN_PI, get rid of M_PI and acos(-1.0) 2015-06-10 17:12:10 +02:00
Gael Guennebaud
9756c7fb4d Make more use of EIGEN_HAS_C99_MATH 2015-06-10 16:26:55 +02:00
Gael Guennebaud
93a62265dc fix isinf(complex(inf,NaN)) to return false. 2015-06-10 16:19:10 +02:00
Gael Guennebaud
b0d5aaafcc Rename free functions isFinite, isInf, isNaN to be compatible with c++11 2015-06-10 16:17:09 +02:00
Gael Guennebaud
25a98be948 bug #80: merge with d_hood branch on adding more coefficient-wise unary array functors 2015-06-10 15:52:05 +02:00
Gael Guennebaud
192bce2795 bug #890, add a more general routine to check that two dense object reference to the same data 2015-06-10 10:09:04 +02:00
Gael Guennebaud
e6832ce93d Add regression test for bug #890 2015-06-10 09:32:10 +02:00
Gael Guennebaud
0b2cbb2bdc bug #897: make umfpack support use Ref<> 2015-06-09 23:30:06 +02:00
Gael Guennebaud
feaf76c001 bug #910: add a StandardCompressedFormat option to Ref<SparseMatrix> to enforce standard compressed storage format.
If the input is not compressed, then this trigger a copy for a const Ref, and a runtime assert for non-const Ref.
2015-06-09 23:11:24 +02:00
Gael Guennebaud
f899aeb301 bug #650: fix sparse * dense wrt noalias and compound assignment 2015-06-09 18:33:24 +02:00
Gael Guennebaud
785b9c0127 bug #1003: assert in MapBase if the provided pointer is not aligned on scalar while it is expected to be. Also add a EIGEN_ALIGN8 macro. 2015-06-09 17:42:09 +02:00
Gael Guennebaud
0eb06f1391 Enable -Wshadow with clang 2015-06-09 17:44:18 +02:00
Gael Guennebaud
64753af3b7 code simplification 2015-06-09 15:35:34 +02:00
Gael Guennebaud
cacbc5679d formatting 2015-06-09 15:23:08 +02:00
Gael Guennebaud
04665ef9e1 remove redundant dynamic allocations in GMRES 2015-06-09 15:18:21 +02:00
Gael Guennebaud
d4c574707e fix some legitimate shadow warnings 2015-06-09 15:17:58 +02:00
Gael Guennebaud
f9350e70eb fix unused variable warning 2015-06-09 15:17:21 +02:00
Gael Guennebaud
4aba24a1b2 Clean argument names of some functions 2015-06-09 13:32:12 +02:00
Gael Guennebaud
302cf8ffe2 Add missing documentation for TriangularViewImpl<MatrixType,Mode,Sparse> 2015-06-09 11:40:07 +02:00
Gael Guennebaud
3a4299b245 bug #872: remove usage of deprecated bind1st. 2015-06-09 10:52:04 +02:00
Gael Guennebaud
9aef0db992 Skip too large real-world problems for solvers that do not scale (e.g., SimplicialLLT without reordering) 2015-06-09 09:29:53 +02:00
Gael Guennebaud
9a2447b0c9 Fix shadow warnings triggered by clang 2015-06-09 09:11:12 +02:00
Gael Guennebaud
cd8b996f99 Extend unit test and documentation of SelfAdjointEigenSolver::computeDirect 2015-06-08 16:16:42 +02:00
Gael Guennebaud
913a61870d Update utility for experimenting with 3x3 eigenvalues 2015-06-08 15:54:53 +02:00
Gael Guennebaud
8f031a3cee bug #997: add missing evaluators for m.lazyProduct(v.homogeneous()) 2015-06-08 15:43:41 +02:00
Gael Guennebaud
e6c5723dcd Add unit test for m.replicate(...)(index). 2015-06-08 15:42:15 +02:00
Gael Guennebaud
274b1f5d7e Fix homogeneous() for 1x1 matrix: in this case, homogeneous follows the storage order guaranteeing that v.transpose().homogeneous() == v.homogeneous().transpose() 2015-06-08 15:40:51 +02:00
Gael Guennebaud
cbe3a1a83e Add missing accessors for 1D index based access to Replicate<> expressions. 2015-06-08 15:39:09 +02:00
Gael Guennebaud
a7ae628c9f bug #1005: fix regression regarding sparse coeff-wise binary operator that did not trigger a static assertion for mismatched storage 2015-06-08 10:14:08 +02:00
Gael Guennebaud
0a9b5d1396 bug #705: fix handling of Lapack potrf return code 2015-06-05 15:59:13 +02:00
Gael Guennebaud
d0b7b5cb55 minor documentation fixes 2015-06-05 14:40:07 +02:00
Gael Guennebaud
56d4ef7ad6 BiCGSTAB: set default guess to 0, and improve restart mechanism by recomputing the accurate residual. 2015-06-05 14:37:57 +02:00
Gael Guennebaud
98a8d43457 Improve unit testing of real-word sparse problem (fix some shortcommings, use VERIFY, etc.) 2015-06-05 14:33:37 +02:00
Gael Guennebaud
b685660b22 Do go to full accuracy when testing BiCGSTAB. 2015-06-05 14:32:26 +02:00
Gael Guennebaud
8bc26562f4 Do not abort if the folder cannot be openned! 2015-06-05 14:31:29 +02:00
Gael Guennebaud
3e7bc8d686 Improve loading of symmetric sparse matrices in MatrixMarketIterator 2015-06-05 10:16:29 +02:00
Gael Guennebaud
acc761cf0c Merged in FlorianGeorge/eigen_blaze_fork_2 (pull request PR-60)
Use trans(X) instead of X.transpose() in Blaze Benchmark
2015-06-04 09:15:22 +02:00
Benoit Steiner
ea1190486f Fixed a compilation error triggered by nvcc 7 2015-05-28 11:57:51 -07:00
Benoit Steiner
0e5fed74e7 Worked around some constexpr related bugs in nvcc 7 2015-05-28 10:14:38 -07:00
Benoit Steiner
f13b3d4433 Added missing include files 2015-05-28 07:57:28 -07:00
Benoit Steiner
abec18bae0 Fixed potential compilation error 2015-05-26 10:11:15 -07:00
Benoit Steiner
9df186c140 Added a few more missing EIGEN_DEVICE_FUNC statements 2015-05-26 09:47:48 -07:00
Benoit Steiner
466bcc589e Added a few missing EIGEN_DEVICE_FUNC statements 2015-05-26 09:37:23 -07:00
Gael Guennebaud
d457734a19 Avoid calling smart_copy with null pointers. 2015-05-25 22:30:56 +02:00
Benoit Steiner
6b800744ce Moved away from std::async and std::future as the underlying mechnism for the thread pool device. On several platforms, the functions passed to std::async are not scheduled in the order in which they are given to std::async, which leads to massive performance issues in the contraction code.
Instead we now have a custom thread pool that ensures that the functions are picked up by the threads in the pool in the order in which they are enqueued in the pool.
2015-05-20 13:52:07 -07:00
Benoit Steiner
48f6b274e2 Fixed compilation error triggered by gcc 4.7 2015-05-20 08:54:44 -07:00
Benoit Steiner
2451679951 Avoid using the cuda memcpy for small tensor slices since the memcpy kernel is very expensive to launch 2015-05-19 15:19:01 -07:00
Benoit Steiner
a81d17b73a Added new version of the TensorIntDiv class optimized for 32 bit signed integers. It saves 1 register on CPU and 2 on GPU. 2015-05-19 13:59:52 -07:00
Benoit Jacob
051d5325cc Abandon blocking size lookup table approach. Not performing as well in real world as in microbenchmark. 2015-05-19 11:03:59 -04:00
Christoph Hertzberg
ebea530782 bug #1014: More stable direct computation of eigenvalues and -vectors for 3x3 matrices 2015-05-17 21:54:32 +02:00
Benoit Jacob
c88e1abaf3 also uninitialized here, see previous cset 2015-05-15 11:34:57 -04:00
Benoit Jacob
807793ec3b Fix uninitialized var warning. The compiler was clearing the register anyway, so this does not change resulting code 2015-05-15 11:15:53 -04:00
Pete Warden
140f85bb99 Check for the macro __ARM_NEON__ (with two underscores at the end) as well as __ARM_NEON. The second macro is correct according to the ARM language extensions specification, but historically the first one has been more common. Some older compilers (e.g. gcc v4.6 on a Beaglebone Black) only define the first, so without this patch NEON isn't enabled. 2015-05-12 16:03:43 -07:00
Gael Guennebaud
a852001196 Add regression test for bugs #854 and #1014, and check that the eigenvector matrix is unitary. 2015-05-12 18:45:39 +02:00
Gael Guennebaud
e66caf48e8 Make test matrices for eigensolver/selfadjoint even more tricky 2015-05-12 18:44:46 +02:00
Gael Guennebaud
ef81730625 Ignore denormal numbers in selfadjoint eigensolver. 2015-05-12 18:38:43 +02:00
Christoph Hertzberg
a605a1d7df Merged in MattPD/eigen/MattPD/doc-fix-wording-typos-in-templatekeywor-1431363009359 (pull request PR-116)
[Doc] Fix wording / typos in TemplateKeyword.dox
2015-05-11 23:37:52 +02:00
MattPD
447e060b81 [Doc] Fix wording / typos in TemplateKeyword.dox 2015-05-11 16:50:18 +00:00
Christoph Hertzberg
494fa991c3 bug #872: Avoid deprecated binder1st/binder2nd usage by providing custom functors for comparison operators 2015-05-07 17:28:40 +02:00
Gael Guennebaud
4a936974a5 bug #1013: fix 2x2 direct eigensolver for identical eiegnvalues 2015-05-07 15:55:12 +02:00
Gael Guennebaud
c2107d30ce Extend unit tests of sefladjoint-eigensolver 2015-05-07 15:54:07 +02:00
Gael Guennebaud
ebf8ca4fa8 Fix bug #1010: m_isInitialized was improperly updated 2015-05-07 14:20:42 +02:00
Konstantinos Margaritis
dd698e6680 Merged in doug_kwan/eigen (pull request PR-103)
Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
2015-05-05 20:50:14 +03:00
Benoit Steiner
1dded10cb7 Added a double-precision implementation of the exp() function for AVX. 2015-05-04 10:42:51 -07:00
Christoph Hertzberg
6273aca9b1 small typo 2015-05-04 15:26:31 +00:00
Christoph Hertzberg
4dd7d0b5dc Merged in mvdyck/eigen-3/mvdyck/doc-multithreading-fix-old-n-eigennbthr-1430750928880 (pull request PR-114)
[Doc] Multi-threading fix
2015-05-04 17:23:21 +02:00
michiel van dyck
4b9eddaef8 [Doc] Multi-threading fix
OLD: n = Eigen::nbThreads( n );
NEW: n = Eigen::nbThreads( );

from:
You can query the number of threads that will be used with:
\code
n = Eigen::nbThreads( );
\endcode

Kr Michiel
2015-05-04 14:48:52 +00:00
Christoph Hertzberg
28a4c92cbf bug #998: Started fixing doxygen warnings 2015-05-01 22:10:41 +02:00
Christoph Hertzberg
173b34e9ab bug #999: clarify that behavior of empty AlignedBoxes is undefined, and further improvements in documentation 2015-04-30 19:30:36 +02:00
Christoph Hertzberg
da2baf685d Regression test for bug #302
(transplanted from 80fd8fab87
)
Changed DenseIndex to Index
2015-04-26 21:05:33 +02:00
Christoph Hertzberg
8c6a3b5ace Fix trivial warnings in LevenbergMarquardt module and test 2015-04-24 21:35:30 +02:00
Gael Guennebaud
de18cd413d Disable posix_memalign on Solaris and SunOS, and allows to by-pass built-in posix_memalign detection rules. 2015-04-24 11:26:51 +02:00
Gael Guennebaud
1681a665d9 Extend unit test of Map<,,Stride<>> with stack allocated buffers and less trivial operations. 2015-04-24 10:38:28 +02:00
Gael Guennebaud
834f66e9fc Extend unit test of Map<> with stack allocated buffers and less trivial operations. 2015-04-24 10:10:19 +02:00
Gael Guennebaud
40258078c6 bug #360: add value_type typedef to DenseBase/SparseMatrixBase 2015-04-24 09:44:24 +02:00
Christoph Hertzberg
c460af414e Fix bug #1000: Manually inherit assignment operators for MSVC 2013 and later (as required by the standard). 2015-04-23 13:39:03 +02:00
Benoit Steiner
fd1d4bd86c Silenced a few compilation warnings 2015-04-22 16:16:15 -07:00
Benoit Steiner
91359e1d0a Added the ability to generate a tensor from a custom user defined 'generator'. This simplifies the creation of constant tensors initialized using specific regular patterns.
Created a gaussian window generator as a first use case.
2015-04-22 11:14:58 -07:00
Benoit Steiner
8838ed39f4 Added support for non-deterministic random number generation on GPU 2015-04-22 09:14:38 -07:00
Christoph Hertzberg
e7457e419d Merge with dfa991cbae 2015-04-22 03:39:32 +02:00
Benoit Steiner
dfa991cbae Make sure that the copy constructor of the evaluator is always called before launching the evaluation of a tensor expression on a cuda device. 2015-04-21 16:15:45 -07:00
Gael Guennebaud
dbd12b4cda Make sure that BlockImpl<const SparseMatrix> ctor is called with the right type 2015-04-21 10:15:36 +02:00
Gael Guennebaud
d6a8b43b39 Fix typo in the definition of EIGEN_COMP_GNUC_STRICT 2015-04-21 10:12:38 +02:00
Benoit Steiner
e709488361 Silenced a few compilation warnings 2015-04-20 17:39:45 -07:00
Benoit Steiner
10a1f81822 Sped up the assignment of a tensor to a tensor slice, as well as the assigment of a constant slice to a tensor 2015-04-20 17:34:11 -07:00
Deanna Hood
e5048b5501 Use std::isfinite when available 2015-04-20 14:59:57 -04:00
Deanna Hood
249c48ba00 Incorporate C++11 check into EIGEN_HAS_C99_MATH macro 2015-04-20 14:57:04 -04:00
Deanna Hood
0250f4a9f2 Merged default into unary-array-cwise-functors 2015-04-20 14:01:35 -04:00
Deanna Hood
0339502a4f Only use std::isnan and std::isinf if they are available 2015-04-20 13:14:06 -04:00
Benoit Steiner
43eb2ca6e1 Improved the tensor random number generators:
* Use a mersenne twister whenebver possible instead of the default entropy source since the default one isn't very good at all.
 * Added the ability to seed the generators with a time based seed to make them non-deterministic.
2015-04-20 09:24:09 -07:00
Christoph Hertzberg
016c29f207 Merge with 70bc3b0668 2015-04-20 08:33:39 +02:00
Benoit Steiner
70bc3b0668 Silenced a warning in the tensor code 2015-04-19 12:38:00 -07:00
Benoit Steiner
3220eb2b93 Fixed some compilation warnings 2015-04-19 12:36:35 -07:00
Gael Guennebaud
fc2d5b86ce simplify previous changeset: sub-expressions are nested by value 2015-04-18 22:50:16 +02:00
Gael Guennebaud
5a3c48e3c6 bug #942: fix dangling references in evaluator of diagonal * sparse products. 2015-04-18 22:43:27 +02:00
Benoit Steiner
3b429b71e6 Fixed compilation warning triggered by gcc 4.7 2015-04-18 13:41:06 -07:00
Benoit Steiner
9c6b82bcd5 Use ptrdiff_t instead of size_t to encode fixed sizes. This silences several clang compilation warnings
(transplanted from 4400e4436ac7c5bbd305a03c21aa4bce24ae199b)
2015-04-17 09:12:18 -07:00
Christoph Hertzberg
4f126b862d Add internal assertions to purely fixed-size DenseStorage, mark optional variables always as unused 2015-04-17 11:36:21 +02:00
Benoit Steiner
da5b98a94d Updated the cxx11_tensor_convolution test to avoid using cxx11 features. This should enable the test to compile with gcc 4.7 and older 2015-04-16 12:29:16 -07:00
Benoit Steiner
d19d09ae6a Updated a regression test to avoid compilation errors when compiling with gcc 4.7 2015-04-16 12:15:27 -07:00
Christoph Hertzberg
9d7843d0d0 Add internal assertions to DenseStorage constructor 2015-04-16 15:47:06 +02:00
Christoph Hertzberg
3be9f5c4d7 Constructing a Matrix/Array with implicit transpose could lead to memory leaks.
Also reduced code duplication for Matrix/Array constructors
2015-04-16 13:25:20 +02:00
Gael Guennebaud
e0cff9ae0d Fix bug #996: fix comparisons to 0 instead of Scalar(0) 2015-04-15 14:48:53 +02:00
Gael Guennebaud
5dbe758dc3 Backed out changeset 04c8c5d9ef 2015-04-15 14:47:08 +02:00
Gael Guennebaud
04c8c5d9ef Fix bug #996: fix comparisons to 0 instead of Scalar(0) 2015-04-15 14:44:57 +02:00
Benoit Steiner
0f82399fe9 Pulled latest changes from trunk 2015-04-14 19:13:34 -07:00
Christoph Hertzberg
761691f18d Make conversion from 0 to Scalar explicit (issue reported by Brad Bell) 2015-04-13 17:15:00 +02:00
Benoit Steiner
5401fbcc50 Improved the blocking strategy to speedup multithreaded tensor contractions. 2015-04-09 16:44:10 -07:00
Deanna Hood
085aa8e601 Don't use M_PI since it's only guaranteed to be defined in Eigen/Geometry 2015-04-08 13:59:18 -05:00
Gael Guennebaud
0eb220c00d add a note on bug #992 2015-04-08 09:25:34 +02:00
Benoit Jacob
d7f51feb07 bug #992: don't select a 3p GEMM path with non-vectorizable scalar types, this hits unsupported paths in symm/triangular products code 2015-04-07 15:13:55 -04:00
Benoit Jacob
0e9753c8df Fix compiler flags on Android/ARM:
- generate position-independent code (PIE), a requirement to run binaries on Android 5.0+ devices;
 - correctly handle EIGEN_TEST_FMA + EIGEN_TEST_NEON to pass -mfpu=neon-vfpv4.
2015-04-07 14:03:21 -04:00
Benoit Steiner
1de49ef4c2 Fixed a bug when chipping tensors laid out in row major order. 2015-04-07 10:44:13 -07:00
Benoit Steiner
a1f1e1e51d Fixed the order of 2 #includes 2015-04-06 10:41:39 -07:00
Benoit Steiner
7c18ab921c Pulled latest updates from trunk 2015-04-04 20:07:04 -07:00
Gael Guennebaud
15b5adb327 Fix regression in DynamicSparseMatrix and SuperLUSupport wrt recent change on nonZeros/nonZerosEstimate 2015-04-02 22:21:41 +02:00
Benoit Steiner
74e558cfa8 Pulled latest updates from trunk 2015-04-01 23:24:11 -07:00
Benoit Steiner
03a0df2010 Fixed some compilation warnings triggered by pre-cxx11 comoilers 2015-04-01 22:51:33 -07:00
Benoit Steiner
b8b7807269 Fixed some compilation warning triggered by the cxx11 emulation code 2015-04-01 21:48:18 -07:00
Benoit Steiner
383b6dfafe Fixed 2 typos 2015-04-01 16:44:36 -07:00
Gael Guennebaud
5861cfb55e Remove unused GenericSparseBlockInnerIteratorImpl code. 2015-04-01 22:29:29 +02:00
Gael Guennebaud
3105986e71 bug #875: remove broken SparseMatrixBase::nonZeros and introduce a nonZerosEstimate() method to sparse evaluators for internal uses.
Factorize some code in SparseCompressedBase.
2015-04-01 22:27:34 +02:00
Gael Guennebaud
39dcd01b0a bug #973: enable alignment of multiples of half-packet size (e.g., Vector6d with AVX) 2015-04-01 13:55:09 +02:00
Gael Guennebaud
8481dc21ea bug #986: add support for coefficient-based product with 0 depth. 2015-04-01 13:15:23 +02:00
Gael Guennebaud
79b4e6acaf Fix bug #987: wrong alignement guess in diagonal product. 2015-03-31 23:35:12 +02:00
Gael Guennebaud
3c38589984 Remove most of the dynamic memory allocations that occured in D&C SVD. Still remains the calls to JacobiSVD and UpperBidiagonalization. 2015-03-31 22:54:47 +02:00
Gael Guennebaud
8313fb7df7 Add row/column-wise reverseInPlace feature. 2015-03-31 21:35:53 +02:00
Gael Guennebaud
dfb674a25e Make reverseInPlace really work in-place. 2015-03-31 20:17:10 +02:00
Gael Guennebaud
20d030f207 Fix vectorization of swap for non trivial expressions 2015-03-31 20:16:02 +02:00
Benoit Steiner
678207e02a Added regression tests for tensor convolutions 2015-03-31 09:08:08 -07:00
Benoit Steiner
68d4afe985 Added support for convolution of tensors laid out in RowMajor mode 2015-03-31 09:07:09 -07:00
Benoit Steiner
f873686602 Added documentation for the convolution operation 2015-03-31 08:27:23 -07:00
Benoit Jacob
73cdeae1d3 Only use blocking sizes LUTs for single-thread products for now 2015-03-31 11:17:23 -04:00
Benoit Jacob
0cbd5ae3cb Correctly detect Android with ndk_build 2015-03-31 11:17:21 -04:00
Gael Guennebaud
ae01c05e18 Fix computeProductBlockingSizes with m==0, and add respective unit test. 2015-03-31 15:19:57 +02:00
Gael Guennebaud
bd76d837e6 Fix sign of SuperLU::determinant 2015-03-31 14:57:32 +02:00
Gael Guennebaud
35d3053d55 Fix regression introduced in 3b169d792d 2015-03-31 09:23:53 +02:00
Benoit Steiner
731d7b84b4 Sharded a large test 2015-03-30 23:26:45 -07:00
Christoph Hertzberg
7bd578d11d Change CMake warning to simple message for old Metis versions 2015-03-31 00:50:04 +02:00
Christoph Hertzberg
3b169d792d Suppress unused variable warning 2015-03-31 00:49:08 +02:00
Christoph Hertzberg
3238ca6abc Addendum to last patch: k is Index and not int 2015-03-31 00:42:14 +02:00
Christoph Hertzberg
1efae98fee bug #985: RealQZ failed when either matrix had zero rows or columns (report and patch by Ben Goodrich)
Also added a regression test
2015-03-30 23:56:20 +02:00
Benoit Steiner
35722fa022 Made the index type a template parameter of the tensor class instead of encoding it in the options. 2015-03-30 14:55:54 -07:00
Benoit Steiner
71950f02e5 Deleted unnecessary semicolons 2015-03-30 14:49:10 -07:00
Christoph Hertzberg
58af8bf90c bug #982: Make sure numext::maxi and numext::mini are called correctly, in case Scalar expressions return expression templates. 2015-03-30 16:47:22 +02:00
Gael Guennebaud
2adbf6b8ca fix stupid warning with old GCC 2015-03-28 22:34:54 +01:00
Gael Guennebaud
41e20248f8 merge 2015-03-28 14:43:35 +01:00
Christoph Hertzberg
09a5361d1b bug #983: Pass Vector3 by const reference and not by value 2015-03-28 12:36:24 +01:00
Christoph Hertzberg
266a84558f Optionally build the documentation when building unit tests. 2015-03-27 16:36:59 +01:00
Christoph Hertzberg
1b4bb20cf1 Merged in d_hood/eigen/sparse-tutorial-doc-fix (pull request PR-107)
[Doc] Fix missing image in sparse tutorial
2015-03-27 16:22:16 +01:00
Gael Guennebaud
eb7e4c2b9c Pass Vector3 type by reference 2015-03-27 12:11:24 +01:00
Gael Guennebaud
ad044008da Fix transpose versus adjoint. 2015-03-27 12:07:14 +01:00
Gael Guennebaud
79cb875249 merge 2015-03-27 10:56:04 +01:00
Gael Guennebaud
7e225b6fa4 Suppress some false negatives in SVD unit test 2015-03-27 10:55:53 +01:00
Gael Guennebaud
1b8cc9af43 Slight numerical stability improvement in 2x2 svd 2015-03-27 10:55:00 +01:00
Gael Guennebaud
3d59ae0203 Fix hypot(0,0). 2015-03-27 09:59:24 +01:00
Benoit Steiner
4df8b5a75e Avoid making an unecessary copy of the tensor expression when evaluating it on a GPU device 2015-03-25 14:36:07 -07:00
Benoit Steiner
b3343bfdae Fixed the vectorized implementation of the Tensor select() method 2015-03-25 13:25:53 -07:00
Benoit Steiner
ccf290a65c Cleaned up the TensorDevice code a little bit. 2015-03-25 12:37:38 -07:00
Benoit Steiner
d3f7915aeb Pulled latest update from the eigen main codebase 2015-03-24 13:12:14 -07:00
Benoit Steiner
abdbe8562e Fixed the CUDA packet primitives 2015-03-24 10:45:46 -07:00
Gael Guennebaud
29eaa2b0f1 Make MatrixBase::is* methods aware of nested_eval. 2015-03-24 13:42:42 +01:00
Gael Guennebaud
f42b105f73 Add the possibility to make VERIFY* checks to output a warning instead of abording. 2015-03-24 13:39:14 +01:00
Gael Guennebaud
d27968eb7e D&C SVD: directly falls back to JacobiSVD for very small problems (by-pass upper-bidiagonalization) 2015-03-24 13:38:07 +01:00
Gael Guennebaud
4472f3e578 Avoid SVD: consider denormalized small numbers as zero when computing the rank of the matrix 2015-03-23 09:40:21 +01:00
Deanna Hood
83e5b7656b Use M_PI instead of acos(-1) for pi 2015-03-22 06:04:31 +10:00
Deanna Hood
4bab4790c0 Add \sa tags of isFinite/isInf for each other 2015-03-22 05:39:08 +10:00
Gael Guennebaud
4e2b18d909 Update approx. minimum ordering method to push and keep structural empty diagonal elements to the bottom-right part of the matrix 2015-03-20 16:33:48 +01:00
Gael Guennebaud
8d9bfb3a7b fix loadMarket wrt Index versus int 2015-03-20 16:00:10 +01:00
Benoit Steiner
a6a628ca6b Added the -= operator to the device classes 2015-03-19 23:22:19 -07:00
Benoit Steiner
e134226a03 Fixed a bug in the handling of packets by the MeanReducer 2015-03-19 23:11:42 -07:00
Gael Guennebaud
9ee62fdcd5 Fix random unit test for 32bits systems. 2015-03-19 21:39:37 +01:00
Gael Guennebaud
d6b2f300db Fix MSVC compilation: aligned type must be passed by reference 2015-03-19 17:28:32 +01:00
Gael Guennebaud
61c45d7cfd Fix comparison warning 2015-03-19 17:13:22 +01:00
Gael Guennebaud
d7698c18b7 Split sparse_basic unit test 2015-03-19 15:11:05 +01:00
Gael Guennebaud
f329d0908a Improve random number generation for integer and add unit test 2015-03-19 15:10:36 +01:00
Deanna Hood
2ab4922431 Make html directory before generating output image there 2015-03-18 07:24:13 +10:00
Deanna Hood
41b717de25 More extensive unit tests for recent array-wise functors 2015-03-18 03:11:03 +10:00
Benoit Steiner
cc0f89eb3b Changed the way lvalue operations are declared in TensorBase: this fixes constness isses that prevented some expressions mixing lvalues and rvalues from compiling. 2015-03-17 09:57:20 -07:00
Benoit Jacob
dc04f12967 use unsigned short instead of uint16_t which doesn't exist in c++98 2015-03-17 10:31:45 -04:00
Deanna Hood
8878e1c1de Remove ambiguity with recent numext methods isNaN and isInf 2015-03-17 22:39:51 +10:00
Deanna Hood
596be3cd86 Use std::arg for real numbers when c++11 is used, custom implementation otherwise 2015-03-17 15:28:12 +10:00
Deanna Hood
e26134ed62 Use std::round when c++11 is used, custom implementation otherwise 2015-03-17 14:55:14 +10:00
Deanna Hood
e21e29a088 Update cost of arg call to depend on if the scalar is complex or not 2015-03-17 14:04:33 +10:00
Deanna Hood
447a5a6b01 Fix VML declarations to only be for real/complex as appropriate 2015-03-17 13:33:31 +10:00
Deanna Hood
f52b78491c Remove packet isNaN, isInf, isFinite 2015-03-17 09:26:24 +10:00
Deanna Hood
1c78d6f2a6 Add boolean not operator (!) array support 2015-03-17 08:29:57 +10:00
Deanna Hood
85da0c2281 Remove test of now-missing floor, ceil, round complex implementations 2015-03-17 06:56:47 +10:00
Benoit Jacob
364cfd529d Similar to cset 3589a9c115
, also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment
2015-03-16 16:28:44 -04:00
Benoit Steiner
25664afacd Pulled latest updates from trunk 2015-03-16 13:25:45 -07:00
Deanna Hood
e1d6e6c972 Make cube, inverse and abs2 free-functions 2015-03-17 06:25:24 +10:00
Benoit Jacob
577056aa94 Include stdint.h. Not going for cstdint because it is a C++11 addition. Needed for uint16_t at least, in lookup-table code. 2015-03-16 16:21:50 -04:00
Benoit Steiner
5144f66728 Fixed compilation warning 2015-03-16 13:17:52 -07:00
Benoit Steiner
0fd6d52724 Fixed compilation error with clang 2015-03-16 13:16:12 -07:00
Benoit Jacob
eb6929cb19 fix bug in maxsize calculation, which would cause products of size > 2048 to address the lookup table out of bounds 2015-03-16 16:15:47 -04:00
Benoit Steiner
f218c0181d Fixes the Lvalue computation by actually setting the LvalueBit properly when instantiating tensors of const T. Added a test to check the fix. 2015-03-16 13:05:00 -07:00
Deanna Hood
fef4e071d7 Rename isinf to isInf 2015-03-17 05:58:47 +10:00
Deanna Hood
46cf9cda32 Add isfinite array support as isFinite 2015-03-17 04:33:12 +10:00
Deanna Hood
7b829940d1 Add code snippets for new methods 2015-03-17 03:40:28 +10:00
Deanna Hood
1d76ceab55 Remove floor, ceil, round for complex numbers 2015-03-17 02:36:07 +10:00
Deanna Hood
717b7954ce Update cost of coeff-wise arg call 2015-03-17 02:11:57 +10:00
Deanna Hood
fb68b149cb Rename isnan to isNaN 2015-03-17 02:04:42 +10:00
Benoit Jacob
35c3a8bb84 Update Nexus 5 lookup table from combining now 2 runs of the benchmark, using the analyze-blocking-sizes partition tool. Gives better worst-case performance. 2015-03-16 11:05:51 -04:00
Benoit Jacob
e274607d7f fix compilation with GCC 4.8 2015-03-16 10:48:27 -04:00
Benoit Jacob
151b8b95c6 Fix bug in case where EIGEN_TEST_SPECIFIC_BLOCKING_SIZE is defined but false 2015-03-15 19:10:51 -04:00
Benoit Jacob
02babb9c0f Provide a empirical lookup table for blocking sizes measured on a Nexus 5. Only for float, only for Android on ARM 32bit for now. 2015-03-15 18:13:12 -04:00
Benoit Jacob
3589a9c115 actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment 2015-03-15 18:12:18 -04:00
Benoit Jacob
1dd3d89818 Fix a unused-var warning 2015-03-15 18:07:19 -04:00
Benoit Jacob
ca5c12587b Polish lookup tables generation 2015-03-15 18:05:53 -04:00
Benoit Jacob
e56aabf205 Refactor computeProductBlockingSizes to make room for the possibility of using lookup tables 2015-03-15 18:05:12 -04:00
Benoit Jacob
b6b88c0808 update perf_monitoring/gemm/changesets.txt 2015-03-13 14:57:05 -07:00
Benoit Jacob
488c15615a organize a little our default cache sizes, and use a saner default L1 outside of x86 (10% faster on Nexus 5) 2015-03-13 14:51:26 -07:00
Gael Guennebaud
9f58524cbd merge 2015-03-13 21:16:39 +01:00
Gael Guennebaud
1330f8bbd1 bug #973, improve AVX support by enabling vectorization of Vector4i-like types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled. 2015-03-13 21:15:50 +01:00
Gael Guennebaud
d99ab35f9e Fix internal::random(x,y) for integer types. The previous implementation could return y+1. The new implementation uses rejection sampling to get an unbiased behabior. 2015-03-13 21:12:46 +01:00
Gael Guennebaud
8580eb6808 bug #949: add static assertion for incompatible scalar types in dense end-user decompositions. 2015-03-13 21:06:20 +01:00
Gael Guennebaud
a9df28c95b SparseMatrix::insert: switch to a fully uncompressed mode if sequential insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases) 2015-03-13 21:00:21 +01:00
Gael Guennebaud
5ffe29cb9f Bound pre-allocation to the maximal size representable by StorageIndex and throw bad_alloc if that's not possible. 2015-03-13 20:57:33 +01:00
Benoit Jacob
d73ccd717e Add support for dumping blocking sizes tables 2015-03-13 10:36:01 -07:00
Gael Guennebaud
2f6f8bf31c Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests. 2015-03-13 16:24:40 +01:00
Benoit Jacob
f2c3e2b10f Add --only-cubic-sizes option to analyze-blocking-sizes tool 2015-03-12 13:16:33 -07:00
Doug Kwan
657407227e Fix bug in pdiv<Packet1cd> which swaps 32-bit halves of a pair of
doubles instead of swapping the doubles.
2015-03-11 15:13:37 -07:00
Deanna Hood
f89fcefa79 Add hyperbolic trigonometric functions from std array support 2015-03-11 13:13:30 +10:00
Deanna Hood
a5e49976f5 Add log10 array support 2015-03-11 08:56:42 +10:00
Deanna Hood
19a71056ae Allow calling of square(array) in addition to array.square() 2015-03-11 06:59:28 +10:00
Deanna Hood
31fdd67756 Additional unary coeff-wise functors (isnan, round, arg, e.g.) 2015-03-11 06:39:23 +10:00
Gael Guennebaud
fd78874888 Fix compilation of iterative solvers with dense matrices 2015-03-09 21:31:03 +01:00
Gael Guennebaud
d4317a85e8 Add typedefs for return types of SparseMatrixBase::selfadjointView 2015-03-09 21:29:46 +01:00
Gael Guennebaud
9e885fb766 Add unit tests for CG and sparse-LLT for long int as storage-index 2015-03-09 14:33:15 +01:00
Gael Guennebaud
224a1fe4c6 bug #963: make IncompleteLUT compatible with non-default storage index types. 2015-03-09 13:55:20 +01:00
Gael Guennebaud
cf9940e17b Make sparse unit-test helpers aware of StorageIndex 2015-03-09 13:54:05 +01:00
Benoit Jacob
39228cb224 deserialization assumed benchmarks in same order, but we shuffle them. 2015-03-06 19:29:01 -05:00
Benoit Jacob
a4f956b1da merge 2015-03-06 19:13:36 -05:00
Benoit Jacob
19bf13aa62 Automatically serialize partial results to disk, reboot, and resume, when timings are getting bad 2015-03-06 19:11:50 -05:00
Gael Guennebaud
0ee391863e Avoid undeflow when blocking size are tuned manually. 2015-03-06 21:51:09 +01:00
Gael Guennebaud
14a5f135a3 bug #969: workaround abiguous calls to Ref using enable_if. 2015-03-06 17:51:31 +01:00
Gael Guennebaud
d23fcc0672 bug #978: add unit test for zero-sized products 2015-03-06 16:12:08 +01:00
Gael Guennebaud
87681e508f bug #978: early return for vanishing products 2015-03-06 16:11:22 +01:00
Gael Guennebaud
4c8eeeaed6 update gemm changeset list 2015-03-06 15:08:20 +01:00
Gael Guennebaud
cd3bbffa73 Improve blocking heuristic: if the lhs fit within L1, then block on the rhs in L1 (allows to keep packed rhs in L1) 2015-03-06 14:31:39 +01:00
Gael Guennebaud
eedd5063fd Update gemm performance monitoring tool:
- permit to recompute a subset of changesets
 - update changeset list
 - add a few more cases
2015-03-06 11:47:13 +01:00
Gael Guennebaud
58740ce4c6 Improve product kernel: replace the previous dynamic loop swaping strategy by a more general one:
It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
2015-03-06 10:30:35 +01:00
Benoit Jacob
4ab01f7c21 slightly increase tolerance to clock speed variation 2015-03-05 14:41:16 -05:00
Benoit Jacob
5db2baa573 Make benchmark-blocking-sizes detect changes to clock speed and be resilient to that. 2015-03-05 13:44:20 -05:00
Gael Guennebaud
4c8b95d5c5 Rename LSCG to LeastSquaresConjugateGradient 2015-03-05 10:16:32 +01:00
Gael Guennebaud
7550107028 Product optimization: implement a dynamic loop-swapping startegy to improve memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large" 2015-03-05 10:03:46 +01:00
Gael Guennebaud
2dc968e453 bug #824: improve accuracy of Quaternion::angularDistance using atan2 instead of acos. 2015-03-04 17:03:13 +01:00
Benoit Jacob
2231b3dece output to cout, not cerr, the actual results 2015-03-04 09:45:12 -05:00
Benoit Jacob
00ea121881 Complete the tool to analyze the efficiency of default sizes. 2015-03-04 09:30:56 -05:00
Benoit Steiner
0196141938 Fixed the optimized AVX implementation of the fast rsqrt function 2015-03-02 13:49:39 -08:00
Benoit Steiner
b0f2b6f297 Updated the tensor type casting code as follow: in the case where TgtRatio < SrcRatio, disable the vectorization of the source expression unless is has direct-access. 2015-03-02 10:11:40 -08:00
Benoit Steiner
d9cb604a5d Disabled the use of aligned memory loads when converting a tensor from float to doubles since alignment can't always be guaranteed. 2015-03-02 09:41:36 -08:00
Benoit Steiner
4fd7f47692 Added an optimized version of rsqrt for SSE and AVX that is used when EIGEN_FAST_MATH is defined. 2015-03-02 09:38:47 -08:00
Benoit Steiner
ae73859a0a Fixed incorrect assertion 2015-02-28 08:02:02 -08:00
Benoit Steiner
131449298f Fixed clang compilation warning 2015-02-28 03:01:19 -08:00
Benoit Steiner
56ea45ff0f Silenced some compilation warnings 2015-02-28 02:37:41 -08:00
Benoit Steiner
bb483313f6 Fixed another batch of compilation warnings 2015-02-28 02:32:46 -08:00
Benoit Steiner
fb53384b0f Improved the default implementation of prsqrt 2015-02-28 01:51:26 -08:00
Benoit Steiner
61409d9449 Silenced one more comilation warning 2015-02-28 01:49:09 -08:00
Benoit Steiner
1a7b84dc75 Silenced a few compilation warnings 2015-02-28 01:45:15 -08:00
Benoit Steiner
37357a310f Fixed compilation warnings 2015-02-27 23:54:24 -08:00
Benoit Steiner
cf1eea11de Fixed compilation warnings 2015-02-27 23:52:02 -08:00
Benoit Steiner
78732186ee Fixed compilation warnings 2015-02-27 23:51:16 -08:00
Benoit Steiner
4250a0cab0 Fixed compilation warnings 2015-02-27 21:59:10 -08:00
Benoit Steiner
a4e37b0617 Reverted the README 2015-02-27 13:09:49 -08:00
Benoit Steiner
306fceccbe Pulled latest updates from trunk 2015-02-27 13:05:26 -08:00
Benoit Steiner
75e7f381c8 Pulled latest updates from trunk 2015-02-27 12:57:55 -08:00
Benoit Steiner
2386fc8528 Added support for 32bit index on a per tensor/tensor expression. This enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary. 2015-02-27 12:57:13 -08:00
Benoit Steiner
e1f6a45b14 README.md edited online with Bitbucket 2015-02-27 20:44:24 +00:00
Benoit Steiner
90893bbe18 README.md edited online with Bitbucket 2015-02-27 20:44:10 +00:00
Benoit Steiner
473e6d4c3d README.md edited online with Bitbucket 2015-02-27 20:41:45 +00:00
Benoit Steiner
4369538227 README.md edited online with Bitbucket 2015-02-27 20:41:33 +00:00
Benoit Steiner
99cfbd6e84 README.md edited online with Bitbucket 2015-02-27 20:41:14 +00:00
Benoit Jacob
6466fa63be Reimplement the selection between rotating and non-rotating kernels
using templates instead of macros and if()'s.
That was needed to fix the build of unit tests on ARM, which I had
broken. My bad for not testing earlier.
2015-02-27 15:30:10 -05:00
Benoit Steiner
05089aba75 Switch to truncated casting when converting floating point types to integer. This ensures that vectorized casts are consistent with scalar casts 2015-02-27 09:27:30 -08:00
Benoit Steiner
bf9877a92a Pulled latest updates from trunk 2015-02-27 09:23:22 -08:00
Benoit Steiner
90f4e90f1d Fixed off-by-one error that prevented the evaluation of small tensor expressions from being vectorized 2015-02-27 09:22:37 -08:00
Benoit Steiner
573b377110 Added support for vectorized type casting of tensors 2015-02-27 08:46:04 -08:00
Benoit Jacob
2fc3b484d7 remove trailing comma 2015-02-27 11:37:45 -05:00
Benoit Jacob
33669348c4 Disable Packet2f/2i halfpacket support in NEON.
I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented,
and code trying to use halfpackets just fails to compile on NEON, as it tries to use the
default implementation of pload/pstore and the types don't match.
2015-02-27 11:35:37 -05:00
Benoit Jacob
f5ff4d826f Fix NEON build flags: in the current NDK, at least with the clang-3.5 toolchain,
-mfpu=neon is not enough to activate NEON, since it's incompatible with the default float ABI,
and I have to pass -mfloat-abi=softfp (which is what everyone does in practice).
In fact, it would be a good idea to pass -mfloat-abi=softfp all the time, regardless of NEON.
Also removing the -mcpu=cortex-a8, as 1) it's not needed and 2) if we really wanted to pass
a specific -mcpu flag, that would presumably to tune performance for benchmarks, and it would
then not really make sense to tune for the very old cortex-a8 (it reflects ARM CPUs from 5 years ago).
2015-02-27 10:56:50 -05:00
Benoit Jacob
b7fc8746e0 Replace a static assert by a runtime one, fixes the build of unit tests on ARM
Also safely assert in the non-implemented path that should never be taken in practice,
and would return wrong results.
2015-02-27 10:01:59 -05:00
Benoit Steiner
f074bb4b5f Fixed another compilation problem with TensorIntDiv.h 2015-02-26 11:14:23 -08:00
Benoit Steiner
57154fdb32 Can now use the tensor 'reverse' operation as a lvalue 2015-02-26 11:13:42 -08:00
Benoit Steiner
f41b1f1666 Added support for fast reciprocal square root computation. 2015-02-26 09:42:41 -08:00
Benoit Steiner
2fffe69b1b Added missing copy constructor 2015-02-26 09:27:53 -08:00
Gael Guennebaud
bcf9bb5c1f Avoid packing rhs multiple-times when blocking on the lhs only. 2015-02-26 17:01:33 +01:00
Gael Guennebaud
4ec3f04b3a Make sure that the block size computation is tested by our unit test. 2015-02-26 17:00:36 +01:00
Gael Guennebaud
2e9cb06a87 Update changeset list to be checked by perf_monitoring/gemm. 2015-02-26 16:13:33 +01:00
Gael Guennebaud
a46061ab7b Make perf_monitoring/gemm script more flexible:
- skip existing dataset
  - add a "-up" option to recompute the dataset (see script header)
  - allow to specify a filename prefix
2015-02-26 16:12:58 +01:00
Gael Guennebaud
a8ad8887bf Implement a more generic blocking-size selection algorithm. See explanations inlines.
It performs extremely well on Haswell. The main issue is to reliably and quickly find the
actual cache size to be used for our 2nd level of blocking, that is: max(l2,l3/nb_core_sharing_l3)
2015-02-26 16:04:35 +01:00
Gael Guennebaud
400becc591 Fix typos in block-size testing code, and set peeling on k to 8. 2015-02-26 15:57:06 +01:00
Benoit Steiner
bffb6bdf45 Made TensorIntDiv.h compile with MSVC 2015-02-25 23:54:43 -08:00
Benoit Steiner
27f3fb2bcc Fixed another clang warning 2015-02-25 22:54:20 -08:00
Benoit Steiner
f8fbb3f9a6 Fixed several compilation warnings reported by clang 2015-02-25 22:22:37 -08:00
Benoit Steiner
8e817b65d0 Silenced a few more compilation warnings generated by nvcc 2015-02-25 17:46:20 -08:00
Benoit Steiner
410070e5ab Added more tests to validate support for tensors laid out in RowMajor order. 2015-02-25 16:14:59 -08:00
Benoit Steiner
1cfd51908c Added support for RowMajor layout to the tensor patch extraction cofde. 2015-02-25 13:29:12 -08:00
Benoit Steiner
eb21a8173e Pulled latest changes from trunk 2015-02-25 09:49:44 -08:00
Benoit Steiner
8afce86e64 Added support for RowMajor layout to the image patch extraction code
Speeded up the unsupported_cxx11_tensor_image_patch test and reduced its memory footprint
2015-02-25 09:48:54 -08:00
Benoit Jacob
692136350b So I extensively measured the impact of the offset in this prefetch. I tried offset values from 0 to 128 (on this float* pointer, so implicitly times 4 bytes).
On x86, I tested a Sandy Bridge with AVX with 12M cache and a Haswell with AVX+FMA with 6M cache on MatrixXf sizes up to 2400.

I could not see any significant impact of this offset.

On Nexus 5, the offset has a slight effect: values around 32 (times sizeof float) are worst. Anything else is the same: the current 64 (8*pk), or... 0.

So let's just go with 0!

Note that we needed a fix anyway for not accounting for the value of RhsProgress. 0 nicely avoids the issue altogether!
2015-02-25 12:37:14 -05:00
Christoph Hertzberg
531fa9de77 bug #970: Add EIGEN_DEVICE_FUNC to RValue functions, in case Cuda supports RValue-references. 2015-02-24 21:03:28 +01:00
Benoit Jacob
26275b250a Fix my recent prefetch changes:
- the first prefetch is actually harmful on Haswell with FMA,
   but it is the most beneficial on ARM.
 - the second prefetch... I was very stupid and multiplied by sizeof(scalar)
   and offset of a scalar* pointer. The old offset was 64; pk = 8, so 64=pk*8.
   So this effectively restores the older offset. Actually, there were
   two prefetches here, one with offset 48 and one with offset 64. I could not
   confirm any benefit from this strange 48 offset on either the haswell or
   my ARM device.
2015-02-23 16:55:17 -05:00
Benoit Jacob
488874781b Add analyze-blocking-sizes program under bench/ to analyze multiple logs
generated by benchmark-blocking-sizes.
2015-02-23 14:02:29 -05:00
Christoph Hertzberg
052b6b40f1 Fix two trivial warnings 2015-02-22 12:40:51 +01:00
Christoph Hertzberg
ecbf2a6656 log1p is defined only for real Scalars in C++11 2015-02-21 19:58:24 +01:00
Christoph Hertzberg
6af6cf0c2e I can reproduce any problems that justified this hack. However it makes builds fail in C++11 mode. 2015-02-21 19:43:56 +01:00
Gael Guennebaud
3cf642baa3 Fix compilation of unit tests disabling assertion cheking 2015-02-21 14:13:48 +01:00
Benoit Jacob
458cf91cd9 Add benchmark-blocking-sizes.cpp to bench/ per mailing list discussion. 2015-02-20 17:08:04 -05:00
Gael Guennebaud
03ec601ff7 Initial version of a small script to help tracking performance regressions 2015-02-20 19:20:34 +01:00
Gael Guennebaud
333b497383 update bench_gemm 2015-02-20 11:59:49 +01:00
Gael Guennebaud
2da1594750 Fix doc of Ref<> 2015-02-20 11:52:22 +01:00
Gael Guennebaud
01b8440579 With C++11 Matrix<float> + Matrix<complex<float>> does not even compile 2015-02-20 09:32:49 +01:00
Gael Guennebaud
3594451ee0 Remove EIGEN_TEST_C++0x option and let EIGEN_TEST_CXX11 adds the -std=c++11 flag 2015-02-20 09:31:27 +01:00
Gael Guennebaud
b192e29eae In C++11 destructors do not throw by default (fix CommaInitializer unit test) 2015-02-20 09:28:34 +01:00
Benoit Steiner
ab41652d81 Pulled latest changes from trunk 2015-02-19 21:23:37 -08:00
Benoit Steiner
7765039f1c Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up being executed on the GPU device. 2015-02-19 21:22:51 -08:00
Gael Guennebaud
a66f5fc2fd Fix regression with C++11 support of lambda: now internal::result_of falls back to std::result_of in C++11. 2015-02-19 23:32:12 +01:00
Gael Guennebaud
ece6b440f9 Fix a C++11 compilation issue in unit test 2015-02-19 23:31:08 +01:00
Gael Guennebaud
1b7e12847d Fix some calls to result_of on binary functors as unary ones. 2015-02-19 23:30:41 +01:00
Gael Guennebaud
0f4dd15dfc Declare const some const variables 2015-02-19 23:28:57 +01:00
Benoit Steiner
92ceb02c6d Pulle latest updates from trunk 2015-02-19 11:59:52 -08:00
Benoit Steiner
110fb90250 Improved the documentations 2015-02-19 11:59:04 -08:00
Gael Guennebaud
829dddd0fd Add support for C++11 result_of/lambdas 2015-02-19 15:18:37 +01:00
Benoit Jacob
db05f2d01e rotating kernel: avoid compiling anything outside of ARM 2015-02-18 15:43:52 -05:00
Benoit Jacob
0ed00d5438 remove a newly introduced redundant typedef - sorry. 2015-02-18 15:05:01 -05:00
Benoit Jacob
9bd8a4bab5 bug #955 - Implement a rotating kernel alternative in the 3px4 gebp path
This is substantially faster on ARM, where it's important to minimize the number of loads.

This is specific to the case where all packet types are of size 4. I made my best attempt to minimize how dirty this is... opinions welcome.

Eventually one could have a generic rotated kernel, but it would take some work to get there. Also, on sandy bridge, in my experience, it's not beneficial (even about 1% slower).
2015-02-18 15:03:35 -05:00
Hauke Heibel
ee27d50633 Fixed template parameter. 2015-02-18 18:51:08 +01:00
Gael Guennebaud
73a24de424 merge 2015-02-18 15:51:00 +01:00
Gael Guennebaud
63eb0f6fe6 Clean a bit computeProductBlockingSizes (use Index type, remove CEIL macro) 2015-02-18 15:49:05 +01:00
Gael Guennebaud
fc5c3e85e2 Fix bug #961: eigen-doc.tgz included part of itself. 2015-02-18 15:47:01 +01:00
Benoit Jacob
4a3e6c8be1 bug #958 - Allow testing specific blocking sizes
This is only a debugging/testing patch. It allows testing specific
product blocking sizes, typically to study the impact on performance.

Example usage:

int testk, testm, testn;
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZES
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_K testk
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_M testm
#define EIGEN_TEST_SPECIFIC_BLOCKING_SIZE_N testn
#include <Eigen/Core>
2015-02-18 09:43:55 -05:00
Gael Guennebaud
c7bb1e8ea8 Fix a regression when using OpenMP, and fix bug #714: the number of threads might be lower than the number of requested ones 2015-02-18 15:19:23 +01:00
Jan Blechta
168ceb271e Really use zero guess in ConjugateGradients::solve as documented
and expected for consistency with other methods.
2015-02-18 14:26:10 +01:00
Gael Guennebaud
8fdcaded5e merge 2015-03-04 10:18:08 +01:00
Gael Guennebaud
c43154bbc5 Check for no-reallocation in SparseMatrix::insert (bug #974) 2015-03-04 10:16:46 +01:00
Gael Guennebaud
1ce0178363 Improve efficiency of SparseMatrix::insert/coeffRef for sequential outer-index insertion strategies (bug #974) 2015-03-04 09:39:26 +01:00
Gael Guennebaud
3dca4a1efc Update manual wrt new LSCG solver. 2015-03-04 09:35:30 +01:00
Gael Guennebaud
05274219a7 Add a CG-based solver for rectangular least-square problems (bug #975). 2015-03-04 09:34:27 +01:00
Benoit Jacob
2aa09e6b4e Fix asm comments in 1px1 kernel 2015-03-03 13:44:00 -05:00
Benoit Steiner
5d2fd64a1a Fixed compilation error when compiling with gcc4.7 2015-03-03 08:56:49 -08:00
Benoit Jacob
f64b4480af Add missing copyright notices 2015-03-03 11:43:56 -05:00
Benoit Jacob
eae8e27b7d Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp 2015-03-03 11:41:21 -05:00
Marc Glisse
37a93c4263 New scoring functor to select the pivot.
This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice.
2015-03-03 17:08:28 +01:00
Benoit Jacob
ccc1277a42 must also disable complex<double> when disabling double vectorization 2015-03-03 10:17:05 -05:00
Benoit Jacob
f839099512 Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON intrinsics. 2015-03-03 09:35:22 -05:00
Benoit Jacob
9930e9583b Improve analyze-blocking-sizes, and in particular give it a evaluate-defaults tool
that shows the efficiency of Eigen's default blocking sizes choices, using a
previously computed table from benchmark-blocking-sizes.
2015-03-02 18:08:38 -05:00
Benoit Jacob
1ec0f4fadf HalfPacket also needed to be disabled for double, on ARMv8. 2015-03-02 16:08:54 -05:00
Gael Guennebaud
3109f0e74e Add SSE vectorization of Quaternion::conjugate. Significant speed-up when combined with products like q1*q2.conjugate() 2015-03-02 20:09:33 +01:00
Abhijit Kundu
ef09ce4552 Fix for TensorIO for Fixed sized Tensors.
The following code snippet was failing to compile:

TensorFixedSize<double, Sizes<4, 3> > t_4x3;
cout << 4x3;
2015-02-28 21:30:31 -05:00
Abhijit Kundu
3a4b6827b4 Merged eigen/eigen into default 2015-02-28 20:15:28 -05:00
Christoph Hertzberg
31e2ffe82c Replaced POSIX random() by internal::random 2015-02-28 18:39:37 +01:00
Christoph Hertzberg
73dd95e7b0 Use @CMAKE_MAKE_PROGRAM@ instead of make in buildtests.sh 2015-02-28 16:51:53 +01:00
Christoph Hertzberg
682196e9fc Fixed MPRealSupport 2015-02-28 16:41:00 +01:00
Christoph Hertzberg
33f40b2883 Cygwin does not like weak linking either. 2015-02-28 14:53:11 +01:00
Christoph Hertzberg
0f82a1d7b7 bug #967: Automatically add cxx11 suffix when building in C++11 mode 2015-02-28 14:52:26 +01:00
Gael Guennebaud
9aee1e300a Increase unit-test L1 cache size to ensure we are doing at least 2 peeled loop within product kernel. 2015-02-27 22:55:12 +01:00
Gael Guennebaud
b10cd3afd2 Re-enbale detection of min/max parentheses protection, and re-enable mpreal_support unit test. 2015-02-27 22:38:00 +01:00
Abhijit Kundu
4084dce038 Added CMake support for Tensor module. CMake now installs CXX11 Tensor module like the rest of the unsupported modules 2015-02-26 16:50:09 -05:00
Gael Guennebaud
548b781380 Fix bug #945: workaround MSVC warning 2015-02-18 12:53:49 +01:00
Gael Guennebaud
6f4adc9e94 Add missing install directives for arch/CUDA 2015-02-18 11:40:06 +01:00
Gael Guennebaud
371d3bef36 Workaround dead store warnings in unit tests. 2015-02-18 11:30:44 +01:00
Gael Guennebaud
63464754ef Add an internal assertion in makeCompressed to catch a possible risk of null-pointer access. 2015-02-18 11:29:54 +01:00
Gael Guennebaud
eb563049f7 Remove some dead stores. 2015-02-18 11:26:48 +01:00
Gael Guennebaud
dc7e6acc05 Fix possible usage of a null pointer in CholmodSupport 2015-02-18 11:26:25 +01:00
Gael Guennebaud
d4eda01488 Big 957, workaround MSVC/ICC compilation issue 2015-02-18 11:24:32 +01:00
Christoph Hertzberg
24d65ac0b0 Removed redundant typedef which confused old gcc versions. 2015-02-18 01:03:32 +01:00
Gael Guennebaud
20cac72b82 Packet must be passed by const reference and not by value to avoid alignment issue. 2015-02-17 22:58:32 +01:00
Benoit Steiner
36c9d08274 Pulled latest updates from trunk 2015-02-17 10:04:25 -08:00
Benoit Steiner
f77054f43c Silenced compilation warning 2015-02-17 10:02:04 -08:00
Benoit Steiner
1d3b64d32b Added support for tensor concatenation as lvalue 2015-02-17 09:57:41 -08:00
Benoit Steiner
00f048d44f Added support for tensor concatenation as lvalue 2015-02-17 09:54:40 -08:00
Christoph Hertzberg
97a36ecba4 Suppress some remaining Index conversion warnings 2015-02-17 18:52:39 +01:00
Gael Guennebaud
159fb181c2 Disable __m128* wrappers when compiling with AVX and -fabi-version=4 2015-02-17 16:27:20 +01:00
Gael Guennebaud
91ab2489dd Fix compilation with GCC/AVX (workaround __m128 and __m256 being the same type with default ABI) 2015-02-17 16:08:07 +01:00
Gael Guennebaud
9daf8eba6f Fix compilation of Cholmod*(matrix) ctor 2015-02-17 15:24:52 +01:00
Gael Guennebaud
3373c903b3 Fix compilation of int*complex with gcc 2015-02-16 19:18:12 +01:00
Gael Guennebaud
9f49f00feb Extend sparse-determinant unitests 2015-02-16 19:09:48 +01:00
Gael Guennebaud
f0b1b1df9b Fix SparseLU::signDeterminant() method, and add a SparseLU::determinant() method. 2015-02-16 19:09:22 +01:00
Gael Guennebaud
8768ff3c31 Add PermutationMatrix::determinant method. 2015-02-16 19:08:25 +01:00
Martin Drozdik
64b29e06b9 bug #956: Fixed bug in move constructors of DenseStorage which caused "moved-from" objects to be in an invalid state. 2015-02-16 18:18:46 +09:00
Gael Guennebaud
1c0e8bcf09 Fix unused variable warning. 2015-02-16 17:21:30 +01:00
Gael Guennebaud
69fa405096 Update circulant custom expression example 2015-02-16 17:21:16 +01:00
Gael Guennebaud
0f464d9d87 bug #897: fix regression in BiCGSTAB(mat) ctor (an all other iterative solvers).
Add respective regression unit test.
2015-02-16 17:05:10 +01:00
Gael Guennebaud
470d26d580 Remove some useless typedefs 2015-02-16 16:48:21 +01:00
Gael Guennebaud
4dded73227 bug #914: fix compiler detection on windows
(grafted from 77af14fb62
)
2015-02-16 16:26:47 +01:00
Gael Guennebaud
953d5ccfd5 Doc: explain how to free allocated memory in SparseMAtrix 2015-02-16 15:56:11 +01:00
Gael Guennebaud
98604576d1 Merged in chtz/eigen-indexconversion (pull request PR-92)
bug #877, bug #572: Get rid of Index conversion warnings, summary of changes:

- Introduce a global typedef Eigen::Index making Eigen::DenseIndex and AnyExpr<>::Index deprecated (default is std::ptrdiff_t).

 - Eigen::Index is used throughout the API to represent indices, offsets, and sizes.

 - Classes storing an array of indices uses the type StorageIndex to store them. This is a template parameter of the class. Default is int.

 - Methods that *explicitly* set or return an element of such an array take or return a StorageIndex type. In all other cases, the Index type is used.
2015-02-16 15:29:00 +01:00
Gael Guennebaud
45cbb0bbb1 The usage of DenseIndex is deprecated, so let's replace DenseIndex by Index 2015-02-16 15:05:41 +01:00
Gael Guennebaud
cc641aabb7 Remove deprecated usage of expr::Index. 2015-02-16 14:46:51 +01:00
Gael Guennebaud
aa6c516ec1 Fix many long to int conversion warnings:
- fix usage of Index (API) versus StorageIndex (when multiple indexes are stored)
 - use StorageIndex(val) when the input has already been check
 - use internal::convert_index<StorageIndex>(val) when val is potentially unsafe (directly comes from user input)
2015-02-16 13:19:05 +01:00
Christoph Hertzberg
bd511dde9d bug #952: Missing \endcode made doxygen fail to build ColPivHouseholderQR 2015-02-15 06:08:25 +01:00
Benoit Steiner
e2cfddf75f Pulled latest updates from trunk 2015-02-13 16:21:59 -08:00
Benoit Steiner
0927801a84 Optimized version of the sin(), exp(), log() and sqrt() function for AVX 2015-02-13 16:07:08 -08:00
Benoit Jacob
e972b55ec4 bug #953 - Fix prefetches in 3px4 product kernel
This gives a 10% speedup on nexus 4 and on nexus 5.
2015-02-13 14:52:36 -05:00
Gael Guennebaud
fc202bab39 Index refactoring: StorageIndex must be used for storage only (and locally when it make sense). In all other cases use the global Index type. 2015-02-13 18:57:41 +01:00
Gael Guennebaud
fe51319980 Merge Index-refactoring branch with default, fix PastixSupport, remove some useless typedefs 2015-02-13 10:03:53 +01:00
Gael Guennebaud
0918c51e60 merge Tensor module within Eigen/unsupported and update gemv BLAS wrapper 2015-02-12 21:48:41 +01:00
Gael Guennebaud
409547a0c8 update EIGEN_FAST_MATH documentation 2015-02-12 21:04:31 +01:00
Benoit Steiner
4470c99975 Added a test to validate tensor casting on cuda devices 2015-02-10 14:40:18 -08:00
Benoit Steiner
6620aaa4b3 Silenced a few compilation warnings generated by nvcc 2015-02-10 14:34:42 -08:00
Benoit Steiner
f669f5656a Marked a few functions as EIGEN_DEVICE_FUNC to enable the use of tensors in cuda kernels. 2015-02-10 14:29:47 -08:00
Gael Guennebaud
029d236ceb merge 2015-02-10 23:12:47 +01:00
Gael Guennebaud
fe25f3b8e3 FMA has been wrongly disabled 2015-02-10 23:11:35 +01:00
Benoit Steiner
ceb4c9c10b Pulled latest changes from trunk 2015-02-10 14:03:17 -08:00
Benoit Steiner
cc5d7ff523 Added vectorized implementation of the exponential function for ARM/NEON 2015-02-10 14:02:38 -08:00
Gael Guennebaud
d771295554 remove useless include 2015-02-10 22:59:27 +01:00
Benoit Steiner
fefec723aa Fixed compilation error triggered when trying to vectorize a non vectorizable cuda kernel. 2015-02-10 13:16:22 -08:00
Benoit Steiner
780b2422e2 Silenced the last batch of compilation warnings triggered by gcc 4.8 2015-02-10 12:43:55 -08:00
Benoit Steiner
c21e45fbc5 Fixed a few more compilation warnings 2015-02-10 12:36:26 -08:00
Benoit Steiner
057cfd2f02 Silenced more compilation warnings 2015-02-10 12:25:02 -08:00
Benoit Steiner
114e863f08 Silcenced a few compilation warnings 2015-02-10 12:20:24 -08:00
Benoit Steiner
410895a7e4 Silenced several compilation warnings 2015-02-10 12:13:19 -08:00
Benoit Steiner
4716c2c666 Fixed compilation error 2015-02-10 12:06:19 -08:00
Benoit Steiner
91fe3a3004 Removed a debug printf statement. 2015-02-10 10:29:28 -08:00
Jan Blechta
c3f3580b8f Fix bug #733: step by step solving is not a good example for solveWithGuess 2015-02-10 14:24:39 +01:00
Gael Guennebaud
deecff97ed typo 2015-02-10 19:22:05 +01:00
Gael Guennebaud
c6e8caf090 Allows Lower|Upper as a template argument of CG and MINRES: in this case the full matrix will be considered. 2015-02-10 18:57:41 +01:00
Gael Guennebaud
d10d6a40dd bug #897: Update unsupported iterative solvers based on IterativeSolverBased. 2015-02-10 13:02:59 +01:00
Gael Guennebaud
87629cd639 bug #897: makes iterative sparse solvers use a Ref<SparseMatrix> instead of a SparseMatrix pointer. This fixes usage of iterative solvers with a Map<SparseMatrix>. 2015-02-09 11:41:25 +01:00
Gael Guennebaud
bde98df03f merge 2015-02-09 11:15:37 +01:00
Gael Guennebaud
d4ec48575e Make Block<SparseMatrix> inherit SparseCompressedBase in the case of an inner-panels and fix valuePtr() innerIndexPtr() 2015-02-09 11:14:36 +01:00
Gael Guennebaud
554aa9b31d Add failtests for Ref<SparseMatrix> 2015-02-09 10:24:07 +01:00
Gael Guennebaud
3af29caae8 Cleaning and add more unit tests for Ref<SparseMatrix> and Map<SparseMatrix> 2015-02-09 10:23:45 +01:00
Gael Guennebaud
f2ff8c091e Add a Ref<SparseMatrix> specialization. 2015-02-07 22:04:18 +01:00
Gael Guennebaud
f3be317614 Add a Map<SparseMatrix> specialization. 2015-02-07 22:03:25 +01:00
Gael Guennebaud
08081f8293 Make SparseTranspose inherit SparseCompressBase when possible 2015-02-07 22:02:14 +01:00
Gael Guennebaud
7838fda82c Add a SparseCompressedBase class providing (un)compressed accessors (like data()/*Stride() for dense matrices),
and a CompressedAccessBit flag (similar to DirectAccessBit for dense matrices).
2015-02-07 22:00:46 +01:00
Benoit Steiner
3ba6647398 Fixed the cxx11_meta test 2015-02-06 06:00:59 -08:00
Benoit Steiner
01f7918788 Pulled latest fixes 2015-02-06 05:30:20 -08:00
Gael Guennebaud
b50ffaddf2 merge 2015-02-06 14:27:12 +01:00
Gael Guennebaud
74e460b995 Fix symmetric product 2015-02-06 14:26:24 +01:00
Gael Guennebaud
c03c73c9b7 Fix clang compilation 2015-02-06 14:26:12 +01:00
Gael Guennebaud
668518aed6 Fix non initialized entries and comparison of very small numbers 2015-02-06 14:25:41 +01:00
Benoit Steiner
c739102ef9 Pulled the latest changes from the trunk 2015-02-06 05:25:03 -08:00
Benoit Steiner
2559fa9b0f Fixed compilation error in the tensor broadcasting test 2015-02-06 02:55:18 -08:00
Benoit Steiner
dcb2a8b184 Added the EIGEN_HAS_CONSTEXPR define
Gate the tensor index list code based on the value of EIGEN_HAS_CONSTEXPR
2015-02-06 02:51:59 -08:00
Filippo Basso
a8f2c6eec7 Using numext::pow instead of std::pow in poly_eval function. 2015-02-04 18:37:51 +00:00
Gael Guennebaud
b1eca55328 Use Ref<> to ensure that both x and b in Ax=b are compatible with Umfpack/SuperLU expectations 2015-02-03 23:46:05 +01:00
Gael Guennebaud
ebdf6a2dbb SPQR: fix default threshold value 2015-02-03 22:32:34 +01:00
Benoit Steiner
f64045a060 Silenced a few more compilation warnings 2015-01-30 19:52:01 -08:00
Benoit Steiner
590f4b0aa3 Silenced some compilation warnings 2015-01-30 19:46:30 -08:00
Benoit Jacob
5ef95fabee bug #936, patch 3/3: Properly detect FMA support on ARM (requires VFPv4)
and use it instead of MLA when available, because it's both more accurate,
and faster.
2015-01-30 17:45:03 -05:00
Benoit Jacob
0f21613698 bug #936, patch 2/3: Remove EIGEN_VECTORIZE_FMA, was redundant with EIGEN_HAS_SINGLE_INSTRUCTION_MADD 2015-01-30 17:44:26 -05:00
Benoit Jacob
340b8afb14 bug #936, patch 1.5/3: rename _FUSED_ macros to _SINGLE_INSTRUCTION_,
because this is what they are about. "Fused" means "no intermediate rounding
between the mul and the add, only one rounding at the end". Instead,
what we are concerned about here is whether a temporary register is needed,
i.e. whether the MUL and ADD are separate instructions.
Concretely, on ARM NEON, a single-instruction mul-add is always available: VMLA.
But a true fused mul-add is only available on VFPv4: VFMA.
2015-01-31 14:15:57 -05:00
Benoit Jacob
9f99f61e69 bug #936, patch 1/3: some cleanup and renaming for consistency. 2015-01-30 17:43:56 -05:00
Benoit Jacob
759bd92a85 bug #935: Add asm comments in GEBP kernels to work around a bug
in both GCC and Clang on ARM/NEON, whereby they spill registers,
severely harming performance. The reason why the asm comments
make a difference is that they prevent the compiler from
reordering code across these boundaries, which has the effect
of extending the lifetime of local variables and increasing
register pressure on this register-tight code.
2015-01-30 17:27:56 -05:00
Gael Guennebaud
f1092d2f73 bug #941: fix accuracy issue in ColPivHouseholderQR, do not stop decomposition on a small pivot 2015-01-30 19:04:04 +01:00
Gael Guennebaud
9d82f7e30d Supernodes was disabled. 2015-01-30 17:24:40 +01:00
Benoit Steiner
e896c0ade7 Marked the contraction operation as read only, since its result can't be assigned. 2015-01-29 10:29:47 -08:00
Benoit Steiner
5a6ea4edf6 Added more tests to cover tensor reductions 2015-01-28 10:02:47 -08:00
Gael Guennebaud
a727a2c4ed bug #933: RealSchur, do not consider the input matrix norm to check negligible sub-diag entries. This also makes this test consistent with the complex and self-adjoint cases. 2015-01-28 16:07:51 +01:00
Benoit Steiner
9dfdbd7e56 mproved the performance of tensor reductions that preserve the inner most dimension(s). 2015-01-27 14:15:31 -08:00
Benoit Steiner
46fc881e4a Added a few benchmarks for the tensor code 2015-01-26 17:46:40 -08:00
Gael Guennebaud
c6eb84aabc Enable vectorization of transposeInPlace for PacketSize x PacketSize matrices 2015-01-26 17:09:01 +01:00
Gael Guennebaud
e1f1091fde Add support for dense ?= diagonal 2015-01-24 10:32:49 +01:00
Gael Guennebaud
b9d314ae19 bug #329: fix typo 2015-01-17 21:55:33 +01:00
Benoit Steiner
14f537c296 gcc doesn't consider that
template<typename OtherDerived> TensorStridingOp& operator = (const OtherDerived& other)
provides a valid assignment operator for the striding operation, and therefore refuses to compile code like:
result.stride(foo) = source.stride(bar);

Added the explicit
   TensorStridingOp& operator = (const TensorStridingOp& other)

as a workaround to get the code to compile, and did the same in all the operations that can be used as lvalues.
2015-01-16 09:09:23 -08:00
Benoit Steiner
641e824c56 Added cube() operation 2015-01-15 11:11:48 -08:00
Benoit Steiner
b5124e7cfd Created many additional tests 2015-01-14 15:46:04 -08:00
Benoit Steiner
54e3633b43 Updated the list of include files 2015-01-14 15:43:38 -08:00
Benoit Steiner
f697df7237 Improved support for RowMajor tensors
Misc fixes and API cleanups.
2015-01-14 15:38:48 -08:00
Benoit Steiner
6559d09c60 Ensured that each thread has it's own copy of the TensorEvaluator: this avoid race conditions when the evaluator calls a non thread safe functor, eg when generating random numbers. 2015-01-14 15:34:50 -08:00
Benoit Steiner
8a382aa119 Improved the resizing of tensors 2015-01-14 15:33:11 -08:00
Benoit Steiner
703c526355 Misc improvements 2015-01-14 15:31:52 -08:00
Benoit Steiner
4cdf3fe427 Misc fixes 2015-01-14 15:30:47 -08:00
Benoit Steiner
0feff6e987 Expanded the functionality of index lists 2015-01-14 15:29:48 -08:00
Gael Guennebaud
cd679f2c47 Fix doc: setConstant does not exist for SparseMatrix. 2015-01-14 22:06:09 +01:00
Benoit Steiner
1ac8600126 Fixed the return type of coefficient wise operations. For example, the abs function returns a floating point value when called on a complex input. 2015-01-14 12:47:46 -08:00
Benoit Steiner
378bdfb7f0 Added missing apis to the TensorMap class 2015-01-14 12:45:20 -08:00
Benoit Steiner
0526dc1bb4 Added missing apis to the tensor class 2015-01-14 12:44:08 -08:00
Benoit Steiner
1a36590e84 Fixed the printing of RowMajor tensors 2015-01-14 12:43:20 -08:00
Benoit Steiner
7e0b6c56b4 Added ability to initialize a tensor using an initializer list 2015-01-14 12:41:30 -08:00
Benoit Steiner
b12dd1ae3c Misc improvements for fixed size tensors 2015-01-14 12:39:34 -08:00
Benoit Steiner
71676eaddd Added support for RowMajor inputs to the contraction code. 2015-01-14 12:36:57 -08:00
Benoit Steiner
0a0ab6dd15 Increased the functionality of the tensor devices 2015-01-14 11:45:17 -08:00
Benoit Steiner
5692723c58 Improved the performance of the contraction code on CUDA 2015-01-14 11:42:52 -08:00
Benoit Steiner
8f4b8d204b Improved the performance of tensor reductions
Added the ability to generate random numbers following a normal distribution
Created a test to validate the ability to generate random numbers.
2015-01-14 10:19:33 -08:00
Benoit Steiner
3bd2b41b2e Created a test for tensor type casting 2015-01-14 10:17:02 -08:00
Benoit Steiner
4928ea1212 Added ability to reverse the order of the coefficients in a tensor 2015-01-14 10:15:58 -08:00
Benoit Steiner
b00fe1590d Added ability to swap the layout of a tensor 2015-01-14 10:14:46 -08:00
Benoit Steiner
c94174b4fe Improved tensor references 2015-01-14 10:13:08 -08:00
Benoit Steiner
91dd53e54d Created some documentation 2015-01-13 16:07:51 -08:00
Gael Guennebaud
279786e987 Fix missing evaluator in outer-product 2015-01-13 10:25:50 +01:00
Gael Guennebaud
ae4644cc68 bug #907, ARM64: workaround ICE in xcode/clang 2015-01-13 10:03:00 +01:00
Gael Guennebaud
36f7c1337f bug #907, ARM64: workaround vreinterpretq_u64_* not defined in xcode/clang 2015-01-13 09:57:37 +01:00
Gael Guennebaud
63974bcb88 Big 907: workaround some missing intrinsics in current NDK's gcc version (ARM64) 2015-01-07 09:44:25 +01:00
Gael Guennebaud
79f4a59ed9 bug #907: fix compilation with ARM64 2015-01-07 09:41:56 +01:00
Benoit Steiner
9f98650d0a Ensured that contractions that can be reduced to a matrix vector product work correctly even when the input coefficients aren't aligned. 2015-01-06 09:29:13 -08:00
Gael Guennebaud
db5b0741b5 Fix bug #925: typo in MatLab versions of middleRows 2015-01-04 21:39:50 +01:00
Gael Guennebaud
f5f6e2c6f4 bug #921: fix utilization of bitwise operation on enums in first_aligned 2014-12-19 14:41:59 +01:00
Gael Guennebaud
25c7d9164f bug #920: fix MSVC 2015 compilation issues 2014-12-18 22:58:15 +01:00
Gael Guennebaud
b8d9eaa19b Use true compile time "if" for Transform::makeAffine 2014-12-13 22:16:39 +01:00
Gael Guennebaud
f806c23012 Fix false negatives in geo_transformations unit tests 2014-12-16 16:50:30 +01:00
Gael Guennebaud
99501a2c4c Fix wrong negative in nullary unit test when extended precision is used (FPU). 2014-12-16 16:23:47 +01:00
Gael Guennebaud
7dad5f797e bug #821: workaround MSVC 2013 issue with using Base::Base::operator= 2014-12-16 13:33:43 +01:00
Christoph Hertzberg
dcad508986 At least CMAKE 2.8.4 is required for WORKING_DIRECTORY option in add_test 2014-12-15 12:45:29 +01:00
Christoph Hertzberg
608733415a Free functions should only be declared as static in separate compilation units
(grafted from d85abc89c5
)
2014-12-12 12:01:03 +01:00
Gael Guennebaud
57ec399ec9 Remove unused fortran files 2014-12-13 21:41:25 +01:00
Gael Guennebaud
56ca44ad1a Use f2c generated code instead of the original fortran code, except for dotc/dotu. 2014-12-11 17:03:41 +01:00
Christoph Hertzberg
e8cdbedefb bug #877, bug #572: Introduce a global Index typedef. Rename Sparse*::Index to StorageIndex, make Dense*::StorageIndex an alias to DenseIndex. Overall this commit gets rid of all Index conversion warnings. 2014-12-04 22:48:53 +01:00
Gael Guennebaud
6ccf97f3e6 Fix GL support wrt evaluators 2014-12-04 22:05:28 +01:00
Gael Guennebaud
433bce5c3a UmfPack support: fix redundant evaluation/copies when calling compute() and support generic expressions as input 2014-12-02 17:30:57 +01:00
Gael Guennebaud
775f7e5fbb bug #697: make sure empty classes are at the end in case of multiple inheritence 2014-12-02 14:40:19 +01:00
Gael Guennebaud
a819fa148d Fix MSVC compilation issue 2014-12-02 14:35:31 +01:00
Gael Guennebaud
1a8dc85142 bug #897: fix UmfPack usage with mapped sparse matrices 2014-12-02 13:57:13 +01:00
Gael Guennebaud
4974d1d2b4 Fix bug #911: m_extractedDataAreDirty was not initialized in UmfPackLU 2014-12-02 13:54:06 +01:00
Gael Guennebaud
e2f3e4e4aa Document non-const SparseMatrix::diagonal() method. 2014-12-01 14:45:15 +01:00
Gael Guennebaud
b26e697182 Make SparseMatrix::coeff() returns a const reference and add a non const version of SparseMatrix::diagonal() 2014-12-01 14:41:39 +01:00
Gael Guennebaud
b1f9f603a0 Simplify return type of diagonal(Index) (and ease compiler job) 2014-11-28 14:39:47 +01:00
Gael Guennebaud
5384e89147 Disable MatrixBase::bdcSvd with CUDA (just like MatrixBase::jacobiSvd 2014-11-26 22:29:29 +01:00
Gael Guennebaud
8518ba0bbc Fix Hyperplane::Through(a,b,c) when points are aligned or identical. We use the stratgey as in Quaternion::setFromTwoVectors. 2014-11-26 15:01:53 +01:00
Tim Murray
80cae358b0 Adds a modified f2c-generated C implmentation for BLAS.
This adds an optional implementation for the BLAS library that does
not require the use of a FORTRAN compiler. It can be enabled with
EIGEN_USE_F2C_BLAS.

The C implementation uses the standard gfortran calling convention
and does not require the use of -ff2c when compiled with gfortran.
2014-11-24 10:56:30 -08:00
Gael Guennebaud
0efaff9b3b Fix out-of-bounds write 2014-12-11 16:15:20 +01:00
Gael Guennebaud
41a20994cc In simplicial cholesky: avoid deep copy of the input matrix is this later can be used readily 2014-12-08 17:56:33 +01:00
Gael Guennebaud
a910a7466e Fix inner iterator type 2014-12-08 17:55:31 +01:00
Gael Guennebaud
4371911861 Remove useless and non standard numext::atanh2 function. 2014-12-08 16:44:34 +01:00
Gael Guennebaud
5fc4ce6449 bug #876: remove usage of atanh2 in matrix power 2014-12-08 16:44:05 +01:00
Gael Guennebaud
77294047d6 bug #876, matrix_log_compute_2x2: directly use logp1 instead of atanh2 2014-12-08 16:28:06 +01:00
Gael Guennebaud
bea36925db bug #876: implement a portable log1p function 2014-12-08 16:26:53 +01:00
Gael Guennebaud
7f7a712062 Optimize Simplicial Cholesky when NaturalOrdering is used. 2014-12-08 15:02:25 +01:00
Gael Guennebaud
30c849669d Fix dynamic allocation in JacobiSVD (regression) 2014-12-08 14:45:04 +01:00
Gael Guennebaud
e0a8615b94 Merged in infinitei/eigen (pull request PR-91)
Added cmake uninstall target
2014-12-05 15:04:19 +01:00
Gael Guennebaud
8efd9142b3 Merged in infinitei/eigen-opengl-fixes (pull request PR-90)
Adding missing OPENGL_LIBRARIES for openglsupport test.
2014-12-05 12:54:57 +01:00
Gael Guennebaud
80ed5bd90c Workaround various "returning reference to temporary" warnings. 2014-12-05 12:49:30 +01:00
Abhijit Kundu
eb3695d2fc Added cmake uninstall target.
This adds a cmake command make uninstall
Running make uninstall removes the files installed by running make install
2014-12-04 02:57:03 -05:00
Abhijit Kundu
48db34a7b9 Adding missing OPENGL_LIBRARIES for openglsupport test. Also adding OpenGL include directories as a better pratice even though these are system include directories in most systems. 2014-12-04 01:18:47 -05:00
Gael Guennebaud
da584912b6 Fix memory pre-allocation when permuting inner vectors of a sparse matrix. 2014-11-24 17:31:59 +01:00
Benoit Steiner
509e4ddc02 Added reduction packet primitives for CUDA 2014-11-19 10:34:11 -08:00
Benoit Steiner
b33cf92878 Fixed the evaluation of expressions involving tensors of 2 or 3 elements on CUDA devices. 2014-11-18 14:32:41 -08:00
Benoit Steiner
1d3c8306f8 Fixed compilation errors with clang.
H: Enter commit message.  Lines beginning with 'HG:' are removed.
2014-11-13 19:13:17 -08:00
Benoit Steiner
ec785b0180 Added support for extraction of patches from images 2014-11-13 09:28:54 -08:00
Benoit Steiner
eeabf7975e Optimized broadcasting 2014-11-12 22:35:44 -08:00
Benoit Steiner
c2d1074932 Added support for static list of indices 2014-11-12 22:25:38 -08:00
Gael Guennebaud
722916e19d bug #903: clean swap API regarding extra enable_if parameters, and add failtests for swap 2014-11-06 09:25:26 +01:00
Benoit Steiner
cb37f818ca Fixed a compilation error triggered by some operations on fixed sized tensors 2014-11-05 23:25:11 -08:00
Benoit Steiner
9a06a71627 Fixed a test 2014-11-05 07:49:51 -08:00
Gael Guennebaud
c6fefe5d8e Big 853: replace enable_if in Ref<> ctor by static assertions and add failtests for Ref<> 2014-11-05 16:15:17 +01:00
Gael Guennebaud
ee06f78679 Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively. 2014-11-04 21:58:52 +01:00
Benoit Steiner
9ea09179b5 Fixed the return type of the coefficient-wise tensor operations. 2014-11-04 10:24:42 -08:00
Benoit Steiner
b1789c112b Improved handling of 1d tensors 2014-11-03 08:51:33 -08:00
Benoit Steiner
2dde63499c Generalized the matrix vector product code. 2014-10-31 16:33:51 -07:00
Benoit Steiner
7f2c6ed2fa Fixed a compilation warning 2014-10-31 11:45:21 -07:00
Christoph Hertzberg
c5a3777666 Regression test for (invalid) bug #900. We should make it possible somehow to increase the problem size depending on the available RAM. 2014-10-31 17:19:05 +01:00
Christoph Hertzberg
0833b82efd Run sparse_basic unit tests also for rectangular matrices.
TriangularView with UnitDiag does not work properly yet (bug #901)
2014-10-31 17:12:13 +01:00
Benoit Steiner
85c3389b28 Fixed a test 2014-10-31 00:04:13 -07:00
Benoit Steiner
67fcf47ecb Merged from trunk 2014-10-30 21:59:22 -07:00
Benoit Steiner
fcecafde3a Fixed a compilation error with clang 2014-10-30 21:58:14 -07:00
Benoit Steiner
d62bfe73a9 Use the proper index type in the padding code 2014-10-30 18:15:05 -07:00
Benoit Steiner
bc99c5f7db fixed some potential alignment issues. 2014-10-30 18:09:53 -07:00
Benoit Steiner
1946cc4478 Added missing packet primitives for CUDA. 2014-10-30 17:52:32 -07:00
Benoit Steiner
5e62427e22 Use the proper index type 2014-10-30 17:49:39 -07:00
Christoph Hertzberg
4ec2f07a5b Fixed bug in SparseBlock which caused a segfault in sparse_extra_3 test 2014-10-30 21:34:10 +01:00
Christoph Hertzberg
883168ed94 Make select CUDA compatible (comparison operators aren't yet, so no test case yet) 2014-10-30 20:16:16 +01:00
Christoph Hertzberg
e5f134006b EIGEN_UNUSED_VARIABLE works better than casting to void. Make this also usable from CUDA code 2014-10-30 19:59:09 +01:00
Christoph Hertzberg
d2fc597d5b Removed deprecated header (unsupported/Eigen/BDCSVD is included in Eigen/SVD now) 2014-10-29 17:51:14 +01:00
Christoph Hertzberg
3d25b1f5b8 Split up some test cases 2014-10-29 17:46:54 +01:00
Christoph Hertzberg
acecb7b09f Fixed include in bdcsvd.cpp 2014-10-29 17:46:33 +01:00
Gael Guennebaud
21c0a2ce0c Move D&C SVD to official SVD module. 2014-10-29 11:29:33 +01:00
Benoit Steiner
debc97821c Added support for tensor references 2014-10-28 23:10:13 -07:00
Christoph Hertzberg
e2e7ba9f85 bug #898: add inline hint to const_cast_ptr 2014-10-28 14:49:44 +01:00
Christoph Hertzberg
bd2d330b25 Temporary workaround for bug #875:
Let TriangularView<Sparse>::nonZeros() return nonZeros() of the nested expression
2014-10-28 13:31:00 +01:00
Konstantinos Margaritis
79225db0b6 Merged in kmargar/eigen (pull request PR-87)
Extend NEON to add ARMv8 64-bit double support
2014-10-28 13:08:53 +02:00
Benjamin Chrétien
c426054767 BDCSVD: fix CMake install (missing separator). 2014-10-24 15:10:56 +02:00
Christoph Hertzberg
1fa793cb97 Removed weird self assignment. 2014-10-24 13:19:19 +02:00
Christoph Hertzberg
04ffb9956e Replace TEST_SET_BUT_UNUSED_VARIABLE by already defined EIGEN_UNUSED_VARIABLE 2014-10-24 13:18:23 +02:00
Konstantinos Margaritis
94ed7c81e6 Bug #896: Swap order of checking __VSX__/__ALTIVEC__ 2014-10-22 06:15:18 -04:00
Konstantinos Margaritis
fcb3573d17 Merged eigen/eigen into default 2014-10-22 10:42:18 +03:00
Konstantinos Margaritis
fae4fd7a26 Added ARMv8 support 2014-10-22 07:39:49 +00:00
Christoph Hertzberg
cf09c5f687 Prevent CUDA calling a __host__ function from a __host__ __device__ function is not allowed error. 2014-10-21 20:40:09 +02:00
Konstantinos Margaritis
b508619392 working 64-bit support in PacketMath.h, Complex.h needed 2014-10-21 18:10:33 +00:00
Konstantinos Margaritis
0f65f2762d add EIGEN_TEST_NEON64, but it's a dummy, AArch64 implies NEON support so extra CXXFLAGS are needed 2014-10-21 18:10:01 +00:00
Konstantinos Margaritis
87524922dc check for __ARM_NEON instead as it's defined in arm64 as well 2014-10-21 18:08:50 +00:00
Gael Guennebaud
a303b6a733 bug #670: add unit test for mapped input in sparse solver. 2014-10-20 16:46:47 +02:00
Gael Guennebaud
fe57b2f963 bug #701: workaround (min) and (max) blocking ADL by introducing numext::mini and numext::maxi internal functions and a EIGEN_NOT_A_MACRO macro. 2014-10-20 15:55:32 +02:00
Christoph Hertzberg
c12b7896d0 bug #766: Check minimum CUDA version 2014-10-20 14:23:11 +02:00
Gael Guennebaud
973e6a035f bug #718: Introduce a compilation error when using the wrong InnerIterator type with a SparseVector 2014-10-20 14:07:08 +02:00
Christoph Hertzberg
84aaa03182 Addendum to bug #859: pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN.
psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0
2014-10-20 13:13:43 +02:00
Gael Guennebaud
aa5f79206f Fix bug #859: pexp(NaN) returned Inf instead of NaN 2014-10-20 11:38:51 +02:00
Gael Guennebaud
b4a9b3f496 Add unit tests for Rotation2D's inverse(), operator*, slerp, and fix regression wrt explicit ctor change 2014-10-20 11:04:32 +02:00
Gael Guennebaud
d04f23260d Fix bug #894: the sign of LDLT was not re-initialized at each call of compute() 2014-10-20 10:48:40 +02:00
Gael Guennebaud
8838b0a1ff Fix SparseQR::rank for a completely empty matrix. 2014-10-19 22:42:20 +02:00
Benoit Steiner
f786897e4b Added access to the unerlying raw data of a tnsor slice/chip whenever possible 2014-10-17 15:33:27 -07:00
Benoit Steiner
7acd38d19e Created some benchmarks for the tensor code 2014-10-17 09:49:03 -07:00
Gael Guennebaud
b50e5bc816 merge 2014-10-17 16:53:18 +02:00
Gael Guennebaud
a370b1f2e2 Fix SparseLU::absDeterminant and add respective unit test 2014-10-17 16:52:56 +02:00
Gael Guennebaud
a13bc22204 Ignore automalically imported lapack source files 2014-10-17 15:34:39 +02:00
Gael Guennebaud
4b7c3abbea Fix D&C SVD wrt zero matrices 2014-10-17 15:32:55 +02:00
Gael Guennebaud
feacfa5f83 Fix JacobiSVD wrt undeR/overflow by doing scaling prior to QR preconditioning 2014-10-17 15:32:06 +02:00
Gael Guennebaud
8472e697ca Add lapack interface to JacobiSVD and BDCSVD 2014-10-17 15:31:11 +02:00
Benoit Steiner
65af852b54 Silenced one last warning 2014-10-16 15:02:30 -07:00
Benoit Steiner
ae697b471c Silenced a few compilation warnings
Generalized a TensorMap constructor
2014-10-16 14:52:50 -07:00
Benoit Steiner
94e47798f4 Fixed the return types of unary and binary expressions to properly handle the case where it is different from the input type (e.g. abs(complex<float>)) 2014-10-16 10:41:07 -07:00
Benoit Steiner
d853adffdb Avoid calling get_future() more than once on a given promise. 2014-10-16 10:10:04 -07:00
Mark Borgerding
880e72c130 quieted more g++ warnings of the form: warning: typedef XXX locally defined but not used [-Wunused-local-typedefs] 2014-10-16 09:19:32 -04:00
Benoit Steiner
bfdd9f3ac9 Made the blocking computation aware of the l3 cache
Also optimized the blocking parameters to take into account the number of threads used for a computation
2014-10-15 15:32:59 -07:00
Gael Guennebaud
c566cfe2ba Make SVD unit test even more tough 2014-10-15 23:37:47 +02:00
Benoit Steiner
dba55041ab Added support for promises
Started to improve multithreaded contractions
2014-10-15 11:20:36 -07:00
Gael Guennebaud
fd1aaf4772 merge 2014-10-15 16:33:14 +02:00
Gael Guennebaud
c806009453 Extend svd unit tests to stress problems with duplicated singular values. 2014-10-15 16:32:16 +02:00
Gael Guennebaud
2cc41dbe83 D&C SVD: fix some numerical issues by truly skipping deflated singular values when computing them 2014-10-15 15:21:12 +02:00
Gael Guennebaud
c26e8a1af3 D&C SVD: fix deflation of repeated singular values, fix sorting of singular values, fix case of complete deflation 2014-10-15 11:59:21 +02:00
Christoph Hertzberg
0ec1fc9e11 bug #891: Determine sizeof(void*) via CMAKE variable instead of test program 2014-10-14 14:14:25 +02:00
Benoit Steiner
99d75235a9 Misc improvements and cleanups 2014-10-13 17:02:09 -07:00
Benoit Steiner
4c70b0a762 Added support for patch extraction 2014-10-13 10:04:04 -07:00
Christoph Hertzberg
d3f52debc6 Make cuda_basic test compile again by adding lots of EIGEN_DEVICE_FUNC.
Although the test passes now, there might still be some missing.
2014-10-13 17:18:26 +02:00
Benoit Steiner
0219f8aed4 Added ability to print a tensor using an iostream. 2014-10-10 16:17:26 -07:00
Benoit Steiner
2ed1838aeb Added support for tensor chips 2014-10-10 16:11:27 -07:00
Benoit Steiner
4b36c3591f Fixed the tensor shuffling test 2014-10-10 15:43:21 -07:00
Benoit Steiner
a991f94c0e Fixed the thread pool test 2014-10-10 15:20:37 -07:00
Benoit Steiner
498b7eed25 Rewrote the TensorBase::random method to support the generation of random number on gpu. 2014-10-09 15:39:13 -07:00
Benoit Steiner
767424af18 Improved the functors defined for standard reductions
Added a functor to encapsulate the generation of random numbers on cpu and gpu.
2014-10-09 15:36:23 -07:00
Gael Guennebaud
a80e17cfe8 Remove unused and dangerous CompressedStorage::Map function 2014-10-09 23:42:33 +02:00
Gael Guennebaud
349c2c9235 bug #367: fix double copies in atWithInsertion, and add respective unit-test 2014-10-09 23:35:49 +02:00
Gael Guennebaud
48d537f59f Fix indentation 2014-10-09 23:35:26 +02:00
Gael Guennebaud
538c059aa4 bug #887: fix CompressedStorage::reallocate wrt memory leaks 2014-10-09 23:35:05 +02:00
Gael Guennebaud
a48b82eece Add a scoped_array helper class to handle locally allocated/used arrays 2014-10-09 23:34:05 +02:00
Gael Guennebaud
ccd70ba123 Various numerical fixes in D&C SVD: I cannot make it fail with double, but still need to tune for single precision, and carefully test with duplicated singular values 2014-10-09 23:29:01 +02:00
Benoit Steiner
44beee9d68 Removed dead code 2014-10-08 14:14:20 -07:00
Benoit Steiner
0a07ac574e Added support for the *= and /* operators to TensorBase 2014-10-08 13:32:41 -07:00
Benoit Steiner
6c047d398d Fixed a comment 2014-10-08 13:29:36 -07:00
Gael Guennebaud
4b886e6b39 bug #889: fix protected typedef 2014-10-08 07:48:30 +02:00
Gael Guennebaud
5741349294 bug #882: fix various const-correctness issues with *View classes. 2014-10-07 18:29:28 +02:00
Gael Guennebaud
118b1113d9 Workaround MSVC issue. 2014-10-07 09:53:39 +02:00
Gael Guennebaud
503c176d8e Fix missing outer() member in DynamicSparseMatrix 2014-10-07 09:53:27 +02:00
Gael Guennebaud
dbdd8b0883 D&C SVD: add scaling to avoid overflow, fix handling of fixed size matrices 2014-10-06 19:35:57 +02:00
Gael Guennebaud
d44d432baa Re-enable products with triangular views of sparse matrices: we simply have to treat them as a sparse matrix. 2014-10-06 16:11:26 +02:00
Gael Guennebaud
893bfcf95f bug #887: use ei_declare_aligned_stack_constructed_variable instead of manual new[]/delete[] pairs in AMD and Paralellizer 2014-10-06 11:54:30 +02:00
Gael Guennebaud
fb53ff1eda Fix SparseLU regarding uncompressed inputs and avoid manual new/delete calls. 2014-10-06 11:42:31 +02:00
Gael Guennebaud
7a17639953 Extend unit tests to check uncompressed sparse inputs in sparse solvers 2014-10-06 11:41:50 +02:00
Benoit Steiner
bbce6fa65d define EIGEN_VECTORIZE_CUDA when compiling with nvcc 2014-10-03 19:55:35 -07:00
Benoit Steiner
95a430a2ca Vector primitives for CUDA 2014-10-03 19:45:19 -07:00
Benoit Steiner
152f3218ac Improved contraction test 2014-10-03 19:33:44 -07:00
Benoit Steiner
af2e5995e2 Improved support for CUDA devices.
Improved contractions on GPU
2014-10-03 19:18:07 -07:00
Benoit Steiner
1269392822 Created the IndexPair type to store pair of tensor indices. CUDA doesn't support std::pair so we can't use them when targeting GPUs.
Improved the performance on tensor contractions
2014-10-03 10:16:59 -07:00
Benoit Steiner
b7271dffb5 Generalized the gebp apis 2014-10-02 16:51:57 -07:00
Benoit Steiner
8b2afe33a1 Fixes for the forced evaluation of tensor expressions
More tests
2014-10-02 10:39:36 -07:00
Benoit Steiner
5cc23199be More tests to validate the const-correctness of the tensor code. 2014-10-02 10:30:44 -07:00
Benoit Steiner
7caaf6453b Added support for tensor reductions and concatenations 2014-10-01 20:38:22 -07:00
Benoit Steiner
1c236f4c9a Added tests for tensors of const values and tensors of stringswwq:: 2014-10-01 20:21:42 -07:00
Christoph Hertzberg
1fa6fe2abd template keyword not allowed before non-template function call 2014-10-01 14:33:55 +02:00
Konstantinos Margaritis
9d3c69952b fixed to make big-endian VSX work as well 2014-10-01 09:43:56 +00:00
Gael Guennebaud
5180bb5e47 Add missing default ctor in Rotation2D 2014-09-30 16:59:28 +02:00
Christoph Hertzberg
0187504912 Avoid `unneeded-internal-declaration' warning 2014-09-30 16:43:52 +02:00
Christoph Hertzberg
6d26deb894 Missing outerStride in AlignedVector3 resulted in infinite recursion 2014-09-30 16:43:19 +02:00
Christoph Hertzberg
81517eebc1 Missing explicit 2014-09-30 16:42:04 +02:00
Christoph Hertzberg
12d59465cb bug #884: Copy constructor of Ref shall never malloc, constructing from other RefBase shall only malloc if the memory layout is incompatible. 2014-09-30 14:57:54 +02:00
Christoph Hertzberg
e404841235 make sure that regex does not match cmake 2014-09-29 19:28:10 +00:00
Christoph Hertzberg
15c946338f Related to bug #880: Accept make as well a gmake when searching the MakeCommand. And don't include \n in match expression 2014-09-29 19:20:01 +02:00
Gael Guennebaud
56a0bbbbee Fix compilation with GCC 2014-09-29 18:28:18 +02:00
Gael Guennebaud
842e31cf5c Let KroneckerProduct exploits the recently introduced generic InnerIterator class. 2014-09-29 13:37:49 +02:00
Gael Guennebaud
abd3502e9e Introduce a generic InnerIterator classes compatible with evaluators. 2014-09-29 13:36:57 +02:00
Gael Guennebaud
76c3cf6949 Re-enable -Wshorten-64-to-32 compilation flag. 2014-09-29 10:33:16 +02:00
Georg Drenkhahn
bc34ee3365 Using Index type instead of hard coded int type to prevent potential implicit integer conversion. 2014-09-22 18:56:36 +02:00
Georg Drenkhahn
9a04cd307c Added implicit integer conversion by using explicit integer type conversion. Adding assert to catch overflow. 2014-09-22 18:47:33 +02:00
Gael Guennebaud
f0a62c90bc Avoid comparisons between different index types. 2014-09-29 10:27:51 +02:00
Georg Drenkhahn
2946992ad4 Using StorageIndexType for loop assigning initial permutation. Adding assert for index overflow. 2014-09-22 17:59:02 +02:00
Georg Drenkhahn
821ff0ecfb Using Index instead of hard coded int type to prevent potential implicit integer conversion 2014-09-22 16:12:35 +02:00
Georg Drenkhahn
2c4cace56c Using Index instead of hard coded int type to prevent potential implicit integer conversion 2014-09-22 15:54:34 +02:00
Georg Drenkhahn
8a502233d8 Correcting the ReturnType in traits<KroneckerProduct<>> to include the correct Index type.
Fixed mixup of types Rhs::Index and Lhs:Index in various loop variables.
Added explicit type conversion for arithmetic expressions which may return a wider type.
2014-09-21 23:19:29 +02:00
Georg Drenkhahn
b2755edcdd Replaced hard coded int types with Index types preventing implicit integer conversions. 2014-09-21 23:15:35 +02:00
Georg Drenkhahn
d1ef3c3546 Changed Diagonal::index() to return an Index type instead of int to prevent possible implicit conversion from long to int.
Added inline keyword to member methods.
2014-09-21 10:21:20 +02:00
Georg Drenkhahn
edaefeb978 Using Kernel::Index type instead of int to prevent possible implicit conversion from long to int. 2014-09-21 10:01:12 +02:00
Georg Drenkhahn
3bd31e21b5 Fixed compiler warning on implicit integer conversion by separating index type for matrix and permutation matrix which may not be equal. 2014-09-20 15:00:36 +02:00
Georg Drenkhahn
75e269c77b Fixed warning on implicit integer conversion in test case code by using type VectorXd::Index instead of int. 2014-09-20 14:57:42 +02:00
Gael Guennebaud
74cde0c925 Add missing return derived() in ArrayBase::operator= 2014-09-28 09:16:13 +02:00
Jitse Niesen
ce2035af86 New doc page on implementing a new expression class. 2014-09-27 23:25:58 +01:00
Konstantinos Margaritis
6d0f0b8cec add VSX identifier 2014-09-25 16:06:16 +00:00
Christoph Hertzberg
4ba8aa1482 Fix bug #884: No malloc for zero-sized matrices or for Ref without temporaries 2014-09-25 16:05:17 +02:00
Christoph Hertzberg
27d6b4daf9 Tridiagonalization::diagonal() and ::subDiagonal() did not work. Added unit-test 2014-09-24 14:37:13 +02:00
Gael Guennebaud
446001ef51 Fix nested_eval<Product<> > which wrongly returned a Product<> expression 2014-09-24 09:39:09 +02:00
Gael Guennebaud
13cbc751c9 bug #880: automatically preserves buildtool flags when modifying DartConfiguration.tcl file. 2014-09-23 22:10:32 +02:00
Christoph Hertzberg
421feea3b2 member_redux constructor is explicit too. Renamed some typedefs for more consistency. 2014-09-23 18:55:42 +02:00
Christoph Hertzberg
7817bc19a4 Removed FIXME, as it is actually necessary. 2014-09-23 17:23:34 +02:00
Christoph Hertzberg
eb13ada3aa Renamed CwiseInverseReturnType to InverseReturnType for ArrayBase::inverse() 2014-09-23 17:21:27 +02:00
Christoph Hertzberg
36448c9e28 Make constructors explicit if they could lead to unintended implicit conversion 2014-09-23 14:28:23 +02:00
Christoph Hertzberg
de0d8a010e Suppress stupid gcc-4.4 warning 2014-09-23 12:58:14 +02:00
Gael Guennebaud
72569f17ec bug #882: add const-correctness failtests for CwiseUnaryView, TriangularView, and SelfAdjointView. 2014-09-23 10:26:02 +02:00
Gael Guennebaud
3878e6f170 Add a true ctest unit test for failtests 2014-09-23 10:25:12 +02:00
Gael Guennebaud
ff46ec0f24 bug #881: make SparseMatrixBase::isApprox(SparseMatrixBase) exploits sparse computations instead of converting the operands to dense matrices. 2014-09-22 23:33:28 +02:00
Gael Guennebaud
ae514ddfe5 bug #880: manually edit the DartConfiguration.tcl file to get it working with cmake 3.0.x 2014-09-22 22:49:20 +02:00
Gael Guennebaud
f9d6d3780f bug #879: fix compilation of tri1=mat*tri2 by copying tri2 into a full temporary. 2014-09-22 17:34:17 +02:00
Gael Guennebaud
abba11bdcf Many improvements in Divide&Conquer SVD:
- Fix many numerical issues, in particular regarding deflation.
- Add heavy debugging output to help track numerical issues (there are still fews)
- Make use of Eiegn's apply-inplane-rotation feature.
2014-09-22 15:22:52 +02:00
Christoph Hertzberg
d9e0336a78 Merged in kmargar/eigen (pull request PR-84)
Add VSX support
2014-09-22 12:57:06 +02:00
Jitse Niesen
333905b0c2 Fix typos in docs for IterativeLinearSolvers module 2014-09-21 14:20:08 +01:00
Jitse Niesen
5fa69422a2 Fix copy-and-paste typo in SolveWithGuess assignment
This fixes compilation of code snippets in BiCGSTAB docs.
2014-09-21 14:19:23 +01:00
Konstantinos Margaritis
de38ff2499 prefetch are noops on VSX, actually disable the prefetch trait 2014-09-21 11:56:07 +00:00
Konstantinos Margaritis
60e093a9dc Merged eigen/eigen into default 2014-09-21 14:02:51 +03:00
Konstantinos Margaritis
56408504e4 fix compile error on big endian altivec 2014-09-21 13:59:30 +03:00
Konstantinos Margaritis
974fe38ca3 prefetch are noops on VSX 2014-09-21 11:24:30 +00:00
Konstantinos Margaritis
c0205ca4af VSX supports vec_div, implement where appropriate (float/doubles) 2014-09-21 08:12:22 +00:00
Konstantinos Margaritis
10f8aabb61 VSX port passes packetmath_[1-5] tests! 2014-09-20 22:31:31 +00:00
Jitse Niesen
80de35b6c5 Remove double return statement in PlainObjectBase::_set() 2014-09-19 22:05:18 +01:00
Konstantinos Margaritis
60663a510a 32-bit floats/ints, 64-bit doubles pass packetmath tests, complex 32/64-bit remaining 2014-09-19 21:05:01 +00:00
Gael Guennebaud
03dd4dd91a Unify unit test for BDC and Jacobi SVD. This reveals some numerical issues in BDCSVD. 2014-09-19 15:25:48 +02:00
Gael Guennebaud
0a18eecab3 bug #100: add support for explicit scalar to Array conversion (as enable implicit conversion is much more tricky) 2014-09-19 13:25:28 +02:00
Gael Guennebaud
7b044c0ead Added tag before-evaluators for changeset 9452eb38f8 2014-09-19 10:10:29 +02:00
Gael Guennebaud
755e77266f Fix SparseQR for row-major inputs. 2014-09-19 09:58:56 +02:00
Gael Guennebaud
07c5500d70 Introduce a compilation error when using the wrong InnerIterator type. 2014-09-19 09:58:20 +02:00
Gael Guennebaud
e70506dd8f Fix inner-stride of AlignedVector3 2014-09-18 22:46:46 +02:00
Gael Guennebaud
2ae20d558b Update KroneckerProduct wrt evaluator changes 2014-09-18 22:08:49 +02:00
Gael Guennebaud
62bce6e5e6 Make MatrixFunction use nested_eval instead of nested 2014-09-18 17:31:17 +02:00
Gael Guennebaud
060e835ee9 Add evaluator for the experimental AlignedVector3 2014-09-18 17:30:21 +02:00
Gael Guennebaud
0ca43f7e9a Remove deprecated code not used by evaluators 2014-09-18 15:15:27 +02:00
Gael Guennebaud
8b3be4907d log2(int) must be inlined. 2014-09-18 10:53:53 +02:00
Gael Guennebaud
0bf5894861 workaround one more shadowing issue with MSVC 2014-09-16 18:21:39 -07:00
Gael Guennebaud
e44d78dab3 workaround ambiguous call 2014-09-16 17:10:25 -07:00
Gael Guennebaud
c2f66c65aa workaround MSVC compilation issue (shadow issue) 2014-09-16 16:23:45 -07:00
Gael Guennebaud
125619146b workaround weird MSVC compilation issue: a typdedef in a base class shadows a template parameter of a derived class 2014-09-16 16:06:32 -07:00
Gael Guennebaud
341ae8665d avoid division by 0 2014-09-16 16:05:06 -07:00
Gael Guennebaud
fc23e93707 Add a portable log2 function for integers 2014-09-17 09:56:07 +02:00
Gael Guennebaud
0f0580b97c Remove not needed template keyword. 2014-09-17 09:55:44 +02:00
Gael Guennebaud
486ca277a0 Workaround MSVC ICE 2014-09-16 10:29:29 -07:00
Benoit Steiner
10a79ca3a3 Merged latest updates from the Eigen trunk. 2014-09-15 09:18:16 -07:00
Gael Guennebaud
466d6d41c6 Avoid a potential risk of recursive definition using traits to get he scalar type 2014-09-15 17:40:17 +02:00
Gael Guennebaud
8514179aa3 Fix traits<Quaternion>::IsAligned when using evaluators 2014-09-15 13:53:52 +02:00
Gael Guennebaud
0403d49006 Fix inverse unit test making sure we try to invert an invertible matrix 2014-09-14 20:12:07 +02:00
Gael Guennebaud
c83e01f2d6 Favor column major storage for inner products 2014-09-14 19:38:49 +02:00
Gael Guennebaud
26db954776 Re-enable aliasing checks when using evaluators 2014-09-14 19:06:08 +02:00
Gael Guennebaud
fda680f9cf Adapt changeset 51b3f558bb
to evaluators:
(Fix bug #822: outer products needed linear access, and add respective unit tests)
2014-09-14 18:31:29 +02:00
Gael Guennebaud
dfc54e1bbf Fix /= when using evaluator as in changeset 2d90484450 2014-09-14 18:27:48 +02:00
Gael Guennebaud
749b56f6af merge with default branch 2014-09-14 17:34:54 +02:00
Gael Guennebaud
af9c9f7706 Fix comparison to block size 2014-09-14 17:33:39 +02:00
Jitse Niesen
9452eb38f8 Make UpperBidiagonalization accept row-major matrices (bug #769)
* Give temporary workspace the same storage order as original matrix
* Take storage order into account when determining inner stride
  of rows and columns
* Change one test to use a row-major matrix.
2014-09-12 14:52:35 +01:00
Konstantinos Margaritis
470aa15c35 First time it compiles, but fails to pass the tests. 2014-09-09 16:58:48 +00:00
Gael Guennebaud
188a13f9fe Fix compilation of coeff(Index) on sub-inner-panels 2014-09-08 09:50:03 +02:00
Benoit Steiner
efdff15749 Fixed a typo in the contraction code 2014-09-06 13:28:24 -07:00
Gael Guennebaud
dacd39ea76 Exploit sparse structure in naiveU and naiveV when updating them. 2014-09-05 17:51:46 +02:00
Benoit Steiner
74db22455a Misc fixes. 2014-09-05 07:47:43 -07:00
Gael Guennebaud
b23556bbbd Oops, a block size of 1 is not very useful, set it to 48 as in HouseholderQR 2014-09-05 08:50:50 +02:00
Benoit Steiner
1abe4ed14c Created more regression tests 2014-09-04 20:27:28 -07:00
Benoit Steiner
d43f737b4a Added support for evaluation of tensor shuffling operations as lvalues 2014-09-04 20:02:28 -07:00
Benoit Steiner
f50548e86a Added missing tensor copy constructors. As a result it is now possible to declare and initialize a tensor on the same line, as in:
Tensor<bla> T = A + B;  or
  Tensor<bla> T(A.reshape(new_shape));
2014-09-04 19:50:27 -07:00
Gael Guennebaud
15bad3670b Apply Householder U and V in-place. 2014-09-04 09:17:01 +02:00
Gael Guennebaud
8846aa6d1b Optimization: enable cache-efficient application of HouseholderSequence. 2014-09-04 09:15:59 +02:00
Gael Guennebaud
80993b95d3 Disable a test which had never worked without evalautors 2014-09-03 22:56:39 +02:00
Benoit Steiner
b24fe22b1a Improved the performance of the tensor convolution code by a factor of about 4. 2014-09-03 11:38:13 -07:00
Gael Guennebaud
c82dc227f1 Cleaning in BDCSVD (formating, handling of transpose case, remove some for loops) 2014-09-03 10:15:24 +02:00
Gael Guennebaud
a96f3d629c Clean bdcsvd 2014-09-02 22:30:23 +02:00
Gael Guennebaud
47829e2d16 Disable solve_ret_val like mechanism with evaluator enabled 2014-09-01 18:32:59 +02:00
Gael Guennebaud
1f398dfc82 Factorize *SVD::solve to SVDBase 2014-09-01 18:31:54 +02:00
Gael Guennebaud
b3a0365429 merge with default branch 2014-09-01 18:21:01 +02:00
Gael Guennebaud
eb39296028 Reafctoring in D&C SVD unsupported module: clean and merge the SVDBase class to Eigen/SVD, rm copy/pasted JacobiSVD.h file 2014-09-01 18:16:20 +02:00
Gael Guennebaud
72c4f8ca8f Disable a few unit tests in unsupported 2014-09-01 17:35:58 +02:00
Gael Guennebaud
b121eecf60 Fix regression is sparse-sparse product 2014-09-01 17:34:55 +02:00
Gael Guennebaud
8754341848 Fix remaining garbage during a merge. 2014-09-01 17:25:13 +02:00
Gael Guennebaud
daad9585a3 Fix Kronecker product in legacy mode. 2014-09-01 17:24:07 +02:00
Gael Guennebaud
b051bbd64f Make unsupport sparse solvers use SparseSolverBase 2014-09-01 17:21:47 +02:00
Gael Guennebaud
b3d63b4db2 Add evaluator for DynamicSparseMatrix 2014-09-01 17:21:05 +02:00
Gael Guennebaud
1c4b69c5fb Factorize solveWithGuess in IterativeSolverBase 2014-09-01 17:19:51 +02:00
Gael Guennebaud
8a74ce922c Make IncompleteLUT use SparseSolverBase. 2014-09-01 17:19:16 +02:00
Gael Guennebaud
863b7362bc Fix usage of m_isInitialized in SparseLU and Pastix support. 2014-09-01 17:16:32 +02:00
Gael Guennebaud
1bf3b34849 Fix regression in sparse-sparse product 2014-09-01 17:15:08 +02:00
Gael Guennebaud
f9580a3473 Fix Cholmod support without evaluators 2014-09-01 17:14:30 +02:00
Gael Guennebaud
fbb53b6cbb Fix sparse matrix times sparse vector. 2014-09-01 16:53:52 +02:00
Gael Guennebaud
85c7659574 Refactoring of sparse solvers through a SparseSolverBase class and usage of the Solve<> expression. Introduce a SolveWithGuess expression on top of Solve. 2014-09-01 15:00:19 +02:00
Gael Guennebaud
bc065c75d2 Implement the missing bits to make Solve compatible with sparse rhs 2014-09-01 14:50:59 +02:00
Gael Guennebaud
e6cc24cbd6 Fix compilation in legacy mode 2014-09-01 14:20:11 +02:00
Gael Guennebaud
0369db12af bug #871: fix compilation on ARM/Neon regarding __has_builtin usage 2014-09-01 10:52:58 +02:00
Konstantinos Margaritis
7ff266e3ce Initial VSX commit 2014-08-29 20:03:49 +00:00
Gael Guennebaud
b4a709520d merge 2014-08-29 15:31:54 +02:00
Gael Guennebaud
c1d0f15bde Enable evaluators by default 2014-08-29 15:31:32 +02:00
Gael Guennebaud
124d12a915 merge default branch 2014-08-29 15:20:31 +02:00
Gael Guennebaud
f29dbec321 undef Unsable macro 2014-08-29 15:12:03 +02:00
Gael Guennebaud
01f3ca3e8d Merged in georg_drenkhahn/eigen/georg_d/fix_warn_minmax (pull request PR-81)
Fix for warning on macro definitions of max() and min() in test.h
2014-08-29 14:34:00 +02:00
Gael Guennebaud
aec3d90ca6 Optimization in sparse-sparse matrix products for small ones 2014-08-29 14:19:03 +02:00
Gael Guennebaud
460662cbcc Fix SparseVector::coeffRef(i,j) and add missing SparseVector::insert*Unordered 2014-08-29 14:18:23 +02:00
Gael Guennebaud
1ed9e2d004 In sparse matrix product, enable sorted insertion when doing two transposition is defenitely not optimal. 2014-08-29 11:55:03 +02:00
Georg Drenkhahn
e49e84d979 Added missing STL include of <list> in main.h
Removed duplicated include of <sstream>
Added comments on the background of min/max macro definitions and STL header includes
2014-08-29 10:41:05 +02:00
Freddie Witherden
c3e4080474 Allow LevenbergMarquardt to work with non-standard types. 2014-08-27 15:24:51 +01:00
Benoit Steiner
2959045f2f Optimized the tensor padding code. 2014-08-26 09:47:18 -07:00
Benoit Steiner
36fffe48f7 Misc api improvements and cleanups 2014-08-23 14:35:41 -07:00
Benoit Steiner
fb5c1e9097 Optimized and cleaned up the tensor morphing code 2014-08-23 13:18:30 -07:00
Georg Drenkhahn
0ba490cf80 Fixed CMakeLists.txt files to prevent CMake 3.0.0 warnings about deprecated LOCATION target property.
Small whitespace cleanup in CMakelLists.txt.
2014-08-22 12:13:07 +02:00
Gael Guennebaud
25a3e65a68 In SparseQR, calling factorize() without analyzePattern() was broken. 2014-08-26 23:32:32 +02:00
Gael Guennebaud
be3477e206 bug #857: workaround MSVC compilation issue. 2014-08-26 12:52:29 +02:00
Gael Guennebaud
2e50289ba3 bug #861: enable posix_memalign with PGI 2014-08-26 12:54:19 +02:00
Gael Guennebaud
b49ef99617 Do not apply the preconditioner before starting the iterations as this might destroy a very good initial guess. 2014-08-21 22:14:25 +02:00
Gael Guennebaud
9c0aa81fbf bug #854: fix numerical issue in SelfAdjointEigenSolver::computeDirect for 3x3 matrices. The tolerance to detect stable cross products was too optimistic.
Add respective unit tests.
2014-08-21 10:49:09 +02:00
Benoit Steiner
3d298da269 Added support for broadcasting 2014-08-20 17:00:50 -07:00
Christoph Hertzberg
eeadc06e83 EIGEN_EXCEPTIONS was not defined in test/main.h, therefore all VERIFY_RAISES_ASSERT tests were not enabled 2014-08-20 16:39:25 +02:00
Christoph Hertzberg
4403800e11 Merged in vladimir_ch/eigen-1/vladimir_ch/fix-uninitialized-variable-warning-in-sp-1408513228472 (pull request PR-77)
Fix uninitialized variable warning in SparseQR
2014-08-20 11:12:18 +02:00
Christoph Hertzberg
9062f74d13 Merged in traversaro/eigen/traversaro/findeigen3cmake-add-reading-hints-of-eig-1407426517521 (pull request PR-76)
FindEigen3.cmake: Add reading hints of Eigen directory location from environment variables EIGEN3_ROOT and EIGEN3_ROOT_DIR .
2014-08-20 11:11:52 +02:00
Vladimir Chalupecky
6a3423da80 Fix uninitialized variable warning in SparseQR 2014-08-20 05:42:22 +00:00
Benoit Steiner
9ac3c821ea Improved the speed of convolutions when running on cuda devices 2014-08-19 16:57:10 -07:00
Benoit Steiner
33c702c79f Added support for fast integer divisions by a constant
Sped up tensor slicing by a factor of 3 by using these fast integer divisions.
2014-08-14 22:13:21 -07:00
Benoit Steiner
756292f8aa Fixed compilation errors 2014-08-14 00:32:59 -07:00
Benoit Steiner
8c8db49331 Added a few regression tests 2014-08-14 00:25:22 -07:00
Benoit Steiner
eeb43f9e2b Added support for padding, stridding, and shuffling 2014-08-14 00:22:47 -07:00
Benoit Steiner
16047c8d4a Pulled in the latest changes from the Eigen trunk 2014-08-13 22:25:29 -07:00
Benoit Steiner
916ef48846 Added ability to get the nth element from an abstract array type. 2014-08-13 08:44:47 -07:00
Benoit Steiner
f1d8c13dbc Fixed misc typos. 2014-08-13 08:40:26 -07:00
Benoit Steiner
9faad2932f Added missing apis. 2014-08-13 08:36:33 -07:00
Benoit Steiner
f8fad09301 Updated the convolution and contraction evaluators to follow the new EvalSubExprsIfNeeded apu. 2014-08-13 08:33:18 -07:00
Benoit Steiner
72e7529708 Fixed a typo. 2014-08-13 08:29:40 -07:00
Benoit Steiner
1aa2bf8274 Support for in place evaluation of expressions containing slicing and reshaping operations 2014-08-13 08:27:58 -07:00
Benoit Steiner
b1892ab14d Added suppor for in place evaluation to simple tensor expressions.
Use mempy to speedup tensor copies whenever possible.
2014-08-13 08:26:44 -07:00
Benoit Steiner
439feca139 Reworked the TensorExecutor code to support in place evaluation. 2014-08-13 08:22:05 -07:00
Kevin Locke
e6d55c081b Fix bug #852: define Traits type in general_matrix_matrix_product when EIGEN_USE_BLAS is defined 2014-08-08 04:05:28 -04:00
Gael Guennebaud
57f71a5552 Update bench_norm utility 2014-09-11 10:27:46 +02:00
Gael Guennebaud
5e890d3ad7 Improve further the accuracy of JacobiSVD wrt under/overflow while improving speed for small matrices (hypot is very slow). 2014-09-10 23:11:58 +02:00
Gael Guennebaud
2d90484450 mat/=scalar was transformed into mat*=(1/scalar) thus laking accuracy. This was also inconsistent with mat = mat/scalar. 2014-09-10 23:10:01 +02:00
Gael Guennebaud
84a7ead059 Add one more regression test for bug #791. 2014-09-10 11:59:45 +02:00
Gael Guennebaud
d6236d3b26 Fix bug #791: infinite loop in JacobiSVD in the presence of NaN. 2014-09-10 11:54:20 +02:00
Gael Guennebaud
921a645481 ArrayWrapper and MatrixWrapper classes should not be nested by reference. 2014-09-10 10:33:19 +02:00
Yan Zhou
4b678b96eb fix for MKL_BLAS not defined in MKL 11.2 2014-09-08 17:37:58 +08:00
Gael Guennebaud
51b3f558bb Fix bug #822: outer products needed linear access, and add respective unit tests 2014-09-08 10:21:22 +02:00
Gael Guennebaud
6162672dc5 Runtime alignement is not possible if AlignedOnScalar is not true (e.g., for complex<double>) 2014-09-08 10:04:26 +02:00
Gael Guennebaud
e54898f53e bug #619: workaround MSVC 2008 implementing std::abs for int only on WINCE 2014-09-07 23:02:30 +02:00
Gael Guennebaud
fafc829424 bug #804: copy group__TopicUnalignedArrayAssert.html to TopicUnalignedArrayAssert.html as the second is linked to by old Eigen versions. 2014-09-07 22:38:09 +02:00
Jitse Niesen
abb33258ce Doc: difference between array and matrix cosine etc (bug #830) 2014-09-06 14:59:44 +01:00
Jitse Niesen
25bceefb4e Replace asm by __asm__ (bug #873) 2014-09-06 11:47:24 +01:00
Gael Guennebaud
60314beb38 Update reference value for testNistLanczos1 test 2014-09-02 17:35:11 +02:00
Gael Guennebaud
280661e67d Remove LM::sqrt_() member function in favor of a shortcut for sqrt(epsilon()) 2014-09-02 17:29:06 +02:00
Gael Guennebaud
ff9bfc45f7 relax some LM unit tests 2014-09-02 17:10:17 +02:00
Gael Guennebaud
42e27d41a2 Fix hypot() and hypotNorm() wrt NaN and INF values. 2014-09-02 16:09:39 +02:00
Gael Guennebaud
a44a343f03 Fix blueNorm wrt NaN/INF. 2014-09-02 15:06:24 +02:00
Gael Guennebaud
18fbe7e7d4 Fix stableNorm() with respect to NaN and inf, and add respective unit tests. blueNorm() and hypotNorm() are broken wrt to NaN/inf 2014-09-02 14:49:23 +02:00
Gael Guennebaud
3eb5253ca1 Optimization: "matrix<complex> * real" did not call the special path and the real was converted to a complex. Add respective unit test to avoid future regression. 2014-09-02 14:41:14 +02:00
Gael Guennebaud
305aa1f9c5 Add examples for hnormalized and homogenous (fix bug #846) 2014-09-02 10:47:40 +02:00
Silvio Traversaro
50085d2c28 FindEigen3.cmake: Add reading hints of Eigen directory location from environment variables EIGEN3_ROOT and EIGEN3_ROOT_DIR . 2014-08-07 15:48:53 +00:00
Gael Guennebaud
e51da9c3a8 Memory allocated on the stack is freed at the function exit, so reduce iteration count to avoid stack overflow 2014-08-04 12:46:00 +02:00
Kolja Brix
953ec08089 Correct GMRES:
* Fix error in calculation of residual at restart.
* Use relative residual as stopping criterion.
* Improve documentation.
2014-08-02 18:39:15 +02:00
Gael Guennebaud
3e59163a24 Fix bug #850: workaround MSVC 2008 weird compilation bug 2014-08-02 02:47:30 +02:00
Gael Guennebaud
4dd55a2958 Optimize reduxions for Homogeneous 2014-08-01 17:00:20 +02:00
Gael Guennebaud
f25338f4d7 Fix nesting of Homogenous evaluator 2014-08-01 16:49:44 +02:00
Gael Guennebaud
51357a6622 Fix geo_orthomethods unit test for complexes 2014-08-01 16:26:23 +02:00
Gael Guennebaud
107bb308c3 Fix various small issues detected by gcc 2014-08-01 16:24:23 +02:00
Gael Guennebaud
c2ff44cbf3 Make assignment from general EigenBase object call evaluator, and support dense X= sparse 2014-08-01 16:23:30 +02:00
Benjamin Chrétien
c53f88297c Fix more typos in Ref.h (doc). 2014-08-01 15:43:47 +02:00
Benjamin Chrétien
6f58a41097 Fix typos in Ref.h (doc). 2014-08-01 15:35:45 +02:00
Gael Guennebaud
2a3c3c49a1 Fix numerous nested versus nested_eval shortcomings 2014-08-01 14:48:22 +02:00
Gael Guennebaud
fc13b37c55 Make cross product uses nested/nested_eval 2014-08-01 14:47:33 +02:00
Benjamin Chrétien
db76193bc7 Fix typo in PermutationMatrix (doc). 2014-08-01 14:41:49 +02:00
Benoit Steiner
647622281e The tensor assignment code now resizes the destination tensor as needed. 2014-07-31 17:39:04 -07:00
Gael Guennebaud
d79516660c Make loadMarket use the sparse-matrix index type, thus enabling loading huge matrices. 2014-07-31 16:43:19 +02:00
Gael Guennebaud
26d2cdefd4 Fix 4x4 inverse via SSE for submatrices 2014-07-31 16:24:29 +02:00
Gael Guennebaud
db183ca7b3 Make minimal changes to make homogenous compatible with evaluators 2014-07-31 14:54:54 +02:00
Gael Guennebaud
702a3c17db Make Transform exposes sizes: Dim+1 x Dim+1 for projective transform, and Dim x Dim+1 for all others 2014-07-31 14:54:00 +02:00
Gael Guennebaud
5f5a8d97c0 Re-enable main unit tests which are now compiling and running fine with evaluators 2014-07-31 13:43:19 +02:00
Gael Guennebaud
bae2e3327b Call product_generic_impl by default, and remove lot of boilerplate code 2014-07-31 13:35:49 +02:00
Gael Guennebaud
cd0ff253ec Make permutation compatible with sparse matrices 2014-07-30 15:22:50 +02:00
Gael Guennebaud
929e77192c Various minor fixes 2014-07-30 11:39:52 +02:00
Gael Guennebaud
ba694ce8cf add missing delete operator overloads 2014-07-30 09:32:35 +02:00
Benoit Steiner
2116e261fb Made sure that the data stored in fixed sized tensor is aligned. 2014-07-25 09:47:59 -07:00
Jitse Niesen
5f3d542b8a Fix typo in MatrixExponential noticed by Markos. 2014-07-25 13:34:03 +01:00
Gael Guennebaud
a0a87410d0 Fix bug #61: gemm was broken since we changed the blocking order 2014-07-24 22:08:10 +02:00
Konstantinos Margaritis
2c625ec9ba Simplification of some Altivec constants, reuse existing constants and avoid loading from RAM esp in the case of p16uc_COMPLEX_TRANSPOSE* 2014-07-22 20:46:03 +00:00
Benoit Steiner
1f371e78e6 Added a few tests to validate the behavior of the assignment operator. 2014-07-22 10:32:40 -07:00
Benoit Steiner
f7bb7ee3f3 Fixed the assignment operator of the Tensor and TensorMap classes. 2014-07-22 10:31:21 -07:00
Gael Guennebaud
d1e9f39a9a Ambiguous call fixes for clang. 2014-07-22 18:28:19 +02:00
Gael Guennebaud
7f15f27a9e Workaround ambiguous call of init1 with MSVC. 2014-07-22 17:01:34 +02:00
Gael Guennebaud
922694a2d1 Extend fixed-size ctor unit test and fix conversion warning. 2014-07-22 16:57:14 +02:00
Gael Guennebaud
baa77ffe38 Fix max sizes at compile time of DiagonalWrapper 2014-07-22 16:13:56 +02:00
Christoph Hertzberg
a8283e0ed2 Define EIGEN_TRY, EIGEN_CATCH, EIGEN_THROW as suggested by Moritz Klammer.
Make it possible to run unit-tests with exceptions disabled via EIGEN_TEST_NO_EXCEPTIONS flag.
Enhanced ctorleak unit-test
2014-07-22 13:16:44 +02:00
Gael Guennebaud
4aac87251f Re-enable a couple of unit tests with evaluators. 2014-07-22 12:54:03 +02:00
Gael Guennebaud
6daa6a0d16 Refactor TriangularView to handle both dense and sparse objects. Introduce a glu_shape<S1,S2> helper to assemble sparse/dense shapes with triagular/seladjoint views. 2014-07-22 11:35:56 +02:00
Gael Guennebaud
2a251ffab0 Implement evaluator for sparse-selfadjoint products 2014-07-22 09:32:40 +02:00
Gael Guennebaud
9b729f93a1 Resizing is done by call_assignment_noalias, so no need to perform it when dealing with aliasing. 2014-07-21 11:46:47 +02:00
Gael Guennebaud
946b99dd5c Extend qr unit test 2014-07-21 11:45:54 +02:00
Gael Guennebaud
50eef6dfc3 Compilation fixes 2014-07-20 15:16:34 +02:00
Gael Guennebaud
62f332fc04 Make sure we evaluate into temporaries matching evaluator storage order requirements 2014-07-19 15:19:10 +02:00
Gael Guennebaud
3eba5e1101 Implement evaluator for sparse outer products 2014-07-19 14:55:56 +02:00
Moritz Klammler
529e6cb552 Applied changes suggested by Christoph Hertzberg to c'tor leak fix.
- Enclose exception handling in '#ifdef EIGEN_EXCEPTIONS'.
- Use an object counter to demonstrate the bug more readily.
2014-07-18 23:19:56 +02:00
Gael Guennebaud
36e6c9064f bug #770: fix out of bounds access 2014-07-18 14:19:18 +02:00
Gael Guennebaud
a325d1cb1e merge with default branch 2014-07-18 11:02:22 +02:00
Gael Guennebaud
da62eb22e4 bug #843: fix jacobisvd for complexes and extend respective unit test to chack with random tricky matrices 2014-07-17 17:09:15 +02:00
Gael Guennebaud
77af4cc3c9 bug #397: add a warning for 64 to 32 bit integer conversion and fix many of these warning by splitting the index type used for storage and as size/coefficient indexes in PermutationMatrix and Transpositions. 2014-07-17 13:34:26 +02:00
Gael Guennebaud
5e72151ca5 bug #842: warn user about MPFR++ being under the GPL 2014-07-17 12:06:20 +02:00
Gael Guennebaud
2cd38a6634 merge 2014-07-17 12:01:55 +02:00
Gael Guennebaud
84ad8ce7e3 Fix bug #770: workaround thread safety in mpreal 2014-07-17 12:00:56 +02:00
Gael Guennebaud
40b74411e4 bug #842: update mpreal copy (fix compilation with clang) 2014-07-17 11:59:51 +02:00
Christoph Hertzberg
14c8793a70 Remove unnecessary <bench/BenchTimer.h>include 2014-07-17 11:14:14 +02:00
Gael Guennebaud
424c3ad266 bug #842: fix specialized product for mpreal 2014-07-17 09:41:33 +02:00
Gael Guennebaud
a53f2b0e43 bug #838: add unit test for fill-in in sparse outer product and fix abusive fill-in. 2014-07-16 17:00:54 +02:00
Christoph Hertzberg
cd0b433540 Regression test for bug #714.
Note that the bug only occurs on some compilers and is not fixed yet
2014-07-16 15:41:10 +02:00
Gael Guennebaud
338d2ec42b bug #826: fix is_convertible for MSVC and add minimalistic unit test for is_convertible 2014-07-16 13:17:06 +02:00
Konstantinos Margaritis
0a945687b7 Added HasDiv=1 to Altivec PacketMath.h, now vectorization_logic test passes.
Added comments to the constants, indicative of the actual values
2014-07-15 11:02:51 +00:00
Gael Guennebaud
a0d1aac6c5 Extend unit test of dense triangular solvers 2014-07-15 11:15:36 +02:00
Gael Guennebaud
2bdb3b1afd Extend dense*sparse product unit tests 2014-07-15 11:00:16 +02:00
Gael Guennebaud
3c7686630d merge with default branch 2014-07-15 10:55:03 +02:00
Christoph Hertzberg
4f440b8123 Test vectorization logic for int 2014-07-14 14:36:20 +02:00
Gael Guennebaud
a20e2462bf Fix bug #838: detect outer products from either the lhs or rhs 2014-07-11 17:15:26 +02:00
Gael Guennebaud
c0f76ce2cf Fix bug #838: fix dense * sparse and sparse * dense outer products 2014-07-11 16:25:36 +02:00
Gael Guennebaud
df604e4f49 Fix inner iterator on an outer-vector 2014-07-11 16:24:49 +02:00
Hauke Heibel
5f1eedd655 Merged in complexzeros/eigen (pull request PR-69)
Added Spline interpolation with derivatives.
2014-07-11 12:03:10 +02:00
Gael Guennebaud
296cb40161 merge with default branch 2014-07-10 22:04:45 +02:00
Benoit Steiner
40bb98e76a Added primitives to compare tensor dimensions 2014-07-10 11:29:51 -07:00
Benoit Steiner
9b7a6f0122 Added tests for tensor slicing 2014-07-10 11:27:27 -07:00
Benoit Steiner
ffd3654f67 Vectorized the evaluation of expressions involving tensor slices. 2014-07-10 11:09:46 -07:00
Jeff
b1169ce40c Fixed index that would cause crash with two point, two derivative interpolation. Added static_cast. 2014-07-10 12:03:42 -06:00
Christoph Hertzberg
d1460d9278 stride must be DenseIndex not int 2014-07-10 16:23:20 +02:00
Christoph Hertzberg
cf7cf7b490 Backed out of changeset 6089:f27f55bee3efc2cafd01cb07d3faadf7eb490f66
Unfortunately this breaks things at other places
2014-07-10 16:12:13 +02:00
Christoph Hertzberg
f27f55bee3 Make MatrixBase::makeHouseholder resize its output vector if it is zero 2014-07-10 14:59:18 +02:00
Kolja Brix
e955725ff1 Fix GMRES: Initialize essential Householder vector with correct dimension. Add check if initial guess is already a sufficient approximation. 2014-07-10 08:20:55 +02:00
Benoit Steiner
25b2f6624d Improved the speed of slicing operations. 2014-07-09 12:48:34 -07:00
Gael Guennebaud
23bb592a2d Fix unit test when using 80bits FPU 2014-07-09 17:21:16 +02:00
Christoph Hertzberg
75d19bb087 Determine version of Metis library. Apparently, at least version 5.x is needed for Eigen/MetisSupport.
Marked some internal variables as advanced
2014-07-09 16:54:15 +02:00
Gael Guennebaud
62f948c56a Generalize unit testing of pscatter 2014-07-09 16:01:24 +02:00
Gael Guennebaud
da1e356306 Merged in jdh8/eigen (pull request PR-72)
Fix bug #839
2014-07-09 13:07:39 +02:00
Gael Guennebaud
54fbbe7b4e Add unit test for bug #839. 2014-07-09 13:06:06 +02:00
Benoit Steiner
ea0906dfd8 Improved evaluation of tensor expressions when used as rvalues 2014-07-08 16:43:28 -07:00
Benoit Steiner
cc1bacea5b Improved the efficiency of the tensor evaluation code on thread pools and gpus. 2014-07-08 16:39:28 -07:00
Benoit Steiner
c285fda7f4 Extended the functionality of the TensorDeviceType classes 2014-07-08 16:30:48 -07:00
Chen-Pang He
1967e7f2f3 Fix bug #839 2014-07-09 03:32:32 +08:00
Gael Guennebaud
77d57cd681 bug #808: fix implicit conversions from int/longint to float/double 2014-07-08 19:07:58 +02:00
Gael Guennebaud
e3557e8dd2 bug #808: use double instead of float for the increasing size ratio in CompressedStorage::resize
(grafted from 0e0ae40084
)
2014-07-08 18:58:41 +02:00
Gael Guennebaud
5214466b7a Fix implicit long to int conversions in blas interface 2014-07-08 19:01:49 +02:00
Gael Guennebaud
5c4733f6e4 Fix bug #809: unused variable warning 2014-07-08 18:38:34 +02:00
Gael Guennebaud
b47ef1431f Fix many long to int implicit conversions 2014-07-08 16:47:11 +02:00
Christoph Hertzberg
e25e674852 bug #837: Always re-align the result of EIGEN_ALLOCA. 2014-07-08 13:57:26 +02:00
Gael Guennebaud
4b6b76463a Merged in jdh8/eigen (pull request PR-71)
Find benchmark opponents more aggressively
2014-07-08 13:13:16 +02:00
Gael Guennebaud
904509fbb6 Move using std::abs from Eigen's namespace to function scope. 2014-07-08 10:28:09 +02:00
Gael Guennebaud
0dfb73d46a Fix LDLT with semi-definite complex matrices: owing to round-off errors, the diagonal was not real. Also exploit the fact that the diagonal is real in the rest of LDLT 2014-07-08 10:04:27 +02:00
Gael Guennebaud
7fa83e7374 Fix LDLT with semi-definite complex matrices: owing to round-off errors, the diagonal was not real. Also exploit the fact that the diagonal is real in the rest of LDLT 2014-07-08 09:56:09 +02:00
Benoit Steiner
7d53633e05 Added support for tensor slicing 2014-07-07 14:10:36 -07:00
Benoit Steiner
bc072c5cba Added support for tensor slicing 2014-07-07 14:08:45 -07:00
Benoit Steiner
47981c5925 Added support for tensor slicing 2014-07-07 14:07:57 -07:00
Chen-Pang He
1eefa5a841 Find benchmark opponents also in /usr/lib64 2014-07-07 22:55:28 +08:00
Chen-Pang He
e4b6979334 Find OpenBLAS more aggressively. This made a difference on Fedora 20 2014-07-07 21:32:33 +08:00
Chen-Pang He
b9ee880f07 chmod -x Eigen/src/Core/GenericPacketMath.h 2014-07-07 21:28:00 +08:00
Chen-Pang He
2bf58316ee Fix dox at internal::tridiagonal_qr_step 2014-07-06 13:49:43 +08:00
Chen-Pang He
85777fc131 Mark internal namespace as \internal 2014-07-06 13:45:54 +08:00
Moritz Klammler
58687aa5e6 Avoid memory leak when constructor of user-defined type throws exception.
The added check `ctorleak.cpp` demonstrates how the leak can be reproduced.
The test appears to pass but it is leaking the storage of the (not created)
matrix.  I don't know how to make this test fail in the existing test suite but
you can run it through Valgrind (or another debugger) to verify the leak.

    $ ./check.sh ctorleak && valgrind --leak-check=full ./test/ctorleak

This patch fixes this leak by adding some try-catch-delete-rethrow blocks to
`Eigen/src/Core/util/Memory.h`.
2014-07-06 06:58:13 +02:00
Gael Guennebaud
339f14b8d1 bug #826: document caveats in 1x1 and 2x1 constructors. 2014-07-21 13:43:48 +02:00
Gael Guennebaud
d4cc1bdc7f Make the ordering method of SimplicialL[D]LT user configurable. 2014-07-20 14:22:58 +02:00
Gael Guennebaud
8e19027130 bug #826: fix 64 to 32 bits conversion warning when calling Matrix<int,1,1>(long) 2014-07-20 14:03:22 +02:00
Christoph Hertzberg
ef4a86d6b8 Fix trivial warnings in MPRealSupport 2014-07-18 16:39:58 +02:00
Christoph Hertzberg
68eafc10b1 Add note to EIGEN_DONT_PARALLELIZE into preprocessor documentation page (requested in IRC) 2014-07-18 15:42:12 +02:00
Christoph Hertzberg
1cb71a8782 bug #138: Make building of internal documentation configurable via cmake flag 2014-07-18 14:34:58 +02:00
Gael Guennebaud
ac1bb3e5b3 bug #770: fix out of bounds access 2014-07-18 14:22:33 +02:00
Chen-Pang He
4860da2de1 Percent "Eigen" in dox to prevent linking if not referring to the Eigen namespace 2014-07-05 23:01:27 +08:00
Chen-Pang He
7a915f6846 Move Doxygen-only stuff to *.dox 2014-07-05 22:41:58 +08:00
Chen-Pang He
1a817d3b70 Document internal namespace 2014-07-05 21:50:05 +08:00
Chen-Pang He
8ee38d2db6 Fix dox for namespaces 2014-07-05 21:48:48 +08:00
Christoph Hertzberg
f365380496 Fix regression introduced by 3117036b80
:
Matrix<Scalar,1,1>(int) did not compile if Scalar is not constructible from int. Now this falls back to the (Index size) constructor.
2014-07-04 12:52:55 +02:00
Christoph Hertzberg
3a9f9faada Fix unused typedef warning 2014-07-04 12:48:24 +02:00
Gael Guennebaud
998455a570 LDLT is not rank-revealing, so we should not attempt to use the biggest diagonal elements as thresholds. 2014-07-02 23:04:46 +02:00
Gael Guennebaud
0a8e4712d1 Do not attempt to include <intrin.h> on Windows CE 2014-07-02 16:13:05 +02:00
Gael Guennebaud
61b88d2feb merge with default branch 2014-07-02 09:35:37 +02:00
Gael Guennebaud
bf334b8ae5 Fix regeression in bicgstab: the threshold used to detect the need for a restart was much too large. 2014-07-01 22:29:04 +02:00
Gael Guennebaud
8f4cdbbc8f Fix typo in dense * diagonal evaluator. 2014-07-01 18:04:30 +02:00
Gael Guennebaud
7390af91b6 Implement evaluators for sparse*dense products 2014-07-01 17:53:18 +02:00
Gael Guennebaud
1e6f53e070 Use DiagonalShape as the storage kind of DiagonalBase<>. 2014-07-01 17:52:58 +02:00
Gael Guennebaud
6f846ef9c6 Split StorageKind promotion into two helpers: one for products, and one for coefficient-wise operations. 2014-07-01 17:51:53 +02:00
Christoph Hertzberg
324e7e8fc9 Removed the deprecated EIGEN2_SUPPORT, as previously announced. A compilation error is raised, if this compile-switch is defined. The documentation references to the corresponding pages from Eigen3.2 now. Also, the Eigen2 testsuite has been removed. 2014-07-01 16:58:11 +02:00
Gael Guennebaud
3c63446507 Update copyright dates 2014-07-01 13:27:35 +02:00
Gael Guennebaud
746d2db6ed Implement evaluators for sparse * sparse with auto pruning. 2014-07-01 13:18:56 +02:00
Gael Guennebaud
441f97b2df Implement evaluators for sparse * sparse products 2014-07-01 11:50:20 +02:00
Gael Guennebaud
0ad7a644df Implement nonZeros() for Transpose<sparse> 2014-07-01 11:49:46 +02:00
Gael Guennebaud
7ffd55c980 Do not bypass aliasing in sparse e assignments 2014-07-01 11:48:49 +02:00
Gael Guennebaud
75e574275c Fix bug #836: extend SparseQR to support more columns than rows. 2014-07-01 10:24:46 +02:00
Gael Guennebaud
c401167712 Fix double constructions of the nested CwiseBinaryOp evaluator in sparse*diagonal product iterator. 2014-06-27 16:41:45 +02:00
Gael Guennebaud
73e686c6a4 Implement evaluators for sparse times diagonal products. 2014-06-27 15:54:44 +02:00
Gael Guennebaud
ae039dde13 Add a NoPreferredStorageOrderBit flag for expression having no preferred storage order.
It is currently only used in Product.
2014-06-27 15:53:51 +02:00
Gael Guennebaud
f0648f8860 Implement evaluator for sparse views. 2014-06-26 13:52:19 +02:00
Jeff
08c615f1e4 IndexArray is now a typename.
Changed interpolate with derivatives test to use VERIFY_IS_APPROX.
2014-06-25 19:02:57 -06:00
Gael Guennebaud
54607665ab Fix inverse evaluator 2014-06-25 23:44:59 +02:00
Christoph Hertzberg
d73ee84d37 Disabled HIDE_SCOPE_NAMES (default doxygen setting). This might help to avoid API confusions as in bug #830. 2014-06-25 22:44:43 +02:00
Gael Guennebaud
a7bd4c455a Update sparse reduxions and sparse-vectors to evaluators. 2014-06-25 17:24:43 +02:00
Gael Guennebaud
b868bfb84a Make operator=(EigenBase<>) uses the new assignment mechanism and introduce a generic EigenBase to EigenBase assignment kind based on the previous evalTo mechanism. 2014-06-25 17:23:52 +02:00
Gael Guennebaud
3b19b466a7 Generalize static assertions on matching sizes to avoid the need for SizeAtCompileTime 2014-06-25 17:22:12 +02:00
Gael Guennebaud
199ac3f2e7 Implement evaluators for sparse coeff-wise views 2014-06-25 17:21:04 +02:00
Gael Guennebaud
e3ba5329ff Implement evaluators for sparse Block. 2014-06-25 09:58:26 +02:00
Gael Guennebaud
17f119689e implement evaluator for SparseVector 2014-06-25 09:58:03 +02:00
Jeff
f9496d341f Merged. 2014-06-23 20:24:31 -06:00
Jeff
e745a450de Removed tabs and fixed indentation. 2014-06-23 20:18:16 -06:00
Jeff
e86adc87e9 Fixed type mixing issues. 2014-06-23 19:52:42 -06:00
Jeff
b59f045c07 Using LU decomposition with complete pivoting for better accuracy. 2014-06-23 19:04:52 -06:00
Christoph Hertzberg
755be9016a Workaround clang error introduced by 3117036b80
:
"template argument for non-type template parameter is treated as function type 'bool (bool)'"
2014-06-23 22:33:36 +02:00
Jeff
957c2c291b Changed uint to unsigned int. 2014-06-23 08:34:11 -06:00
Christoph Hertzberg
15c2c083e8 Additional unit tests for bug #826 by Gael 2014-06-23 11:21:40 +02:00
Christoph Hertzberg
3117036b80 Fix bug #826: Allow initialization of 1x1 Arrays/Matrices by passing a value. 2014-06-23 11:15:42 +02:00
Christoph Hertzberg
1c3843bf86 Fix bug #729: Use alloca if it is defined 2014-06-23 11:04:12 +02:00
Christoph Hertzberg
0ddde223e9 Fixed typos 2014-06-23 11:00:52 +02:00
Gael Guennebaud
3849cc65ee Implement binaryop and transpose evaluators for sparse matrices 2014-06-23 10:40:03 +02:00
Jeff
5dbbe6b400 Added Spline interpolation with derivatives. 2014-06-20 22:12:45 -06:00
Gael Guennebaud
ec0a8b2e6d rm conflict 2014-06-20 16:30:34 +02:00
Gael Guennebaud
7fa87a8b12 Backport changes from old to new expression engines 2014-06-20 16:17:57 +02:00
Gael Guennebaud
b29b81a1f4 merge with default branch 2014-06-20 15:55:44 +02:00
Gael Guennebaud
47585c8ab2 merge 2014-06-20 15:49:07 +02:00
Gael Guennebaud
c415b627a7 Started to move the SparseCore module to evaluators: implemented assignment and cwise-unary evaluator 2014-06-20 15:42:13 +02:00
Gael Guennebaud
78bb808337 1- Introduce sub-evaluator types for unary, binary, product, and map expressions to ease specializing them.
2- Remove a lot of code which should not be there with evaluators, in particular coeff/packet methods implemented in the expressions.
2014-06-20 15:39:38 +02:00
Gael Guennebaud
963d338922 Fix bug #827: improve accuracy of quaternion to angle-axis conversion 2014-06-20 15:09:42 +02:00
Gael Guennebaud
98ef44fe55 Add assertion and warning on the requirements of SparseQR and COLAMDOrdering 2014-06-20 14:43:47 +02:00
Gael Guennebaud
1fdef63d1f Explain how to export sparse linear problems in matrix-market format. 2014-06-20 13:23:33 +02:00
Jitse Niesen
de150b1e14 Add documentation and very simple test for array atan(), part 2
(files I forget in the previous commit).
2014-06-19 15:12:33 +01:00
Jitse Niesen
55453c51e8 Add documentation and very simple test for array atan(). 2014-06-19 15:07:42 +01:00
Roger Martin
eb49100de9 Add component-wise atan() function (see bug #80). 2014-06-19 14:55:14 +01:00
Mark Borgerding
afb1a8c124 fixed warning: -Wunused-local-typedefs 2014-06-17 18:25:56 -04:00
Gael Guennebaud
c06ec0f464 Fix Jacobi preconditioner with zero diagonal entries 2014-06-17 23:47:30 +02:00
Gael Guennebaud
95ecd582a3 Update decompositions tables 2014-06-17 09:37:07 +02:00
Gael Guennebaud
b0979b8598 Merged in vladimir_ch/eigen/vladimir_ch/bug-796-fix-eigen3config.cmake (pull request PR-67)
Change variable names in Eigen3Config.cmake to EIGEN3_*
2014-06-17 09:20:37 +02:00
Benoit Steiner
774c3c1e0a Created additional unit tests for the tensor code and improved existing ones. 2014-06-13 10:20:28 -07:00
Benoit Steiner
f80c8e17eb Silenced a compilation warning 2014-06-13 10:12:12 -07:00
Benoit Steiner
38ab7e6ed0 Reworked the expression evaluation mechanism in order to make it possible to efficiently compute convolutions and contractions in the future:
* The scheduling of computation is moved out the the assignment code and into a new TensorExecutor class
 * The assignment itself is now a regular node on the expression tree
 * The expression evaluators start by recursively evaluating all their subexpressions if needed
2014-06-13 09:56:51 -07:00
Vladimir Chalupecky
1ee4e2db15 Change variable names in Eigen3Config.cmake to EIGEN3_* 2014-06-12 10:51:02 +09:00
Benoit Steiner
aa664eabb9 Fixed a few compilation errors. 2014-06-10 10:31:29 -07:00
Benoit Steiner
4304c73542 Pulled latest updates from the Eigen main trunk. 2014-06-10 10:23:32 -07:00
Benoit Steiner
925fb6b937 TensorEval are now typed on the device: this will make it possible to use partial template specialization to optimize the strategy of each evaluator for each device type.
Started work on partial evaluations.
2014-06-10 09:14:44 -07:00
Benoit Steiner
a77458a8ff Fixes compilation errors triggered when compiling the tensor contraction code with cxx11 enabled. 2014-06-09 10:06:57 -07:00
Benoit Steiner
a669052f12 Improved support for rvalues in tensor expressions. 2014-06-09 09:45:30 -07:00
Benoit Steiner
36a2b2e9dc Prevent the generation of unlaunchable cuda kernels when compiling in debug mode. 2014-06-09 09:43:51 -07:00
Benoit Steiner
2859a31ac8 Fixed compilation error 2014-06-09 09:42:34 -07:00
Benoit Steiner
d13711a363 Pulled latest changes from the main branch 2014-06-09 09:35:04 -07:00
Benoit Steiner
fe102248ac Fixed the threadpool test 2014-06-09 09:19:21 -07:00
Benoit Steiner
8c8ae2d819 Fixed a typo 2014-06-07 11:24:38 -07:00
Benoit Steiner
29aebf96e6 Created the pblend packet primitive and implemented it using SSE and AVX instructions. 2014-06-06 20:18:44 -07:00
Benoit Steiner
79085e08e9 Fixed a typo 2014-06-06 20:16:13 -07:00
Benoit Steiner
a961d72e65 Added support for convolution and reshaping of tensors. 2014-06-06 16:25:16 -07:00
Gael Guennebaud
abc1ca0af1 The BLAS interface is complete. 2014-06-06 11:21:19 +02:00
Gael Guennebaud
c331ce6b8e Fix bug #738: use the "current" version of cmake project directories to ease the inclusion of Eigen within other projects. 2014-06-06 11:06:44 +02:00
Gael Guennebaud
ed37c44765 Enable LinearAccessBit in Block expression for inner-panels 2014-06-06 11:02:20 +02:00
Benoit Steiner
8998f4099e Created additional tests for the tensor code. 2014-06-05 10:49:34 -07:00
Christian Seiler
96cb58fa3b unsupported/TensorSymmetry: factor out completely from Tensor module
Remove the symCoeff() method of the the Tensor module and move the
functionality into a new operator() of the symmetry classes. This makes
the Tensor module now completely self-contained without symmetry
support (even though previously it was only a forward declaration and a
otherwise harmless trivial templated method) and also removes the
inconsistency with the rest of eigen w.r.t. the method's naming scheme.
2014-06-04 20:44:22 +02:00
Christian Seiler
ea99433523 unsupported/TensorSymmetry: make symgroup construction autodetect number of indices
When constructing a symmetry group, make the code automatically detect
the number of indices required from the indices of the group's
generators. Also, allow the symmetry group to be applied to lists of
indices that are larger than the number of indices of the symmetry
group.

Before:
SGroup<4, Symmetry<0, 1>, Symmetry<2,3>> group;
group.apply<SomeOp, int>(std::array<int,4>{{0, 1, 2, 3}}, 0);

After:
SGroup<Symmetry<0, 1>, Symmetry<2,3>> group;
group.apply<SomeOp, int>(std::array<int,4>{{0, 1, 2, 3}}, 0);
group.apply<SomeOp, int>(std::array<int,5>{{0, 1, 2, 3, 4}}, 0);

This should make the symmetry group easier to use - especially if one
wants to reuse the same symmetry group for different tensors of maybe
different rank.

static/runtime asserts remain for the case where the length of the
index list to which a symmetry group is to be applied is too small.
2014-06-04 20:27:42 +02:00
Christian Seiler
cee62018fc unsupported/CXX11/Core: allow gen_numeric_list to have a starting point
Add a template parameter to gen_numeric_list that acts as a starting
point for the list, i.e. gen_numeric_list<int, 5, 4> will generate a
numeric_list<int, 4, 5, 6, 7, 8>.
2014-06-04 19:54:22 +02:00
Christian Seiler
58cfac9a12 unsupported/ C++11 workarounds: don't use hack for libc++ if not required
libc++ from 3.4 onwards supports constexpr std::get, but only if
compiled with -std=c++1y. Change the detection so that libc++'s
internals are only used if either -std=c++1y is not specified or the
library is too old, making the whole hack a bit more future-proof.
2014-06-04 18:47:42 +02:00
Christian Seiler
45515779d3 Fix compilation for CXX11/Tensor module if unsupported is not in include path 2014-06-04 18:31:02 +02:00
Benoit Steiner
6fa6cdd2b9 Added support for tensor contractions
Updated expression evaluation mechanism to also compute the size of the tensor result
Misc fixes and improvements.
2014-06-04 09:21:48 -07:00
Gael Guennebaud
0f1e321dd4 Fic bug #819: include path of details.h 2014-06-04 11:58:01 +02:00
Jitse Niesen
789674809f Fix test: EigenSolver on 1x1 matrix with NaN sets info to NumericalIssue.
This was changed in 3c66bb136b
.
2014-06-02 11:42:42 +01:00
Jitse Niesen
eb56461ac2 Fix doc'n of FullPivLU re permutation matrices (bug #815). 2014-05-31 23:05:18 +01:00
Benoit Steiner
736267cf6b Added support for additional tensor operations:
* comparison (<, <=, ==, !=, ...)
  * selection
  * nullary ops such as random or constant generation
  * misc unary ops such as log(), exp(), or a user defined unaryExpr()
Cleaned up the code a little.
2014-05-22 16:22:35 -07:00
Benoit Steiner
7402fea0a8 Vectorized the evaluation of tensor expression (using SSE, AVX, NEON, ...)
Added the ability to parallelize the evaluation of a tensor expression over multiple cpu cores.
Added the ability to offload the evaluation of a tensor expression to a GPU.
2014-05-16 15:08:05 -07:00
Mark Borgerding
e3ab46b8c9 AsciiQuickReference: added .real(), .imag()
(transplanted from 11462c1a29
)
2014-05-16 13:45:35 -04:00
Mark Borgerding
c794099e69 fixed AsciiQuickReference typo: LinSpace -> LinSpaced
(transplanted from e667819055
)
2014-05-08 15:14:12 -04:00
Christoph Hertzberg
aa524604b7 README.md edited online with Bitbucket 2014-05-21 14:08:04 +00:00
Benjamin Chrétien
c55c5763fe PolynomialSolver: fix typo. 2014-05-19 19:24:02 +02:00
Benjamin Chrétien
eda79321be PolynomialSolver: fix bugs related to linear polynomials. 2014-05-19 19:08:51 +02:00
Benjamin Chrétien
df92649379 PolynomialSolver: add missing constructors. 2014-05-19 18:40:29 +02:00
Benjamin Chrétien
0f94607947 PolynomialSolver: test template constructor in test suite. 2014-05-19 18:34:10 +02:00
Benjamin Chrétien
edebb15275 PolynomialSolver: add a test to reveal a bug. 2014-05-19 18:21:29 +02:00
Christoph Hertzberg
9aa3dc4e21 Merged in benoitsteiner/eigen-fixes (pull request PR-62)
Made it possible to call the assignment operator on an Eigen::Block from a CUDA kernel.
2014-05-08 17:06:28 +02:00
Benoit Steiner
881aab14b4 Made it possible to call the assignment operator on an Eigen::Block from a CUDA kernel. 2014-05-07 13:34:46 -07:00
Benoit Steiner
0320f7e3a7 Added support for fixed sized tensors.
Improved support for tensor expressions.
2014-05-06 11:18:37 -07:00
Christoph Hertzberg
5811e68dd1 Disabled unused warnings in Eigen2-tests 2014-05-06 12:53:18 +02:00
Christoph Hertzberg
9217de8bf2 Missed to remove IACA_END in previous commit 2014-05-05 15:10:18 +02:00
Christoph Hertzberg
84cb1d72b8 Removed IACA-defines
This caused redefinition warnings if IACA headers were included from elsewhere. For a clean solution we should define our own EIGEN_IACA_* macros
2014-05-05 15:06:37 +02:00
Christoph Hertzberg
56de8d3816 Fixed unused variable warnings 2014-05-05 15:03:29 +02:00
Christoph Hertzberg
b4beba72a2 Fix bug #807: Missing scalar type cast in umeyama() 2014-05-05 14:23:52 +02:00
Christoph Hertzberg
b5e3d76aa5 Fixed bug #806: Missing scalar type cast in Quaternion::setFromTwoVectors() 2014-05-05 14:22:27 +02:00
Florian George
f56d452c7e Enable atv in Blaze Benchmark 2014-05-04 17:07:17 +02:00
Florian George
af79b158a1 Use trans(X) instead of X.transpose() in Blaze Benchmark 2014-05-04 17:06:34 +02:00
Benjamin Chretien
0b7f95a03f Fix typo in SparseMatrix assert. 2014-05-03 12:41:37 +02:00
Gael Guennebaud
d67aa1549b Add missing add_subdirectory directive 2014-05-03 10:46:11 +02:00
Gael Guennebaud
07986189b7 Fix bug #803: avoid char* to int* conversion 2014-05-01 23:03:54 +02:00
Benoit Steiner
c0f2cb016e Extended support for Tensors:
* Added ability to map a region of the memory to a tensor
  * Added basic support for unary and binary coefficient wise expressions, such as addition or square root
  * Provided an emulation layer to make it possible to compile the code with compilers (such as nvcc) that don't support cxx11.
2014-04-28 10:32:27 -07:00
Gael Guennebaud
2fb64578aa Add a small benchmark to compare dense solvers for small to large problems. 2014-04-28 16:16:29 +02:00
Kolja Brix
ecf1f1d589 Make gdb pretty printer Python3-compatible (bug #800). 2014-04-28 14:10:22 +01:00
Gael Guennebaud
e7ef26fa44 TRMM: Make sure we have enough memory in rhs block to enforce alignment. 2014-04-25 23:36:22 +02:00
Gael Guennebaud
450d0c3de0 Make sure that calls to broadcast4 are 16 bytes aligned 2014-04-25 22:25:48 +02:00
Gael Guennebaud
f9d2f3903e Product kernel: skip loop on columns if there is no remaining rows 2014-04-25 16:54:30 +02:00
Gael Guennebaud
6f64b0b487 Fix sizeof unit test 2014-04-25 14:05:54 +02:00
Gael Guennebaud
c20e3641de Fix for mixed products 2014-04-25 13:22:34 +02:00
Gael Guennebaud
2dbfd83424 Implement pbroadcast4 on altivec 2014-04-25 02:46:57 -07:00
Gael Guennebaud
7388fdf560 pbroadcast4/2 assume aligned memory 2014-04-25 02:46:22 -07:00
Gael Guennebaud
c9788d55b9 Disable 3pX4 kernel on Altivec: despite this platform has 32 registers, this version seems significantly slower. 2014-04-25 11:48:22 +02:00
Gael Guennebaud
ae4d9434e2 Add unit test for pbroadcast4/2 2014-04-25 11:21:18 +02:00
Gael Guennebaud
4def7b1fa5 Fix ptranspose overload prototypes for NEON 2014-04-25 11:15:13 +02:00
Gael Guennebaud
c79bd4b64b Minor optimizations in product kernel:
- use pbroadcast4 (helpful when AVX is not available)
- process all remaining rows at once (significant speedup for small matrices)
2014-04-25 11:06:03 +02:00
Gael Guennebaud
cf7eaed38d Avoid blocking-size mismatch in unit tests calling Eigen's blas interface. 2014-04-25 11:04:02 +02:00
Gael Guennebaud
3d8d0f6269 Enable vectorization of pack_rhs with a column-major RHS.
Rename and generalize Kernel<*> to PacketBlock<*,N>.
2014-04-25 10:56:18 +02:00
Gael Guennebaud
b0e19db1cf Enable fused madd for Altivec 2014-04-24 23:17:18 +02:00
Gael Guennebaud
8d85ce88e1 Implement ptranspose on altivec and fix pgather/pscatter 2014-04-24 05:47:53 -07:00
Benoit Steiner
4eb92e5647 Fixed the NEON implementation of predux_max<Packet4i>. 2014-04-23 18:23:07 -07:00
Benoit Steiner
ccb4dec719 Created a NEON version of the ptranspose packet primitives 2014-04-23 18:22:10 -07:00
Gael Guennebaud
82b09fcb91 Add Altivec implementation of pgather/pscatter (not tested) 2014-04-23 13:09:26 +02:00
Gael Guennebaud
ecbd67a15a Fix EIGEN_MAKE_UNALIGNED_ARRAY_ASSERT macro 2014-04-22 17:03:57 +02:00
Gael Guennebaud
934ce93886 merge with default branch 2014-04-22 17:00:38 +02:00
Gael Guennebaud
5c5231ab71 Workaround gcc's default ABI not being able to distinghish between vector types of different sizes. 2014-04-22 16:03:19 +02:00
Gael Guennebaud
2606abed53 Fix 128bit packet size assumptions in unit tests. 2014-04-18 21:14:40 +02:00
Gael Guennebaud
a7d20038df Fix alignment assertion. 2014-04-18 17:06:31 +02:00
Gael Guennebaud
3454b4e5f1 Fix calls to lazy products (lazy product does not like matrices with 0 length) 2014-04-18 17:06:03 +02:00
Gael Guennebaud
94684721bd Smarter block size computation 2014-04-18 15:35:34 +02:00
Gael Guennebaud
1388f4f9fd Fix typo (was working with clang\!) 2014-04-18 11:43:13 +02:00
Gael Guennebaud
6d665d446b Fixes for fixed sizes and non vectorizable types 2014-04-17 23:26:34 +02:00
Gael Guennebaud
2c3c95990d merge 2014-04-17 22:50:49 +02:00
Benoit Steiner
6d6df90c9a Implemented the pgather/pscatter packet primitives for the arm/NEON architecture 2014-04-17 12:28:01 -07:00
Gael Guennebaud
c354bd47f7 Make our gemm bench a little more powerful. 2014-04-17 21:03:26 +02:00
Gael Guennebaud
9777a5ca60 Various minor fixes in BTL 2014-04-17 21:01:45 +02:00
Gael Guennebaud
9746396d1b Optimize AVX pset1 for complexes and ploaddup 2014-04-17 20:51:04 +02:00
Benjamin Chretien
e5d0cb54a5 Fix typo in Reductions tutorial. 2014-04-17 18:49:23 +02:00
Gael Guennebaud
1dd015fea6 Reduce block sizes in unit tests. 2014-04-17 16:27:58 +02:00
Gael Guennebaud
45a4aad572 add unit tests for ploadquad and predux4, and split packetmath unit test wrt real/complex 2014-04-17 16:27:22 +02:00
Gael Guennebaud
e1d461352e Extend mixingtype unit test to check transposed cases. 2014-04-17 16:26:35 +02:00
Gael Guennebaud
11fbdcbc38 Fix and optimize mixed products 2014-04-17 16:04:30 +02:00
Gael Guennebaud
0fa8290366 Optimize ploaddup for AVX 2014-04-17 16:02:27 +02:00
Gael Guennebaud
d936ddc3d1 Fallback to lazy products for very small ones. 2014-04-16 23:15:42 +02:00
Gael Guennebaud
de8336a9bc Enable alloca on MAC OSX 2014-04-16 23:14:58 +02:00
Jitse Niesen
ffc995c9e4 Implement evaluator<ReturnByValue>.
All supported tests pass apart from Sparse and Geometry,
except test in adjoint_4 that a = a.transpose() raises an assert.
2014-04-16 18:16:36 +01:00
Gael Guennebaud
d5a795f673 New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell.
This changeset also introduce new vector functions: ploadquad and predux4.
2014-04-16 17:05:11 +02:00
Jitse Niesen
b30706bd5c Fix typo in Inverse.h 2014-04-15 22:51:46 +01:00
Mark Borgerding
e0dbb68c2f Check IMKL version for compatibility with Eigen 2014-04-15 13:57:03 -04:00
Jitse Niesen
59f5f155c2 Port products with permutation matrices to evaluators. 2014-04-15 15:21:38 +01:00
Gael Guennebaud
20c840be15 Merged in benoitsteiner/eigen-fixes/nvcc_fixes (pull request PR-56)
Fixed a typo in CXX11Meta.h
2014-04-15 10:38:25 +02:00
Benoit Steiner
1afd50e0f3 Fixed a typo in CXX11Meta.h 2014-04-14 14:26:30 -07:00
Gael Guennebaud
3c66bb136b bug #793: detect NaN and INF in EigenSolver instead of aborting with an assert. 2014-04-14 22:00:27 +02:00
Gael Guennebaud
7098e6d976 Add isfinite overload for complexes. 2014-04-14 21:57:49 +02:00
Benoit Steiner
feaf7c7e6d Optimized SSE unaligned loads and stores when compiling a 64bit target with a recent version of gcc (ie gcc 4.8). 2014-04-14 10:44:17 -07:00
Gael Guennebaud
d567e3b893 Merged in benoitsteiner/eigen-fixes (pull request PR-55)
CUDA fixes
2014-04-14 14:33:50 +02:00
Gael Guennebaud
148acf8e4f bug #790: fix overflow in real_2x2_jacobi_svd 2014-04-14 13:52:16 +02:00
Gael Guennebaud
0587db8bf5 bug #793: fix overflow in EigenSolver and add respective regression unit test 2014-04-14 11:43:08 +02:00
Benoit Steiner
7903d3f27b Updated the compiler flags to enable nvcc to work with clang. 2014-04-12 23:39:37 -07:00
Benoit Steiner
a803ff18a9 Fixed a typo in cuda_basic.cu 2014-04-12 20:24:05 -07:00
Freddie Witherden
91288e9bf9 Add include LevenbergMarquardt in CMakeLists.txt.
This fixes bug #768.
2014-04-12 12:53:09 +01:00
Jitse Niesen
fbd5eac7cf Merged in benoitsteiner/eigen-fixes/nvcc_fixes (pull request PR-53)
Silenced a compilation warning produced by nvcc.
2014-04-11 14:16:08 +01:00
Benoit Steiner
1b333c89c9 Updated my previous fix to avoid introducing a compilation warning on ARM platforms. 2014-04-10 17:43:13 -07:00
Benoit Steiner
a1fcf599fa Silenced a compilation warning produced by nvcc. 2014-04-10 11:19:37 -07:00
Jitse Niesen
a91a7a1964 doc: Add references to Cholesky methods in SelfAdjointView. 2014-04-07 14:14:48 +01:00
Benoit Steiner
3b2321e3ab Updated the geo_parametrizedline_2 test for AVX. 2014-04-04 17:08:47 -07:00
Benoit Steiner
b446ff037e Deleted some dead code. 2014-04-04 14:12:24 -07:00
Jitse Niesen
5afcb4965c Remove out-dated comment in cholesky test. 2014-04-04 16:48:13 +01:00
Christoph Hertzberg
096af59799 Fix bug #784: Assert if assigning a product to a triangularView does not match the size. 2014-04-04 17:48:37 +02:00
Benoit Steiner
8044b00a7f bug #782: Workaround for gcc <= 4.4 compilation error on the NEON PacketMath code. 2014-04-03 23:41:47 +02:00
Benoit Steiner
aecc78325a Pulled the latest updates from the eigen trunk. 2014-04-01 22:07:05 -07:00
Christoph Hertzberg
1cb8de1250 Make some actual verifications inside the autodiff unit test 2014-04-01 17:44:48 +02:00
Florian George
56c4851323 Fixed typo: symmretric -> symmetric 2014-04-01 15:52:25 +02:00
Gael Guennebaud
ceae5b4145 Fix lapack build 2014-04-01 11:52:23 +02:00
Gael Guennebaud
ec65e6648c bug #775: propagate generator when workingaround cmake bug #9220 2014-04-01 11:45:43 +02:00
Gael Guennebaud
d992634fbc Fix bug #776: it seems that mingw does not support weak linking 2014-04-01 11:31:21 +02:00
Benoit Steiner
5e8622477b Rename the vector() factories defined in blas/common.h into make_vector() to prevent a possible name conflict with std::vector. 2014-04-01 11:23:28 +02:00
Gael Guennebaud
1221dd90aa Fix no newline at end of file warning 2014-04-01 11:21:14 +02:00
Gael Guennebaud
93870d95b7 BTL: add blaze 2014-03-31 10:59:55 +02:00
Gael Guennebaud
f603823ef3 BTL: fix warnings and extend to 5k matrices, update GotoBlas to OpenBlas, etc. 2014-03-31 10:58:30 +02:00
Gael Guennebaud
8d0441052e Finally, prefetching seems to help getting more stable performance 2014-03-31 10:42:19 +02:00
Gael Guennebaud
82c8163067 Enable repetition in mixing type unit test 2014-03-31 10:41:40 +02:00
Gael Guennebaud
1c0728043a Workaround alignment warnings 2014-03-30 22:43:47 +02:00
Gael Guennebaud
e497a27ddc Optimize gebp kernel:
1 - increase peeling level along the depth dimention (+5% for large matrices, i.e., >1000)
2 - improve pipelining when dealing with latest rows of the lhs
2014-03-30 21:57:05 +02:00
Benoit Steiner
ad59ade116 Vectorized the loop peeling of the inner loop of the block-panel matrix multiplication code. This speeds up the multiplication of matrices which size is not a multiple of the packet size. 2014-03-28 12:11:23 -07:00
Benoit Steiner
39bfbd43f0 Properly align the input data to prevent false failures of the packetmath.cpp test. 2014-03-28 12:00:08 -07:00
Gael Guennebaud
10aa14592a Add a mechanism to recursively access to half-size packet types 2014-03-28 10:18:04 +01:00
Gael Guennebaud
8d2bb2c20d merge with default branch 2014-03-28 09:24:18 +01:00
Gael Guennebaud
c94fde118a Enable vectorization of gemv for PacketSize>4 through unaligned loads (still better than no vectorization) 2014-03-28 09:11:06 +01:00
Benoit Steiner
51e85c936d Merged latest changes from parent. 2014-03-27 18:32:15 -07:00
Benoit Steiner
8a94cb3edd Implemented the SSE version of the gather and scatter packet primitives. 2014-03-27 18:29:01 -07:00
Benoit Steiner
7f3162f707 Implemented the AVX version of the gather and scatter packet primitives. 2014-03-27 17:42:25 -07:00
Benoit Steiner
ee86679096 Introduced pscatter/pgather packet primitives. They will be used to optimize the loop peeling code of the block-panel matrix multiplication kernel. 2014-03-27 16:03:03 -07:00
Gael Guennebaud
58fe2fc2b2 enforce the use of vfmadd231ps for pmadd (gcc and clang stupidely generates the other fmadd variants plus some register moves...) 2014-03-27 23:38:50 +01:00
Benoit Steiner
729363114f Fixed compilation error when FMA instructions are enabled. 2014-03-27 11:20:41 -07:00
Benoit Steiner
1697d7a179 Silenced "unused variable" warnings when compiling with FMA. 2014-03-27 11:00:47 -07:00
Benoit Steiner
3e1fe8e416 Vectorized the packing of a col-major matrix used as the right hand side argument in a matrix-matrix product when AVX instructions are used. No vectorization takes place when SSE instructions are used, however this doesn't seem to impact performance. 2014-03-27 10:38:41 -07:00
Benoit Steiner
b776458ccb Vectorized the packing of a row-major matrix used as the left hand side argument in a matrix-matrix product. 2014-03-27 10:02:24 -07:00
Benoit Steiner
c4902a3d01 Implemented the AVX version of the ptranspose packet primitive. 2014-03-27 09:34:51 -07:00
Gael Guennebaud
7d73c7f18b Change abi version when enabling AVX with GCC 2014-03-27 15:38:40 +01:00
Gael Guennebaud
6f123d209e Fix geo_* unit tests with respect to AVX 2014-03-27 15:29:56 +01:00
Gael Guennebaud
052aedd394 Implement pcplflip, palign, predux and the likes from AVC/complexes 2014-03-27 14:47:00 +01:00
Gael Guennebaud
fb03b56647 Fix warning 2014-03-27 11:38:35 +01:00
Jitse Niesen
6a81594771 Merged in infinitei/eigen (pull request PR-50)
Fixed compilation error due to obsolete internal::abs and internal::sqrt function calls
2014-03-27 10:12:25 +00:00
Mark Borgerding
9ce0d78513 immintrin.h did not come until intel version 11 2014-03-26 22:26:07 -04:00
Benoit Steiner
a419cea4a0 Created the ptranspose packet primitive that can transpose an array of N packets, where N is the number of words in each packet. This primitive will be used to complete the vectorization of the gemm_pack_lhs and gemm_pack_rhs functions.
Implemented the primitive using SSE instructions.
2014-03-26 19:03:07 -07:00
Abhijit Kundu
ba3457cab2 Fixed compilation error due to obsolete internal::abs and internal::sqrt function calls 2014-03-26 22:02:48 -04:00
Benoit Steiner
14bc4b9704 Made sure that the version of gemm_pack_rhs specialized for row major matrices is vectorized when nr == 2*PacketSize (which is the case for SSE when compiling in 64bit mode). 2014-03-26 17:35:18 -07:00
Benoit Steiner
e45a6bed45 Specialized the pload1 packet primitive for Packet8f and Packet4d in order to take advantage of the vbroadcastss and vbroadcastsd instructions whenever possible. 2014-03-26 15:58:13 -07:00
Benoit Steiner
cc73164aa8 Merged latest updates from the parent branch 2014-03-26 15:23:59 -07:00
Gael Guennebaud
f0a4c9d5ab Update gebp kernel to process a panle of 4 columns at once for the remaining ones. 2014-03-26 23:22:36 +01:00
Gael Guennebaud
8be011e776 Remove remaining bits of the dead working buffer 2014-03-26 23:14:44 +01:00
Benoit Steiner
a078f442a3 Vectorized the multiplication and division of complex numbers using AVX instructions. 2014-03-26 15:11:18 -07:00
Benoit Steiner
cf1a7bfbe1 Used AVX instructions to vectorize the complex version of the pfirst and ploaddup packet primitives.
Silenced a few compilation warnings.
2014-03-26 12:03:31 -07:00
Gael Guennebaud
bc401eb6fa Implement new 1 packet x 8 gebp kernel 2014-03-26 18:53:00 +01:00
Gael Guennebaud
b286a1e75c add pbroadcast2/4 generic intrinsics 2014-03-26 16:46:36 +01:00
Benoit Steiner
6bf3cc2732 Use AVX instructions to vectorize pset1<Packet2cd>, pset1<Packet4cf>, preverse<Packet2cd>, and preverse<Packet4cf> 2014-03-25 09:00:43 -07:00
Benoit Steiner
7ae9b0805d Used AVX instructions to vectorize the predux_min<Packet8f>, predux_min<Packet4d>, predux_max<Packet8f>, and predux_max<Packet4d> packet primitives. 2014-03-24 13:33:40 -07:00
Benoit Steiner
08f7b3221d Added proper support for AVX and FMA in the makefiles. 2014-03-24 09:52:45 -07:00
Benoit Steiner
72707a8664 Made sure that EIGEN_ALIGN is defined when EIGEN_DONT_VECTORIZE is set to true to prevent build failures when vectorization is disabled. 2014-03-21 11:40:29 -07:00
Benoit Steiner
8a0845ebd7 Merged latest changes from the parent 2014-03-18 12:58:08 -07:00
giacomo po
3e42b775ea MINRES, bug #715: add support for zero rhs, and remove square test. 2014-03-17 16:33:52 -07:00
Bo Li
dead9085c0 fixed Spline constructor dimension bug 2014-03-16 22:26:57 +08:00
Bo Li
4fe56a0e02 fix Spline constructor 2014-03-15 08:42:20 +08:00
Christoph Hertzberg
35a2c9cde7 clang does not accept this without template keyword 2014-03-14 16:48:29 +01:00
Gael Guennebaud
bb4b67cf39 Relax Ref such that Ref<MatrixXf> accepts a RowVectorXf which can be seen as a degenerate MatrixXf(1,N) 2014-03-13 18:04:19 +01:00
Gael Guennebaud
0a6c472335 A bit of cleaning 2014-03-13 15:44:20 +01:00
Christoph Hertzberg
2db792852f Silence stupid parenthesis warnings for old GCC versions (<= 4.6.x) 2014-03-13 12:58:57 +01:00
Gael Guennebaud
847d801a4c Fix bug #760: complete Eigen's lapack interface with default Lapack for SPQR if there is no fortran compiler. 2014-03-12 21:33:45 +01:00
Gael Guennebaud
aceae8314b Resurect EvalBeforeNestingBit to control nested_eval 2014-03-12 20:25:36 +01:00
Gael Guennebaud
16d4c7a5e8 Conditionally disable unit tests that are not supported by evaluators yet 2014-03-12 20:23:44 +01:00
Gael Guennebaud
a395024d44 More debug info and use lazyProd instead of operator* to query the right flags 2014-03-12 18:14:58 +01:00
Gael Guennebaud
f74ed34539 Fix regressions in redux_evaluator flags and evaluator<Block> flags 2014-03-12 18:14:08 +01:00
Gael Guennebaud
5e26b7cf9d Extend evaluation traits debuging info 2014-03-12 18:13:18 +01:00
Gael Guennebaud
74b1d79d77 merge default and evaluator branches 2014-03-12 16:24:25 +01:00
Gael Guennebaud
0b362e0c9a This file is not needed anymore 2014-03-12 16:18:54 +01:00
Gael Guennebaud
a6be1952f4 Fix a few regression when moving the flags 2014-03-12 16:18:34 +01:00
Christoph Hertzberg
2379ccffcb bug #755: CommaInitializer produced wrong assertions in absence of ReturnValueOptimization. 2014-03-12 13:48:09 +01:00
Christoph Hertzberg
88aa18df64 bug #759: Removed hard-coded double-math from Quaternion::angularDistance.
Some documentation improvements
2014-03-12 13:43:19 +01:00
Gael Guennebaud
0bd5671b9e Fix Eigenvalues module 2014-03-12 13:35:44 +01:00
Gael Guennebaud
8dd3b716e3 Move evaluation related flags from traits to evaluator and fix evaluators of MapBase and Replicate 2014-03-12 13:34:11 +01:00
Gael Guennebaud
7eefdb948c Migrate JacobiSVD to Solver 2014-03-11 13:43:46 +01:00
Gael Guennebaud
082f7ddc37 Port Cholesky module to evaluators 2014-03-11 13:33:44 +01:00
Christoph Hertzberg
bbc0ada12a Avoid stupid "enumeral mismatch in conditional expression" warnings in GCC 2014-03-11 12:18:32 +01:00
Gael Guennebaud
9be72cda2a Port QR module to Solve/Inverse 2014-03-11 11:47:32 +01:00
Gael Guennebaud
ae40583965 Fix CoeffReadCost issues 2014-03-11 11:47:14 +01:00
Gael Guennebaud
5806e73800 It is not clear what XprType::Nested should be, so let's use nested<Xpr>::type as much as possible 2014-03-11 11:44:11 +01:00
Gael Guennebaud
2bf63c6b4a Even ReturnByValue should not evaluate when assembling the expression 2014-03-11 11:42:07 +01:00
Christoph Hertzberg
1b3d7fc04c Merged in abachrac/eigen (pull request PR-47)
Move the Base typedef's from private to public scope
2014-03-11 11:01:36 +01:00
Gael Guennebaud
da6ec81282 Move CoeffReadCost mechanism to evaluators 2014-03-10 23:24:40 +01:00
Gael Guennebaud
354bd8a428 Hide legacy dense assignment routines with EIGEN_TEST_EVALUATORS 2014-03-10 09:30:58 +01:00
Gael Guennebaud
5c0f294098 Fix evaluators unit test (i.e., when only EIGEN_ENABLE_EVALUATORS is defined 2014-03-10 09:28:00 +01:00
Abraham Bachrach
804ef2350d Move the Base typedef's from private to public scope
Move the Quaternion::Base typedef from private to public scope so that one may
create child classes of Quaternion.

NOTE: This matches the semantics of MatrixBase.
2014-03-09 16:56:44 -07:00
Gael Guennebaud
a6b130c63c swap 3.2 <-> default CTestConfig.cmake file 2014-03-05 10:07:44 +01:00
Benoit Steiner
8eac97138a Merged latest changes from the main trunk 2014-02-24 13:59:43 -08:00
Benoit Steiner
1dd176b0b0 Pulled latest changes from the Eigen main trunk 2014-02-24 13:56:01 -08:00
Benoit Steiner
131027ee0a Merged eigen/eigen into default 2014-02-24 13:54:07 -08:00
Benoit Steiner
db7d49efbb Added support for FMA instructions 2014-02-24 13:45:32 -08:00
Gael Guennebaud
9fdc6258cf Implement bug #317: use a template function call to suppress unused variable warnings. This also fix the issue of the previous changeset in a much nicer way. 2014-02-24 18:13:49 +01:00
Gael Guennebaud
21fecd5252 Workaround clang ABI change with unsed arguments (ugly fix) 2014-02-24 17:12:17 +01:00
Jitse Niesen
6fecb6f1b6 Fix bug #748 - array_5 test fails for seed 1392781168. 2014-02-24 14:10:17 +00:00
Christoph Hertzberg
3e439889e0 Specify what non-resizeable objects are in transposeInPlace and adjointInPlace (cf bug #749) 2014-02-24 13:12:10 +01:00
Gael Guennebaud
cbc572caf7 Split LU/Inverse.h to Core/Inverse.h for the generic Inverse expression, and LU/InverseImpl.h for the dense implementation of dense.inverse() 2014-02-24 11:49:30 +01:00
Gael Guennebaud
1e0c2f6ddb Hide some deprecated classes. 2014-02-24 11:41:19 +01:00
Gael Guennebaud
c98881e130 By-pass ProductBase for triangular and selfadjoint products and get rid of ProductBase 2014-02-23 22:51:13 +01:00
Gael Guennebaud
d67548f345 Get rid of GeneralProduct<> for GemvProduct 2014-02-21 17:13:28 +01:00
Gael Guennebaud
6c7ab50811 Get rid of GeneralProduct<> for GemmProduct 2014-02-21 16:43:03 +01:00
Gael Guennebaud
728c3d2cb9 Get rid of GeneralProduct for outer-products, and get rid of ScaledProduct 2014-02-21 16:27:24 +01:00
Gael Guennebaud
af31b6c37a Generalize evaluator<Inverse<>> such that there is no need to specialize it 2014-02-21 15:22:08 +01:00
Gael Guennebaud
93125e372d Port LU module to evaluators (except image() and kernel()) 2014-02-20 15:26:15 +01:00
Gael Guennebaud
b2e1453e1e Some bit flags and internal structures are deprecated 2014-02-20 15:25:06 +01:00
Gael Guennebaud
9621333545 Fix dimension of Solve expression 2014-02-20 15:24:21 +01:00
Gael Guennebaud
5f6ec95291 Propagate LvalueBit flag to TriangularView 2014-02-20 15:24:00 +01:00
Gael Guennebaud
ecd2c8f37b Add general Inverse<> expression with evaluator 2014-02-20 14:18:24 +01:00
Gael Guennebaud
5960befc20 More int versus Index fixes 2014-02-19 21:42:29 +01:00
Gael Guennebaud
2eee6eaf3c Fix mixing scalar types with evaluators 2014-02-19 16:30:17 +01:00
Gael Guennebaud
8af02d19b2 ExprType::Nested has a new meaning now... 2014-02-19 15:16:11 +01:00
Gael Guennebaud
95b0a6707b evaluator<Replicate> must evaluate its argument to avoid redundant evaluations 2014-02-19 14:51:46 +01:00
Gael Guennebaud
b1ab6a8e0b Add missing assertion in swap() 2014-02-19 14:06:35 +01:00
Gael Guennebaud
61cff28618 Disable Flagged and ForceAlignedAccess 2014-02-19 14:05:56 +01:00
Gael Guennebaud
68e8ddaf94 Fix vectorization logic wrt assignment functors 2014-02-19 13:26:07 +01:00
Gael Guennebaud
3a735a6cf1 Fix lazy evaluation in Ref 2014-02-19 13:17:41 +01:00
Gael Guennebaud
ccc41128fb Add a Solve expression for uniform treatment of solve() methods. 2014-02-19 11:33:29 +01:00
Gael Guennebaud
b3a07eecc5 Fix CoeffReadCost of products to handle Dynamic costs 2014-02-19 11:32:04 +01:00
Gael Guennebaud
c16b80746a isApprox must honors nested_eval 2014-02-19 11:30:58 +01:00
Benoit Steiner
cbd7e98174 Merged the latest version of the code from eigen/eigen 2014-02-18 18:51:24 -08:00
Benoit Steiner
7ed9441ea4 Reverted the definition of the EIGEN_ALIGN to its former meaning (i.e. a boolean)
Created a new EIGEN_ALIGN_BYTES define to encode how the data should be aligned
Fixed a few remaining alignment issues exposed when the Eigen code is compiled with avx enabled.
Created a new EIGEN_ALIGN_DEFAULT define, which is set to the minimum alignment value required for the chosen instruction set. Use this value instead of EIGEN_ALIGN32 to preserve the existing alignment on SSE/Altivec/Neon.
2014-02-18 18:06:44 -08:00
Gael Guennebaud
5b78780def Add evaluator shortcut for triangular ?= product 2014-02-18 17:43:16 +01:00
Gael Guennebaud
8169c6ac59 Simplify implementation of coeff-based products to fully exploit our reduxion mechanisms.
If this results in performance regressions, then we should optimize reduxion rather than
somehow duplicate the code.
2014-02-18 16:57:25 +01:00
Gael Guennebaud
463554c254 Merge with default branch 2014-02-18 15:45:39 +01:00
Gael Guennebaud
82c066b3c4 Cleaning 2014-02-18 15:44:32 +01:00
Gael Guennebaud
0543cb51b5 Product::coeff method are also OK for lazy products (including diagonal products) 2014-02-18 14:51:41 +01:00
Gael Guennebaud
99e27916cf Fix all()/any() for evaluators 2014-02-18 14:26:25 +01:00
Gael Guennebaud
37a1d736bf _MatrixTypeNested must be public in sparse Block 2014-02-18 13:35:24 +01:00
Gael Guennebaud
06545058bb Temporary workaround for permutations 2014-02-18 13:33:04 +01:00
Gael Guennebaud
7002aa858f Support Product::coeff(0,0) even for dynamic matrices 2014-02-18 13:32:30 +01:00
Gael Guennebaud
8cfb138e73 Finally, the simplest remains to deffer resizing at the latest 2014-02-18 13:31:44 +01:00
Gael Guennebaud
1b5de5a37b Add evaluator for Ref 2014-02-18 13:30:16 +01:00
Gael Guennebaud
a08cba6b5f Move is_diagonal to XprHelper, forward declare Ref 2014-02-18 11:03:59 +01:00
Gael Guennebaud
573c587e3d New design for handling automatic transposition 2014-02-18 10:53:14 +01:00
Gael Guennebaud
551bf5c66a Get rid of DiagonalProduct 2014-02-18 10:52:26 +01:00
Gael Guennebaud
2d136d3d7f Get rid of SeflCwiseBinaryOp 2014-02-18 10:52:00 +01:00
Gael Guennebaud
873401032b Fix scalar * product optimization when 'product' includes a selfadjoint matrix 2014-02-17 19:00:45 +01:00
Gael Guennebaud
d595fd31f5 Deal with automatic transposition in call_assignment, fix a few shortcomings 2014-02-17 16:11:55 +01:00
Gael Guennebaud
bffa15142c Add evaluator support for diagonal products 2014-02-17 16:10:55 +01:00
Christoph Hertzberg
b14a4628af Relaxed umeyama test. Problem was ill-posed if linear part was scaled with very small number. This should fix bug #744. 2014-02-17 13:48:00 +01:00
Gael Guennebaud
3573a10712 Fix support for row (resp. column) of a column-major (resp. row-major) sparse matrix 2014-02-17 13:46:17 +01:00
Gael Guennebaud
bd6eca059d Fix compilation of SPlines module 2014-02-17 10:00:38 +01:00
Gael Guennebaud
ed461ba9bc Fix sparse_product/sparse_extra unit tests 2014-02-17 09:57:47 +01:00
Gael Guennebaud
3bb57e21a8 Fix FFTW unit test with clang 2014-02-17 09:56:46 +01:00
Gael Guennebaud
4b6b3f310f Fix a few Index to int buggy conversions 2014-02-15 09:35:23 +01:00
Gael Guennebaud
cd606bbc94 Fix infinite loop in sparselu 2014-02-14 23:10:16 +01:00
Gael Guennebaud
0508af4287 Merged in martinhofernandes/eigen (pull request PR-40)
Better fix for bug #503
2014-02-14 15:31:39 +01:00
Gael Guennebaud
3283d98d13 optimize sparse-sparse Kronecker product 2014-02-14 14:46:01 +01:00
Gael Guennebaud
0d3f496233 Upload the 3.2 testing result to its own CDash project 2014-02-14 10:18:14 +01:00
Gael Guennebaud
6df3bee687 reduce false negative in the qr unit test 2014-02-14 09:58:30 +01:00
Gael Guennebaud
97965dde9b alloca is not necessarily alligned on windows 2014-02-14 00:04:38 +01:00
Gael Guennebaud
0b1430ae10 Fix propagation of index type 2014-02-13 23:58:28 +01:00
Gael Guennebaud
c0e08e9e4b fix stable norm benchmark 2014-02-13 15:53:51 +01:00
Gael Guennebaud
0715d49908 Fix stable_norm unit test for complexes 2014-02-13 15:49:54 +01:00
Gael Guennebaud
3291580630 Fix bug #740: overflow issue in stableNorm 2014-02-13 15:44:01 +01:00
Gael Guennebaud
14422decc2 Fix Fortran compiler detection 2014-02-13 09:21:13 +01:00
Jitse Niesen
7ea6ef8969 Fix documentation of MatrixBase::applyOnTheLeft (bug #739)
Add examples; move methods from EigenBase.h to MatrixBase.h
2014-02-12 14:03:39 +00:00
Gael Guennebaud
31c63ef0b4 fix compilation of Transform * UniformScaling 2014-02-12 13:37:23 +01:00
Christoph Hertzberg
e170e7070b Added examples for casting, made better examples for Maps 2014-02-11 17:27:14 +01:00
yoco
15f273b63c fix reshape flag and test case 2014-02-10 22:49:13 +08:00
Jitse Niesen
c1921ad3e2 Remove unused typedef in polynomialsolver test. 2014-02-08 20:31:35 +00:00
Jitse Niesen
c4f08cfc05 Merged in maksqwe/eigen/maksqwe/fix-typo-in-evalSolverSugarFunction (pull request PR-44)
fix typo in evalSolverSugarFunction()
2014-02-08 20:27:13 +00:00
Naumov Maks
9e71ecbeec fix typo in evalSolverSugarFunction() 2014-02-08 10:40:51 +00:00
Jitse Niesen
ff8d81762d Fix bug #736: LDLT isPositive returns false for a positive semidefinite matrix
Add unit test covering this case.
2014-02-06 11:06:06 +00:00
Hauke Heibel
6c527bd811 Fixed assignment from QMatrix to Transform for compact storage. 2014-02-04 07:02:34 +01:00
yoco
b64a09acc1 fix reshape's Max[Row/Col]AtCompileTime 2014-02-04 05:54:50 +08:00
yoco
f8ad87f226 Reshape always non-directly-access 2014-02-04 05:19:56 +08:00
yoco
515bbf8bb2 Improve reshape test case
- simplify test code
- add reshape chain
2014-02-04 02:50:23 +08:00
yoco
009047db27 Fix Reshape traits flag calculate bug 2014-02-04 02:21:41 +08:00
Hauke Heibel
e722f36ffa Fixed issue #734 (thanks to Philipp Büttgenbach for reporting the issue and proposing a fix).
Kept ColMajor layout if possible in order to keep derivatives of the same order adjacent in memory.
2014-02-01 20:49:48 +01:00
Christoph Hertzberg
febfc7b9b4 Fix bug #730: Path of OpenGL headers is different on MacOS 2014-01-29 22:05:39 +01:00
Benoit Steiner
64a85800bd Added support for AVX to Eigen. 2014-01-29 11:43:05 -08:00
Gael Guennebaud
94acccc126 Fix Random().normalized() by introducing a nested_eval helper (recall that the old nested<> class is deprecated) 2014-01-26 15:35:44 +01:00
Gael Guennebaud
34694d8828 Fix evaluator<Replicate> for fixed size objects 2014-01-26 15:34:26 +01:00
Gael Guennebaud
ee1c55f923 Add missing template keyword 2014-01-26 14:55:25 +01:00
Gael Guennebaud
f54e62e4a9 Port evaluation from selfadjoint to full to evaluators 2014-01-26 12:18:36 +01:00
Gael Guennebaud
5fa7262e4c Refactor triangular assignment 2014-01-25 23:02:14 +01:00
Gael Guennebaud
fef534f52e fix scalar * prod in evaluators unit test 2014-01-25 19:06:07 +01:00
Gael Guennebaud
a7621809fe Remove useless register keyword, and optimize predux_min/max for SSE4 2014-01-25 16:54:13 +01:00
Gael Guennebaud
6cf938df53 Add a minimalistic page on CUDA with Eigen. 2014-01-24 13:24:30 +01:00
Gael Guennebaud
afcfb560a2 NVCC: add more debug info 2014-01-24 12:51:33 +01:00
Gael Guennebaud
40c42d9788 NVCC: no need to enforce host compiler 2014-01-24 12:51:05 +01:00
Gael Guennebaud
deab937d45 NVCC: fix closed-form eigenvalue decomposition, workaround gcc4.7/nvcc5.5 issue 2014-01-24 12:50:29 +01:00
yoco
2b89080903 Remove reshape InnerPanel, add test, fix bug 2014-01-20 01:43:28 +08:00
yoco
03723abda0 Remove useless reshape row/col ctor 2014-01-20 00:22:16 +08:00
yoco
342c8e5321 Fix Reshape DirectAccessBit bug 2014-01-20 00:15:19 +08:00
Christoph Hertzberg
66f1c56aab sparse_solve_retval_base::defaultEvalTo created extremely oversized temporary matrices in some cases 2014-01-19 03:04:51 +01:00
yoco
24e1c0f2a1 add reshape test for const and static-size matrix 2014-01-18 23:27:53 +08:00
yoco
150796337a Add unit-test for reshape
- add unittest for dynamic & fixed-size reshape
- fix fixed-size reshape bugs bugs found while testing
2014-01-18 16:10:46 +08:00
yoco
497a7b0ce1 remove c++11, make c++03 compatible 2014-01-18 13:20:14 +08:00
Jitse Niesen
aa0db35185 Add doc page on computing Least Squares. 2014-01-18 01:16:17 +00:00
yoco
9c832fad60 add reshape() snippets 2014-01-18 04:53:46 +08:00
yoco
1e1d0c15b5 add example code for Reshape class 2014-01-18 04:43:29 +08:00
yoco
fe2ad0647a reshape now supported
- add member function to plugin
- add forward declaration
- add documentation
- add include
2014-01-18 04:21:20 +08:00
yoco
7bd58ad0b6 add Eigen/src/Core/Reshape.h 2014-01-18 04:16:44 +08:00
Anton Gladky
4cd4be97a7 Port unsupported constrained CG to Eigen3 2014-01-15 17:49:52 +01:00
Gael Guennebaud
548216b7ca QuaternionBase::slerp was documented twice and one explanation was ambiguous. 2014-01-12 11:09:06 +01:00
Gael Guennebaud
e15cb9f4f8 Make geo_hyperplane unit test more stable (bug #539) 2014-01-11 20:04:36 +01:00
Martinho Fernandes
4c08385b74 Merged eigen/eigen into default 2014-01-10 11:22:24 +01:00
Martinho Fernandes
4ccff2d028 Placement new must use void* to avoid user-specific overloads. 2014-01-10 11:20:40 +01:00
Martinho Fernandes
3a4616d6e3 Add C++11 allocator overloads to avoid implicit conversions. 2014-01-10 11:02:11 +01:00
Gael Guennebaud
92190a1caf Add an example showing how to use C++11 random distributions 2014-01-07 20:23:35 +01:00
Gael Guennebaud
ac409f51f1 Document the fact that Random and setRandom are not reentrant (so not thread-safe) 2014-01-07 20:17:59 +01:00
Gael Guennebaud
a6a57748dd Fix typo 2014-01-05 14:24:41 +01:00
Benoit Steiner
c8c81c1e74 Improved the efficiency if the block-panel matrix multiplication code: the change reduces the pressure on the L1 cache by removing the calls to gebp_traits::unpackRhs(). Instead the packetization of the rhs blocks is done on the fly in gebp_traits::loadRhs(). This adds numerous calls to pset1<ResPacket> (since we're packetizing on the fly in the inner loop) but this is more than compensated by the fact that we're decreasing the memory transfers by a factor RhsPacketSize. 2014-01-02 16:18:32 -08:00
Christoph Hertzberg
60cd361ebe Fix bug #222. Make temporary matrix column-major independently of EIGEN_DEFAULT_TO_ROW_MAJOR 2014-03-26 17:48:30 +01:00
Gael Guennebaud
c8bfbf4a7e Merged in prclibo/eigen (pull request PR-49)
fixed a template type conversion bug in AngleAxis found by Pei Luo
2014-03-25 10:54:40 +01:00
Gael Guennebaud
01fd880424 Revert previous change and introduce a new workaround regarding gcc generating a shufps instruction instead of the more efficient pshufd instruction.
The trick consists in introducing a new pload1 function to be used in low level product kernels for which bug #203 does not apply.
Indeed, it turned out that using inline assembly prevents gcc of doing a good job at instructtion reordering.
2014-03-20 16:03:46 +01:00
Bo Li
e3fb190edf merged incoming udpates 2014-03-20 22:11:13 +08:00
Bo Li
cfd3d6ce9c fixed a template type conversion bug in AngleAxis found by Pei Luo 2014-03-20 22:05:40 +08:00
Gael Guennebaud
c39a3fa7a1 Makes gcc to generate a pshufd instruction for pset1 2014-03-20 10:14:26 +01:00
Gael Guennebaud
2a564695f0 Simpler and hopefully more future-proof fix for bug #503 (aligned_allocator with c++11) 2014-03-19 13:28:50 +01:00
Jitse Niesen
a58325ac2f Minor corrections in QR docs. 2013-12-31 18:06:28 +00:00
Christoph Hertzberg
bbf373bbe9 Applied patch from Richard JW Roberts, resolving bug #704 2013-12-21 22:14:03 +01:00
Christoph Hertzberg
1200bd2ef0 Grafted from 5725:cdedc9e90d21099e8b3191f95425680ebe710d6f
and resolved conflicts
2013-12-21 21:46:27 +01:00
Christoph Hertzberg
8a49dd5626 Fixed typos in comments 2013-12-19 11:55:17 +01:00
Benoit Steiner
ce99b502ce Use vectorization when packing row-major rhs matrices. (bug #717) 2013-12-17 10:49:43 -08:00
Henry de Valence
033ee7f6d9 Fix typo: 'explicitely' -> 'explicitly' 2014-03-08 00:44:56 -05:00
Gael Guennebaud
ba2f79e680 Fix selfadjoint_matrix_vector_product for complex with packet size > 2 (e.g., AVX) 2014-03-07 23:18:20 +01:00
Gael Guennebaud
72461be962 Fix typo and formating 2014-03-07 23:13:14 +01:00
Gael Guennebaud
33ca9b4ee6 Add support for OSX in BTL and fix a few warnings 2014-03-07 23:11:38 +01:00
Gael Guennebaud
ce41b72eb8 Extend sizeof unit test 2014-03-07 23:09:39 +01:00
Christoph Hertzberg
d5cc083782 Fixed bug #754. Only inserted (!defined(_WIN32_WCE)) analog to alloc and free implementation (not tested, but should be correct). 2014-03-05 14:50:00 +01:00
Gael Guennebaud
7313f32efa Help MSVC to inline some trivial functions 2014-03-04 17:24:00 +01:00
Christoph Hertzberg
04e1e38eed bug #289: Removed useless static keywords 2014-03-04 15:10:29 +01:00
Olivier Saut
47679c50ae Typo in the example for Eigen::SelfAdjointEigenSolver::eigenvectors, the first eigenvector should be col(0) not col(1) 2014-03-03 14:44:39 +01:00
Gael Guennebaud
76d2ca27e5 Fix PaStiX support for Pastix 5.2 2014-02-28 13:11:39 +01:00
Christoph Hertzberg
41e89c73c7 Regression test for bug #752 2014-02-27 12:57:24 +01:00
Gael Guennebaud
ac69d8769f Remove early termination in LDLT: the zero on the diagonal of the input matrix does not mean the matrix is not full rank. Typical examples are matrices coming from LS with linear equality constraints. 2014-02-26 10:12:27 +01:00
Christoph Hertzberg
6b6071866b Make pivoting HouseholderQR compatible with custom scalar types 2014-02-25 18:55:16 +01:00
Gael Guennebaud
d357bbd9c0 Fix a few regression regarding temporaries and products 2013-12-14 22:53:47 +01:00
Gael Guennebaud
27c068e9d6 Make selfqdjoint products use evaluators 2013-12-13 18:09:07 +01:00
Gael Guennebaud
e94fe4cc3e fix resizing in noalias for blocks, and make -=/+= use evaluators 2013-12-13 18:06:58 +01:00
Gael Guennebaud
2ca0ccd2f2 Add support for triangular products with evaluators 2013-12-07 17:17:47 +01:00
Gael Guennebaud
8d8acc3ab4 Move inner product special functions to a base class to avoid ambiguous calls 2013-12-04 22:58:19 +01:00
Gael Guennebaud
6c5e915e9a Enable use of evaluators for noalias and lazyProduct, add conversion to scalar for inner products 2013-12-03 17:17:53 +01:00
Gael Guennebaud
f0b82c3ab9 Make reductions compatible with evaluators 2013-12-02 17:54:38 +01:00
Gael Guennebaud
6f1a0479b3 fix a typo triangular assignment 2013-12-02 17:54:15 +01:00
Gael Guennebaud
b5fd774775 Fix flags of Product<> 2013-12-02 17:53:26 +01:00
Gael Guennebaud
34ca81b1bf Add direct assignment of products 2013-12-02 16:37:58 +01:00
Gael Guennebaud
7f917807c6 Fix product evaluator when TEST_EVALUATOR in not ON 2013-12-02 16:19:14 +01:00
Gael Guennebaud
8af1ba5346 Make swap unit test work with evaluators 2013-12-02 15:07:45 +01:00
Gael Guennebaud
c6f7337032 Get rid of call_dense_swap_loop 2013-12-02 14:44:13 +01:00
Gael Guennebaud
626821b0e3 Add evaluator/assignment to TriangularView expressions 2013-12-02 14:06:17 +01:00
Gael Guennebaud
27ca9437a1 Fix usage of Dense versus DenseShape 2013-12-02 14:05:34 +01:00
Gael Guennebaud
d0261bd26c Fix swap in DenseBase 2013-11-30 10:42:23 +01:00
Christoph Hertzberg
276801b25a Fixed and simplified Matlab code and added further block-related examples 2013-11-29 19:54:01 +01:00
Christoph Hertzberg
d61345f366 Fix bug #609: Euler angles are in Range [0:pi]x[-pi:pi]x[-pi:pi].
Now the unit test verifies this (also that it is bijective in this range).
2013-11-29 19:42:11 +01:00
Gael Guennebaud
c15c65990f First step toward the generalization of evaluators to triangular, sparse and other fancyness.
Remove product_tag template parameter to Product.
2013-11-29 17:50:59 +01:00
Gael Guennebaud
fb6e32a62f Get rid of evalautor_impl 2013-11-29 16:45:47 +01:00
Gael Guennebaud
d331def6cc add definition of product_tag 2013-11-29 16:18:22 +01:00
Gael Guennebaud
5584275325 Remove HasEvalTo and all at once eval mode 2013-11-29 13:38:59 +01:00
Gael Guennebaud
cc6dd878ee Refactor dense product evaluators 2013-11-27 17:32:57 +01:00
Gael Guennebaud
fc6ecebc69 Simplify evaluator of EvalToTemp 2013-11-27 11:32:07 +01:00
Gael Guennebaud
49034d1570 Fix bug #708: add placement new/delete for array 2013-11-27 09:46:59 +01:00
Gael Guennebaud
230f5c3aa9 Evaluator: introduce the main Assignment class, add call_assignment to bypass NoAlias and AssumeAliasing, and some bits of cleaning 2013-11-25 15:20:31 +01:00
Gael Guennebaud
c550a0e634 extend Map unit test to check buffers allocated on the stack 2013-11-21 10:39:47 +01:00
Gael Guennebaud
28b2abdbea Fix FullPivHouseholderQR ctors for non squared fixed size matrix types 2013-11-19 12:53:46 +01:00
Gael Guennebaud
654eab3bd6 Add scaling in JacobiSVD to avoid overflows 2013-11-19 11:53:48 +01:00
Gael Guennebaud
5d1291a4de Document how to reproduce matlab's rot90 2013-11-19 11:51:16 +01:00
Gael Guennebaud
8b4dd78d57 Merged in chris-se/eigen/tensor-for-merge (pull request PR-39)
Tensor support for Eigen
2013-11-16 11:12:05 +01:00
Christian Seiler
f6bac196d5 C++11/Tensor: Fix copyright headers 2013-11-16 00:03:23 +01:00
Gael Guennebaud
46dd1bb1be Workaround fixing aliasing issue in x = SparseLU::solve(x) 2013-11-15 11:19:19 +01:00
Gael Guennebaud
6b471f205e fix overflow and ambiguity in SparseLU memory allocation 2013-11-15 10:59:19 +01:00
Christian Seiler
03a956925a CXX11/TensorSymmetry: add symmetry support for Tensor class
Add a symCoeff() method to the Tensor class template that allows the
user of the class to set multiple elements of a tensor at once if they
are connected by a symmetry operation with respect to the tensor's
indices (symmetry/antisymmetry/hermiticity/antihermiticity under
echange of two indices and combination thereof for different pairs of
indices).

A compile-time resolution of the required symmetry groups via meta
templates is also implemented. For small enough groups this is used to
unroll the loop that goes through all the elements of the Tensor that
are connected by this group. For larger groups or groups where the
symmetries are defined at run time, a standard run-time implementation
of the same algorithm is provided.

For example, the following code completely initializes all elements of
the totally antisymmetric tensor in three dimensions ('epsilon
tensor'):

SGroup<3, AntiSymmetry<0,1>, AntiSymmetry<1,2>> sym;
Eigen::Tensor<double, 3> epsilon(3,3,3);
epsilon.setZero();
epsilon.symCoeff(sym, 0, 1, 2) =  1;
2013-11-14 23:35:11 +01:00
Christian Seiler
f97b3cd024 CXX11/Tensor: add simple initial tensor implementation
This commit adds an initial implementation of a class template Tensor
that allows for the storage of objects with more than two indices.
Currently, only storing data and setting the object to zero for POD
data types are implemented.
2013-11-14 22:52:37 +01:00
Christian Seiler
5e28c41549 C++11: add template metaprogramming helpers
Create a new directory CXX11 under unsupported/Eigen that contains code
that requires C++11. In that directory, add a few generic templates
useful for any module relying on C++11. These templates may be included
with #include <[unsupported/]Eigen/CXX11/Core>. At the moment, this
will only provide templates in the Eigen::internal namespace.
2013-11-14 22:27:06 +01:00
Christoph Hertzberg
e59b38abef Implement boolean reductions for zero-sized objects 2013-11-13 16:47:02 +01:00
Gael Guennebaud
8f2d068e84 Use the specialization of Block<SparseMatrix> for const matrices too 2013-11-10 16:16:50 +01:00
Gael Guennebaud
5c2d1b4710 Add missing nonZeros() overload in Block<SparseMatrixBase<>> 2013-11-10 15:26:07 +01:00
Leszek Swirski
b93520b1a5 Install functor folder with cmake 2013-11-08 14:07:11 +00:00
Gael Guennebaud
cb8da751a0 fix broken commit 2013-11-07 22:44:37 +01:00
Gael Guennebaud
fe0b44e876 Fix stupid mistake in CMakeLists.txt 2013-11-07 18:48:17 +01:00
Christoph Hertzberg
ae83f5ede9 Fixed bug #702 and added unit test.
Thanks to Alexander Werner for the report.
2013-11-07 18:32:24 +01:00
Gael Guennebaud
76c230a84d Add an option to test evaluators globally 2013-11-07 16:38:14 +01:00
Gael Guennebaud
57327cc2d5 Drop evaluators for SwapWrapper and SelfCwiseBinaryOp 2013-11-07 14:07:27 +01:00
Gael Guennebaud
5887e82729 Clean evaluator_impl_base. It will probably be removed in the future 2013-11-07 14:02:47 +01:00
Gael Guennebaud
af9851d1d7 bug #99: move the creation of the evaluator to a central place, and make generic_dense_assignment_kernel hold the destination and source evaluators 2013-11-07 12:03:12 +01:00
Gael Guennebaud
8fe609311d Move internal::swap to numext to fix ambiguous call with std::swap 2013-11-07 09:01:26 +01:00
Gael Guennebaud
8edc964734 bug #99: refactor assignment and compound assignment mechanism through "assignment functors" and "assignement kernels".
The former is very low level and generic. The later abstarct the former for dense expressions. This refactoring permits
to get rid of the very ugly SwapWrapper and SelfCwiseBinaryOp classes.
In the future, this will also permit to simplify all these evaluation loops and perhaps to reuse them for reduxions.
That will also permit to specialize for operations like expr1 += expr2 outside Eigen, and so for any kind
of expressions (dense, sparse, tensor, etc.)
2013-11-06 18:17:59 +01:00
Gael Guennebaud
a37bdfc955 Fix static/inline order 2013-11-06 11:13:31 +01:00
Gael Guennebaud
03de5c2410 Split the huge Functors.h file 2013-11-06 10:36:10 +01:00
Gael Guennebaud
4f572e4c14 Add minimalistic unit tests for NVCC support 2013-11-05 15:41:45 +01:00
Gael Guennebaud
87aee5fda1 Allow calling attributes of dynamic size objects from device 2013-11-05 15:40:58 +01:00
Gael Guennebaud
1bb1a57ef7 merge with default branch 2013-11-05 10:31:59 +01:00
Gael Guennebaud
7c9cdd6030 SparseLU: fix estimated non-zeros in U 2013-11-05 00:12:14 +01:00
Gael Guennebaud
a236e15048 JacobiSVD: fix a 0/0 issue for complexes 2013-11-04 23:58:18 +01:00
Gael Guennebaud
ad1dc50b57 Check for minimal norm solutions 2013-11-03 13:19:55 +01:00
Gael Guennebaud
019dcfc21d JacobiSVD: move from Lapack to Matlab strategy for the default threshold 2013-11-03 13:18:56 +01:00
Gael Guennebaud
19521c83b8 bug #677: fix usage of pld instrinsics for ccomplexes 2013-11-02 12:10:48 +01:00
Gael Guennebaud
bbd49d194a Add a rank method with threshold control to JacobiSVD, and make solve uses it to return the minimal norm solution for rank-deficient problems 2013-11-01 18:21:46 +01:00
Gael Guennebaud
8f496cd3a3 Fix changeset 2702788da7
for fixed size matrices
2013-11-01 18:17:55 +01:00
Gael Guennebaud
6dc0e59b1e Fix bug #677: compilation issue on arm64 which does not have the PLD instruction 2013-10-31 13:52:43 +01:00
Gael Guennebaud
2702788da7 Fix bug #678: vectors of row and columns transpositions were not properly resized in FullPivQR 2013-10-29 18:02:18 +01:00
Gael Guennebaud
58c0a6f0fd Fix unused variable warnings 2013-10-29 17:51:19 +01:00
Gael Guennebaud
5974685866 Fix parenthesis min/max issue in mpreal 2013-10-29 17:43:21 +01:00
Christoph Hertzberg
7fae9b358d Use aligned loads in Matrix-Vector product where possible. Fixes bug #689 2013-10-29 12:42:46 +01:00
Gael Guennebaud
e14f529dac Merged in martinhofernandes/eigen (pull request PR-33)
Fix for bug #503
2013-10-29 11:39:20 +01:00
Gael Guennebaud
fe2f437642 Merged in xantares/eigen (pull request PR-36)
Add cmake config files
2013-10-29 11:31:28 +01:00
Gael Guennebaud
90b5d303db Fix bug #672: use exceptions in SuperLU if they are enabled only 2013-10-29 11:26:52 +01:00
Gael Guennebaud
9b863c1830 Merged in vanhoucke/eigen_vanhoucke_unused_variable (pull request PR-34)
Silence unused variable warning.
2013-10-29 11:04:47 +01:00
Gael Guennebaud
11fbbc51fa Fix bug #359: fix AlignedBit flag of CoeffBasedProduct thus enabling the vectorization of more matrix products 2013-10-28 17:48:32 +01:00
Gael Guennebaud
d3e84b747a Clarify the meaning of AlignedBit (bug #359) 2013-10-28 17:44:07 +01:00
Gael Guennebaud
2e606394b1 Fix bug #685: document the range of Random and setRandom 2013-10-28 17:16:03 +01:00
Gael Guennebaud
285112fc55 Fix bug #688: make it clearer that CG is for both dense and sparse matrices. 2013-10-28 15:56:30 +01:00
Gael Guennebaud
9f3f42d66a fix a few "dead stores" warnings 2013-10-26 13:59:02 +02:00
Gael Guennebaud
a0e8577b49 Fix bug #684: optimize vectorization of array-scalar and scalar-array 2013-10-18 14:56:36 +02:00
Thomas Capricelli
a6bff116f9 simplify/uniformize eigen_gen_docs 2013-10-18 12:56:15 +02:00
Christoph Hertzberg
36052c4911 Added comparisons scalar to array (previously only the array to scalar was possible) (Fixes bug #147)
Extended the unit test for that
2013-10-17 15:37:29 +02:00
Christoph Hertzberg
3d2a3bc755 Copy all format flags (not only precision) from actual output stream when calculating the maximal width 2013-10-17 14:30:09 +02:00
Christoph Hertzberg
ad9dc05663 consider all columns for aligned output (fixes bug #616) 2013-10-17 14:14:06 +02:00
Christoph Hertzberg
ff075def5c Copy and paste mistake in last commit 2013-10-17 14:02:00 +02:00
Christoph Hertzberg
4d7dfafbe7 Don't add rowSpacer if columns are not to be aligned 2013-10-17 13:49:56 +02:00
Christoph Hertzberg
3390db099a Fixes bug #681
Also fixed some spelling issues in the documentation
2013-10-17 00:03:00 +02:00
Gael Guennebaud
c6da881849 Fix bug #674: typo in documentation example for BiCGSTAB. They are now proper snippet files. 2013-10-16 15:25:39 +02:00
Christoph Hertzberg
b61facb08b Use != instead of < to check for emptiness of iterator range (fixes bug #664) 2013-10-16 13:10:15 +02:00
Christoph Hertzberg
4a42843513 Make index type of Triplet default to SparseMatrix::Index as suggested by Kolja Brix. Fixes bug #665. 2013-10-16 13:08:09 +02:00
Gael Guennebaud
b433fb2857 Allow .conservativeResize(rows,cols) on vectors 2013-10-16 12:07:33 +02:00
Gael Guennebaud
2c0303c89e bug #679: add respective unit test 2013-10-15 23:51:01 +02:00
Christoph Hertzberg
0bce534c8f Fix bug #679 2013-10-15 19:09:09 +02:00
Thomas Capricelli
6bef527f9d uniformize piwik code among branches 2013-10-11 20:46:18 +02:00
xantares
2d186da58a Add cmake config files 2013-10-09 10:25:50 +02:00
vanhoucke
3736e00ae7 Silence unused variable warning. 2013-10-04 00:21:03 +00:00
Gael Guennebaud
40f1548b32 Sparse is stable now, so Eigen/Eigen should include Sparse 2013-10-02 23:31:59 +02:00
Gael Guennebaud
446320b226 Fix dot*w to return 0 for empty vectors (BLAS interface) 2013-10-01 22:37:10 +02:00
Desire NUENTSA
54e576c88a Fix SPQR Solve() when assigning to a Map object 2013-09-26 15:00:22 +02:00
Desire NUENTSA
fe19f972e1 Fix leaked memory for successive calls to SPQR 2013-09-24 15:56:56 +02:00
Gael Guennebaud
00dc45d0f9 Reduce explicit zeros when applying SparseQR's matrix Q 2013-09-20 23:28:10 +02:00
Desire NUENTSA
4bb1c48f25 Add a block sparse matrix class. tests to be added 2013-09-20 18:54:17 +02:00
Desire NUENTSA
bd21c82a94 Fix assert bug in sparseQR 2013-09-20 18:49:32 +02:00
Gael Guennebaud
1b4623e713 Fix elimination tree and SparseQR with rows<cols 2013-09-12 22:16:35 +02:00
Martinho Fernandes
a1f056cf2a Fix bug #503
C++11 support on simple allocators comes for free. `aligned_allocator` does not
need to add any `construct` overloads to work with C++11 compilers.
2013-09-10 17:08:04 +02:00
Gael Guennebaud
4612a1cd87 Fix ploaddup and lin-spaced with AltiVec. 2013-09-10 16:13:59 +02:00
Gael Guennebaud
07417bd03f Fix bug #654: allow implicit transposition in Array to Matrix and Matrix to Array constructors 2013-09-07 00:01:04 +02:00
Gael Guennebaud
7fa007e8bf Fix sparse block 2013-09-07 00:00:13 +02:00
Gael Guennebaud
ed78a76161 Merged in advanpix/eigen-mp-devs (pull request PR-32)
Fixes for SparseMatrix to support non-POD scalar types
2013-09-03 22:05:14 +02:00
Gael Guennebaud
eda2f8948a Another compilation fix with ICC/MSVC combo 2013-09-03 21:42:59 +02:00
Jitse Niesen
16cbd3d72d BDCSVD: Use rational interpolation to solve secular equation.
Algorithm is rather ad-hoc and falls back on bisection if required.
2013-08-27 15:30:11 +01:00
Hauke Heibel
86daf2f75c Added missing inline statements in order to prevent linker errors. 2013-08-27 15:41:18 +02:00
Hauke Heibel
69c057ccb1 Fixed InnerPanel definition in the Transformation class.
Added some inital documentation on InnerPanel.
2013-08-27 14:54:57 +02:00
Gael Guennebaud
94a7a1ec00 Use unblocked version if the matrix is too small, plus some cleaning. 2013-08-27 13:47:15 +02:00
Gael Guennebaud
5864e3fbd5 Implement a blocked upper-bidiagonalization algorithm. The computeUnblocked function is currently for benchmarking purpose. 2013-08-27 07:23:31 +02:00
Pavel Holoborodko
d2c4f4ab21 Updated mpfr::mpreal. Move semantic support, RVO, other new features 2013-08-26 00:22:18 +09:00
Pavel Holoborodko
41321e4366 Replaced memcpy & memmove to smart_* alternatives for non-POD scalar types 2013-08-25 18:12:15 +09:00
Pavel Holoborodko
e6462c2ce3 Switched to smart_copy to support non-trivial scalar types 2013-08-25 18:03:49 +09:00
Pavel Holoborodko
1472f4bc61 Fixed bug #647 by using smart_copy instead of bitwise memcpy. 2013-08-25 18:02:07 +09:00
Pavel Holoborodko
a147500dee Added smart_memmove with support of non-POD scalars (e.g. needed in SparseBlock.h). 2013-08-25 18:00:28 +09:00
Jitse Niesen
d1c48f1606 BDCSVD: Use HouseholderSeq directly. 2013-08-21 14:34:48 +01:00
Gael Guennebaud
1b8394f71f Fix compilation with ICC/MSVC combo 2013-08-21 15:28:53 +02:00
Gael Guennebaud
4ecfdc4716 Add explanations of the logic behind the matrix-vector products 2013-08-21 14:29:53 +02:00
Gael Guennebaud
d9381598bc Allows EIGEN_STACK_ALLOCATION_LIMIT to be 0 for no limit 2013-08-21 14:29:00 +02:00
Jitse Niesen
403be74861 BDCSVD: Compute SVD of combined problem directly.
First step at implementing final stage in BDCSVD algorithm.
Uses bisection method to solve nonlinear equation.
Still lots of room for optimization.
2013-08-20 14:10:55 +01:00
Gael Guennebaud
1c61e28b32 Fix indentation 2013-08-20 14:13:41 +02:00
Gael Guennebaud
c06e373beb Fix compilation with non-msvc compilers. 2013-08-20 14:12:42 +02:00
Gael Guennebaud
7bca2910c7 Make the static assertions on maximal fixed size object use EIGEN_STACK_ALLOCATION_LIMIT, and raise its default value to 128KB 2013-08-20 13:59:33 +02:00
Gael Guennebaud
2cf513e973 Merged in advanpix/eigen-mp-devs (pull request PR-31)
Added support for custom scalars in SparseLU
2013-08-20 12:10:38 +02:00
Gael Guennebaud
150c9fe536 Make FullPivHouseholderQR::solve returns the least-square solution instead of aborting if no exact solution exist 2013-08-20 11:52:48 +02:00
Pavel Holoborodko
e4ffb7729a Removed unnecessary parentheses 2013-08-20 16:06:13 +09:00
Pavel Holoborodko
d908ccc01c Added support for custom scalars 2013-08-20 15:00:28 +09:00
Gael Guennebaud
2b15e00106 Make ArrayBase operator+=(scalar) and -=(scalar) use SelfCwiseBinaryOp optimization 2013-08-19 16:40:50 +02:00
Gael Guennebaud
127d7f2071 Fix bug #643: enable vectorization of compound assignement for fixed size objects 2013-08-19 16:34:09 +02:00
Gael Guennebaud
c47010e3d2 typo 2013-08-19 16:10:00 +02:00
Gael Guennebaud
d4dd6aaed2 Fix bug #642: add vectorization of sqrt for doubles, and make sqrt really safe if EIGEN_FAST_MATH is disabled 2013-08-19 16:02:27 +02:00
Jitse Niesen
d3635b08da Merged in advanpix/eigen-mp-devs (pull request PR-30)
Added support for custom-scalars
2013-08-19 11:41:22 +01:00
Pavel Holoborodko
ebd6a7a46c Added support for custom-scalars 2013-09-02 19:09:39 +09:00
Christoph Hertzberg
e0dbc2913a Documentation of deprecated struct. Closing bug #426. 2013-08-16 16:43:02 +02:00
Christoph Hertzberg
1d89554f1b Deprecate boolean sum operator (bug #426) 2013-08-13 14:54:09 +02:00
Gael Guennebaud
ace2ed7b87 Fix broken link on transforming normals 2013-08-12 13:38:25 +02:00
Gael Guennebaud
956251b738 bug #638: fix typos in sparse tutorial 2013-08-12 13:37:47 +02:00
Hauke Heibel
6f5f488a80 Switched to MPL2 license. 2013-08-12 07:39:24 +02:00
Gael Guennebaud
916d29e58f Backout parts of changeset 6719e56b5b
(these changes were not intended to be commited)
2013-08-11 19:26:41 +02:00
Gael Guennebaud
bffdc491b3 Fix cost evaluation of partial reduxions -> improve performance of vectorwise/replicate expressions involving partial reduxions 2013-08-11 19:21:43 +02:00
Gael Guennebaud
6719e56b5b Ref<> objects must be nested by reference because they potentially store a temporary object 2013-08-11 17:52:43 +02:00
Jitse Niesen
c13e9bbabf QuickReference.dox: std::tan(array) --> tan(array), same for other functions. 2013-08-11 10:17:23 +01:00
Hauke Heibel
e4acd6e2fd Added copy constructor and assignment to DenseStorage.
Required by the standard even when its not used but elided.
Added a test for DenseStorage copying and assignment.
2013-08-10 19:13:46 +02:00
Hauke Heibel
8a89ba9275 Added alternative C++11 detection. 2013-08-10 19:11:03 +02:00
Hauke Heibel
097a105603 Disabled std::log1p on Cygwin. 2013-08-10 19:10:23 +02:00
Jitse Niesen
306ce33e1c BDCSVD: Streamline compute() and copyUV() 2013-08-07 16:34:34 +01:00
Jitse Niesen
616f9cc593 doc: Explain type of result for VectorwiseOp member functions.
Prompted by a question on the forum.
2013-08-06 09:49:44 +01:00
Jitse Niesen
2f0faf117e Remove LinearLeastSquares.dox , which should not have been added.
Accidentally included in changeset e37ff98bbb
 .
2013-08-06 08:03:39 +01:00
Hauke Heibel
8710440951 Removed errornous swap for stack storage. 2013-08-03 10:09:31 +02:00
Jitse Niesen
8fdffdd573 Move inheritance from Eigen example in stand-alone file.
Also fix a small mistake (Vector3d instead of VectorXd).
2013-08-02 22:33:12 +01:00
Hauke Heibel
3444f06f68 Removed a warning when rvalue references are not unsed. 2013-08-02 22:54:01 +02:00
Hauke Heibel
8f4d93a4b7 Fix compilation.
The Matrix is required to be mutable but it also needs to be a reference and
temporaries do not bind to non-const references - thus we need a hack and
cast away the constness.
2013-08-02 22:40:36 +02:00
Hauke Heibel
51b361b3bb Ensure that (potentially aligned) stack objects are passed by reference. 2013-08-02 21:07:39 +02:00
Hauke Heibel
7c99b38b7c Added move support for Matrix and Array.
Added EIGEN_HAVE_RVALUE_REFERENCES define.
Added move unit tests.
Removed superfluous 'inline' declarations in DenseStorage.
2013-08-02 19:59:43 +02:00
Gael Guennebaud
b72a686830 Fix bug #635: add isCompressed to MappedSparseMatrix for compatibility 2013-08-02 11:11:21 +02:00
Gael Guennebaud
e3058dd88b Make Pardiso solvers non copyabe 2013-08-02 11:09:02 +02:00
Gael Guennebaud
8ea7413a64 Fix compilation and warning of PARDISO 2013-08-02 11:05:00 +02:00
Gael Guennebaud
e90229a429 reduce cancellation probablity 2013-08-02 00:36:06 +02:00
Hauke Heibel
cf884a9815 Added build name support for VC11 and its service packs. 2013-08-01 16:38:05 +02:00
Gael Guennebaud
ddf7753631 Add nvcc support for small eigenvalues decompositions and workaround lack of support for std::swap and std::numeric_limits 2013-08-01 16:26:57 +02:00
Hauke Heibel
222eedf5f3 Removed unused testing files. 2013-08-01 12:14:03 +02:00
Gael Guennebaud
d0e543be26 Remove superfluous testing files (as changeset e41bc6cbbf
but more complete)
2013-07-31 23:11:43 +02:00
Hauke Heibel
8e6d0cba5f Added a pattern which forces LF line endings for *.sh files. 2013-07-31 18:20:58 +02:00
Hauke Heibel
32d46dd9b8 Backed out changeset: e41bc6cbbf 2013-07-31 18:03:34 +02:00
Hauke Heibel
e41bc6cbbf Removed unused test files. 2013-07-31 17:21:32 +02:00
Gael Guennebaud
55b57fcba6 Disable some shortcuts with nvcc 2013-07-31 16:56:31 +02:00
Hauke Heibel
39491e3b75 Enable support for minimal rebuilds. 2013-07-31 16:16:08 +02:00
Jitse Niesen
68168e9eae MatrixFunctions: replace eval() by nested.
This eliminates an unnecessary copy in some situations, e.g. Map.
2013-07-31 14:57:20 +01:00
Gael Guennebaud
6126ad801f Extend support for nvcc to Array objects and wrappers 2013-07-31 15:30:50 +02:00
Hauke Heibel
43df1e707c Merged in advanpix/eigen-mp-3.2 (pull request PR-29)
Quick fix in order to be custom-scalar friendly.
2013-07-30 08:11:39 +02:00
Hauke Heibel
b1f4601bf9 Removed non-standard conforming (17.4.3.1.2/1) leading underscore. 2013-07-30 08:05:10 +02:00
Pavel Holoborodko
acb82c7f16 Quick fix in order to be custom-scalar friendly. 2013-07-29 20:13:52 +09:00
Hauke Heibel
9ef3645cc7 Removed 'T' prefix from types and thus fixed compilation for GCC. 2013-07-29 12:08:50 +02:00
Sven Strothoff
5f11db695b bug #502: add bool intersects() methods to AlignedBox 2013-07-28 23:59:37 +02:00
Hauke Heibel
2437215221 Fixed constness in Array- and MatrixWrapper.
This also fixes the compilation on VC11.
2013-07-28 22:46:38 +02:00
Hauke Heibel
dd27b5c4a8 Fixed dummy_precision evaluation. 2013-07-28 19:31:33 +02:00
Jitse Niesen
70131120ab Fix bug in MatrixFunctions for matrices with multiple eigenvalues.
Store indices, not eigenvalues, in clusters.
Bug was introduced in changeset a3a55357db
.
2013-07-26 15:39:18 +01:00
Jitse Niesen
6d86cd7224 merge 2013-07-26 14:30:28 +01:00
Hauke Heibel
75dab1ce5e Fixed floating point warning.
Fixed evaluation of matrix_exp_computeUV.
2013-07-26 15:13:54 +02:00
Jitse Niesen
e43934d60f MatrixFunctions: Clean up StemFunction.h 2013-07-26 13:51:10 +01:00
Hauke Heibel
75edc7cc8b Fixed VC11 compilation.
The typedefs Lhs/Rhs in the base class are now accessible by derived classes.
2013-07-26 11:05:21 +02:00
Hauke Heibel
5897695e8a Merged simple geometry asserts. 2013-07-25 21:21:21 +02:00
Jitse Niesen
a3a55357db Clean up MatrixFunction and MatrixLogarithm. 2013-07-25 15:08:53 +01:00
Jitse Niesen
084dc63b4c Clean-up of MatrixSquareRoot. 2013-07-22 13:56:15 +01:00
Jitse Niesen
463343fb37 Clean-up of MatrixExponential:
* put internal stuff in the internal namespace
* replace member functions by free functions
2013-07-21 21:31:15 +01:00
Jitse Niesen
5879937f58 Merge in jdh8's branch.
* Enable singular matrix power and complex exponents.
* Eliminate unnecessary copying for sparse Kronecker product.
2013-07-21 20:50:15 +01:00
Chen-Pang He
01190b3544 Directly code failing example, or it breaks make doc. 2013-07-21 18:09:11 +08:00
Chen-Pang He
c00f688c64 Fix doc. (It is also used by computeFracPower) 2013-07-21 05:40:56 +08:00
Chen-Pang He
51573da3a4 Warn about power of a matrix with non-semisimple 0 eigenvalue. 2013-07-21 01:00:36 +08:00
Chen-Pang He
1191949e87 Improve documentation on Kronecker product module. 2013-07-21 00:19:46 +08:00
Chen-Pang He
3d94ed9fa0 Document on MatrixExponential::ScalingOp 2013-07-21 00:18:19 +08:00
Chen-Pang He
ede27f5780 Apply argument-dependent lookup on user-defined types. (using std::) 2013-07-20 23:30:37 +08:00
Chen-Pang He
dda869051d Optimize MatrixPower::computeIntPower 2013-07-20 18:47:54 +08:00
Chen-Pang He
2320073e41 Comment on private members of MatrixPower. 2013-07-20 17:58:12 +08:00
Chen-Pang He
c587e63631 Simplify MatrixPower::split 2013-07-20 17:49:38 +08:00
Gael Guennebaud
660b905e12 Fix ICE with ICC 11 2013-07-19 11:46:54 +02:00
Gael Guennebaud
4f0bd557a4 Previous isFinite->hasNonFinite change was broken. After discussion let's rename it to allFinite 2013-07-18 11:27:04 +02:00
Desire NUENTSA
736fe99fbf Fix bug #326 : expose tridiagonal eigensolver to end-users through ComputeFromTridiagonal() 2013-07-18 10:32:31 +02:00
Gael Guennebaud
6fab4012a3 Rename isFinite to hasNonFinite to avoid future naming collisions. 2013-07-17 21:13:45 +02:00
Gael Guennebaud
2f593ee67c merge with main branch 2013-07-17 13:21:35 +02:00
Gael Guennebaud
20e535e142 Bump default branch to 3.2.90 2013-07-17 10:04:20 +02:00
Gael Guennebaud
bbaef8ebba SparseLU: make COLAMDOrdering the default ordering method. 2013-07-17 09:30:25 +02:00
Gael Guennebaud
bd689ccc28 IncompleteLUT should not raise an assert in compute if factorize failed. 2013-07-17 09:21:07 +02:00
Gael Guennebaud
e3774e93b7 Fix vompilation of bdcsvd with ICC. 2013-07-17 09:20:30 +02:00
Gael Guennebaud
db8e88c936 Fix testing issues with x87 extra precision. 2013-07-16 17:35:08 +02:00
Desire NUENTSA
cfd7f9b84a avoid unneeded const_cast 2013-07-16 15:56:05 +02:00
Desire NUENTSA
3e094af410 Fix Sparse LU for matrices in non compressed mode 2013-07-16 15:15:53 +02:00
Gael Guennebaud
adeaa657eb Expose InnerSizeAtCompileTime in SparseMatrixBase (it was already present in DenseBase) and simplify sparse_vector_assign_selector (this also fix a stupid warning in old gcc versions) 2013-07-16 09:49:01 +02:00
Gael Guennebaud
f2aba7a768 Remove obsolete sentence on LPGL in MKL doc. 2013-07-15 23:25:01 +02:00
Gael Guennebaud
d02e329218 Fix adjoint unit test: test_isApproxWithRef works for positive quantities only. 2013-07-15 21:21:14 +02:00
Gael Guennebaud
c76990664b Add bdcsvd unit test in CMakeLists 2013-07-15 21:16:57 +02:00
Chen-Pang He
4b780553e0 Eliminate unnecessary copying for sparse Kronecker product. 2013-07-15 09:10:17 +08:00
Chen-Pang He
9be658f701 generateTestMatrix can use processTriangularMatrix 2013-07-15 00:43:14 +08:00
Chen-Pang He
b8f0364a1c Test singular matrix power with square roots. Exponent laws are too unstable. 2013-07-15 00:10:17 +08:00
Gael Guennebaud
ee244d54f4 SparseVector::assign: it is not always possible to reserve according to given non-zeros. 2013-07-14 11:56:08 +02:00
Chen-Pang He
cbe92de2b5 Fix typo in testSingular. 2013-07-14 17:27:44 +08:00
Chen-Pang He
eeb744dc8d Add test3dRotation. 2013-07-14 02:00:50 +08:00
Gael Guennebaud
4bb0fff151 Rationalize assignment to sparse vectors 2013-07-13 19:45:05 +02:00
Chen-Pang He
d5501d3a90 Document on MatrixPowerAtomic. 2013-07-13 23:13:07 +08:00
Chen-Pang He
3c423ccfe2 Document on complex matrix power. 2013-07-13 22:12:09 +08:00
Chen-Pang He
738d75d3eb Document on the return type of MatrixPower::operator() 2013-07-13 22:11:36 +08:00
Gael Guennebaud
9a16519d62 Extend the "functions taking Eigen type" doc page to present the Ref<> option. 2013-07-13 12:36:55 +02:00
Gael Guennebaud
06a5bcecf6 Stabilize eulerangle unit test. 2013-07-13 10:55:04 +02:00
Gael Guennebaud
7ee378d89d Fix various scalar type conversion warnings. 2013-07-12 16:40:02 +02:00
Gael Guennebaud
61c3f55362 Relax slerp unit test 2013-07-12 14:30:28 +02:00
Gael Guennebaud
5431473d67 Fix SparseMatrix::conservativeResize() when one dimension is null 2013-07-12 14:10:02 +02:00
Desire Nuentsa
444c09e313 Fix constness of diagonal() and transpose() for MSVC. 2013-07-11 12:36:57 +02:00
Gael Guennebaud
84f52ad317 Remove double const qualifier 2013-07-10 23:54:53 +02:00
Gael Guennebaud
6d1f5dbaae Add no_assignment_operator to a few classes that must not be assigned, and fix a couple of warnings. 2013-07-10 23:48:26 +02:00
Gael Guennebaud
71cccf0ed8 Rename map unit test to mapped_matrix: without splitting unit tests, this created a "map" binary file in the include path, not a good idea! 2013-07-10 23:26:35 +02:00
Gael Guennebaud
5a4519d2b4 Revisit the implementation of random_default_impl for integer to make sure avoid overflows and compiler warnings. 2013-07-10 21:11:41 +02:00
Chen-Pang He
a992fa74eb Make non-conversion unary constructors explicit. 2013-07-11 02:31:13 +08:00
Chen-Pang He
4466875d54 The only(?) way to test complex matrix power. 2013-07-10 02:59:16 +08:00
Chen-Pang He
5c95892b83 Test power of singular matrices. 2013-07-10 02:57:54 +08:00
Chen-Pang He
639d03d900 These casts are unnecessary because isApprox already casts them. 2013-07-10 02:53:15 +08:00
Chen-Pang He
d204bb57d0 Remove unused struct definition in test. 2013-07-10 02:48:17 +08:00
Chen-Pang He
c52cbd9de9 Write doc for positive power of a matrix with a semisimple zero eigenvalue. 2013-07-10 02:44:38 +08:00
Chen-Pang He
159a3bed9e Write doc for complex power of a matrix. 2013-07-10 02:43:10 +08:00
Chen-Pang He
25544dbec3 Add assertion against undefined matrix power. 2013-07-10 02:36:34 +08:00
Jitse Niesen
f850550e3e merge 2013-07-08 14:11:25 +01:00
Chen-Pang He
04bd1e3fc0 Slightly optimize atanh2. 2013-07-08 16:49:27 +08:00
Gael Guennebaud
0567cf96cc Ease setting build options when running ctest -D 2013-07-07 17:25:58 +02:00
Chen-Pang He
00e30a5fc4 We need not prohibit assignment here. Thanks to changeset 3edd4681f2
.
2013-07-07 19:57:23 +08:00
Chen-Pang He
55ec3cc6d5 Prevent copying with internal::noncopyable. 2013-07-07 19:34:13 +08:00
Gael Guennebaud
4f28ccdd0e Rationalize the use of Index type in iterators 2013-07-06 22:05:49 +02:00
Gael Guennebaud
9b833aff42 Use numeric_limits to get NaN and inf 2013-07-06 22:01:14 +02:00
Gael Guennebaud
3edd4681f2 ReturnByValue should not be assignable! 2013-07-06 20:26:02 +02:00
Gael Guennebaud
d0142e963b Fix ambiguity from the origin of Index type in BlockImpl<Sparse>::InnerIterator 2013-07-06 17:33:49 +02:00
Gael Guennebaud
8ba7ccf16a bug #63: add lapack unit tests. They are automatically downloaded and configured if EIGEN_ENABLE_LAPACK_TESTS is ON. 2013-07-06 15:08:42 +02:00
Gael Guennebaud
cc03c9d683 bug #556: workaround mingw bug with -O3 or -fipa-cp-clone 2013-07-05 23:47:40 +02:00
Gael Guennebaud
4f14b3fa72 Fix bug #611: diag * sparse * diag 2013-07-05 22:42:46 +02:00
Chen-Pang He
9e2b4eeac0 Const-correct the scaling functor. 2013-07-05 23:28:57 +08:00
Gael Guennebaud
9b9177f1ce Fix a couple of warnings in unit tests. 2013-07-05 13:35:34 +02:00
Gael Guennebaud
7d8823c8b7 Use true compile-time branching in SparseVector::assign to handle automatic transposition. 2013-07-05 09:14:32 +02:00
Chen-Pang He
c273a6c37c Avoid pow(Scalar, int) for C++11 conformance. 2013-07-05 03:33:56 +08:00
Chen-Pang He
04a9ad6e10 Let complex power fall back to "log, scale, exp". 2013-07-05 00:28:28 +08:00
Chen-Pang He
4e26057f66 Remove unused declarations for MatrixPowerProduct. 2013-07-05 00:08:11 +08:00
Desire NUENTSA
edba612f68 Fix unresolved typename bug for MSVC 2013-07-04 16:56:01 +02:00
Chen-Pang He
cce68d4e91 Remove unused inclusions. 2013-07-04 18:39:33 +08:00
Chen-Pang He
75b3391e3f Enable singular matrix power using unitary similarities. 2013-07-04 18:37:46 +08:00
Gael Guennebaud
4020d4286f Fix bug in sparse documentation. 2013-07-04 06:49:24 +02:00
Chen-Pang He
3cda1deb52 Simplify class hierarchy. 2013-07-04 05:10:43 +08:00
Chen-Pang He
eaf92ef48c Remove unreachable MatrixPowerTriangular, paving the way to future cleanups. 2013-07-04 04:42:02 +08:00
Gael Guennebaud
155fa0ca83 Add missing namespace prefix in pconj 2013-07-03 11:36:12 +02:00
Jitse Niesen
4e458d309c Fix some doxygen errors and warnings. 2013-07-02 14:08:12 +01:00
Jitse Niesen
419b5cff44 doc: Mention vec=vec.head(n) in aliasing page. 2013-07-02 13:35:36 +01:00
Gael Guennebaud
1caeb814f0 Fix bicgstab for complexes, and avoid a duplicate computation 2013-07-02 08:14:10 +02:00
Gael Guennebaud
f8e325356a It's better to check that eigen_assert does raise an assert rather than testing the definition of NDEBUG 2013-07-01 13:48:21 +02:00
Gael Guennebaud
65cc51288a On windows CE, assert.h defines NDEBUG if DEBUG is not defined 2013-07-01 13:47:25 +02:00
Gael Guennebaud
22820e950e Improve BiCGSTAB robustness: fix a divide by zero and allow to restart with a new initial residual reference. 2013-07-01 11:49:23 +02:00
Gael Guennebaud
99bef0957b Add missing sparse matrix constructor from sparse self-adjoint views, and add documentation for sparse time selfadjoint matrix 2013-06-28 22:56:26 +02:00
Desire NUENTSA
9f035c876a Fiw bug #553: add support for sparse matrix time sparse self-adjoint view products 2013-06-28 22:27:45 +02:00
Gael Guennebaud
fc27cbd914 Fix bug #611: fix const qualifier in cwiseProduct(sparse,dense) and SparseDiagonalProduct::InnerIterator 2013-06-28 17:10:53 +02:00
Gael Guennebaud
a915f0292e Fix bug #626: add assertion on input ranges for coeff* and insert members for sparse objects 2013-06-28 16:16:02 +02:00
Gael Guennebaud
4cf742525f bug #626: add compiletime check of the Options template parameter of SparseMatrix and SparseVector. Fix eval and plain_object for sparse objects. 2013-06-28 15:56:43 +02:00
Gael Guennebaud
487d94f495 Fix bug #623: inlining test_is_equal leads to failures with x87 2013-06-27 22:30:46 +02:00
Gael Guennebaud
74beb218d2 Fix bug #554: include unistd.h before checking the presence of posix_memalign. 2013-06-26 22:49:14 +02:00
Jitse Niesen
ffbe04ae78 Merged in jdh8/eigen (pull request PR-27): Matrix power cleanup 2013-06-25 13:05:37 +01:00
Gael Guennebaud
95f8a738ea Introduce a TEST_SET_BUT_UNUSED_VARIABLE macro for initialized but unused variables in the unit tests and also fix a few other warnings. 2013-06-25 11:42:04 +02:00
Gael Guennebaud
231d4a6fda Workarounf nvcc not being able to find RowMajor when declaring a Matrix<...> inside another namespace. 2013-06-25 10:08:50 +02:00
Chen-Pang He
7b6e94fb58 Clean namespace pollution. 2013-06-25 02:56:30 +08:00
Chen-Pang He
b9543ce237 Matrix square root can process 0 eigenvalue. 2013-06-24 23:57:57 +08:00
Chen-Pang He
b9fc9d8f32 Remove mat.pow * vec specialization, which causes segfault for mat.pow * mat.pow 2013-06-24 23:56:17 +08:00
Gael Guennebaud
4cc9377941 fix casting from double* to void* in SuperLU and Cholmod support 2013-06-24 17:24:32 +02:00
Chen-Pang He
ee8a28fb85 Fix segfault and bug with equal eivals in matrix power (bug #614). 2013-06-24 13:58:51 +01:00
Gael Guennebaud
1330ca611b CwiseUnaryView should not inherit no_assignment_operator! 2013-06-24 13:45:33 +02:00
Gael Guennebaud
c21a04bcf9 fix compilation of ArrayBase::transposeInPlace 2013-06-24 13:35:13 +02:00
Gael Guennebaud
c695cbf0fa fix compilation of ArrayBase::transposeInPlace 2013-06-24 13:33:44 +02:00
Gael Guennebaud
8bbde351e7 bug #620: fix robustness issue in JacobiSVD::solve (also fix a perf. issue) 2013-06-24 13:08:09 +02:00
Gael Guennebaud
d1d7a1ade9 Workaround a bunch of stupid warnings in unit tests 2013-06-23 19:11:32 +02:00
Simon Pilgrim
fab0235369 Fix bug #590: NEON Duplicate lane load 2013-06-23 14:13:21 +02:00
Gael Guennebaud
bea4a67c92 that's getting harder and harder to make ICC, GCC and clang all happy: one wants type_name to be static and if it is so then the other one triggers 'unused function' warnings -> a forward declaration seems to do the trick 2013-06-22 10:51:45 +02:00
Gael Guennebaud
260a923334 explicit template specialization cannot have a storage class 2013-06-22 10:30:26 +02:00
Gael Guennebaud
3ed919e0ed Fix an shut down a few ICC's remarks 2013-06-22 10:19:50 +02:00
Gael Guennebaud
dd964ec08c Fix a couple of warnings 2013-06-21 19:06:45 +02:00
Gael Guennebaud
620e4277bc Disable ASM comments on non x86 architecture and do not redfine if EIGEN_ASM_COMMENT is already defined 2013-06-21 17:49:36 +02:00
Gael Guennebaud
8cc9b12589 Add missing using std::pow in lpNorm. 2013-06-21 11:37:33 +02:00
Gael Guennebaud
cf5c5ed725 Fix warning typedef XXX locally defined but not used 2013-06-21 09:27:38 +02:00
Gael Guennebaud
7adfca5af2 Shutdown clang warning: argument unused during compilation: '-ansi' at linking time 2013-06-21 09:24:57 +02:00
Gael Guennebaud
c0cad44da6 Reduce maximum number of warnings/errors. (they took GBs even for limited period of time) 2013-06-20 17:39:15 +02:00
Gauthier Brun
8105b5ed3f new unsupported and not finished SVD, using a divide and conquert algorithm, with tests and benchmark 2013-06-19 00:03:27 +02:00
Gael Guennebaud
ba79e39c5c bug #71: enable vectorization of diagonal products in more cases. 2013-06-18 17:44:25 +02:00
Gael Guennebaud
eef8d98139 Fix bug #542: fix detection of compiler version on systems without the head command. 2013-06-18 17:25:37 +02:00
Jitse Niesen
4e6d746514 Avoid phrase "static allocation" for local storage on stack (bug #615). 2013-06-18 14:35:12 +01:00
Jitse Niesen
e37ff98bbb Implement mixed static/dynamic-size .block() (bug #579) 2013-06-18 14:29:15 +01:00
Kolja Brix
05da15bf40 bug #230, fix compilation issues and wrong static assertions 2013-06-18 09:44:40 +02:00
Gael Guennebaud
33788b97dd Fix compilation issue with some compilers (when doing using Base::foo;, foo must be visible in the direct base class) 2013-06-18 00:48:47 +02:00
Jitse Niesen
79bd6fa5ee Require at least cmake version 2.8.2 (bug #606). 2013-06-17 22:12:01 +01:00
Jitse Niesen
a8494787f4 Merged in RhysU/eigen//fix-documentation-typo-1371479301909 (pull request PR-25)
Fix documentation typo
2013-06-17 15:35:44 +01:00
Rhys Ulerich
437e26d000 Fix documentation typo 2013-06-17 14:28:42 +00:00
Gael Guennebaud
55365566b2 Fix HouseholderSequence::conjugate() and ::adjoint() and add respective unit tests. 2013-06-17 00:14:42 +02:00
Gael Guennebaud
9f11f80db1 Make psqrt works with numeric_limits<float>::min 2013-06-14 10:55:05 +02:00
Gael Guennebaud
5f178e54e9 Extend sparse-block unit test to explicitly cover bug #584 2013-06-14 10:52:19 +02:00
Jeff Dean
d5fa5001a7 Fix bug #613: psqrt was incorrect for small numbers 2013-06-13 18:17:27 +02:00
Gael Guennebaud
3352b8d873 Extend the magnitude range of tested numbers in packet math unit tests 2013-06-13 18:12:58 +02:00
Gael Guennebaud
d541765e85 Fix copy constructor signature 2013-06-12 18:02:13 +02:00
Gael Guennebaud
f75419c711 Add missing changes. 2013-06-12 17:56:15 +02:00
Gael Guennebaud
f3a029e957 Remove meaningless explicit qualifier 2013-06-12 13:05:23 +02:00
Gael Guennebaud
1b92d2ca33 Suppress warning #2304: non-explicit constructor with single argument may cause implicit type conversion 2013-06-12 13:02:30 +02:00
Gael Guennebaud
f6c1841316 compilation fixes in unsupported 2013-06-12 12:52:41 +02:00
Gael Guennebaud
65c59307e2 Fix sparse_basic unit test conflict 2013-06-12 10:37:15 +02:00
Gael Guennebaud
62670c83a0 Fix bug #314: move remaining math functions from internal to numext namespace 2013-06-10 23:40:56 +02:00
Gael Guennebaud
827843bbbd Complete the lapack interface to make it complete enough for suitesparse QR. 2013-06-12 10:12:50 +02:00
Gael Guennebaud
76f4820560 Improve SuiteSparse cmake scripts 2013-06-12 10:12:05 +02:00
Gael Guennebaud
f0efe60924 Fix implicit conversion warnings 2013-06-12 09:25:58 +02:00
Gael Guennebaud
92eb807c30 Fix warning: explicitely initialize all member of IOFormat 2013-06-12 09:24:07 +02:00
Gael Guennebaud
7742eacfeb Add default value for IsRepeatable in functor_traits 2013-06-12 09:22:59 +02:00
Gael Guennebaud
f3af423c70 Add missing dependency in SparseSholesky header 2013-06-11 21:13:30 +02:00
Desire NUENTSA
1bf18bd57f Fix bug in SparseLU dfs for dense matrices 2013-06-11 14:48:04 +02:00
Desire NUENTSA
9266f65318 Fix bug #588 : Compute a determinant using SparseLU 2013-06-11 14:46:13 +02:00
Desire NUENTSA
4cd8245c96 Add support with unit test for off-diagonal sparse matrix views 2013-06-11 14:42:29 +02:00
Desire NUENTSA
b3fff170a0 Restore internal math functions for unit tests 2013-06-11 14:31:31 +02:00
Gael Guennebaud
18e476107e Fix bug #583: add compile-time check that DenseIndex is signed 2013-06-10 17:16:16 +02:00
Simon Pilgrim
ca67c60150 Fix bug #591: minor optimization in NEON vectorization support 2013-06-10 15:59:03 +02:00
Gael Guennebaud
05c9be65ce Fix bug #595: typo 2013-06-10 13:10:36 +02:00
Gael Guennebaud
a4a575e2a3 fix bug #597: typo in sparse documentation 2013-06-10 12:13:31 +02:00
Gael Guennebaud
26c35b95c7 Fix bug #598: add explicit cast to Scalar type 2013-06-10 12:03:55 +02:00
Gael Guennebaud
0525874a03 Fix bug #599: add missing documentation of some members in QR module. 2013-06-10 11:58:28 +02:00
Gael Guennebaud
2b6528effc HouseholderSequence should expose standard enums (Rows/Cols, etc.)) 2013-06-10 11:42:14 +02:00
Gael Guennebaud
47e89026d0 Check sparse matrices with short indices 2013-06-10 10:34:03 +02:00
Gael Guennebaud
e8c963568c Simplify and generalize assign_selector logic 2013-06-10 10:32:29 +02:00
Gael Guennebaud
b6d3fcf6f2 Fix bug #605: ambiguous call to std::min when calling .diagonal() on a sparse matrix with non default index type 2013-06-10 10:11:29 +02:00
Gael Guennebaud
e392948548 Fix bug #607: handle implicit transposition from sparse vector to dense vector 2013-06-10 00:06:40 +02:00
Gael Guennebaud
4811b4526c Add regression test for bug #608 2013-06-09 23:30:04 +02:00
Gael Guennebaud
a69b4b092b Fix bug #608: the sign computation in LDLT was broken 2013-06-09 23:19:32 +02:00
Gael Guennebaud
c98fd7a6ca Fix bug #609: avoid if statement and improve consistency of eulerAngles method 2013-06-09 23:14:45 +02:00
Gael Guennebaud
e04b59929e fix unused variable warning 2013-06-09 21:03:32 +02:00
Gael Guennebaud
64054ee396 Add nvcc support for normalize, initializers, and fuzzy comparisons 2013-06-05 15:38:33 +02:00
Gael Guennebaud
b3adc4face Add missing pconj specializations 2013-05-17 17:25:29 +02:00
Thomas Capricelli
62e337eb01 fix a weird typo I commited in ae76c97704
(Nov 10th, 2009)
2013-06-03 23:09:33 +02:00
Desire NUENTSA
d7cd957f10 Include misc struct declarations 2013-05-29 10:15:40 +02:00
Desire NUENTSA
e0566a817f Delete unneeded resize in SparseQR 2013-05-22 10:44:12 +02:00
Desire NUENTSA
8e050bd681 Optimize Sparse setIdentity and add a unit test 2013-05-22 10:43:12 +02:00
Desire NUENTSA
cf939f154f Fix bug #596 : Recover plain SparseMatrix from SparseQR matrixQ() 2013-05-21 17:35:10 +02:00
Gael Guennebaud
bd7511fc36 Fix return type of TriangularView::ReverseInnerIterator::operator++ 2013-05-17 14:40:32 +02:00
Gael Guennebaud
bd0474adbb Fix A=A with A a SparseMatrix 2013-05-17 14:39:31 +02:00
Gael Guennebaud
9ab3811cc5 Disallow implicit scalar conversion of SparseMatrix 2013-05-17 14:02:20 +02:00
Gael Guennebaud
b5e5b6aa57 Fix non const data() member in Array and Matrix wrappers. 2013-05-16 10:18:19 +02:00
Hauke Heibel
12e69ec896 Added asserts to AngleAxis class which verify that the initial axis is
normalized.
2013-05-15 12:05:01 +02:00
Hauke Heibel
8556ca3de5 Adapted settings for the eol extension. 2013-05-15 13:45:24 +02:00
Desire NUENTSA
f7bdbf69e1 Add support in SparseLU to solve with L and U factors independently 2013-05-14 17:15:23 +02:00
Desire NUENTSA
83736e9c61 Set back the default ordering method in SPQR support 2013-05-13 13:08:13 +02:00
Desire NUENTSA
122b16d841 fix memory leak from Cholmod data in SPQR support 2013-05-13 13:04:12 +02:00
Gael Guennebaud
43bb942365 Add missing support for x.noalias() = ReturnByValue<...> 2013-05-13 10:39:50 +02:00
Gael Guennebaud
fcdbfabf7a Fix setFromTripplet with empty inputs 2013-05-03 14:28:37 +02:00
Gael Guennebaud
aa8b897607 document the evaluation order of the comma initializer 2013-04-19 14:03:16 +02:00
Gael Guennebaud
9cd2d14005 merge with default branch 2013-04-19 11:21:39 +02:00
Gael Guennebaud
4e2e615a7c actually assertion are incompatible with nvcc even on host code 2013-04-19 11:14:17 +02:00
Gael Guennebaud
46755648ec Add a few missing standard functions for ScalarWithExceptions type. 2013-04-17 10:24:31 +02:00
Gael Guennebaud
41b3c56e61 Disable "operands are evaluated in unspecified order" ICC's remark 2013-04-17 10:23:08 +02:00
Gael Guennebaud
9a4caf2b0f Extend internal doc of ploaddup and palign 2013-04-17 09:17:34 +02:00
Gael Guennebaud
94e20f485c Big 564: add hasNaN and isFinite members 2013-04-16 15:10:40 +02:00
Desire NUENTSA
d4b0c19a46 Fix a bug in Supernodal Matrix Iterator 2013-04-15 17:24:49 +02:00
Gael Guennebaud
db43205dc6 Fix ICC warning when defining both -ansi and -strict-ansi 2013-04-12 15:51:40 +02:00
Gael Guennebaud
9816e8532e Fix bug #482: pass scalar value by const reference (it remained a few cases) 2013-04-12 15:26:55 +02:00
Gael Guennebaud
43f4fd4d71 generalize testing flags to clang and ICC 2013-04-12 15:24:41 +02:00
Gael Guennebaud
7450b23fbb Fix bug #563: assignement to Block<SparseMatrix> is now allowed on non-compressed matrices 2013-04-12 13:20:13 +02:00
Gael Guennebaud
6eaff5a098 Enable SSE with ICC even when it mimics a gcc version lower than 4.2 2013-04-11 19:48:34 +02:00
Gael Guennebaud
1e38928c64 workaround strange compilation issue with ICC and -strict-ansi 2013-04-10 17:30:25 +02:00
Gael Guennebaud
ff661a7b6f Add temporary check for triangularView += product 2013-04-10 23:13:04 +02:00
Gael Guennebaud
899c0c2b6c Clean source code and unit tests with respect to -Wunused-local-typedefs 2013-04-10 22:27:35 +02:00
Gael Guennebaud
7e04d7db02 Fix a serious bug in handmade_aligned_realloc: original data have to be moved if the alignment offset differs. 2013-04-10 13:58:20 +02:00
Gael Guennebaud
f7e52d22d4 Fix missuse of unitialized values in unit tests 2013-04-10 09:46:16 +02:00
Gael Guennebaud
84637ca58c Remove a useless variable in blueNorm 2013-04-10 09:41:42 +02:00
Gael Guennebaud
d7f3cfb56e bug #564: document the fact that minCoeff/maxCoeff members have undefined behavior if the matrix contains NaN. 2013-04-09 11:27:54 +02:00
Gael Guennebaud
3cb6e21f80 Fix bug #562: add vector-wise normalized and normalize functions 2013-04-09 11:12:35 +02:00
Gael Guennebaud
d8f1035355 Fix a couple of int versus Index issues. 2013-04-09 09:43:00 +02:00
Gael Guennebaud
bff264283d Add missing epsilon/dummy_precision function in NumTraits<Array> 2013-04-09 09:31:26 +02:00
Gael Guennebaud
8f44205671 Fix bug #581: remove useless piece of code is blueNorm 2013-04-09 09:23:40 +02:00
Desire NUENTSA
d97cd746ae Replace int by Index 2013-04-08 08:51:58 +02:00
Gael Guennebaud
12439e1249 Port SelfCwiseBinaryOp and Dot.h to nvcc, fix portability issue with std::min/max 2013-04-05 16:35:49 +02:00
Christoph Hertzberg
9b33ab62da Fixing bug #578. Thanks to Angelos <filiatra@gmail.com> 2013-04-03 16:29:16 +02:00
Gael Guennebaud
c3a6fa03a2 elif/elseif typo 2013-03-26 11:52:43 +01:00
Gael Guennebaud
0a1d9fb9ae Fix warning: implicit conversion loses integer precision in SparseMatrix. No need to use std::ptrdiff_t instead of Index since this later is requested to be signed. 2013-03-20 21:58:24 +01:00
Gael Guennebaud
225fd0f579 adapt AutoDiff to scalar_product_traits 2013-03-20 21:20:13 +01:00
Gael Guennebaud
c519be2bac Allow multiplication like binary operators to be applied on type couples supported by scalar_product_traits 2013-03-20 21:19:16 +01:00
Desire NUENTSA
f350f34560 Add complex support to dgmres and the unit test 2013-03-20 18:38:22 +01:00
Gael Guennebaud
d63712163c Add SSE4 min/max for integers 2013-03-20 18:28:40 +01:00
Desire NUENTSA
da6219b19d Bug567 : Fix iterative solvers to immediately return when the initial guess is the true solution and for trivial solution 2013-03-20 16:15:18 +01:00
Desire NUENTSA
22460edb49 Use a template Index for COLAMD ordering 2013-03-20 16:02:03 +01:00
Desire NUENTSA
4107b371e3 Handle zero right hand side in CG and GMRES 2013-03-20 11:22:45 +01:00
Gael Guennebaud
9bfeeba1c5 Add Official/Unsupported labels to unit tests and add a ctest driver to submit subprojects to cdash 2013-03-20 08:40:13 +01:00
Thomas Capricelli
11a9091084 fix a weird bug where a space was missing before a link 2013-03-19 20:09:13 +01:00
Thomas Capricelli
aba50d842e fixes #568
(files from previous build were kept on the server, with outdated/garbled
information)

The documentation update script now wipes build/doc/html
before rebuilding stuff. Most of the time/cpu consuming is spent in
compiling snippets, so we don't loose that much.
2013-03-19 19:18:14 +01:00
Gael Guennebaud
f29b4c435b Make cpuid not use %%esi -> dangerous if someone is using it. 2013-03-19 14:11:59 +01:00
Michael Schmidt
0d5a418048 Fix bug #566: rbx register has to be saved when calling cpuid on x84_64 with -fPIC and medium or large code models. 2013-03-19 14:00:42 +01:00
Claas H. Köhler
d6d638c751 Forward compiler flags to Fortran workaround 2013-03-17 14:17:44 +01:00
Christoph Hertzberg
6357fd68da Patch by Kolja Brix <brix@igpm.rwth-aachen.de> that fixes bug #565 and adds a testcase to verify that. 2013-03-17 13:55:31 +01:00
Desire NUENTSA
f8addac4e1 Include SparseLU and SparseQR 2013-03-13 18:01:47 +01:00
Gael Guennebaud
5d1a74da0a Update matlab-eigen quick ascii reff 2013-03-11 21:20:12 +01:00
Desire NUENTSA
6c68f1d787 bug #563 : Sparse block assignments should be called on compressed matrices. Uncompressed matrices will be supported later 2013-03-11 19:21:18 +01:00
Jitse Niesen
79f93247c5 Relax tolerances in matrix_power tests to avoid intermittent failures. 2013-03-09 17:20:16 +00:00
Jitse Niesen
97c9e3c74f Handle special case in atanh2(x,y) when y = 0.
This fixes matrix_power unit test on clang.
2013-03-09 16:58:05 +00:00
Gael Guennebaud
03373f41cb Fix bug #561: remove useless sign macro 2013-03-07 23:35:26 +01:00
Gael Guennebaud
f82ee241ac Added tag 3.2-beta1 for changeset 2238592062 2013-03-07 08:51:23 +01:00
Gael Guennebaud
2238592062 bump to 3.2-beta1 (3.1.91) 2013-03-07 08:49:10 +01:00
Desire NUENTSA
4fdae4dda9 Fix bug in SparseLU kernel for 32bits indices 2013-03-06 16:35:12 +01:00
Gael Guennebaud
98ce4455dd fix sparse vector assignment from a sparse matrix 2013-03-06 11:58:22 +01:00
Desire NUENTSA
69bd334d2b Fix mismatched free/delete 2013-03-05 16:35:13 +01:00
Desire NUENTSA
a1ddf2e7a8 Update doc for the sparse module 2013-03-05 12:55:03 +01:00
Gael Guennebaud
24d81aeb20 Fix overlaping operands when calling memcpy 2013-03-04 17:47:45 +01:00
Gael Guennebaud
d2e5c9d892 Do not globally disable stupid warnings in our unit test since such warnings do affect user code. 2013-03-01 14:50:20 +01:00
Gael Guennebaud
b9fe79153b Fix a couple of remaining warnings (missing newlines, inline-noinline, meaningless type qualifiers) 2013-03-01 14:42:36 +01:00
Gael Guennebaud
87142237b5 Fix "missing return statement at end of non-void function" 2013-03-01 14:33:11 +01:00
Gael Guennebaud
210a56ff48 Update to latest mpreal. 2013-03-01 14:31:11 +01:00
Gael Guennebaud
d70366d011 Remove assumption on RowMajorBit==RowMajor and ColMajor==0 2013-03-01 14:23:31 +01:00
Gael Guennebaud
01c6308d6e Add missing template keyword in evaluators 2013-03-01 00:26:52 +01:00
Gael Guennebaud
858ac9ffe0 Add missing template keyword 2013-03-01 00:03:28 +01:00
Gael Guennebaud
1bb1945078 Fix "explicit instantiation of 'Eigen::Spline' must occur in namespace 'Eigen'" warnings 2013-02-28 20:22:26 +01:00
Gael Guennebaud
3930c9b086 Fix "routine is both "inline" and "noinline"" warnings 2013-02-28 19:31:03 +01:00
Gael Guennebaud
e5bf4440c0 Fix "type qualifiers are meaningless here" warnings 2013-02-28 19:29:32 +01:00
Gael Guennebaud
0fac91ac22 Fix "storage class is not first" warnings 2013-02-28 19:27:53 +01:00
Hauke Heibel
b5d8299ee7 Prevent calling .norm() on integer matrices in the unit tests. 2013-02-28 12:33:34 +01:00
Hauke Heibel
83aac6d54c MSVC fix; the compiler failed to detect the correct overload. 2013-02-28 11:38:34 +01:00
Hauke Heibel
5882f1631d Fixed compiler warning. 2013-02-28 10:15:19 +01:00
Hauke Heibel
5e8384df2e MSVC fix; the base class typedef shadowed the local template parameter. 2013-02-28 08:47:38 +01:00
Gael Guennebaud
6dd93fc76e The ref unit test cannot be easily written to work with EIGEN_DEFAULT_TO_ROW_MAJOR 2013-02-27 23:52:10 +01:00
Hauke Heibel
c754023e72 Fixed MSVC dashboard (Experimental/Continuous) build scripts. 2013-02-27 15:54:27 +01:00
Gael Guennebaud
455e6e38b6 Fix two numerical issues in unit tests. 2013-02-27 08:07:18 +01:00
Gael Guennebaud
61a2995d03 Remove ICC warning in nomalloc unit test. 2013-02-26 18:10:19 +01:00
Gael Guennebaud
fe2c8e1c36 Fix compilation with ICC that was unable to instanciate Scaling from Eigen's namespace. 2013-02-26 17:38:37 +01:00
Gael Guennebaud
fa17a6da75 Fix compilation with ICC that was unable to instanciate first_aligned 2013-02-26 17:32:42 +01:00
Gael Guennebaud
bb94f0ebc6 Add a unit test for Ref.h and fix an extra copy. 2013-02-26 15:10:00 +01:00
Gael Guennebaud
63135a7350 Fix computation of outer-stride when calling .real() or .imag() 2013-02-26 15:08:50 +01:00
Gael Guennebaud
e8ccd07671 Add the possibility to define a custom build-string suffix 2013-02-26 13:40:13 +01:00
Gael Guennebaud
0b187a40a1 workaround "may be used uninitialized in this function" warning 2013-02-26 12:09:08 +01:00
Gael Guennebaud
5dda7842ca Add assertion on the input matrix size in factorizations relying on permutations of 32bits int 2013-02-26 11:42:32 +01:00
Gael Guennebaud
b73baa1ea4 Workaround warning: assuming signed overflow does not occur when... 2013-02-26 10:29:24 +01:00
Gael Guennebaud
5108ef01fc Fix no newline warning. 2013-02-26 10:27:55 +01:00
Gael Guennebaud
b6dc2613ac Fix bug #552: disable EIGEN_GLIBC_MALLOC_ALREADY_ALIGNED when compiling with -fsanitize=address, and allow users to manually tell whether EIGEN_MALLOC_ALREADY_ALIGNED. 2013-02-25 19:17:13 +01:00
Gael Guennebaud
12a1313b09 bug #482: pass scalar arguments by const references. Still remains a few cases that might affect the ABI (see the bug entry) 2013-02-25 18:05:57 +01:00
Desire NUENTSA
cc35c44256 Add reference for the default threshold in sparse QR 2013-02-25 14:26:55 +01:00
Desire NUENTSA
ced8dfc0d9 Fix the computation of the default pivot threshold for sparse QR 2013-02-25 13:41:59 +01:00
Gael Guennebaud
5a0c5c0393 Fix bug #483: optimize outer-products to skip setZero and a scalar multiple when not needed. 2013-02-25 13:31:42 +01:00
Gael Guennebaud
96ad13abba conservative_resize unit test was executed only once 2013-02-25 01:50:58 +01:00
Gael Guennebaud
41aa0fcc3b Output random generator seed in case of failure so that we have it in CDash. 2013-02-25 01:46:59 +01:00
Gael Guennebaud
698de91c8a Fix segfault in SparseBlock::InnerIterator 2013-02-25 01:30:18 +01:00
Gael Guennebaud
19bc418f5c Remove erroneously committed debugging stuff. 2013-02-25 01:17:44 +01:00
Gael Guennebaud
80d2a65465 Fix visitor unit test. 2013-02-25 01:12:07 +01:00
Gael Guennebaud
04367447ac Fix bug #496: generalize internal rank1_update implementation to accept uplo(A) += v * w and make A.triangularView() += v * w uses it.
Update unit tests and blas interface respectively.
2013-02-24 23:05:42 +01:00
Gael Guennebaud
08388cc712 Remove superfluous cast. 2013-02-24 20:37:52 +01:00
Gael Guennebaud
1c137496c6 Extend sparseqr unit test to check linearly dependent columns 2013-02-24 20:36:54 +01:00
Gael Guennebaud
4eeaff6d38 Cleaning pass on SparseQR 2013-02-24 20:36:28 +01:00
Gael Guennebaud
28e139ad60 Fix another issue related to summing up many signed values. 2013-02-23 23:06:45 +01:00
Gael Guennebaud
42af5870a4 Fix array unit test: isApprox(log(0),log(0)) is false, and summing up signed float value might result in very small values and thus large numerical errors 2013-02-23 22:58:14 +01:00
Gael Guennebaud
274c24c262 Avoid problematic ternary operator (http://forum.kde.org/viewtopic.php?f=74&t=109486) 2013-02-23 20:13:21 +01:00
Sebastien Barthelemy
74438f8aa9 Fix EIGEN_INITIALIZE_MATRICES_BY_NAN. 2013-02-22 15:09:03 +01:00
Gael Guennebaud
7fe6419171 remove double parenthesis 2013-02-22 14:50:47 +01:00
Gael Guennebaud
e71bc79f2a SparseLU does not accept row-major matrices for the destination. 2013-02-22 14:45:42 +01:00
Gael Guennebaud
bd8c9c69e4 Protect min with parenthesis in IncompleteLLT 2013-02-22 14:41:32 +01:00
Gael Guennebaud
d93c1c113b NVCC: EIGEN_NO_DEBUG must be defined before including Macro.h 2013-02-21 19:05:23 +01:00
Desire NUENTSA
59f9400420 Clarify the doc for column-pivoting QR 2013-02-21 13:33:31 +01:00
Gael Guennebaud
968f7591f8 Make it compile without nvcc 2013-02-21 12:51:58 +01:00
Jitse Niesen
986f60127d Guard against transposeInPlace on non-square non-resizable matrix.
Inspired by question by Martin Drozdik at stackoverflow.com/q/14954983
2013-02-20 14:03:14 +00:00
Jitse Niesen
a054b4ee27 Be more explicit about user-defined functions in Map tutorial.
See discussion on mailing list on 18 + 19 Feb 2013.
2013-02-20 13:44:40 +00:00
Desire NUENTSA
febf8e5d7b Set built-in sparse QR as the default sparse solver and add ComputationInfo for Levenberg Marquardt, 2013-02-20 14:10:14 +01:00
Desire NUENTSA
dca7190e15 Add setPivotThreshold to Sparse QR 2013-02-20 14:00:28 +01:00
Desire NUENTSA
19de016fef Correct the SPQR backend for rank-deficient matrices and delete some public functions 2013-02-20 13:59:56 +01:00
Desire NUENTSA
bc18e06fe3 Add matrixR() to get the triangular factor from the Householder QR 2013-02-20 13:58:26 +01:00
Desire NUENTSA
962c99d462 Metis ordering backend supports only squared matrices 2013-02-20 13:56:51 +01:00
Jitse Niesen
ba653105a2 Remove confusing transpose() in setLinSpaced() docs. 2013-02-18 17:27:41 +00:00
Jitse Niesen
b4f6aec195 Fix linear vectorized transversal in linspace (fixes bug #526). 2013-02-18 17:26:03 +00:00
Desire NUENTSA
1a056b408d Add a rank-revealing feature to sparse QR 2013-02-15 16:35:28 +01:00
Gael Guennebaud
9fd465ea2b Fix the following warning: "comparison between signed and unsigned integer expressions" 2013-02-15 14:31:38 +01:00
Gael Guennebaud
cf259ce590 Workaround the following warning: "assuming signed overflow does not occur when assuming that (X + c) < X is always false" 2013-02-15 14:28:20 +01:00
Gael Guennebaud
a1091caa43 Fix some unused or not initialized related warnings. 2013-02-15 14:05:37 +01:00
Gael Guennebaud
19f699ded0 "-Wno-psabi" option is not supported by all gcc version. 2013-02-15 14:01:30 +01:00
Gael Guennebaud
8745da14d8 Fix SSE plog<float> to return -INF on 0 2013-02-14 23:34:05 +01:00
Gael Guennebaud
912ba10efe Remove the following note made by gcc: "The ABI of passing structure with complex float member has changed in GCC 4.4" 2013-02-14 21:52:12 +01:00
Gael Guennebaud
24e4a3af2b Add missing using std::sqrt 2013-02-14 21:40:00 +01:00
Gael Guennebaud
a0fb885c82 Update adjoint unit test to avoid instantiating sqrt(int) 2013-02-14 21:33:42 +01:00
Gael Guennebaud
9cc016d3f9 Update basicstuff unit test to avoid instantiating the ambiguous sqrt(int) 2013-02-14 21:15:58 +01:00
Gael Guennebaud
f8407742c1 Fix some implicit int64 to int conversion warnings. However, the real issue
is that PermutationMatrix mixes the type of the stored indices and the "Index"
type used for the sizes, coeff indices, etc., which should be DenseIndex.
(transplanted from 66cbfd4d39
)
2013-02-14 18:16:51 +01:00
Gael Guennebaud
25bcbfb10c Fix bug in aligned_free with windows CE 2013-02-13 19:09:31 +01:00
Gael Guennebaud
a143c5b78c Fix bug #544: assertion in JacobiSVD when compiling with EIGEN_NO_AUTOMATIC_RESIZING 2013-02-12 19:56:48 +01:00
Gael Guennebaud
3cd32996f1 Fix bug #551: compilation issue when using EIGEN_DEFAULT_DENSE_INDEX_TYPE 2013-02-09 09:43:17 +01:00
Gael Guennebaud
5adcc6c7b4 Add support for NVCC5: most of the Core and part of LU are callable from CUDA code.
Still a lot to do.
2013-02-07 19:06:14 +01:00
Gael Guennebaud
5115f4c504 add EIGEN_INITIALIZE_MATRICES_BY_NAN 2013-02-07 18:07:07 +01:00
Gael Guennebaud
3c1ccca285 Add missing operator= in RefBase 2013-02-07 17:49:16 +01:00
Gael Guennebaud
e21dc15386 Add missing data member function in CwiseUnaryView 2013-02-07 17:44:24 +01:00
Gael Guennebaud
5154253933 Fix some MPL2/LGPL lisencing confusions 2013-02-06 11:30:33 +01:00
Jitse Niesen
14e2ab02b5 Replace assert() by eigen_assert() (fixes bug #548). 2013-02-02 22:04:42 +00:00
Desire NUENTSA
35647b0133 Correct bug in SPQR backend and replace matrixQR by matrixR 2013-01-29 17:48:30 +01:00
Desire NUENTSA
8bc00925e5 Change int to Index type for SparseLU 2013-01-29 16:21:24 +01:00
Hauke Heibel
57e50789f3 Added missing using std::sqrt. 2013-01-27 13:46:06 +01:00
Hauke Heibel
718535ac6c Added Visual Studio 2012 debug visualizers. 2013-01-26 17:32:14 +01:00
Desire NUENTSA
7f0f7ab5b4 Add additional methods in SparseLU and Improve the naming conventions 2013-01-25 20:38:26 +01:00
Desire NUENTSA
d58056bde4 Merged local branch with main trunk 2013-01-25 19:05:33 +01:00
Desire NUENTSA
81d4bfa8d9 add support for solving with sparse right hand side 2013-01-25 18:17:17 +01:00
Gael Guennebaud
e4ec63aee7 Suppress annoying "may be used uninitialized in this function" warning with gcc >= 4.6 2013-01-24 11:59:17 +01:00
Gael Guennebaud
b74c0a4413 Check that NeedsToAlign is properly sets before checking alignment 2013-01-24 11:42:04 +01:00
Gael Guennebaud
7282a45a0a Remove dummy code in MPRealSupport 2013-01-24 08:48:26 +01:00
Gael Guennebaud
29d395f769 Relax a bit the precision in mpreal unit test. 2013-01-23 23:57:28 +01:00
Gael Guennebaud
691e607d85 Specialize GEBP traits and kernel for mpreal to by-pass mpreal and remove the costly creation of many temporaries. 2013-01-23 23:56:57 +01:00
Gael Guennebaud
c22f7cef83 Workaround "error: floating-point literal cannot appear in a constant-expression" in mpreal.h when compiling with predantic.
(I really don't know how to properly fix this))
2013-01-23 20:51:38 +01:00
Gael Guennebaud
73026eab4d Fix SparseLU special gemm kernel on 32 bits system w/o SSE 2013-01-23 19:34:01 +01:00
Gael Guennebaud
ee36eaefc6 remove dummy code in ColPivHouseholderQR::solve 2013-01-23 18:34:29 +01:00
Gael Guennebaud
19c78cf510 Workaround gcc-4.7 bug #53900 (too aggressive optimization in our alignment check) 2013-01-22 22:59:09 +01:00
Gael Guennebaud
67b9f42528 Recent UMFPACK library requires to link to libSuiteSparse 2013-01-22 22:53:28 +01:00
Desire NUENTSA
ad798231ec Fix test for Metis 2013-01-21 15:43:15 +01:00
Desire NUENTSA
3d9150870d Fix documentation for SparseLU 2013-01-21 15:39:18 +01:00
Desire NUENTSA
d2dd5063b6 Documentation for the ordering methods 2013-01-21 15:37:47 +01:00
Desire NUENTSA
5b9bb00265 Test for the sparse Blue norm 2013-01-21 15:37:06 +01:00
Desire NUENTSA
5dcf6caa36 Unit test for the Metis Ordering package 2013-01-21 15:36:18 +01:00
Gael Guennebaud
392ffce3b9 Fix traits of Map<Quaternion>, and respectively extend the unit tests 2013-01-20 10:21:54 +01:00
Gael Guennebaud
fb89b66229 Some minor documentation fixes in Quaternion 2013-01-20 10:20:39 +01:00
Chen-Pang He
23c87fcde6 I think it's OK to let XprHelper.h determine the nested type. 2012-10-15 00:14:32 +08:00
Chen-Pang He
fe0ef8e609 Remove unused typedef (traits<MatrixPowerProduct>::PlainObject) for brevity. 2012-10-14 22:30:52 +08:00
Chen-Pang He
40fce01648 Simplify traits<MatrixPowerProduct>: StorageKind must be Dense because MatrixPowerProduct is derived from MatrixBase. 2012-10-14 18:36:17 +08:00
Chen-Pang He
c890cf5489 Use the nested type instead of const reference 2012-10-14 03:02:16 +08:00
Chen-Pang He
daa65c5bce Just tidy up: no need to specify template parameters inside class body. 2012-10-14 01:36:54 +08:00
Chen-Pang He
0017d8c58f Make MatrixPowerTriangularAtomic::computePade static because it should be. 2012-10-07 02:25:00 +08:00
Chen-Pang He
a5d348e30a Use simplified return type, trying to work around MSVC. 2012-10-03 19:42:02 +08:00
Chen-Pang He
4cfde4590f Make use of TRMM (speed up), and remove useless condition (the triangular don't need LU) 2012-10-02 23:04:23 +08:00
Chen-Pang He
21c2b4e327 Make better decision on PartialPivLU vs inverse(): We have specialized inverse() only for FIXED matrices. 2012-10-02 19:53:38 +08:00
Chen-Pang He
e92fe88159 Add test for real MatrixPowerTriangular. 2012-09-30 19:21:53 +08:00
Chen-Pang He
eb33d307af Avoid Schur decomposition on (quasi-)triangular matrices. (Huge speed up!) 2012-09-30 16:30:18 +08:00
Chen-Pang He
332eb36436 Implement complex MatrixPowerTriangular. There are still problems with real one. 2012-09-30 02:14:16 +08:00
Gael Guennebaud
209199a13e Move the definition of DenseBase::InnerIterator to Core module. (needed to make blueNorm generic) 2013-01-15 22:03:54 +01:00
Desire NUENTSA
f813e83bc3 Delete unused variable in SparseLU 2013-01-14 16:03:46 +01:00
Desire NUENTSA
c05848a330 Move SparseColEtree common to SparseLU and SparseQR to SparseCore and fix build issue of sparseqr 2013-01-14 15:59:46 +01:00
Desire NUENTSA
904c2f137b Fix the column permutation in SparseQR 2013-01-14 14:20:42 +01:00
Gael Guennebaud
a3b94d26c8 Remove TOC numbering, and minor improvement in overview. 2013-01-12 20:34:52 +01:00
Sergey Popov
761b3bbb69 Fix bug #540: SelfAdjointEigenSolver improperly used the upper triangular part to extract the scaling factor. 2013-01-12 12:07:49 +01:00
Gael Guennebaud
7262cf783c Cleaning documentation pass in ordering and ILUT 2013-01-12 11:56:56 +01:00
Gael Guennebaud
38fa432e07 Clean inclusion, namespace definition, and documentation of SparseLU 2013-01-12 11:55:16 +01:00
Gael Guennebaud
50625834e6 SparseQR: clean a bit the documentation, fix rows/cols methods, remove rowsQ methods and rename matrixQR to matrixR. 2013-01-12 09:40:31 +01:00
Gael Guennebaud
581e389c54 Fix installation path of SparseQR 2013-01-12 09:32:51 +01:00
Desire NUENTSA
121f3bdf04 Pass a const matrix to sparseQR 2013-01-11 17:47:32 +01:00
Desire NUENTSA
33febdb48b Add support for Schur decomposition of matrices in Hessenberg form 2013-01-11 17:36:45 +01:00
Desire NUENTSA
0f94e96342 Add support for sparse blueNorm 2013-01-11 17:27:12 +01:00
Desire NUENTSA
91b3b3aaab Add a sparse QR factorization and update the elimination tree in SparseLU 2013-01-11 17:16:14 +01:00
Gael Guennebaud
1ccd90a927 Make the MatrixFunctions documentation page looks a bit better 2013-01-11 10:48:43 +01:00
Gael Guennebaud
cc444bbbf9 update unsupported module documentation to be conformed with new documentation style 2013-01-11 10:41:26 +01:00
Gael Guennebaud
b0cb5e6d48 remove the 'Unsupported Modules' meta module 2013-01-11 10:40:35 +01:00
Gael Guennebaud
109cbb6ad3 typos 2013-01-09 17:44:25 +01:00
Gael Guennebaud
dcc1754f05 update javascript hacks for doxygen 1.8.3 2013-01-09 00:40:48 +01:00
Gael Guennebaud
2abe7d8c6e Rename the dox files: the number prefixes are not needed anymore 2013-01-06 23:57:54 +01:00
Gael Guennebaud
091a49cad5 Clean the manual page titles, links and intro. 2013-01-06 23:48:59 +01:00
Thomas Capricelli
c71c06b71f fix typo 2013-01-06 14:39:20 +01:00
Gael Guennebaud
8a50ed86f3 Check that minCoeff(int*)/maxCoeff(int*) always pick the first entry in case of multiple extrema. 2013-01-05 23:49:47 +01:00
Gael Guennebaud
f9927b4aca Fix _data() versus data() issue in SparseVector, and add a Storage typedef just like SparseMatrix. 2013-01-05 23:04:22 +01:00
Gael Guennebaud
86983fa1ff Update the overview page to reflect the new organisation 2013-01-05 21:25:41 +01:00
Gael Guennebaud
2de69c2f26 Doc presentation:
- remove the "modules|classes" link for module pages (they are already in the TOC)
 - fine tune the TOC css
2013-01-05 17:14:14 +01:00
Gael Guennebaud
93ee82b1fd Big changes in Eigen documentation:
- Organize the documentation into "chapters".
  - Each chapter include many documentation pages, reference pages organized as modules, and a quick reference page.
  - The "Chapters" tree is created using the defgroup/ingroup mechanism, even for the documentation pages (i.e., .dox files for which I added an \eigenManualPage macro that we can switch between \page or \defgroup ).
  - Add a "General topics" entry for all pages that do not fit well in the previous "chapters".
  - The highlevel struture is managed by a new eigendoxy_layout.xml file.
- remove the "index" and quite useless pages (namespace list, class hierarchy, member list, file list, etc.)
- add the javascript search-engine.
- add the "treeview" panel.
- remove \tableofcontents (replace them by a custom \eigenAutoToc macro to be able to easily re-enable if needed).
- add javascript to automatically generate a TOC from the h1/h2 tags of the current page, and put the TOC in the left side panel.
- overload various javascript function generated by doxygen to:
  - remove the root of the treeview
  - remove links to section/subsection from the treeview
  - automatically expand the "Chapters" section
  - automatically expand the current section
  - adjust the height of the treeview to take into account the TOC
- always use the default .css file, eigendoxy.css now only includes our modifications
- use Doxyfile to specify our logo
- remove cross references to unsupported modules (temporarily)
2013-01-05 16:37:11 +01:00
Jitse Niesen
eac676ff6c Set matrix to zero before inserting entries (partially fixes bug #539). 2013-01-03 18:00:45 +00:00
Chen-Pang He
8321b7ae74 Make KroneckerProductSparse inherit EigenBase instead of SparseMatrixBase, for it does not provide an InnerIterator. 2012-10-25 02:09:48 +08:00
Chen-Pang He
204a09cb82 Fix compile error caused by incomplete SparseMatrixBase. 2012-10-16 00:06:49 +08:00
Chen-Pang He
0508a0620b Let KroneckerProduct inherit ReturnByValue to eliminate temporary evaluation. It's uncommon to store the product back to one of the operands. 2012-10-15 19:45:50 +08:00
Chen-Pang He
8284e7134b Add doc for KroneckerProductSparse. 2012-10-15 00:31:09 +08:00
Chen-Pang He
c4b83461d9 Make kroneckerProduct take two arguments and return an expression, which is more straight-forward. 2012-10-15 00:21:12 +08:00
Chen-Pang He
f34db6578a KroneckerProduct: we have const_cast_derived so why not use it? 2012-10-14 01:38:38 +08:00
Jitse Niesen
20a984cd2e Remove #include of removed header file. 2013-01-03 16:44:15 +00:00
Gael Guennebaud
6fb3be9841 Remove useless empty file. 2013-01-03 17:05:20 +01:00
Gael Guennebaud
2ea1e49a08 Doc: replace manual TOC by doxygen's \tableofcontents command 2012-12-28 18:58:07 +01:00
Gael Guennebaud
ded70b8b58 Doc: remove page margins and limits to 60em paragraphes only instaead of the entire page (many declarations and tables are larger than 60em anyway) 2012-12-28 18:57:10 +01:00
Gael Guennebaud
3f82401890 Update doxygen files to doxygen version 1.8 2012-12-28 18:52:53 +01:00
Gael Guennebaud
f41d96deb9 Fix several documentation issues 2012-12-24 13:33:22 +01:00
Gael Guennebaud
f450303321 Fix MSVC compilation: std::log was not accessible. 2012-12-20 18:11:49 +01:00
Gael Guennebaud
85005ffbd1 Make sure sqrt and the likes are not compiled for integer type in cwiseop unit test. 2012-12-20 18:08:26 +01:00
Christoph Hertzberg
b7ea59556d Fix bug #507: Mark variable as unused in NDEBUG case 2012-12-20 11:21:47 +01:00
Christoph Hertzberg
0fe264869a Merge with 6300e8ca02 2012-12-17 17:01:24 +01:00
Christoph Hertzberg
6300e8ca02 replaced compiler specific __attribute__((noinline)) by EIGEN_DONT_INLINE 2012-12-17 16:55:14 +01:00
Christoph Hertzberg
c69577ea31 Fix bug #531: Empty line in <table> made doxygen render it as paragraphs 2012-12-17 16:13:42 +01:00
Jakob Schwendner
22e6741da9 updated geometry benchmark to handle additional cases 2012-12-17 09:33:22 +01:00
Jakob Schwendner
98798e904b added benchmark for test vectorization in geometry package 2012-12-16 23:30:56 +01:00
Gael Guennebaud
e90752d252 Fix bug #534: rm useless initialization of rowSpacer. 2012-12-16 20:32:48 +01:00
Gael Guennebaud
925a5b7d07 Fix bug #535: unused variable warnings 2012-12-16 20:21:28 +01:00
Gael Guennebaud
6c8cf15c06 Fix compilation of Block/SparseBlock with MSVC 2012-12-16 19:45:48 +01:00
David Harmon
23ce05971b Add arpack support module file 2012-12-16 19:11:24 +01:00
David Harmon
dbe1ab67ac Added ARPACK support for standard and generalized eigenvalue problems. Currently self-adjoint only. 2012-10-06 17:18:09 -06:00
Gael Guennebaud
8844f632fa Move work in progress Levenberg Marquardt module in unsupported 2012-12-08 18:22:23 +01:00
Gael Guennebaud
4cdcb6d793 Add missing minpack copyrights/license.
Fix LM header files and credits original MINPACK authors.
Move minimizeOneStep code into its own file to get it more properly credited.
2012-12-08 18:17:18 +01:00
Gael Guennebaud
d85253ccf4 Backed out changeset 363e506776 2012-12-07 20:53:19 +01:00
Desire NUENTSA
363e506776 Rename the old LevenbergMarquardt class to LevenbergMarquardtLegacy
Split the levenberg marquardt test and the hybrid nonlinear test
2012-12-07 15:51:25 +01:00
Desire NUENTSA
cc0fef9807 Add tests for dense and sparse levenberg-Marquardt 2012-12-07 15:48:21 +01:00
Desire NUENTSA
2aba6645f4 Move the Levenberg Marquardt to the supported branch
Add support for sparse computations... need SPQR module.
2012-12-07 15:47:13 +01:00
Desire NUENTSA
71cb78e1ba Fix Incomplete Cholesky factorization. Stable but need iterative robust shift 2012-12-07 15:33:26 +01:00
Desire NUENTSA
5afaacedc6 Update SPQR interface 2012-12-07 15:32:04 +01:00
Pavel Holoborodko
895d90d3e1 Fixed mpreal for IA64 architectures 2012-12-04 18:15:46 +09:00
Gael Guennebaud
8719b1bf16 fix geometry tutorial 2012-11-29 22:48:13 +08:00
Desire NUENTSA
36414d1f3e Update SPQR module for Sparse LM 2012-11-21 19:47:44 +01:00
Desire NUENTSA
9162a8492e ReverseInnerIterator for SparseBlock 2012-11-16 20:00:34 +01:00
Desire NUENTSA
4acc18f100 Move VectorBlock methods into plugin section 2012-11-16 19:59:11 +01:00
Gael Guennebaud
6a790058f5 remove deprecated InnerVectorSet for the deprecated DynamicSparseMatrix class 2012-11-16 09:03:42 +01:00
Gael Guennebaud
4e60283289 Remove Sparse/InnerVectorSet expression in favor of a more general Block<> specialization for Sparse expression.
The specializations for "InnerPanels" are still preserved for efficiency reasons and because they offer additional usefull features.
2012-11-16 09:02:50 +01:00
Gael Guennebaud
3dc8f8536a Generalize Block<> to support various implementation wrt StorageKind (just like other expression) 2012-11-16 09:00:27 +01:00
Gael Guennebaud
493319ae5f plugin header files can be included more than once 2012-11-15 14:33:30 +01:00
Desire NUENTSA
b40a5b8b48 Improve the IncompleteLLT ... not yet robust 2012-11-13 18:14:34 +01:00
Desire NUENTSA
0412dff97b Add more useful functions to SPQR interface 2012-11-13 18:13:13 +01:00
Desire NUENTSA
9cf77ce1d8 Add support for Sparse QR factorization 2012-11-12 15:20:37 +01:00
Desire NUENTSA
474716ec5b Add restarted GMRES with deflation 2012-11-12 10:47:55 +01:00
Gael Guennebaud
a76fbbf397 Fix bug #314:
- remove most of the metaprogramming kung fu in MathFunctions.h (only keep functions that differs from the std)
- remove the overloads for array expression that were in the std namespace
2012-11-06 15:25:50 +01:00
Gael Guennebaud
959ef37006 Fix FindUmfpack when specifying manually the related cmake variables. 2012-11-05 23:21:02 +01:00
Gael Guennebaud
691fb92690 Disable opengl demo if Qt4 or OpenGL cannot be found.
(transplanted from caf24f1c9e
)
2012-10-31 11:36:45 +01:00
Gael Guennebaud
aa858cb43a add first_multiple helper function 2012-10-30 16:27:52 +01:00
Gael Guennebaud
90fcaf11cf SparseLU: remove the "snode" path which appears to bring nearly zero speedup 2012-10-30 15:17:58 +01:00
Gael Guennebaud
ac8c694f3e add missing copyright 2012-10-30 15:16:47 +01:00
Gael Guennebaud
fea4220f37 SparseLU: add a specialized gemm kernel, and add padding to the supernodes such that supernodes columns are all properly aligned 2012-10-30 15:09:48 +01:00
Desire NUENTSA
f7e203fb0c Fix build error in matrixfunctions on MSVC 2012-10-30 11:30:37 +01:00
Gael Guennebaud
b3254c9af5 fix bug #524: Pardiso's parameter array does not have to be aligned! 2012-10-24 10:31:04 +02:00
Gael Guennebaud
138897cc06 fix bug #521: __cpuidex is not available on all architectures supported by MSVC 2012-10-24 10:21:41 +02:00
Gael Guennebaud
9b418afff6 Windows CE does not provide an aligned_malloc function. 2012-10-24 10:12:42 +02:00
Gael Guennebaud
0753463d70 Fix bug #519: AlignedBox::dim() was wrong for dynamic dimensions 2012-10-24 09:58:35 +02:00
Pavel Holoborodko
7857118f2e Fixed gcc warnings, John Westwood name and round_style function 2012-10-19 22:51:55 +09:00
Pavel Holoborodko
8b84e05739 Updated multiprecision module to support the most recent version of MPFR C++ 2012-10-19 18:12:31 +09:00
Desire NUENTSA
77f92bf0b1 the repeated solves are already present in check_sparse_solving() 2012-10-09 13:30:48 +02:00
dnuentsa
f757034001 MINRES solver 2012-10-09 13:07:09 +02:00
Desire NUENTSA
fe78c86b4a Discard failing tests in NonlinearOptimization 2012-10-09 12:20:21 +02:00
Desire NUENTSA
b722c405b7 Use Ref instead of VectorBlock 2012-10-09 12:18:47 +02:00
Desire NUENTSA
23e2de3cb6 RealShur for a already Hessenberg matrix 2012-10-09 12:16:54 +02:00
Gael Guennebaud
a67eea05c1 fix comma initializer when inserting empty matrices 2012-10-03 21:58:14 +02:00
Desire NUENTSA
cfa8032ffb bug #517 : Small typo in AsciiQuickReference 2012-10-03 09:48:33 +02:00
Gael Guennebaud
fec6df1f7d fix dense=sparse*diagonal (there was an issue in the values returned by the .outer() function of the related iterators) 2012-10-03 09:06:19 +02:00
Gael Guennebaud
f30ca7ed7e extend unit tests to check rectangular matrices for sparse*diagonal products 2012-10-02 23:03:06 +02:00
Gael Guennebaud
62b1f75a86 add an assertion when inserting an already existing element 2012-10-02 23:02:23 +02:00
giacomo po
bf81276dad spd test instead of square test. Still missing complex version of MINRES. 2012-10-01 12:23:03 -07:00
Jitse Niesen
2008f76120 Merge 2012-09-29 17:35:15 +01:00
Chen-Pang He
d7d96f6694 Make testExponentLaws in matrix_power quiet. It was too noisy. 2012-09-29 17:45:59 +08:00
Chen-Pang He
50c07e50e8 Avoid memory manipulation for simplicity, efficiency, and safety. 2012-09-29 17:41:51 +08:00
Chen-Pang He
5814a5f1a0 Abort the extension. MatrixSquareRootTriangular only takes upper triangular matrices. 2012-09-29 17:41:06 +08:00
Chen-Pang He
067a5a98c8 Extend MatrixPowerTriangularAtomic for future implementation for triangular matrix power. 2012-09-29 02:02:12 +08:00
Desire NUENTSA
b68102d9a2 MSVC needs parentheses around min and max 2012-09-28 10:44:25 +02:00
giacomo po
01cb88fff8 compiling (but failing) unit test 2012-09-27 17:44:54 -07:00
Gael Guennebaud
87074d97e5 old gcc versions do not have immintrin.h file... 2012-09-27 23:35:54 +02:00
Chen-Pang He
ed18d6f2ad Fix doc and tidy up 2012-09-28 02:08:14 +08:00
Desire NUENTSA
82c3ff3784 Fix Build error on MSVC 2012-09-27 12:04:59 +02:00
Desire NUENTSA
72bfed5e20 Add forgotten SparseLUBase 2012-09-27 11:34:56 +02:00
Chen-Pang He
3b88216d42 Move unshared items back to MatrixPower 2012-09-27 17:19:32 +08:00
Gael Guennebaud
8b83e66906 add scalar multiple to diagonal matrices
(transplanted from dc5b335f9f
)
2012-09-27 09:37:05 +02:00
Gael Guennebaud
1b004d5794 fix SparseMatrix option bit flag in eval<> helper 2012-09-27 09:22:10 +02:00
Gael Guennebaud
b648484dba fix bug #515: missing explicit scalar conversion
(transplanted from b0862dcb2f
)
2012-09-27 00:23:19 +02:00
Gael Guennebaud
44374788b5 fix bug #511: pretty printers on windows 2012-09-26 23:48:48 +02:00
Gael Guennebaud
7c4b55fda9 fix bug #509: warning with gcc 4.7 2012-09-26 23:32:22 +02:00
Chen-Pang He
73a0bfe261 Write doc on (matrix power) * (matrix expression) 2012-09-27 02:31:18 +08:00
Chen-Pang He
aa5acdb352 Create class MatrixPowerBase for further extension (like specialization for triangular or self-adjoint matrices) 2012-09-27 02:20:36 +08:00
Gael Guennebaud
7e97dd5bd8 we should not directly include the *mmintrin.h headers but include immintrin.h only 2012-09-26 19:28:57 +02:00
Gael Guennebaud
1edb396542 fix minor typo in doc 2012-09-26 19:24:41 +02:00
Desire NUENTSA
357fe3641d Correct reference to iterative scaling method 2012-09-25 11:55:33 +02:00
Desire NUENTSA
15a9f6b9c1 Doc for sparseLU 2012-09-25 11:48:18 +02:00
Hauke Heibel
5a3f49036b Removed scaling from the umeyama when it is not requested. 2012-09-25 11:39:40 +02:00
Desire NUENTSA
088379ac2f Fix MSVC compile error in SparseLU 2012-09-25 09:58:29 +02:00
Desire NUENTSA
a01371548d Define sparseLU functions as static 2012-09-25 09:53:40 +02:00
giacomo po
fd0441baee some clean-up and new comments. 2012-09-24 09:20:40 -07:00
Chen-Pang He
d387dfa9dc Remove unnecessary code. lazyAssign seems to fix all (noalias, initialization, etc.) 2012-09-24 23:36:19 +08:00
giacomo po
18c41aa04f Some minor optimization. 2012-09-24 08:33:11 -07:00
giacomo po
dd7ff3f493 moved MINRES to unsupported. Made unit test. 2012-09-24 07:47:38 -07:00
Chen-Pang He
334532b7f5 Remove class MatrixPowerEvaluator with enhanced existing MatrixPowerReturnValue to simplicity, but docs are not completed yet. 2012-09-23 23:49:50 +08:00
Chen-Pang He
1d402dac03 Fix bug in MatrixPower(expression) due to destruction of temporary objects. Sorry for ugly pointer manipulation but it prevents copying a PlainObject. 2012-09-23 18:49:44 +08:00
giacomo po
8c5e4fae61 working preconditioned MINRES solver 2012-09-22 15:29:00 -07:00
Chen-Pang He
963794b04a Eliminate unnecessary evaluations 2012-09-23 00:20:19 +08:00
Chen-Pang He
7e64f78f65 Avoid inefficient 2x2 LU 2012-09-22 22:06:22 +08:00
Chen-Pang He
d7b1049cab Fix my typo in MatrixPowerBase.h, no effect on the flow. 2012-09-22 19:13:02 +08:00
Chen-Pang He
dd8034bd1c Fix cost evaluation. (chain product for integral power) 2012-09-22 17:37:14 +08:00
Gael Guennebaud
7740127e3d Make Ref<> suitable for both Matrix and Array kinds. Note that Matrix kind objects can be implicitely converted to an Array kind Ref<> and vice versa 2012-09-22 11:11:26 +02:00
Chen-Pang He
446d14f6ad Implement matrix power-matrix product again 2012-09-22 03:26:00 +08:00
Chen-Pang He
87afd99433 Enable saving intermidiate (Schur decomposition) but disable unstable specialization for matrix power-matrix product. 2012-09-21 23:24:28 +08:00
Desire NUENTSA
7e0dd17312 Improve BiCGSTAB : With exact preconditioner, the solution should be found in one iteration 2012-09-19 18:32:02 +02:00
Chen-Pang He
d5d99dd1f0 Optimize matrix functions: m_fT is triangular and trmm is faster than gemm 2012-09-16 14:42:42 +08:00
Gael Guennebaud
48c4d48aec workaround weird compilation error with MSVC 2012-09-14 09:54:56 +02:00
Gael Guennebaud
0c584dcf4d fix compilation with m.array().min/max(scalar) 2012-09-12 17:50:07 +02:00
Gael Guennebaud
28528519e9 Merged in jdh8/eigen (pull request PR-17) 2012-09-11 21:36:05 +02:00
Gael Guennebaud
9e80822fc9 fix compilation on freebsd 2012-09-11 13:32:56 +02:00
Desire NUENTSA
45672e724e Incomplete Cholesky preconditioner... not yet stable 2012-09-11 12:12:19 +02:00
Benoit Jacob
504edbddb1 Replace COPYING.LGPL by a copy of the LGPL 2.1 (instead of LGPL 3).
Indeed, all the LGPL code we use, is licensed under LGPL 2.1 (with some files being "2.1 or later").
2012-09-10 13:27:44 -04:00
Desire NUENTSA
2d49d049d1 Clean the Colamd routine and add test for sparselu code 2012-09-10 14:41:17 +02:00
Desire NUENTSA
761fe49f37 Clean the Colamd routine 2012-09-10 14:28:28 +02:00
Desire NUENTSA
2c99d84133 add SparseLU in sparse bench 2012-09-10 12:41:26 +02:00
Chen-Pang He
04f315d692 Fix rank-1 update for self-adjoint packed matrices. 2012-09-10 18:25:30 +08:00
Chen-Pang He
65caa40a3d Implement packed triangular solver. 2012-09-10 06:29:02 +08:00
Chen-Pang He
3642ca4d46 Implement packed triangular matrix-vector product. 2012-09-09 23:34:45 +08:00
Chen-Pang He
2828c995c5 Use conj_expr_if to clarify what it's doing. 2012-09-09 21:35:28 +08:00
Chen-Pang He
669db3d776 Extend rank-1 updates for different storage orders. 2012-09-09 02:55:10 +08:00
Chen-Pang He
1b8f416408 Implement rank-1 update for self-adjoint packed matrices. 2012-09-08 22:51:40 +08:00
Chen-Pang He
dac5a8a37d Simplify Rank2Update.h 2012-09-08 22:20:05 +08:00
Chen-Pang He
17c746523e Comment FIXMEs on Rank2Update.h and remove unused files. 2012-09-08 21:25:09 +08:00
Gael Guennebaud
24f371bdb4 Merged in jdh8/eigen (pull request PR-16) 2012-09-08 12:16:49 +02:00
Gael Guennebaud
721671cc4e fix bug #501: remove aggressive mat/scalar optimization (was replaced by mat*(1/scalar) for non integer types) 2012-09-08 11:52:03 +02:00
Chen-Pang He
e4e7585a24 Implement rank-2 update for packed matrices. 2012-09-08 17:29:44 +08:00
Chen-Pang He
b5f9bec8ac Perform direct calls in xHEMV and xSYMV. 2012-09-08 15:47:33 +08:00
Gael Guennebaud
06d2fe453d remove stupid assert in blue norm. 2012-09-07 23:19:24 +02:00
Chen-Pang He
1b61aadcbe Implement SDSDOT with DSDOT and avoid allocating buffers in DSDOT. 2012-09-08 02:06:45 +08:00
Chen-Pang He
b0b9b4d6b2 Implement functors for rank-1 and rank-2 update. 2012-09-08 01:39:16 +08:00
Desire NUENTSA
5433986f5a multiple warnings for unused variable 2012-09-07 14:01:51 +02:00
Desire NUENTSA
fdd0f0c5fc merge Sparse LU branch 2012-09-07 13:18:16 +02:00
Desire NUENTSA
063705b5be Add tutorial for sparse solvers 2012-09-07 13:14:57 +02:00
Chen-Pang He
145f89cd5f Fix memory leak in DSDOT. 2012-09-07 15:21:57 +08:00
Chen-Pang He
c86d047c2f BLAS: implement DSDOT and SDSDOT; update test for them. 2012-09-05 18:59:32 +08:00
Desire NUENTSA
2280f2490e Init perf values 2012-09-04 12:21:07 +02:00
Desire NUENTSA
2e38666d01 correct bug in Blas 3 2D block update 2012-09-04 11:36:57 +02:00
Desire NUENTSA
3a22c47fb5 Bug in blas 3 2D block update 2012-09-03 14:49:03 +02:00
Desire NUENTSA
288e6aab14 Insert XSL styles into output XML files 2012-09-03 10:33:39 +02:00
Chen-Pang He
c4051d3d02 Fix a typo in blas/common.h 2012-09-03 15:31:19 +08:00
giacomo po
751501eade added preconditioner with preconditioned-Lanczos iteration 2012-09-01 21:59:06 +02:00
Chen-Pang He
d4144583bb Write dox for assertions 2012-08-31 21:53:02 +08:00
Chen-Pang He
d23134e4a7 Avoid inefficient 2x2 LU. Move atanh to internal for maintainability. 2012-08-30 23:40:30 +08:00
Gael Guennebaud
9da41cc527 forward resize() function from Array/Matrix-Wrapper to the nested expression such that mat.array().resize(a,b) is now allowed. 2012-08-30 16:28:53 +02:00
giacomo po
5f3880c5a8 some optimization in MINRES, not sure about convergence criterion 2012-08-30 13:10:08 +02:00
Gael Guennebaud
c5031edb92 Fix out-of-range memory access in GEMV (the memory was not used for the computation, only to assemble unaligned packets from aligned packet loads)
(transplanted from 221f54698c
)
2012-08-30 10:52:15 +02:00
giacomo po
064f3eff95 first working version. Still no preconditioning 2012-08-30 10:01:34 +02:00
Chen-Pang He
d0ee31aea6 Fix dox and tabbing 2012-08-29 01:56:23 +08:00
Chen-Pang He
ba4e886376 Tidy up and write dox. 2012-08-28 01:55:13 +08:00
Chen-Pang He
5252d823c9 Optimize matrix power 2012-08-26 02:15:41 +08:00
Chen-Pang He
1cd4279b03 Fix a lot in MatrixPower.h 2012-08-25 01:09:20 +08:00
Jitse Niesen
edc7a09ee7 merge 2012-08-27 22:57:39 +01:00
Chen-Pang He
bfaa7f4ffe Add test for matrix power.
Use Christoph Hertzberg's suggestion to use exponent laws.
2012-08-27 22:48:37 +01:00
Desire NUENTSA W.
fe9956defe Read real and complex bench matrices from a unique folder
Output and display bench results using XML and XSLT
2012-08-27 22:52:43 +02:00
Chen-Pang He
b55d260ada Replace atanh with atanh2 2012-08-27 21:43:41 +01:00
Gael Guennebaud
ebe511334f workaround clang bug (see http://forum.kde.org/viewtopic.php?f=74&t=102653) 2012-08-27 14:50:45 +02:00
Gael Guennebaud
576d62db64 fix a typo in commit 324ecf153b
(regarding MKL on windows)
2012-08-27 13:17:45 +02:00
Gael Guennebaud
75435079ca fix bug #499: the image was missing because of a dependency issue when building/executing the "special" examples 2012-08-27 11:11:25 +02:00
Gael Guennebaud
aa1aa36d6d simplify eigen-doc.tgz file generation, and make it more future proof 2012-08-27 10:56:44 +02:00
Gael Guennebaud
904c2e6cfb remove EXTRACT_ALL 2012-08-27 10:30:10 +02:00
Thomas Capricelli
edc496f087 add piwik code to documentation (web stats engine) 2012-08-21 22:36:29 +02:00
jdh8
1b4aed7255 Fix toc in dox and claim copyright 2012-08-20 03:04:28 +08:00
jdh8
573d88f81c Dox in MatrixFunctions 2012-08-19 18:12:04 +08:00
jdh8
15dabd4db7 Bugfix in MatrixLogarithm.h 2012-08-18 21:28:05 +08:00
Hauke Heibel
42c1b9a8dd Ensured that all branches of MatrixLogarithmAtomic::getPadeDegree return values. 2012-08-18 10:18:31 +02:00
jdh8
f047030104 Add specialization for float and long double 2012-08-18 02:27:47 +08:00
Jitse Niesen
dee866a99a Undo incorrect fix in previous commit, and fix real mistake instead. 2012-08-17 15:36:37 +01:00
Jitse Niesen
5eefca637e Documentation fixes. Thanks to Rodney Sparapani for reporting these. 2012-08-17 14:49:18 +01:00
jdh8
2337ea430b Remove useless code (abort specialization for complex exponent temporarily) 2012-08-15 20:56:03 +08:00
jdh8
4be172d84f matrix power: MatrixBase::pow(RealScalar) and MatrixBase::pow(T) where T is integral type 2012-08-15 00:34:20 +08:00
jdh8
c5800a2452 using std::frexp instead of frexp 2012-08-08 17:50:56 +08:00
jdh8
8cddcaf839 Optimize getting exponent from IEEE floating points. 2012-08-08 01:27:11 +08:00
Desire NUENTSA
63d2dcfb70 Clean the supernodal matrix class 2012-08-07 17:10:42 +02:00
Desire NUENTSA
43f74cb5b1 Bug in 2D block update, disable it for now 2012-08-07 13:55:50 +02:00
Desire NUENTSA
4d3b7e2a13 Add support for Metis fill-reducing ordering ; it is generally more efficient than COLAMD ordering 2012-08-06 14:55:02 +02:00
Gael Guennebaud
a1b405c92e Add missing const in some casts 2012-08-05 10:40:46 +02:00
Gael Guennebaud
af824091be Fix precision regression when attempting to fix underflow issues. 2012-08-05 09:57:31 +02:00
jdh8
93967b0dd6 Fix some typos in MatrixLogarithm to improve accuracy. 2012-08-03 23:54:42 +08:00
Desire NUENTSA
a51806993b Prefix with glu, the global structure 2012-08-03 16:43:12 +02:00
Desire NUENTSA
70db61c269 Prefix with glu, the global structure 2012-08-03 16:36:00 +02:00
Gael Guennebaud
03509d1387 SparseLU: add leverage level3 ops 2012-08-03 15:37:44 +02:00
Gael Guennebaud
48dc95f1da factorize column_dfs and panel_dfs 2012-08-02 18:28:16 +02:00
Desire NUENTSA
7dc39b7037 Add unit tests 2012-08-03 13:05:45 +02:00
Desire NUENTSA
6e8aa96e0f correct bug when solving with multiple Rhs 2012-08-03 13:05:27 +02:00
Gael Guennebaud
c73c3ec2f8 fix bug #495: remove too aggressive EIGEN_FLATTEN_ATTRIB attribute
(after some benchmarking, it was not useful anymore)
2012-08-02 12:22:22 +02:00
Desire NUENTSA
e3ac608e41 bug #493 : multiple calls to FindUmfPack
(transplanted from 1914024965
)
2012-08-02 10:00:23 +02:00
Desire NUENTSA
3a0f5a2a7f Update copyrights sections 2012-08-01 11:40:56 +02:00
Desire NUENTSA
02935b4249 switch to MPL license 2012-08-01 11:38:32 +02:00
Desire NUENTSA
390d6599ba Add missing .noalias() 2012-08-01 11:35:23 +02:00
Gael Guennebaud
7d98c864ff fix warning 2012-08-01 10:44:59 +02:00
Gael Guennebaud
22e0ebbc2c fix lower acceptable bound of SSE pexp for double 2012-07-31 23:11:04 +02:00
Gael Guennebaud
e88817cc51 add another missing .noalias() 2012-07-30 19:28:31 +02:00
Gael Guennebaud
8f6d5eacb4 optimize LU_kernel_bmod for small cases, and add an important .noalias() 2012-07-29 22:26:00 +02:00
Jitse Niesen
696b2f999f Eigenvalues module: Implement setMaxIterations() methods. 2012-07-28 21:30:09 +01:00
Gael Guennebaud
6f54269829 add an example for GeneralizedEigenSolver 2012-07-28 18:00:54 +02:00
Gael Guennebaud
8ab0e16e27 fix various regressions with MKL support 2012-07-28 16:32:43 +02:00
Alexey Korepanov
d937e67b48 RealQZ: added example and some code comments 2012-07-28 08:24:44 -05:00
Hauke Heibel
52a0e0d65e Added a default constructor for Splines which creates zero (constant) splines. 2012-07-28 13:37:29 +02:00
Gael Guennebaud
f23dd7c6f1 Fix typo in the doc: s/Succeeded/Success 2012-07-28 13:07:45 +02:00
Gael Guennebaud
e8aa1f00c5 add SSE pexp function for double, make use of _mm_floor_p* for pexp with SSE4.1 2012-07-27 23:40:04 +02:00
Desire NUENTSA W.
ce30d50e3e Improve the permutation 2012-07-27 16:38:20 +02:00
Gael Guennebaud
6eee2918d9 extend quotient functor to allow for mixed types (complex-real) 2012-07-27 11:56:20 +02:00
Desire NUENTSA W.
c0fa5811ec Refactoring codes for numeric updates 2012-07-27 11:36:58 +02:00
Gael Guennebaud
9e8d2dea80 Add a preliminary GeneralizedEigenSolver computing the eigenvalues of Av=lBv with A and B general real matrices.
Currently only the eigenvalues are reported.
2012-07-26 20:15:17 +02:00
Gael Guennebaud
cfb76b242f RealSchur: improve speed of computeNormOfT 2012-07-26 18:04:58 +02:00
Gael Guennebaud
4e60e2cdf6 RealQZ: improve computeNorms speed, improve shift accuracy (better to do a/b than a*(1/b)),
update API to set the maximum number of iterations
2012-07-26 18:03:10 +02:00
Gael Guennebaud
7518201de8 SparseMatrix: add missing ctor for ReturnByValue 2012-07-25 23:03:10 +02:00
Alexey Korepanov
ea310249f3 RealQZ: bug in pushDownZero fixed too 2012-07-25 12:49:18 -05:00
Alexey Korepanov
a3a9773ab6 RealQZ: bug in splitOffTwoRows fixed 2012-07-25 12:17:00 -05:00
Desire NUENTSA
925ace196c correct bug in the complex version 2012-07-19 18:15:23 +02:00
Desire NUENTSA
59642da88b Add exception handler to memory allocation 2012-07-19 18:03:44 +02:00
Desire NUENTSA
b0cba2d988 Add a draft (not clean ) version of the COLAMD ordering implementation 2012-07-18 16:59:00 +02:00
Jitse Niesen
bf7d986af6 Add static assert that objects on stacks are not too big (bug #491). 2012-07-17 22:15:42 +01:00
Gael Guennebaud
e75b1eb883 Fix aliasing issue in sparse matrix assignment.
(m=-m; or m=m.transpose(); with m sparse work again)
2012-07-25 09:33:50 +02:00
Gael Guennebaud
7b34b5f6f9 do not apply plane rotation when it is exactly the identity 2012-07-24 18:19:56 +02:00
Gael Guennebaud
e7c07de549 RealQZ: optimize general hessenberg to not apply rotations to zero entries. 2012-07-24 18:16:22 +02:00
Gael Guennebaud
c1cab7b8ed real QZ: update license 2012-07-24 18:11:41 +02:00
Desire NUENTSA
773804691a working version of sparse LU with unsymmetric supernodes and fill-reducing permutation 2012-07-13 17:32:25 +02:00
Alexey Korepanov
65db91ac2b Add a RealQZ class: a generalized Schur decomposition for real matrices 2012-07-11 16:38:03 -05:00
Jitse Niesen
ba5eecae53 Allow user to specify max number of iterations (bug #479). 2012-07-24 15:17:59 +01:00
Jitse Niesen
b7ac053b9c Use EISPACK's strategy re max number of iters in Schur decomposition (bug #479). 2012-07-22 22:02:50 +01:00
Jitse Niesen
fd5749f51c LDLT: Report sign consistent with D for indefinite matrices.
See http://forum.kde.org/viewtopic.php?f=74&t=106942
2012-07-22 21:39:38 +01:00
Jitse Niesen
907f4562ac Fix some illegal memory access in sparse conservativeResize() 2012-07-20 22:51:51 +01:00
Benjamin Piwowarski
6bf49ceac2 bug #449: add SparseMatrix::conservativeResize feature 2012-07-19 00:07:06 +02:00
Benoit Jacob
3f08a6a126 add COPYING.MINPACK 2012-07-15 11:46:22 -04:00
Benoit Jacob
df06e5662d MINPACK license is OK for MPL2 after all 2012-07-15 10:30:57 -04:00
Benoit Jacob
f28e95500b add COPYING.README 2012-07-15 10:29:09 -04:00
Benoit Jacob
9bf3ec134e add COPYING.MPL2 2012-07-15 10:20:59 -04:00
Benoit Jacob
b596f6c10c remove outdated "Eigen itself is part of the KDE project" outside of eigen2 files 2012-07-15 10:33:40 -04:00
Jitse Niesen
d3998de472 Silence clang warning about && inside || 2012-07-14 15:50:56 +01:00
Jitse Niesen
4ae3e0a9b8 Evaluators: Fixed bug caused by Diagonal dynamic index change.
This caused the evaluators unit test to fail.
2012-07-14 14:55:04 +01:00
Gael Guennebaud
79214745c7 clean Eigen2Support wrt KDE mentions 2012-07-14 10:15:45 +02:00
Gael Guennebaud
e59f95a9a0 clean old KDE mention and related 2012-07-14 10:04:26 +02:00
Gael Guennebaud
54559094ec document EIGEN_MPL2_ONLY 2012-07-14 09:56:03 +02:00
Gael Guennebaud
46b1c7a0ce fix bug #485: conflict between a typedef and template type parameter 2012-07-13 20:54:38 +02:00
Benoit Jacob
269be00925 Add a EIGEN_MPL2_ONLY build option to generate compiler errors when including non-MPL2 modules 2012-07-13 14:42:47 -04:00
Benoit Jacob
0733e622a3 Manual MPL2 relicensing fixes 2012-07-13 14:42:47 -04:00
Benoit Jacob
69124cfca2 Automatic relicensing to MPL2 using Keirs script. Manual fixup follows. 2012-07-13 14:42:47 -04:00
Keir Mierle
d4ca0963bc Add preliminary script to relicense Eigen to MPL2. 2012-07-11 11:29:52 -07:00
Gael Guennebaud
904ecdf9d8 Add a DynamicIndex constant for signed quantities and use it to fix the conflict
between Diagonal<S,-1> (the first sub diagonal) and a runtime super/sub diagonal which is now:
Diagonal<S,DynamicIndex>
2012-07-10 23:04:17 +02:00
Gael Guennebaud
3e6329a0d9 fix computation of fixed size sub/super diagonal size 2012-07-10 22:39:05 +02:00
Desire NUENTSA
e529bc9cc1 correct bug when applying column permutation 2012-07-10 19:18:50 +02:00
Desire NUENTSA
de2544cc9b working version of sparse LU without fill-reducing permutation 2012-07-10 19:16:57 +02:00
Gael Guennebaud
a2c3003be2 Fix possible underflow issues in SelfAdjointEigenSolver 2012-07-10 09:51:26 +02:00
Desire NUENTSA
3095e4a5f9 Correct bug for triangular solve within supernodes 2012-07-09 19:09:48 +02:00
Gael Guennebaud
ff12a6cd43 fix include path 2012-07-08 16:18:51 +02:00
Desire NUENTSA
b5a83867ca Update Ordering interface 2012-07-06 20:18:16 +02:00
Jitse Niesen
b1b6864c88 Evaluators: Remove member variables if known at compile-time.
Also, use composition instead of inheritance in EvalToTemp evaluator.
2012-07-06 14:50:03 +01:00
Desire NUENTSA
203a0343fd Update Ordering interface 2012-07-06 13:34:06 +02:00
Gael Guennebaud
7bfd8eabff fix compilation with MSVC 2012-07-05 21:58:01 +02:00
Gael Guennebaud
5dbdde0420 Fix bug #480: workaround the Android NDK defining isfinite as a macro 2012-07-05 17:22:25 +02:00
Gael Guennebaud
23df2eed46 bug #481 step 1: add a new Ref<> class for non templated function arguments 2012-07-05 17:00:28 +02:00
Jitse Niesen
60edf02f6f doc: Typo in CustomizingEigen, introduced in previous commit.
Thanks to Christoph Hertzberg for noting this.
2012-07-05 13:56:28 +01:00
Jitse Niesen
37d5825c5a merge 2012-07-05 13:39:06 +01:00
Jitse Niesen
b582b2ebdc doc: Add constructor to example for inheritance.
See "Error in Inheriting Eigen::Vector3d" on forum.
2012-07-05 13:36:02 +01:00
Gael Guennebaud
0a7ce6ad69 fix bug #486: template speacialization of member functions must be declared inline to avoid duplicate references 2012-07-05 13:32:23 +02:00
Jitse Niesen
cb9f3685d3 Move implementation of coeff() &c to Matrix/Array evaluator. 2012-07-05 11:09:42 +01:00
Christoph Hertzberg
03fe095622 bug #488: Add setShift method (and functionality) to Cholmod classes
Also check for Success of numerical factorization
2012-07-04 18:46:14 +02:00
Gael Guennebaud
54d55aeaf6 fix bug #487: isometry * scaling was not compiling 2012-07-04 18:25:07 +02:00
Konstantinos Margaritis
d878cf2227 fix typo 2012-07-04 11:28:59 +03:00
Konstantinos Margaritis
f737536744 fix NEON port, use vget_lane_*() instead of temporary variables (saves extra
load/store), following advice by  Josh Bleecher Snyder <josharian@gmail.com>.
Also implement pmadd() using vmla instead of nested padd/pmul.
2012-07-04 11:12:02 +03:00
Gael Guennebaud
9a97dac4d9 Doc: add an example for HouseholderQR::householderQ() 2012-07-02 16:33:32 +02:00
Gael Guennebaud
eee34f2da4 workaround compilation issue with MSVC 2005 2012-07-02 10:20:44 +02:00
Desire NUENTSA
15f1563533 Before moving to the new building 2012-06-29 17:45:10 +02:00
Jitse Niesen
746378868a Implement A.noalias() = B * C without temporaries
* Wrap expression inside EvalToTemp in copy_using_evaluators() if we
  assume aliasing for that expression (that is, for products)
* Remove temporary kludge of evaluating expression to temporary in
  AllAtOnce traversal
* Implement EvalToTemp expression object
2012-06-29 13:54:09 +01:00
Jitse Niesen
d0b873822f Make product eval-at-once.
* Make product EvalAtOnce in cases OuterProduct, GemmProduct and
  GemvProduct
* Ensure that product evaluators are nested inside EvalToTemp
  evaluator
* As temporary kludge, evaluate expression to temporary in AllAtOnce
  traversal and pass expression operator to evalTo()
2012-06-29 13:49:25 +01:00
Jitse Niesen
2393ceb380 Implement eval-at-once in evaluator.
- Add evaluator_traits with HasEvalTo flag, which is true if evaluator
  has evalTo() function.
- Add AllAtOnce traversal, which calls evalTo() in evaluator.
- If source evaluator in copy_using_evaluator has HasEvalTo set, then
  use AllAtOnce traversal.
2012-06-29 13:32:12 +01:00
Jitse Niesen
c1eb820e50 Implement interface for NoAlias assignments.
* Rename the old copy_using_evaluators to noalias_copy_using_evaluators.
* Write a new copy_using_evaluators which strips NoAlias expression, if present,
  and calls noalias_copy_using_evaluators; in future, it will also take care of
  aliasing in products.
* Add expression() getter to NoAlias.
2012-06-29 13:24:04 +01:00
Jitse Niesen
069fd0e4be Move (part of) evaluation of products to evaluator objects.
* Copy implementation from CoeffBasedProduct.
* Copy implementation from GeneralProduct in InnerProduct case.
* For GeneralProduct in other cases, call the evalTo() member function with
  expression objects in constructor of evaluator.
2012-06-29 13:07:21 +01:00
Gael Guennebaud
9629ba361a bug #482: pass scalar by const ref - pass on the sparse module
(also fix a compilation issue due to previous pass)
2012-06-28 21:01:02 +02:00
Jitse Niesen
23184527fa Resize lhs automatically in copy_using_evaluator(). 2012-06-28 15:25:25 +01:00
Gael Guennebaud
139c91bf30 fix implicit scalar conversion 2012-06-28 13:12:49 +02:00
Gael Guennebaud
a2ace4b79a bug #482: pass scalar arguments by const references. This changeset only concerns the Core and Geometry modules 2012-06-28 02:08:59 +02:00
Gael Guennebaud
cc964b6caf fix performance regression due to check_rows_cols_for_overflow and add appropriate assertions in the PlainObjectBase::resize() functions.
The fix consists in disabling this useless test for statically allocated objects.
2012-06-26 22:16:07 +02:00
Gael Guennebaud
57b5804974 remove dynamic allocation for fixed size object and triangular matrix-matrix products 2012-06-26 17:45:01 +02:00
Jitse Niesen
8994f9962a Fix bug in {Matrix,Array}Wrapper evaluator 2012-06-24 17:35:27 +01:00
Jitse Niesen
d0d077b212 Fix bug in evaluators with sliced vectorization. 2012-06-24 17:33:21 +01:00
Jitse Niesen
172af9facc Fix an evaluator test which was wrong and failed in debug builds. 2012-06-24 17:31:19 +01:00
Gael Guennebaud
7c32904766 typo 2012-06-24 10:13:28 +02:00
Gael Guennebaud
e46fc8c05c fix GMRES 2012-06-23 19:29:21 +02:00
Gael Guennebaud
cc6dd55028 put the resurected files into the Eigen namespace 2012-06-22 16:35:20 +02:00
Gael Guennebaud
62c504e7bf fix most of the shadow warnings in Core/*.h 2012-06-22 16:32:45 +02:00
Gael Guennebaud
5fae6c7848 resurrect expression evaluators 2012-06-22 09:39:35 +02:00
Gael Guennebaud
81e39e1bc6 bump default branch to 3.1.90 2012-06-22 09:27:24 +02:00
Gael Guennebaud
b737850d38 Added tag 3.1.0-rc2 for changeset dd86165c13 2012-06-21 22:00:32 +02:00
Gael Guennebaud
dd86165c13 bump to 3.1.0-rc2 2012-06-21 22:00:13 +02:00
Gael Guennebaud
110cf8bbf5 fix compilation issue with MKL's backend 2012-06-21 17:03:15 +02:00
Gael Guennebaud
d428b620aa add the multithreading topic in the topic list 2012-06-21 10:54:16 +02:00
Gael Guennebaud
eb626877d7 fix sparse benchmark help 2012-06-21 10:53:36 +02:00
Gael Guennebaud
6f3057f624 extend documentation of *Support modules 2012-06-21 10:51:22 +02:00
Gael Guennebaud
5b5f3ecafa MPreal: extended unit test, remove useless internal overloads, add support for internal::cast (needed for printing) 2012-06-21 10:02:32 +02:00
Gael Guennebaud
7380592bc2 patch mpfr c++ copy to fix warnings and min/max issues 2012-06-21 09:59:44 +02:00
Gael Guennebaud
b5093e2585 update internal mpfr C++ copy 2012-06-21 09:56:54 +02:00
Jitse Niesen
8c71d7314b Fix some typos in sparse tutorial. 2012-06-20 09:52:45 +01:00
Gael Guennebaud
b96b429aa2 fix bug #478: RealSchur failed on a zero matrix. 2012-06-20 10:08:32 +02:00
Gael Guennebaud
c8346abcdd fix bug #477: warning with gcc 4.7 2012-06-20 09:54:52 +02:00
Gael Guennebaud
52dce0c126 significantly extend the tutorial of sparse matrices 2012-06-20 09:28:32 +02:00
Gael Guennebaud
882912b85f comment two tests in nomalloc (there is no regression here, it's just I've been too optimistic when adding them recently) 2012-06-20 08:58:26 +02:00
Gael Guennebaud
1727373706 fix geometry tutorial about scalings. 2012-06-18 22:07:13 +02:00
Gael Guennebaud
47a77d3e38 update custom scalar type doc 2012-06-18 21:49:55 +02:00
Gael Guennebaud
791e28f25d update adolc support wrt "new" NumTraits mechanism 2012-06-18 21:32:56 +02:00
Jitse Niesen
148587e229 Update custom scalar example, based on unstable/Eigen/AdolcForward . 2012-06-16 20:35:59 +01:00
Gael Guennebaud
3c9289129b prevent the allocation of the two preconditioner, only one is needed 2012-06-15 23:22:34 +02:00
Desire NUENTSA
f0c34c6822 Build finished... start debugging 2012-06-15 17:23:54 +02:00
Gael Guennebaud
aa3daad883 fix a warning and formatting 2012-06-15 09:16:10 +02:00
Gael Guennebaud
3fd2beebc8 Matrix-Market: fix perf issue and infinite loop 2012-06-15 09:07:13 +02:00
Gael Guennebaud
c858fb353f fix a few warnings 2012-06-15 09:06:32 +02:00
Gael Guennebaud
37d367a231 fix typo in unsupported/NumericalDiff 2012-06-15 07:56:55 +02:00
Gael Guennebaud
12e9f3b0fc Added tag 3.1.0-rc1 for changeset 4ca5735de4 2012-06-14 21:26:11 +02:00
Gael Guennebaud
4ca5735de4 bump to 3.1.0-rc1 2012-06-14 21:25:50 +02:00
Desire NUENTSA
0c9b08e46e build complete... almost 2012-06-14 18:45:04 +02:00
Gael Guennebaud
b9f25ee656 bug #466: better fix for the race condition: this new patch add an initParallel()
function which must be called at the initialization time of any multi-threaded
application calling Eigen from multiple threads.
2012-06-14 14:24:15 +02:00
Gael Guennebaud
a3e700db72 fix bug #475: .exp() now returns +inf when overflow occurs (SSE) 2012-06-14 10:38:39 +02:00
Gael Guennebaud
324ecf153b disable the MKL's vm*powx functions on windows 2012-06-14 09:49:57 +02:00
Desire NUENTSA
f8a0745cb0 Build process... 2012-06-13 18:26:05 +02:00
Desire NUENTSA
c0ad109499 Checking Data structures and function prototypes 2012-06-12 18:19:59 +02:00
Gael Guennebaud
9c7b62415a simplify and clean a bit the Pastix support module 2012-06-12 16:47:14 +02:00
Gael Guennebaud
4e8523b835 update blas interface for trsm 2012-06-12 14:33:03 +02:00
Gael Guennebaud
88e051019b extend nomalloc unit test to test the solve calls 2012-06-12 13:12:47 +02:00
Gael Guennebaud
cd48254a87 fix inclusion order 2012-06-12 11:40:33 +02:00
Gael Guennebaud
924c7a9300 avoid dynamic allocation for fixed size triangular solving 2012-06-12 11:33:50 +02:00
Desire NUENTSA
bccf64d342 Checking Syntax... 2012-06-11 18:52:26 +02:00
Gael Guennebaud
bc580bbffb fix typo 2012-06-11 18:49:30 +02:00
Desire NUENTSA W.
0591011d5c Sparse LU - End Triangular solve... start debugging 2012-06-10 23:36:38 +02:00
Gael Guennebaud
f2849fac20 Fix bug #466: race condition destected by helgrind in manage_caching_sizes.
After all, the solution based on threadprivate is not that costly.
2012-06-08 17:29:02 +02:00
Desire NUENTSA
7bdaa60f6c triangular solve... almost finished 2012-06-08 17:23:38 +02:00
Gael Guennebaud
28d0a8580e workaround ICC 11.1 compilation issue 2012-06-08 14:13:28 +02:00
Gael Guennebaud
7e36d32b32 fix ambiguous calls in the functors by prefixing function calls with internal:: 2012-06-08 09:53:50 +02:00
Desire NUENTSA
f091879d77 Memory management 2012-06-07 19:06:22 +02:00
Gael Guennebaud
5cec86cb1e BTL: add missing TRMM plots, update Eigen's interface 2012-06-07 18:35:38 +02:00
Gael Guennebaud
512e0b151b clean the support for testing existing sparse problems 2012-06-07 18:31:09 +02:00
Gael Guennebaud
83c932ed15 fix a warning 2012-06-07 18:22:13 +02:00
Gael Guennebaud
1e5e66b642 For consistency, Simplicial* now factorizes P A P^-1 (instead of P^-1 A P).
Document how is applied the permutation in Simplicial* .
2012-06-07 16:24:46 +02:00
Gael Guennebaud
63c6ab3e42 fix documentaion of twistedBy 2012-06-07 16:18:00 +02:00
Gael Guennebaud
c1edb7fd95 Added tag 3.1.0-beta1 for changeset b7a7285909 2012-06-06 22:34:08 +02:00
Gael Guennebaud
b7a7285909 bump to beta1 2012-06-06 22:33:39 +02:00
Gael Guennebaud
5a697e495c fix installation path 2012-06-06 22:32:44 +02:00
Desire NUENTSA
268ba3b521 Memory expansion and few bugs 2012-06-06 18:23:39 +02:00
Gael Guennebaud
05af70a958 make sure we do not solve with a null right hand side 2012-06-06 17:11:50 +02:00
Gael Guennebaud
fd32697074 Fix stopping criteria of CG 2012-06-06 17:11:16 +02:00
Gael Guennebaud
b9f0eabd93 discourage users to user developer preprocessor directives 2012-06-06 15:36:08 +02:00
Gael Guennebaud
84d20720b2 fix umfpack for row-major 2012-06-06 09:44:53 +02:00
Gael Guennebaud
9d2b6dd71a test block objects for sparse solving 2012-06-06 09:40:01 +02:00
Gael Guennebaud
c58b759865 Fix bug #454: allow Block/Map objects for solving with SuperLU 2012-06-06 09:37:59 +02:00
williami
fc5f21903b Fixed RVCT 3.1 compiler errors. 2012-06-04 10:21:16 -05:00
Gael Guennebaud
cb64e587c5 Fix kdBVH unit test 2012-06-04 22:01:06 +02:00
Gael Guennebaud
945179b26c CholmodDecomposition now has explicit variants. These variants will allow to provide access to the underlying factors. 2012-06-04 13:24:41 +02:00
Gael Guennebaud
5f5a4d4546 make Simplicial* non-copyable, and fix return type of Simplicial*::compute() 2012-06-04 13:22:44 +02:00
Gael Guennebaud
a2ae063491 add a noncopyable base class for decompositions 2012-06-04 13:21:15 +02:00
Gael Guennebaud
1b20e16546 extend umfpack support 2012-06-04 10:39:57 +02:00
Desire NUENTSA
4e5655cc03 Supernodal Matrix 2012-06-01 18:44:51 +02:00
Gael Guennebaud
b509cf0742 Fix bug #468: generalize UmfPack support to accept any input at the cost of an implicit copy. 2012-06-01 16:31:36 +02:00
Gael Guennebaud
7f63169f09 SimplicialCholesky: avoid multiple twisting of the same matrix when calling compute() 2012-06-01 15:51:03 +02:00
Desire NUENTSA
b26d6b02de Eliminate and prune columns in a panel 2012-05-31 17:10:29 +02:00
Desire NUENTSA
8608d08d65 Symbolic and numeric updates within the panel 2012-05-30 18:09:26 +02:00
Desire NUENTSA
8ab820b5b8 Symbolic and numeric update on a whole panel 2012-05-29 17:55:38 +02:00
kmargar
97cdf6ce9e ARM NEON supports multiply-accumulate instruction vmla, use that in pmadd(). 2012-05-28 14:55:23 +03:00
Desire NUENTSA
b6267507ea Add preliminary files for SparseLU 2012-05-25 18:17:57 +02:00
Desire NUENTSA
b202c5ed2f The sparse quick reference guide is not ready 2012-05-25 18:02:38 +02:00
Desire NUENTSA
1b9097644d Add common options to the benchmark interface 2012-05-25 17:58:43 +02:00
Desire NUENTSA
5cbe6a5fbf Read header of Hermitian matrices 2012-05-25 17:53:37 +02:00
Desire NUENTSA
2fecd818c4 Add a preliminary reference guide on sparse interface 2012-05-25 17:52:11 +02:00
Gael Guennebaud
695a7ab9d7 protect min/max with parenthesis 2012-05-15 08:18:39 +02:00
Jitse Niesen
b5f70814c1 Warn users against dangerous macros.
Also, mark EIGEN_DEFAULT_TO_ROW_MAJOR as internal (see also bug #422).
2012-05-13 21:42:45 +01:00
Gael Guennebaud
ce2e2fe336 bug #455: add support for c++11 in aligned_allocator 2012-05-03 11:55:30 +02:00
Jitse Niesen
823c44e4e5 merge 2012-05-02 17:21:29 +01:00
Philip Avery
cb3b1bb73e AutoDiffScalar: fix bug with operator/, add missing functions 2012-05-02 17:17:12 +02:00
clusty
d062a8bd31 Got rid of a warning message by doing an explicit cast 2012-05-02 10:50:44 -04:00
Gael Guennebaud
8f47246475 fix lmdif1 with Scalar!=double 2012-05-01 14:46:02 +02:00
Jitse Niesen
65fb0d43ff Define NoChange as enum constant (bug #450).
This gets rid of some warnings on Intel Composer XE, apparently.
2012-04-29 15:37:44 +01:00
Gael Guennebaud
1741dbce1a fix more warnings in MKL support 2012-04-18 18:36:25 +02:00
Jitse Niesen
57b5767fe2 Fix infinite recursion in ProductBase::coeff() (bug #447)
Triggered by product of dynamic-size 1 x n and n x 1 matrices.
Also, add regression test.
2012-04-18 15:23:28 +01:00
Gael Guennebaud
5cab18976b cleaning pass: rm unused variables in MKL stuff, fix a few namespace issues, MarketIO needs iostream 2012-04-18 10:09:46 +02:00
Gael Guennebaud
1198ca0284 remove debug output 2012-04-17 08:38:42 +02:00
Jitse Niesen
5d56f9f763 Remove unused file EigenvaluesCommon.h 2012-04-16 13:47:48 +01:00
Jitse Niesen
3c412183b2 Get rid of include directives inside namespace blocks (bug #339). 2012-04-15 11:06:28 +01:00
Hauke Heibel
84c93b048e Added spline interpolation with pre-defined knot parameters. 2012-04-13 12:50:05 +02:00
Gael Guennebaud
f6a5508392 remove an extra ';' and suppress a 'variable used before its value is set' warning 2012-04-11 09:49:52 +02:00
Gael Guennebaud
a3ddb14426 remove use of GSL in polynomialsolver unit test 2012-04-11 09:48:01 +02:00
Gael Guennebaud
51410975ac suppress extra ',' and ';' 2012-04-10 17:32:21 +02:00
Gael Guennebaud
b0cf95619e fix compilation of "somedensematrix.llt().matrixL().transpose()" (missing constness on the return types) 2012-04-10 15:40:36 +02:00
Gael Guennebaud
311c5b87a3 Replicate now makes use of the cost model to evaluate its nested expression 2012-04-06 00:22:13 +02:00
Thomas Capricelli
3018e80c59 uniformize eigen_gen_docs between branches / cleaning 2012-04-03 14:24:20 +02:00
Gael Guennebaud
a060e0b486 does not include MatrixMaketIterator on win32,
no "using whatever" in global scope in a header file
2012-03-31 18:01:43 +02:00
Gael Guennebaud
daaeddd581 rm unused gsl_helper file 2012-03-31 17:37:46 +02:00
Gael Guennebaud
48f0bbb586 fix bug #362 and add missing specialization for affine-compact * projective 2012-03-30 23:22:29 +02:00
Gael Guennebaud
63ea667ed7 fix compilation with ICC 2012-03-30 11:22:23 +02:00
Desire NUENTSA
5dbb646190 Add private copy constructors to sparse solvers backends 2012-03-29 19:19:12 +02:00
Desire NUENTSA
2d35f88bcf Cholmod does not compute a determinant 2012-03-29 19:07:13 +02:00
Desire NUENTSA
22cd65ee33 Adding a householder-GMRES implementation from Kolja Brix 2012-03-29 15:00:55 +02:00
Desire NUENTSA
f776c061a1 Correct a small bug in sparse_solver 2012-03-29 14:53:42 +02:00
Desire NUENTSA
f804a319c8 modify the unit tests of sparse linear solvers to enable tests on real matrices, from MatrixMarket for instance 2012-03-29 14:32:54 +02:00
Desire NUENTSA
ada9e79145 add a benchmark routine for all sparse linear solvers in Eigen 2012-03-29 14:29:55 +02:00
Gael Guennebaud
caecaf9c9e add missing forward declaration 2012-03-29 13:45:01 +02:00
Gael Guennebaud
c172abdcc7 add sparse * permutation products with assiciated unit tests 2012-03-29 11:29:43 +02:00
Gael Guennebaud
8ff882aa4c add sparse-selfadjoint to sparse-selfadjoint assignment operators
(no need to use .twistedBy(I) anymore)
2012-03-29 11:28:43 +02:00
Gael Guennebaud
fd2f399c18 fix bug #439: add Quaternion::FromTwoVectors() static constructor 2012-03-26 18:30:04 +02:00
Gael Guennebaud
6c3b8b2ebc we have a new server for hosting CDash reports. 2012-03-22 19:15:47 +01:00
Desire NUENTSA
afeddd80ab Algorithm to equilibrate rows and columns of a square matrix 2012-03-22 16:18:34 +01:00
Desire NUENTSA
0d52b965c8 Add simple API to set Pastix parameters 2012-03-22 15:54:52 +01:00
Desire NUENTSA
f6cd3389a2 compress loaded market matrix 2012-03-22 15:53:25 +01:00
Gael Guennebaud
daad446d5d workaround stupid gcc 4.7 warning 2012-03-22 00:01:03 +01:00
Gael Guennebaud
f0a1652113 s/__SSE3__/EIGEN_VECTORIZE_SSE3 2012-03-21 23:50:43 +01:00
Gael Guennebaud
b0fd94aa85 improve FindFFTW cmake module 2012-03-15 15:18:22 +01:00
Kolja Brix
30dee7d235 Add some documentation to existing methods in the Householder module. 2012-03-08 12:42:10 +01:00
Gael Guennebaud
77b05d5b7d remove parenthesis suggestion warning 2012-03-14 17:38:21 +01:00
Gael Guennebaud
60daf70a20 add 2 missing ReverseInnerIterators 2012-03-14 17:37:28 +01:00
Hauke Heibel
dd9365e089 Fixed division by zero corner case in array unit test. 2012-03-09 14:04:13 +01:00
Gael Guennebaud
d7da6f63a8 declare Block::m_outerStride as Index (instead of int) 2012-03-09 13:54:22 +01:00
Gael Guennebaud
728ca6ad9c export IsRowMajor in MappedSparseMatrix 2012-03-09 13:52:35 +01:00
Gael Guennebaud
9b1ad5e5bd rm cC++11 features 2012-03-09 12:08:06 +01:00
Gael Guennebaud
fe9b7c2564 typo in variable name not revealed by ICC 2012-03-08 21:45:00 +01:00
Gael Guennebaud
48a3e0ed55 fix conversion warning 2012-03-08 21:31:49 +01:00
Desire NUENTSA
0d8466d317 Adding an interface to PaStiX, the multithreaded and distributed linear solver 2012-03-08 18:59:08 +01:00
Desire NUENTSA
37d2efd4f6 Adding support to read and write complex matrices in Matrix Market format 2012-03-08 18:45:47 +01:00
Hauke Heibel
c08521ea6b Improved the unit tests for setLinSpaced.
Provide a default constructed step size as opposed to an int when the size is 1.
2012-03-07 16:18:35 +01:00
Hauke Heibel
ef022da28e Fixed setLinSpaced for size==1. 2012-03-07 15:34:39 +01:00
Hauke Heibel
81c1336ab8 Added support for component-wise pow (equivalent to Matlab's operator .^). 2012-03-07 08:58:42 +01:00
Hauke Heibel
aee0db2e2c Moved the operator/(Scalar,ArrayBase) into the Eigen namespace. 2012-03-02 16:58:12 +01:00
Hauke Heibel
8cb3e36e14 Added support for scalar / array division. 2012-03-02 16:27:27 +01:00
Hauke Heibel
8a7d16d523 Replicate ctor now uses Index instead of int. 2012-03-02 16:27:08 +01:00
Gael Guennebaud
553a0ae924 simplify and speedup sparse * dense matrix products 2012-03-01 10:13:13 +01:00
Desire NUENTSA
85b358097d allow null elements in sparse assignments 2012-02-29 15:51:23 +01:00
Gael Guennebaud
fc85f91df0 fix MKL interface with LLT::rankUpdate 2012-02-28 16:19:40 +01:00
Gael Guennebaud
309b27b545 update unit test for Simplicial-Cholesky 2012-02-28 14:21:54 +01:00
Gael Guennebaud
0d3d46573e fix assertion condition 2012-02-27 19:04:34 +01:00
Gael Guennebaud
5effdba2c6 SimplicialCholesky*: s/LLt/LLT and s/LDLt/LDLT for consistency with dense names 2012-02-27 14:28:07 +01:00
Gael Guennebaud
ece30e9e6f fix a couple of warnings 2012-02-27 14:27:12 +01:00
Gael Guennebaud
eb168ef8ed add analyzePattern/factorize API to iterative solvers and basic preconditioners 2012-02-27 14:10:26 +01:00
Gael Guennebaud
122f28626c fix and clean Pardiso solver and s/PARDISOSupport/PardisoSupport 2012-02-27 13:23:21 +01:00
Gael Guennebaud
b240a3fad9 add unit tests for analyzePatter/factorize API 2012-02-27 13:22:38 +01:00
Gael Guennebaud
bc8188f6a1 fix symmetric permuatation for mixed storage orders 2012-02-27 13:21:41 +01:00
Gael Guennebaud
128ff9cf07 declare a ReverseInnerIterator in sparse CwiseBinaryOp. These ReverseInnerIterator should probably be removed anyway since we currently don't have real use cases for them. The only one in TriangularSolver could be advantageously replaced by a binary search. 2012-02-23 11:38:18 +01:00
Christoph Hertzberg
1edfa64f44 bug #419: Add spaces between adjacent > in template arguments 2012-02-15 14:14:29 +01:00
Gael Guennebaud
eff167d2c8 SSOR is not there yet 2012-02-19 16:01:13 +01:00
Gael Guennebaud
4cc6d7aa62 clean a bit the ILUT code 2012-02-14 22:07:19 +01:00
Rhys Ulerich
ef448da57b add Eigen::Array support to GDB pretty printers 2012-02-11 20:50:21 -06:00
Gael Guennebaud
7de3478027 <complex> must be included first 2012-02-10 22:49:09 +01:00
Gael Guennebaud
ef7f1371b2 some cleaning and add copyrights 2012-02-10 19:38:31 +01:00
Desire NUENTSA
16da7299dd Add test in BiCGSTAB for ILUT 2012-02-10 18:57:38 +01:00
Desire NUENTSA
edbebb14de Split the computation of the ILUT into two steps 2012-02-10 18:57:01 +01:00
Desire NUENTSA
a815d962da Add the implementation of the Incomplete LU preconditioner with dual threshold (ILUT)
Modify the BiCGSTAB function to check the residual norm of the initial guess
2012-02-10 10:59:39 +01:00
Desire NUENTSA
9ed6a267a3 Modify the LinSpaced function to take only the two bounds 2012-02-10 10:21:11 +01:00
Desire NUENTSA
2ea98594c4 Modify the symmetric permutation to deal with nonsymmetric matrices 2012-02-10 10:18:38 +01:00
Gael Guennebaud
70284b7eff suppress generation of TEMPLATE_RELATIONS: they are useful but take much too much space 2012-02-09 21:42:58 +01:00
Gael Guennebaud
8dd3ae282d fix bug #417: Map should be nested by value, not by reference 2012-02-09 15:25:42 +01:00
Tim Holy
44b19b432c Add a tutorial page on the Map class, and add a section to FunctionsTakingEigenTypes about multiple-argument functions and the pitfalls when using Map/Expression types. 2012-02-08 22:11:12 +01:00
Gael Guennebaud
5bb34fd14c fix bug #415: wrong return in Rotation2D::operator*= 2012-02-08 21:50:51 +01:00
Desire NUENTSA
a1c7b5aa48 Adding support for twistedby on SparseMatrixBase 2012-02-08 18:22:48 +01:00
Gael Guennebaud
3836402631 Improve performance of some Transform<> operations by better preserving the alignment status.
There probably many other places in Transform.h where such optimizations could be done.
2012-02-07 17:12:15 +01:00
Gael Guennebaud
ff67676c0b Added tag 3.1.0-alpha2 for changeset fe0350cf1b 2012-02-06 16:39:51 +01:00
Gael Guennebaud
fe0350cf1b bump 2012-02-06 16:39:26 +01:00
Gael Guennebaud
99c694623a fix a dozen of warnings with MSVC, and get rid of some useless throw() 2012-02-06 15:57:51 +01:00
Gael Guennebaud
6ad48c5d92 fix conjugation in packet_lhs 2012-02-05 18:18:38 +01:00
Gael Guennebaud
4ed87c59c7 Update the PARDISO interface to match other sparse solvers.
- Add support for Upper or Lower inputs.
- Add supports for sparse RHS
- Remove transposed cases, remove ordering method interface
- Add full access to PARDISO parameters
2012-02-04 14:20:56 +01:00
Gael Guennebaud
1763f86364 add the recent setFromTriplets() feature in the manual 2012-02-04 10:44:07 +01:00
Gael Guennebaud
fe85b7ebc6 fix several const qualifier issues: double ones, meaningless ones, some missing ones, etc.
(note that const qualifiers are set by internall::nested)
2012-02-03 23:18:26 +01:00
Gael Guennebaud
bc7b251cd9 fix compilation errors with ICC 2012-02-03 23:16:52 +01:00
Gael Guennebaud
a594d7ffd7 stop disabling this legitimate warning, recall that in the following the const on FooRef is really meaningless:
typedef Foo& FooRef;
const FooRef foo;
2012-02-03 23:16:11 +01:00
Gael Guennebaud
ad4aa7873f remove unused variables 2012-02-03 13:30:48 +01:00
Gael Guennebaud
fd4aefadcd fix ctest -D Foo with MSVC 2008 2012-02-03 10:50:49 +01:00
Zuiquan
a64407f086 Enable Eigen to compile on 'pure C/C++' Gcc environment (with no inline assembly or asm directive). Required if we want to use Eigen with Adobe Alchemy. 2012-02-02 12:05:02 +01:00
Gael Guennebaud
13abb37721 shutup floating point underflow warning for this specific unit test 2012-01-31 23:18:17 +01:00
Gael Guennebaud
7002639844 the default ctor had no sense because of the const reference member 2012-01-31 23:12:04 +01:00
Gael Guennebaud
13e46ad847 add missing return *this 2012-01-31 23:11:13 +01:00
Gael Guennebaud
9a954d29ec rm non standard and useless overloads of is_arithmetic for long long 2012-01-31 21:45:03 +01:00
Gael Guennebaud
634fedaf68 proper C++ casting 2012-01-31 18:56:25 +01:00
Gael Guennebaud
10cd52350f fix a few warnings: change of sign and missing return statement 2012-01-31 13:05:44 +01:00
Gael Guennebaud
9c86ee2695 fix static inline versus inline static issues (the former is the correct order) 2012-01-31 12:58:52 +01:00
Gael Guennebaud
8d6e394b06 workaround "empty macro argument" warning 2012-01-31 12:46:14 +01:00
Gael Guennebaud
670e3af5a8 add .data() member to Diagonal<> 2012-01-31 12:44:59 +01:00
Gael Guennebaud
18e3ac0f0d fix some compilation errors with ICC and -strict-ansi 2012-01-31 09:14:01 +01:00
Gael Guennebaud
87138075da add the possibility to assemble a SparseMatrix object from a random list of triplets that may contain duplicated elements. It works in linear time, with O(1) re-allocations. 2012-01-28 11:13:59 +01:00
Gael Guennebaud
fc2d85d139 fix memory leak in SuperLUSupport 2012-01-27 10:07:09 +01:00
Gael Guennebaud
27d222d23e honor nested types in dense * sparse 2012-01-27 09:39:36 +01:00
Jitse Niesen
ed244e9c1a Document that JacobiSVD also handles complex matrices.
Thanks to 'Jazzdude' for noting this on IRC.
2012-01-26 13:16:50 +00:00
Gael Guennebaud
0251bb6c1d add missing inline keyword (linking issue) 2012-01-26 10:53:42 +01:00
Gael Guennebaud
65d5311c68 SimplicialCholesky: the shift offset must be real, and fix a comparison issue for complexes 2012-01-26 10:34:45 +01:00
Gael Guennebaud
d9f5840f7a simple compilation fix 2012-01-26 08:52:20 +01:00
Gael Guennebaud
a108216af1 fix bug #410: fix a possible out of range access in EigenSolver 2012-01-25 19:02:31 +01:00
Christoph Hertzberg
362fcabc44 Check for positive definiteness in SimplicialLLT 2012-01-14 22:34:18 +01:00
Gael Guennebaud
5e4dfa4a09 fix a nesting type issue in Sparse/TriangularView 2012-01-25 18:16:48 +01:00
Gael Guennebaud
606e204f6d fix bug #406: Using OpenMP and Eigen causes infinite loop/deadlock
(transplanted from fd52daae87
)
2012-01-25 17:42:22 +01:00
Gael Guennebaud
c68616b3b5 fix warning with gcc 4.6 2012-01-25 15:48:50 +01:00
Gael Guennebaud
87f2af5930 workaround ICC compilation error with -strict-ansi 2012-01-25 15:45:01 +01:00
Gael Guennebaud
d615d39af0 determine windows version from major.minor only, the patch number is irrelevant. 2012-01-23 21:56:46 +01:00
Gael Guennebaud
0d03492e1e std::isfinite is non standard 2012-01-23 21:49:00 +01:00
Gael Guennebaud
ee9f3e34b0 LLT: improve rankUpdate to support downdates,
LDLT: add the missing info() function,
improve unit testing of rankUpdate()
2012-01-23 17:28:23 +01:00
Abraham Bachrach
039408cd66 added functions to allow for cwise min/max operations with scalar argument (bug #400).
added function for array.min(), array.max(), matrix.cwiseMin(), matrix.cwiseMax().

The matrix.cwiseMin/Max functions required the definition of the ConstantReturnType typedef.
However, it wasn't defined until after MatrixCwiseBinaryOps was included in Eigen/src/SparseCore/SparseMatrixBase.h,
so I moved those includes after the definition of the typedefs.

tests for both the regular and scalar min/max functions were added as well
2012-01-11 11:00:30 -05:00
Gael Guennebaud
238999045c optimize the packing of lhs blocks for matrix-matrix products => significant speedup for small products 2012-01-21 19:34:28 +01:00
Jitse Niesen
0e1e0a2a58 Make sure that now-fixed assert is not triggered. 2012-01-19 14:30:44 +00:00
Keir Mierle
274f8a0947 Fix broken asserts releaved by Clang. 2012-01-18 15:03:27 -08:00
Gael Guennebaud
589cc627f8 fixe one more VC10 ICE 2012-01-18 17:45:22 +01:00
Gael Guennebaud
db8f528737 fix VC10 ICE 2012-01-18 17:42:13 +01:00
Jitse Niesen
d6bf9f848a Correct description of rankUpdate() in quick reference guide.
Thanks to Sameer Agarwal for pointing out this mistake.
(transplanted from bc0fc5d21e
)
2012-01-09 12:57:11 +00:00
Keir Mierle
2d4fee0b40 Fix out-of-range int constant in 4x4 inverse.
(transplanted from 45bcad41b4
)
2012-01-05 23:15:09 -08:00
Gael Guennebaud
e7ef367db1 suppress unused variable warnings 2012-01-06 09:02:06 +01:00
Gael Guennebaud
bdee0c9baa set the default number of iteration to the size of the problem 2011-12-27 16:38:05 +01:00
Gael Guennebaud
15ea999f84 pushed too fast the previous one 2011-12-23 23:22:31 +01:00
Gael Guennebaud
901bcdd2a8 the previous test works for Dynamic sizes only 2011-12-23 23:16:43 +01:00
Gael Guennebaud
96a18ef230 add a reconstruction test 2011-12-23 23:15:08 +01:00
Gael Guennebaud
8171adb7ff fix bug #398, the quaternion returned by slerp was not always normalized,
add a proper unit test for slerp
2011-12-23 22:39:32 +01:00
Gael Guennebaud
67ae94f3a2 fix compilation of sparse_basic unit test for complexes 2011-12-23 09:41:14 +01:00
Gael Guennebaud
e3e39ea26d suppress an 'unused variable' warning 2011-12-22 14:06:16 +01:00
Gael Guennebaud
2c03e6fccc evaluate 1D sparse expressions into SparseVector and make the sparse operator<< and dot honor nested types 2011-12-22 14:01:06 +01:00
Gael Guennebaud
7f04845023 fix assignment of a row-major sparse vector to a column major sparse one 2011-12-22 11:53:47 +01:00
Gael Guennebaud
e4cea957df fix bug #391: prune was for compressed format only, now it also turns the matrix into compressed form 2011-12-20 18:37:24 +01:00
Gael Guennebaud
7e866c447f fix bug #391: improper stream output for uncompressed mode, also avoid double debugging outputs for column major matrices 2011-12-20 18:31:00 +01:00
Gael Guennebaud
6f92b75874 add aliasing test for sparse*sparse product 2011-12-20 18:10:22 +01:00
Gael Guennebaud
50d756b9ea fix bug #394: innerVector::nonZeros() was broken for uncompressed mode 2011-12-20 18:10:02 +01:00
Gael Guennebaud
15d781b64c we need to define EXTRACT_ALL to YES to get doxygen see the whole hierarchy. Exclude internal::* from the doc. 2011-12-20 10:25:54 +01:00
Gael Guennebaud
fcc966b40d workaround doxygen limitation to follow the base class of PlainObjectBase 2011-12-19 22:13:11 +01:00
Gael Guennebaud
33e52a3943 rm local fill-in ratio estimation (was broken sometimes) 2011-12-16 16:29:46 +01:00
Gael Guennebaud
732a50d043 implement a more optimistic heuristic to predict the nnz of a saprse*sparse product 2011-12-16 15:59:44 +01:00
Gael Guennebaud
40c0f3af57 fig bug #396: add a static assertion on the storage order of a sparse-sparse coeff-wise binary op 2011-12-15 19:23:20 +01:00
Jitse Niesen
3db6455896 Remove evaluators for 2.1 release.
We plan to re-instate them when we branch 2.2 (see bug #388).
2011-12-14 21:23:43 +00:00
Gael Guennebaud
0308c11849 remove a file that was not intended to be committed 2011-12-13 08:42:48 +01:00
Jitse Niesen
1e7712771e Remove asserts that eigenvalue computation has converged (bug #354). 2011-12-12 17:17:38 +00:00
Gael Guennebaud
1aa6c7f122 fix sparse insertion example 2011-12-11 17:18:14 +01:00
Gael Guennebaud
d738bedc5b remove redundant declaration (fix compilation with clang 3.0) 2011-12-11 11:45:03 +01:00
Gael Guennebaud
f60e6f5ee8 s/compressed()/isCompressed() 2011-12-10 23:08:10 +01:00
Gael Guennebaud
594fd2d11d Cholmod: add support for uncompressed SparseMatrix objects 2011-12-10 22:53:31 +01:00
Gael Guennebaud
9d7d634897 add cholmod_support unit tests 2011-12-10 19:32:17 +01:00
Gael Guennebaud
f35708d2e0 enforce weak linking of xerbla 2011-12-10 19:30:36 +01:00
Gael Guennebaud
105e170d8b trivial compilation fix 2011-12-10 16:17:12 +01:00
Gael Guennebaud
2600ba1731 feature 297: s/intersectionPoint/pointAt, fix documentation, add a unit test 2011-12-10 12:17:42 +01:00
Andy Somerville
c06ae325a4 feature 297: add ParametrizedLine::intersectionPoint() and intersectionParam()
-> intersection() is deprecated
2011-12-10 11:58:38 +01:00
Igor Krivenko
36457178f9 bug #352:properly cast constants 2011-12-09 23:38:41 +01:00
Gael Guennebaud
d400a6245e fix compilation with EIGEN_NO_DEBUG 2011-12-09 23:42:39 +01:00
Gael Guennebaud
38277e8a9b feature 319: fix LDLT::rankUpdate for complex/upper, simply the algortihm, update copyrights 2011-12-09 23:08:38 +01:00
Tim Holy
2d7c3eea53 feature 319: Add update and downdate functionality to LDLT 2011-12-09 21:04:44 +01:00
Gael Guennebaud
37f304a2e6 add a "using MKL" documentation page, add a minimal documentation of PARDISO wrapper classes, refine a bit the EIGEN_USE_* logic 2011-12-09 16:52:37 +01:00
Sebastian Lipponer
fff25a4b46 Fix MSVC integer overflow warning 2011-12-09 10:39:10 +00:00
Gael Guennebaud
57c6bfba08 add missing CMakeLists.txt 2011-12-09 10:53:12 +01:00
Gael Guennebaud
081abb701d add user defined CXX and LINKER flag cmake variables for the unit tests 2011-12-09 10:50:13 +01:00
Gael Guennebaud
10447a7b57 mv blas.h to src/misc such that it would be possible to use any blas libraries,
however, this requires some more works:
 - add const qualifiers in the declarations of blas.h
 - add the possibility to add a suffix to blas function names
2011-12-09 10:40:35 +01:00
Gael Guennebaud
43cdd242d0 - split and rename defined tokens to enable the use of BLAS/Lapack/VML/etc
- include MKL headers outside the Eigen namespace.
2011-12-09 10:06:49 +01:00
karturov
015c331252 Intel(R) MKL support added.
* * *
License disclaimer changed to BSD license for MKL_support.h
* * *
Pardiso support fixed, test added.
blas/lapack tests fixed: Scalar parameter was added in Cholesky, product_matrix_vector_triangular remaned to triangular_matrix_vector_product.
* * *
PARDISO test was added physically.
2011-12-05 14:52:21 +07:00
Gael Guennebaud
e270a5656a fix min/max clash with clang's header by including fstream beforehand 2011-12-08 23:27:10 +01:00
Gael Guennebaud
86bb20c431 remove dead code 2011-12-08 23:22:28 +01:00
Gael Guennebaud
e36a4c880a suppress deprecated warning when compiling legacy tests 2011-12-08 23:15:07 +01:00
Gael Guennebaud
06450882ab add missing CMakeLists.txt in Splines 2011-12-08 23:12:39 +01:00
Jitse Niesen
dd232e30b0 Document QuaternionBase, minor doc improvements.
* Document class QuaternionBase so that docs for members are displayed.
* Remove obsolete \redstar refering to Array module
* Fix typo in Constants.h
* Document EIGEN_NO_AUTOMATIC_RESIZING
2011-12-08 14:22:06 +00:00
Gael Guennebaud
a1fa05f14e improve compiler name and version detection 2011-12-07 13:20:52 +01:00
Gael Guennebaud
a0da96e2f4 fix detection of ICC version 2011-12-06 22:07:20 +01:00
Gael Guennebaud
80f8ed9f9c improve compiler and architecture detection 2011-12-06 19:54:34 +01:00
Thomas Capricelli
c3ad1f9382 eigen_gen_docs: dont try to update permissions on server 2011-12-06 15:55:20 +01:00
Gael Guennebaud
6ec0af6dc7 Added tag 3.1.0-alpha1 for changeset e017f798eb 2011-12-06 15:53:46 +01:00
Gael Guennebaud
e017f798eb bump 2011-12-06 15:53:17 +01:00
Hauke Heibel
accae638b2 Fixed a typo. 2011-12-06 15:42:05 +01:00
Gael Guennebaud
84cf1b5b1d fix QuaternionBase::cast.
It did not work with clang, and I'm unsure how it worked for gcc/msvc since QuaternionBase was introduced
2011-12-05 14:13:59 +01:00
Gael Guennebaud
9ca673daed fix compilation with clang 2011-12-05 12:50:43 +01:00
Gael Guennebaud
dd504d6aae fix bug #223: SparseMatrix::Flags no longer encode triangularness information 2011-12-05 10:17:09 +01:00
Gael Guennebaud
59576014a9 fig bug #373: compilation error with clang 2.9 when exceptions are disabled (cannot reproduce with clang 3.0 or 3.1) 2011-12-05 09:44:25 +01:00
Gael Guennebaud
b60624dc2a fix bug #384: add a static assertion on the Index type which has to be signed 2011-12-04 22:14:53 +01:00
Gael Guennebaud
82f9aa194d fix bug #294: add a diagonal() method to SparseMatrix (const) 2011-12-04 21:49:21 +01:00
Gael Guennebaud
69966e90e1 fix bug #221: remove the dense to SparseVector conversion ctor. 2011-12-04 21:15:46 +01:00
Gael Guennebaud
5dc9650f11 fix bug #281: replace csparse macros by template functions 2011-12-04 19:15:23 +01:00
Hauke Heibel
a8a2bf3b5a Added docs to the spline module. 2011-12-04 18:44:01 +01:00
Gael Guennebaud
9bd902ed9c fix bug #341: trisove on MappedSparseMatrix 2011-12-04 14:57:43 +01:00
Gael Guennebaud
9353bbac4a fix bug #356: fix TriangularView::InnerIterator for unit diagonals 2011-12-04 14:39:24 +01:00
Gael Guennebaud
32917515df make the accessors to internal sparse storage part of the public API and remove their "_" prefix. 2011-12-04 12:19:26 +01:00
Gael Guennebaud
1cdbae62db add SparseVector::ReverseInnerIterator 2011-12-04 09:56:40 +01:00
Gael Guennebaud
91e392a042 add ReverseInnerIterators to loop over the elements in reverse order,
and partly fix bug #356 (issue in trisolve for upper-column major))
2011-12-03 23:49:37 +01:00
Gael Guennebaud
a09cc5d4c0 fix bug #282: add the possibiliry to shift the diagonal coefficients via a linear function. 2011-12-03 18:26:08 +01:00
Gael Guennebaud
c861e05181 fix matrix names in the insertion example 2011-12-03 18:14:51 +01:00
Gael Guennebaud
9ae606866c Eigen2sSupport: import some fixes from the 3.0 branch (MSVC fix) 2011-12-03 17:45:07 +01:00
Gael Guennebaud
950eeab4d7 RandomSetter: turns the matrix into compressed form before the filling 2011-12-03 17:35:21 +01:00
Gael Guennebaud
c0e36516f3 add a command to fix the permission of the uploaded documentation 2011-12-03 11:18:20 +01:00
Gael Guennebaud
3f56de2628 improve sparse manual 2011-12-03 10:26:00 +01:00
Gael Guennebaud
e759086dcd improve documentation of some sparse related classes 2011-12-02 19:02:49 +01:00
Gael Guennebaud
4ca89f32ed Sparse matrix insertion:
- automatically turn a SparseMatrix to uncompressed mode when calling insert(i,j).
 - now coeffRef insert a new element when it does not already exist
2011-12-02 19:00:16 +01:00
Gael Guennebaud
f10bae74e8 - move CompressedStorage and AmbiVector into internal namespace
- remove innerVectorNonZeros(j) => use innerVector(j).nonZeros()
2011-12-02 10:00:24 +01:00
Jitse Niesen
a0bcaa88af Extend tutorial page on broadcasting to reflect recent changes. 2011-12-01 21:16:07 +00:00
Gael Guennebaud
b85bcd91bf remove GSL dependency in the unit tests 2011-12-01 18:17:19 +01:00
Gael Guennebaud
7aaae9d6df remove useless blas reference code 2011-12-01 18:10:12 +01:00
Gael Guennebaud
3a4c78b588 add code for band triangular problems:
- currently available from the BLAS interface only
 - and for vectors only
2011-12-01 18:06:28 +01:00
Gael Guennebaud
9fdb6a2ead output error messages in blas unit tests 2011-12-01 18:04:01 +01:00
Hauke Heibel
b00a33bc70 Integrated spline class and simple spline fitting 2011-11-25 14:53:40 +01:00
Gael Guennebaud
49d652c600 fix assigment from uncompressed 2011-11-30 21:55:54 +01:00
Gael Guennebaud
6b8d6887ac bug fix in SparseSelfAdjointTimeDenseProduct for empty rows or columns 2011-11-30 19:39:20 +01:00
Gael Guennebaud
00d4a360ba bug fix in SparseView::incrementToNonZero 2011-11-30 19:31:11 +01:00
Gael Guennebaud
d1b54ecfa3 add more support for uncompressed mode 2011-11-30 19:24:43 +01:00
Gael Guennebaud
cda397b117 cleanning pass on the sparse modules:
- remove outdated/deprecated code
 - improve a bit the documentation
2011-11-28 16:36:37 +01:00
Gael Guennebaud
2d621d235d fix alignment computation in Block and MapBase such that aligned means aligned on 16 bytes and nothing else
(transplanted from dcb36e3d49
)
2011-11-28 13:43:10 +01:00
Marc Glisse
a2810aa32f bug #383 - another c++11-user-defined-literal fix 2011-11-27 15:27:25 -05:00
Marc Glisse
8107b3da75 bug #383 - EIGEN_ASM_COMMENT broken in C++11
this is due to the new user-defined literals syntax.
2011-11-26 17:55:18 -05:00
Gael Guennebaud
f56316f7ed add two alternative solutions to the problem of fixed size members 2011-11-25 13:46:48 +01:00
Gael Guennebaud
70206ab1e1 draft of the new sparse manual reflecting the new sparse module 2011-11-24 17:32:30 +01:00
Gael Guennebaud
57d1ccb2dc fix compilation of doc (broken by changeset bc6d78982f
- General tightening/testing of vectorwise ops)
2011-11-24 17:30:55 +01:00
Gael Guennebaud
2d4fe54b73 fix CG example 2011-11-24 08:19:13 +01:00
Gael Guennebaud
01b4b6e456 improve accuracy of 3x3 direct eigenvector extraction 2011-11-23 22:43:40 +01:00
Gael Guennebaud
be9b87377f typo 2011-11-23 08:30:10 +01:00
Jitse Niesen
63dcdb65fd Install eigen3.pc in default directory if pkgconfig not found (bug #358). 2011-11-22 17:30:35 +00:00
Benoit Jacob
ffe6d1f901 Alignment fixes:
* Fix AlignedBit computation for Plain Objects
 * use it for the conditional alignment of operator new
 * only overload new in PlainObjectBase, don't overload again in Matrix and Array
2011-11-22 09:04:31 -05:00
Gael Guennebaud
f278a3eaba stop fill pivoting LU only if the pivot is exactly 0 2011-11-22 09:18:54 +01:00
Benoit Jacob
bc6d78982f Bugs 157 and 377 - General tightening/testing of vectorwise ops:
* add lots of static assertions making it very explicit when all these ops
are supposed to work:
** all ops require the rhs vector to go in the right direction
** all ops already require that the lhs and rhs are of the same kind
(matrix vs vector) otherwise we'd have to do complex work
** multiplicative ops (introduced Kibeom's patch) are restricted to arrays, if only because for matrices they could be ambiguous.

* add a new test, vectorwiseop.cpp.

* these compound-assign operators used to be implemented with for loops:

   for(Index j=0; j<subVectors(); ++j)
     subVector(j).array() += other.derived().array();

This didn't seem to be needed; replaced by using expressions like operator+ and operator- did.
2011-11-18 11:10:27 -05:00
Kibeom Kim
de22ad117c bug #157 - Implemented *= /= * / operations for VectorwiseOp (e.g. mat.colwise()) 2011-11-17 17:57:45 -05:00
Jitse Niesen
08c0edae86 Move EIGEN_USING_MATRIX_TYPEDEFS macros to Eigen2Support. 2011-11-16 14:32:50 +00:00
Dennis Schridde
db36e4204f [Geometry/AlignedBox] New typedefs, like for Core/Matrix
Includes 1-4 and dynamic sized boxes for int, float and double type.
Also changes the tests to use these typedefs.
2011-11-09 22:12:28 +01:00
Gael Guennebaud
8fbbbe7521 fix some include paths 2011-11-16 09:27:38 +01:00
Gael Guennebaud
cb2f1944e2 add the new module headers 2011-11-12 15:22:35 +01:00
Gael Guennebaud
53fa851724 move sparse solvers from unsupported/ to main Eigen/ and remove the "not stable yet" warning 2011-11-12 14:11:27 +01:00
Gael Guennebaud
dcb66d6b40 fix ei_add_property 2011-11-12 10:54:16 +01:00
Gael Guennebaud
3e4a68cc60 optimize vectorized reductions by peeling the loop:
- x2 for squaredNorm() on double
 - peeling the loop with a peeling factor of 4 leads to even better perf
   for large vectors (e.g., >64) but it makes more difficult to keep good performance on smaller ones.
2011-11-12 09:19:48 +01:00
Gael Guennebaud
c110abb7d2 fix performance issue with SPMV 2011-11-11 06:04:31 +01:00
Gael Guennebaud
9d82a7e204 merge with hauke/eigen-cdash-improvements branch 2011-11-09 21:19:05 +01:00
Dennis Schridde
3a82aa1133 [Core/Matrix] Fix: Clear the right typedef macro 2011-11-09 12:25:55 +00:00
Gael Guennebaud
fb3aa7220f reimplement abs2 not to use std::norm which is incredibly slow. 2011-11-08 22:42:51 +01:00
Jitse Niesen
45a6bb34c3 Add simple example on how to compute Cholesky decomposition. 2011-11-07 17:14:06 +00:00
Marton Danoczy
f422668d39 Patches to support ARM NEON with Clang 3.0 and LLVM-GCC 2011-11-04 16:37:10 +01:00
Benoit Jacob
1b98b73472 Refactor force-inlining macros and use EIGEN_ALWAYS_INLINE to force inlining of the integer overflow helpers, whose non-inlining caused major performance problems, see the mailing list thread 'Significant perf regression probably due to bug #363 patches' 2011-11-06 16:27:41 -05:00
Benoit Jacob
aa3e420df5 Add test for Matrix(x, y) ctor static assert added in previous changeset 2011-11-06 00:44:04 -04:00
Benoit Jacob
ab3f138b23 In the Matrix constructor taking (rows, cols), statically assert that the types are integer.
The 2D vector ctor taking (x, y) is not concerned.
2011-11-05 23:56:48 -04:00
Gael Guennebaud
478de03bd8 fix a couple of warnings in the unit tests 2011-11-05 23:30:49 +01:00
Gael Guennebaud
cdd3e85060 Automatically produce a tgz archive of the documentation. 2011-11-05 21:59:36 +01:00
Gael Guennebaud
b4d1d4a2e0 completely remove EIGEN_BUILD_BLAS_LAPACK option 2011-11-05 13:26:53 +01:00
Gael Guennebaud
c5ddaf0c87 fix compilation 2011-11-05 10:54:05 +01:00
Gael Guennebaud
1de769d122 remove deprecated assert 2011-11-04 14:42:54 +01:00
Gael Guennebaud
05de3dddca use runtest.sh script iif /bin/bash does exist 2011-11-03 17:37:25 +01:00
Gael Guennebaud
94d87abbdb fix fftw cmake stuff 2011-11-03 15:33:42 +01:00
Jitse Niesen
a594ac3966 Allow for more iterations in SelfAdjointEigenSolver (bug #354).
Add an assert to guard against using eigenvalues that have not converged.
Add call to info() in tutorial example to cover non-convergence.
2011-11-02 14:18:20 +00:00
Gael Guennebaud
57207239f3 Mention that the axis in AngleAxis have to be normalized. 2011-11-01 09:40:51 +01:00
Jan Oberländer
fa7c08a831 bug #365 - Rename B0 in GeneralBlockPanelKernel.h to avoid name clash
with termios.h on POSIX systems.
2011-10-31 10:44:09 -04:00
Benoit Jacob
0cf2a05f3e bug #365 - Add test for non-usage of B0 2011-10-31 10:44:06 -04:00
Benoit Jacob
9df2f5c923 bug #369 - Quaternion alignment is broken
The problem was two-fold:
 * missing aligned operator new
 * Flags were mis-computed, the Aligned constant was misused
2011-10-31 09:23:41 -04:00
Benoit Jacob
0609dbeec6 fix more variable-set-but-not-used warnings on gcc 4.6 2011-10-31 00:51:36 -04:00
Benoit Jacob
6a1caf0351 Fix some unused-variable warnings with GCC 4.6 2011-10-30 23:55:20 -04:00
Adolfo Rodriguez Tsourouksdissian
4477843bdd bug #206 - part 4: Removes heap allocations from JacobiSVD and its preconditioners 2011-10-30 23:55:20 -04:00
Adolfo Rodriguez Tsourouksdissian
5e431779f3 bug #206 - part 3: Reimplement FullPivHouseholderQR<T>::matrixQ() using ReturnByValue 2011-03-08 19:04:31 +01:00
Adolfo Rodriguez Tsourouksdissian
7bf0e8cd82 bug #206 - part 2: For HouseholderSequence objects, added non-allocating versions of evalTo() and applyThisOnTheRight/Left that take additional working vector parameters. 2011-10-30 23:55:16 -04:00
Benoit Jacob
bca18a13ea The most important inline keyword ever? Without it, gcc failed to inline this function, which is called by all matrix constructors... 2011-10-25 20:45:26 -04:00
Gael Guennebaud
d7e70edfb3 remove the MSVC specific blas/lapack option 2011-10-24 13:40:01 +02:00
Gael Guennebaud
e44c19d1cc hopefully this workaround of cmake bug #9220 works for MSVC too 2011-10-24 13:36:49 +02:00
Gael Guennebaud
1ddf88060b update sparse*sparse product: the default is now a conservative algorithm preserving symbolic non zeros. The previous with auto pruning of the small value is avaible doing: (A*B).pruned() or (A*B).pruned(ref) or (A*B).pruned(ref,eps) 2011-10-24 11:44:53 +02:00
Gael Guennebaud
a997dacc67 mark deprecated sparse solvers as so. 2011-10-24 09:51:02 +02:00
Gael Guennebaud
39d4585bff add the possiibility to disable deprectated warnings (useful for deprecated unit tests!) 2011-10-24 09:40:37 +02:00
Gael Guennebaud
5d43b4049d factorize solving with guess 2011-10-24 09:33:24 +02:00
Gael Guennebaud
70df09b76d move DynamicSparseMatrix to SparseExtra 2011-10-24 09:31:33 +02:00
Gael Guennebaud
a2d414f568 move the blas.h header to blas/ and remove declaration of function returning a complex 2011-10-19 16:29:43 +02:00
Benoit Jacob
de69129f56 forgot inline keyword 2011-10-17 08:49:59 -04:00
Benoit Jacob
16b638c159 Throw std::bad_alloc even when exceptions are disabled, by doing new int[size_t(-1)].
Don't throw exceptions on aligned_malloc(0) (just because malloc's retval is null doesn't mean error, if size==0).
Remove EIGEN_NO_EXCEPTIONS option, use only compiler standard defines. Either exceptions are enabled or they aren't.
2011-10-17 08:44:44 -04:00
Benoit Jacob
dcbc985a28 bug #363 - add test for integer overflow in size computations 2011-10-16 16:12:19 -04:00
Benoit Jacob
739559b08a bug #363 - check for integer overflow in size=rows*cols computations 2011-10-16 16:12:19 -04:00
Benoit Jacob
0c6055c285 bug #363 - check for integer overflow in byte-size computations 2011-10-16 16:12:19 -04:00
Gael Guennebaud
c1170d2e93 update the decomposition catalogue 2011-10-14 21:21:38 +02:00
Gael Guennebaud
3fce43a704 add a basic ILU preconditioner 2011-10-11 20:41:43 +02:00
Gael Guennebaud
a5761d6dd7 fix sparse tri-solve for full matrices 2011-10-11 20:35:52 +02:00
Gael Guennebaud
15cb4f5b09 extend BiCGSTAB to arbitrary rhs 2011-10-11 19:53:18 +02:00
Gael Guennebaud
21d27c6f71 add proper bicgstab unit test 2011-10-11 19:38:36 +02:00
Gael Guennebaud
cd3c2451b6 add a unit test for permutation applied to sparse objects 2011-10-11 13:45:27 +02:00
Gael Guennebaud
3172749f32 refactor sparse solving unit tests 2011-10-11 11:32:26 +02:00
Gael Guennebaud
4f237f035c extend SimplicialCholesky for sparse rhs, and add determinant 2011-10-11 11:31:12 +02:00
Gael Guennebaud
5dc8458293 extend CG for multiple right hand sides 2011-10-11 11:29:50 +02:00
Gael Guennebaud
b94c00226f make it compatible with Diagonal<> 2011-10-11 11:28:13 +02:00
Gael Guennebaud
ae9c96a32d fix assignment to a set of sparse inner vectors 2011-10-10 16:16:37 +02:00
Gael Guennebaud
4e7f38ffc7 fix nesting 2011-10-09 22:19:01 +02:00
Gael Guennebaud
e97879857b DiagonalPrecond: fix potential segfault in case the diagonal contains explciit zeros 2011-10-09 22:17:37 +02:00
Gael Guennebaud
1beb8a6564 add a generic unit test for sparse SPD problems 2011-10-09 21:50:02 +02:00
Gael Guennebaud
2fc1b58cd2 split SimplicialCholesky into SimplicialLLt and SimplicialLDLt classes and add specific factor access functions 2011-10-09 21:45:55 +02:00
Hauke Heibel
e1dec359ba Configured unsupported/test/mpreal/*.* as CRLF files. 2011-10-04 11:57:49 +02:00
Hauke Heibel
b96d0bd240 Added a flag to build blas/lapack. 2011-10-04 11:23:55 +02:00
Gael Guennebaud
683ea3c93f fix superLU when the salver is called multiple times 2011-09-27 18:30:53 +02:00
Jitse Niesen
ac3ad9c1e7 Convert tabs to spaces. 2011-09-27 15:47:04 +01:00
Jitse Niesen
17c321617d Fix bug #286: Infinite loop in JacobiSVD with denormals 2011-09-27 14:25:02 +01:00
Bram de Jong
961a825b97 Add method which returns worst time (and make some methods const). 2011-09-26 14:39:23 +01:00
Gael Guennebaud
9bba0e7ba1 clean sparse LU tests 2011-09-24 17:15:37 +02:00
Gael Guennebaud
b2988375e8 fix a couple of issues in SuperLU support (memory and determinant) 2011-09-24 14:20:31 +02:00
Gael Guennebaud
6799fabba9 port umfpack support to new API 2011-09-24 14:19:39 +02:00
Gael Guennebaud
d8ae978b65 fix some compilation issues 2011-09-23 16:28:26 +02:00
Gael Guennebaud
823b2105b6 fix atan2 when tmp4==0 2011-09-22 17:34:25 +02:00
Gael Guennebaud
b0adbfbae7 BiCGSTAB does not like starting from 0... 2011-09-21 18:08:08 +02:00
Gael Guennebaud
c331c092d5 no comment 2011-09-21 14:20:41 +02:00
Gael Guennebaud
7301f4345c quick workaround of MSVC9' ICE in pset1 2011-09-21 14:18:41 +02:00
Gael Guennebaud
83563dee3c find macport' umfpack/cholmod 2011-09-21 10:28:09 +02:00
Gael Guennebaud
ebfed5a512 Enable incomplete BLAS/Lapack builds when no fortran compiler has been found.
Works here with gcc. Hopefully this will work for MSVC too.
2011-09-21 10:27:38 +02:00
Gael Guennebaud
1d796acb05 fix status after initialization 2011-09-20 18:45:50 +02:00
Gael Guennebaud
5d1836b182 accept both STL and Eigen's containers for reserve() 2011-09-20 02:04:03 +02:00
Jitse Niesen
e0a6ce50dd Typo in geometry tutorial. 2011-09-19 21:57:26 +01:00
Jitse Niesen
2092b45d0d Bug fix for matrix1 * matrix2 * scalar1 * scalar2.
See report on http://forum.kde.org/viewtopic.php?f=74&t=96947 .
2011-09-19 15:07:19 +01:00
Chen-Pang He
16b13596a6 mainly enhance MatrixLogarithm's performance for RealScalar != double 2011-09-17 21:00:55 +08:00
Gael Guennebaud
edf4c4b217 add support for macosx 2011-09-17 10:57:27 +02:00
Gael Guennebaud
9053729d68 add a bi conjugate gradient stabilized solver 2011-09-17 10:54:14 +02:00
Gael Guennebaud
f4122e9f94 add tan, acos, asin 2011-09-14 08:35:54 +02:00
Jitse Niesen
6b006772f1 Fix LDLT::solve() if matrix singular but solution exists (bug #241).
Clarify this in docs and add regression test.
2011-09-11 06:30:53 +01:00
Jitse Niesen
59b83c14fd Write page on template and typename keywords in C++.
After yet another question on the forum, I decided to write something on this
common issue. Now we just need to link to this and get people to read it.
Thanks to mattb on the forum for some links. Caveat: IANALL (I am not a
language lawyer).
2011-09-10 09:18:18 +01:00
Gael Guennebaud
3e7aaadb1d fix bench_gemm 2011-09-09 10:36:20 +02:00
Gael Guennebaud
d52d8e4a53 reactivate the sorting in the experimental sparse-sparse product 2011-09-08 13:43:32 +02:00
Gael Guennebaud
7706bafcfd add the possibility to reserve room for inner vector in SparseMatrix 2011-09-08 13:42:54 +02:00
Jitse Niesen
7898281b2b Put docs for unsupported modules in right place.
Doxygen was confused by the unsupported modules being partly in the doc/
directly, instead of completely in unsupported/doc/ . Thus, the link to
the unsupported modules on the server did not work (I think this manifested
itself after doxygen was upgraded on the server).
2011-09-07 04:19:12 +01:00
Jitse Niesen
b38d3b360e Define log2() on FreeBSD (fixes bug #343). 2011-09-06 06:52:04 +01:00
Gael Guennebaud
f1d98aad1b add atan2 support in AutoDiff and remove superfluous std:: specializations 2011-09-05 17:47:58 +02:00
Gael Guennebaud
063042bca3 Merged in trevorw/eigen (pull request PR-7) 2011-09-05 10:55:49 +02:00
Jitse Niesen
477d3e5726 Update docs of PlainObjectBase::Map(); fixes bug #335.
Also fix some typos.
2011-09-03 15:18:21 +01:00
Jitse Niesen
a2feb6f3c7 Add defensive assert to MatrixExponential, 2011-09-03 04:58:06 +01:00
Chen-Pang He
dd598ef8ce enhance efficacy via avoiding exception handling 2011-09-02 00:15:02 +08:00
Trevor Wennblom
6b31aa4bd1 resolve pkgconfig destination - #338 2011-08-30 19:15:16 -05:00
Jitse Niesen
7ee084f82f Leverage triangular square root in matrix log. 2011-08-25 07:42:32 +01:00
Jitse Niesen
c01ed935dd Split code for (quasi)triangular matrices from MatrixSquareRoot.
This way, (quasi)triangular matrices can avoid the costly Schur decomposition.
2011-08-25 07:42:21 +01:00
Chen-Pang He
8ddd1e390b fix: <ctime> is necessary for srand(time(NULL)) 2011-08-24 18:26:38 +08:00
Gael Guennebaud
8414be739b fix bug #330: Index to int conversion warning 2011-08-23 11:02:10 +02:00
Gael Guennebaud
b3f5fbbd9a oops EIGEN_DEFINE_STL_VECTOR_SPECIALIZATION now perfroms full specialization,
no need for the typename keywords
2011-08-22 10:48:04 +02:00
Gael Guennebaud
b85c89c313 fix bug #262: Compilation error of stdvector_overload test with GCC 4.6
Now our aligned allocator is automatically activatived only when the user
did not specified an allocator (or specified the default std::allocator).
2011-08-22 10:12:10 +02:00
Jitse Niesen
9bf4d709e4 Fix failures in redux test caused by underflow in .prod() test. 2011-08-21 00:51:15 +01:00
Jitse Niesen
9e667e28f5 Add coverage for long double to matrix_exponential test. 2011-08-21 00:20:29 +01:00
Chen-Pang He
6d7a32231d add compatibility with long double 2011-08-20 12:33:51 +08:00
Gael Guennebaud
ea4a1960f0 mv the mpreal copy in its own folder 2011-08-19 15:08:29 +02:00
Gael Guennebaud
79ad55a901 update to latest mpreal and fix a min/max issue in mprel.h 2011-08-19 15:03:45 +02:00
Gael Guennebaud
42e2578ef9 the min/max macros to detect unprotected min/max were undefined by some std header,
so let's declare them after and do the respective fixes ;)
2011-08-19 14:18:05 +02:00
Gael Guennebaud
5734ee6df4 add the possibility to specialize assign_impl and still call the default implementations.
(yes I know this change will be deprecated as soon as the evaluators will be in shape but I need this now)
2011-08-18 10:19:25 +02:00
Gael Guennebaud
ca7d3dca79 fix linking issue 2011-08-12 22:38:53 +02:00
Gael Guennebaud
f162f7c323 fix a numerical issue in the direct 3x3 eigenvector extraction 2011-08-08 10:46:26 +02:00
Thomas Capricelli
a660e6425c fix a bug where some rotations were not initialized
They actually were in the original minpack code, this is a bug introduced
by our migration.
Reported on #322 and
http://forum.kde.org/viewtopic.php?f=74&t=96197#p201158
2011-08-04 05:02:04 +02:00
Thomas Capricelli
5748d3c96f wa2 was computed twice because of a confustion between changesets
746c787a76
 and ee0e39284c
.
Reported on forum:
http://forum.kde.org/viewtopic.php?f=74&t=96197#p201158
2011-08-04 03:27:01 +02:00
Jitse Niesen
b12522f696 Remove unnecessary template keywords (breaks compilation under MSVC).
Thanks to Hauke for finding this.
2011-07-28 13:55:56 +01:00
Hauke Heibel
3431c052c6 Improved compilation errors for Transform initialization/assignment with different numeric types. 2011-07-28 09:35:17 +02:00
Gael Guennebaud
3a2cabc275 compilation fix with conjugate_gradient_solve_retval_with_guess 2011-07-26 14:43:20 +02:00
Gael Guennebaud
51f706b916 add the possibility to configure the preconditioner 2011-07-26 09:22:18 +02:00
Gael Guennebaud
66fa6f39a2 add a naive IdentityPreconditioner 2011-07-26 09:17:18 +02:00
Gael Guennebaud
80b1d1371d add a conjugate gradient solver 2011-07-26 09:04:10 +02:00
Gael Guennebaud
8fa7e92e77 fix sparse selfadjoint time dense such that the other triangular part is not used at all 2011-07-26 09:02:41 +02:00
Gael Guennebaud
97ac0fd192 fix eigen2 support min/max garbage 2011-07-22 11:37:41 +02:00
Gael Guennebaud
e8313364c1 simplify a bit the 2x2 direct eigenvalue solver 2011-07-22 11:21:43 +02:00
Gael Guennebaud
47a2bca89f integrate Hauke's 2x2 direct symmetric eigenvalues solver 2011-07-22 09:43:14 +02:00
Gael Guennebaud
26d7dad138 add a computeDirect method to SelfAdjointEigenSolver for fast eigen decomposition 2011-07-21 19:07:52 +02:00
Gael Guennebaud
22bff949c8 protect calls to min and max with parentheses to make Eigen compatible with default windows.h
(transplanted from 49b6e9143e
)
2011-07-21 11:19:36 +02:00
Gael Guennebaud
d4bd8bddb5 fix bug #320 (pretty gdb printer on mingw) 2011-07-20 11:15:42 +02:00
Hauke Heibel
705023fd85 Translation * RotationBase now returns an isometric transformation. 2011-07-19 11:13:40 +02:00
Gael Guennebaud
3fb65734ab fix triangular unit test: it only accepts small matrices 2011-07-19 10:45:42 +02:00
Gael Guennebaud
22cc2b727b fix trmv unit test 2011-07-19 10:44:44 +02:00
Gael Guennebaud
38a4e3053d fix LLT rank one update for "upper" hermitian matrices 2011-07-19 10:09:43 +02:00
Gael Guennebaud
0d02182ae8 add an "InvalidInput" enum, used by the SuperLU interface 2011-07-18 13:37:41 +02:00
Gael Guennebaud
a8f66fec65 add the possibility to configure the maximal matrix size in the unit tests 2011-07-12 14:41:00 +02:00
Gael Guennebaud
bdb545ce3b enable instalation of blas and lapack libs 2011-07-11 17:02:09 +02:00
Gael Guennebaud
5fdebc2fa5 fix bug #316 - SelfAdjointEigenSolver::compute does not handle matrices of size (1,1) correctly 2011-07-09 07:15:14 +02:00
Thomas Capricelli
08074843ac fix few warnings reported by clang 2011-07-07 22:20:04 +02:00
Gael Guennebaud
c52268c649 suppress polluting EMPTY macro defined by SuperLU 2011-07-07 16:42:51 +02:00
Gael Guennebaud
2489c81562 add new interface to SuperLU 2011-07-07 14:19:42 +02:00
Gael Guennebaud
c98cd5e564 fix constness of intersection methods (bug #309) 2011-06-27 13:15:01 +02:00
Jitse Niesen
0b308e79c4 Add DenseStorage specializations for dynamic size with MaxSize = 0 (bug #288).
This is necessary for instantiations like Matrix<float,Dynamic,Dynamic,0,0,0>.
2011-06-24 13:47:11 +01:00
Jitse Niesen
16db255333 Fix compilation of cholesky rank update test. 2011-06-24 13:41:23 +01:00
Thomas Capricelli
9b52fe0432 fix typo in doc for ParametrizedLine 2011-06-23 00:36:24 +02:00
Gael Guennebaud
3ecf7e8f6e add a KroneckerProduct module (unsupported) from Kolja Brix and Andreas Platen materials. 2011-06-22 14:39:11 +02:00
Gael Guennebaud
7aabce7c76 rm confusing sentence 2011-06-17 09:46:05 +02:00
Tim Holy
16a2d896bc Relatively straightforward changes to wording of documentation, focusing particularly on the sparse and (to a lesser extent) geometry pages. 2011-06-20 22:47:58 -05:00
Tim Holy
4a95badf74 A first tiny test commit: fix a spelling error in the documentation. 2011-06-19 14:39:19 -05:00
Gael Guennebaud
2f32e48517 New feature: add rank one update in Cholesky decomposition 2011-06-20 15:05:50 +02:00
Gael Guennebaud
a55c27a15f fix documentation of norm 2011-06-18 08:30:34 +02:00
Zach Ploskey
642d452921 Suggest placing Eigen directory in system include path. 2011-06-17 15:46:50 -07:00
Zach Ploskey
e3491beb48 Fixed a few typos and cleaned up some language. 2011-06-17 15:42:15 -07:00
Benoit Jacob
a871f3cdb8 adapt test to the change reverting normalize() to returning void 2011-06-15 10:00:43 -04:00
Benoit Jacob
aedccbf52f back out 842881cfb1 2011-06-15 09:59:10 -04:00
Benoit Jacob
d2673d89bd add test for normalize() and normalized() 2011-06-15 00:30:46 -04:00
Andy Somerville
842881cfb1 bug #298 - let normalize() return a reference to *this 2011-06-15 00:30:11 -04:00
Gael Guennebaud
40287d2fd9 remove the use of non standard long long 2011-06-14 10:56:47 +02:00
Gael Guennebaud
f82b3ea241 fix aligned_allocator::allocate interface 2011-06-14 08:50:25 +02:00
Thomas Capricelli
cf04a7c682 fix typo in constant name 2011-06-12 23:54:28 +02:00
Gael Guennebaud
6d3dee1b66 introduce a smart_copy internal function and fix sparse matrices with non POD scalar type 2011-06-09 19:04:06 +02:00
Jitse Niesen
8c8ab9ae10 Implement matrix logarithm + test + docs.
Currently, test matrix_function_1 fails due to bug #288.
2011-06-07 14:44:43 +01:00
Jitse Niesen
a6d42e28fe Decouple MatrixFunction and MatrixFunctionAtomic
in preparation for implementation of matrix log.
2011-06-07 14:40:27 +01:00
Jitse Niesen
86ca35ccff Fix and test MatrixSquareRoot for 1-by-1 matrices. 2011-06-07 14:32:16 +01:00
Gael Guennebaud
91fe1507d1 Sparse: more fixes regarding long int as index type 2011-06-07 11:28:16 +02:00
Gael Guennebaud
421ece38e1 Sparse: fix long int as index type in simplicial cholesky and other decompositions 2011-06-06 10:17:28 +02:00
Jitse Niesen
7a61a564ef Fix snippets for operator|| and && by adding pair of parens. 2011-06-03 11:17:08 +01:00
Gael Guennebaud
5bc4abc45e fix compilation with MinGW 2011-06-01 12:16:21 +02:00
Gael Guennebaud
562d3ea91d forgot to include this file in previous commit 2011-06-01 10:49:36 +02:00
Gael Guennebaud
35c1158ee3 add boolean || and && operators 2011-05-31 22:17:34 +02:00
Gael Guennebaud
b495203310 update URL 2011-05-31 19:07:15 +02:00
Gael Guennebaud
5830f90983 add read/write routines for sparse matrices in the Market format 2011-05-31 18:58:04 +02:00
Jitse Niesen
9d6fdbced7 Fix truncated instructions for printers.py
... as noted by kp0987 on forum
2011-05-30 16:15:11 +01:00
Gael Guennebaud
5b71d44e18 fix bug #278: geometry tutorial
(transplanted from 3cd1641dac
)
2011-05-28 22:12:15 +02:00
Gael Guennebaud
9464745385 do not directly call std::ceil 2011-05-28 16:46:38 +02:00
Gael Guennebaud
7b46d7ed0f finish to fix bug #270: we have to use EIGEN_ALIGN_STATICALLY and not EIGEN_DONT_ALIGN_STATICALLY... 2011-05-28 11:38:53 +02:00
Jitse Niesen
d23845c4cc Fix typo ('using namespace' instead of 'using'). 2011-05-26 09:52:36 +01:00
Gael Guennebaud
87ac09daa8 Simplify the use of custom scalar types, the rule is to never directly call a standard math function using std:: but rather put a using std::foo before and simply call foo:
using std::max;
max(a,b);
2011-05-25 08:41:45 +02:00
Gael Guennebaud
5541bcb769 bug #225: add a unit test for memory leak 2011-05-23 14:20:49 +02:00
Gael Guennebaud
117d17ee58 bug #271: fix copy/paste mistakes in doc
(transplanted from 145b9cad63101ee46924d446fa8b08ffb48c7f3a)
2011-05-23 13:39:26 +02:00
Gael Guennebaud
46bee5682f clean a bit previous patch (ctor vs static_cast and a few bits) 2011-05-23 13:34:04 +02:00
David H. Bailey
074b067624 fix implicit scalar conversions (needed to support fancy scalar types, see bug #276) 2011-05-23 11:20:13 +02:00
Gael Guennebaud
7209d6a126 fix gemv_static_vector_if on architectures that cannot aligned on the stack (e.g., ARM NEON) 2011-05-21 22:15:11 +02:00
Gael Guennebaud
96464f8563 clean several other assertion checking tests 2011-05-20 09:59:15 +02:00
Gael Guennebaud
501bc602ec fix vectorization_logic when EIGEN_GCC_AND_ARCH_DOESNT_WANT_STACK_ALIGNMENT 2011-05-19 21:52:40 +02:00
Gael Guennebaud
f2837aebc4 NEON: fix plset 2011-05-18 21:12:08 +02:00
Gael Guennebaud
8170ef0b2d add unit test for plset 2011-05-18 21:11:03 +02:00
Gael Guennebaud
7f2a88c91f NEON: disable unaligned assertion checking for non vectorized types 2011-05-18 14:11:40 +02:00
Gael Guennebaud
85c137ccd4 NEON: fix ploaddup 2011-05-18 08:15:47 +02:00
Gael Guennebaud
179d42bb2b fix bug #267: alloca is not aligned on arm 2011-05-17 21:30:12 +02:00
Gael Guennebaud
d4fd298fbb Autodiff: fix scalr - active_scalar 2011-05-14 22:38:41 +02:00
Jitse Niesen
9a06055870 Store light-weight objects in evaluators by value.
This resolves failure in unit test caused by dying temporaries.
2011-05-13 14:05:59 +01:00
Gael Guennebaud
a34a216e82 AutoDiff: add one missing operator- version 2011-05-12 23:40:19 +02:00
Gael Guennebaud
3de2f4b75a AutoDiff: fix most of bug #234 (missing operators, used old internal math function interface, etc) 2011-05-12 23:36:33 +02:00
Gael Guennebaud
ae3b6cc324 AutoDiff: fix unary operator- 2011-05-12 22:27:51 +02:00
Jitse Niesen
e22a523021 Remove Eigen::internal::sqrt(), see bug #264. 2011-05-12 16:52:56 +01:00
John Tytgat
0aa7425f15 fix bug #260: broken Qt support for Transform
(transplanted from 84c8b6d5c5
)
2011-05-11 22:31:36 +02:00
Jitse Niesen
0c463a21c4 Forgot to 'hg add' example file in last commit. 2011-05-10 09:59:58 +01:00
Jitse Niesen
d7e3c949be Implement and document MatrixBase::sqrt(). 2011-05-09 22:20:20 +01:00
Jitse Niesen
dac4bb640a Fix compilation error under GCC 4.5.
That version is stricter in forcing function prototype and definition
to match.
2011-05-09 13:57:06 +01:00
Jitse Niesen
837db08cbd Add test for sqrt() on complex Arrays.
From Gael's dashboard output of matrix_square_root test, I suspect the
test committed here may fail on old gcc.
2011-05-09 10:17:41 +01:00
Jitse Niesen
6e1573f66a Implement square root for real matrices via Schur. 2011-05-08 22:18:37 +01:00
Jitse Niesen
6b4e215710 Implement matrix square root for complex matrices.
I hope to implement the real case soon, but it's a bit more
complicated due to the 2-by-2 blocks in the real Schur decomposition.
2011-05-07 22:57:46 +01:00
Jitse Niesen
0896c6d97d Get rid of wrong "subscript above bounds" warning (bug #149). 2011-05-07 18:44:11 +01:00
Gael Guennebaud
4e7e5d09e1 s/n=n/EIGEN_UNUSED_VARIABLE(n) 2011-05-06 21:29:19 +02:00
Gael Guennebaud
fb76452cbc add missing .data() members to MatrixWrapper and ArrayWrapper 2011-05-06 21:15:05 +02:00
Gael Guennebaud
97b6d26f5b fix compilation on ARM NEON (missing AlignedOnScalar) 2011-05-06 09:03:48 +02:00
Thomas Capricelli
883219041f better fix for gcc 4.6.0 / ptrdiff_t, as suggested by Benoit 2011-05-05 18:48:18 +02:00
Thomas Capricelli
a18a1be42d Fix compilation with gcc-4.6.0, patch provided by Anton Gladky <gladky.anton@gmail.com>,
working on debian packaging.
2011-05-05 00:44:24 +02:00
Jitse Niesen
012419166e Bail out if preprocessor symbol Success is defined (bug #253). 2011-05-04 14:28:45 +01:00
Jitse Niesen
781e75cbd7 Document some more preprocessor symbols:
EIGEN_NO_MALLOC, EIGEN_RUNTIME_NO_MALLOC, eigen_assert.
2011-05-04 14:13:20 +01:00
Jitse Niesen
cc23b0a3d9 Remove unused enums in Constants.h . 2011-05-03 17:20:54 +01:00
Jitse Niesen
a96c849c20 Document enums in Constants.h (bug #248).
To get the links to work, I also had to document the Eigen namespace.
Unfortunately, this means that the word Eigen is linked whenever it appears
in the docs.
2011-05-03 17:08:14 +01:00
Gael Guennebaud
1947da39ab fix bug #258: asin/acos copy paste mistake 2011-05-02 13:26:44 +02:00
Hauke Heibel
10426b7647 Final working fix for the EOL extension.
MSVC debugger tools are now forced to CRLF.
2011-04-30 18:10:17 +02:00
Hauke Heibel
0358a8247c This should fix the eol extension. 2011-04-30 17:46:40 +02:00
Hauke Heibel
9e0c8549ce Fixed Unix script line ending conversions. 2011-04-30 17:35:51 +02:00
Jitse Niesen
06fb7cf470 Implement compound assignments using evaluator of SelfCwiseBinaryOp. 2011-04-28 16:57:35 +01:00
Jitse Niesen
3b60d2dbc4 Implement swap using evaluators. 2011-04-28 15:52:15 +01:00
Jitse Niesen
2d11041e24 Use copyCoeff/copyPacket in copy_using_evaluator. 2011-04-22 22:36:45 +01:00
Jitse Niesen
3457965bf5 Implement evaluator for Diagonal. 2011-04-22 22:36:45 +01:00
Jitse Niesen
f924722f3b Implement evaluators for Reverse. 2011-04-22 22:36:45 +01:00
Jitse Niesen
bb2d70d211 Implement evaluators for ArrayWrapper and MatrixWrapper. 2011-04-22 22:36:45 +01:00
Gael Guennebaud
6441e8727b fix aligned_stack_memory_handler for null pointers 2011-04-21 09:00:55 +02:00
Mathieu Gautier
392eb9fee8 Quaternion : add Flags on Quaternion's traits with the LvalueBit set if needed
Quaternion : change PacketAccess to IsAligned to mimic other traits
test : add a test and 4 failtest on Map<const Quaternion> based on Eigen::Map ones
2011-04-12 14:49:50 +02:00
Gael Guennebaud
f85db18c1c I doubt this change was intented to be committed
ss: Enter commit message.  Lines beginning with 'HG:' are removed.
2011-04-20 08:15:09 +02:00
Thomas Capricelli
50c00d14c8 be nice with the server : dont use -j3 2011-04-19 17:41:59 +02:00
Gael Guennebaud
e87f653924 fix bug #250: compilation error with gcc 4.6 (STL header files no longer include cstddef) 2011-04-19 16:34:25 +02:00
Gael Guennebaud
67d50f539b fix bug #242: vectorization was wrongly enabled on MSVC 2005 2011-04-19 15:25:00 +02:00
Eamon Nerbonne
e48bc0dfe3 WIN32 isn't defined ?? but _WIN32 is. 2011-04-19 14:37:04 +02:00
Jitse Niesen
0b40b36d10 Make MapBase(PointerType) constructor explicit (fixes bug #251) 2011-04-19 12:13:04 +01:00
Benoit Jacob
820545cddb fix unaligned-array-assert link 2011-04-18 06:35:54 -04:00
Jitse Niesen
c9b5531d6c Normalize eigenvectors returned by EigenSolver (fixes bug #249)
because the documentation says that we do this.
Also, add a unit test to cover this.
2011-04-15 17:39:59 +01:00
Jitse Niesen
e654405900 Implement unrolling in copy_using_evaluator() . 2011-04-13 11:49:48 +01:00
Jitse Niesen
7e86324898 Implement evaluator for PartialReduxExpr as a dumb wrapper. 2011-04-13 09:49:10 +01:00
Jitse Niesen
11164830f5 Implement evaluator for Replicate. 2011-04-12 22:54:31 +01:00
Jitse Niesen
12a30a982f Implement evaluator for Select. 2011-04-12 22:34:16 +01:00
Jitse Niesen
88b3116b99 Decouple AssignEvaluator.h from assign_traits from Assign.h 2011-04-12 13:35:08 +01:00
Gael Guennebaud
0c146bee1b enforce no inlining of the GEBP product kernel: this is a big
function that makes no sense to inline, though GCC was thinking
the opposite. This even slighlty improve the perf. And as a side
effect this workaround a weird GCC-4.4 linking bug (see
"Problem with g++-4.4 -O2 and Eigen3" in the ML)
2011-04-07 18:49:45 +02:00
Jitse Niesen
eae5a6bb09 Decouple Cwise*Op evaluators from expression objects 2011-04-05 18:30:51 +01:00
Jitse Niesen
11ea81858a Implement evaluator for CwiseUnaryView 2011-04-05 18:20:43 +01:00
Jitse Niesen
cca7b146a2 Implement evaluator for Map 2011-04-05 18:15:59 +01:00
Gael Guennebaud
a6b5314c20 Performance tunning for TRMM products 2011-04-05 11:20:50 +02:00
Jitse Niesen
ae06b8af5c Make evaluators for Matrix and Array inherit from common base class.
This gets rid of some code duplication.
2011-04-04 15:35:14 +01:00
Jitse Niesen
afdd26f229 Do some of the actual work in evaluator for Block.
Also, add simple accessor methods to Block expression class.
2011-04-04 13:44:50 +01:00
Gael Guennebaud
0d58c36ffd std::min/max are not implemented and they cannot be implemented easily 2011-04-04 16:26:43 +02:00
Jitse Niesen
70d5837e00 Correct typo in QuickReference doc, plus typographical improvements. 2011-04-01 16:58:51 +01:00
Gael Guennebaud
77a1373c3a fix trmm unit test 2011-03-31 15:32:21 +02:00
Jitse Niesen
d90a8ee8bd Evaluators: add Block evaluator as dumb wrapper, add slice vectorization. 2011-03-31 13:50:52 +01:00
Gael Guennebaud
b471161f28 fix typo and remove unused declaration. 2011-03-31 10:02:02 +02:00
Adam Szalkowski
969e92261d fix bug #239: the essential part was left uninitialized in some cases 2011-03-31 09:54:52 +02:00
Jitse Niesen
10dae8dd4d Add directory containing split_test_helper.h to include path. 2011-03-29 14:17:49 +01:00
Jitse Niesen
8175fe43e0 Evaluators: Make inner vectorization more similar to default traversal. 2011-03-28 21:29:47 +01:00
Gael Guennebaud
00991b5b64 extend trmm/trmv unit test to thoroughly check all configurations 2011-03-28 17:45:16 +02:00
Gael Guennebaud
4f1419e9c3 add the possibility to specify a list of sub-test suffixes in a compact way 2011-03-28 17:43:59 +02:00
Gael Guennebaud
6feb1d3c0b fix trmv for Strictly* triangular matrices and trapezoidal matrices 2011-03-28 17:42:26 +02:00
Gael Guennebaud
568478ffe5 fix trmm for some unusual trapezoidal cases (a dense set of columns or rows is zero) 2011-03-28 17:41:46 +02:00
Gael Guennebaud
f4ac7d2b43 automatically generate the CALL_SUBTEST_* macros 2011-03-28 17:39:05 +02:00
Jitse Niesen
b175bc464f Evaluators: Implement linear traversal, better testing. 2011-03-27 22:08:48 +01:00
Jitse Niesen
1b17a674dd Evaluators: Implement inner vectorization.
The implementation is minimal (I only wrote the functions called by
the unit test) and ugly (lots of copy and pasting).
2011-03-27 13:49:15 +01:00
Jitse Niesen
5c204d1ff7 Evaluators: Implement LinearVectorizedTraversal, packet ops in evaluators. 2011-03-25 16:30:41 +00:00
Gael Guennebaud
e6fa4a267a improve computation of the sub panel width 2011-03-24 23:42:25 +01:00
Gael Guennebaud
931814d7c0 improve performance of trsm 2011-03-24 23:19:53 +01:00
Jitse Niesen
c6ad2deead Bug fix in linspace_op::packetOp(row,col). Fixes bug #232.
Also, add regression test.
2011-03-24 10:42:11 +00:00
Gael Guennebaud
42bc1f77be impl basic product evaluator on top of previous one 2011-03-24 09:33:36 +01:00
Gael Guennebaud
abc8c0821c makes evaluator test use VERIFY_IS_APPROX 2011-03-23 17:23:56 +01:00
Gael Guennebaud
4ada45bc76 BTL: add eigen2 backend 2011-03-23 16:59:12 +01:00
Gael Guennebaud
7d24cf283a do not confuse Eigen3 or beta versions of Eigen3 with Eigen2 2011-03-23 16:58:45 +01:00
Gael Guennebaud
7bb4f6ae2f BTL: do not enable GOTO1 if GOTO2 was found 2011-03-23 16:28:43 +01:00
Gael Guennebaud
3ef0da6efb fix tridiagonalization action 2011-03-23 16:28:09 +01:00
Gael Guennebaud
816541d82c add a stupid Product<A,B> expression produced by prod(a,b), and implement a first version of its evaluator 2011-03-23 16:12:21 +01:00
Gael Guennebaud
cfd5c2d74e import evaluator works 2011-03-23 11:54:00 +01:00
Gael Guennebaud
611fc17894 add support for ublas 2011-03-23 11:39:35 +01:00
Gael Guennebaud
ec32d2c807 BTL: by default use current Eigen headers, and disable the novec version 2011-03-23 11:08:10 +01:00
Gael Guennebaud
b3e43246bc BTL: add a Eigen-blas backend 2011-03-23 11:00:31 +01:00
Gael Guennebaud
f9da1ccc3b BTL: clean the BLAS implementation 2011-03-23 10:35:54 +01:00
Gael Guennebaud
e35b1ef3f3 BTL: rm stupid backends 2011-03-23 10:07:24 +01:00
Gael Guennebaud
fe595e91ae update plot settings 2011-03-23 10:03:01 +01:00
Gael Guennebaud
9cca79f5ca update aat action to do a syrk operation, and remove (comment) ata action 2011-03-23 10:02:00 +01:00
Gael Guennebaud
da3f3586e0 BTl: GMM++ LU is not a full pivoting LU 2011-03-22 15:39:23 +01:00
Gael Guennebaud
22c7609d72 extend sparse product unit tests 2011-03-22 11:58:22 +01:00
Gael Guennebaud
5fda8cdfb3 fix 228 (ei_aligned_stack_delete does not exist anymore) 2011-03-21 21:59:42 +01:00
Benoit Jacob
eb9c6b6cfd merge 2011-03-21 06:46:27 -04:00
Benoit Jacob
bb8a25e94b fix typos 2011-03-21 06:45:57 -04:00
Gael Guennebaud
535a61ede8 port sparse LLT/LDLT to new stack allocation API 2011-03-20 17:10:43 +01:00
Benoit Jacob
eba023d082 make compile_snippet use Eigen/Dense 2011-03-20 11:48:53 -04:00
Gael Guennebaud
b8ecda5c66 clean a bit the stack allocation mechanism 2011-03-19 10:27:47 +01:00
Gael Guennebaud
bbb4b35dfc test the new stack allocation mechanism 2011-03-19 08:51:38 +01:00
Gael Guennebaud
290205dfc0 fix memory leak when a custom scalar throw an exception 2011-03-19 01:06:50 +01:00
Benoit Jacob
5991d247f9 bump 2011-03-18 05:27:58 -04:00
Gael Guennebaud
37c5341d64 fix compilation for old but not so old versions of glew 2011-03-18 10:26:21 +01:00
Gael Guennebaud
2359486129 disable testing of aligned members when aligned static allocation is not enabled (e.g., for gcc 3.4) 2011-03-15 09:53:23 +01:00
Gael Guennebaud
dd2e4be741 fix array_for_matrix unit test 2011-03-15 09:42:22 +01:00
Benoit Jacob
c5ef8f9027 Added tag 3.0-rc1 for changeset 4931a719f4 2011-03-14 14:10:12 -04:00
Benoit Jacob
4931a719f4 bump 2011-03-14 14:10:05 -04:00
Jitse Niesen
27f34269d5 Document EIGEN_DEFAULT_DENSE_INDEX_TYPE.
Also, expand description of EIGEN_DONT_ALIGN.
2011-03-11 11:15:44 +00:00
Jitse Niesen
e7d2376688 Change int to Index in equalsIdentity().
This fixes compilation errors in nullary test on 64-bits machines.
2011-03-11 11:06:13 +00:00
Benoit Jacob
dc36efbb8f fix bug #219: Map Flags AlignedBit was miscomputed, didn't account for EIGEN_ALIGN 2011-03-10 10:17:17 -05:00
Benoit Jacob
9a47fb289b add test for EIGEN_DONT_ALIGN and EIGEN_DONT_ALIGN_STATICALLY, cf recent bugs (214 etc) and changeset 56818d907e 2011-03-10 09:44:59 -05:00
Jitse Niesen
151e3294cf Fix equalsIdentity() for rectangular matrices. 2011-03-10 13:49:06 +00:00
Oliver Ruepp
5d1263e7c5 bug #37: fix resizing when the destination sparse matrix is row major 2011-03-08 16:37:59 +01:00
Gael Guennebaud
c6c6c34909 repeat nullary tests, and fix some tests 2011-03-07 16:41:59 +01:00
Jitse Niesen
931edea57d Tweak geo_quaternion test to squash intermittent failures. 2011-03-07 11:42:55 +00:00
Benoit Jacob
bfcad536e8 * bug #206: correctly forward computationOptions and work towards avoiding mallocs after preallocation, with unit test.
* added EIGEN_RUNTIME_NO_MALLOC and new set_is_malloc_allowed() function to implement that test
2011-03-06 20:59:25 -05:00
Benoit Jacob
b464fc19bc try to fix a ICC 11.1 compiler error (bug #217) 2011-03-06 19:27:31 -05:00
Benoit Jacob
c541d0a62e disable ICC 12 warning 279 - controlling expression is constant 2011-03-06 19:06:44 -05:00
Benoit Jacob
b43d92a5a2 The Eigen2 intrusive std::vector hack really can't be supported in eigen3 (bug #215) 2011-03-04 10:24:41 -05:00
Benoit Jacob
56818d907e Make EIGEN_ALIGN16 always align to fix crashes with EIGEN_DONT_ALIGN_STATICALLY. New macro EIGEN_USER_ALIGN16 had the old behavior i.e. honors user preference. 2011-03-04 09:57:49 -05:00
Sameer Sheorey
e9868f438b Changed debug/gdb/printers.py to correctly display variable sized matrices.
There is no python error now.
2011-03-02 10:47:54 -06:00
Gael Guennebaud
4f0909b5f0 fix bug #212 (installation of Eigen2Support/Geometry) 2011-03-04 14:16:58 +01:00
Jitse Niesen
6cac61ca3e Copy fix of unit test when GSL is enabled to eigen2 test suite. 2011-03-04 11:04:07 +00:00
Jitse Niesen
1180ede36d Escape hash character in docs as required by doxygen. 2011-03-03 15:19:11 +00:00
Jitse Niesen
99fa279ed1 Use copy_bool() workaround in Eigen2 test suite.
See bug #89 and changeset 59596efdf7
.
2011-03-03 14:17:23 +00:00
Jitse Niesen
dbab12d6b0 Fix bug #205: eigen2_adjoint_5 test fails. 2011-03-02 22:00:48 +00:00
Gael Guennebaud
dc727d86f1 extend unit tests of Transform * MatrixBase and Transform * Homogeneous 2011-03-02 19:34:39 +01:00
Gael Guennebaud
5cec29162b fix compilation in the case of 1D Transform 2011-03-02 19:29:55 +01:00
Gael Guennebaud
703c8a0cc6 fix compilation when mixing CompactAffine with Homogeneous objects 2011-03-02 19:27:13 +01:00
Gael Guennebaud
d30f0c0953 fix transform * matrix products: in particular it now truely considers the rhs as a set of (homogeneous) points and do not neglect the homogeneous coordinates in the case of affine transform 2011-03-02 19:26:38 +01:00
Gael Guennebaud
adacacb285 fix bug #204: limit integer values to numbers which are representable using float 2011-03-02 14:24:26 +01:00
Gael Guennebaud
c8e1b679fa re-enable fast pset1-pstore by introducing a new higher level pstore1 function 2011-03-02 10:55:44 +01:00
Gael Guennebaud
951e238430 now fixing "unsupported" "legacy" code... 2011-03-01 16:45:46 +01:00
Benoit Jacob
9c5c8d8916 Added tag 3.0-beta4 for changeset 77fc6a9914 2011-02-28 00:55:59 -05:00
Benoit Jacob
77fc6a9914 bump 2011-02-28 00:55:52 -05:00
Benoit Jacob
eef03525b8 fix bug #203: revert to using _mm_set1_p[sd] 2011-02-28 00:04:05 -05:00
Benoit Jacob
31621ff0ef relax condition in matrix_exponential test for clang 2011-02-27 23:25:14 -05:00
Benoit Jacob
0b44893b4e fix umeyama test 2011-02-27 23:20:45 -05:00
Benoit Jacob
8cad73072e fix stable_norm test: the |small| value was 0 on clang with complex<float>. 2011-02-27 22:35:49 -05:00
Benoit Jacob
9be2712bf7 remove now-useless comments 2011-02-27 22:35:17 -05:00
Benoit Jacob
0612768c1c fix bug #201: Clang too has intrinsics bugs preventing us to use custom unaligned loads 2011-02-27 21:59:07 -05:00
Benoit Jacob
32025a2510 disable BVH test on Clang++. Looks like there's a good reason why BVH is unsupported. It seems to have a very weird usage pattern, relying on an externally defined bounding_box function in a naive way. 2011-02-27 21:37:34 -05:00
Benoit Jacob
771e64200f fix compilation of unit tests with clang 2011-02-27 20:33:58 -05:00
Benoit Jacob
4846c76d9d shut up a stupid clang 2.8 warning 2011-02-27 20:18:03 -05:00
Benoit Jacob
afc9efca15 fix compilation with clang 2.8 2011-02-27 20:17:47 -05:00
Benoit Jacob
ea7d872181 documentation fixes 2011-02-27 17:43:10 -05:00
Benoit Jacob
b6299c974f add option to build in 32bit mode 2011-02-27 17:27:23 -05:00
Benoit Jacob
b3544ce2ae bug #195 - fix this once and for all: just never use _mm_load_sd on gcc/i386, it generates redundant x87 ops 2011-02-27 17:26:59 -05:00
Jitse Niesen
a8f5ef9388 Document (non)sorting of eigenvalues.
Also, update docs for (Generalized)SelfAdjointEigenSolver to reflect that these
two classes were split apart.
2011-02-27 14:06:55 +00:00
Jitse Niesen
58abf0eb98 Use absolute error to test sum in which cancellation may occur. 2011-02-25 08:56:37 +00:00
Gael Guennebaud
ef73265987 to ease debugging let's catch invalid template options in Transform 2011-02-25 09:03:24 +01:00
Gael Guennebaud
4fbd78d993 fix compilation with gcc 3.4 2011-02-25 09:02:15 +01:00
Benoit Jacob
5dfae4524b fix bug #195: fast unaligned load for integer using _mm_load_sd failed when the value interpreted as a NaN 2011-02-24 10:31:57 -05:00
Hauke Heibel
2064c59878 Improved docs of PlainObjectBase::conservativeResize methods. 2011-02-24 15:48:41 +01:00
Gael Guennebaud
bb9a465c5a fix AltiVec ploaddup 2011-02-24 00:23:50 +03:00
Gael Guennebaud
28d17c5390 bounds the range of random integers for AltiVec 2011-02-24 00:22:53 +03:00
Gael Guennebaud
4bfe38eda2 extend testing of ploaddup 2011-02-24 00:22:10 +03:00
Gael Guennebaud
23aae0d63e fix pset1 for complex 2011-02-23 21:24:47 +03:00
Gael Guennebaud
0dfea7fce4 improve packetmath unit test 2011-02-23 21:24:26 +03:00
Gael Guennebaud
c121e6f390 implement ploaddup for complex and SSE/NEON even though they are not used in practice 2011-02-23 16:31:42 +01:00
Gael Guennebaud
955c099eb5 implement ploaddup for altivec and add respective unit test 2011-02-23 18:20:55 +03:00
Gael Guennebaud
a00aaf7f7e fix overflow in packetmath unit test 2011-02-23 17:57:18 +03:00
Gael Guennebaud
6e01780541 fix a couple of issues with pcplxflip 2011-02-23 17:51:40 +03:00
Gael Guennebaud
939f0327b6 mention reverse and replicate in the quick ref 2011-02-23 15:31:16 +01:00
Gael Guennebaud
78e1a62c54 implement pcplxflip for altivec 2011-02-23 14:20:58 +01:00
Gael Guennebaud
59eeb67187 add unit test for pcplxflip 2011-02-23 14:20:33 +01:00
Gael Guennebaud
b8374aec00 implement workarounds for MSVC IDEs and the Experimental target 2011-02-23 11:53:20 +01:00
Gael Guennebaud
7dc18b20bb same for neon 2011-02-23 09:41:55 +01:00
Gael Guennebaud
32e7dae776 Altivec: fix infinite loop (ei_ -> internal:: change) 2011-02-23 09:41:02 +01:00
Gael Guennebaud
9ab503903e suppress unused warning 2011-02-23 09:32:55 +01:00
Gael Guennebaud
14b164b00e do not try to use Eigen's blas/lapack if they cannot be compiled 2011-02-23 09:25:32 +01:00
Gael Guennebaud
c78b5fd9aa fix no newline warning 2011-02-23 09:23:11 +01:00
Gael Guennebaud
2fb5567e08 add missing AlignedOnScalar 2011-02-22 21:25:47 +01:00
Benoit Jacob
3df134dec2 fix icc warning #68 2011-02-22 10:11:03 -05:00
Benoit Jacob
c58a2ff03a add EIGEN_PERMANENTLY_DISABLE_STUPID_WARNINGS non-default option. Use it in our own CMakeLists. also add a include-guard-like mechanism to prevent doing unmatched #pragma warning push/pop. 2011-02-22 10:05:41 -05:00
Benoit Jacob
9e1127619c merge 2011-02-22 09:33:01 -05:00
Benoit Jacob
720767ae40 ICC 12 / linux only defined __INTEL_COMPILER, not __intel_compiler 2011-02-22 09:32:39 -05:00
Benoit Jacob
d8e97aee89 shut up stupid ICC warnings 2011-02-22 09:31:22 -05:00
Benoit Jacob
625814464e fix legitimate ICC 12 warning 2011-02-22 09:30:54 -05:00
Gael Guennebaud
39b27fb656 altivec compilation fix 2011-02-22 15:26:28 +01:00
Benoit Jacob
25579df2d4 'fix' a couple of clang -Wconstant-logical-operand warnings (still not convinced about the pertinence of that warning) 2011-02-22 08:54:55 -05:00
Benoit Jacob
3884308da7 __attribute__((flatten)) seems to be recognized by neither clang nor icc despite these compilers defining __GNUC__. 2011-02-22 08:40:37 -05:00
Gael Guennebaud
68631e28d4 also test non_projective_only with row major transformations 2011-02-22 14:26:32 +01:00
Benoit Jacob
39d3bc2394 fix bug #190: directly pass Transform Options to Matrix, allowing to use RowMajor. Fix issues in Transform with non-default Options. 2011-02-22 08:14:38 -05:00
Gael Guennebaud
659c97ee49 gcc 4.4 also defines float32_t as a special type 2011-02-22 10:04:09 +01:00
Gael Guennebaud
769eeac35e disable output compression since this feature seems to be broken 2011-02-21 21:19:38 +01:00
Gael Guennebaud
51da67f211 more compilation fixes for altivec 2011-02-21 20:36:20 +01:00
Gael Guennebaud
05545d0197 fix compilation 2011-02-21 17:47:31 +01:00
Gael Guennebaud
8bee573a78 workaround ICC aggressive optimization 2011-02-21 16:17:58 +01:00
Gael Guennebaud
fb1a29fed5 fix ICE and warning with gcc 4.2.4 2011-02-21 16:11:18 +01:00
Gael Guennebaud
e129e985c3 link to blas/lapack only when needed, and use the static versions to hopefully workaround weird linking issues to gfortranbegin (see jitse dashboard) 2011-02-21 15:48:37 +01:00
Gael Guennebaud
2d5ea82807 fix bug #176 (workaround a too aggressive optimization made by ICC) 2011-02-21 11:00:07 +01:00
Hauke Heibel
50a3cd678a Improved site and buildname generation. 2011-02-20 11:54:07 +01:00
Gael Guennebaud
3c00e3da03 enable some tests that have been commented out 2011-02-18 18:08:58 +01:00
Gael Guennebaud
434817164e fix umfpack with complexes 2011-02-18 18:07:59 +01:00
Gael Guennebaud
2c1ac23c62 remove unused code 2011-02-18 17:54:48 +01:00
Gael Guennebaud
a0e5b00280 forgot that one, again 2011-02-18 17:50:36 +01:00
Gael Guennebaud
6456b74a89 merge 2011-02-18 17:40:31 +01:00
Gael Guennebaud
86ca05b324 remove largeEps in adjoint unit test and use a more accurate test_isApproxWithRef test. 2011-02-18 17:39:04 +01:00
Gael Guennebaud
8f8c67b8bd fix bug #186 (in 32 bits mode, gcc 4.3 messed up with pfirst for complex<float>) 2011-02-18 15:47:17 +01:00
Benoit Jacob
aa966ca319 fix bug #187: stable norm test was quite broken 2011-02-18 09:46:49 -05:00
Gael Guennebaud
f7cd63b964 fix bug #189 (issue with fortran concentions to return COMPLEX values) 2011-02-18 15:11:31 +01:00
Gael Guennebaud
69cecc45e5 extend mapstride unit test to test unaligned configurations 2011-02-18 14:41:40 +01:00
Gael Guennebaud
abce49ea21 fix a segfault in "slice vectorization" when the destination might not be aligned on a scalar (complex<double>) 2011-02-18 14:20:36 +01:00
Gael Guennebaud
d271ad38ce back to brute force linking to sparse libraries (fix cmake when these libs are not found) 2011-02-18 11:35:45 +01:00
Gael Guennebaud
3e2314dd67 forgot to include this file in previous commit (needed for lapack) 2011-02-18 11:32:39 +01:00
Gael Guennebaud
444c1bc55b now cholmod, umfpack, and superlu uses our own BLAS and LAPACK libs 2011-02-18 11:26:31 +01:00
Gael Guennebaud
390724b4b6 add lapack interface to real symmetric eigenvalue dec and enable building of the lapack shared library 2011-02-18 11:25:04 +01:00
Gael Guennebaud
d8ca948148 it is now up to user of these Find* module to find and link to BLAS and/or LAPACK 2011-02-18 11:23:27 +01:00
Gael Guennebaud
3345ea0ddd clean a bit SuperLU declarations 2011-02-18 10:23:32 +01:00
Gael Guennebaud
9195a224f3 fix division by zero if the matrix is exactly zero 2011-02-17 19:39:57 +01:00
Gael Guennebaud
b8ef48c46d for consistency forward declare tan, asin, acos functors 2011-02-17 18:23:04 +01:00
Gael Guennebaud
a53a7d6e6a use C linkage for umfpack (might fix some linking issues) 2011-02-17 18:19:28 +01:00
Gael Guennebaud
eda59ffc1b mention std::ptr_fun in the quickref guide 2011-02-17 18:07:21 +01:00
Gael Guennebaud
6f86c12339 typo 2011-02-17 17:48:16 +01:00
Gael Guennebaud
aea630a98a factorize implementation of standard real unary math functions, and add acos, asin 2011-02-17 17:37:11 +01:00
Gael Guennebaud
2ba55e90db make check no test everything - also rm the EigenTesting cmake sub-project 2011-02-17 16:58:18 +01:00
Benoit Jacob
d0b8ce8f2a fix unused var warning 2011-02-17 09:41:17 -05:00
Gael Guennebaud
1c4e85ac7e forgot to include this file in one pretty old commit (missing EXCLUDE_FROM_ALL) 2011-02-17 15:33:35 +01:00
Jitse Niesen
78fa34e8ff Add blas tests for buildtests target. 2011-02-17 13:53:20 +00:00
Benoit Jacob
8fb27fad36 remove #include <iostream> at the wrong place 2011-02-17 07:47:05 -05:00
Jitse Niesen
be224d93f4 Include necessary header files when working around bug #89.
Fixes bug #188.
2011-02-17 11:51:48 +00:00
Benoit Jacob
11402edfd3 with old gcc (bug #89), only include iostream in debug mode 2011-02-16 12:01:47 -05:00
Gael Guennebaud
fe8a710a21 properly report OpenGL as a disabled backend 2011-02-16 18:01:06 +01:00
Gael Guennebaud
03d86ea736 fix intallation of unsupported modules 2011-02-16 17:59:35 +01:00
Benoit Jacob
13a5582835 undo debugging change 2011-02-16 09:18:48 -05:00
Benoit Jacob
59596efdf7 Fix bug #89: on GCC <= 4.3, use a custom assert implementation to work around a compiler bug 2011-02-16 08:50:19 -05:00
Jitse Niesen
6db8fa7d04 Replace unset() by set() with no value specified; this does the same.
unset() was introduced in CMake 2.6.3 but we require only 2.6.2.
2011-02-16 10:16:47 +00:00
Gael Guennebaud
2f15f74218 CTEST_CUSTOM_* parameter have to be put in a CTestCustum.cmake file which itself has to be in the build directory 2011-02-15 12:39:45 +01:00
Gael Guennebaud
578d6f7ced now ctest does compile the test even though they are not in the "all" target 2011-02-15 11:40:43 +01:00
Gael Guennebaud
a1d7e9051e fix bug #184 (warning) 2011-02-14 15:41:00 +01:00
Gael Guennebaud
8e0a42350d fix stupid warning (bug #185) 2011-02-14 15:33:26 +01:00
Hauke Heibel
ac465a0891 Improve the Transform interface in order to prevent T.rotation() = R from compiling. 2011-02-14 12:00:47 +01:00
Jitse Niesen
211e1f8044 Improve documentation of plugins. 2011-02-13 22:50:57 +00:00
Benoit Jacob
d09b94e2ad Added tag 3.0-beta3 for changeset 58986ac832 2011-02-12 18:57:10 -05:00
Benoit Jacob
58986ac832 bump 2011-02-12 18:57:04 -05:00
Jitse Niesen
8bca23bbec Mention comma initializer can be used to concatenate vectors
(inspired by a question on IRC)
2011-02-12 23:17:31 +00:00
Hauke Heibel
1a6597b8e4 MSVC does not like using uninitialized SSE variables, so we have to pass all zeros. 2011-02-12 21:29:16 +01:00
Hauke Heibel
509ca63543 Merge 2011-02-12 18:50:53 +01:00
Hauke Heibel
beb03032b7 Disabled warning regarding the use of uninitialized variables on MSVC. 2011-02-12 18:48:57 +01:00
Jitse Niesen
9ac68e40a0 Write topic page for storage orders. 2011-02-12 17:43:29 +00:00
Hauke Heibel
7015aa00a9 Added configuration file for the 'eol' extension. 2011-02-12 18:38:56 +01:00
Gael Guennebaud
9d2bf35a05 implement optimized ploadu for MSVC10: this also fix bad code generation in gebp_kernel :) 2011-02-12 16:40:09 +01:00
Gael Guennebaud
ec7409b16e since gebp_kernel handled the scaling by alpha it used too many packets, this patch fix that. 2011-02-12 14:17:52 +01:00
Benoit Jacob
f7e4602a40 doc fixes 2011-02-11 09:55:54 -05:00
Hauke Heibel
bf79a3199c Reduced error traces when mixing matrices with different scalar types. 2011-02-11 09:41:48 +01:00
Gael Guennebaud
fe70113fab fix Transform documention regarding Mode 2011-02-10 18:58:37 +01:00
Benoit Jacob
f3b81302cd fix typo 2011-02-10 11:06:01 -05:00
Benoit Jacob
57b22204db document the eigen2 support stages 2011-02-10 10:55:22 -05:00
Benoit Jacob
6a5a13e394 The pfirst hack is needed also on msvc 2010 as it gets completely nuts, even though it doesnt segfault as msvc 2008 did 2011-02-09 15:13:23 -05:00
Benoit Jacob
63626bb966 remove debug #error 2011-02-09 14:37:52 -05:00
Benoit Jacob
85f9fab003 back out changeset efdf2e4056
. It turns out that the SSE3 header is always included, even without any SSE enabled, so it was making us wrongly use SSE3 paths. Backing this out fixes msvc related crashes, at least bug #165.
2011-02-09 14:01:26 -05:00
Gael Guennebaud
d6c4ca4845 fix redundancy 2011-02-09 13:44:05 +01:00
Gael Guennebaud
c0d5131435 workaround gcc 4.2.1 ICE (fix bug #145) 2011-02-09 13:04:35 +01:00
Gael Guennebaud
40526e24b4 fix memory leak (when conservatively resizing vectors of dynamically allocated scalar types such as bugnums) 2011-02-07 19:52:16 +01:00
Benoit Jacob
ba9f6a2c3b now random<integer types> spans over 0..RAND_MAX, or -RAND_MAX/2..RAND_MAX/2 for signed types, or the most significant bits for smaller integer types. 2011-02-07 10:55:41 -05:00
Benoit Jacob
3386a946f8 fix unit tests for integer types in preparation for next changeset making random<int> span over a much bigger range 2011-02-07 10:54:50 -05:00
Benoit Jacob
68a2e04a96 fix fuzzy compares for integer types, using a selector 2011-02-07 10:53:17 -05:00
Gael Guennebaud
c5c8efa575 workaround gcc 4.2 and 4.3 compilation issue with NEON 2011-02-07 16:41:21 +01:00
Benoit Jacob
9105e62d0a introduce EIGEN_MAKING_DOCS to tell whether we're compiling the docs examples 2011-02-06 12:51:42 -05:00
Benoit Jacob
02ee26a3a5 fix build of class Block examples 2011-02-06 12:43:01 -05:00
Benoit Jacob
182ed9ba6c merge 2011-02-06 11:57:31 -05:00
Benoit Jacob
bc6625ab87 fix const correctness in Diagonal::coeffRef (fix found by failtests) 2011-02-06 11:57:04 -05:00
Benoit Jacob
dab4e583cb fix EIGEN_STATIC_ASSERT_LVALUE (fix found by failtests) 2011-02-06 11:56:33 -05:00
Benoit Jacob
80500b693c add more failtests 2011-02-06 11:55:51 -05:00
Hauke Heibel
d975b82105 Removed internal::as_argument. This fixes the alignment issues of bug #165. 2011-02-06 17:33:04 +01:00
Hauke Heibel
7ea6ac79a3 Exposed failtetst publicly. 2011-02-06 13:43:08 +01:00
Gael Guennebaud
ea99880760 fix under- and overflow 2011-02-06 08:23:10 +01:00
Benoit Jacob
9ce08b352f add more failtests 2011-02-06 01:44:51 -05:00
Benoit Jacob
9b13e9aece failtest: a new cmake-based test suite for testing stuff that should fail to build. This first batch imports some const correctness checks from bug #54. 2011-02-05 18:57:29 -05:00
Hauke Heibel
8aee724274 Made MatrixBase::BasisReturnType const. 2011-02-05 15:53:17 +01:00
Hauke Heibel
6c3dc0d243 Fix Diagonal related const correctness issues. 2011-02-05 14:19:53 +01:00
Hauke Heibel
e20f1a44bb Fixed hidden const correctness issue. 2011-02-05 13:52:18 +01:00
Jitse Niesen
e2d46eac42 Remove all references to EIGEN_TUNE_CPU_CACHE_SIZE.
This macro is no longer used as of revision 0212eec23f
.
2011-02-04 22:33:53 +01:00
Thomas Capricelli
0b555a4a3d fix misc warnings 2011-02-04 13:55:12 +01:00
Thomas Capricelli
0ed604583f turnaround for a compiler bug in gcc 3.4.6 2011-02-04 12:09:30 +01:00
Gael Guennebaud
aee4e950d3 extend ctest script for SSSE3 and above 2011-02-03 18:04:43 +01:00
Gael Guennebaud
5887a086cf fix SSE3 issue (infinite loop after the ei_ => internal change) - this fix bug #174 2011-02-03 17:55:24 +01:00
Gael Guennebaud
1526de96a0 fix compilation with MSVC 2011-02-03 17:23:33 +01:00
Benoit Jacob
4489c56c9e add Map static methods taking Strides, add test checking for compilation errors 2011-02-03 10:05:45 -05:00
Gael Guennebaud
2e2614b0fd fix MSVC8 compilation 2011-02-03 15:40:48 +01:00
Gael Guennebaud
2f71277105 add global tan function 2011-02-03 14:45:21 +01:00
Jason Newton
d028262e06 add tan function in Array world 2011-02-03 14:34:40 +01:00
Gael Guennebaud
1eae6d0fb9 an even more stable procedure 2011-02-03 11:25:34 +01:00
Gael Guennebaud
5beb2f4f0d slightly more stable eigen vector computation 2011-02-03 10:31:45 +01:00
Gael Guennebaud
a617d7f2ad fix compilation with MSVC2005 (strange, stupid fixes for MSVC9 confuse MSVC8....) 2011-02-02 17:47:48 +01:00
Gael Guennebaud
52e0a44034 implement GBMV 2011-02-02 11:39:13 +01:00
Gael Guennebaud
d5f6819761 split BandMatrix to a base and a wrapper class 2011-02-02 11:38:08 +01:00
Gael Guennebaud
8915d5bd22 fix 168 : now TriangularView::solve returns by value making TriangularView::solveInPlace less important.
Also fix the very outdated documentation of this function.
2011-02-01 17:21:20 +01:00
Gael Guennebaud
59af20b390 extend nomalloc test 2011-02-01 16:46:35 +01:00
Gael Guennebaud
ffc8386fdb mark the packet access methods as internal 2011-02-01 16:14:53 +01:00
Gael Guennebaud
a486d5590a implement optimized path for selfadjoint rank 1 update (safe regarding dynamic alloc) 2011-02-01 15:49:10 +01:00
Benoit Jacob
3eb74cf9fc forgot hg add 2011-02-01 07:51:55 -05:00
Gael Guennebaud
fa32ce0fc5 fix alignment issue 2011-02-01 13:51:56 +01:00
Benoit Jacob
2d09b11a97 relax Matrix/Array(Index) ctors to allow size 0, add test. 2011-02-01 07:46:02 -05:00
Gael Guennebaud
faa1284c12 fix compilation of snippets 2011-02-01 13:28:14 +01:00
Gael Guennebaud
4cb9d0f943 notify the creation of manual temporaries 2011-02-01 11:41:52 +01:00
Gael Guennebaud
c60818fca8 fix trmv regarding strided vectors and static allocation of temporaries 2011-02-01 11:38:46 +01:00
Gael Guennebaud
0fdd01fe24 operator(int) and the likes are not only fine for linear storage 2011-02-01 11:09:02 +01:00
Gael Guennebaud
f4a7679904 fix packing criterion 2011-02-01 10:41:12 +01:00
Gael Guennebaud
f46ace61d3 fix dynamic allocation for fixed size objects in matrix-vector product 2011-01-31 21:30:27 +01:00
Benoit Jacob
5ca407de54 update .hgignore 2011-01-31 09:21:31 -05:00
Benoit Jacob
dc22ae101f kill stage 15, it's useless 2011-01-31 09:18:49 -05:00
Benoit Jacob
df06f0be31 eigen2 support: pass remaining 2 tests 2011-01-31 08:55:38 -05:00
Benoit Jacob
7032ec80ae eigen2support: disable sparse tests, and do not require to define YES_I_KNOW_NOT_STABLE 2011-01-31 08:44:49 -05:00
Benoit Jacob
374deaed5f make eigen2 eigensolver test pass 2011-01-31 08:36:14 -05:00
Gael Guennebaud
e2642ed620 clean the script to generate the plots 2011-01-31 12:45:18 +01:00
Gael Guennebaud
3874e6a72b include cblas.h header file to ease configuration 2011-01-31 11:02:59 +01:00
Gael Guennebaud
476cb4c65c fix name collision 2011-01-31 10:54:21 +01:00
Gael Guennebaud
9a73bfeb85 add GOTO2 and clean a bit the cmake macros 2011-01-31 10:45:03 +01:00
Gael Guennebaud
6e67d15795 now gemv supports strides 2011-01-30 08:17:46 +01:00
Hauke Heibel
157a5040d5 Added the /bigobj flag in order to enable compilation with MSVC when EIGEN_SPLIT_LARGE_TESTS is not set. 2011-01-29 14:35:24 +01:00
Benoit Jacob
a1f5ea8954 make eigen2 cholesky test pass 2011-01-28 13:04:23 -05:00
Benoit Jacob
e001db2a15 fix bug in triangular matrix-vector produce found by eigen2 tests! 2011-01-28 13:04:11 -05:00
Gael Guennebaud
852077fbc9 still test fftw even if the binary for long double is not available 2011-01-28 16:54:01 +01:00
Gael Guennebaud
c478e0039e disable broken determinant for complexes and SuperLU 2011-01-28 16:30:21 +01:00
Benoit Jacob
6f2ba1f52b typo reported by Don Lorenzo 2011-01-28 10:00:34 -05:00
Gael Guennebaud
817d86cbaf really fix permute_symm_to_symm for sparse complex matrix 2011-01-28 15:51:55 +01:00
Gael Guennebaud
6ec660ca7e fix crash in autodiff 2011-01-28 15:30:33 +01:00
Gael Guennebaud
af712e80e6 fix bug #73: weird compilation error in HouseholderSequence where double and float were mixed. Hopefuly this also solve bug #91... 2011-01-28 12:35:32 +01:00
Gael Guennebaud
d76ed18a9f rm useless ctor 2011-01-28 11:25:11 +01:00
Gael Guennebaud
1731a432e7 fix BTL cholesky action and output errors if the factorization failed 2011-01-28 11:24:18 +01:00
Gael Guennebaud
837f1ae59c fix compilation with old gcc 2011-01-28 11:23:02 +01:00
Gael Guennebaud
ddfd288dc9 start nighlty builds at 00:00:00 UTC 2011-01-28 10:33:02 +01:00
Gael Guennebaud
42d512d33c fix compilation with gcc 4.2 and older 2011-01-28 10:26:05 +01:00
Gael Guennebaud
97801e5e0e Eigen/Eigen should not include Sparse until it is API stable 2011-01-28 10:04:02 +01:00
Gael Guennebaud
736d00ab87 typo 2011-01-28 09:57:35 +01:00
Gael Guennebaud
162d29e696 fix compilation of sparse module with ICC 2011-01-28 09:55:32 +01:00
Thomas Capricelli
22db1a6e82 fix fftw test 2011-01-27 18:25:41 +01:00
Benoit Jacob
b2b8c6a89c dot() now always uses eigen3 convention, even in eigen2 support mode, even stage 10. Didn't have a choice as lots of eigen code is using it. 2011-01-27 12:04:26 -05:00
Gael Guennebaud
e761ba68f7 merge 2011-01-27 18:03:13 +01:00
Gael Guennebaud
3d8e179aa2 fix MaxCols in ComplexEigenSolver which was causing memory allocation instead of static allocation in the nomalloc test. Uncomment commenetd parts of the nomalloc test since now matrix-matrix products are safe. 2011-01-27 18:02:49 +01:00
Gael Guennebaud
32124bc64a EIGEN_YES_I_KNOW_SPARSE_MODULE_IS_NOT_STABLE_YET must be defined to use Eigen/Sparse 2011-01-27 17:36:58 +01:00
Benoit Jacob
52fed69baa add test for geometry with eigen2_ prefixes. fix that stuff. 2011-01-27 11:21:38 -05:00
Gael Guennebaud
955e096277 add an Options template parameter to Hyperplane and ParametrizedLine 2011-01-27 17:17:06 +01:00
Hauke Heibel
d5e81d866a Added regression tests for bug #148. 2011-01-27 16:37:06 +01:00
Benoit Jacob
fd400ffffb reverse order of testing for eigen2 support stages. Higher stages now have priority. So if your whole project builds with say stage 10, you can manually enable stage 20 for selected files. 2011-01-27 10:34:44 -05:00
Benoit Jacob
b69b6a9db2 add Threshold API to FullPivHouseholderQR 2011-01-27 10:17:52 -05:00
Gael Guennebaud
a954a0fbd5 Add an Options template paramter to Transform to enable/disable alignment 2011-01-27 16:07:33 +01:00
Jakob Schwendner
e3306953ef test case for unaligned quaternion 2011-01-27 09:14:30 -05:00
Christoph Hertzberg
0aa752fc4f add quaternion Options, add unaligned possibility 2011-01-27 09:14:22 -05:00
Gael Guennebaud
9ccd16609c fix twisted selfadjoint to selfadjoint (conjugation issue) 2011-01-27 14:39:01 +01:00
Gael Guennebaud
f5d0f115b4 EigenSolver is now in the Eigenvalues modules, not QR !
: Enter commit message.  Lines beginning with 'HG:' are removed.
2011-01-27 13:56:03 +01:00
Gael Guennebaud
255f2a1379 fix various compilations issues 2011-01-27 13:51:39 +01:00
Gael Guennebaud
999678c3f0 fix mixingtypes unit test 2011-01-27 13:51:17 +01:00
Eamon Nerbonne
40998f5e86 fix const-related compiler error on MSC. 2011-01-27 07:43:07 -05:00
Gael Guennebaud
5f03cbd44f fix many missing const in return types 2011-01-27 12:12:24 +01:00
Gael Guennebaud
e8d6a5ca87 fix cross product for complexes and add support for mixed real-complex cross products 2011-01-27 11:33:37 +01:00
Gael Guennebaud
0bfb78c824 allow mixed complex-real and real-complex dot products 2011-01-27 09:59:19 +01:00
Benoit Jacob
fe3bb545e0 allow matrix[index] in EIGEN2_SUPPORT 2011-01-26 20:22:33 -05:00
Gael Guennebaud
c90d0c363b improve automatic handling of gotoblas and atlas 2011-01-26 19:39:10 +01:00
Gael Guennebaud
0e8a532f87 always link to gfortran for gotoblas, it seems to be harmless for 1.x but needed for 2.x 2011-01-26 19:16:06 +01:00
Gael Guennebaud
240bfdd142 finish the move to Eigen3 in BTL, and let's use our own FindEigen3.cmake script 2011-01-26 19:12:35 +01:00
Gael Guennebaud
86acb46518 pass to eigen3 ;) 2011-01-26 18:41:06 +01:00
Gael Guennebaud
faeae169dd fix compilation 2011-01-26 17:58:17 +01:00
Gael Guennebaud
210a280daf update FindMKL to match the default installation behavior of MKL 11 2011-01-26 17:58:01 +01:00
Gael Guennebaud
1eb85b4cf1 allow the possibility to automatically call or not the ctors on a per scalar type basis, and disable automatic initialization of std::complex<> 2011-01-26 17:56:49 +01:00
Gael Guennebaud
4783748953 do not include reference lapack files if they are not there 2011-01-26 17:10:05 +01:00
Benoit Jacob
162cb8ff42 import back LeastSquares into eigen2support. Pass most of eigen2's 'regression' test, except for regression_4 which is about complex numbers. 2011-01-26 11:05:41 -05:00
Gael Guennebaud
98285ba81c merge 2011-01-26 16:36:07 +01:00
Gael Guennebaud
7ef9d82b39 add a minimalistict lapack wrapper 2011-01-26 16:34:45 +01:00
Gael Guennebaud
15ef62ca43 extend PermutationMatrix and Transpositions to support arbitrary interger types and to support the Map/Wrapper model via base and derived classes 2011-01-26 16:33:23 +01:00
Benoit Jacob
76c630d185 eigen2 support: import SVD back, pass SVD tests 2011-01-26 10:33:03 -05:00
Benoit Jacob
313eea8f10 fix the remainder of bug #159 2011-01-26 10:01:18 -05:00
Benoit Jacob
f88ca0ac79 fix the eigen3 part of bug #159 - build issue with selfadjointview 2011-01-26 09:49:06 -05:00
Benoit Jacob
9a5ded3e1d fix bug #160 - forgot hg add 2011-01-25 21:31:27 -05:00
Benoit Jacob
c350f6f12c fix bug #161 2011-01-25 21:28:20 -05:00
Benoit Jacob
39536d44da fix build 2011-01-25 21:24:31 -05:00
Benoit Jacob
1d98cc5e5d eigen2 support: implement part<SelfAdjoint>, mimic eigen2 behavior braindeadness-for-braindeadness 2011-01-25 21:22:04 -05:00
Benoit Jacob
4fbadfd230 merge 2011-01-25 11:19:54 -05:00
Benoit Jacob
07e3ef4f38 eigen2: pass QR decomposition and hyperplane tests 2011-01-25 11:19:26 -05:00
Gael Guennebaud
6896cab5b9 one more const missing 2011-01-25 16:52:40 +01:00
Gael Guennebaud
28d6e84150 fix compilation after recent const change in return types 2011-01-25 16:33:02 +01:00
Benoit Jacob
b1d6a9945c eigen2: pass the inverse test 2011-01-25 10:05:29 -05:00
Benoit Jacob
09d1923f61 eigen2: pass lu test 2011-01-25 10:02:36 -05:00
Benoit Jacob
3e2469f951 eigen2: split tests 2011-01-25 09:02:59 -05:00
Benoit Jacob
b04591fbb4 disable eigen2_first_aligned test, it's completely internal stuff 2011-01-25 08:38:22 -05:00
Benoit Jacob
acd2c82655 fix eigen2_bug_132 test 2011-01-25 08:37:32 -05:00
Benoit Jacob
8acd43bbdb let eigen2 tests use the same ei_add_test macro, which required to prefix them with eigen2_ ; rename buildtests_eigen2 to eigen2_buildtests, etc. 2011-01-25 08:37:18 -05:00
Benoit Jacob
dcfb58f529 eigen2: fix USING_PART_OF_NAMESPACE_EIGEN 2011-01-25 08:03:12 -05:00
Gael Guennebaud
84448b058c fix USING_PART_OF_NAMESPACE_EIGEN to export ei_ prefixed math functions 2011-01-25 09:35:49 +01:00
Gael Guennebaud
7dd4aaba9f fix missing const qualifier in cwiseEqual 2011-01-24 18:49:18 +01:00
Benoit Jacob
bd12ac4ffc import eigen2 Geometry module into Eigen2Support.
fix build of geometry tests
2011-01-24 11:21:58 -05:00
Benoit Jacob
5bfde30e48 fix compilation of array tests 2011-01-24 09:38:50 -05:00
Benoit Jacob
9089488210 fix compilation of Eigen/Geometry with EIGEN2_SUPPORT: was including non-existent header 2011-01-24 08:59:47 -05:00
Benoit Jacob
c3a4f6b5c5 const-qualify template parameters representing const arguments to expressions.
needed to fix docs compile issue.
2011-01-24 08:27:06 -05:00
Benoit Jacob
5331fa3033 fix compilation of LU class example 2011-01-24 07:41:47 -05:00
Benoit Jacob
1dabd133cc pass eigen2's triangular test 2011-01-23 21:53:28 -05:00
Benoit Jacob
5c82fd7f40 Move part() to EIGEN2_SUPPORT (had been deprecated for a long time) 2011-01-23 18:49:36 -05:00
Benoit Jacob
1cf4996d3c make eigen2 visitor test pass 2011-01-23 18:34:30 -05:00
Benoit Jacob
8df5bca979 rename build stages to multiples of 10; old stage 2 becomes stage 15, while stage 20 generates errors (instead of warnings) on conflicting API. 2011-01-23 18:22:18 -05:00
Benoit Jacob
cc1f70abc3 make eigen2 dynalloc test pass (add to eigen2 support some internal stuff that some users may have been relying on) 2011-01-21 10:47:31 -05:00
Benoit Jacob
30de1651d3 relax Map const correctness in eigen2 support stages <= 3
introduce new 'strict' stage 4
2011-01-21 10:42:19 -05:00
Benoit Jacob
54dfcdf86e remove eigen2 vectorization_logic test, it's not an API test 2011-01-21 10:29:43 -05:00
Benoit Jacob
5be269db88 make eigen2 submatrices test pass 2011-01-21 10:24:59 -05:00
Benoit Jacob
cc2b7a5397 introduce the 3 stages of eigen2 support, writing to the mailing list about that in Eigen2 to Eigen3 Migration Path thread 2011-01-21 09:51:03 -05:00
Benoit Jacob
34d93686db lots more EIGEN2_SUPPORT fixes. Now several of the most important core tests build and succeed. 2011-01-20 10:36:32 -05:00
Benoit Jacob
66a2ffa9bd Completely disable Eigen/Array in Eigen3; completely enable in EIGEN2_SUPPORT. 2011-01-20 08:12:24 -05:00
Benoit Jacob
96f08213f7 big eigen2support fix, aimed at users who relied on internal eigen2 stuff: now we dont need customizations in test/eigen2/main.h anymore.
These tests already build:
eigen2_basicstuff
eigen2_adjoint
eigen2_linearstructure
eigen2_prec_inverse_4x4
2011-01-19 11:01:07 -05:00
Benoit Jacob
bf0cffa897 restore the behavior of defaulting to Release build type 2011-01-19 10:15:36 -05:00
Benoit Jacob
1f6bd2915d import eigen2 test suite. enable by defining EIGEN_TEST_EIGEN2
only test_prec_inverse4x4 is fixed at the moment. now need to go over all those tests.
2011-01-19 10:10:54 -05:00
Benoit Jacob
604afc9aca fix bug #155, const-related compilation error 2011-01-18 09:14:14 -05:00
Hauke Heibel
9b2546fea8 Added remaining const coeffRef accessors to Array- and MatrixWrapper. 2011-01-18 13:19:13 +01:00
Benoit Jacob
c7eaca50a0 __cpuidex is not (always) present in VS 2008 + SP1, it seems 2011-01-17 11:17:45 -05:00
hamelin.philippe
5e28f34005 Replace CMAKE_SOURCE_DIR with PROJECT_SOURCE_DIR to allow the cmake project to be included by a root project. 2011-01-17 09:59:40 -05:00
Gael Guennebaud
5010033d88 do not stop the factorization if one pivot is exactly 0, and return the
index of the first zero pivot if any
2011-01-17 11:11:22 +01:00
Gael Guennebaud
ef3e690a0c return the index of the first non positive diagonal entry (more useful than simply true or false) 2011-01-17 11:09:03 +01:00
Gael Guennebaud
8b6c1caa3e fix compilation of rowmajor sparse time diagonal 2011-01-14 20:29:55 +01:00
Thomas Capricelli
dcbf091e60 fix EIGEN_TEST_NOQT (reported by Philippe Hamelin) 2011-01-14 14:30:06 +01:00
Jose Luis Blanco
cbfab7204f Update of CPUID macros to fix segfaults in amd64 code. 2011-01-05 02:43:43 +01:00
Benoit Jacob
98f0274305 third pass of const-correctness fixes (bug #54), hopefully the last one... 2011-01-07 05:16:01 -05:00
Gael Guennebaud
c7baf07a3e add plugin mechanism to sparse objects 2011-01-07 15:53:02 +01:00
Jitse Niesen
9111d73017 Fix compilation error in HouseholderSequence introduced in my previous commit. 2011-01-07 13:46:23 +00:00
Romain Bossart
4abb772b52 Fix bug #38
* address of temporaries were passed to umfpack_zi_* functions. It is ok with g++-4.4 or 4.5, but not with the -std=c++0x in both versions. This patch makes it work for c++98 and c++0x versions
2011-01-07 10:27:22 +01:00
Jitse Niesen
2cc75f4922 Make HouseholderSequence::setTrans() protected (cf. bug #50).
Users can call .transpose() instead.
2011-01-06 11:30:19 +00:00
Manuel Yguel
934720c4ba Decrease the degree of the polynomials being tested to reduce time spent during the tests. 2011-01-05 19:49:13 +01:00
Hauke Heibel
4ba0ec5e0e Fixed #148 where a const-accessor for coefficients was missing in the MatrixWrapper. 2011-01-04 15:35:50 +01:00
Gael Guennebaud
d7e1eeaece fix compilation when defaulting to row major 2011-01-04 14:40:06 +01:00
Gael Guennebaud
3a4d56171d fix openglsupport unit test when defaulting to row major 2011-01-04 14:34:17 +01:00
Gael Guennebaud
64356a622d fix vectorization_logic unit test when defaulting to row major 2011-01-04 14:18:07 +01:00
Jitse Niesen
004488a31d Fix bug in symmetric rank-2 update for row-major matrices (bug #144). 2011-01-04 10:35:39 +00:00
Jitse Niesen
fb023b871f Const-correctness fix for gemv_selector<OnTheRight,ColMajor,true> (bug #144). 2011-01-04 10:35:10 +00:00
Benoit Jacob
fd4e366d7e fix severe perf bug: coeff-based matrix products were not considered aligned, typically preventing vectorization.
added unit test.
2011-01-02 12:07:39 -05:00
Jitse Niesen
47a9d2ed54 Document HouseholderSequence.
Incomplete: I did not explain the difference between OnTheLeft and OnTheRight,
and there is only one example.
2011-01-02 16:59:44 +00:00
Gael Guennebaud
583f963517 make the table fit within 80 characters 2011-01-01 12:02:55 +01:00
Gael Guennebaud
e7318148b5 an attempt to fix a compilation issue with -std=c++0x 2011-01-01 11:40:30 +01:00
Jose Luis Blanco
7feb644620 Switched "MESSAGE(" -> "MESSAGE(STATUS " in CMake script, since otherwise they may look like errors to the user. 2010-12-29 22:02:01 +01:00
Gael Guennebaud
902af035d3 merge 2010-12-31 17:26:48 +01:00
Gael Guennebaud
25efcdd042 fix sparse time dense product with a rowmajor lhs 2010-12-31 17:11:17 +01:00
David J. Luitz
11e253bc10 [Sparse] Added regression tests for the two bugfixes, the code passes all sparse_product tests 2010-12-30 15:16:23 +01:00
Benoit Jacob
13867c15cc fix compilation of code using e.g. Transpose<const Foo>::data() non-const-qualified. Same problem existed for coeffRef() and also in MapBase.h. 2010-12-30 07:47:51 -05:00
Benoit Jacob
26c2afd55a fix compile errors in Tridiagonalization and in doc examples 2010-12-30 04:52:20 -05:00
Benoit Jacob
dbd9c5fd50 fix HouseholderSequence API, bug #50:
* remove ctors taking more than 2 ints
 * rename actualVectors to length
 * add length/shift/trans accessors/mutators
2010-12-30 04:18:40 -05:00
Trevor Irons
e112ad8124 In QuickRefPage LinSpaced is improperly documented. 2010-12-29 10:08:41 -07:00
Jitse Niesen
d6a5ba5a08 Rename EIGEN_DENSESTORAGEBASE_PLUGIN to EIGEN_PLAINOBJECTBASE_PLUGIN. 2010-12-29 19:12:39 +00:00
Jose Luis Blanco
3ca31a8b74 fixed msvc9 build errors. 2010-12-29 19:42:01 +01:00
Jitse Niesen
d84b135ed3 Enable GSL tests (reverts part of changeset 6628534eb5
).
2010-12-29 17:45:18 +00:00
Jose Luis Blanco
97c54ad220 fix MSVC warnings, bug #143 2010-12-29 06:15:41 -05:00
Thomas Capricelli
7a29ae0b5c fix preprocessor checks for availability of cpuid 2010-12-28 13:46:39 +01:00
Jitse Niesen
657013c974 Mention ptr_fun in docs for .unaryExpr() 2010-12-27 16:35:25 +00:00
Jitse Niesen
265e1ef4ef Extend doc page on preprocessor directives. 2010-12-27 16:34:58 +00:00
Jitse Niesen
8db9acbc16 Move doxygen comments for EIGEN_NO_DEBUG from source to I14.
This reverts changeset 76fbe94279
. Benoit and I agree that my
approach there (to use doxygen comments) pollutes the code too much.
2010-12-27 15:07:11 +00:00
Jitse Niesen
840c4e1ab5 Move section on preprocessor directives from I00 to its own page. 2010-12-27 15:07:07 +00:00
Jitse Niesen
42a050dc68 Finish doc page on aliasing. 2010-12-27 15:06:55 +00:00
Benoit Jacob
dc3618a557 move BandMatrix and TridiagonalMatrix to the internal:: namespace 2010-12-25 17:17:10 -05:00
Benoit Jacob
8d2a10c5c1 more renaming to make this file matrix-or-array-agnostic 2010-12-25 17:04:36 -05:00
Benoit Jacob
e8768251db rename macro 2010-12-25 17:01:01 -05:00
Benoit Jacob
86d3711fb7 remove EIGEN_REF_TO_TEMPORARY, clarify docs 2010-12-25 16:45:25 -05:00
Benoit Jacob
75b7d98665 bug #54 - really fix const correctness except in Sparse 2010-12-22 17:45:37 -05:00
Hauke Heibel
3b6d97b51a Re-enabled the BLAS compilation on non-MSVC systems. 2010-12-17 10:52:57 +01:00
Hauke Heibel
5e46f7a499 Switched back to the old behaviour where EIGEN_SPLIT_LARGE_TESTS was ON per default on MSVC systems.
Without splitting these tests, some do not compile
2010-12-17 09:42:17 +01:00
Gael Guennebaud
a21d56b766 disable blas if C++ compiler is MSVC 2010-12-16 20:51:44 +01:00
Hauke Heibel
efdf2e4056 Added automatic SSE3/4.1/4.2 support for MSVC. 2010-12-16 20:08:22 +01:00
Hauke Heibel
b31e1246e1 Re-enabled the missing tests, again... 2010-12-16 19:07:23 +01:00
Hauke Heibel
83e3c4582f Improved the array unit test - internal::isApprox needs to use the same precision as VERIFY_IS_NOT_APPROX.
Removed debug code from test_isApprox.
2010-12-16 18:53:02 +01:00
Hauke Heibel
2d0dfe5d60 Uups - re-enabled subtests 1 to 5. 2010-12-16 17:36:10 +01:00
Hauke Heibel
f578dc7aff Fixed compound subtraction in ArrayBase where the assignment needs to be carried out on the derived type.
Added unit tests for map based component wise arithmetic.
2010-12-16 17:34:13 +01:00
Hauke Heibel
dbfb53e8ef Added unit test for matrix creation from const raw data. 2010-12-15 15:28:43 +01:00
Hauke Heibel
6f5c45ceff Fixed ctor from const raw data for Matrices and added the missing implementation for Arrays.
Fixed a warning regarding the conversion from int to bool in MapBase.
2010-12-15 15:19:51 +01:00
Gael Guennebaud
6a9a6bbc78 fix warning 2010-12-13 10:18:33 +01:00
Gael Guennebaud
68fe80861c Fix bug #133: remove the EIGEN_RESTRICT which was useless here anyway 2010-12-13 09:56:13 +01:00
Jitse Niesen
f2c18f2e37 merge 2010-12-12 21:24:24 +00:00
Jitse Niesen
4a5ebcd1ce Fix compilation of Tridiagonalization_diagonal example.
After changeset 0d63212257
, matrixT() is a real matrix even if the matrix
which is decomposed is complex.
2010-12-12 13:53:42 +00:00
Gael Guennebaud
c7f01157dd enforce compilation of blas unit tests when running ctest 2010-12-12 13:10:00 +01:00
Jitse Niesen
9cd4f67e7f Specify root namespace for fftw_plan from FFTW3 library.
After changeset 4716040703
 (the ei_ --> internal:: change), there are two symbols
called fftw_plan, one from the FFTW3 library and one from Eigen.
2010-12-12 11:44:30 +00:00
Konstantinos Margaritis
e05c79cbd8 Fixed NEON compilation errors, changed float-abi back to softfp (which is the most used right now).
Some complex tests appear to segfault, needs a more careful look.
2010-12-10 20:27:46 +02:00
Benoit Jacob
b11343e15c fix intermittend failure of schur_real test: there only is an iterative process if size>2 2010-12-10 02:10:03 -05:00
Benoit Jacob
74cc42b22f bug #54 - The big Map const-correctness changes 2010-12-10 02:09:58 -05:00
Gael Guennebaud
e736df3edd suppress stupid warning 2010-12-10 15:53:13 +01:00
Gael Guennebaud
79cc86f701 fix compilation 2010-12-10 13:52:47 +01:00
Gael Guennebaud
67c28570e3 fix compilation with ICC (template keyword on a non template method) 2010-12-10 10:05:52 +01:00
Gael Guennebaud
5bc21c25c5 fix ICE with gcc 3.4 and 4.0.1 2010-12-10 09:59:44 +01:00
Gael Guennebaud
bacd531862 fix bug #128 : tridiagonalization failed for 1x1 matrices 2010-12-09 19:56:20 +01:00
Gael Guennebaud
17de59278b simplification 2010-12-09 19:47:02 +01:00
Gael Guennebaud
147a63c4b5 compilation fix 2010-12-09 19:46:26 +01:00
Gael Guennebaud
0b32c5bdda fix compilation of sparse_basic for DynamicSparseMatrix 2010-12-09 19:39:15 +01:00
Benoit Jacob
aec0782719 fix the build of eigensolver_complex test.
it was calling the .value() method on an inner product, and that was blocked in bad zero-sized case.

fixed by adding the .value() method to DenseBase for all 1x1 expressions, and allowing coeff accessors in ProductBase for 1x1 expressions.
2010-12-09 03:47:35 -05:00
Benoit Jacob
1be6449f2e fix bug #127. our product selection logic was flawed in that it used the Max-sized to determine whether the size is 1.
+ test.
2010-12-09 02:38:07 -05:00
Benoit Jacob
819bcbed19 fix comment 2010-12-07 02:17:15 -05:00
Eamon Nerbonne
7a7ca99a31 [mq]: Mingw32 fix
intrin.h is not required nor supported by mingw32.  It is present (and supported) on mingw-w64 builds, even those for 32-bit systems, but here too it's not required on 32-bit systems.  So if we're on mingw, and it's 64-bit, then and only then is the intrin.h inclusion necessary.
2010-12-03 23:24:06 +01:00
Gael Guennebaud
c49c013c47 add main ei_* functions into Eigen2Support 2010-12-03 11:22:35 +01:00
Gael Guennebaud
14208eb478 add a word about the ei_ prefix change in Eigen2 -> Eigen3 doc page. 2010-12-03 10:54:16 +01:00
Hauke Heibel
a289065c73 Applied a fix to our std::vector specialization which prevents the usage of workaround_msvc_stl_support when T is not a class. 2010-12-02 12:33:15 +01:00
Benoit Jacob
59b944cb50 add is_const 2010-12-01 09:22:54 -05:00
Benoit Jacob
46387cc180 remove makeconst_return_type 2010-12-01 09:22:50 -05:00
Hauke Heibel
f0ba513f41 Fixed compilation of tridiagonalization related unit tests. 2010-11-27 15:41:46 +01:00
Hauke Heibel
3899857e08 Removed remove_const_on_value_type since the meaning is unclear and it is in fact unused.
Extened the meta unit tests.
2010-11-26 18:06:08 +01:00
Hauke Heibel
60a544c879 Added STL like (add|remove)_const. Fixed add_const_on_value_type for "const T* const". 2010-11-26 16:56:03 +01:00
Hauke Heibel
bf9d25ce58 Postfixed add_const and remove_const by _on_value_type to express the differences to the STL. 2010-11-26 16:30:45 +01:00
Benoit Jacob
139392488d dos2unix 2010-11-26 10:10:26 -05:00
Jitse Niesen
e868b6736a Merge. 2010-11-26 14:37:58 +00:00
Gael Guennebaud
d551e99644 make HessenbergDecompositionMatrixHReturnType internal 2010-11-26 15:39:01 +01:00
Gael Guennebaud
e06c6553e0 make TridiagonalizationMatrixTReturnType internal and only export a public MatrixTReturnType typedef 2010-11-26 15:36:29 +01:00
Gael Guennebaud
0d63212257 add a TridiagonalizationMatrixTReturnType class to make Tridiagonalization::matrixT() more efficient and future proof. 2010-11-26 15:31:47 +01:00
Jitse Niesen
9bad7c7edb Compilation fix in case EIGEN_DEBUG_ASSERTS is defined. 2010-11-26 14:21:57 +00:00
Gael Guennebaud
421b2b5ff7 fix a couple of issues with TridiagonalMatrix 2010-11-26 13:04:20 +01:00
Gael Guennebaud
d8b26cfeec s/id/p to avoid name clash 2010-11-26 08:36:16 +01:00
Gael Guennebaud
156a31b0e9 fully implement scalar_fuzzy_impl<bool> as, e.g., the missing isMuchSmallerThan is convenient to filter out false values. 2010-11-25 18:00:30 +01:00
Jitse Niesen
010ed9510b Remove parentheses for compatibility with cmake 2.6.2 2010-11-24 22:26:13 +00:00
Benoit Jacob
cd1225ef14 make example compile 2010-11-24 09:18:49 -05:00
Benoit Jacob
f84cbba52a minor fixes 2010-11-24 09:16:30 -05:00
Benoit Jacob
07f2406dc1 some dox tweaks 2010-11-24 08:23:17 -05:00
Gael Guennebaud
f1690fb9fa fix bug #122 : rank 2 update test and scalar multiple extraction were both wrong 2010-11-23 19:19:04 +01:00
Benoit Jacob
0ab9a0a2f7 make UpperBidiagonalization internal: don't want to support it, it's not used.
Keeping it because it tests BandMatrix.
2010-11-23 11:12:42 -05:00
Benoit Jacob
ee38dbf1e6 Rework nested<> to be cleaner, see bug #76. 2010-11-23 11:11:40 -05:00
Frederic Gosselin
4c5932f8f5 Improves the filter for hidden files in "Eigen" and "Eigen/src".
This generic solution prevent cmake from having an error .svn folders when the source folder is under subversion.
2010-11-22 10:47:07 -05:00
Gael Guennebaud
5a65d7970a now the full blas folder requires a fortran compiler 2010-11-22 19:07:29 +01:00
Gael Guennebaud
3976a66889 fix bug #120 : compilation issue of trsolve unit test 2010-11-22 18:59:56 +01:00
Gael Guennebaud
f5f288b741 split level 1 and 2 implementation files into smaller ones and fix a couple of numerical and tricky issues discovered by the lapack test suite 2010-11-22 18:49:12 +01:00
Gael Guennebaud
a6f483e86b import reference BLAS routines which are not already implemented in Eigen : modified givens rotations, and packed and banded storages 2010-11-22 18:05:09 +01:00
Gael Guennebaud
7213dd1e6b this product still badly read the imaginary part on the diagonal 2010-11-22 18:00:47 +01:00
Benoit Jacob
a3f214ade9 holy crap, i had disabled all static asserts in 71f023de3e 2010-11-22 08:21:30 -05:00
Gael Guennebaud
d8396a8da0 fix compilation of product_mmtr 2010-11-21 10:23:06 +01:00
Gael Guennebaud
fb6d9ca951 add missing non const data() method to MapBase 2010-11-21 10:17:25 +01:00
Gael Guennebaud
0020ea544a implement HEMV level2 blas routine 2010-11-21 10:09:33 +01:00
Gael Guennebaud
12bfe5e718 make sure our internal selfadjoint*vector product does not use the imaginary part of the diagonal entries 2010-11-21 10:08:48 +01:00
Gael Guennebaud
e88901daf4 implement SYMV level2 blas routines 2010-11-21 09:34:41 +01:00
Gael Guennebaud
1ac9124fac implements TRMV level 2 blas routine 2010-11-20 23:29:20 +01:00
Gael Guennebaud
d72a8f1e50 make trmv uses direct access 2010-11-20 22:42:24 +01:00
Gael Guennebaud
437dff80ee fix issue 114: workaround cmake enable_language bug 2010-11-20 12:01:17 +01:00
Gael Guennebaud
86474115f5 IBM XL C compiler supports __attribute__((aligned(n))) syntax 2010-11-19 17:33:51 +01:00
Gael Guennebaud
8ad1f64e0a some cleaning in blas level 2 2010-11-19 17:22:43 +01:00
Thomas Capricelli
94f59a92cb fix typo 2010-11-19 17:16:28 +01:00
Gael Guennebaud
ed1ecb24d2 implement GERC and GERU blas routines 2010-11-19 17:05:24 +01:00
Gael Guennebaud
458637f097 implement GER blas routine 2010-11-19 17:02:24 +01:00
Gael Guennebaud
68f8519327 implement HER and HER2 blas routines 2010-11-19 16:51:52 +01:00
Gael Guennebaud
5ce199b1dd update rank 2 update doc 2010-11-19 16:50:49 +01:00
Gael Guennebaud
f369b5a711 makes rank 2 update function conformant to BLAS HER2 2010-11-19 16:50:15 +01:00
Gael Guennebaud
e14f14642d implement SYR and SYR2 2010-11-19 16:09:25 +01:00
Gael Guennebaud
661ef6c127 add regression unit test 2010-11-19 15:38:37 +01:00
Gael Guennebaud
3f24dbf6f5 fix compilation of transform * scaling 2010-11-19 14:45:45 +01:00
Gael Guennebaud
3e99356b59 clean a bit AMD and SimplicialCholesky and add support for partly stored selfadjoint matrices 2010-11-18 10:30:52 +01:00
Gael Guennebaud
1618df55df Add support for sparse symmetric permutations 2010-11-18 10:28:39 +01:00
Gael Guennebaud
fb71b737e4 update blas lib wrt recent change of general_matrix_matrix_triangular_product 2010-11-16 19:19:33 +01:00
Jitse Niesen
e54c8d20cb Docs: aliasing and component-wise operations. 2010-11-16 17:28:59 +00:00
Gael Guennebaud
da05b6af0e fix some remainign issue with ei_ -> internal change 2010-11-16 15:54:48 +01:00
Gael Guennebaud
9a3ec637ff new feature: copy from a sparse selfadjoint view to a full sparse matrix 2010-11-15 14:14:05 +01:00
Gael Guennebaud
5a3a229550 fix return type of rightHouseholderSequence() 2010-11-15 11:11:22 +01:00
Jitse Niesen
cad73d9cdc Correct std::map fix (two commits ago); copy fix to aligned_allocator doc. 2010-11-12 12:06:24 +00:00
Thomas Capricelli
d64e68c8bc fix doc compilation 2010-11-12 11:33:09 +01:00
Jose Luis Blanco
9ba15cd63c Docs: correct declaration of aligned std::map in TopicStlContainers. 2010-11-12 10:05:41 +00:00
Gael Guennebaud
b4fa8261b1 properly use nested types 2010-11-10 19:06:20 +01:00
Gael Guennebaud
05ed9be639 prevent warning 2010-11-10 18:59:16 +01:00
Gael Guennebaud
2577ef90c0 generalize our internal rank K update routine to support more general A*B product while evaluating only one triangular part and make it available via, e.g.:
R.triangularView<Lower>() += s * A * B;
2010-11-10 18:58:55 +01:00
Gael Guennebaud
c810d14d4d add missing specialization 2010-11-09 12:03:20 +01:00
Gael Guennebaud
39477e697a extend unit test to cover previous bug 2010-11-05 14:37:42 +01:00
Gael Guennebaud
572b5585e3 fix Eigen's trsv for complexes 2010-11-05 14:36:34 +01:00
Gael Guennebaud
0e30c4ae3f blas level2: gemv and trsv are green 2010-11-05 14:14:50 +01:00
Gael Guennebaud
3fdea699b8 trsv: simplifications/cleaning 2010-11-05 12:54:32 +01:00
Gael Guennebaud
0e6c1170ab trsv: add support for inner-stride!=1, reduce code instanciation, move implementation to a new products/XX.h file 2010-11-05 12:43:14 +01:00
Gael Guennebaud
fe1353080e fix error handling of level 1 routines 2010-11-04 22:25:59 +01:00
Gael Guennebaud
15e8ad686c add a minimum degree ordering routine based on CSparse (LGPL) and a new built-in sparse cholesky decomposition 2010-11-04 09:58:22 +01:00
Gael Guennebaud
5a4f77716d fix bug #107: SelfAdjointEigenSolver and RowMajor (and add unit test) 2010-11-04 09:33:05 +01:00
Gael Guennebaud
20fcef9656 fixes related to ei_ -> internal change 2010-11-04 08:38:16 +01:00
Gael Guennebaud
62a51184d7 merge 2010-11-04 08:32:52 +01:00
Gael Guennebaud
fd88d721d2 implement proper error handling in level 3 routines 2010-11-03 22:03:12 +01:00
Gael Guennebaud
a8fb6b0ad3 improve detection of erros 2010-11-03 22:02:44 +01:00
Gael Guennebaud
1eea88bff7 fix matrix product bug with OpenMP 2010-11-03 16:12:37 +01:00
Gael Guennebaud
8d27f55eb3 rm auto normalization in favor of clamping 2010-11-03 15:32:40 +01:00
Hauke Heibel
d204ec491d Additional fix to enforce the compiler to use the correct prunning method. 2010-11-02 14:33:33 +01:00
Hauke Heibel
3a3f163e31 Fix bug #65.
In order to prevent compilation errors, the default functor "struct func" must not be defined inside the function scope. I just moved it into a private section of SparseMatrix.
2010-11-02 14:32:41 +01:00
Hauke Heibel
b3007db131 Added a comment on why is_arithmetic is used in DenseCoeffsBase. 2010-11-02 10:11:22 +01:00
Hauke Heibel
96e4a4b59c Fixed compilation due to lacking Transform definitions. 2010-11-01 16:53:39 +01:00
Gael Guennebaud
d2e257cb5d oops (rm commented code) 2010-11-01 09:40:33 +01:00
Gael Guennebaud
c7eda0d866 Let's be safe: enable auto normalization is quaternion to angle-axis code since a slight numerical issue may trigger NaN. The overhead is small and I doubt the perf of this function could be critival for any application ! 2010-10-31 23:26:01 +01:00
Benoit Jacob
006c9a5105 implement VERIFY in a function so it doesn't get compiled thousands of times. 2010-10-29 10:27:20 -04:00
Benoit Jacob
7d441260db on test failure, abort instead of exit, so we can get a stack trace 2010-10-29 10:07:30 -04:00
Benoit Jacob
99ccb26cfe add eigen2support Transform typedefs, add Eigen2To3 section on Transform 2010-10-29 09:00:35 -04:00
Benoit Jacob
bd249d1121 fix bug #92 - we were doing stupid things when passing the list of libraries to link to. 2010-10-28 10:44:20 -04:00
Benoit Jacob
868f753d10 document LvalueBit better 2010-10-28 09:40:20 -04:00
Gael Guennebaud
1d4e80f09d generalize the prune function 2010-10-28 11:39:31 +02:00
Gael Guennebaud
02c8b6af82 fix sparse rankUpdate and triangularView iterator 2010-10-27 15:13:03 +02:00
Gael Guennebaud
241e5ee3e7 add the possibility to solve for sparse rhs with Cholmod 2010-10-27 14:31:23 +02:00
Hauke Heibel
5d4ff3f99c Fixed bug #95 by changing _M_IX64 to _M_X64 as proposed by Jan Schlicht. 2010-10-27 11:07:38 +02:00
Hauke Heibel
3efff8c69e Merge 2010-10-26 16:48:12 +02:00
Gael Guennebaud
f4a6a8e295 rm the useless SparseSolverBase class and provide more compile time traits 2010-10-26 16:47:47 +02:00
Hauke Heibel
c738cd56eb Renamed cleantype to remove_all since it is close to remove_{const|pointer|reference}. 2010-10-26 16:47:01 +02:00
Gael Guennebaud
2fbb9932b0 fix compilation (bad internal:: stuff) 2010-10-26 16:38:51 +02:00
Gael Guennebaud
5e95ee6662 fix compilation and unit test of adolc 2010-10-26 16:26:20 +02:00
Gael Guennebaud
92044fcc2b fix bug #94: add #include src/misc/Solve.h in SparseExtra 2010-10-26 15:51:06 +02:00
Gael Guennebaud
666c16cf63 add new API for Cholmod preserving the legacy one for now 2010-10-26 15:48:33 +02:00
Hauke Heibel
7bc8e3ac09 Initial fixes for bug #85.
Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const.
Changed boolean type 'ret' member to 'value'.
Changed 'ret' members refering to types to 'type'.
Adapted all code occurences.
2010-10-25 22:13:49 +02:00
Hauke Heibel
597b2745e1 Allow unset ${CMAKE_BUILD_TYPE} which is required for some targets and corresponding to using default values. 2010-10-25 18:49:39 +02:00
Benoit Jacob
724af13540 make polynomialsolver test compile faster 2010-10-25 10:15:22 -04:00
Benoit Jacob
a94f216487 error out on bad build type 2010-10-25 10:15:22 -04:00
Benoit Jacob
fdaa3f311a adapt mpreal to eigen3 mathfunctions system 2010-10-25 10:15:22 -04:00
Benoit Jacob
4716040703 bug #86 : use internal:: namespace instead of ei_ prefix 2010-10-25 10:15:22 -04:00
Benoit Jacob
ca85a1f6c5 remove build type tweaking 2010-10-23 10:00:43 -04:00
Jitse Niesen
dbdf7ee942 Use 'Release' as default when build type is not specified.
Otherwise, "cmake /path/to/eigen/" in an empty build directory, as specified
on the CMake page on the wiki, yields a fatal error.
2010-10-22 12:23:35 +01:00
Benoit Jacob
bfd46eacad don't change the build type, fatal error if bad build type 2010-10-21 08:55:48 -04:00
Hauke Heibel
969518f99d Improved I13_FunctionsTakingEigenTypes.dox.
Removed the r-value reference part and focused on EIGEN_REF_TO_TEMPORARY only.
2010-10-21 10:14:23 +02:00
Hauke Heibel
ba86d3ef65 Fixed bug #84. 2010-10-21 10:13:17 +02:00
Hauke Heibel
9bbaff6b41 Fixed the unit test splitting for MSVC. 2010-10-21 07:39:06 +02:00
Benoit Jacob
ee60fc2062 fix typo and rephrase sentence 2010-10-20 09:43:16 -04:00
Benoit Jacob
8c17fab8f5 renaming: ei_matrix_storage -> DenseStorage
DenseStorageBase  -> PlainObjectBase
2010-10-20 09:34:13 -04:00
Hauke Heibel
9cf748757e Improved the fixed size array display. 2010-10-20 11:56:29 +02:00
Benoit Jacob
e259f71477 rename PlanarRotation -> JacobiRotation 2010-10-19 21:56:26 -04:00
Benoit Jacob
9044c98cff work around stupid msvc error when constructing at compile time an expression
that involves a division by zero, even if the numeric type has floating point
2010-10-19 21:56:11 -04:00
Gael Guennebaud
e5073746f3 allows blocks of code to be larger than the page body (like tables) 2010-10-19 16:55:49 +02:00
Gael Guennebaud
e19c6b89f5 update the position of the owl 2010-10-19 16:07:04 +02:00
Gael Guennebaud
54814eb05b factorize CSS code, make use of the "manual" class when appropriate, clean the style of the big linear algebra table 2010-10-19 15:25:00 +02:00
Benoit Jacob
70f95ef80d increase css max-width 2010-10-19 09:40:23 -04:00
Benoit Jacob
b1604ea553 merge 2010-10-19 09:32:19 -04:00
Benoit Jacob
b8dfc62f3c specify max-width in em not px 2010-10-19 09:31:22 -04:00
Gael Guennebaud
6d8e7d68e4 factorize CSS code, make use of the "manual" class when appropriate, clean the style of the big linear algebra table 2010-10-19 15:25:00 +02:00
Benoit Jacob
9e3005d552 css update: max-width and margins 2010-10-19 09:18:06 -04:00
Benoit Jacob
9fa54d4cc9 move tables from class "tutorial_code" to "example"
also remove a align="center" in the Aliasing page -- it doesn't make sense to have 1 centered table page when all others are left aligned.
2010-10-19 08:42:49 -04:00
Gael Guennebaud
ca4bd5851c update style of the quick ref guide 2010-10-19 11:59:11 +02:00
Gael Guennebaud
f66fe2663f update CSS to doxygen 1.7.2, new CSS and cleaning of the tutorial 2010-10-19 11:40:49 +02:00
Hauke Heibel
9f8b6ad43e Fixed bug #79. 2010-10-19 09:43:54 +02:00
Benoit Jacob
3481f10e7a re-fix the broken msvc warning in JacobiSVD 2010-10-18 09:46:22 -04:00
Benoit Jacob
3404d5fb14 improvements in pages 5 and 7 of the tutorial. 2010-10-18 09:09:30 -04:00
Benoit Jacob
1c15a6d96f improvements in tutorial page 4 : block operations 2010-10-18 08:44:27 -04:00
Benoit Jacob
4b0fb968ea fixed table html 2010-10-18 07:23:48 -04:00
Benoit Jacob
597bb61c23 fix stupid msvc warning in jacobisvd 2010-10-18 06:54:11 -04:00
Benoit Jacob
6628534eb5 fix bug i just introduced in ei_add_test_internal 2010-10-17 11:47:59 -04:00
Benoit Jacob
19ae4362bd ah ok, we want to build this even without GSL.
so the bug is in FindGSL.cmake.
2010-10-17 11:31:58 -04:00
Benoit Jacob
4e3feb023d more unsupported/ CMake fixes 2010-10-17 11:21:10 -04:00
Benoit Jacob
1e3a035275 Fix general linking issue for tests linking to multiple libs, and explicitly link mpfr_real test to GMP. 2010-10-17 11:04:43 -04:00
Benoit Jacob
8356bc8d06 add jacobiSvd() method, update test & docs 2010-10-17 09:40:52 -04:00
Hauke Heibel
cd3a9d1ccb Fixed bug #74. 2010-10-17 12:33:47 +02:00
Hauke Heibel
c19b965730 Added stddeque unit test dervied from the stdlist test. 2010-10-16 10:45:30 +02:00
Benoit Jacob
6f6400e488 Added tag 3.0-beta2 for changeset 3f79884f03 2010-10-15 09:46:45 -04:00
Benoit Jacob
3f79884f03 bump to 2.92.0 2010-10-15 09:46:20 -04:00
Benoit Jacob
26129229ec doc updates/improvements 2010-10-15 09:44:43 -04:00
Benoit Jacob
fcee1903be update the porting guide 2010-10-15 08:48:44 -04:00
Benoit Jacob
6dc478fd77 doc typo 2010-10-14 10:19:46 -04:00
Benoit Jacob
65c01e2bf7 JacobiSVD doc fix 2010-10-14 10:17:40 -04:00
Benoit Jacob
8f0e80fe30 JacobiSVD:
* fix preallocating constructors, allocate U and V of the right size for computation options
  * complete documentation and internal comments
  * improve unit test, test inf/nan values
2010-10-14 10:14:43 -04:00
Gael Guennebaud
e85a3857f0 import BLAS test suite 2010-10-14 13:46:01 +02:00
Gael Guennebaud
47197065da compilation fix 2010-10-14 10:19:55 +02:00
Benoit Jacob
bcb9068268 fix bug #44: use VERIFY_IS_APPROX instead of exact comparison to please x87 extended precision 2010-10-13 09:40:57 -04:00
Benoit Jacob
c8ecc897c0 add EIGEN_TEST_X87 option 2010-10-13 09:04:59 -04:00
Gael Guennebaud
3a2bb7f782 fix compilation and warnings with fcc 4.0.1 2010-10-13 10:21:28 +02:00
Gael Guennebaud
bf402dd9b8 add the possibility to disable OpenGL testing 2010-10-12 20:23:52 +02:00
Benoit Jacob
8eb0fc1e72 remove SVD class (was bad code taked from elsewhere)
Use JacobiSVD for now.
We do plan to reintroduce a bidiagonalizing SVD asap.
2010-10-12 10:19:59 -04:00
Benoit Jacob
dbedc70012 Jacobi improvements:
* add fixed-size vectorized path
  * add missing restrict keywords
  * use innerStride()
  * allow vectorization even if innerStride()>1, if PacketSize==1
    (think of the case of rows of std::complex<double>)
2010-10-12 09:58:53 -04:00
Benoit Jacob
12a152031d fix the Jacobi bug, expand unit test 2010-10-12 09:43:40 -04:00
Benoit Jacob
75e60121f4 add Jacobi unit test. jacobi_5 fails, exposing bug #39. 2010-10-12 09:12:36 -04:00
Gael Guennebaud
0308f64515 add support for uniform of double 2010-10-12 11:04:19 +02:00
Gael Guennebaud
fb30bb9e59 uncomment commented line for debug 2010-10-12 10:40:42 +02:00
Gael Guennebaud
20be8ad91e add support for uniforms 2010-10-12 10:39:28 +02:00
Benoit Jacob
b8bb804007 set ColPivHouseholderQR as default preconditioner for JacobiSVD 2010-10-11 21:00:42 -04:00
Benoit Jacob
5c3d21693b implement JacobiSVD::solve() and expand the unit test 2010-10-11 15:36:04 -04:00
Gael Guennebaud
0cae73d1eb add the prototype of all level2 functions 2010-10-08 23:31:57 +02:00
Gael Guennebaud
eb105cace8 compilation fix 2010-10-08 22:51:10 +02:00
Benoit Jacob
d229f99ba2 adapt Quaternion to JacobiSVD API changes. 2010-10-08 10:42:41 -04:00
Benoit Jacob
8ba8d90063 add option to compute thin U/V.
By default nothing is computed. You have to ask explicitly for thin/full U/V if you want them.
2010-10-08 10:42:40 -04:00
Benoit Jacob
6fad2eb97b Rework JacobiSVD api / template parameters.
There is now an integer QRPreconditioner template parameter, defaulting to full-piv QR.
Since we have to special-case each QR dec anyway, a template template parameter didn't add much value here.
There is an option NoQRPreconditioner if you know your matrices are already square (auto-detected for fixed-size matrices).
2010-10-08 10:42:32 -04:00
Benoit Jacob
58e0cce0f7 merge backout 2010-10-08 10:42:25 -04:00
Benoit Jacob
4a98cada26 Backed out changeset 2334291157
Sorry Thomas, these doc fixes are no longer relevant with the JacobiSVD API changes, and they are preventing me from applying my patches cleanly.
2010-10-08 10:42:06 -04:00
Gael Guennebaud
a76ce042e6 MSVC for windows mobile does not have the errno.h file 2010-10-07 18:09:15 +02:00
Gael Guennebaud
af22364988 an attempt to fix compilation on windows mobile 2010-10-07 17:54:46 +02:00
Gael Guennebaud
d9c131de5b remove the Taucs backend : Taucs is not maintained anymore and the backend was crap anyway 2010-10-06 17:42:17 +02:00
Gael Guennebaud
423f88aa1e improve FindCholmod 2010-10-06 17:38:02 +02:00
Romain Bossart
c6503e03eb Updates to the Sparse unsupported solvers module.
* change Sparse* specialization's signatures from <..., int Backend> to <..., typename Backend>. Update SparseExtra accordingly to use structs instead of the SparseBackend enum.
* add SparseLDLT Cholmod specialization
* for Cholmod and UmfPack, SparseLU, SparseLLT and SparseLDLT now use ei_solve_retval and have the new solve() method (to be closer to the 3.0 API).

* fix doc
2010-10-04 20:56:54 +02:00
Gael Guennebaud
e3d01f85b2 extend OpenGL support module with true unit tests and support for Transform, Translation, etc. 2010-10-06 13:28:13 +02:00
Gael Guennebaud
b5f32830fd fix geometry tutorial regarding the need to specify the "mode" 2010-10-06 13:27:14 +02:00
Gael Guennebaud
01fad14d78 mark LLT/LDLT solveInPlace func internal and rm their boolean returned value 2010-10-05 15:56:50 +02:00
Thomas Capricelli
2334291157 fix doc 2010-10-04 04:08:32 +02:00
Benoit Jacob
71f023de3e fix compilation on ubuntu 9.04's version of gcc 4.3 (yes, wtf) 2010-09-27 09:57:57 -04:00
Radu Bogdan Rusu
94ea1eed9a fix warning 2010-09-27 09:56:54 -04:00
Hauke Heibel
327ed3d1d3 Added a note to the Gram Schmidt code and improved some formatting. 2010-09-25 14:15:35 +02:00
Hauke Heibel
72d4d45133 Merge. 2010-09-24 17:34:49 +02:00
Hauke Heibel
316dadc8e4 Fixed some SVD issues.
Make the SVD's output unitary.
Improved unit tests.
Added an assert to the SVD ctor to check whether rows>=cols.
2010-09-24 17:32:44 +02:00
Hauke Heibel
053261de88 Make the SVD's output unitary and improved unit tests. 2010-09-24 16:28:20 +02:00
Benoit Jacob
1c54514bfc merge 2010-09-23 09:53:21 -04:00
Benoit Jacob
c253cc3d53 SVD:
* fix unit test for rectangular matrices.
 * enforce that rows >= cols since various places in the code assume that.
2010-09-23 09:51:08 -04:00
Hauke Heibel
947f84633b Fixed bad memory access in the SVD. 2010-09-23 11:15:36 +02:00
Hauke Heibel
62bf04b339 Fixed bad memory access in the SVD. 2010-09-23 11:15:36 +02:00
Gael Guennebaud
82e4a16759 remove superfluous #ifdef 2010-09-15 15:24:21 +02:00
Benoit Jacob
77c943670e add cmakelists for 2 subdirs and make sure all subdirs are installed (GLOB) 2010-09-14 04:11:15 -04:00
Gael Guennebaud
91e9344be9 fix vectorization logic and code of cross3 which was never enabled.. 2010-09-08 14:10:01 +02:00
Gael Guennebaud
f9123df772 fix unitialized quaternion 2010-09-08 12:57:33 +02:00
Gael Guennebaud
d591b0466d add a bench to compare various transformation methods 2010-09-07 18:21:36 +02:00
Gael Guennebaud
9bb75937cc fix += return by value like operations 2010-09-06 11:51:42 +02:00
Gael Guennebaud
62eb4dc99b noalias was wrongly skipping automatic transposition 2010-09-02 19:18:34 +02:00
Gael Guennebaud
4824db6444 add the possibility to extend QuaternionBase 2010-09-02 17:28:07 +02:00
Eamon Nerbonne
d17bb02ccd Fixes mingw32 compile issues 2010-09-02 10:38:23 +02:00
Gael Guennebaud
e0ea25fc21 add missing copyrights 2010-09-01 12:59:38 +02:00
Gael Guennebaud
b49dde01dc fix bad mat * mat * scalar when the implicit conversion operator to a Matrix is used 2010-08-31 09:54:38 +02:00
Hauke Heibel
dd94f10442 Docs: Improved the docs for writing functions taking Eigen types.
- Removed the wrong statement about the MSVC compiler.
- Reformulated "simple functions" usage.
- Reformulated the summary paragraph about writable parameters.
2010-08-27 08:19:09 +02:00
Gael Guennebaud
dcff9ba785 fix bad "using typename" 2010-08-25 13:34:35 +02:00
Gael Guennebaud
cb7a72d5b0 Fix Sun CC parsing of Eigen/Core. In particular,
I moved all the block related methods to a plugin file. This also
significantly reduce code verbosity.
2010-08-25 13:09:56 +02:00
Benoit Jacob
e17d17cea3 didn't want to commit that bench change. 2010-08-24 10:57:22 -04:00
Benoit Jacob
bd8d06033d make a couple of typedefs public so stuff compiles 2010-08-24 10:53:33 -04:00
Gael Guennebaud
a47bbf664c fix 4x4 SSE inversion when storage orders don't match 2010-08-24 13:00:59 +02:00
Gael Guennebaud
548ecc2fe5 update inverse unit test to highlight another bug in SSE 4x4 inversion code 2010-08-24 12:38:20 +02:00
Gael Guennebaud
ad9a7c69bc fix inversion of 4x4 unaligned matrices 2010-08-24 12:28:42 +02:00
Benoit Jacob
6924d4eec5 update this test to build against current eigen.
remove the 'normal' path as it was not compiling anymore and I couldn't see the point of it (?)
2010-08-23 23:21:25 -04:00
Gael Guennebaud
6261f4629f add TriangularMatrix::conjugate to be consistent since we have adjoint 2010-08-23 23:38:35 +02:00
Jitse Niesen
474c2996bd Docs: add section on resolving the aliasing issue. 2010-08-23 17:23:30 +01:00
Jitse Niesen
d1111d625c Docs: Typos in ArrayBase doxygen comments 2010-08-23 11:44:51 +01:00
Jitse Niesen
103b9351fd Docs: Add references to TopicClassHierarchy 2010-08-22 18:28:19 +01:00
Jitse Niesen
a6da803873 Document DenseCoeffsBase 2010-08-22 17:30:31 +01:00
Hauke Heibel
60aad09878 Fixed DiagonalMatrix assignment. 2010-08-21 16:34:46 +02:00
Hauke Heibel
92b1674c79 Fixed typos. 2010-08-19 20:11:06 +02:00
Hauke Heibel
610d79e686 Simplified to product templates to a minimum of template parameters.
Removed the ei_is_any_projective helper and added ei_transform_traits.
2010-08-19 20:02:46 +02:00
Hauke Heibel
a64aabf73c Removed unused code. 2010-08-19 19:33:13 +02:00
Hauke Heibel
55c7848877 Matrix product refactoring (rhs products only).
Added strong inlines required for MSVC for proper inlining.
Added specializations for DiagonalMatrix products to RotationBase.
Added left- and righ-hand-side products with DiagonalMatrix to Transform.
RHS Transform products now return Matrix objects only.
Split the geo_transformations unit test. Some tests were not made for projectivities.
Removed unused variables from main.h that caused warnings.
2010-08-19 19:25:35 +02:00
Gael Guennebaud
d4b664c4cd fix ugly conversion from double[2] to complex 2010-08-19 14:47:58 +02:00
Gael Guennebaud
5354ffbb4f add missing specialization for vector * selfadjoint 2010-08-19 14:05:21 +02:00
Gael Guennebaud
6264755dd3 merge 2010-08-18 15:34:55 +02:00
Gael Guennebaud
ab41c18d60 quickly mention how to solve a sparse problem 2010-08-18 15:33:58 +02:00
Benoit Jacob
216c9125e9 disable NonLinearOptimization test until it's fixed 2010-08-18 09:11:01 -04:00
Gael Guennebaud
ddbbd7065d * disable unalignment detection when vectorization is not enabled
* revert MapBase unalignment detection
2010-08-18 09:35:55 +02:00
Hauke Heibel
85fdcdf055 Fixed Geometry module failures.
Removed default parameter from Transform.
Removed the TransformXX typedefs.
Removed references to TransformXX from unit tests and docs.
Assigning Transforms to a sub-group is now forbidden at compile time.
Products should now properly support the Isometry flag.
Fixed alignment checks in MapBase.
2010-08-17 20:03:50 +02:00
Benoit Jacob
87aafc9169 fix Transform() constructor taking a Transform with other mode.
Not really tested as the geometry tests are currently busted.
2010-08-16 12:30:33 -04:00
Benoit Jacob
19d9c835e0 fix warnings 2010-08-16 11:11:43 -04:00
Gael Guennebaud
b37551f62a further improve compilation error message for array+=matrix 2010-08-16 11:13:02 +02:00
Gael Guennebaud
c625a6a85b improve compilation error message for array+=matrix and the likes 2010-08-16 11:07:17 +02:00
Gael Guennebaud
453d54325e fix declaration of AffineTransformType in Translation 2010-08-16 10:44:27 +02:00
Gael Guennebaud
ba212aeaa9 fix missdetection of GLUT 2010-08-16 09:50:24 +02:00
Gael Guennebaud
aa2b46aa91 allow vectorization of mat44.col() by adding a InnerPanel boolean
template parameter to Block
2010-07-23 16:29:29 +02:00
Gael Guennebaud
853c0e15df slightly generalize the alignment assert in MapBase 2010-08-16 09:41:07 +02:00
Gael Guennebaud
8566ef805b remove the aligned bit flag for non vectorizable types 2010-08-16 09:38:49 +02:00
Benoit Jacob
3a30a2bc3e forgot to remove a #endif 2010-08-13 14:03:38 -04:00
Benoit Jacob
b80d9dd42e fix determination of number of registers on sse:
__i386__ was not defined by MSVC 2010.
fixed as (2*sizeof(void*)).
also move that to SSE/ and let the default for unknown arch's be just 8.
2010-08-13 13:55:28 -04:00
Benoit Jacob
8bbe556e35 merge the backout 2010-08-11 00:06:31 -04:00
Benoit Jacob
97ced33b33 Backed out changeset 40f6e26a24
See thread on mailing list: "InnerPanel change mis-detects alignment?"
2010-08-11 00:04:06 -04:00
Jitse Niesen
76fbe94279 Document EIGEN_NO_DEBUG macro.
I needed some doxygen tricks to get this to work, so it may not be worth it.
2010-08-10 11:37:23 +01:00
Jitse Niesen
530b328769 Aliasing doc: explain that some cases are detected, reverse order examples. 2010-08-08 21:20:14 +01:00
Hauke Heibel
3dd8225862 Added more detailed docs to the QR decompositions classes. 2010-08-05 08:56:19 +02:00
Benoit Jacob
976d7c19e8 some small improvements to the page on functions taking eigen objects.
- make the beginning more precise
 - make the first example be a full selfcontained compiled example, no need for all the others, but having the first one doesn't hurt.
2010-08-04 21:42:32 -04:00
Hauke Heibel
5c7cb3c05c Added more examples to the function writing tutorial including EigenBase, DenseBase, etc. 2010-08-04 17:50:46 +02:00
Hauke Heibel
d558e84f0b Fixed some typos and reformulated a few sentences. 2010-08-04 16:40:33 +02:00
Hauke Heibel
224dd66e10 Added a tutorial on writing functions taking Eigen types. 2010-08-04 12:01:19 +02:00
Benoit Jacob
d90d7a006f fix warnings. The one in Reverse was potentially serious: coeff() methods should return CoeffReturnType, not "Scalar", if the expression is potentially a Lvalue. 2010-08-03 10:38:48 -04:00
Hauke Heibel
cc25edd5de Fixed Affine transform typedef. 2010-08-02 21:33:48 +02:00
Jitse Niesen
508b51cb62 Add page giving an overview of the class hierarchy.
This is mostly copied from the wiki, which in turn copies Benoit's email at
http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2010/06/msg00576.html
I used ASCII art for the inheritance diagrams for now, but I don't mind
moving to GraphViz/dot as discussed earlier.
2010-08-02 11:36:44 +01:00
Jitse Niesen
a9fe75efc4 Documentation: Start special topic page on aliasing. 2010-07-31 21:37:29 +01:00
Hauke Heibel
7cefa75901 Added static method Identity() to the Translation class. 2010-07-29 17:30:37 +02:00
Hauke Heibel
e92993d7b9 Safeguarded some Transform functions with compile time asserts.
Added missing static Identity() to Rotation2D, AngleAxis.
2010-07-29 16:17:42 +02:00
Hauke Heibel
6b89ee0095 Transform is now per default Projective.
Improved invert() in the Transform class.
RotationBase offers matrix() to be conform with Transform's naming scheme.
Added Translation::translation() to be conform with Transform's naming scheme.
2010-07-29 15:54:32 +02:00
Hauke Heibel
2f0e8904f1 Removed debug outputs. 2010-07-28 10:47:58 +02:00
Kenneth Riddile
b038a4bb71 * added EIGEN_ALIGNED_ALLOCATOR macro to allow specifying a different aligned allocator
* attempted to add support for std::deque by copying and modifying the std::vector implementation...MSVC still fails to compile with the std::deque::resize() "will not be aligned" error...probably missing something simple but I'm not sure how to make it work
2010-07-26 19:06:47 -04:00
Jitse Niesen
1420f8b3a1 Several changes in comments to keep Doxygen happy. 2010-07-25 20:29:07 +01:00
Jitse Niesen
3d9764ee24 Add some more examples for the API documentation.
The only missing examples now are for homogeneous() and hnormalized();
I don't know what they're used for ...
2010-07-24 16:43:07 +01:00
Jitse Niesen
425444428c Add examples for API documentation of block methods in DenseBase. 2010-07-23 22:20:00 +01:00
Jitse Niesen
2b5a0060b4 Add examples for API documentation of MatrixBase::cwiseXxx() methods. 2010-07-23 20:32:33 +01:00
Jitse Niesen
072ee3c07d Set Doxygen config variable INCLUDE_PATH to plugins directory.
This is necessary to get functions like MatrixBase::cwiseAbs() documented;
otherwise doxygen can't find the include file in which they are defined.
2010-07-23 19:57:21 +01:00
Jitse Niesen
ae8425c74c Tutorial page 7: more typical example for .all(), minor copy-editing. 2010-07-23 19:20:10 +01:00
User Martin Senst
145830e067 Add newline at the end of Dense. 2010-07-23 19:00:02 +02:00
Gael Guennebaud
40f6e26a24 allow vectorization of mat44.col() by adding a InnerPanel boolean
template parameter to Block
2010-07-23 16:29:29 +02:00
Jitse Niesen
d0f6b1c21f Tutorial page 6: Fix typo, add table of contents. 2010-07-22 21:52:04 +01:00
Gael Guennebaud
9daa66f262 fix merge conflicts 2010-07-22 17:23:11 +02:00
Gael Guennebaud
5d98fa235d merge with complex branch 2010-07-22 16:57:14 +02:00
Jitse Niesen
403e672587 Extend tutorial page 5: Advanced initialization. 2010-07-22 15:53:21 +01:00
Gael Guennebaud
7020f30da3 sync with default branch 2010-07-22 16:29:35 +02:00
Gael Guennebaud
b9edd6fb85 oops 2010-07-22 16:24:01 +02:00
Gael Guennebaud
96ba7cd655 add an OpenGL module simplifying the way you can pass Eigen's objects to GL 2010-07-22 16:08:58 +02:00
Gael Guennebaud
fa6d36e0f7 fix SparseView: clean the nested matrix type 2010-07-22 15:57:01 +02:00
Hauke Heibel
734469e43f Unified LinSpaced in order to be conform with other setter methods as e.g. Constant. 2010-07-22 14:04:00 +02:00
Gael Guennebaud
c7f40e522e merge 2010-07-22 13:21:06 +02:00
Gael Guennebaud
06250a154c add matlab-like mixed product 2010-07-22 13:19:09 +02:00
Gael Guennebaud
bec3f9bfe4 rename indices to a common scheme 2010-07-22 13:17:39 +02:00
Gael Guennebaud
0916d69ca5 fix inner vectorization logic 2010-07-22 13:17:12 +02:00
Gael Guennebaud
0dfc5b296b fix strict aliasing issue 2010-07-22 13:16:53 +02:00
Gael Guennebaud
8a96b0080d now that we properly support mixing real-complex: clean mixingtypes test 2010-07-22 13:15:49 +02:00
Thomas Capricelli
8e21cef80a fix typo 2010-07-22 13:15:15 +02:00
Gael Guennebaud
4393f20fea fix compilation of quaternion demo 2010-07-21 17:34:32 +02:00
Gael Guennebaud
f1104a3b0f fix mandelbrot compilation, and make it use Array instead of Matrix 2010-07-21 17:13:02 +02:00
Gael Guennebaud
35f0bc70d8 fix a strict aliasing issue with gcc 4.3 2010-07-20 22:43:55 +02:00
Gael Guennebaud
b5f2b7d087 fix storage order request 2010-07-20 22:08:48 +02:00
Gael Guennebaud
7dbbc6ffd1 fix static allocation of workspace 2010-07-20 17:06:14 +02:00
Gael Guennebaud
ced1a45f82 add NEON ploaddup and pcplxflip functions 2010-07-20 14:24:01 +02:00
Gael Guennebaud
193eedbfe2 one more fix for openmp 2010-07-20 14:19:00 +02:00
Gael Guennebaud
d7fa09bf05 improve block-size heuristic 2010-07-20 13:23:50 +02:00
Gael Guennebaud
4824ac1363 fix openmp version 2010-07-20 13:23:19 +02:00
Gael Guennebaud
b551a2d77a fix declaration of pack_lhs in trsm 2010-07-20 12:58:22 +02:00
Gael Guennebaud
10a7668035 uncomment commented code for debug 2010-07-20 12:57:46 +02:00
Gael Guennebaud
7b23fad4c9 report a true assert when not checking for an assertion 2010-07-20 12:54:53 +02:00
Gael Guennebaud
44cb1e4802 it appears only the "on the left" case was tested 2010-07-20 10:32:56 +02:00
Gael Guennebaud
872523844a fix trmm and symm wrt lhs packing 2010-07-20 10:06:41 +02:00
Gael Guennebaud
76eb9c9fd9 fix compilation by including file in correct order 2010-07-19 23:32:13 +02:00
Gael Guennebaud
70b1ce11c6 * fix SelfCwiseBinaryOp traits and handling of mixed types
* improve compilation error in case of type mismatch
2010-07-19 23:31:08 +02:00
Gael Guennebaud
8b0b121c9e explicitely disable vectorization for mixed coeff based products 2010-07-19 23:28:57 +02:00
Gael Guennebaud
08c841eb87 fix lhs packing in the case of real * complex products 2010-07-19 23:16:03 +02:00
Gael Guennebaud
1ed4233fd2 port Jacobi to new ei_pset1/ei_pload API 2010-07-19 16:51:38 +02:00
Gael Guennebaud
c2ee454df4 * fix compilation of mixed scalar product
* optimize mixed scalar products
2010-07-19 16:49:09 +02:00
Gael Guennebaud
6e157dd7c6 * fix a couple of remaining issues with previous commit,
* merge ei_product_blocking_traits into ei_gepb_traits
2010-07-19 15:45:13 +02:00
Gael Guennebaud
f8aae7a908 * _mm_loaddup_pd is slow
* optimize SSE ei_ploaddup<Packet4f>
2010-07-19 15:43:27 +02:00
Gael Guennebaud
cd0e5dca9b wip: extend the gebp kernel to optimize complex and mixed products 2010-07-19 08:50:59 +02:00
Gael Guennebaud
45362f4eae update mixing type test 2010-07-15 08:40:09 +02:00
Gael Guennebaud
3f532edc6d update unit test for new API 2010-07-15 08:38:31 +02:00
Gael Guennebaud
1dc9aaaf36 add support for mixing type in trsv 2010-07-13 16:03:49 +02:00
Gael Guennebaud
36d9b51a44 optimize non fused MADD, and add a flatten attribute macro to enforce
inlining within a function
2010-07-13 15:16:34 +02:00
Gael Guennebaud
b72b7ab76f matrix product: move the alpha factor to gebp instead of the packing,
clean some temporaries, etc.
2010-07-12 16:31:46 +02:00
Gael Guennebaud
f8678272a4 mixing types step 3:
- improve support of colmajor by vector and matrix - matrix
- now all configurations are well handled, but the perf are not always very good
2010-07-11 23:57:23 +02:00
Gael Guennebaud
8e3c4283f5 make colmaj * vector uses pointers only 2010-07-11 16:01:48 +02:00
Gael Guennebaud
ff96c94043 mixing types in product step 2:
* pload* and pset1 are now templated on the packet type
* gemv routines are now embeded into a structure with
  a consistent API with respect to gemm
* some configurations of vector * matrix and matrix * matrix works fine,
  some need more work...
2010-07-11 15:48:30 +02:00
Gael Guennebaud
4161b8be67 sync 2010-07-10 22:58:51 +02:00
Gael Guennebaud
e5bc9526f1 * generalize rowmajor by vector
* fix weird compilation error when constructing a matrix with a row by matrix product
2010-07-10 22:53:27 +02:00
Gael Guennebaud
c4ef69b5bd fix compilation: make the check_coordinates* functions const 2010-07-10 22:37:16 +02:00
Benoit Jacob
6dcd373b9d let ei_pset1 use _mm_loaddup_pd. Not a significant speed improvement, but also not a speed regression, and replaces 3 instructions by 1 single instruction. 2010-07-09 18:51:17 -04:00
Konstantinos Margaritis
6ad3f1ab1f Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>
minor fix in AltiVec Complex.h
2010-07-10 00:09:29 +03:00
Gael Guennebaud
96f9015807 disable MSVC optimization when the underlying compiler is ICC 2010-07-09 19:33:43 +02:00
Gael Guennebaud
b2effa2b2c move ei_conj_if to a more appropriate file 2010-07-09 18:05:57 +02:00
Konstantinos Margaritis
642cc27eb1 forgot to commit ei_p4f_FORWARD; 2010-07-09 18:08:18 +03:00
Konstantinos Margaritis
f6bd508351 forgot to add the Complex.h include for AltiVec. 2010-07-09 17:56:53 +03:00
Konstantinos Margaritis
d9e134c73c Altivec port of Complex.h.
Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code.
The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing),
with same CFLAGS. With some code reorganizing I managed to get some minor gain
on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting
to see if it's fixed on 4.5. I'll look into this a bit more.
2010-07-09 17:54:41 +03:00
Jitse Niesen
26cfe5a958 Be consistent in how the tutorial pages link together. 2010-07-09 11:59:29 +01:00
Jitse Niesen
2c03ca3325 Small changes to tutorial page 2 (matrix arithmetic):
* slightly more extensive discussion of aliasing
* layout: put example code and output side-by-side
* add some links, etc
2010-07-09 11:46:07 +01:00
Gael Guennebaud
b1a17dbfe4 fix a few weird issues with gcc 4.3 32bits and complex<float> 2010-07-09 08:27:58 +02:00
Thomas Capricelli
551cb9b7b4 bench: use of Eigen/Array is deprecated + fix includes for iostream 2010-07-09 03:59:36 +02:00
Gael Guennebaud
504d3a3586 fix SliceVectorizedTraversal for packetsize==1 2010-07-08 23:31:14 +02:00
Gael Guennebaud
51ec188da0 extend vectorization_logic 2010-07-08 23:30:16 +02:00
Carlos Becker
951da96f14 Added more redux types/examples in tutorial and fixed some display issues 2010-07-08 18:16:39 +01:00
Carlos Becker
cb3aad1d91 Reductions/Broadcasting/Visitor Tutorial added to index 2010-07-08 17:45:25 +01:00
Carlos Becker
9852e7b9cb Reductions/Broadcasting/Visitor Tutorial added 2010-07-08 17:42:23 +01:00
Gael Guennebaud
300a226ffa scalars fitting in a single packet requires more work, step 1
* add a, Alignable trait
* update LinearVectorization assignment
2010-07-08 14:27:47 +02:00
Gael Guennebaud
2a1500915a compilation fix 2010-07-08 14:26:00 +02:00
Gael Guennebaud
2066ed91de enabling aligned loads/store for complex<double> is much more tricky,
so the temporary fix is to always perform unaligned load/store
2010-07-07 22:50:19 +02:00
Gael Guennebaud
d89925e6de an attempt to fix wrong unaligned store 2010-07-07 22:35:06 +02:00
Gael Guennebaud
02fd3acd81 update to support mixin types 2010-07-07 19:49:48 +02:00
Gael Guennebaud
31a36aa9c4 support for real * complex matrix product - step 1 (works for some special cases) 2010-07-07 19:49:09 +02:00
Gael Guennebaud
fc3fd8ab57 mention that array = matrix is fine too 2010-07-07 18:10:11 +02:00
Gael Guennebaud
861962c55f sync 2010-07-07 16:44:05 +02:00
Gael Guennebaud
0f2d480af0 add support for complex 2010-07-07 16:41:29 +02:00
Gael Guennebaud
a2415388ef optimized conjugate products for SSE3 2010-07-07 16:37:20 +02:00
Gael Guennebaud
65257f6b29 optimize for SSE3 => significant speed up !! 2010-07-07 15:34:46 +02:00
Gael Guennebaud
dd18b22f0b optimize pmul for complex<double> 2010-07-07 15:29:04 +02:00
Gael Guennebaud
845994f18f optimize gemv for complex<double> and fix gcc alignment issue in 32bits 2010-07-07 15:28:41 +02:00
Gael Guennebaud
e07c0f6bb5 cleanning 2010-07-07 11:41:29 +02:00
Gael Guennebaud
3a7f16a655 typo 2010-07-07 11:13:30 +02:00
Gael Guennebaud
b0896382a3 s/IsVectorized/Vectorizable 2010-07-07 11:10:46 +02:00
Gael Guennebaud
74cf12cbe0 add a compile time error if someone call packet on Diagonal (instead of infinite runtime loop) 2010-07-07 11:07:12 +02:00
Gael Guennebaud
d5e0efaf69 fix vectorization rule of diagonal-product 2010-07-07 11:06:31 +02:00
Gael Guennebaud
c851044eae fix row cwise-prod column in coeff based products...
I really don't know why this worked so far...
2010-07-07 10:52:59 +02:00
Gael Guennebaud
55495dcbae extend product unit tests 2010-07-07 10:50:40 +02:00
Gael Guennebaud
e38fc9692d add a conj_product functor and optimize dot products 2010-07-07 10:00:08 +02:00
Gael Guennebaud
f8d3b4c060 fix mixing types in DiagonalProduct 2010-07-07 09:43:29 +02:00
Gael Guennebaud
bfa606d16f * add a IsVectorized mechanism (instead of packet-size>1...)
* vectorize complex<double>
2010-07-06 23:36:00 +02:00
Gael Guennebaud
38d0a0d5d6 add a unit test for previous bug 2010-07-06 20:54:35 +02:00
Gael Guennebaud
2dba4b7ce7 add a unit test for conj_helper and ei_pconj 2010-07-06 20:54:14 +02:00
Gael Guennebaud
bc57c68cf5 bug fix forgot to conjugate the scalar factor when needed 2010-07-06 20:53:48 +02:00
Gael Guennebaud
e04c3f2cc0 reduce code generation and minor speed up 2010-07-06 19:15:02 +02:00
Gael Guennebaud
d6454788d9 add support for vectorized conjugated products 2010-07-06 19:10:24 +02:00
Gael Guennebaud
291fef5760 fix range 2010-07-06 19:09:31 +02:00
Jitse Niesen
49747fa4a9 Various documentation improvements.
* Add short documentation for Array class
* Put all classes explicitly in Core module (where applicable)
* Section on Modules in Quick Reference Guide
* Put Page 7 after Page 6 in Contents :)
2010-07-06 13:10:08 +01:00
Jitse Niesen
3428d80d20 Small changes to tutorial page 1. 2010-07-06 10:48:25 +01:00
Jens Mueller
d849bc4401 Avoid calling resizeLike, if EIGEN_NO_AUTOMATIC_RESIZING is defined 2010-07-06 10:11:18 +02:00
Jens Mueller
5322b670c8 Add all unsupported modules and fix header file paths 2010-07-06 10:25:52 +02:00
Gael Guennebaud
7d23e7f9f1 indentation 2010-07-06 11:02:01 +02:00
Benoit Jacob
d1243b393e Added tag 3.0-beta1 for changeset 8cfbf33f60 2010-07-06 00:50:30 -04:00
Benoit Jacob
8cfbf33f60 fix the overview page and add mention that the wrong stack alignment issue may have been solved by gcc 4.5 2010-07-06 00:50:16 -04:00
Gael Guennebaud
c69a226192 * extend the Has* packet traits and makes all functor use it
* extend the packing routines to support conjugation
2010-07-05 23:27:54 +02:00
Gael Guennebaud
8db60afb47 oops I did not see that 2010-07-05 21:27:15 +02:00
Gael Guennebaud
e1eccfad3f add intitial support for the vectorization of complex<float> 2010-07-05 16:18:09 +02:00
Konstantinos Margaritis
1505221263 add check for non x86 platforms, we get a compile error on arm/powerpc without the check
(there is no known -yet- method to get cpuid, without resolving to kernel /sys interface)
2010-07-05 16:44:41 +03:00
Konstantinos Margaritis
1daf9b11ba check for !x86 platforms, otherwise the BTL benchmark doesn't compile on arm/powerpc 2010-07-05 16:42:11 +03:00
Jitse Niesen
9fa4e9a098 Improve documentation, mostly by adding links to Quick Start Guide. 2010-07-05 10:59:29 +01:00
Gael Guennebaud
efb79600b9 fix warning "type qualifiers ignored on function return type" for long long scalar types 2010-07-05 11:23:05 +02:00
Gael Guennebaud
15a421ef63 char is not necessarily signed.... 2010-07-05 11:15:08 +02:00
Gael Guennebaud
6249d60715 improve packetmath unit test for sum reductions 2010-07-05 10:54:24 +02:00
Gael Guennebaud
fffaa58ac2 fix unaligned workspace in sybb 2010-07-05 10:12:30 +02:00
Gael Guennebaud
8a38047ec5 fix nomalloc_2 issues with ICC and gcc 4.0.1 (and speed up compilation ;) ) 2010-07-04 15:35:21 +02:00
Gael Guennebaud
c201aabf3e comment the workaround of the EIGEN_EMPTY_STRUCT_CTOR workaround for gcc 4.3 2010-07-04 15:26:58 +02:00
Hauke Heibel
282b5614ed Relaxed precision test. 2010-07-04 14:04:07 +02:00
Gael Guennebaud
0c25f868c7 update topic page on products 2010-07-04 10:37:32 +02:00
Gael Guennebaud
41ea92d355 * update the general TOC
* integrate the old geometry/sparse tutorial into the new one (they are better than nothing)
* remove the old tutorial on the core module
2010-07-04 10:14:47 +02:00
Gael Guennebaud
11329f49f4 suppress warning and add a fixme about this transpose argument 2010-07-03 19:39:29 +02:00
Gael Guennebaud
be1fdbf3af fix openmp for row major destination 2010-07-03 12:52:39 +02:00
Hauke Heibel
0d9dc578dd Adapted the MSVC visualizer to the new Dynamic value. 2010-07-03 12:26:53 +02:00
Gael Guennebaud
b4ef323e90 fix bug with openmp 2010-07-03 12:20:13 +02:00
Hauke Heibel
d6791e8f3d Fixed annoying CMake - Qt warning. 2010-07-03 11:52:46 +02:00
Benoit Jacob
5a52f2833f simplify and polish a bit the page 4 / block ops 2010-07-01 20:52:40 -04:00
Benoit Jacob
08c17c412e polish the Array tutorial page 2010-07-01 20:29:13 -04:00
Benoit Jacob
ba7e9a760d actually remove 3.0-beta1 tag 2010-07-01 10:49:25 -04:00
Benoit Jacob
51fae7e57d Removed tag 3.0-beta1 2010-07-01 10:47:03 -04:00
Hauke Heibel
ce1e5e52dd Enable OpenMP testing for MSVC.
Added CMake comments.
2010-07-01 07:28:16 +02:00
Thomas Capricelli
b212227418 shut one more warning 2010-07-01 04:27:45 +02:00
Thomas Capricelli
1399fd9cbd fix compilation issue with clang 2010-07-01 04:26:07 +02:00
Thomas Capricelli
d414ab51f0 oops... fix it better 2010-07-01 03:39:19 +02:00
Thomas Capricelli
2874101b62 fix compilation with icc. Anyway, the use of an enum instead of a
'const bool' is more consistent with the code around.
2010-07-01 03:23:47 +02:00
Thomas Capricelli
04648becf7 fix warnings with old gcc 2010-07-01 03:22:09 +02:00
Jitse Niesen
7d72d4f3c7 Bug fix for NumTraits<T>::lowest() .
std::numeric_limits<T>::min() is the lowest *positive* normalized number
for floating point types.
This fixes the test failure for geo_alignedbox8 for me.
2010-07-01 01:42:31 +01:00
Benoit Jacob
6326f4623a slightly raise the threshold used in this test to accomodate results obtained with gcc 4.1 2010-06-30 19:55:43 -04:00
Benoit Jacob
532aeba308 s/struct/class/g ; bug reported by Thomas 2010-06-30 19:47:26 -04:00
Benoit Jacob
962b30d75e fix linalg tut; remove the old one 2010-06-30 19:27:30 -04:00
Carlos Becker
97f3c7f282 Fixed nullary test not passing on Core Duo 2010-06-30 19:17:29 +01:00
Hauke Heibel
34d79b6a63 Firefox specific style fix. 2010-06-30 18:31:31 +02:00
Hauke Heibel
7b74d376d3 More style fixes. 2010-06-30 18:27:27 +02:00
Hauke Heibel
e1348b9cc9 Slight pimping of the "Basic matrix manipulation" table.
More CSS simplifications.
2010-06-30 18:04:36 +02:00
Manoj Rajagopalan
c64c0f382f Examples for DenseBase::middle{Rows,Cols}() 2010-06-30 11:26:31 -04:00
Benoit Jacob
63287ff08f Added tag 3.0-beta1 for changeset 73db507d15 2010-06-30 10:16:55 -04:00
Benoit Jacob
73db507d15 merge.
first time i see this: someone pushed *between*
my hg pull -u and my hg push ! I guess that means we have very high activity these days. good!
2010-06-30 10:14:09 -04:00
Benoit Jacob
4d4a23cd3e nearly complete page 6 / linear algebra + examples
fix the previous/next links
2010-06-30 10:11:55 -04:00
Carlos Becker
b83225edfb Fixed small typo in arithmetic tutorial 2010-06-30 15:08:25 +01:00
Hauke Heibel
b1741c1dc6 Fixed some doc appearance issue.
Started cleaning up the CSS.
2010-06-30 15:52:00 +02:00
Hauke Heibel
56fe64c15d Fix hover background color for H2.
Align tutorial tables at the top.
2010-06-30 14:05:37 +02:00
Gael Guennebaud
21d940fbe4 fix unsupported module doc 2010-06-30 13:19:54 +02:00
Hauke Heibel
1688b86823 Followed Benoit's comment and removed the Mainpage.dox too. 2010-06-30 13:04:50 +02:00
Gael Guennebaud
020bf9e922 clean the class hierarchy 2010-06-30 13:05:07 +02:00
Hauke Heibel
af9f452299 Removed old doxygen config file. 2010-06-30 12:57:33 +02:00
Hauke Heibel
c66c4b3293 Added exclusion filters. 2010-06-30 12:57:00 +02:00
Jitse Niesen
096c13ea6d Fill in open entries in decompositions table. 2010-06-30 10:41:23 +01:00
Gael Guennebaud
1b8277fc2a update the big linear algebra table (fixes, add notes and definitions) 2010-06-30 10:37:23 +02:00
Gael Guennebaud
a06cd0fb13 it remains only to set the status of RealSchur and EigenSolver 2010-06-29 20:59:23 +02:00
Gael Guennebaud
1f4927a28c update the big table and add an Optimization column 2010-06-29 20:44:51 +02:00
Manoj Rajagopalan
5c58582a08 Renamed DenseBase::{row,col}Range() to DenseBase::middle{Rows,Cols}() 2010-06-29 14:31:39 -04:00
Manoj Rajagopalan
6e5bed69dc Included tests for middleRows() and middleCols() 2010-06-29 12:39:58 -04:00
Gael Guennebaud
82c4a755af disable empty struct trick for buggy gcc 4.3 2010-06-29 18:17:17 +02:00
Benoit Jacob
e5de9e5226 Remove \nonstable yet. The stability rules for Eigen3 are much simpler:
- all what's not in unsupported/ is considered stable API
    (except internal stuff e.g. expression templates).
2010-06-29 10:10:47 -04:00
Benoit Jacob
76152e9844 start linear algebra tutorial 2010-06-29 10:02:33 -04:00
Benoit Jacob
3bd421e073 fix potential warning 2010-06-29 10:02:10 -04:00
Carlos Becker
deba829911 Added Block tutorial to docs index 2010-06-29 14:23:49 +01:00
Jitse Niesen
3070164525 Fix name clash in "m.block(i,j,m,n)" where m has two meanings.
Fix simple typos in tutorial.
2010-06-29 11:42:51 +01:00
Konstantinos Margaritis
cf3616b2c0 AltiVec signed integer pmadd removed, proved to be 2x slower than the scalar trait(!). 2010-06-28 21:24:55 +03:00
Carlos Becker
97889a7f46 Added Block Operations tutorial and code examples 2010-06-28 18:42:59 +01:00
Carlos Becker
82e2e8b13a Modified Array Class tutorial, added examples 2010-06-28 18:42:09 +01:00
Carlos Becker
bdef7eb656 Added doxygen info for .matrix() and .array() 2010-06-28 18:38:28 +01:00
Benoit Jacob
086ad93295 start a topic page on decompositions, with a big table.
still have to write the _tutorial_ on decompositions.
2010-06-28 10:43:11 -04:00
Gael Guennebaud
dbefd7aafb * update redux section
* fix output precision to 3 for the snippets
2010-06-28 13:30:10 +02:00
Gael Guennebaud
768bdd08c8 fix bad tests 2010-06-28 01:01:29 +02:00
Gael Guennebaud
75da254fc3 * use transpose() instead of row vectors (more common use case)
* add a word about noalias and performance for BLAS users
2010-06-28 00:42:57 +02:00
Gael Guennebaud
aae5994b9e mv comma initializer to page 1 2010-06-28 00:22:47 +02:00
Gael Guennebaud
de1220aa62 add a Transposition section in page 2 2010-06-28 00:05:11 +02:00
Gael Guennebaud
ca29620e25 fix filename 2010-06-27 23:45:37 +02:00
Gael Guennebaud
f98c758f61 fix link 2010-06-27 20:21:12 +02:00
Gael Guennebaud
b5659dc9cf show a more fancy example for the getting started tut 2010-06-27 19:37:16 +02:00
Gael Guennebaud
189d4b51c2 fix unused warning when EIGEN_HAS_FUSE_CJMADD 2010-06-27 17:42:03 +02:00
Benoit Jacob
25f44266a2 fix #146 2010-06-27 08:44:21 -04:00
Gael Guennebaud
f096452dfd Fix cache computation on old Intel CPUs which do not
support the cpuid function 0x4
2010-06-27 00:17:38 +02:00
Gael Guennebaud
5e7bd967cc add the manual Intel's way to query cache info 2010-06-26 23:37:42 +02:00
Manoj Rajagopalan
464fc297cf Included definitions for rowRange() and colRange() member functions of DenseBase 2010-06-26 17:37:17 -04:00
Martin Senst
4b474fdb34 Relax assertion to allow for matrices with cols() == 0 and/or rows() == 0. 2010-07-20 21:25:43 +02:00
Gael Guennebaud
95f2e7f3f5 introduce a new LvalueBit flag and split DenseCoeffBase into three level of accessors 2010-07-21 10:57:01 +02:00
Jitse Niesen
3abbdfd621 Add (set)LinSpaced to quick reference guide. 2010-07-20 21:55:22 +01:00
Jitse Niesen
abd5faf784 Require at least MPFR version 2.3.0, because we use mpfr_signbit.
Code in FindMPFR.cmake is taken from FindEigen2.cmake .
2010-07-19 12:26:52 +01:00
Gael Guennebaud
cac147ba10 add support for determinant on empty matrix 2010-07-19 10:45:06 +02:00
Gael Guennebaud
78d3c54631 add a small bench demoing the possibilities of a direct 3x3 eigen decomposition 2010-07-18 17:26:06 +02:00
Gael Guennebaud
ea27678153 fix compilation of ei_tridiagonalization_inplace_selector for 1x1 matrix 2010-07-18 17:10:11 +02:00
Gael Guennebaud
2a820d41df finish/fix level1 blas, all test pass 2010-07-17 13:49:43 +02:00
Gael Guennebaud
dd27e10360 fix level3 blas: it now passes all computational tests 2010-07-17 11:59:09 +02:00
Gael Guennebaud
2d78023815 fix hemm to not use the imaginary part of the diagonal entries 2010-07-17 11:57:54 +02:00
Gael Guennebaud
cbd6fe323c fix a couple a issue with blas (new TRMM api, and enforece column major) 2010-07-16 23:30:06 +02:00
Gael Guennebaud
f59226e901 fix compilation of blas lib 2010-07-16 22:27:24 +02:00
Gael Guennebaud
4c19024fbf re-enable writing to reversed objects 2010-07-16 22:26:07 +02:00
Gael Guennebaud
fb041c260c fix for empty matrices 2010-07-16 22:25:35 +02:00
Gael Guennebaud
883a8cbb2c disable the optimized 3x3 path for complexes which was not working at all 2010-07-16 18:22:00 +02:00
Gael Guennebaud
6ab9e8632f fix bad fuzzy comparison in 3x3 tridiagonalization 2010-07-16 16:38:58 +02:00
Gael Guennebaud
044424b0e2 fix sum()/prod() on empty matrix making sure this does not affect fixed sized object, extend related unit tests including partial reduction 2010-07-16 14:02:20 +02:00
Gael Guennebaud
6a370f50c7 MPRealSupport was missing 2010-07-15 20:45:45 +02:00
Gael Guennebaud
b08c26aefa merge 2010-07-15 20:41:33 +02:00
Gael Guennebaud
84fdbded4d add support for strictly triangular matrix in trmm though it is not really useful 2010-07-15 20:39:20 +02:00
Gael Guennebaud
87e89fea4e add a support module for MPFR C++ with basic unit testing 2010-07-15 16:29:04 +02:00
Gael Guennebaud
bfbe61454e merge 2010-07-15 09:54:31 +02:00
Gael Guennebaud
cf9edd9958 fix compilation for non trivial types 2010-07-14 23:31:38 +02:00
Gael Guennebaud
b6fac91998 merge 2010-07-14 22:51:53 +02:00
Gael Guennebaud
d4d4382b18 use dummy_precision by default instead of 0 2010-07-14 22:50:03 +02:00
Gael Guennebaud
90d6fc0e28 fix ei_aligned_delete for null pointers and non trivial dtors 2010-07-14 22:49:34 +02:00
Jitse Niesen
b0bd1cfa05 Tutorial page 4: add some text, diversify examples.
Use \verbinclude for output text to disable syntax highlighting.
Give tables consistent look.
2010-07-14 10:16:12 +01:00
Gael Guennebaud
e4f3759c4d add a bench for quaternion multiplication 2010-07-13 13:29:35 +02:00
Jitse Niesen
c36316f284 Change EXPAND_AS_DEFINED doxygen configuration option.
Add macros so that MatrixBase::cwiseProduct() and ArrayBase::min() are
documented, and remove one macro which is no longer used.
2010-07-13 10:14:58 +01:00
Jitse Niesen
140ad0908d Tutorial page 3: add more cwise operations, condense rest. 2010-07-12 22:45:57 +01:00
Christoph Hertzberg
6ba5d2c90c Implemented SSE optimized double-precision Quaternion multiplication 2010-07-12 23:30:47 +02:00
Jitse Niesen
8e776c94c1 Tutorial page 1: Put code and output side-by-side. 2010-07-12 12:02:31 +01:00
Gael Guennebaud
19a70ae939 fix doc compilation on non 32bits systems 2010-07-11 11:01:17 +02:00
Gael Guennebaud
850c6d8a2b fix unused warning 2010-07-11 10:58:58 +02:00
Gael Guennebaud
931027f31b add a utilility to debug cpuid, and makes sure we get 0 if we query an unsupported cpuid function 2010-06-26 23:15:06 +02:00
Gael Guennebaud
d8b1ce664b update the main page and add a TOC 2010-06-26 22:42:14 +02:00
Gael Guennebaud
f3c64c7cce improve ref tables 2010-06-26 22:19:03 +02:00
Benoit Jacob
e078bb2637 big improvements to tutorial, especially page 2 (matrix arithmetic).
add placeholders for some 'special topic' pages.
2010-06-26 14:00:00 -04:00
Gael Guennebaud
1c783e252f extend the quick ref table page 2010-06-26 18:49:50 +02:00
Gael Guennebaud
5c866f2d8c started the quick reference tables 2010-06-26 16:59:18 +02:00
Benoit Jacob
85c2c468df rename file 2010-06-25 20:19:17 -04:00
Carlos Becker
9d44005916 add initial versions of pages 2 and 3 of the tutorial: matrix arithmetic and the array class 2010-06-25 20:16:12 -04:00
Benoit Jacob
4338834e33 add tutorial page 1 - the Matrix class
+ 3 examples
2010-06-25 10:04:35 -04:00
Benoit Jacob
a90575514a int main() is a standard main() prototype, and makes for cleaner examples 2010-06-25 10:04:10 -04:00
Benoit Jacob
67d79c6751 adapt to change: lu() now gives partial piv LU, here we want fullPivLu() 2010-06-25 10:02:39 -04:00
Gael Guennebaud
eb4095d41a extend the eigen 2 to 3 guide 2010-06-25 15:32:01 +02:00
Gael Guennebaud
ec07c4109d add default parameters for InnerStride/OuterStride to be
able to simply write OuterStride<> instead of OuterStride<Dynamic>
2010-06-25 14:48:16 +02:00
Gael Guennebaud
4056db01e7 use const Scalar& instead of Scalar for function arguments 2010-06-25 13:52:23 +02:00
Gael Guennebaud
686689e9cf comment all disabled MSVC warnings 2010-06-25 13:31:07 +02:00
Gael Guennebaud
75b6d2b2f8 fix very annoying warning (gcc 4.3): type qualifiers ignored on function return type 2010-06-25 13:20:34 +02:00
Gael Guennebaud
01553c419e fox blcok size computation for fixed size objects 2010-06-25 11:44:55 +02:00
Gael Guennebaud
e313826890 add mixed sparse-dense outer product 2010-06-25 11:36:38 +02:00
Gael Guennebaud
1927b4dff5 Fix use of nesting types in SparseTranspose and split the big SparseProduct.h file 2010-06-25 10:26:24 +02:00
Gael Guennebaud
28e64b0da3 email change 2010-06-24 23:21:58 +02:00
Gael Guennebaud
002f7114d1 add support for oski 2010-06-24 23:21:45 +02:00
Gael Guennebaud
88e7a572fd makes sure to test small sizes 2010-06-24 23:06:21 +02:00
Gael Guennebaud
99e4afd43e makes SparseView a true sparse expression and fix use of nesting types 2010-06-24 22:48:48 +02:00
Gael Guennebaud
f3b875e434 fix infinite loop 2010-06-24 22:18:09 +02:00
Gael Guennebaud
566867428c - add a low level mechanism to provide preallocated memory to gemm
- ensure static allocation for the product of "large" fixed size matrix
2010-06-24 21:44:24 +02:00
Gael Guennebaud
e039edcb42 fix temporary creation rule 2010-06-24 19:21:25 +02:00
Gael Guennebaud
b22fc6cdc3 bug fix in gemv:
solution always use a temporary in dst.innerStride != 1
even though this is not needed when packet_size == 1....
2010-06-24 17:51:25 +02:00
Gael Guennebaud
7e836ccb4c unit test fix for default to row major 2010-06-24 17:49:51 +02:00
Gael Guennebaud
6be06745df block householder : minor optimization 2010-06-24 17:48:38 +02:00
Gael Guennebaud
905beb0953 fix symm 2010-06-24 16:42:43 +02:00
Gael Guennebaud
af38bccd3d fix syrk 2010-06-24 16:26:27 +02:00
Gael Guennebaud
e499646c74 fix vectorization logic test 2010-06-24 15:38:14 +02:00
Gael Guennebaud
19f2f53e2c fix compilation when default to row major 2010-06-24 15:13:41 +02:00
Gael Guennebaud
d44fce501b fix computation of blocking sizes for small triangular matrices 2010-06-24 11:50:28 +02:00
Hauke Heibel
0068d3ccf6 Added version testing for MSVC. 2010-06-24 10:05:24 +02:00
Hauke Heibel
22a6cb2dc0 Fix compilation when the memory layout is RowMajor. 2010-06-24 09:56:59 +02:00
Hauke Heibel
83f1b747e7 Fixed MSVC cpuid. 2010-06-24 09:55:53 +02:00
Gael Guennebaud
0a42f8c876 fix compilation when EIGEN_CPUD is not defined 2010-06-24 09:45:17 +02:00
Gael Guennebaud
8beb60bf63 fix EIGEN_CPUID for i386 & PIC, still remains to fix the MSVC version 2010-06-24 09:29:43 +02:00
Gael Guennebaud
98fec45d3c btl: add a trmm action and update eigen interface 2010-06-23 22:10:49 +02:00
Gael Guennebaud
546587c7d3 default to Intel's API by default 2010-06-23 17:14:06 +02:00
Gael Guennebaud
e1a6bad087 fix cache queries for non core2 CPU ;) 2010-06-23 16:34:51 +02:00
Gael Guennebaud
37dcdb1ed6 add missing typename 2010-06-22 23:43:12 +02:00
Gael Guennebaud
b284bb8bba add a spmv mini becnhmark for Eigen, GMM++, ublas, mtl4, and oski 2010-06-22 21:39:55 +02:00
Gael Guennebaud
b4fe53f561 * makes all product use the new API to set the blocking sizes
* fix an issue preventing multithreading (now Dynamic = -1 ...)
2010-06-22 16:08:35 +02:00
Gael Guennebaud
fd9a9fa0ae slightly optimize computeProductBlockingSizes by explicitely precomputing what is known at compile time 2010-06-22 11:10:38 +02:00
Hauke Heibel
3ae0eee0b8 merge 2010-06-22 09:31:26 +02:00
Hauke Heibel
d132b5b061 Correct the options computation for RowMajor matrices. 2010-06-22 09:30:08 +02:00
Gael Guennebaud
6ff28eb3cf forgot to include this file in my previous commit 2010-06-22 08:59:02 +02:00
Gael Guennebaud
98686ab86c fix in case we don't know how to query the L1/L2 cache sizes 2010-06-21 23:44:20 +02:00
Gael Guennebaud
0212eec23f simplify and optimize block sizes computation for matrix products. They
are now automatically computed from the L1 and L2 cache sizes which are
themselves automatically determined at runtime.
2010-06-21 23:28:50 +02:00
Hauke Heibel
4bac6fbe1e The intrin.h header needs to be included after cmath in order to prevent warnigns.
Fixed (hopefully) final Index realted warnings.
2010-06-21 18:39:24 +02:00
Hauke Heibel
80b6e5f278 Added include reuqired for __cpuid. 2010-06-21 16:43:31 +02:00
Gael Guennebaud
4cd38b333c make bench_gemm print out the queried cache sizes 2010-06-21 12:07:05 +02:00
Gael Guennebaud
e54635da11 add functions to query the L1 and L2/L3 cache sizes 2010-06-21 11:59:37 +02:00
Hauke Heibel
7196777b74 Added missing typename. 2010-06-21 11:39:11 +02:00
Hauke Heibel
dc6ad5e25b More Index related stuff. 2010-06-21 11:36:00 +02:00
Hauke Heibel
5f65a89f49 Compilation fix. 2010-06-21 08:58:23 +02:00
Hauke Heibel
a1af6e1151 This does hopefully really raise the warning/error limit. 2010-06-21 07:41:40 +02:00
Hauke Heibel
8239c3b85b Fix compilation. 2010-06-21 07:37:59 +02:00
Hauke Heibel
bb46a45340 Finally fixed the matrix function/exponential warning.
Index fixes.
2010-06-20 23:13:24 +02:00
Hauke Heibel
69b50047d6 Raise the error/warning limit. 2010-06-20 22:38:36 +02:00
Gael Guennebaud
52165ba55a compilation fix 2010-06-20 22:27:35 +02:00
Hauke Heibel
f34eaa2628 Next try - more Index fixes. 2010-06-20 21:44:25 +02:00
Hauke Heibel
546b802b77 Still fixing warnings. 2010-06-20 20:16:45 +02:00
Hauke Heibel
cb11f2f8a6 Fix compilation of some tests as well as more warnings. 2010-06-20 18:59:15 +02:00
Hauke Heibel
f1679c7185 Utilize Index in all unit tests. 2010-06-20 17:37:56 +02:00
Hauke Heibel
e402d34407 More Index realted warnings. 2010-06-20 15:52:34 +02:00
Hauke Heibel
7548708848 Silence index warnings in triangular unit test.
Silence index warnings in FFT module.
2010-06-20 14:00:14 +02:00
Hauke Heibel
9a6967d9ba Attempt to fix MatrixExponential/Function related warnings. 2010-06-20 13:17:57 +02:00
Hauke Heibel
aeb12b417d Silence indexing warning. 2010-06-20 13:17:37 +02:00
Gael Guennebaud
e3853353fb fix array_comp *= array_real 2010-06-20 00:35:33 +02:00
Gael Guennebaud
7fd8418b19 finish to merge Array into Core:
- mv Array/* into Core/
- merge Functors.h files, and move Norms.h into Dot.h
2010-06-19 23:36:38 +02:00
Gael Guennebaud
575ac5409c add missing support for std::pow(array,scalar) 2010-06-19 23:17:07 +02:00
Gael Guennebaud
eba418a458 remove reference to the dead Array module 2010-06-19 23:00:22 +02:00
Gael Guennebaud
17af8c763d fix compilation of sparse tests 2010-06-19 15:24:39 +02:00
Hauke Heibel
b1103c5767 Fixed spare unit test. 2010-06-19 13:58:07 +02:00
Gael Guennebaud
6db6e358f5 add the possibility to set the cache size at runtime 2010-06-18 23:25:57 +02:00
Gael Guennebaud
f85a1cf5df optimize SparseMatrix iterator 2010-06-18 16:47:41 +02:00
Benoit Jacob
f0a6d56f07 fix linking errors with multiply defined functions 2010-06-18 09:01:34 -04:00
Jitse Niesen
9d4b16c1d1 QuickStart examples: shorten var names, remove superfluous 'using'. 2010-06-18 10:43:22 +01:00
Gael Guennebaud
729960e465 add missing files 2010-06-18 11:36:30 +02:00
Gael Guennebaud
ece48a6450 split the Sparse module into multiple ones, and move non stable parts to unsupported/
(see the ML for details)
2010-06-18 11:28:30 +02:00
Gael Guennebaud
22d07ec2e3 Add blocking inside HouseholderQR based on code from Vincent Lejeune.
This is all internal stuff for now.
2010-06-17 18:30:47 +02:00
Gael Guennebaud
bc99c82d17 add an inplace householder QR dec function in preparation for a block version 2010-06-17 17:27:52 +02:00
Gael Guennebaud
3acd007f9d more compilation fixes for ICC 2010-06-17 17:25:18 +02:00
Gael Guennebaud
9637574e2b compilation fix for ICC 2010-06-17 16:56:42 +02:00
Gael Guennebaud
ab6a044d0d eigenvalues: documentation fixes 2010-06-17 14:34:10 +02:00
Jitse Niesen
9196b6b659 Add second example to QuickStart guide.
Also, some changes suggested by Keir and Benoit on mailing list.
2010-06-17 12:12:40 +01:00
Gael Guennebaud
7fdf218951 makes trmv works with the triangular matrix on the right 2010-06-17 10:17:22 +02:00
Gael Guennebaud
6bff339cc5 add unit tests for other generalized variants 2010-06-17 10:16:15 +02:00
Gael Guennebaud
43086d12d2 implement other variants 2010-06-17 10:11:38 +02:00
Gael Guennebaud
db160f2e0b warn users other variants are not implemented yet... (will do it very soon) 2010-06-16 23:55:08 +02:00
Gael Guennebaud
74006a9fe9 * decouple the generalized selfadjoint eigenvalue problem to the standard one
* uses named values instead of bools
2010-06-16 23:48:16 +02:00
Gael Guennebaud
197ce96c00 typo 2010-06-16 17:23:39 +02:00
Benoit Jacob
42c62c8876 fix #126, part 2/2: the checkTransposeAliasing() assertion was always compiled, for all expressions,
even for expressions that are known at compile time to not need it because they don't involve any transposing.
This gave 'controlling condition is constant' warnings on ICC, and potentially worse, this could generate a lot
of useless code per-expression if the compiler failed to realize that the condition was constant.
2010-06-16 09:23:32 -04:00
Benoit Jacob
2d1ae6fa08 fix #126, part 1/2: fix the return type of coeff() on direcaccess xprs: was amounting to
const (const Scalar&)

which really doesn't make sense.
2010-06-16 09:21:14 -04:00
Benoit Jacob
d0d62e4437 fix #139, exactly the same issue as #138, this time in Assign.h: const Index is not a compile-time constant, must use enum. 2010-06-16 07:37:52 -04:00
Benoit Jacob
404aa963d9 fix #138: const bool is (rightly) not considered a compile-time constant by ICC, use enum. 2010-06-16 07:32:44 -04:00
Jitse Niesen
8438719111 Compilation fix for matrix_exponential test: add 'typename'. 2010-06-16 11:07:40 +01:00
Gael Guennebaud
9726824f7c improve trmm unit test and fix several bugs in trmm 2010-06-15 23:38:21 +02:00
Gael Guennebaud
2e792d1f42 * make the triangular matrix * matrix product works with trapezoidal matrices
* extend the trmm unit test for unit diagonal
2010-06-15 22:00:34 +02:00
Benoit Jacob
134ca4acb3 packet math functions:
- take const Packet& args like the other packet funcs
 - SSE specializations: make them be actual template specializations
2010-06-15 08:29:21 -04:00
Hauke Heibel
7958797648 Ups, fixed a little ugly bug. 2010-06-15 12:37:54 +02:00
Hauke Heibel
99d952466f This scalar needs to be passed by ref to preserve its alignment. 2010-06-15 10:26:12 +02:00
Hauke Heibel
e5aa6a466b Fixed 64bit/Index related warnings in the matrix functions module. 2010-06-15 09:57:41 +02:00
Hauke Heibel
0afb1e80c7 Really fix #123. 2010-06-14 23:02:49 +02:00
Gael Guennebaud
3cabd0c417 fix issue 135 (SparseBlock::operator= for SparseMatrix) 2010-06-14 16:26:33 +02:00
Benoit Jacob
2d65f5d3cd remove extra semicolon; 2010-06-14 09:06:27 -04:00
Benoit Jacob
d788627b54 rename:
EIGEN_SIZE_MIN    ---> EIGEN_SIZE_MIN_PREFER_DYNAMIC
  EIGEN_MAXSIZE_MIN ---> EIGEN_SIZE_MIN_PREFER_FIXED
and make sure to use the latter in products xprs to determine the inner size.
2010-06-14 09:05:08 -04:00
Hauke Heibel
a54772250f Fixes bug #123. 2010-06-14 14:33:10 +02:00
Daniel Lowengrub
8673f68fd8 merged 2010-06-14 14:17:19 +03:00
Jitse Niesen
c2f6cbab8d Fix compilation of docs after changes in Eigenvalues module.
Clean-up after revision 469382407c
.
2010-06-14 10:16:01 +01:00
Daniel Lowengrub
af5117dbd5 fixed a bug in the DenseBase InnerIterator ctor. 2010-06-14 02:18:36 +03:00
Daniel Lowengrub
dcd39a96e1 added the SparseView class. 2010-06-14 02:16:46 +03:00
Jitse Niesen
9e00697ccc First draft for the 5-minute quick start guide to kick off discussions. 2010-06-13 22:39:27 +01:00
Gael Guennebaud
9ffc0f6975 an attempt to fix 133 2010-06-13 23:29:44 +02:00
Gael Guennebaud
f159613210 compilation fix 2010-06-13 22:53:53 +02:00
Hauke Heibel
eb95c57a9f Fixed another enum related warning. 2010-06-12 19:42:16 +02:00
Hauke Heibel
058f7d3486 Fixed another enum related warning. 2010-06-12 15:21:11 +02:00
Hauke Heibel
91c7af28a8 Remove printouts. 2010-06-12 15:16:39 +02:00
Hauke Heibel
340ac9ea9d Fixed warnings regarding enums. 2010-06-12 13:24:02 +02:00
Gael Guennebaud
03331552a9 add a info() function in LLT to report on succes/faillure 2010-06-12 10:12:22 +02:00
Gael Guennebaud
a25749ade9 add missing overload of operator= in SparseVector 2010-06-12 01:01:12 +02:00
Benoit Jacob
f5b1b6b351 undo 314bfa1375
, the right fix was made as part of the Dynamic -> -1 change, the bug was that in Map, the InnerStrideAtCompileTime could be 0, which doesn't make sense. The 0 value in Stride should not have been forwarded as-is.
2010-06-11 08:38:30 -04:00
Benoit Jacob
d72d538747 merge my Dynamic -> -1 change 2010-06-11 08:04:06 -04:00
Benoit Jacob
bdd7c6c88a change the value of Dynamic to -1, since the index type is now configurable.
remove EIGEN_ENUM_MIN/MAX, implement new macros instead
2010-06-11 07:56:50 -04:00
Benoit Jacob
52e8c42a00 unsplit this test: was compiling 16 times the same thing! Need to find another way. Suggestion: should compile the stuff once into a binary library, or properly split into different tests with different code. 2010-06-11 07:15:34 -04:00
Hauke Heibel
00267e3a47 Added some verbosity on failures in order to get an idea of what is goind wrong on GCC 4.3. 2010-06-11 12:13:29 +02:00
Hauke Heibel
f48af91c7e For 1x1 matrices we really need to check the abs diff of the determinants. 2010-06-11 08:08:12 +02:00
Hauke Heibel
cedea2aba4 Fixed warnings regarding missing assignment operator. 2010-06-11 07:59:59 +02:00
Gael Guennebaud
988aaed964 merge 2010-06-10 23:30:43 +02:00
Gael Guennebaud
5b192930b6 add runtime API to control multithreading 2010-06-10 23:30:15 +02:00
Jitse Niesen
4d597e4654 Merge. 2010-06-10 21:28:10 +01:00
Jitse Niesen
fcab4c951d Add line to prevent compiler warning on unused variables. 2010-06-10 21:26:23 +01:00
Gael Guennebaud
842b54fe80 make the cache size mechanism future proof by adding level 2 parameters 2010-06-10 22:11:31 +02:00
Jitse Niesen
54235879a9 Make test slightly fuzzy to account for effect of extended precision. 2010-06-10 19:18:19 +01:00
Gael Guennebaud
986f65c402 merge 2010-06-10 16:44:24 +02:00
Gael Guennebaud
469382407c * Make HouseholderSequence::evalTo works in place
* Clean a bit the Triadiagonalization making sure it the inplace
  function really works inplace ;), and that only the lower
   triangular part of the matrix is referenced.
* Remove the Tridiagonalization member object of SelfAdjointEigenSolver
  exploiting the in place capability of HouseholdeSequence.
* Update unit test to check SelfAdjointEigenSolver only consider
  the lower triangular part.
2010-06-10 16:39:46 +02:00
Gael Guennebaud
d2d7465bcf merge 2010-06-10 11:00:47 +02:00
Gael Guennebaud
d2779a1a8e fix warning with gcc 4.3 2010-06-10 10:59:06 +02:00
Gael Guennebaud
dad19c4173 compilation fix for gcc 4.2 2010-06-10 10:55:49 +02:00
Hauke Heibel
941ca80b80 Adapted the determinant test for rank 1 matrices with zero determinant. 2010-06-10 10:06:14 +02:00
Gael Guennebaud
f8683c409f generalized eigendecomposition doc 2010-06-10 09:44:52 +02:00
Gael Guennebaud
41e5625f96 clean general symm eigensolver 2010-06-10 09:34:49 +02:00
Hauke Heibel
3f388282ae Fixes geo_transformations_3 unit test. 2010-06-10 00:23:11 +02:00
Gael Guennebaud
8855c4e264 fix unit test when GSL is enabled 2010-06-10 00:19:45 +02:00
Gael Guennebaud
8692ccc5fb Fix generalized symm eigensolver (I don't know why the eigenvectors were normalized) 2010-06-10 00:04:33 +02:00
Hauke Heibel
bcf738811e Added missing return statement. 2010-06-10 00:02:10 +02:00
Hauke Heibel
56e585efcc Fixed language issue. 2010-06-09 17:20:31 +02:00
Hauke Heibel
2b7b549e9e Fix #131. 2010-06-09 17:16:05 +02:00
Gael Guennebaud
e242ac9345 fix LDLT, now it really only uses a given triangular part! 2010-06-09 14:01:06 +02:00
Gael Guennebaud
201bd253ad fix ldlt unit test 2010-06-09 13:18:10 +02:00
Hauke Heibel
8cc02169fd Fix devision by zero warning. 2010-06-09 09:48:06 +02:00
Hauke Heibel
45d3b405eb Fixed many MSVC warnings. 2010-06-09 09:30:22 +02:00
Gael Guennebaud
50e43bc75a * add Transpositions to PermutationMatrix conversion
* make PartialPivLu uses the  Transpositions class
2010-06-08 22:23:11 +02:00
Trevor Irons
684656d41c added inline to setL1Cache functions to avoid shared object compile error 2010-06-08 10:56:50 -06:00
Hauke Heibel
fb3fcd0919 Disabled warning caused by declspec(align()). 2010-06-08 20:21:55 +02:00
Hauke Heibel
8b0da2de64 Fix stable_norm compilation. 2010-06-08 20:09:39 +02:00
Hauke Heibel
1c9b7a8d9f Fighting for a green dashboard! Next warning's gone. 2010-06-08 16:02:22 +02:00
Hauke Heibel
4c5778d29d Made the supression of unused variables portable.
EIGEN_UNUSED is not supported on non GCC systems.
2010-06-08 15:52:00 +02:00
Hauke Heibel
2a64fa4947 Eigen types must always be passed by reference in order to retain memory alignment. 2010-06-08 13:49:40 +02:00
Gael Guennebaud
4b5d359c3a improve/fix stable_norm unit test 2010-06-08 13:29:27 +02:00
Hauke Heibel
626afe8b62 Fixed integer type warnings. 2010-06-08 10:06:14 +02:00
Hauke Heibel
bda40da002 Fixed GCC compilation. 2010-06-08 09:37:13 +02:00
Hauke Heibel
f6d26bf6dc Fixed more warnings. 2010-06-08 00:41:33 +02:00
Hauke Heibel
04274f6793 Fixed eigensolver warning. 2010-06-08 00:05:20 +02:00
Gael Guennebaud
f3a568c81d remove ei_ prefix of public global functions, and s/cpu/l1 2010-06-07 19:05:30 +02:00
Gael Guennebaud
727376b5f4 compilation fix (std::sqrt(int) does not exist) 2010-06-07 18:55:39 +02:00
Gael Guennebaud
88cd6885be Add a proof concept API to configure the blocking parameters at runtime.
After validation of the final API I'll update the other products to use it.
2010-06-07 16:35:25 +02:00
Gael Guennebaud
7726cc8a29 clean old stuff used to support precompilation inside a binary lib 2010-06-07 14:47:20 +02:00
Gael Guennebaud
bfeba41174 Add a Transpositions class to ease the representation and
manipulation of permutations as a sequence of transpositions.
Make LDLT use it.
2010-06-04 23:17:57 +02:00
Jitse Niesen
1ff1bd69ac Schur decomposition of 1-by-1 always converges. 2010-06-04 09:40:35 +01:00
Jitse Niesen
9178e2bd54 Add info() method which can be queried to check whether iteration converged. 2010-06-03 22:59:57 +01:00
Jitse Niesen
ed73a195e0 Refactor compute() by splitting off two smaller private methods. 2010-06-03 22:29:11 +01:00
Gael Guennebaud
e64460d5d0 LDLT: make it honors the Lower/Upper directive and make it works inplace 2010-06-03 22:22:14 +02:00
Gael Guennebaud
4159db979d make LDLT uses only the lower triangular part 2010-06-03 21:33:47 +02:00
Gael Guennebaud
d92de9574a fix sparse LDLT with complexes 2010-06-03 11:56:08 +02:00
Gael Guennebaud
8350ab9fb8 * remove ei_index, and let ei_traits propagate the index types
* add an Index type template parapeter to sparse objects
2010-06-03 08:41:11 +02:00
Hauke Heibel
e40852d282 Fixes #104. 2010-06-02 19:17:41 +02:00
Jitse Niesen
38d8352b7b Add field m_maxIterations; break loop when this limit is exceeded. 2010-06-02 17:31:02 +01:00
Gael Guennebaud
9ff0d67156 fix typos (oops) 2010-06-02 17:56:35 +02:00
Gael Guennebaud
8710bd23e7 clean the ambiguity with insertBack and add a insertBackByOuterInner function 2010-06-02 13:32:13 +02:00
Gael Guennebaud
143e6ab9d0 improve aliasing detection for inverse and add unit test 2010-06-02 10:12:13 +02:00
Gael Guennebaud
4ebb80490a implicit conversion to scalar for inner product 2010-06-02 09:45:57 +02:00
Gael Guennebaud
314bfa1375 fix issue #128 : inner stride can also be 0 in which case it means 1... 2010-06-01 22:51:47 +02:00
Jitse Niesen
ab2b33e802 Add cast to aliasing check.
Otherwise, one of the geo tests fails to compile. Now there are some compiler
warnings about aliasing and type-punned pointers that I don't understand.
2010-06-01 17:45:58 +01:00
Jitse Niesen
e3e2380548 Make all compute() methods return a reference to *this. 2010-06-01 17:40:51 +01:00
Trevor Irons
4c6d182c42 Addressess small compile error with OpenMP 2010-06-01 07:09:40 -06:00
Benoit Jacob
e54faba198 merge the backing-out of the stupid RetByVal change, and implement a simple
aliasing check in inverse, that catches simple cases like x = x.inverse()
2010-06-01 09:17:50 -04:00
Benoit Jacob
3e95609cd4 Backed out changeset 641d968a9a 2010-06-01 09:01:39 -04:00
Benoit Jacob
6ce22a61b3 oops, remove extra 'typename' 2010-05-30 16:41:16 -04:00
Benoit Jacob
aaaade4b3d the Index types change.
As discussed on the list (too long to explain here).
2010-05-30 16:00:58 -04:00
Benoit Jacob
faa3ff3be6 tests:
* change test_is_equal so it just checks for equality, doesn't try anymore to allow to ignoring the difference between col and row vectors.
   (needed for the upcoming Index types change)
 * in ei_assert, don't report on cerr if we're inside of VERIFY_RAISES_ASSERT
2010-05-30 16:00:09 -04:00
Benoit Jacob
42f1b7d470 finish the change of adding .sh extensions to scripts 2010-05-30 15:58:11 -04:00
Benoit Jacob
641d968a9a * Make ReturnByValue have the EvalBeforeAssigningBit and explicitly
enforce this mechanism (otherwise ReturnByValue bypasses it).
 (use .noalias() to get the old behavior.)
* Remove a hack in Inverse, futile optimization for 2x2 expressions.
2010-05-30 13:43:08 -04:00
Anton Gladky
09a1b7f7e1 Fixes the problem, described here:
http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2010/05/msg00154.html
2010-05-28 10:18:37 +02:00
Gael Guennebaud
39c568445c simplify a using statement 2010-06-01 13:59:21 +02:00
Gael Guennebaud
0116261407 make BenchTimer compatible with 2.0 branch 2010-06-01 13:57:38 +02:00
Gael Guennebaud
e2097c55f8 fix issue #125 - *norm() return RealScalar and not Scalar 2010-05-31 22:46:18 +02:00
Jitse Niesen
8dc947821b Allow user to compute only the eigenvalues and not the eigenvectors. 2010-05-31 18:17:47 +01:00
Jitse Niesen
609941380a Change skipU argument to computeU - this reverses the meaning.
See "skipXxx / computeXxx parameters in Eigenvalues module" on mailing list.
2010-05-31 16:48:41 +01:00
Jitse Niesen
c21390a611 Define non-const operator() in Reverse; enable test for this.
Introduction of DenseCoeffBase (revision bfdc1c4973
) meant that non-const
operator() is only defined if DirectAccess is set. This caused the line
"m.reverse()(1,0) = 4;" in MatrixBase_reverse.cpp to fail at compile-time.
Not sure this is correct solution; perhaps we should disallow this? Or make
Reverse DirectAccess with a negative stride - would that break something?
2010-05-31 14:42:04 +01:00
Jitse Niesen
07a65dd02b Fix stupid compilation error in test. 2010-05-31 11:48:47 +01:00
Jitse Niesen
db8631b66a Guard with assert against using decomposition objects uninitialized. 2010-05-30 21:49:35 +01:00
Jens Mueller
48b8ace517 Fix SparseMatrix/SparseVector::sum()
SparseMatrix/SparseVector::sum() uses Map to compute the sum. But Map expects a
pointer.
2010-05-27 17:02:24 +02:00
Manoj Rajagopalan
a240f83216 Fix to ProductBase::evalTo() in order to get matrix multiplication working for numeric
types that don't have implicit conversion from int
2010-05-26 13:00:55 -04:00
Thomas Capricelli
4a2d6ece2e fix readcost for complex types 2010-05-26 02:33:28 +02:00
Jitse Niesen
e7dc772554 Respect MaxRowsAtCompileTime in HouseholderSequence::evalTo().
This fixes the failing test nomalloc_4.
Also remove a print inserted for debugging in schur_real test.
2010-05-25 12:40:42 +01:00
Jitse Niesen
e7d809d434 Update eigenvalues() and operatorNorm() methods in MatrixBase.
* use SelfAdjointView instead of Eigen2's SelfAdjoint flag.
* add tests and documentation.
* allow eigenvalues() for non-selfadjoint matrices.
* they no longer depend only on SelfAdjointEigenSolver, so move them to
  a separate file
2010-05-24 17:43:50 +01:00
Jitse Niesen
8a3f552e39 Return matrices by constant reference where possible.
This changes the return type of:
* eigenvectors() and eigenvalues() in ComplexEigenSolver
* eigenvalues() in EigenSolver
* eigenvectors() and eigenvalues() in SelfAdjointEigenSolver
2010-05-24 17:43:27 +01:00
Jitse Niesen
7a43a4408b Replace local variables by member variables in compute() methods.
This is to avoid dynamic memory allocations in the compute() methods of
ComplexEigenSolver, EigenSolver, and SelfAdjointEigenSolver where possible.
As a result, Tridiagonalization::decomposeInPlace() is no longer used.
Biggest remaining issue is the allocation in HouseholderSequence::evalTo().
2010-05-24 17:43:06 +01:00
Jitse Niesen
68820fd4e8 Use ReturnByValue mechanism for HessenbergDecomposition::matrixH(). 2010-05-24 17:36:13 +01:00
Jitse Niesen
eb3ca68684 Change return type of matrixH() method to HouseholderSequence.
This method is a member of Tridiagonalization and HessenbergDecomposition.
2010-05-24 17:35:54 +01:00
Thomas Capricelli
76dd0e5314 fix some warnings 2010-05-22 19:17:04 +02:00
Thomas Capricelli
79e9e3f24c erm.. use EIGEN_ONLY_USED_FOR_DEBUG() as it already exists. 2010-05-21 18:13:59 +02:00
Thomas Capricelli
618640bf2b fix another warning 2010-05-21 02:14:46 +02:00
Thomas Capricelli
00a3d07795 display actual/expected warning if ei_isApprox() fails 2010-05-21 02:13:50 +02:00
Thomas Capricelli
aadea5ae56 fix a warning 2010-05-21 02:13:16 +02:00
Thomas Capricelli
c1d005e976 introduce a new macro EIGEN_ARG_UNUSED(arg) and use it in some places to
silent some warnings (from clang)
2010-05-21 02:05:25 +02:00
Thomas Capricelli
742bbdfa57 clang/llvm is now good enough. I can compile a project using those (one of
the binary segfaults though, and i think it's related..)
2010-05-21 02:03:43 +02:00
Thomas Capricelli
4daba0750e fix some warnings with clang 2010-05-21 01:46:17 +02:00
Thomas Capricelli
2ab446695b fix some compilation issues with clang (and hopefully bring eigen more
close to std ?)
2010-05-21 01:39:33 +02:00
Thomas Capricelli
17d080edf2 clang shocks on this.
According to people on #llvm, this is indeed not allowed by c++ standard:

[01:33] <coppro> what good would mutable do on a reference?
[01:33] <dgregor> orzel: gcc is wrong to allow "mutable" on references
[01:33] <coppro> just remove mutable; it won't damage the code at all
[01:34] <dgregor> "The mutable specifier can be applied only to names of
class data members (9.2) and cannot be applied to
[01:34] <dgregor> names declared const or static, and cannot be applied to
reference members."
[01:34] <coppro> constness is not passed from an object to the referents of
its members anyways
2010-05-21 01:37:48 +02:00
Thomas Capricelli
b9bcd93ddc fix a compilation pb with clang (it's actually surprising gcc did not complain) 2010-05-20 03:53:09 +02:00
Hauke Heibel
cec297f77b Disabled to __forceinline warning - it creates too many spurious errors and we disabled it only for the unit test (see also the code comment). 2010-05-19 19:35:42 +02:00
Hauke Heibel
39edf8e2bf I was not really aware of the implications on fixed size types when the strong inlining is not present. We really need it on MSVC! 2010-05-19 18:57:38 +02:00
Gael Guennebaud
05910b7996 merge 2010-05-19 16:51:07 +02:00
Gael Guennebaud
188171b991 merge 2010-05-19 16:49:56 +02:00
Gael Guennebaud
64cbd45266 minor chnages in Taucs and Cholmod backends 2010-05-19 16:49:05 +02:00
Hauke Heibel
6c18246a80 DenseBase is implemented as a class, not a struct. 2010-05-19 16:44:28 +02:00
Gael Guennebaud
2b6153d3ed simplify inner product 2010-05-19 16:37:17 +02:00
Gael Guennebaud
bf09fe55ed fix selfadjoint to dense 2010-05-19 16:35:34 +02:00
Hauke Heibel
f0283c13e8 Applied tiny Qt related fixes. 2010-05-19 16:32:47 +02:00
Benoit Jacob
08fbfa93e0 make the cmake options EIGEN_DEFAULT_TO_ROW_MAJOR and disabling EIGEN_SPLIT_LARGE_TESTS work also for unsupported tests 2010-05-18 08:59:39 -04:00
Benoit Jacob
1c04484a01 bug fix, since the last storage order changes, this InnerSize calculation was wrong 2010-05-18 08:24:06 -04:00
Benoit Jacob
5250c4395c compilation fix: const T ---> typename ei_makeconst<T>::type
(error was appearing when building tests with alignmnet disabled)

What is this stuff still doing in MatrixBase.h? Shouldn't it move to DenseBase.h? How are Array blocks compiling?
2010-05-18 08:19:23 -04:00
Gael Guennebaud
cf6d3162cc cache outer size in Block => x1.5 speed up for a.block() = b.block() 2010-05-17 16:54:17 +02:00
Gael Guennebaud
0f3bcf853f fix mixing types in inner product 2010-05-14 08:55:56 +02:00
Gael Guennebaud
6d08301dcc add regression test for previous fix 2010-05-13 23:34:04 +02:00
Gael Guennebaud
42a1c983c1 fix bug in sliced redux 2010-05-13 23:22:18 +02:00
Gael Guennebaud
c55761e015 make inner product return a 1x1 matrix 2010-05-12 18:11:05 +02:00
Benoit Jacob
82d898083f fix compilation error thanks to test case by Trevor Irons, and expand unit test 2010-05-09 13:20:46 -04:00
Benoit Jacob
6624b93d67 add important comment and move stride helpers to DenseCoeffsBase.h 2010-05-09 09:41:54 -04:00
Cyrille Berger
8f076f6817 fix installation of global headers in case the src path contains 'src' 2010-05-08 17:55:55 -04:00
Benoit Jacob
0928c40f68 rename Coeffs.h -> DenseCoeffsBase.h 2010-05-08 16:02:13 -04:00
Benoit Jacob
7cbb84b046 move the strides API to DenseCoeffsBase,
and various fixes to make stuff compile after my big changes
2010-05-08 16:00:05 -04:00
Benoit Jacob
0e2a480466 use CoeffReturnType 2010-05-08 15:58:27 -04:00
Benoit Jacob
eda2795f51 use modern ei_first_aligned function, dont try compiling coeffRef() on rvalue expressions. 2010-05-08 15:57:56 -04:00
Benoit Jacob
5deda97413 remove bogus test that was failing 2010-05-08 15:56:52 -04:00
Benoit Jacob
65bd1652b1 let Diagonal have the DirectAccessBit, using an inner stride 2010-05-08 14:19:04 -04:00
Benoit Jacob
bfdc1c4973 introduce DenseCoeffsBase: this is where the coeff / coeffRef / etc... methods go.
Rationale: coeffRef() methods should only exist when we have DirectAccess. So a natural thing to do would have been to use enable_if, but since there are many methods it made more sense to do the "enable_if" for the whole group by introducing a new class. And that also that the benefit of not changing method prototypes.
2010-05-08 13:45:31 -04:00
Benoit Jacob
d03779f75f fix CwiseUnaryView: it shouldn't have the AlignedBit, but it should have the DirectAccessBit and corresponding strides API. 2010-05-08 13:42:41 -04:00
Jitse Niesen
2d74f1ac92 Document SelfAdjointEigenSolver and add examples. 2010-05-04 17:11:32 +01:00
Hauke Heibel
6ea6276f20 Quiet MSVC. 2010-05-04 14:06:41 +02:00
Jitse Niesen
38021b08c1 Merge. 2010-05-02 21:51:27 +01:00
Jitse Niesen
afed0ef90d Document Tridiagonalization class, remove unused types. 2010-05-01 17:52:16 +01:00
Benoit Jacob
8f249e8b54 fix compilation: const (T&) != const T& , use ei_makeconst 2010-04-30 14:28:35 -04:00
Benoit Jacob
cf4f90ccea fix #116 and remove debug cout's 2010-04-30 11:58:17 -04:00
Benoit Jacob
38facbd55b kill the LeastSquares module.
I didn't even put it in Eigen2Support because it requires several other modules. But if you want we can always create a new module, Eigen2Support_LeastSquares...
2010-04-29 10:40:52 -04:00
Benoit Jacob
664f2d4508 dont try passing --version to qcc 2010-04-29 08:04:42 -04:00
Benoit Jacob
d3f97f7582 forgot hg add 2010-04-29 07:30:00 -04:00
Benoit Jacob
5d63d2cc4b * kill the retval typedefs, instead introduce ei_xxx_retval which does the job automatically in 99% cases and can be specialized
* add real/imag/abs2 global functions for Array
* document ei_global_math_functions_filtering_base
* improve unit tests
2010-04-28 22:42:34 -04:00
Benoit Jacob
e277586958 Complete rework of global math functions and NumTraits.
* Now completely generic so all standard integer types (like char...) are supported.
** add unit test for that (integer_types).
* NumTraits does no longer inherit numeric_limits
* All math functions are now templated
* Better guard (static asserts) against using certain math functions on integer types.
2010-04-28 18:51:38 -04:00
Jitse Niesen
d9c1224133 Simplify computation of eigenvectors in EigenSolver.
* reduce scope of declarations
* use that low = 0 and high = size-1
* rename some variables
* rename hqr2_step2() to computeEigenvectors()
* exploit that ei_isMuchSmallerThan takes absolute value of arguments
2010-04-26 17:43:16 +01:00
Jitse Niesen
024995dbca Use topRows() and rightCols() in ComplexSchur and RealSchur. 2010-04-26 16:59:18 +01:00
Jitse Niesen
4f83d6ad19 Remove doc/snippets/MatrixBase_minor.cpp because minor() was removed. 2010-04-26 16:35:38 +01:00
Carlos Becker
34b3cdb82c Added EIGEN_DONT_PARALLELIZE preprocessor directive 2010-04-26 17:08:54 +02:00
Hauke Heibel
bf71b466e9 Removed ambiguity between Map and Matrix Options template parameter. 2010-04-26 16:59:04 +02:00
Hauke Heibel
17dbe6c743 Added file extensions to our unit test scripts to prevent MSVC from failing to build because of name clashes. 2010-04-26 13:31:58 +02:00
Hauke Heibel
18c70b12d7 Fixed a warning. 2010-04-26 13:31:16 +02:00
Carlos Becker
c4dda79904 Fixed stablenorm test, condition was not met when running tests 2010-04-26 00:20:20 +02:00
Benoit Jacob
ce09b4ddfc compile 2010-04-25 17:10:28 -04:00
Konstantinos Margaritis
9337f371d2 (proper commit this time)
replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:58:44 +03:00
Konstantinos Margaritis
5acf46bd12 Backed out changeset 6972c140f7 2010-04-24 00:57:10 +03:00
oem
6972c140f7 replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
2010-04-24 00:44:14 +03:00
Benoit Jacob
e3e34b5920 remove MakeBase, use ei_dense_xpr_base instead
(yes, it was only used in dense xprs anyway)
2010-04-23 12:16:30 -04:00
Benoit Jacob
a16ba80bfa * remove ei_block_direct_access_status
* remove HasDirectAccess / NoDirectAccess constants
2010-04-23 11:36:22 -04:00
Benoit Jacob
58e7297859 * remove class DenseDirectAccessBase
* remove member XprBase typedefs, use ei_dense_xpr_base
* remove member _HasDirectAccess typedefs, use ei_has_direct_access
2010-04-23 10:27:15 -04:00
Benoit Jacob
1dd27ce280 merge 2010-04-23 09:06:28 -04:00
Benoit Jacob
f22ade8ee4 restrict operator[] to vectors, not matrices. 2010-04-23 09:05:46 -04:00
Benoit Jacob
c29b431ad9 remove eigen_gen_credits script 2010-04-22 20:59:19 -04:00
Benoit Jacob
4502afeedf remove disabled/ directory. It's useless. It remains in the hg history anyways. 2010-04-22 20:56:33 -04:00
Benoit Jacob
a4f9ca44ab add minor to Eigen2Support 2010-04-22 20:49:01 -04:00
Benoit Jacob
2362eadcdd remove Minor
adapt 3x3 and 4x4 (non-SSE) inverse paths
2010-04-22 20:40:31 -04:00
Benoit Jacob
abbe260905 remove USING_PART_OF_NAMESPACE_EIGEN, leaving it in Eigen2Support.
improve porting-Eigen2-to-3 docs
2010-04-22 18:27:13 -04:00
Benoit Jacob
ef789fe0d2 forgot to hg add... 2010-04-22 14:32:28 -04:00
Benoit Jacob
bc22f4da9d * fix Eigen2Support, was not including VectorBlock.h
* move the corners support stuff to a new Block.h there
* expand the unit test
2010-04-22 14:31:39 -04:00
Benoit Jacob
00c716d20e merge 2010-04-22 14:11:44 -04:00
Benoit Jacob
9962c59b56 * implement the corner() API change: new methods topLeftCorner() etc
* get rid of BlockReturnType: it was not needed, and code was not always using it consistently anyway
* add topRows(), leftCols(), bottomRows(), rightCols()
* add corners unit-test covering all of that
* adapt docs, expand "porting from eigen 2 to 3"
* adapt Eigen2Support
2010-04-22 14:11:18 -04:00
Hauke Heibel
27a4a748cb MSVC runs into problems when a forward declaration is using a different template type name than the actual declaration.
This fixes the recent issues we observed on MSVC systems.
2010-04-22 13:56:06 +02:00
Adolfo Rodriguez Tsouroukdissian
28dde19e40 - Added problem size constructor to decompositions that did not have one. It preallocates member data structures.
- Updated unit tests to check above constructor.
- In the compute() method of decompositions: Made temporary matrices/vectors class members to avoid heap allocations during compute() (when dynamic matrices are used, of course).

These  changes can speed up decomposition computation time when a solver instance is used to solve multiple same-sized problems. An added benefit is that the compute() method can now be invoked in contexts were heap allocations are forbidden, such as in real-time control loops.

CAVEAT: Not all of the decompositions in the Eigenvalues module have a heap-allocation-free compute() method. A future patch may address this issue, but some required API changes need to be incorporated first.
2010-04-21 17:15:57 +02:00
Benoit Jacob
faf8f7732d merge 2010-04-21 12:31:04 -04:00
Benoit Jacob
e33953b888 fix compilation in Sparse (error introduced yesterday) 2010-04-21 12:28:51 -04:00
Hauke Heibel
2db5387488 Fixed two bad errors on std::vector.
First, MSVC 2010 does not ship a 'fixed'/'adapted' STL.
Second, only under very rare cases we do not even need the aligned_allocator.
2010-04-21 18:21:46 +02:00
Daniel Lowengrub
028bb7ea48 Changed the gdb display format of vectors and added support for quaternions. 2010-04-21 00:10:03 +03:00
Benoit Jacob
4ba032c9ab fix grave bug introduced by me: the low-level matrix-vector product functions can't be fed strided vectors, only strided matrices. 2010-04-20 15:59:17 -04:00
Benoit Jacob
941a7e4ebd oos, remove eval() used for debugging 2010-04-19 13:31:29 -04:00
Benoit Jacob
84d1b2ae3a add platform check for how to link to the standard math library.
This allows to support QNX.
2010-04-19 11:19:22 -04:00
Benoit Jacob
40b2aaa8b1 merge 2010-04-18 22:15:20 -04:00
Benoit Jacob
504a31f643 renaming (MatrixType ---> whatever appropriate)
and documentation improvements
2010-04-18 22:14:55 -04:00
Benoit Jacob
34b14c48f3 shut up stupid gcc 4.5.0 warning 2010-04-18 22:14:03 -04:00
Thomas Capricelli
29a3aec483 erf() is really non standard, better dont pollute eigen with it. 2010-04-19 02:06:11 +02:00
Hauke Heibel
086d5f1ac6 Fixed indentation and removed debug code. 2010-04-18 20:16:39 +02:00
Hauke Heibel
214d5a892d Added support for STL lists with aligned Eigen types. 2010-04-18 20:10:43 +02:00
Hauke Heibel
031932b4ec Disabled erf also for Cygwin where it is not supported and causes errors. 2010-04-18 20:08:56 +02:00
Hauke Heibel
7db0f242ef Disabled unsupported erf on MSVC machines. 2010-04-18 14:37:44 +02:00
Thomas Capricelli
63eaa8948e tests for nonlinear module : use different slots + misc cleaning 2010-04-18 03:47:23 +02:00
Thomas Capricelli
f1deab0e5a introduce ei_erf() for various scalar type 2010-04-18 03:39:22 +02:00
Benoit Jacob
ce32f90fdd clarify help message about make install 2010-04-17 12:10:53 -04:00
Benoit Jacob
ba5b5f6a4b fix compilation 2010-04-17 11:29:25 -04:00
Hauke Heibel
cc33a56140 Added MSVC stack allocation support. 2010-04-17 16:08:17 +02:00
Gael Guennebaud
0326a51f89 fix use of uninitialzed calues 2010-04-17 05:23:23 +02:00
Benoit Jacob
797f63044a oops, forgot to add DenseDirectAccessBase 2010-04-16 14:46:55 -04:00
Benoit Jacob
d9ee28851e fix ei_blas_traits directaccess check: in the case of vectors, having a nontrivial inner stride is OK. 2010-04-16 14:12:53 -04:00
Benoit Jacob
04c663840b * add some 1x1 tests
* temporarily disable tests that strangely fail, with a big FIXME
2010-04-16 12:32:33 -04:00
Benoit Jacob
0ab431d7b8 * merge with mainline
* adapt Eigenvalues module to the new rule that the RowMajorBit must have the proper value for vectors
* Fix RowMajorBit in ei_traits<ProductBase>
* Fix vectorizability logic in CoeffBasedProduct
2010-04-16 11:25:50 -04:00
Benoit Jacob
ff6a46105d * Refactoring of the class hierarchy: introduction of DenseDirectAccessBase, removal of extra _Base/_Options template parameters.
* Introduction of strides-at-compile-time so for example the optimized code really knows when it needs to evaluate to a temporary
* StorageKind / XprKind
* Quaternion::setFromTwoVectors: use JacobiSVD instead of SVD
* ComplexSchur: support the 1x1 case
2010-04-16 10:13:32 -04:00
Gael Guennebaud
ea1a2df370 taucs: make SupernodalMultifrontal the default mode 2010-04-15 12:42:17 +02:00
Gael Guennebaud
0b6b316f18 an attempt to fix compilation with MSVC 2010-04-15 12:36:28 +02:00
Gael Guennebaud
a2324d6265 fix sparse squared norm 2010-04-13 10:40:55 +02:00
Jitse Niesen
614fbe497d Merge. 2010-04-12 18:55:27 +01:00
Jitse Niesen
574ad9efbd Move computation of eigenvalues from RealSchur to EigenSolver. 2010-04-12 18:54:15 +01:00
Jitse Niesen
73d3a27667 RealSchur: Make sure zeros are really zero (cont'd); add default ctor, docs. 2010-04-12 18:14:32 +01:00
Gael Guennebaud
0b0366a53d cholmod: assume selfadjoint matrix by default since selfadjoint flag has been removed 2010-04-09 13:37:05 +02:00
Jitse Niesen
872df22ca4 RealSchur: Make sure zeros are really zero; simplify initFrancisQRStep(). 2010-04-09 11:23:17 +01:00
Gael Guennebaud
1ad6c79467 merge 2010-04-08 13:29:31 +02:00
Gael Guennebaud
7a59f9ae01 make sure that changing CMAKE_INSTALL_PREFIX is properly taken into account 2010-04-08 13:28:21 +02:00
Jitse Niesen
7dea3a33a5 RealSchur: change parameter lists; minor rewrite of computeShift(). 2010-04-07 17:29:12 +01:00
Jitse Niesen
b6829e1d5b RealSchur: use makeHouseholder() to construct the transformation. 2010-04-07 10:27:27 +01:00
Jitse Niesen
cc57df9bea RealSchur: Rename l and n to il and iu. 2010-04-06 18:26:30 +01:00
Jitse Niesen
9fad1e392b RealSchur: split computation in smaller functions. 2010-04-06 17:45:46 +01:00
Jitse Niesen
7dc56b3dfb RealSchur: Use Householder module in Francis QR step. 2010-04-06 16:43:07 +01:00
Jitse Niesen
86df74c765 RealSchur: reduce scope of temporary variables in hqr2(). 2010-04-06 15:26:09 +01:00
Jitse Niesen
dad50338b8 RealSchur: Use PlanarRotation in "found two real eigenvalues" branch. 2010-04-06 15:12:21 +01:00
Jitse Niesen
d88d1cfa62 Merge. 2010-04-02 21:33:34 +01:00
Jitse Niesen
79e1ce6093 RealSchur and EigenSolver: some straightforward renames. 2010-04-02 21:05:32 +01:00
Jitse Niesen
a16a36ecf2 Add tests for real and complex Schur; extend test for Hessenberg.
Make a minor correction to the ComplexSchur class.
2010-04-02 14:32:20 +01:00
Hauke Heibel
9d6afdeb22 ei_psqrt fix for zero input 2010-04-01 15:10:52 +02:00
Jitse Niesen
3a14a13533 Split computation of real Schur form in EigenSolver to its own class.
This is done with the minimal amount of work, so the result is very rough.
2010-04-01 12:32:56 +01:00
Jitse Niesen
8cfa672fe0 Use HessenbergDecomposition in EigenSolver. 2010-03-31 21:50:18 +01:00
Jitse Niesen
1b3f7f2fee Extend documentation and add examples for EigenSolver class. 2010-03-31 11:59:11 +01:00
Benoit Jacob
338ec0390f let the cast functor use the new ei_cast() 2010-03-30 14:51:47 -04:00
Benoit Jacob
16e416b8d7 generalize the idea of the previous commit to all kinds of casts, see this forum thread:
http://forum.kde.org/viewtopic.php?f=74&t=86914
this  is important to allow users to support custom types that don't have the needed conversion operators.
2010-03-30 14:47:45 -04:00
Benoit Jacob
9e0d8697c7 add ei_cast_to_int, we are indeed somethings (e.g. in IO.h) casting scalars to int and the only way to allow users to extend that to their own scalar types that don't have int cast operators, was to allow them specialize ei_cast_to_int_impl. 2010-03-30 14:16:54 -04:00
Benoit Jacob
8f99ae5ea4 move the computation of the number of significant digits to a templated helper struct, that can be specialized to custom types if needed. Should address this request:
http://forum.kde.org/viewtopic.php?f=74&t=86914
2010-03-30 11:38:09 -04:00
Jitse Niesen
e6300efb5c Extend documentation for HessenbergDecomposition. 2010-03-28 17:33:56 +01:00
Thomas Capricelli
0a5c2d8a54 fix misc warnings, more importantly when NDEBUG is defined, assert() is a
nop.
2010-03-27 18:58:29 +01:00
Jitse Niesen
af08770890 Center version number on main page of API documentation. 2010-03-26 12:36:42 +00:00
Manuel Yguel
eb0c921a58 Fix some doc typos. 2010-03-25 16:54:00 +01:00
Thomas Capricelli
df1cbe8c3d fix display of modules list in documentation 2010-03-25 08:50:23 +01:00
Manuel Yguel
671cfb7ad0 Fix wrong header and warnings in polynomialutils.cpp 2010-03-25 03:31:33 +01:00
Manuel Yguel
89f2d5667f Add the possibility to use the polynomial solver of the gsl. 2010-03-25 03:25:47 +01:00
Manuel Yguel
5ef83d6a6c Add missing test files for Polynomials module. 2010-03-25 03:21:52 +01:00
Manuel Yguel
9a4a08da46 Creation of the Polynomials module with the following features:
* convenient functions:
  - Horner and stabilized Horner evaluation
  - polynomial coefficients from a set of given roots
  - Cauchy bounds
* a QR based polynomial solver
2010-03-25 03:15:05 +01:00
Gael Guennebaud
4e871c6c80 blas: fix compilation and build both a shared and static lib 2010-03-24 19:34:18 +01:00
Jitse Niesen
c68098b9be Clean up ComplexSchur::compute() . 2010-03-24 14:10:33 +00:00
Jitse Niesen
13a1b0ba27 Add snippets file which should have been added in the previous commit. 2010-03-24 13:04:03 +00:00
Jitse Niesen
37e17938e9 Extend documentation of ComplexSchur and add examples. 2010-03-23 12:49:09 +00:00
Jitse Niesen
307c428253 Move documentation of MatrixBase methods in MatrixFunctions to module page.
I think that because MatrixFunctions is in unsupported/ and MatrixBase is
not, doxygen does not include the MatrixBase methods defined and documented
in the MatrixFunctions module with the other MatrixBase methods. This is a
kludge, but at least the documentation is not lost.
2010-03-22 13:58:19 +00:00
Jitse Niesen
525d6b655f Merge. 2010-03-21 21:59:00 +00:00
Jitse Niesen
8e5d2b6fc4 Rename Complex in ComplexSchur and ComplexEigenSolver to ComplexScalar
for consistency with the RealScalar type; correct ComplexEigenSolver
docs to take non-diagonalizable matrices into account; refactor
ComplexEigenSolver::compute().
2010-03-21 21:57:34 +00:00
Benoit Jacob
1803db6e84 merge 2010-03-21 11:28:31 -04:00
Benoit Jacob
92da574ec2 * allow matrix dimensions to be 0 (also at compile time) and provide a specialization
of ei_matrix_array for size 0
* adapt many xprs to have the right storage order, now that it matters
* add static assert on expressions to check that vector xprs
  have the righ storage order
* adapt ei_plain_matrix_type_(column|row)_major
* implement assignment of selfadjointview to matrix
  (was before failing to compile) and add nestedExpression() methods
* expand product_symm test
* in ei_gemv_selector, use the PlainObject type instead of a custom Matrix<...> type
* fix VectorBlock and Block mistakes
2010-03-21 11:28:03 -04:00
Gael Guennebaud
8b093dd2df oops, fix symv (innerStride instead of outerStride) 2010-03-20 20:51:44 +01:00
Jitse Niesen
d91ffffc37 Allow ComplexEigenSolver and ComplexSchur to work with real matrices.
Add a test which covers this case.
2010-03-20 17:04:40 +00:00
Hauke Heibel
fbdf16d425 Added x()/y() and z() access functions to translations. 2010-03-19 20:11:40 +01:00
Jitse Niesen
d3e271c47e Extend documentation and add examples for ComplexEigenSolver. 2010-03-19 18:23:36 +00:00
Benoit Jacob
547269da35 fix the flags and matrix options, to always have the right RowMajor bit in the vector case 2010-03-19 02:12:23 -04:00
Benoit Jacob
9dba86df0b merge 2010-03-18 20:11:38 -04:00
Benoit Jacob
089bd89512 compile with gcc 4.5 2010-03-18 20:10:24 -04:00
Jitse Niesen
0ee10f7da4 Document member functions and types of ComplexEigenSolver. 2010-03-18 13:42:17 +00:00
Jitse Niesen
04a4e22c58 API change: ei_matrix_exponential(A) --> A.exp(), etc
As discussed on mailing list; see
http://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2010/02/msg00190.html
2010-03-16 17:26:55 +00:00
Gael Guennebaud
d536fef1bb fix and extend replicate optimization, and add the packet method though it is currently disabled 2010-03-15 10:39:00 +01:00
Hauke Heibel
d68b85744f Replaced strong with weak inlines in CwiseUnaryOp. 2010-03-14 13:01:10 +01:00
Hauke Heibel
6d08f71a2d Removed strong inlines which cannot always be inlined. 2010-03-14 12:09:29 +01:00
Hauke Heibel
0e5a232dae Ups - again a missing typename. 2010-03-14 11:58:44 +01:00
Hauke Heibel
fc20e6fd55 Try to avoid modulo operations in Replicate if possible. 2010-03-13 14:33:39 +01:00
Hauke Heibel
b9644f3323 Propagate fixed size dimensions if available (on MSVC it leads >2.5x speedup for some reductions). 2010-03-13 13:15:27 +01:00
Benoit Jacob
3e08c22028 attempt to fix #101 2010-03-11 12:41:46 -05:00
Hauke Heibel
2a82b162d7 Nest expression within MatrixWrapper by value. 2010-03-10 17:13:07 +01:00
Hauke Heibel
b20b393a4e Enable resizing of Arrays. 2010-03-10 17:12:45 +01:00
Hauke Heibel
dd9ff1b9a6 Fix MSVC warnings. 2010-03-09 09:04:21 +01:00
Benoit Jacob
74a7c5caee implement the idea that row-vectors have the RowMajorBit and col-vectors don't. 2010-03-09 00:16:07 -05:00
Benoit Jacob
b81351cb07 nomalloc: minor cleanup 2010-03-08 21:30:06 -05:00
Adolfo Rodriguez Tsouroukdissian
5a36f4a8d1 Propagate all five matrix template parameters to members and temporaries of decomposition classes. One particular advantage of this is that decomposing matrices with max sizes known at compile time will not allocate.
NOTE: The ComplexEigenSolver class currently _does_ allocate (line 135 of Eigenvalues/ComplexEigenSolver.h), but the reason appears to be in the implementation of matrix-matrix products, and not in the decomposition itself.
The nomalloc unit test has been extended to verify that decompositions do not allocate when max sizes are specified. There are currently two workarounds to prevent the test from failing (see comments in test/nomalloc.cpp), both of which are related to matrix products that allocate on the stack.
2010-03-08 19:31:27 +01:00
Gael Guennebaud
898762529e update the product selection logic to use the Max* sizes 2010-03-08 22:55:58 +01:00
Gael Guennebaud
2af9468dd1 update the product selection logic to use the Max* sizes 2010-03-08 22:54:31 +01:00
Gael Guennebaud
9a3b00c040 add missing cmake directives for arch/Default 2010-03-08 22:15:35 +01:00
Thomas Capricelli
b2e7329356 tests : fix compilation issues, adding <iostream> and removing
<Eigen/Array>
2010-03-08 20:34:24 +01:00
Benoit Jacob
89343a38af * Fix #97 : Householder operations on 1x1 matrices
* Fix VectorBlock on 1x1 "vectors"
* remove useless makeTrivialHouseholder function
2010-03-08 12:37:04 -05:00
Benoit Jacob
4293a4d1af * let a = foo() work when a is a row-vector xpr and foo() returns a ReturnByValue col-vector
* remove a few useless resize() in evalTo() implementations
2010-03-08 10:34:59 -05:00
Mark Borgerding
3565e89be2 minor edit 2010-03-07 23:57:02 -05:00
Mark Borgerding
101cc03176 merge 2010-03-07 23:46:26 -05:00
Mark Borgerding
e31929337e needed different proxy return types for fwd,inv to work around static asserts 2010-03-07 22:05:48 -05:00
Mark Borgerding
5b2c8b77df created FFT::fwd and FFT::inv with ReturnByValue 2010-03-07 21:31:06 -05:00
Hauke Heibel
9fe040ad29 Reintroduced the if-clause for MSVC ei_ploadu via _loadu_. 2010-03-07 14:05:26 +01:00
Gael Guennebaud
aeea00a9cf fix compilation 2010-03-07 12:32:24 +01:00
Gael Guennebaud
3130b7a722 bugcount--, this time trmm 2010-03-06 22:57:50 +01:00
Gael Guennebaud
1958b7eccc stride() => inner/outerStride() 2010-03-06 22:39:15 +01:00
Gael Guennebaud
4402034678 pff I introduced much too many bugs latey, count-- 2010-03-06 22:24:50 +01:00
Gael Guennebaud
61ce1de048 fix symm 2010-03-06 22:15:59 +01:00
Gael Guennebaud
a7d199bf9a fix trsolve 2010-03-06 21:37:14 +01:00
Gael Guennebaud
6f0b96dcf4 fix issue #100 (fix syrk) 2010-03-06 21:16:43 +01:00
Gael Guennebaud
271fc84e47 bugfix in gebp for 32bits x86 2010-03-06 20:52:20 +01:00
Benoit Jacob
c4f8afdf49 #undef minor at the right place 2010-03-06 14:44:57 -05:00
Benoit Jacob
7e2afe7e95 remove the __ARM_NEON__ check there since Konstantinos said he removed it but apparently didn't commit :) 2010-03-06 12:11:08 -05:00
Benoit Jacob
bf0a21a695 * disable static alignment on QCC
* remove obsolete #error
2010-03-06 09:28:58 -05:00
Benoit Jacob
2bd31d3fbc * include Macros.h much earlier: since it takes care of the alignment platform detection, it is needed before we do the vectorization stuff in Eigen/Core !!
* kill EIGEN_DONT_ALIGN_HEAP option (one should use EIGEN_DONT_ALIGN)
* rename EIGEN_DONT_ALIGN_STACK to EIGEN_DONT_ALIGN_STATICALLY. hope it's a better name.
2010-03-06 09:05:15 -05:00
Hauke Heibel
61a14539c7 merge 2010-03-06 11:48:19 +01:00
Benoit Jacob
f03d95348d introduce EIGEN_DONT_ALIGN_STACK (disables alignment attributes) and EIGEN_DONT_ALIGN_HEAP (disables aligned malloc)...
you can still use EIGEN_DONT_ALIGN to do both at once.
2010-03-06 02:17:37 -05:00
Gael Guennebaud
afd7ee759b fix copy pasted comment 2010-03-05 21:35:11 +01:00
Konstantinos Margaritis
273b236f72 Altivec brought up to date. Most tests pass and performance is better than before too! 2010-03-05 22:28:49 +02:00
Hauke Heibel
51b0159c96 Fixed line endings. 2010-03-05 18:11:54 +01:00
Gael Guennebaud
f2a246c225 add a small program to bench all combinations of small products 2010-03-05 17:16:19 +01:00
Gael Guennebaud
c442208358 clean a bit the bench_gemm files 2010-03-05 11:35:43 +01:00
Gael Guennebaud
5f172cd01f add a FIXME 2010-03-05 10:45:29 +01:00
Gael Guennebaud
48d0595c29 * dynamically adjust the number of threads
* disbale parallelisation if we already are in a parallel session
2010-03-05 10:44:31 +01:00
Gael Guennebaud
dd961f8c60 add an option to test ompenmp 2010-03-05 10:22:27 +01:00
Gael Guennebaud
62ac021606 fix openmp version for scalar types different than float 2010-03-05 10:16:25 +01:00
Gael Guennebaud
d13b877014 remove the 1D and 2D parallelizer, keep only the GEMM specialized one 2010-03-05 10:04:17 +01:00
Gael Guennebaud
24ef5fedcd minor cleaning 2010-03-05 09:57:04 +01:00
Gael Guennebaud
279ad44509 merge 2010-03-05 09:46:58 +01:00
Gael Guennebaud
620bd28480 enable posix_memalign for QNX 2010-03-05 09:44:21 +01:00
Gael Guennebaud
7e2683dc39 merge 2010-03-04 18:59:56 +01:00
Gael Guennebaud
0964810fba merge 2010-03-04 18:59:03 +01:00
Gael Guennebaud
ea8cad5151 make the number of registers easier to configure per architectures 2010-03-04 18:58:12 +01:00
Gael Guennebaud
cefd9b8888 merge with default branch 2010-03-04 18:47:52 +01:00
Hauke Heibel
1723068694 Moved x()/y()/z() and w() access functions to DenseBase; they are now available for Arrays as well. 2010-03-04 18:33:51 +01:00
Gael Guennebaud
8ed1ef4469 add a minor FIXME 2010-03-04 18:30:28 +01:00
Benoit Jacob
68d94d914e integer division is vectorizable on no SIMD platform, not just SSE. 2010-03-04 09:03:06 -05:00
Konstantinos Margaritis
710bc073a7 arm_neon.h is a standard header file, fixed 2010-03-03 12:15:34 -06:00
Benoit Jacob
6c89fd4df0 minor cleanup 2010-03-03 13:16:21 -05:00
Gael Guennebaud
7dd81aad74 factorize default performance related settings to a single file
included after the architecture specific files such that they
can be adapted by each platform.
2010-03-03 18:47:58 +01:00
Konstantinos Margaritis
112c550b4a Added initial NEON support, most tests pass however we had to use some hackish workarounds
as gcc on ARM (both CodeSourcery 4.4.1 used and experimental 4.5) fail to
ensure proper alignment with __attribute__((aligned(16))). This has to be
fixed upstream to remove the workarounds.
2010-03-03 11:25:41 -06:00
Benoit Jacob
45d19afb18 cleanup/simplification in computation of matrix flags 2010-03-03 09:58:43 -05:00
Benoit Jacob
7dbe806711 merge 2010-03-03 09:55:46 -05:00
Benoit Jacob
6a92168915 Backed out changeset 2f3d685e0c
This was implementing deep changes that after discussion on the mailing list seem to need further discussion/thinking.
2010-03-03 09:54:50 -05:00
Hauke Heibel
aa6570c3a3 Added a missing inline hints.
Removed a useless Nested temporary.
2010-03-03 15:24:58 +01:00
Gael Guennebaud
b0ffd9bf04 clean #defined tokens, and use clock_gettime for the real time 2010-03-03 09:41:29 +01:00
Gael Guennebaud
2f3d685e0c a matrix (or array) does not always have the LinearAccessBit!
=> fixes in outerStride and matrix flags
2010-03-02 15:31:39 +01:00
Gael Guennebaud
0ed5edd24d blas: add a default implementation of xerbla 2010-03-02 14:50:41 +01:00
Gael Guennebaud
a76c296e7f blas: fix most of level1 functions 2010-03-02 14:45:43 +01:00
Benoit Jacob
bca04bd983 fix compilation 2010-03-02 08:41:35 -05:00
Gael Guennebaud
a2d7c239f5 blas: fix HEMM and HERK 2010-03-02 12:44:40 +01:00
Gael Guennebaud
7fd6458fec selfadjoint: do not reference the imaginary part of the diagonal 2010-03-02 12:43:55 +01:00
Gael Guennebaud
abfed301cb blas: fix SYRK 2010-03-02 09:37:10 +01:00
Eamon Nerbonne
ff6b94d6d0 BenchTimer: avoid warning about symbol redefinition on win32, and include <Eigen/Core> (required to compile) 2010-03-02 08:46:11 +01:00
Gael Guennebaud
f1d3101956 blas: add warnings for non implemented functions 2010-03-03 09:32:10 +01:00
Hauke Heibel
32823caa62 Adapted the comment and removed it from the public dox. 2010-03-03 07:52:19 +01:00
Gael Guennebaud
3295c1c3e6 product selector: the symmetric case 2010-03-02 23:18:13 +01:00
Hauke Heibel
afad108b5f Added a comment to prevent placing an EIGEN_STRONG_INLINE where it makes no sense. 2010-03-02 19:36:21 +01:00
Eamon Nerbonne
3efb3cc828 Changed product type selector to fix perf regression. 2010-03-02 12:08:49 +01:00
Gael Guennebaud
c7828ac45c add missing implementation of uniform scaling products 2010-03-02 17:38:40 +01:00
Hauke Heibel
3cc9e3f5bb Fixes a compilation issue for MSVC. 2010-03-01 19:56:24 +01:00
Gael Guennebaud
a7b9250ad0 blas interface: fix compilation, fix GEMM, SYMM, TRMM, and TRSM,
i,e., they all pass the blas test suite. More to come
2010-03-01 19:06:07 +01:00
Jitse Niesen
a1ac56a7c7 Add (failing) test for computing HouseholderQR of a 1x1 matrix. 2010-03-01 13:46:41 +00:00
Gael Guennebaud
65eba35f98 rm useless omp shared directive 2010-03-01 13:34:44 +01:00
Gael Guennebaud
1710c07f63 remove Qt's atomic dependency, I don't know what I was doing wrong... 2010-03-01 13:09:47 +01:00
Jitse Niesen
2d7bd1ec91 Make MatrixFunctions tests more robust.
* Use absolute error instead of relative error.
* Test on well-conditioned matrices.
* Do not repeat the same test g_repeat times (bug fix).
* Correct diagnostic output in matrix_exponential.cpp .
2010-03-01 12:05:57 +00:00
Gael Guennebaud
31aa17e4ef GEMM: move the first packing of A' before the packing of B' 2010-03-01 11:10:30 +01:00
Gael Guennebaud
aeff3ff391 make Aron's idea work using Qt's atomic implementation for the synchronisation 2010-03-01 10:57:32 +01:00
Benoit Jacob
f1f3c30ddc remove the hack to make the static assertion on types actually show up.
indeed, now that we use the meta selector for transposing as needed, the static asserts work very well.
2010-02-28 11:10:13 -05:00
Benoit Jacob
07023b94d8 forgot defined(...) 2010-02-28 10:11:28 -05:00
Benoit Jacob
9334ed4444 on 64-bit systems, glibc's malloc returns 16-byte aligned ptrs, and we now take advantage of that. 2010-02-28 10:10:53 -05:00
Benoit Jacob
a480e7e60f * fix ei_handmade_aligned_realloc (was calling realloc on wrong ptr)
* add missing std::  (at least for QNX compatibility)
* add big comments to "structure" the file
2010-02-28 09:10:41 -05:00
Hauke Heibel
ff8c2149c1 Added a generic reallocation implementation based on ei_aligned_malloc/_free.
Rewrote ei_handmade_aligned_realloc such that it is now using std::realloc.
Reorganized functions in Memory.h for better readability.
Add missing <cerrno> include to Core (it's now required in Memory.h).
2010-02-28 14:32:57 +01:00
Hauke Heibel
40bd69fbaa Hide some internal stuff from the docs. 2010-02-28 12:56:37 +01:00
Benoit Jacob
1d9c18a8f3 comment out cerr's 2010-02-28 00:53:06 -05:00
Thomas Capricelli
eb3a3351cc misc cleaning 2010-02-28 02:51:35 +01:00
Benoit Jacob
27f5250258 Only include <iosfwd> unless either EIGEN_DEBUG_ASSIGN is defined or we're in eigen2 support mode 2010-02-27 19:04:22 -05:00
Benoit Jacob
e84f7e07e9 add ei_posix_memalign_realloc 2010-02-27 18:57:07 -05:00
Benoit Jacob
22fabb8940 add missing inline keyword, thanks to Eamon. 2010-02-27 17:51:48 -05:00
Thomas Capricelli
e0830cb1b7 Use a specialization of test_is_equal() instead of defining COMPARE() 2010-02-27 17:56:22 +01:00
Hauke Heibel
6c9eb36222 Added support for realloc based conservative resizing. 2010-02-27 17:25:07 +01:00
Hauke Heibel
78b2c7e16e Fixed a typo. 2010-02-27 17:24:42 +01:00
Benoit Jacob
3f393490ad dot: handle the rowvector.dot(colvector) case where one needs to transpose. 2010-02-27 11:19:14 -05:00
Thomas Capricelli
15a33622ac * define COMPARE(,), which prints expected/actual results in case of failure
* use it in test/NonLinearOptimization.cpp
2010-02-27 16:30:15 +01:00
Benoit Jacob
d9f6380499 Remove the dot product's separate implementation and use cwiseProduct.sum instead.
Also take special care to get nicely working static  assertions.
2010-02-27 10:03:27 -05:00
Benoit Jacob
b5c79e7291 add examples 2010-02-26 22:26:21 -05:00
Benoit Jacob
2c9a91812e merge 2010-02-26 21:47:54 -05:00
Benoit Jacob
814e40c72a let redux use the new ByOuterInner accessors 2010-02-26 21:46:43 -05:00
Benoit Jacob
4927841cba Document Map and Stride, add examples. 2010-02-26 21:29:04 -05:00
Benoit Jacob
b1f666d007 Fix Map-with-Stride and cover it by new unit tests. 2010-02-26 20:12:51 -05:00
Gael Guennebaud
6924bf2e99 implement Aron's idea of interleaving the packing with the first computations 2010-02-26 15:58:22 +01:00
nerbonne
c72a5074e6 Fixed perf problems for vector subtraction: inlining wasn't always happening when necessary. 2010-02-26 15:46:43 +01:00
Benoit Jacob
32115bff1e * add VERIFY_IS_EQUAL, should compile faster and it's natural when no arithmetic is involved.
* rename 'submatrices' test to 'block'
* add block-inside-of-block tests
* remove old cruft
* split diagonal() tests into separate file
2010-02-26 09:03:13 -05:00
Gael Guennebaud
ac425090f3 BTL: allow to bench real time 2010-02-26 14:57:49 +01:00
Gael Guennebaud
8d4a0e6753 fix compilation without openmp 2010-02-26 14:57:22 +01:00
Gael Guennebaud
c05047d28e fix some BTL issues 2010-02-26 12:51:20 +01:00
Gael Guennebaud
3ac2b96a2f implement a smarter parallelization strategy for gemm avoiding multiple
paking of the same data
2010-02-26 12:32:00 +01:00
Jitse Niesen
d86f5339b2 ComplexSchur: fix bug introduced in my previous commit.
The value of c is actually used a few lines later.
2010-02-26 09:47:17 +00:00
Benoit Jacob
f56ac04c34 DenseBase::IsRowMajor now takes the special case of vectors into account. 2010-02-25 21:24:42 -05:00
Benoit Jacob
b1c6c215a4 merge 2010-02-25 21:07:30 -05:00
Benoit Jacob
769641bc58 * Implement the ByOuterInner accessors
* use them (big simplification in Assign.h)
* axe (Inner|Outer)StrideAtCompileTime that were just introduced
* ei_int_if_dynamic now asserts that the size is the expected one: adapt to that in Block.h
* add rowStride() / colStride() in DenseBase
* implement innerStride() / outerStride() everywhere needed
2010-02-25 21:01:52 -05:00
Jitse Niesen
90e4a605ef ComplexSchur: compute shift more stably, introduce exceptional shifts.
Both the new computation of  the eigenvalues of a 2x2 block and the
exceptional shifts are taken from EISPACK routine COMQR.
2010-02-25 22:33:38 +00:00
Gael Guennebaud
53bae6b3f8 update matrix product selection rules for 1xSmallxLarge and the transposed case 2010-02-25 21:59:25 +01:00
Gael Guennebaud
959a1b5d63 detect and implement inplace permutations 2010-02-25 16:30:58 +01:00
Gael Guennebaud
d9ca0c0d36 optimize inverse permutations 2010-02-25 15:31:15 +01:00
Benoit Jacob
77c922bf05 * move the 's': InstructionsSet ---> InstructionSets
* proper capitalization: SSE, AltiVec
2010-02-25 06:43:45 -05:00
Thomas Capricelli
50a5ac3c4b oops, fix typo 2010-02-25 05:31:22 +01:00
Thomas Capricelli
00bc535b66 provide a static method to describe which SIMD instructions are used 2010-02-24 21:52:08 +01:00
Thomas Capricelli
0f3d69b65e Provide "eigen" defines to decide which instruction set is used
(sse3, ssse3 and sse4), independantly from the compiler.
Only those defines should be used in other places, and the user can
rely on those to know which sets are used.
2010-02-24 21:43:30 +01:00
Gael Guennebaud
7c98c04412 add reconstructedMatrix() to LLT, and LUs
=> they show that some improvements have still to be done
   for permutations, tr*tr, trapezoidal matrices
2010-02-24 19:16:10 +01:00
Gael Guennebaud
a7e4c0f825 make testsuite aware of EIGEN_CTEST_ARGS 2010-02-24 11:28:38 +01:00
Gael Guennebaud
f7aa9873ca * fix LDLT's default ctor use
* add a reconstructedMatrix() function to LDLT for debug purpose
2010-02-24 10:40:16 +01:00
Benoit Jacob
60325b8330 actually, this is not even meant to be a termination criterion. so the proper fix is this. 2010-02-23 16:10:26 -05:00
Benoit Jacob
3d066f4bc7 LDLT:
* fix bug thanks to Ben Goodrich: we were terminating at the wrong place, leaving some matrix coefficients with wrong values.
* don't use Higham's formula here: we're not trying to be rank-revealing.
2010-02-23 16:05:37 -05:00
Benoit Jacob
d92df336ad Further LU test improvements. I'm not aware of any test failures anymore, not even with huge numbers of repetitions.
Finally the createRandomMatrixOfRank() function is renamed to createRandomPIMatrixOfRank, where PI stands for 'partial isometry', that is, a matrix whose singular values are 0 or 1.
2010-02-23 15:40:24 -05:00
Gael Guennebaud
a1e1103328 add a 2D parallelizer 2010-02-23 21:40:15 +01:00
Gael Guennebaud
022e2f5ef4 fix typo 2010-02-23 18:24:15 +01:00
Gael Guennebaud
68eaefa5d4 update BTL (better timer, eigen2 => eigen3, etc) 2010-02-23 18:23:12 +01:00
Benoit Jacob
7dc75380c1 * FullPivLU: replace "remaining==0" termination condition (from Golub) by a fuzzy compare
(fixes lu test failures when testing solve())
* LU test: set appropriate threshold and limit the number of times that a specially tricky test
  is run. (fixes lu test failures when testing rank()).
* Tests: rename createRandomMatrixOfRank to createRandomProjectionOfRank
2010-02-23 09:04:59 -05:00
Gael Guennebaud
4a0d41c5fb merge 2010-02-23 14:34:55 +01:00
Hauke Heibel
1fd8d7b96a Attempt to fix PGI compilation issue. 2010-02-23 11:35:51 +01:00
Mark Borgerding
b528b747c1 merge 2010-02-22 21:44:30 -05:00
Mark Borgerding
5d530e0373 enable caller to supply FFT length for Eigen Matrix interface functions to effect zero pad or source shrink at Nyquist bin 2010-02-22 21:43:30 -05:00
Gael Guennebaud
3beedba244 merge 2010-02-22 21:32:29 +01:00
Thomas Capricelli
d3b314569b provide default values for CXX, remove duplicate define 2010-02-22 15:39:17 +01:00
Jitse Niesen
e6f0cd7121 Merge. 2010-02-22 16:17:43 +00:00
Gael Guennebaud
d2b0eadf52 fully adapt the gebp kernel and optimize it for CPU with only 8 registers
(transplanted from 2ed88ebbf1995be90b8d0c25ff10248c8f56d023)
2010-02-22 16:35:05 +01:00
Gael Guennebaud
51a4b929a1 implement an even lower level version of the gebp kernel for MSVC (it seems to be faster with gcc as well)
(transplanted from 9a5643551fe068497f84a81cd8986febf1918382)
2010-02-22 15:18:29 +01:00
Hauke Heibel
3e6ab8f93b ups 2010-02-22 11:34:25 +01:00
Hauke Heibel
d5af5ab92b Added getRealTime() for windows. 2010-02-22 11:23:27 +01:00
Gael Guennebaud
f797ba0abe extend the bench timer to allow benchmarking of parallel code,
improvements are welcome
2010-02-22 11:04:35 +01:00
Gael Guennebaud
801440c519 fix BTL's eigen interface
(transplanted from 437f40acc1
)
2010-02-22 09:32:16 +01:00
Gael Guennebaud
eb905500b6 significant speedup in the matrix-matrix products 2010-02-23 13:06:49 +01:00
Gael Guennebaud
d579d4cc37 oops 2010-02-22 17:57:15 +01:00
Gael Guennebaud
fc4a85ecd5 fully adapt the gebp kernel and optimize it for CPU with only 8 registers 2010-02-22 16:35:05 +01:00
Gael Guennebaud
dd569c7d0f merge 2010-02-22 15:19:19 +01:00
Gael Guennebaud
e00f1fd125 implement an even lower level version of the gebp kernel for MSVC (it seems to be faster with gcc as well) 2010-02-22 15:18:29 +01:00
Hauke Heibel
6730fd9f3f Port BenchTimer fix. 2010-02-22 11:42:58 +01:00
Gael Guennebaud
4ba25a8d5c merge 2010-02-22 11:30:36 +01:00
Gael Guennebaud
aaaf855a88 add a small benchmark to quickly bench/compare SMP support 2010-02-22 11:09:57 +01:00
Gael Guennebaud
1c6bdb4060 merge 2010-02-22 11:09:05 +01:00
Gael Guennebaud
3e62fafce8 clean a bit the parallelizer 2010-02-22 11:08:37 +01:00
Gael Guennebaud
b20935be9b add initial openmp support for matrix-matrix products
=> x1.9 speedup on my core2 duo
2010-02-22 09:40:34 +01:00
Gael Guennebaud
437f40acc1 fix BTL's eigen interface 2010-02-22 09:32:16 +01:00
Thomas Capricelli
1a70f3b48d fix compilation 2010-02-21 19:30:11 +01:00
Hauke Heibel
a901bed33a Added IsRowMajor enum to DenseBase. 2010-02-21 18:26:14 +01:00
Hauke Heibel
e2a059863e Added missing precision/eps functions to AutoDiffScalar. 2010-02-21 15:44:34 +01:00
Hauke Heibel
f079f52b58 merge 2010-02-21 15:25:28 +01:00
Hauke Heibel
f75e6773b0 Added ei_traits<Quaternion>::PlainObject. 2010-02-21 15:24:10 +01:00
Hauke Heibel
ac8ff44278 Tried to get rid of MSVC warning D9025. 2010-02-21 15:23:51 +01:00
Thomas Capricelli
a7d085eb4e NonLinearOptimization : clean 'mode' handling from the old minpack code :
* this is actually a boolean, not an int
* use a better name
* can be set at initialization time instead of bloating all methods signatures
2010-02-21 12:41:37 +01:00
Gael Guennebaud
608959aa6f compilation fix in ldlt() for non matrix types 2010-02-21 10:29:19 +01:00
Gael Guennebaud
71fecd2371 add missing return value 2010-02-20 18:19:34 +01:00
Hauke Heibel
6b4cecc1c6 CMake cleanup. 2010-02-20 17:39:58 +01:00
Jitse Niesen
4e389b195d Change MatrixFunction::separation() parameter from 0.01 to 0.1 .
The latter is actually the value used in the literature.
2010-02-20 16:27:04 +00:00
Hauke Heibel
abc8c01080 Renamed PlainMatrixType to PlainObject (Array != Matrix).
Renamed ReturnByValue::ReturnMatrixType ReturnByValue::ReturnType (again, Array != Matrix).
2010-02-20 15:53:57 +01:00
Jitse Niesen
67ce07ea83 matrix_function test: replace expm(A).inverse() by expm(-A)
The latter is more stable. This fixes one of the issues with the test.
Also, make typedef's in MatrixFunctionReturnValue public; this is
necessary to get the test to compile.
2010-02-20 14:45:50 +00:00
Hauke Heibel
f0c8dcf1e2 Renamed AnyMatrixBase to EigenBase. 2010-02-20 15:26:02 +01:00
Gael Guennebaud
4f8773c23a fix stupid enum values 2010-02-19 17:46:36 +01:00
Benoit Jacob
5491531a81 add Stride.h 2010-02-18 20:44:17 -05:00
Benoit Jacob
b73e22905d miserable half-working state, commiting to a fork just in case, just to perfect
my day, my hard disk would die.
Will write a more detailed commit message once it's working.
2010-02-18 20:42:38 -05:00
Jitse Niesen
39d9f0275b Update matrix_exponential test after API change in ei_matrix_function
Apologies for forgetting this yesterday and not testing properly.
2010-02-17 09:50:11 +00:00
Mark Borgerding
2cd1ad2ded typo in merge 2010-02-16 22:06:23 -05:00
Mark Borgerding
f200c84d9f merge 2010-02-16 21:41:04 -05:00
Mark Borgerding
8f51a4ac90 found out about little-documented FFTW_PRESERVE_INPUT which has effect on c2r transforms 2010-02-16 20:44:48 -05:00
Jitse Niesen
319bf3130b Use ReturnByValue to return result of ei_matrix_function(), ... 2010-02-16 16:43:11 +00:00
Jitse Niesen
25019f0836 Use ReturnByValue to return result of ei_matrix_exponential() . 2010-02-15 19:17:25 +00:00
Gael Guennebaud
a9096b626b not all versions of gcc support -Wno-variadic-macros 2010-02-15 11:39:57 +01:00
Gael Guennebaud
016943f870 avoid 2 redundant calls to resize 2010-02-15 11:31:36 +01:00
Gael Guennebaud
dcb395c6f5 explicitly disable the use of evalTo for dense object 2010-02-15 11:09:33 +01:00
Gael Guennebaud
21d0eb3f11 the default implementation should really call evalTo 2010-02-15 11:01:55 +01:00
Gael Guennebaud
d00bff91ad workaround weird gcc 4.0.1 compilation error 2010-02-15 11:00:30 +01:00
Hauke Heibel
8519558d11 Workaround for compounds affected by #94. 2010-02-15 10:11:10 +01:00
Jitse Niesen
b18f737aa1 Test matrix functions with matrices with clustered imaginary eivals.
The idea is that these test MatrixFunction::swapEntriesInSchur(),
which is not covered by existing tests. This did not work out as
expected, but nevertheless it is a good test so I left it in.
2010-02-13 22:49:01 +00:00
Jitse Niesen
a4a2671fd0 Refactor matrix_function test in preparation of next commit. 2010-02-13 22:38:27 +00:00
Benoit Jacob
9251cfed9b this had to be done here, not at the end. 2010-02-12 09:03:16 -05:00
Benoit Jacob
37ca4200b2 Piotr's patch was missing many occurences of size_t. So,
using std::size_t;
This is the only way that we can ensure QCC support in the long term without having to think about it everytime.
2010-02-12 08:58:29 -05:00
Gael Guennebaud
a76950bdab fix a couple of ICE with gcc 4.0.1 2010-02-12 09:41:56 +01:00
Piotr Trojanek
1701a5d1f8 std:: namespace fixup for more restricive compilers such as QNX's QCC 2010-02-10 13:24:47 +01:00
Hauke Heibel
ae0a17d30b Here is the proper fix. 2010-02-11 11:39:02 +01:00
Hauke Heibel
13ce92f5ba Fixed notemporary unit test. 2010-02-11 11:31:36 +01:00
Hauke Heibel
93e86b0884 Fixed typos.
Replace NumTraits<bool>::dummy_precision() (three locations) by false in order to suppress warnings.
2010-02-11 11:31:22 +01:00
Gael Guennebaud
48f5980669 fix compilation (cwise and epsilon) 2010-02-11 09:24:29 +01:00
Thomas Capricelli
602778ea26 also fix tests for NumTraits<double>::epsilon() 2010-02-11 01:49:27 +01:00
Thomas Capricelli
fd19dd57d8 fix usage of epsilon wrt to latest API change 2010-02-11 01:43:36 +01:00
Thomas Capricelli
ccc58e935e fix usage of epsilon wrt to latest API change 2010-02-11 01:43:03 +01:00
Jitse Niesen
fc5fa77743 unsupported/Eigen/AlignedVector3: dummy_precision is now in NumTraits 2010-02-10 17:59:27 +00:00
Hauke Heibel
d0e8342a04 Fixed warning. 2010-02-10 18:00:36 +01:00
Gael Guennebaud
0ca67afe6a finally here is a simple solution making (a*b).diagonal() even faster than a.lazyProduct(b).diagonal() !! 2010-02-10 14:08:47 +01:00
Mark Borgerding
1d342e135c changed destination argument to reference 2010-01-22 22:52:13 -05:00
Mark Borgerding
141c746fc7 if the src.stride() != 1, then the layout is not continuous -- need to copy to temporary 2010-01-22 01:17:36 -05:00
Mark Borgerding
cd7912313d changed FFT function vector and Matrix args to pointer as Benoit suggested
implemented 2D Complex FFT for FFTW impl
2010-01-22 00:35:03 -05:00
Mark Borgerding
a30d42354f updated comments and played around with Map 2010-01-21 21:10:16 -05:00
Mark Borgerding
7a6cb2a39c added benchmark for unscaled and half-spectrum FFTs 2010-01-21 21:09:26 -05:00
Hauke Heibel
f1a025185a Added array() to ArrayBase and matrix() to MatrixBase(). 2010-01-21 17:55:09 +01:00
Jitse Niesen
dbf3af866e Remove some Array #includes. 2010-01-21 12:31:03 +00:00
Hauke Heibel
7bf5930496 Adapted Geometry includes.
Adapted the decomposition documentation regarding the solve signature.
2010-01-21 09:43:30 +01:00
Hauke Heibel
ecc71abdda Added the Array include's warning for GCC. 2010-01-20 21:36:50 +01:00
Hauke Heibel
26df0609e2 Corrected the Array include's deprecation warning for MSVC. 2010-01-20 20:56:52 +01:00
Hauke Heibel
85d80d0fcd merge 2010-01-20 20:51:42 +01:00
Hauke Heibel
5d48cc1f5b Moved the Array module to Core. 2010-01-20 20:51:01 +01:00
Manuel Yguel
71b64d3498 Updated tests for enhanced AlignedBox with volume, diagonal and better handling of integral types. 2010-01-20 14:55:10 +01:00
Gael Guennebaud
8918d18e21 Improved patch from Manuel Yguel:
Enhance AlignedBox to accept integral types and add some usefull methods: diagonal, volume, sample.
2010-02-10 11:40:55 +01:00
Gael Guennebaud
bb290977b8 add highest and lowest functions to NumTraits 2010-02-10 11:11:21 +01:00
Gael Guennebaud
fe0827495a * move dummy_precision and epsilon to NumTraits
* make NumTraits inherits std::numeric_limits
2010-02-10 10:52:28 +01:00
Hauke Heibel
c11df02f0d Deactivated test which requires variadic macros. 2010-02-09 20:52:15 +01:00
Hauke Heibel
5da3049b80 Regression tests for number of nested temporaries.
Moved EIGEN_DEBUG_MATRIX_CTOR to ei_matrix_storage to capture resize related allocations.
2010-02-09 20:32:48 +01:00
Gael Guennebaud
f96e6a6944 merge 2010-02-09 16:41:48 +01:00
Gael Guennebaud
71e580c4aa fix nesting in Arraywrapper and nesting_ops 2010-02-09 16:38:36 +01:00
Hauke Heibel
04333c6ace Now variadic macro related warnings should be supressed as well under Linux. 2010-02-09 16:37:24 +01:00
Gael Guennebaud
905050b239 extend sparse product benchmark with ublas 2010-02-09 15:55:36 +01:00
Gael Guennebaud
285bc336d5 document lazyProduct 2010-02-09 14:45:17 +01:00
Gael Guennebaud
840977529f * as promised, remove the "optimization" for Product::diagonal()
* add MatrixBase::lazyProduct
2010-02-09 14:34:31 +01:00
Gael Guennebaud
9ce1212d7c For the record, here is a solution for (a*b).diagonal, at the cost of extra copies
if a and/or b as to be evaluated. So in the next commit I'll remove it. A nice
solution would be to evaluate the lhs/rhs into member of the initial product,
but that would be overkill.
2010-02-09 14:28:22 +01:00
Hauke Heibel
0a680a9857 Added a non-diagonal product nesting test. 2010-02-09 14:24:17 +01:00
Hauke Heibel
dd283b3f82 Fixed paste&copy error. 2010-02-09 14:12:41 +01:00
Hauke Heibel
928ae382b4 Added debug only unit test for nesting ops - just run ./check nesting. 2010-02-09 14:07:37 +01:00
Gael Guennebaud
8185a3c6cf fix one useless temp & copy 2010-02-09 13:16:29 +01:00
Gael Guennebaud
1cb59e4781 fix nesting lazy prod by ref 2010-02-09 11:27:30 +01:00
Gael Guennebaud
d104d2cd29 add accessors to coeff based product 2010-02-09 11:13:22 +01:00
Gael Guennebaud
5686eca7b1 * fix multiple temporary copies for coeff based products
* introduce a lazy product version of the coefficient based implementation
  => flagged is not used anymore
  => small outer product are now lazy by default (aliasing is really unlikely for outer products)
2010-02-09 11:05:39 +01:00
Gael Guennebaud
0398e21198 s/UnrolledProduct/CoeffBasedProduct 2010-02-09 10:02:26 +01:00
Gael Guennebaud
c076fec734 fix the multiple temporary issue for nested products 2010-02-09 09:58:34 +01:00
Gael Guennebaud
8b016e717f get rid of NestParentByRefBit 2010-02-08 16:51:41 +01:00
Hauke Heibel
871698d3aa Introduced NestParentByRefBit and NestByRefBit - this should fix temporaries related to nested products.
Fixed a few typos and a few warnings.
2010-02-06 17:43:32 +01:00
Gael Guennebaud
6f3f857897 make noalias works for coefficient based products 2010-02-05 23:44:24 +01:00
Gael Guennebaud
52167be4c8 make sure the correct diagoanl() function is called in trace() 2010-02-04 18:51:29 +01:00
Gael Guennebaud
73eb0e633c * resurected Flagged from Eigen2Support
* reimplement .diagonal() for ProductBase to make (A*B).diagonal() more efficient!
2010-02-04 18:28:09 +01:00
Gael Guennebaud
b44240180f optiization: make hybrid small/large outer products use the unrolled path 2010-02-04 17:17:57 +01:00
Hauke Heibel
d0b4ef81f0 Prevent temporaries for reductions. 2010-02-04 14:26:03 +01:00
Hauke Heibel
1a77334d54 Silenced type conversion warnings. 2010-02-03 19:20:25 +01:00
Hauke Heibel
05837be8fb Fixed a warning.
Transform::Identity() is now returning a Transform.
2010-02-03 09:46:27 +01:00
Hauke Heibel
8861dce7ee Fixed 32bit builds. 2010-02-03 09:07:17 +01:00
Hauke Heibel
7b2dd988fa Fixes #89.
Added regression test.
2010-02-02 09:27:41 +01:00
Gael Guennebaud
7c41fb66f8 fix compilation on 32bits systems 2010-02-01 11:45:08 +01:00
Gael Guennebaud
08f154b93a remove some trailing nestbyvalue 2010-02-01 11:44:44 +01:00
Gael Guennebaud
dd817361f5 use unrolled product path for small outer product 2010-01-31 12:28:16 +01:00
Hauke Heibel
15c0a78c6e One warning less... 2010-01-30 17:24:18 +01:00
Gael Guennebaud
43f0c0cbb3 fix triangular view assignment 2010-01-30 09:02:18 +01:00
Gael Guennebaud
b6521b799f add specialization of ei_ref_selector for Array (fix a big perf issue \!) 2010-01-29 21:28:23 +01:00
Hauke Heibel
6dee5440e4 Adapted mean to work with complex numbers.
Added regression test.
2010-01-29 12:12:02 +01:00
Hauke Heibel
42b88983ff Fixed mean reduction leading to unresolved symbol. 2010-01-29 11:48:16 +01:00
Hauke Heibel
ae06365bbd Disable variadic macro warning when compiling at full warning level.
I was not able to get a macro version running and thus I opted for a cmake patch.
2010-01-29 09:53:19 +01:00
Thomas Capricelli
2b2fcc9460 erm.... using nxn is the actual purpose of this variant, fix this. 2010-01-29 08:59:25 +01:00
Jitse Niesen
375b5faa8a Fix copy-paste error in first_aligned test. 2010-01-28 17:06:20 +00:00
Benoit Jacob
13b078efc9 remove reference to dead option 2010-01-28 08:46:01 -05:00
Hauke Heibel
33abe75afa Fixed Quaternion operator*= added regression test. 2010-01-28 10:32:44 +01:00
Thomas Capricelli
98a584ceb8 Put the Status outside of the class, it really does not depend on the
FunctorType or Scalar template parameters.
2010-01-28 10:29:26 +01:00
Thomas Capricelli
27cf1b3a97 eigenization of ei_r1updt() 2010-01-28 04:28:52 +01:00
Thomas Capricelli
40eac2d8a0 misc cleaning / eigenization 2010-01-28 04:19:39 +01:00
Thomas Capricelli
fcd074c928 silent warning of icc 2010-01-27 23:43:32 +01:00
Thomas Capricelli
bd732986bc fix compilation 2010-01-27 23:43:15 +01:00
Gael Guennebaud
0ce5bc0d14 add support for global math function for array 2010-01-27 23:23:59 +01:00
Hauke Heibel
7d92b7ad5f Modified license header. 2010-01-27 20:47:37 +01:00
Hauke Heibel
3bfba8c9a9 Added the missing unit test file. 2010-01-27 20:34:42 +01:00
Hauke Heibel
5b9cc65418 Added EIGEN_DEFINE_STL_VECTOR_SPECIALIZATION macro including unit tests and documentation. 2010-01-27 20:34:05 +01:00
Benoit Jacob
828d058b4b EIGEN_ENUM_MIN ---> EIGEN_SIZE_MIN 2010-01-27 11:42:04 -05:00
Benoit Jacob
dcbf104bcc add EIGEN_DEFAULT_TO_ROW_MAJOR cmake option for the tests. 2010-01-27 07:48:48 -05:00
Benoit Jacob
35bacf7cb8 *forward port fix in MapBase::coeff(int) and coeffRef(int)
*forward port expanded map.cpp unit test
*fix unused variable warnings
2010-01-27 07:11:49 -05:00
Thomas Capricelli
7ba9dc07ed port ei_rwupdt to c++, and misc cleaning 2010-01-27 09:01:13 +01:00
Thomas Capricelli
e97529c2e3 doc : update code, mention examples 2010-01-27 08:14:50 +01:00
Hauke Heibel
4365a48748 Added an ei_linspaced_op to create linearly spaced vectors.
Added setLinSpaced/LinSpaced functionality to DenseBase.
Improved vectorized assignment - overcomes MSVC optimization issues.
CwiseNullaryOp is now requiring functors to offer 1D and 2D operators.
Adapted existing functors to the new CwiseNullaryOp requirements.
Added ei_plset to create packages as [a, a+1, ..., a+size].
Added more nullaray unit tests.
2010-01-26 19:42:17 +01:00
Thomas Capricelli
afb9bf6281 use PlanarRotation<> instead of handmade givens rotation in cminpack code
+ cleaning.
This results in some more memory being used, but not much.
2010-01-26 17:40:55 +01:00
Thomas Capricelli
c04a93df31 clean r1mpyq: remove fortran leftovers 2010-01-26 14:08:52 +01:00
Thomas Capricelli
55f328b264 misc cleaning 2010-01-26 13:20:24 +01:00
Thomas Capricelli
71f513e250 forgot to commit this: qform.h is not used anymore 2010-01-26 13:08:29 +01:00
Thomas Capricelli
69f11c08a1 more eigenization, dropped 'ipvt' in lm 2010-01-26 12:09:52 +01:00
Thomas Capricelli
8a690299c6 fix possible segfault 2010-01-26 11:48:32 +01:00
Thomas Capricelli
d791b51112 remove spaces 2010-01-26 10:50:43 +01:00
Thomas Capricelli
113995355b get rid of ei_qform + lot of other cleaning, now that we do not depend on
minpack qr factorization.
2010-01-26 08:42:48 +01:00
Thomas Capricelli
ba2a9cce03 some more eigenization 2010-01-26 07:36:15 +01:00
Thomas Capricelli
a3034ee079 cleaning 2010-01-26 06:05:01 +01:00
Thomas Capricelli
91561cded4 use a plain matrix to store the upper triangular matrix 'R', instead
of the "compact inside a vector" scheme used by fortran/minpack.
The most difficult part is to fix all related code. Tests pass.
2010-01-26 05:59:58 +01:00
Thomas Capricelli
4b859c8554 cleaning 2010-01-26 01:59:32 +01:00
Thomas Capricelli
c759814f11 some more (thoroughly checked) eigenization 2010-01-26 01:43:58 +01:00
Jitse Niesen
bdb0e9fcd0 Clean up one compilation error and two warnings. 2010-01-26 16:02:19 +00:00
Thomas Capricelli
1403cea087 fix a bug introduced between the cminpack version of Manolis Lourakis and
the one from Frédéric Devernay.
Here, we want to compute the max over the column, the -1 is not needed in
fortran because indices start at 1, contrary to c/c++.
2010-01-26 04:55:36 +01:00
Thomas Capricelli
9651e0c503 Use eigen methods for solving triangular systems. We loose again very
slightly on both speed and precision on some tests.
2010-01-25 11:34:52 +01:00
Thomas Capricelli
92be7f461b define ei_lmpar2() that takes a ColPivHouseholderQR as argument. We still
need to keep the old one around, though.
2010-01-25 07:23:38 +01:00
Thomas Capricelli
ee0e39284c Replace the qr factorization from (c)minpack (qrfac) by Eigen's own stuff.
Results as checked by unit tests are very slightly worse, but not much.
2010-01-25 07:22:28 +01:00
Thomas Capricelli
1cb0be05b0 useful cleaning 2010-01-25 07:08:08 +01:00
Thomas Capricelli
cbf6022e5a useless cleaning 2010-01-25 07:07:31 +01:00
Benoit Jacob
6ae7d842a3 generate a compilation error when using ReturnByValue::coeff() or coeffRef(),
instead of doing an infinite recursion
2010-01-24 21:44:18 -05:00
Jitse Niesen
858539a6af Use matrices with clustered eigenvalues in matrix function test.
This is in order to get better code coverage.
Test matrix_function_3 now fails regularly because ComplexSchur
reaches the max number of iterations; further study needed.
2010-01-24 22:56:28 +00:00
Thomas Capricelli
d08035f3e1 fix the script again (definitely?) + cleaning 2010-01-22 19:26:29 +01:00
Gael Guennebaud
be11a254ac rm ExpressionMaker stuff (weird as I was pretty sure that I had already removed them) 2010-01-22 10:17:43 +01:00
Gael Guennebaud
d40c110053 lot of cleaning:
- clean the *_PUBLIC_INTERFACE_*
- update Diagonal, ReturnByValue, ForceAlignedAccess, UnaryView, etc. to support array
- many other small stuff
2010-01-22 10:15:41 +01:00
Jitse Niesen
e78e3cd41b Fix bug in MatrixBase::setIdentity(int, int). 2010-01-20 12:07:46 +00:00
Jitse Niesen
e0c2c6385f Add small test for Matrix::setIdentity()
This is to exhibit the bug that makes the jacobisvd_7 test fail.
2010-01-20 10:51:59 +00:00
Hauke Heibel
89ee9f092f Fixed compilation of MatrixFunctions module. 2010-01-20 11:32:13 +01:00
Gael Guennebaud
d5d5417062 add SSE code (from Intel) for the fast inversion of 4x4 matrices of double 2010-01-19 16:04:04 +01:00
Gael Guennebaud
60b0ddc3e1 update the fast 4x4 SSE inversion code from more recent Intel's code 2010-01-19 15:33:45 +01:00
Jitse Niesen
a13ffbd836 Get rid off GCC warning on comparing enums from different types. 2010-01-19 11:05:52 +00:00
Mark Borgerding
dacfa97e82 merge 2010-01-18 19:53:44 -05:00
Mark Borgerding
adb2170eb8 removed Eigen::Complex class since it offered insufficient advantage over std::complex when sane real,imag structure packing is assumed.
for more info see:
http://www.cpptalk.net/portable-complex-numbers-between-c-c--vt46432.html
2010-01-18 19:39:22 -05:00
Thomas Capricelli
c8b9097740 erm.. forgot to test after previous commit. Now it's ok (tm). 2010-01-19 01:00:59 +01:00
Gael Guennebaud
9f899808d7 fix scalar - matrix 2010-01-18 22:56:47 +01:00
Gael Guennebaud
0158d78906 extend CwiseNullaryOp to support Array 2010-01-18 22:56:25 +01:00
Gael Guennebaud
c70d54257b add unit tests for true array objects 2010-01-18 22:54:20 +01:00
Thomas Capricelli
c436abd0ac fix both compilation and previous fix : now 'basicstuff' passes again.
(Gael: i dont think you meant removing this setIdentity(), did you?)
2010-01-18 10:29:11 +01:00
Gael Guennebaud
6b380992b5 fix != 2010-01-18 08:18:48 +01:00
Thomas Capricelli
0c89475317 unit tests for == / != operators 2010-01-17 23:57:59 +01:00
Gael Guennebaud
71630b2160 fix MatrixBaseAddons example 2010-01-17 20:04:49 +01:00
Benoit Jacob
b5a6f382ca work around warning about /* inside of a comment (gcc 4.4) 2010-01-16 11:50:09 -05:00
Hauke Heibel
37d4505228 More documentation improvements. 2010-01-16 15:43:11 +01:00
Hauke Heibel
90d5a7c0dd Adapted doxygen's new style sheet.
Added documentation to some of the typedefs.
2010-01-15 15:45:07 +01:00
Gael Guennebaud
dd4b2f044d forgot to include this file in previous commit 2010-01-15 13:36:09 +01:00
Gael Guennebaud
d62ee0668f remove useless using comp. assignment operators 2010-01-15 13:34:28 +01:00
Gael Guennebaud
76a355817b fix a warning 2010-01-15 13:26:39 +01:00
Benoit Jacob
bfe6fdde24 allow to multiply a householder sequence and a matrix when one is real and one is complex.
This is especially important as in bidiagonalization, the band matrix is real.
2010-01-15 00:35:26 -05:00
Benoit Jacob
ddc32adb0e New UpperBidiagonalization class 2010-01-14 22:30:58 -05:00
Benoit Jacob
f1d1756cdd Introduce third template parameter to HouseholderSequence: int Side.
When it's OnTheRight, we read householder vectors as rows above the diagonal.
With unit test. The use case will be bidiagonalization.
2010-01-14 19:16:49 -05:00
Gael Guennebaud
5d796e363c compilation fix for UmfPack 2010-01-14 22:31:06 +01:00
Hauke Heibel
814659c201 Changed parts of the documentation.
The param keyword is now tparam (in Matrix).
Made PlainMatrixType non-internal (currently MatrixBase only); I think this is an important typedef in particular when writing your own template methods.
2010-01-14 10:10:07 +01:00
Hauke Heibel
624902f559 Enabled class diagrams (requires dot, i.e. graphviz).
Adapted the style sheet in order to center class diagrams.
2010-01-13 17:59:42 +01:00
Hauke Heibel
c0b2aa0ace Added some minor comments.
Adapted some of the doc/snippets.
2010-01-13 17:51:09 +01:00
Thomas Capricelli
a33b2dfeb3 introduce new state, "Not started" 2010-01-13 05:17:58 +01:00
Hauke Heibel
a87c0a5ed8 Fixes #83. 2010-01-12 17:13:46 +01:00
Hauke Heibel
553fb31f7e Using operator*= is not required in MapBase. Since no other operator*= is present, none of the base class operator*='s may be hidden and all of them should be visible. As far as I was able to verify, this is not affecting GCC. This fixes #84. 2010-01-12 16:30:03 +01:00
Hauke Heibel
ffaea1d995 merge 2010-01-12 13:38:46 +01:00
Hauke Heibel
e48c3faf25 Fixed the ProductReturnType (at least for UnrolledProducts).
Fixed operator= (MSVC specific) in Array.
2010-01-12 13:38:04 +01:00
Hauke Heibel
a8ea2c8cef Fixes #81. 2010-01-12 10:07:06 +01:00
Hauke Heibel
caa9ced853 Add a real plain matrix type to the ei_nested declaration used in product return type. 2010-01-12 09:41:06 +01:00
Jitse Niesen
65cd1c7639 Add support for matrix sine, cosine, sinh and cosh. 2010-01-11 18:05:30 +00:00
Hauke Heibel
a05d42616b Fixed DenseStorageBase typedef (MSVC specific).
Unified the ei_plain_matrix_type.
2010-01-11 18:16:59 +01:00
Benoit Jacob
24a09ceae8 * Fix a bug in HouseholderQR with mixed fixed/dynamic size: must use EIGEN_SIZE_MIN instead of EIGEN_ENUM_MIN, and there are many other occurences throughout Eigen!
* HouseholderSequence:
  - add shift parameter
  - add essentialVector() method to start abstracting the direction
  - add unit test in householder.cpp
2010-01-11 08:48:39 -05:00
Hauke Heibel
325da2ea3c Fixed conservativeResize.
Fixed multiple overloads for operator=.
Removed debug output.
2010-01-11 13:57:50 +01:00
Jitse Niesen
376341de4a Eigen/src/Core/DenseStorageBase.h: add 'typename' 2010-01-11 10:04:39 +00:00
Hauke Heibel
83d21d5ff6 Fixes unit test swap_3. Friends are not inherited. 2010-01-10 23:15:36 +01:00
Hauke Heibel
3d5912d458 Backed out the removal of the actual resize like implementation. Now, resizing by dimension is optional. 2010-01-10 23:11:05 +01:00
Hauke Heibel
350c7beb92 Fixed swapping and corresponding MSVC compilation. 2010-01-10 20:02:26 +01:00
Hauke Heibel
e0f5b4add3 Fixed MSVC compilation. 2010-01-10 15:36:28 +01:00
Jitse Niesen
3eb80eecde doc/A05_PortingFrom2To3.dox: fix typos 2010-01-09 18:54:31 +00:00
Jitse Niesen
3407e8a67e triangularView<UpperTriangular> --> triangularView<Upper>
Necessary after c5d7c9f0de
 (big delete of "triangular").
2010-01-08 12:49:46 +00:00
Jitse Niesen
ef0ed5b271 test/triangular.cpp: isUpper() --> isUpperTriangular()
Necessary to get the test to compile after c5d7c9f0de
.
I'm assuming that isUpperTriangular() is the name we want; the alternative
is to change Eigen/src/Core/{MatrixBase,TriangularMatrix}.h
2010-01-08 12:46:24 +00:00
Benoit Jacob
0fb0307377 implement BandMatrix::evalTo (thus avoid infinite recursion when assigning a BandMatrix to a Matrix) 2010-01-07 22:08:18 -05:00
Benoit Jacob
44ed79fc3c finally, undo this 2010-01-07 22:03:58 -05:00
Benoit Jacob
b05f59ee07 Backed out changeset 58fb27cd56 2010-01-07 22:00:45 -05:00
Benoit Jacob
58fb27cd56 undo 2010-01-07 21:53:52 -05:00
Benoit Jacob
7befc8d6f3 undo my last commit 2010-01-07 21:51:40 -05:00
Trevor Irons
5f0cf1d7f6 Added std::sqrt(std::complex<float>) and std::sqrt(std::complex<double>) support to MathFunctions.h 2010-01-07 15:03:51 -07:00
Gael Guennebaud
c5d7c9f0de remove the Triangular suffix to Upper, Lower, UnitLower, etc,
and remove the respective bit flags
2010-01-07 21:15:32 +01:00
Benoit Jacob
82ec250a0f make applyHouseholderOnTheRight take a row vector, not a column vector:
this is how it's used in practice.
2010-01-07 12:50:02 -05:00
Gael Guennebaud
c24de5b413 typo 2010-01-06 17:43:11 +01:00
Gael Guennebaud
2fbe8da7a1 improved a bit the list of API changes 2010-01-06 17:35:22 +01:00
Gael Guennebaud
7d3fe69eff Various documentation updates:
- update the tutorial
- update doc of deprecated cwise function
- update cwise doc snippets
2010-01-06 17:18:38 +01:00
Gael Guennebaud
c11300dbd5 a couple of fixes 2010-01-06 17:16:30 +01:00
Thomas Capricelli
a0efdd843c actually stop on failure 2010-01-06 17:14:31 +01:00
Benoit Jacob
e77748ef96 merge 2010-01-06 08:24:42 -05:00
Benoit Jacob
f287ad651a fix comments 2010-01-06 08:23:19 -05:00
Gael Guennebaud
95c00ca8ff started a page on the porting from Eigen2 to 3, updated a bit the tutorial 2010-01-06 13:56:04 +01:00
Gael Guennebaud
023e0dfb4e improve the new experimental sparse product 2010-01-05 19:56:59 +01:00
Gael Guennebaud
eda4e98c61 merge 2010-01-05 16:04:36 +01:00
Gael Guennebaud
801e601ff1 a couple of improvements in the Autodiff module 2010-01-05 16:04:05 +01:00
Gael Guennebaud
c3823dce72 extend benchmark for sparse products 2010-01-05 16:03:35 +01:00
Gael Guennebaud
d8534be728 add a novel, experimental sparse product 2010-01-05 15:57:16 +01:00
Gael Guennebaud
1837b65b28 add operators *= and /= (bis) 2010-01-05 15:41:01 +01:00
Gael Guennebaud
39209edd71 port unsupported modules to new API 2010-01-05 15:38:20 +01:00
Gael Guennebaud
cab85218db add missing operators: /, /=, *= 2010-01-05 15:36:32 +01:00
Gael Guennebaud
3e22e3907f fix types of scalar constants 2010-01-05 15:35:44 +01:00
Gael Guennebaud
97f0f02e10 forgot to hg add this file 2010-01-05 14:36:55 +01:00
Gael Guennebaud
9d9e00b608 merge and add start/end to Eigen2Support 2010-01-05 13:07:32 +01:00
Gael Guennebaud
90d2ae7fec fix aliasing detection 2010-01-05 12:46:07 +01:00
Gael Guennebaud
37851cfe11 fix a coupe of warnings 2010-01-05 10:15:29 +01:00
Benoit Jacob
51b8f014f3 merge 2010-01-04 21:26:15 -05:00
Benoit Jacob
39ac57fa6d Big renaming:
start ---> head
  end   ---> tail
Much frustration with sed syntax. Need to learn perl some day.
2010-01-04 21:24:43 -05:00
Benoit Jacob
78ba523d30 Make snippet run successfully again:
the snippet for 'eval' was taking m=m.transpose() as an example of code
 that needs an explicit call to eval(), but that doesn't work anymore now
 that we have the clever assert detecting aliasing issues.
2010-01-04 21:23:37 -05:00
Thomas Capricelli
e9724c8ea2 accessor for the levenberg-marquardt parameter 2010-01-05 01:22:53 +01:00
Jitse Niesen
708e6629e2 Further refactoring of MatrixFunction<MatrixType, 1>
* move some data to member variables
* split and/or rename member functions
* document all members
2010-01-04 23:13:15 +00:00
Thomas Capricelli
fd19e0a9ea unimportant small fix 2010-01-04 23:25:00 +01:00
Thomas Capricelli
57275b2b8c make some changes to please clang, fix some warnings too. 2010-01-04 23:21:04 +01:00
Thomas Capricelli
95d9cb77f8 fix compilation in some cases 2010-01-04 22:45:33 +01:00
Gael Guennebaud
71a171c267 s/asMatrix()/matrix() 2010-01-04 19:13:08 +01:00
Gael Guennebaud
b6898996d4 fix dirty triangular unit test 2010-01-04 19:02:43 +01:00
Gael Guennebaud
8eab9fb87e port VectorwiseOp and Swap to the novel mechanisms, and various cleanning 2010-01-04 19:00:16 +01:00
Gael Guennebaud
826bff58c6 Fix #69 for the second time, and add the respective regression test 2010-01-04 17:36:26 +01:00
Benoit Jacob
80d1f9e966 remove debug output. sorry! 2010-01-02 18:01:39 -05:00
Benoit Jacob
ff94eaa4ae clarify docs as requested on forum 2010-01-02 12:55:32 -05:00
Benoit Jacob
25f8adfa6c * Fix bug #79: ei_alignmentOffset was assuming that ptr is multiple of
sizeof(Scalar), and that assumption breaks with double on linux x86-32.
* Rename ei_alignmentOffset to ei_first_aligned
* Rewrite its documentation and part of its body
* The variant taking a MatrixBase doesn't need a separate size argument.
2010-01-02 12:38:16 -05:00
Benoit Jacob
ed5c972801 put the assign assert and debug info before the assignment itself 2010-01-01 21:47:53 -05:00
Benoit Jacob
c436d94f18 Instead of EIGEN_JOBS, introduce EIGEN_MAKE_ARGS and EIGEN_CTEST_ARGS. This is much more powerful!
Notice that CTest 2.8 supports the -j argument, although that is undocumented.
Last commit of 2009! Happy new year to all!
2009-12-31 16:13:56 -05:00
Benoit Jacob
2038f9cc8d * The 'check' and 'buildtests' scripts no longer take a jobs parameter.
Instead, they honor the EIGEN_JOBS environment variable if defined.
* The syntax ${BLABLA} in bash scripts was conflicting with CMake configured
  file syntax. Resolved by using the @ONLY option and the @BLABLA@ syntax.
2009-12-30 15:07:05 -05:00
Jitse Niesen
233540e58a Refactoring of MatrixFunction: Simplify handling of fixed-size case. 2009-12-30 17:34:48 +00:00
Jitse Niesen
fcf821b77d Rename test per naming convention. 2009-12-28 22:30:08 +00:00
Jitse Niesen
d35cc381fe Refactor MatrixFunction class: Split new class MatrixFunctionAtomic off. 2009-12-27 20:44:19 +00:00
Jitse Niesen
a25c9b1e46 Simplify and document Sylvester equation solver in MatrixFunction. 2009-12-27 18:09:50 +00:00
Gael Guennebaud
72b6c05bf0 sorry for committing this mess 2009-12-23 13:30:25 +01:00
Gael Guennebaud
24302ad5cd fix #76, MayLinearVectorize depends on MaxSizeAtCompileTime and not on MaxInnerSize 2009-12-23 13:19:28 +01:00
Gael Guennebaud
0b5853d917 fix the xpr analyzer for Transpose 2009-12-23 13:17:46 +01:00
Gael Guennebaud
c90ccd089a clean a bit Hessenberg and make sure the subdiagonal coeff is real even for 2x2 matrices 2009-12-23 12:30:05 +01:00
Gael Guennebaud
791bac25f2 fix #75, and add a basic unit test for Hessenberg
(it was indirectly tested by the eigenvalue decomposition)
2009-12-23 12:18:54 +01:00
Gael Guennebaud
d65c8cb60a fix #69 and extend unit tests or triangular solvers 2009-12-23 11:48:53 +01:00
Gael Guennebaud
5adfe934c8 add checks for on the right triangular solving with matrices 2009-12-23 10:24:13 +01:00
Gael Guennebaud
fcc3be5dce a couple of fixes after thye merge 2009-12-23 09:07:01 +01:00
Gael Guennebaud
eaaba30cac merge with default branch 2009-12-22 22:51:08 +01:00
Gael Guennebaud
e182e9c616 extend the DenseStorageMatrix mechanism to all expressions 2009-12-22 17:37:11 +01:00
Jitse Niesen
eeec7d842e merge 2009-12-21 19:13:31 +00:00
Jitse Niesen
f54a2a0484 Add support for general matrix functions.
This does the job but it is only a first version. Further plans:
improved docs, more tests, improve code by refactoring, add convenience
functions for sine, cosine, sinh, cosh, and (eventually) add the matrix
logarithm.
2009-12-21 18:53:00 +00:00
Benoit Jacob
4ec99bbc0c support gcc 4.5 2009-12-21 13:49:43 -05:00
Jitse Niesen
9f1fa6ea5e Fix compilation error in doc/examples/class_CwiseBinaryOp.cpp .
This is a follow-up of 9d54783036
 (better work around for empty structs).
2009-12-21 18:16:01 +00:00
Jitse Niesen
32f6242b60 Set default for second argument for check script.
Otherwise, buildtests will be run with a second argument of "" if check
is called with only one argument, which leads to infinitely many jobs.
2009-12-21 11:35:08 +00:00
Gael Guennebaud
02beaea2f8 add missing inclusion of LU/arch, thanks to J.B. Rouault 2009-12-19 13:49:00 +01:00
Gael Guennebaud
f8cb9a6230 oops, remove duplicated ctor 2009-12-19 13:45:14 +01:00
Gael Guennebaud
9d54783036 much better workaround for empty struct (the previous one caused GCC 4.3 to generate wrong code leading to segfaults) 2009-12-18 17:08:59 +01:00
Gael Guennebaud
af4d8c5cec a couple of fixes, now Array passes the linearstructure test 2009-12-17 19:28:54 +01:00
Gael Guennebaud
4b70b47998 clean a bit Matrix and fix static Map functions 2009-12-17 14:48:26 +01:00
Gael Guennebaud
5ca90e1b0c some cleaning in DenseStorageBase 2009-12-17 13:56:33 +01:00
Gael Guennebaud
ebb2878829 finally add a Array class with storage via the introduction of a DenseStorageBase
base class shared by both Matrix and Array
2009-12-17 13:37:00 +01:00
Gael Guennebaud
4e9c227bd5 partial LU optimization: noalias 2009-12-17 11:43:42 +01:00
Gael Guennebaud
30d47860dd more fixes 2009-12-17 10:43:46 +01:00
Gael Guennebaud
34c95029ca a couple of fixes and cleaning 2009-12-17 10:00:35 +01:00
Jitse Niesen
945cbc3bc6 Add test for issue #75 (Hessenberg of 1x1 matrix).
Also remove an superfluous #include in matrixExponential test.
2009-12-16 22:24:24 +00:00
Gael Guennebaud
2033903376 a trivial compilation fix 2009-12-16 19:21:10 +01:00
Gael Guennebaud
9f79558839 a lot of cleaning and fixes 2009-12-16 19:18:40 +01:00
Gael Guennebaud
22a6ab1f4b add an eigen2support test and a few fixes 2009-12-16 17:37:21 +01:00
Benoit Jacob
5cb779e5e1 * introduce ei_alignmentOffset(MatrixBase&,Integer)
couldnt put it in Memory.h as it needs the definition of MatrixBase
* make Redux use it
2009-12-16 08:53:14 -05:00
Gael Guennebaud
e0aa29121f this really fix the previous warning 2009-12-16 13:06:47 +01:00
Gael Guennebaud
c35fcf3bbd fix warning by making ei_empty_struct::_ei_dummy_ private 2009-12-16 12:53:55 +01:00
Gael Guennebaud
bb59c22dc9 fix compilation when mixing types 2009-12-16 12:48:15 +01:00
Gael Guennebaud
6db6774c46 * fix aliasing checks when the lhs is also transposed. At the same time,
significantly simplify the code of these checks while extending them
  to catch much more expressions!
* move the enabling/disabling of vectorized sin/cos to the architecture traits
2009-12-16 11:41:16 +01:00
Gael Guennebaud
7a9988ebb6 fix spasre triangular solve for row major lower matrices 2009-12-14 10:25:21 +01:00
Thomas Capricelli
0d8ffe5240 ignore more build*/ directories 2009-12-11 02:39:49 +01:00
Gael Guennebaud
a314814a6e suppress unused variable warnings 2009-12-15 13:59:02 +01:00
Benoit Jacob
949f14e1e4 no, this wasn't equivalent to ei_pload at all, after all! 2009-12-15 07:43:05 -05:00
Benoit Jacob
805eb9cc8b Gael, who is a man of few words^Winstructions, is right, as usual. 2009-12-15 06:50:40 -05:00
Hauke Heibel
3ea1f97f69 Suppressed the warning for missing assignment generators (forgot that in the last submission).
Commented Quake3's fast inverser sqrt in SSE's MathFunction header.
2009-12-15 08:09:14 +01:00
Benoit Jacob
46a9cac7fb silence 'statement has no effect' warning with gcc 4.4 2009-12-14 23:31:00 -05:00
Benoit Jacob
4948448939 *use scalar instructions, packet not needed here
*remove unused var warning
2009-12-14 23:13:54 -05:00
Benoit Jacob
39095c8faa only include SSE path if SSE enabled 2009-12-14 22:52:11 -05:00
Benoit Jacob
d02eccf584 add SSE path for Matrix4f inverse, taken from Intel except that we do a kosher
division instead of RCPPS-followed-by-Newton-Raphson. The rationale for that is
that elsewhere in Eigen we dont allow ourselves this approximation (which throws
2 bits of mantissa), so there's no reason we should allow it here.
2009-12-14 22:47:14 -05:00
Benoit Jacob
d5f3b2dc94 change the Arch constants: const int ---> enum, more explicit names, and use
of a namespace instead of Prefix_Name.
2009-12-14 17:26:24 -05:00
Hauke Heibel
832045d363 Warning 4512 (assignment operators could not be generated) is now simply disabled.
All unimplemented assignment operators have been removed.
2009-12-14 10:32:43 +01:00
Hauke Heibel
4498864fc8 Fixed a bad type conversion. 2009-12-13 09:26:57 +01:00
Jitse Niesen
63957ad5d6 Remove //@{ ... //@} for same reason as in changeset 2026ea7ff2
.
2009-12-13 00:29:31 +00:00
Jitse Niesen
929521678d Correct type of ei_solve_retval<LLT,...>::operator= 2009-12-12 23:54:44 +00:00
Hauke Heibel
3dce51bd8e Removed more warnings. 2009-12-12 14:49:43 +01:00
Hauke Heibel
d088ee35f6 Added to possibility to compile unit tests at maximum warning level.
Silenced (amongst others) many conversion related warnings.
2009-12-12 11:39:07 +01:00
Hauke Heibel
494a88685e Fixed the bad fix - now the unsupported examples and snippets work on windows. 2009-12-11 19:39:01 +01:00
Hauke Heibel
9a8c16810b Reverted Jitse's change - the targets unsupported_examples and unsupported_snippets are unknown over here. 2009-12-11 19:11:01 +01:00
Jitse Niesen
2026ea7ff2 Coax doxygen in producing better docs for MatrixFunctions.
The //@{ ... //@} in unsupported/Eigen/MatrixFunctions for some reason
caused doxygen to list the constructor of the MatrixExponential class
as a separate function in the MatrixFunctions module without any reference
to the class; very confusing.
2009-12-11 15:54:21 +00:00
Thomas Capricelli
7d444375e7 fix compilation (for some reason the pb happens only under windows) 2009-12-11 02:38:28 +01:00
Gael Guennebaud
9facdaf7b9 some compilation fixes 2009-12-10 22:15:22 +01:00
Gael Guennebaud
7caf751fdd adapt select, replicate and reverse 2009-12-10 22:00:35 +01:00
Benoit Jacob
d2e44f2636 * 4x4 inverse: revert to cofactors method
* inverse tests: use createRandomMatrixOfRank, use more strict precision
* tests: createRandomMatrixOfRank: support 1x1 matrices
* determinant: nest the xpr
* Minor: add comment
2009-12-09 12:43:25 -05:00
Benjamin Schindler
f0315295e9 Handle row and col major matrices in the gdb pretty printer 2009-12-08 19:12:26 +01:00
Benjamin Schindler
4da991eaa8 Display the data ptr as part of the header 2009-12-08 18:40:36 +01:00
Benjamin Schindler
a4c162dbdc Correct license header of the gdb pretty printer 2009-12-08 18:18:05 +01:00
Benjamin Schindler
f2304f3b88 Adding __init__.py so the printers can be used directly from the checkout 2009-12-08 18:17:46 +01:00
Hauke Heibel
64beff03b6 Added a 'pretty printer' for the msvc debugger (tested on Visual Studio 2008). 2009-12-08 18:09:00 +01:00
Benjamin Schindler
097cab26fb Added a pretty printer python script for Eigen::Matrix (gdb-7.0 and later). 2009-12-08 18:06:03 +01:00
Jitse Niesen
8bfa354ee3 Documentation clean up.
* remove non-existant reference to CwiseAll
* define \householder_module (used in HouseholderSequence.h)
* update I01_TopicLazyEvaluation.dox - Product is now called GeneralProduct
* remove reference to list of examples which was deleted ages ago
* rename PartialLU_solve.cpp snippet to PartialPivLU_solve.cpp
2009-12-08 15:12:27 +00:00
Jitse Niesen
a682a0eeb1 C05_TutorialLinearAlgebra.dox: Correct file name 2009-12-08 11:08:04 +00:00
Jitse Niesen
39ceba409b Various improvements to the docs for unsupported.
* Enable compilation of examples for unsupported.
* Fix use of std::vector in BVH example.
* Add an example for the matrix exponential.
* Bug fixes in unsupported/doc/{examples,snippets}/CMakeLists.txt .
2009-12-07 19:10:11 +00:00
Gael Guennebaud
8e05f9cfa1 add a DenseBase class for MAtrixBase and ArrayBase and more code factorisation 2009-12-04 23:17:14 +01:00
Gael Guennebaud
36969cc2a5 add a slerp benchmark (for accuracy and speed)) 2009-12-04 15:02:38 +01:00
Gael Guennebaud
c68c695b87 Fix poor Quaternion::slerp snapping 2009-12-04 15:01:17 +01:00
Gael Guennebaud
ea684af6b4 merge 2009-12-04 12:40:29 +01:00
Gael Guennebaud
7aad434160 fix compilation and clean a bit Map<Quaternion> 2009-12-04 12:26:56 +01:00
Marton Danoczy
ffaea19a70 Fixed compilation warnings in MSVC with Scalar==float 2009-12-03 14:01:34 +01:00
bjornpiltz
af17770680 Fix compilation for MSVC. 2009-12-03 09:27:15 +01:00
Mark Borgerding
012cd62c81 explicitly cast to use the acos(long double) 2009-12-02 22:58:34 -05:00
Mark Borgerding
e5c91b4e95 removed troublesome M_PI and M_PIl constants 2009-12-02 17:23:09 -05:00
Gael Guennebaud
e12f5adbde fix MSVC10 compilation 2009-12-02 19:32:54 +01:00
Benoit Jacob
de25059502 * Remove test_ prefix in tests
* tests now honor EIGEN_REPEAT and EIGEN_SEED if no arguments were passed
2009-12-02 12:07:47 -05:00
Benoit Jacob
b0100a336a merge 2009-12-02 11:12:02 -05:00
Benoit Jacob
3e73f6036c * HouseholderSequence:
* be aware of number of actual householder vectors
    (optimization in non-full-rank case, no behavior change)
  * fix applyThisOnTheRight, it was using k instead of actual_k
* QR: rename matrixQ() to householderQ() where applicable
2009-12-02 11:11:09 -05:00
Hauke Heibel
e3b5a90611 Removed unused 'benchmark'. 2009-12-02 16:51:40 +01:00
Benoit Jacob
3279e39340 * fix include for case-sensitive filesystem
* on GNU, clock_gettime requires linking -lrt
2009-12-02 10:03:16 -05:00
Mathieu Gautier
26ea7c9801 * remove empty destructors in Matrix.h and MatrixStorage.h 2009-12-02 11:09:56 +01:00
Hauke Heibel
84551d067e merge 2009-12-02 11:08:44 +01:00
Hauke Heibel
95861fbd6c Added NestByValue and .nestByValue() again for the sake of backwards compatibility. 2009-12-02 10:53:30 +01:00
Hauke Heibel
e3612bc0b8 Removed unnecessary code. 2009-12-02 09:11:24 +01:00
Mark Borgerding
c05ae35441 merge with tip 2009-12-01 18:03:15 -05:00
Mark Borgerding
ff1e9542f6 added comments to help vim understand the header files are c++. 2009-12-01 18:00:29 -05:00
Mark Borgerding
d1e06721a2 instruction to remove CMakeCache.txt for out-of-source build 2009-12-01 17:39:29 -05:00
Mark Borgerding
0e8cd8de4d quieted signed/unsigned comparison warning 2009-12-01 17:38:23 -05:00
Mark Borgerding
8fce0b5459 added newline at the end of file to quiet gcc warning 2009-12-01 17:37:33 -05:00
Thomas Capricelli
291fd4f8da merge 2009-12-01 21:07:20 +01:00
Benoit Jacob
68117c267c ColPivQR: now the unit tests even succeeds:
* with random matrices multiplied by 1e+8 (i.e. fixed wrong absolute fuzzy compare)
* with 10,000 repetitions (i.e. the fuzzy compare is really clever)
and when it occasionnally fails, less than once in 10,000 repeats, it is only on the exact rank computation.
2009-12-01 13:51:35 -05:00
Benoit Jacob
95d88e1327 Big reworking of ColPivQR and its unit test, which now passes even with thousands of repetitions and correctly tests matrices of all sizes. Several surprises along the way: for example, a major cause of trouble was the optimized "table of column squared norms" where the accumulation of imprecision was a serious issue; another surprise is that tests like "x!=0" before dividing by x really benefit from being replaced by fuzzy tests, as i hit real cases where i got wrong results in 1/epsilon. 2009-12-01 13:26:29 -05:00
Benoit Jacob
49c0986d86 minor cleanup 2009-12-01 13:22:14 -05:00
Hauke Heibel
d3250cb38f That's it NestByValue and .nestByValue() are both gone! 2009-12-01 13:29:08 +01:00
Hauke Heibel
b08d5b2d2c Even more NestByValue cleanup... 2009-12-01 13:16:51 +01:00
Hauke Heibel
2bf354da80 Much more NestByValue cleanup. 2009-12-01 11:51:22 +01:00
Hauke Heibel
3091be5134 Removed NestByValue dependency from Cholesky, Eigenvalues, LU and QR. 2009-12-01 10:22:54 +01:00
Hauke Heibel
7b3e205ebd Removed NestByValue dependency from VectorwiseOp. 2009-12-01 09:56:40 +01:00
Hauke Heibel
88be826791 Removed NestByValue dependency from MatrixBase::select(). 2009-12-01 09:49:15 +01:00
Hauke Heibel
82971a5dcd Adapted the number of test runs. 2009-12-01 09:29:56 +01:00
Hauke Heibel
1fc5fdea25 Added missing typedef (will I ever learn it!?)
Removed unsupported directories that do not provide CMakeList.txt (CMake 2.8 warning).
The BenchTimer is now also working on Cygwin.
2009-12-01 09:20:05 +01:00
Hauke Heibel
052742e6f9 merge 2009-12-01 07:55:07 +01:00
Hauke Heibel
7af1ee0b6c Added the profiling test to unsupported. 2009-12-01 07:53:09 +01:00
Gael Guennebaud
b2a5fb874f add specialization ei_ref_selector for sparse matrix types 2009-12-01 06:21:29 +01:00
Thomas Capricelli
120882c4f1 fix another 'duplicated content in doxygen pages' bug : exclude *.orig
files
2009-11-30 19:42:00 +01:00
Hauke Heibel
e6c560aedf Removed wrong typename. 2009-11-30 19:06:07 +01:00
Hauke Heibel
1c2e476fa7 Initial commit for a modified ei_nested logic. 2009-11-30 18:56:56 +01:00
Hauke Heibel
66534b782c Some of our unit tests require mathematical constants and thus we rely on non-ansi code.
It seems as if the new standard removed pow(T,int).
M_PIL is only defined when _GNU_SOURCE is defined.
2009-11-30 16:54:04 +01:00
Thomas Capricelli
21ff296652 Adapted a mail from Mark about some design and add it as documentation for
the FFT module.
2009-11-30 16:21:21 +01:00
Thomas Capricelli
a255336e4f fix doc 2009-11-28 02:42:05 +01:00
Benoit Jacob
4b1aca2288 precision ---> dummy_precision 2009-11-26 22:05:02 -05:00
Thomas Capricelli
c245054aa8 cleaning 2009-11-26 08:30:38 +01:00
Thomas Capricelli
f948df5a72 cleaning 2009-11-26 07:37:37 +01:00
Thomas Capricelli
db39f892a3 kill yet another un-needed parameter 2009-11-26 05:48:38 +01:00
Thomas Capricelli
905aecf803 make qrsolv use eigen types 2009-11-26 04:29:51 +01:00
Thomas Capricelli
e18caf7a01 clean qrsolv 2009-11-26 04:06:40 +01:00
Thomas Capricelli
e95f736402 reduce ei_qrsolv() signature, wa is actually a 'working array' 2009-11-26 03:26:41 +01:00
Benoit Jacob
5923bcb1b9 improve the scripts for building unit tests:
* support unsupported/
* use egrep instead of grep, properly escape special chars.
2009-11-25 21:26:37 -05:00
Thomas Capricelli
746c787a76 computes column norms outside of ei_qrfac() 2009-11-26 02:53:58 +01:00
Thomas Capricelli
9cbfdbad22 cleaning 2009-11-26 02:42:27 +01:00
Thomas Capricelli
f795681da0 export stableNorm(), blueNorm() and hypotNorm() to colwise() and rowwise()
+ rudimentary test
2009-11-26 02:28:13 +01:00
Thomas Capricelli
dca80b5f22 use typedef 2009-11-25 23:08:46 +01:00
Thomas Capricelli
a6380ae272 silent warnings 2009-11-25 23:08:39 +01:00
Benoit Jacob
40ce7589de Added tag actual-start-from-scratch for changeset 1dabb45d94 2009-11-25 10:57:32 -05:00
Benoit Jacob
1e2bcba5e4 forward port slight changes in the 4x4 inverse test 2009-11-25 09:08:03 -05:00
Benoit Jacob
684d76eba3 add SSE4 support, start with integer multiplication 2009-11-24 15:12:43 -05:00
Benoit Jacob
abdb2a2bd5 fix assert and handle Unit shapes 2009-11-24 12:14:40 -05:00
Benoit Jacob
e6ea9e401c improve precision test 2009-11-23 11:24:06 -05:00
Benoit Jacob
44d0d667cd 4x4 inverse:
* change block selection threshold from 1e-2 to 1e-1
* add rigorous precision test
2009-11-23 10:13:21 -05:00
Thomas Capricelli
06f11f3379 fix important typo 2009-11-21 02:00:47 +01:00
Gael Guennebaud
80ebeae48d Add the concept of base class plugins, and started to write the ArrayBase class.
Sorry for this messy commit but I have to commit it...
2009-11-20 18:20:55 +01:00
Gael Guennebaud
4af1753b6f * remove EnforceAlignedAccess option to Block, VectorBlock, Map and MapBase
because thanks to the previous commit this is not needed anymore
* add a more general ForceAlignedAccess expression which can be used for any expression.
  It is already used by StableNorm.h.
2009-11-20 16:30:14 +01:00
Benoit Jacob
3c5e32f0da improvements in FindEigen*.cmake, ported from changes in CMakeLists.txt:
- better regular expression
 - grep the whole file, not expensive anyway, more robust
2009-11-20 10:17:59 -05:00
Gael Guennebaud
eb8f450071 Hey, finally the copyCoeff stuff is not only used to implement swap anymore :)
Add an internal pseudo expression allowing to optimize operators like +=, *= using
the copyCoeff stuff.
This allows to easily enforce aligned load for the destination matrix everywhere.
2009-11-20 15:39:38 +01:00
Jitse Niesen
34be9d4537 Replace toDense() by toDenseMatrix() in tests. 2009-11-20 12:22:46 +00:00
Jitse Niesen
b0baf43114 Eigen/CMakeLists.txt: remove parens from if.
Only CMake 2.6.3 and later recognize this syntax, and at the moment we
require 2.6.2. CMake uses the right precendence, per its man page, so the
parentheses are not necessary.
2009-11-20 11:26:26 +00:00
Benoit Jacob
6cbf662f14 * don't laugh, but these bugs took me forever to fix.
* expand unit tests to make sure to catch them: they nearly escaped the existing tests as these memory violations were highly dependent on the numbers of rows and cols.
2009-11-19 22:01:13 -05:00
Benoit Jacob
eac3232095 minor improvements in triangular stuff 2009-11-19 20:50:50 -05:00
Benoit Jacob
88b551e89b * fix compilation of unit-tests (sorry, had tested only 1 channel)
* remove buggy (superfluous?) specialization in the meta-unroller
2009-11-19 19:20:19 -05:00
Benoit Jacob
a20a744adc TriangularMatrix: extend to rectangular matrices 2009-11-19 17:07:55 -05:00
Benoit Jacob
2275f98d7b move signature file to root directory, where it belongs 2009-11-19 12:41:28 -05:00
Benoit Jacob
bbf0eb35a7 * in Eigen/CMakeLists.txt, finally do a globbing to we no longer will have problems with "oops forgot to install new module".
* add a file Eigen/signature_of_eigen3_matrix_library, use it to make FindEigen3.cmake more solid: able to find Eigen in either eigen3/ or eigen/ and not mix it up with Eigen2.
2009-11-19 12:31:11 -05:00
Benoit Jacob
b5f4636d42 * eigen2->eigen3
* bump version to 2.91.0
* add FindEigen3.cmake
2009-11-19 12:09:04 -05:00
Benoit Jacob
abb2a1bb15 simplification 2009-11-18 17:44:20 -05:00
Benoit Jacob
126a8e6a69 fix remaining bug in ColPivHouseholderQR, so now all tests pass again 2009-11-18 17:40:45 -05:00
Benoit Jacob
40865fa28c fix bugs, old and new:
* old bug: in CwiseBinaryOp: only set the LinearAccessBit if both sides have the same storage order.
* new bug: in Assign.h, only consider linear traversal if both sides have the same storage order.
2009-11-18 17:20:39 -05:00
Benoit Jacob
11fa2ae2c6 temporarily disable linear traversal.
Actually I don't think it's buggy. But it probably triggers existing bugs, I suspect that
some xprs have LinearAccessBit and shouldn't have it.
Also this fixes the "bugs" with JacobiSVD ---> now it works again
2009-11-18 16:31:14 -05:00
Benoit Jacob
8860203e6a fix stuff after the PermutationMatrix changes.
I still have JacobiSVD errors when cols>rows
2009-11-18 12:41:24 -05:00
Gael Guennebaud
e3d890bc5a Another big refactoring change:
* add a new Eigen2Support module including Cwise, Flagged, and some other deprecated stuff
* add a few cwiseXxx functions
* adapt a few modules to use cwiseXxx instead of the .cwise() prefix
2009-11-18 18:15:19 +01:00
Benoit Jacob
94c706d04f Assign.h: add LinearTraversal (non-vectorized index-based traversal)
Rename some constants to make names match more closely what they mean.
2009-11-18 11:57:07 -05:00
Gael Guennebaud
0529ecfe1b Big refactoring/cleaning in the spasre module with
in particular the addition of a selfadjointView, and the
extension of triangularView. The rest is cleaning and does not
change/extend the API.
2009-11-18 14:52:52 +01:00
Gael Guennebaud
1e62e0b0d8 more ET refactoring:
* extend Cwise for multiple storage base class
* a lot of cleaning in the Sparse module
2009-11-17 16:04:19 +01:00
Benoit Jacob
9f21e2aab7 port the QR module to PermutationMatrix 2009-11-17 08:14:54 -05:00
Gael Guennebaud
63bcc1c0fb adapt CwiseBinaryOp and the Sparse counter part 2009-11-17 10:11:27 +01:00
Benoit Jacob
30b610a10f vade retro 2009-11-16 21:45:01 -05:00
Benoit Jacob
ac00902f84 for consistency: PlainMatrixType ---> DenseMatrixType 2009-11-16 21:43:15 -05:00
Benoit Jacob
984c000778 addToDense ---> addTo
subToDense ---> subTo
2009-11-16 21:33:41 -05:00
Benoit Jacob
07412b2075 PermutationMatrix: add setIdentity and transpositions methods
LU: make use of that
2009-11-16 21:28:26 -05:00
Benoit Jacob
b90744dc05 Port FullPivLU to PermutationMatrix 2009-11-16 17:05:12 -05:00
Benoit Jacob
76c614f9bd PartialPivLU: port to PermutationMatrix
PermutationMatrix: add resize()
2009-11-16 15:36:07 -05:00
Benoit Jacob
eb6df28c6c DiagonalMatrix: release-quality documentation
BandMatrix: rename toDense() ---> toDenseMatrix() for consistency
2009-11-16 15:25:58 -05:00
Benoit Jacob
e8d0dbf82e PermutationMatrix:
* make multiplication order not be reversed
 * release-quality documentation
2009-11-16 15:07:33 -05:00
Benoit Jacob
8a1bada43d initialize-by-zero: remember that when the newsize==oldsize, resize() must remain a NOP 2009-11-16 13:45:06 -05:00
Gael Guennebaud
1c9a2d246f adapt CwiseUnaryOp and CwiseUnaryView 2009-11-16 19:39:29 +01:00
Gael Guennebaud
2a3a6fe45e Experiment the ET refactoring on Transpose for Dense and Sparse storages.
All tests work fine.
2009-11-16 18:19:08 +01:00
Benoit Jacob
307898e84b merge 2009-11-16 09:39:59 -05:00
Benoit Jacob
b25eb5fdaa PermutationMatrix: add inverse() and product of permutations 2009-11-16 09:39:07 -05:00
Benoit Jacob
e09768e3bc handle make errors ---> exit, don't run ctest 2009-11-16 09:38:44 -05:00
Thomas Capricelli
a89b22f352 rename test for coherency with previous renaming of the module 2009-11-16 04:34:04 +01:00
Thomas Capricelli
f7e73f1bf9 don't be shy and test them all 2009-11-16 04:20:13 +01:00
Benoit Jacob
955cd7f884 * add PermutationMatrix
* DiagonalMatrix:
   - add MaxSizeAtCompileTime parameter
   - DiagonalOnTheLeft ---> OnTheLeft
   - fix bug in DiagonalMatrix::setIdentity()
2009-11-15 21:12:15 -05:00
Benoit Jacob
1aaa9059c0 maketests -> buildtests 2009-11-15 19:26:28 -05:00
Benoit Jacob
9aa37f3108 prevent in-source builds. hope it's ok with you... it's still possible, of course, to have the build dir as a subdir of the source dir. 2009-11-15 00:11:33 -05:00
Benoit Jacob
3f04a14d7c there's no reason why we should follow the FSF's stupid recommendation for the naming of these files, right? This could give the wrong impression that Eigen is only GPL-licensed. 2009-11-14 23:26:07 -05:00
Benoit Jacob
c4dacabcc8 add workaround for Guillaume 2009-11-14 22:04:55 -05:00
Mathieu Gautier
6680fa42ee * add Map<Quaternion> test based on Map from test/map.cpp
* replace implicit constructor AngleAxis(QuaternionBase&) by an explicit one, it seems ambiguous for the compiler
* remove explicit constructor with conversion type quaternion(Quaternion&): conflict between constructor.
* modify EIGEN_INHERIT_ASSIGNEMENT_OPERATORS to suit Quaternion class
2009-11-13 16:41:51 +01:00
Jitse Niesen
d07c05b3a5 Build tests for unsupported modules if EIGEN_LEAVE_TEST_IN_ALL_TARGET 2009-11-13 13:05:57 +00:00
Benoit Jacob
7e3c4096d8 xargs ---> xargs echo
(xargs alone doesnt seem to be documented in the man page, while xargs echo is documented)
2009-11-12 15:37:50 -05:00
Benoit Jacob
9b7708f660 introduce check target, and some renaming 2009-11-12 15:02:52 -05:00
Benoit Jacob
8b563d7163 ouch, avoid infinite loop!
optimization is not so important here, so a for loop will do.
2009-11-12 14:08:47 -05:00
Benoit Jacob
5266a78aca also optionnally initialize by zero after resizing 2009-11-12 12:49:00 -05:00
Benoit Jacob
8132ee3908 * add non-default option to initialize matrices by 0
(useful for porting)
* maketests really has to be in test/
2009-11-12 12:39:22 -05:00
Benoit Jacob
358452bbe6 * add ./debug and ./release scripts
* update the messages
* rename EIGEN_CMAKE_RUN_FROM_CTEST to something saner
2009-11-12 12:07:18 -05:00
Benoit Jacob
fc492b6264 add mctestr script. In your build directory, just do:
./mctestr ^qr 5
and it will build all tests matching ^qr with 5 jobs and then do `ctest -R ^qr`
2009-11-12 09:35:10 -05:00
Benoit Jacob
bc18ba7e49 * add maketests script. It is like make but takes a regexp allowing to build selected targets. Next step will be a "mctestr" script doing that and then calling ctest -R.
* in runtest.sh, don't override the default number of repeats. If one thinks the default should be changed, let's change it at the source.
2009-11-11 21:25:21 -05:00
Benoit Jacob
ff7fbc9431 * use standard CMAKE_BUILD_TYPE
* remove debug_xxx targets
* runtest.sh: don't run make
2009-11-11 16:11:33 -05:00
Jitse Niesen
343eec7ca8 Compilation fix: makeHousholderInPlace now uses references. 2009-11-11 16:23:09 +00:00
Benoit Jacob
bf691cc3f1 fix PowerPC platform detection 2009-11-11 10:52:00 -05:00
Thomas Capricelli
b53c2fcc99 fix for *.pc install dir (suggested by Ingmar Vanhassel on IRC) 2009-11-11 15:28:39 +01:00
Benoit Jacob
a440385b41 *adapt Householder to the convention that we now favor refs over ptrs for output. Keep "workspace" as pointer because it is an array (which is now more obvious).
*rename makeHouseholderSequence to householderSequence, because that's what it returns.
2009-11-10 21:22:20 -05:00
Thomas Capricelli
e06aa749a4 doxygen : exclude diff files 2009-11-10 22:17:23 +01:00
Thomas Capricelli
2612bbdf8b make the complex module appear in doxygen + small documentation 2009-11-10 22:02:10 +01:00
Thomas Capricelli
d47a723a6e make FFT appear in doxygen doc, and provide a mininum of documentation 2009-11-10 21:58:17 +01:00
Thomas Capricelli
2c9f46d151 fix Skyline module doxygen stuff 2009-11-10 21:47:57 +01:00
Thomas Capricelli
df117a64c7 merge with main repository 2009-11-10 21:34:43 +01:00
Thomas Capricelli
ae76c97704 documentation fixes 2009-11-10 21:33:36 +01:00
Gael Guennebaud
f647fb8dd4 fix compilation and removed some unused stuff in skyline 2009-11-10 21:22:55 +01:00
Thomas Capricelli
6a56262bf4 merge with main repository 2009-11-10 20:31:07 +01:00
Gael Guennebaud
58632c1652 fix #68 I did not see the skyline matrix patch contained that 2009-11-10 14:19:03 +01:00
Gael Guennebaud
1879403562 mv the Skyline module to unsupported/ 2009-11-10 12:47:42 +01:00
Thomas Capricelli
42b92c2022 merge with main repository 2009-11-09 19:02:52 +01:00
Thomas Capricelli
77fd44a246 revert previous commit on the matter : once doxygen cache is flushed
this gives very bad results
2009-11-09 19:00:48 +01:00
Benoit Jacob
92749eed11 * merge
* remove a ctor in QuaternionBase as it gives a strange error with GCC 4.4.2.
2009-11-09 09:08:03 -05:00
Benoit Jacob
4b366b07be add missing includes 2009-11-09 08:04:20 -05:00
Benoit Jacob
9a0900e33e last round of changes, mainly to return derived types instead of base types, and fix various compilation issues 2009-11-09 07:51:31 -05:00
Gael Guennebaud
670651e2e0 Quaternion: fix compilation, cleaning 2009-11-09 10:48:18 +01:00
Thomas Capricelli
087df89e20 few doc fixes 2009-11-09 06:45:27 +01:00
Thomas Capricelli
d1cc2745f7 Reduce general diff with eigen-tip. Moreover, this test now seems to pass
again.
2009-11-09 05:12:50 +01:00
Thomas Capricelli
17f3e8571c more documentatin 2009-11-09 04:52:47 +01:00
Thomas Capricelli
3e17046668 only define groups once in unsupported, in order to prevent ambiguity for
the group names.
2009-11-09 04:36:01 +01:00
Thomas Capricelli
de195e0e78 some more documentation 2009-11-09 04:21:45 +01:00
Thomas Capricelli
ac8f7d8c9c various fixes in headers 2009-11-09 03:32:40 +01:00
Thomas Capricelli
71a3e96b49 rename NonLinear to NonLinearOptimization 2009-11-09 03:27:15 +01:00
Thomas Capricelli
09cb27c587 documentation + move "namespace eigen" to the main file, as others do. 2009-11-09 03:25:21 +01:00
Thomas Capricelli
cddc83752c starting documentation 2009-11-09 03:07:36 +01:00
Thomas Capricelli
ecbcdafb0f include NonLinearOptimization_Module and NumericalDiff_Module
+ cleaning
2009-11-09 03:06:23 +01:00
Benoit Jacob
e4e58e8337 simplifications in the ei_solve_impl system, factor out some boilerplate code 2009-11-08 16:51:41 -05:00
Thomas Capricelli
751a333491 merge with main repository 2009-11-08 22:27:32 +01:00
Benoit Jacob
ba7bfe110c port the qr module to ei_solve_xxx. 2009-11-08 10:21:26 -05:00
Gael Guennebaud
aa0974286f fix compilation adding a makeconst helper struct 2009-11-07 09:07:23 +01:00
Gael Guennebaud
5dc02fe5e9 improve a bit AutoDiffVector, but it still not working 2009-11-06 11:34:58 +01:00
Gael Guennebaud
6647a58847 update product bench 2009-11-06 11:33:18 +01:00
Gael Guennebaud
771c0507fb back out previous back out, and this time don't forget
to include the NumTraits.h file in the commit ;)
2009-11-06 11:23:18 +01:00
Jitse Niesen
1470afda5b Backed out previous changeset: Does not compile.
There is no member Nested in NumTraits.
2009-11-06 09:16:25 +00:00
Gael Guennebaud
fe81b3f651 Add the possibility to control the storage mode of scalar value (by value or reference)
in order to avoid unecessary copies when using complex scalar types (e.g., a AutoDiffScalar)
2009-11-05 18:06:33 +01:00
Jitse Niesen
daa4574a43 Add regression test for issue #66 (ComplexSchur of zero matrix). 2009-11-05 03:46:18 +00:00
Benoit Jacob
68210b03c1 port svd to the ei_xxx_return_value thing
this commit made in caltrain from Palo Alto to SF
2009-11-04 21:00:12 -05:00
Benoit Jacob
4c456d4211 fix bug in svd solve reported on forum, was apparently assuming square matrix, not sure how the unit test could work. 2009-11-04 11:46:17 -05:00
Hauke Heibel
3979f6d8aa Let's try to stick to the original code, thus activate the fix of #62 only for 64 bit builds. 2009-11-04 15:49:22 +01:00
Hauke Heibel
e2170b9f7e Direct access of the packet structs fixes bug #62 and doe not seem to
influence compiler optimization.
2009-11-04 15:38:11 +01:00
kayhman
1333fe651d Added basic SkylineMatrix. 2009-11-04 15:18:12 +01:00
Mark Borgerding
103f741619 initialize Eigen::Complex arrays 2009-11-09 14:08:44 -05:00
Gael Guennebaud
a7bebe0aeb an attempt to fix a compilation issue with MSVC 2009-11-04 09:04:50 +01:00
Benoit Jacob
0182695204 move cholesky to ei_xxx_return_value 2009-11-03 11:34:45 -05:00
Benoit Jacob
a77872dd6c move partial-pivoting lu to ei_solve_impl 2009-11-03 03:06:34 -05:00
Benoit Jacob
da363d997f introduce ei_xxx_return_value and ei_xxx_impl for xxx in solve,kernel,impl
put them in a new internal 'misc' directory
2009-11-03 02:18:10 -05:00
Gael Guennebaud
979431b987 fix #66 : upper triangular checks in ComplexSchur 2009-11-02 10:46:40 +01:00
Benoit Jacob
5ba19a53a6 rephrase tutorial on Map 2009-10-31 14:37:11 -04:00
Benoit Jacob
3ae4e3880f fix compilation 2009-10-31 14:36:31 -04:00
Benoit Jacob
48261fc773 * default MatrixBase ctor: make it protected, make it a static assert, only do the check when debugging eigen to avoid slowing down compilation for everybody (this check is paranoiac, it's very seldom useful)
* add private MatrixBase ctors to catch cases when the user tries to construct MatrixBase objects directly
2009-10-31 11:50:15 -04:00
Mark Borgerding
ec70f8006b added inlines to a bunch of functions 2009-10-31 00:13:22 -04:00
Mark Borgerding
4c3345364e moved half-spectrum logic to Eigen::FFT 2009-10-30 23:38:13 -04:00
Mark Borgerding
d659fd9b14 moved real-half-spectrum reflection into Eigen::FFT 2009-10-30 20:26:30 -04:00
Mark Borgerding
a26b729cc9 moved scaling to Eigen::FFT 2009-10-30 19:50:11 -04:00
Mark Borgerding
0fa68b9e50 switched to BenchUtil.h 2009-10-30 19:46:45 -04:00
Benoit Jacob
f975b9bd3e SVD::solve() : port to new API and improvements 2009-10-30 08:51:33 -04:00
Benoit Jacob
6b48e932e9 *port the Cholesky module to the new solve() API
*improve documentation
2009-10-29 21:11:05 -04:00
Benoit Jacob
a2268ca6b3 properly implement BenchTimer on POSIX
(may require a platform check for the clock name on non-linux platforms)
2009-10-29 15:47:56 -04:00
Hauke Heibel
d0562bd473 corrected the computation cost of mean 2009-10-29 19:58:54 +01:00
Hauke Heibel
c70a603e34 added mean() reduction 2009-10-29 19:56:58 +01:00
Gael Guennebaud
e513cc75c4 oops I forgot to include that file in the previous commit (fixing #65) 2009-10-29 14:24:09 +01:00
Gael Guennebaud
541eac0828 fix #65: MatrixBase::nonZero() 2009-10-29 14:22:02 +01:00
Mark Borgerding
7911df4b6e improved selftest for Eigen::Complex -- mainly a documentation of what does not work 2009-10-28 23:22:10 -04:00
Mark Borgerding
288ba155f1 forgot Complex test file 2009-10-28 23:05:18 -04:00
Mark Borgerding
938f03e5cf added many inlines and attempted to fix castable pointer so it works with std::vector 2009-10-28 23:02:18 -04:00
Benoit Jacob
e8dd552257 sync with mainline 2009-10-28 19:06:45 -04:00
Benoit Jacob
2840ac7e94 big huge changes, so i dont remember everything.
* renaming, e.g. LU ---> FullPivLU
* split tests framework: more robust, e.g. dont generate empty tests if a number is skipped
* make all remaining tests use that splitting, as needed.
* Fix 4x4 inversion (see stable branch)
* Transform::inverse() and geo_transform test : adapt to new inverse() API, it was also trying to instantiate inverse() for 3x4 matrices.
* CMakeLists: more robust regexp to parse the version number
* misc fixes in unit tests
2009-10-28 18:19:29 -04:00
Benoit Jacob
6219f9acfa * rename new Quat class to Quaternion, remove existing Quaternion
* add Copyright line for Mathieu
* cast() was broken (compile errors) and needed anyway to be in QuaternionBase
* it's VectorBlock<T,3>, don't pass additional parameter 1, it has different meaning!!
* make it compile with GCC (put 'typename' at the right location)
2009-10-27 15:45:24 -04:00
Mathieu Gautier
611d2b0b1d Quaternion could now map an array of 4 scalars :
new classes :
* QuaternionBase
* Map<Quaternion>
2009-10-27 13:19:16 +00:00
Hauke Heibel
427f8a87d1 Added dox for the new typedefs. 2009-10-27 16:02:36 +01:00
Hauke Heibel
dbaba9019b Added more common typedefs. 2009-10-27 15:57:21 +01:00
Hauke Heibel
7cc9fb5d0a Umeyama is now working with fixed size src and dst points. 2009-10-27 15:29:12 +01:00
Benoit Jacob
1f1c04cac1 sync the documentation examples 2009-10-26 14:37:43 -04:00
Benoit Jacob
44cdbaba4d * make inverse() do a ReturnByValue
* add computeInverseWithCheck
* doc improvements
* update test
2009-10-26 14:16:50 -04:00
Benoit Jacob
07d1bcffda remove 1 useless layer of functions 2009-10-26 12:30:29 -04:00
Benoit Jacob
ec02388a5d big rewrite in Inverse.h
in particular, the API is essentially finalized and the 4x4 case is fixed to be numerically stable.
2009-10-26 11:18:23 -04:00
Hauke Heibel
66fe5a5f36 It is just not that easy and requires more work to get it done right. 2009-10-24 14:48:34 +02:00
Gael Guennebaud
a382963b04 * extend Map to allow the user to specify whether the mapped data
is aligned or not. This is done using the Aligned constant:
  Map<MatrixType,Aligned>::Map(data);
* rename ForceAligned to EnforceAlignedAccess, and update its doc,
  and emphasize this is mainly an internal stuff.
2009-10-23 14:26:14 +02:00
Mark Borgerding
0167f5ef43 added inline to many functions 2009-10-22 23:06:19 -04:00
Benoit Jacob
83a7b7c44c support gcc 3.3 2009-10-22 15:56:10 -04:00
Hauke Heibel
99bb29abcf demeaning with colwise expression 2009-10-22 10:11:26 +02:00
Mark Borgerding
e3d08443dc inlining,all namespace declaration moved to FFT, removed preprocessor definitions, 2009-10-21 20:53:05 -04:00
Benoit Jacob
68d48511b2 move PartialLU to the new API 2009-10-21 17:06:42 -04:00
Mark Borgerding
78a53574b7 merge branches 2009-10-21 10:28:56 -04:00
Mark Borgerding
85f8d1f0c6 renamed 'Traits' to 'Impl', added vim modelines for syntax highlighting 2009-10-21 10:27:17 -04:00
Benoit Jacob
c3180b7ffb MatrixBase:
* support resize() to same size (nop). The case of FFT was another case where that make one's life far easier.
   hope that's ok with you Gael. but indeed, i don't use it in the ReturnByValue stuff.

FFT:
 * Support MatrixBase (well, in the case with direct memory access such as Map)
 * adapt unit test
2009-10-20 23:25:49 -04:00
Benoit Jacob
471b4d5092 handle mark's first commits before he configured his id 2009-10-20 21:53:54 -04:00
Mark Borgerding
902b6dcd6c added Eigen::FFT and
Eigen::Complex
2009-10-20 21:33:48 -04:00
Hauke Heibel
5e3e6ff71a Added Windows support to the BenchTimer. 2009-10-20 22:08:13 +02:00
Mark Borgerding
d9b418bf12 merged eigen2_for_fft into eigen2 mainline 2009-10-20 15:18:01 -04:00
Benoit Jacob
13f31b8daf * make PartialLU avoid to generate inf/nan when given a singular matrix
(result undefined, but at least it won't take forever on intel 387)
* add lots of comments, especially to LU.h
* fix stuff I had broken in Inverse.h
* split inverse test
2009-10-20 00:36:07 -04:00
Benoit Jacob
d1db1352f5 update doc snippets 2009-10-19 17:22:04 -04:00
Benoit Jacob
890bff977e * proper check for Make
* fix documentation of ei_add_test
2009-10-19 15:56:03 -04:00
Benoit Jacob
6c1b91678b kill ei_add_test_multi. Now the macro ei_add_test does all that automatically, by parsing the source file. No risk anymore to specify the wrong number of tests! Also, introduce CALL_SUBTESTX for X=1..10 that allows to port existing code much quicker. And port already the product* and eigensolver* files. 2009-10-19 14:40:35 -04:00
Benoit Jacob
580672ea43 Add new default option EIGEN_SPLIT_LARGE_TESTS and cmake macro ei_add_test_multi.
When enabled, large tests are split into smaller executables.
This needs minimal changes in the unit tests.
Updated the LU test to use it.
2009-10-19 13:29:00 -04:00
Benoit Jacob
9a700c2974 * LU unit test: finally test fixed sizes
* ReturnByValue: after all don't eval to temporary for generic MatrixBase impl
2009-10-19 10:56:37 -04:00
Benoit Jacob
47eeb40380 remove the m_originalMatrix member. Instead, image() now takes the original matrix as parameter. It was the only method to use it anyway. Introduce m_isInitialized. 2009-10-18 15:21:19 -04:00
Benoit Jacob
d71c7f42d3 * useThreshold -> setThreshold
* remove defaultThreshold()
2009-10-18 14:20:14 -04:00
Benoit Jacob
0255f28279 oops, didn't want to commit that 2009-10-18 01:35:07 -04:00
Benoit Jacob
8332c232db big huge changes in LU!
* continue the decomposition until a pivot is exactly zero;
  don't try to compute the rank in the decomposition itself.
* Instead, methods such as rank() use a new internal parameter
  called 'threshold' to determine which pivots are to be
  considered nonzero.
* The threshold is by default determined by defaultThreshold()
  but the user can override that by calling useThreshold(value).
* In solve/kernel/image, don't assume that the diagonal of U
  is sorted in decreasing order, because that's only approximately
  true. Additional work was needed to extract the right pivots.
2009-10-18 00:47:40 -04:00
Gael Guennebaud
7b0c4102fa * add a Make* expression type builder to allow the
construction of generic expressions working
  for both dense and sparse matrix. A nicer solution
  would be to use CwiseBinaryOp for any kind of matrix.
  To this end we either need to change the overall design
  so that the base class(es) depends on the kind of matrix,
  or we could add a template parameter to each expression
  type (e.g., int Kind = ei_traits<MatrixType>::Kind)
  allowing to specialize each expression for each kind of matrix.
* Extend AutoDiffScalar to work with sparse vector expression
  for the derivatives.
2009-10-16 13:22:38 +02:00
Gael Guennebaud
44ba4b1d6d add operator+ scalar to AutoDiffScalar 2009-10-16 11:27:04 +02:00
Benoit Jacob
3c4a025a54 merge 2009-10-15 16:09:43 -04:00
Benoit Jacob
41e942d3fb don't try to finish early 2009-10-15 16:09:17 -04:00
Hauke Heibel
d177c1f3ac Inlining fixes + fixed typo.
Removed ei_assert in presence of static assert.
2009-10-15 21:07:14 +02:00
Gael Guennebaud
1503043981 autodiff:
* fix namespace issue
* simplify Jacobian code
* fix issue with "Dynamic derivatives"
2009-10-15 18:43:15 +02:00
Hauke Heibel
0927ba1fd3 More warning fixes. 2009-10-14 19:55:23 +02:00
Hauke Heibel
c37cfc32b3 Fixed more W4 warnings. 2009-10-14 11:08:00 +02:00
Hauke Heibel
f4661e696e Resize is only defined in Matrix and not in MatrixBase.
I am not sure whether the better fix is to move the resize functions to MatrixBase.
2009-10-14 11:07:11 +02:00
Hauke Heibel
949582c809 Added prod() reduction to the AsciiQuickReference. 2009-10-14 09:40:22 +02:00
Gael Guennebaud
8c37b1b5b7 add missing PartialReduxExpr::coeff(index) function 2009-10-13 14:41:57 +02:00
Gael Guennebaud
1443094072 compilation fix: make the generic template ctor explicit 2009-10-13 09:23:09 +02:00
Gael Guennebaud
2049f742e4 trivial compilation fix 2009-10-13 08:53:01 +02:00
Benoit Jacob
c4ab6a2032 also test that the matrix Q is unitary 2009-10-12 22:33:51 -04:00
Thomas Capricelli
456f7d094d merge with eigen-tip 2009-10-13 01:14:19 +02:00
Hauke Heibel
e5bf72679c Fixed nmake parameter.
Disabled debug_* targets for MSVC_IDE (they already exist).
Removed the make usage message for MSVC_IDE.
2009-10-09 14:09:25 +02:00
Gael Guennebaud
81a70cef5c merge 2009-10-07 14:26:42 +02:00
Gael Guennebaud
8f3e33581e extend the sparse matrix assembly benchmark 2009-10-07 14:25:53 +02:00
Gael Guennebaud
af31345df3 really fix stable norm compilation for older gcc 2009-10-07 14:25:12 +02:00
Gael Guennebaud
075830ddb0 - remove the debug_test_* targets from "all"
(otherwise they are compiled when you simply run
   make in test/ or when enforcing "test" to be part of "all")
- add linking libraries to the debug_test_* targets
2009-10-07 14:22:44 +02:00
Benoit Jacob
9115d33fae the answer to the ultimate question in life is: 664 2009-10-06 11:44:57 -04:00
Benoit Jacob
24e1d3266a merge 2009-10-06 09:27:01 -04:00
Benoit Jacob
80ede36b24 allow arbitrary resulttype, fixes Xuewen's issue, and this stuff is going to get deeply refactored soon anyway. 2009-10-06 09:26:28 -04:00
Gael Guennebaud
4cf7366027 fix compilation in stable norm, move a platform check to the unit tests 2009-10-06 10:24:41 +02:00
Gael Guennebaud
904f35d194 discard vectorization in matrix-vector product when data is not even
aligned on the scalar type size (e.g., for double on 32 bits system without -malign-double)
2009-10-05 17:22:16 +02:00
Benoit Jacob
bb1cc0d092 after all we're not aligning to 8byte boundary
keep most of the changes though as they make the code more extensible
2009-10-05 10:55:42 -04:00
Benoit Jacob
f01a8112d6 merge 2009-10-05 10:11:50 -04:00
Benoit Jacob
d41577819b we were already aligning to 16 byte boundary fixed-size objects that are multiple of 16 bytes;
now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes.
That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all?
Also, improvements in test_unalignedassert.
2009-10-05 10:11:11 -04:00
Hauke Heibel
a7d51435e1 added cygwin specific stuff 2009-10-05 14:00:32 +02:00
Benoit Jacob
a9a9ba8453 remove unneeded stuff 2009-10-05 07:58:53 -04:00
Benoit Jacob
fa9fc1397b next attempt ... introduce EIGEN_CMAKE_RUN_FROM_CTEST, in that case don't EXCLUDE_FROM_ALL 2009-10-05 07:42:31 -04:00
Benoit Jacob
b78b2ede5f finally, the right fix: set CTEST_BUILD_TARGET.
So this is the come-back of btest target, and the default target is empty again.
2009-10-04 20:27:44 -04:00
Benoit Jacob
7956fc49a0 dd first-time tip, to warn against doing "make".
now i think we're reasonably safe.
2009-10-03 22:11:30 -04:00
Thomas Capricelli
5f32088443 use provided $USER if available, let the caller do the update (safer) 2009-10-04 03:35:02 +02:00
Benoit Jacob
d040c9fc9e add INSTALL and message about no need to do "make" 2009-10-03 17:19:14 -04:00
Benoit Jacob
a1d9b76dd5 add debug targets like debug_qr to build a specific test with debug info
remove the btest target, instead just do "make" since anyway we have to let "make" build the tests
2009-10-03 16:50:50 -04:00
Benoit Jacob
3e4cb08054 fix #59, can't EXCLUDE_FROM_ALL the test directory 2009-10-03 15:51:42 -04:00
Hauke Heibel
7d2ca0e05e Added cmake project definitions. 2009-10-02 18:45:24 +02:00
Benoit Jacob
71f19d90d0 forgot to hg add this file 2009-10-02 12:40:19 -04:00
Benoit Jacob
57e1c5b0aa fix permissions 666->660 2009-10-02 11:32:36 -04:00
Benoit Jacob
2b810ea818 finally, actually purge the Main Page 2009-10-02 11:22:47 -04:00
Gael Guennebaud
bcdeb68b63 merge 2009-10-01 13:28:22 +02:00
Gael Guennebaud
9a3cae4655 better fix for (v * v') * v, we still have to find a way to reorder it 2009-10-01 13:27:03 +02:00
Benoit Jacob
3529179376 merge 2009-10-01 07:26:47 -04:00
Hauke Heibel
5409ce1625 Fixed wrong line endings. 2009-10-01 07:20:09 +02:00
Benoit Jacob
3a315fdc9a make Replicate ctor require the exact expected type 2009-09-30 15:47:11 -04:00
Gael Guennebaud
d7a2a37a4c bugfix in the eigenvalue solvers (forgot to resize the eigen vectors) 2009-09-30 16:48:02 +02:00
Benoit Jacob
4b04a9bcfa *add test to prevent future regression 2009-09-29 21:00:13 -04:00
Benoit Jacob
fa65b09661 add outerproduct coeff(int,int) method.
This is needed to make this expression work:
(vec1*vec2.transpose())*vec3
Gael, no objection? Seems to make sense as that's fast.
2009-09-29 20:30:28 -04:00
Thomas Capricelli
977ed615a6 be sure that there's no name clash between NumericalDiff::df and the
original functor df()
2009-09-28 17:45:45 +02:00
Benoit Jacob
457b7cba96 more message improvements, tell the user about building a specific test 2009-09-28 11:08:31 -04:00
Benoit Jacob
eeabd18afc Fix compilation of HouseholderQR and ColPivotingHouseholderQR for non-square fixed-size matrices.
For Colpiv that was just changing MatrixQType to MatrixType in the instantiation of HouseholderSequence.
For HouseholderQR I also re-ported the solve method from Colpiv as there were multiple issues.
2009-09-28 10:49:55 -04:00
Benoit Jacob
67bf7c90c5 * update test to expose bug #57
* update createRandomMatrixOfRank to support fixed size
2009-09-28 09:40:18 -04:00
Thomas Capricelli
7968737247 fix tests : we perform slightly worse because we do use one more function
evaluation in our numericaldiff than what (c)minpack did
2009-09-28 04:13:57 +02:00
Thomas Capricelli
d912034565 fdjac2 is not needed anymore 2009-09-28 03:49:23 +02:00
Thomas Capricelli
d3850641a1 remove some duplicated code LevenbergMarquardt::minimizeNumericalDiff*() by
using the generic Eigen NumericalDiff recently introduced.

LevenbergMarquardt::lmdif1(), which is provided as a convenience method for
people porting code from (c)minpack, is now a static function
2009-09-28 03:26:42 +02:00
Thomas Capricelli
87be19de4a central sheme for numerical diff 2009-09-28 02:55:30 +02:00
Thomas Capricelli
206b5e3972 starting work on a Numerical differenciation module 2009-09-28 02:43:07 +02:00
Thomas Capricelli
a453298322 cleaning doc 2009-09-28 02:42:19 +02:00
Thomas Capricelli
52026eb800 cleaning 2009-09-28 02:35:42 +02:00
Thomas Capricelli
de942a44c2 default argument for _jac in functor operator() : this way, we can use
AutoDiffJacobian::operator()(x,value) exactly as the original functor
2009-09-28 01:28:48 +02:00
Thomas Capricelli
8e8997d403 use dynamic type in functor, as NonLinear only knows about this currently 2009-09-28 01:19:29 +02:00
Thomas Capricelli
bee14ee8e6 use operator() so that to be coherent with eigen AutoDiff functor 2009-09-28 00:32:31 +02:00
Thomas Capricelli
956d65ea63 define a generic functor and makes other ones inherit it 2009-09-28 00:18:14 +02:00
Benoit Jacob
765600458b * bump to 2.90.0 now that it's agreed that we're doing eigen3
---> question: do we change the prefix eigen2/ to eigen3/ now?
       no, better wait until we've also changed the repository name
* more message improvements: "Install Eigen" was unclear as it left
  out other things like the BLAS library
2009-09-27 18:05:54 -04:00
Benoit Jacob
2e4be5a75c improve message 2009-09-27 17:55:58 -04:00
Thomas Capricelli
251c0f45ac remove references to adolc and split tests functions for clarity 2009-09-27 23:54:06 +02:00
Benoit Jacob
92480ffd26 * Introduce make targets btest (build tests), blas (build blas lib), demos (build demos).
* remove EIGEN_BUILD_TESTS and siblings
* add summary at the end of cmake run, hopefully not too verbose
* fix build of quaternion demo
* kill remnants of old binary library option
2009-09-27 17:48:53 -04:00
Benoit Jacob
6b23cb2101 remove the hack we made to allow api.kde.org to generate the dox. Update the error help page. 2009-09-27 11:39:51 -04:00
Hauke Heibel
e115fa3cea Ok, too many class bodies - it was only required for ei_svd_precondition_2x2_block_to_be_real. 2009-09-27 17:18:19 +02:00
Hauke Heibel
3c74d6b7d4 Added private, non-implemented assignment operators to functions that don't need them (fixes VC warning on /W4). 2009-09-27 17:03:02 +02:00
Hauke Heibel
13545eab9b Fixed VC compilation error on the JacobiSVD module. 2009-09-27 17:00:10 +02:00
Benoit Jacob
924b55e9a9 when copying a ReturnByValue into a MatrixBase, first eval it to a PlainMatrixType.
This allows to limit the number of instantiations of the big template method evalTo.
Also allows to get rid of the dummy MatrixBase::resize().
See "TODO" comment.
2009-09-26 22:48:16 -04:00
Thomas Capricelli
b1637df4f4 update URL for adol-c 2009-09-27 01:56:50 +02:00
Benoit Jacob
e82ab8a5dd move also inverse() to ReturnByValue, by doing a solve on NestByValue<Identity>.
also: adding resize() to MatrixBase was really needed ;)
2009-09-26 11:40:29 -04:00
Gael Guennebaud
c532f42a0e update the sparse tutorial wrt not so recent changes about the filling API 2009-09-25 16:30:44 +02:00
Hauke Heibel
104f9954e1 Removed implicit type conversion (VC warning fix). 2009-09-25 14:58:20 +02:00
Hauke Heibel
21d2533723 Matrix::conservativeResize, resize only when necessary. 2009-09-25 14:44:48 +02:00
Gael Guennebaud
04dc63776a add a wip blas library built on top of Eigen. TODO:
- write extentive unit tests (maybe this already exist in other projects)
- the level2 functions still have to be implemented
2009-09-25 13:08:39 +02:00
Gael Guennebaud
bdf603caec remove some dirty lines 2009-09-25 12:58:41 +02:00
Gael Guennebaud
e12bd2e8d2 extend the support for bool 2009-09-25 12:58:04 +02:00
Hauke Heibel
3d6e4ab879 Uuups that was not yet intended for a commit. 2009-09-25 10:14:16 +02:00
Hauke Heibel
2fbf5ce7df Fixed issue #57. 2009-09-25 10:12:09 +02:00
Benoit Jacob
e58e9c842e remove (changesets), it's enough numbers like this 2009-09-24 15:14:45 -04:00
Benoit Jacob
ddfd23a13e * sort by last name alphabetically
* replace (no own changesets) by (no information)
2009-09-24 13:56:33 -04:00
Thomas Capricelli
68988e4ad0 use -f so that the script is happy even if the log file is not there 2009-09-24 13:26:23 +02:00
Benoit Jacob
905b3b9379 oops, don't append, overwrite 2009-09-24 07:18:31 -04:00
Benoit Jacob
64648b4b35 improvements, especially: automatically flush the server side cache 2009-09-24 07:04:55 -04:00
Benoit Jacob
a279a277e3 fix typo 2009-09-23 21:01:14 -04:00
Benoit Jacob
51db7ab0df add eigen_gen_credits script 2009-09-23 20:58:20 -04:00
Hauke Heibel
b347075936 Removed unnecessary MSVC check. 2009-09-22 10:03:40 +02:00
Benoit Jacob
176c26feb5 allow to do xpr = solve(b) etc... just by adding a dummy MatrixBase::resize() 2009-09-22 01:41:09 -04:00
Benoit Jacob
4f9e270343 * make LU::kernel() and LU::image() also use ReturnByValue
* make them return zero vector in the degenerate case, instead of asserting
  (let's stick to the principle that we only assert on memory errors)
2009-09-22 00:16:51 -04:00
Benoit Jacob
0ad3494bd3 fix docs 2009-09-22 21:51:23 -04:00
Benoit Jacob
ab5cc8284a convert LU::solve() to the new API 2009-09-22 20:58:29 -04:00
Benoit Jacob
c1c780a94f * ReturnByValue:
-- simpplify by removing the 2nd template parameter
  -- rename Functor to Derived, as now it's a usual CRTP
* Homogeneous:
  -- in products, honor the Max sizes etc.
2009-09-22 12:20:45 -04:00
Hauke Heibel
c6822d6723 Added EIGEN_REF_TO_TEMPORARY define for rvalue support.
Allowed VC10 to make use of static_assert.
2009-09-21 19:59:58 +02:00
Benoit Jacob
1df54e3ac2 fix bug #42, add missing Transform::Identity() 2009-09-19 19:59:49 -04:00
Benoit Jacob
828a79ac78 allow to override EIGEN_RESTRICT, to satisfy a smart ass blogger who claims
that eigen2 owes all its performance to nonstandard restrict keyword.
well, this can also improve portability in case some compiler doesn't have __restrict.
2009-09-19 19:45:58 -04:00
Benoit Jacob
3c780fd472 add legalese 2009-09-20 06:00:39 -04:00
Benoit Jacob
bd491a89c5 add demo of how to mix eigen with C code 2009-09-20 05:49:17 -04:00
Gael Guennebaud
0b60027f3c implement __gnuc_forget_about_setZero_its_over_now 2009-09-18 15:36:05 +02:00
Benoit Jacob
6b5f96cb03 undef B0 2009-09-19 19:14:28 -04:00
Gael Guennebaud
3b5a9acba8 fix stable_norm unit test 2009-09-18 11:41:38 +02:00
Benoit Jacob
0b426ea00d update page to explain how to get rid of it 2009-09-18 22:01:49 -04:00
Gael Guennebaud
add5381be7 finish my evalToDense => evalTo change 2009-09-17 23:51:16 +02:00
Gael Guennebaud
5ba7fe3bee clean the commented asm instructions because now I'm sure
the previous fix is ok
2009-09-17 23:34:00 +02:00
Gael Guennebaud
f2737148b0 merge 2009-09-17 23:21:48 +02:00
Benoit Jacob
760636a237 fix bug #52: Transform::inverse() should return a Transform 2009-09-18 18:45:45 -04:00
Gael Guennebaud
9395326e44 fix #53: performance regression, hopefully I did not resurected another
perf. issue...
2009-09-17 23:18:21 +02:00
Gael Guennebaud
fcae32cc3f compilation fixes 2009-09-17 15:11:13 +02:00
Gael Guennebaud
24950bdfcb make ColPivotingQR use HouseholderSequence 2009-09-16 15:56:20 +02:00
Gael Guennebaud
49dd5d7847 * add a HouseholderSequence class (not good enough yet for Triadiagonalization and HessenbergDecomposition)
* rework a bit AnyMatrixBase, and mobe it to a separate file
2009-09-16 14:35:42 +02:00
Gael Guennebaud
77f858f6ab improve ComplexShur api and doc 2009-09-16 14:34:08 +02:00
Benoit Jacob
a4fd0aa25b * fix bug in col-pivoting qr, forgot to swap the colNorms when swapping cols
* add Gael a copyright line
2009-09-16 14:19:59 -04:00
Benoit Jacob
46be9c9ac1 * fix super nasty bug: vector.maxCoeff(&index) didn't work when 'vector'
was a row-vector. Fixed by splitting the vector version from the matrix version.
* add unit test, the visitors weren't covered by any test!!
2009-09-16 14:18:30 -04:00
Gael Guennebaud
4a6e5694d6 disable warning 279: controlling expression is constant for ICC 2009-09-15 13:03:24 +02:00
Gael Guennebaud
9e9abab2b9 bugfixes for ICC (compilation and runtime) 2009-09-15 11:53:24 +02:00
Gael Guennebaud
432fcefcb1 fix warning with gcc 4.2 2009-09-15 11:27:31 +02:00
Gael Guennebaud
d5319f4ba8 fix warning in stable norm 2009-09-15 11:16:58 +02:00
Thomas Capricelli
7a8ec4ba26 cleaning 2009-09-15 04:17:11 +02:00
Thomas Capricelli
6d8baa757e fix indentation (and only that) 2009-09-14 23:47:44 +02:00
Mark Borgerding
a39de276a9 added the test case for FFTW 2009-09-14 01:52:26 -04:00
Thomas Capricelli
ab88ba6f7f provide some default values for important results. So that we can read them
even before *Init() and do no get random values.
2009-09-13 23:55:08 +02:00
Thomas Capricelli
8c3f7d8e94 cleaning 2009-09-13 01:44:34 +02:00
Thomas Capricelli
8b84c3733a functors need not be const 2009-09-11 20:50:01 +02:00
Jitse Niesen
9f14d72927 Remove no-op statement in AlignedVector3. 2009-09-10 09:22:06 +01:00
Thomas Capricelli
72746838ad merge with tip 2009-09-10 02:37:08 +02:00
Jitse Niesen
2a6db40f10 Re-factor matrix exponential.
Put all routines in a class. I think this is a cleaner design.
2009-09-08 14:51:34 +01:00
Jitse Niesen
220ff54131 Fix LaTeX error in doxygen comment. 2009-09-08 14:41:54 +01:00
Hauke Heibel
26ed19e4cf Fixed if clause. 2009-09-08 10:20:26 +02:00
Hauke Heibel
3a2499fb11 Fixed conservative_resize compilation errors. 2009-09-08 10:02:19 +02:00
Hauke Heibel
e6cac85333 Added missing casts. 2009-09-08 08:27:18 +02:00
Hauke Heibel
437a79e1ab Fixed unit test and improved code reusage for resizing. 2009-09-07 17:48:42 +02:00
Hauke Heibel
e49236bac6 Ups - that was not intended to be part of the commit. 2009-09-07 17:23:29 +02:00
Hauke Heibel
64095b6610 Changed the AnyMatrixBase / ei_special_scalar_op inheritance order as proposed by Gael.
Added conservativeResizeLike as discussed on the mailing list.
2009-09-07 17:22:01 +02:00
Gael Guennebaud
8f4bf4ed7f add optional compile flags to enable coverage testing 2009-09-07 14:05:27 +02:00
Gael Guennebaud
ae1d1c8f6c improve coverage of matrix-vector product 2009-09-07 14:04:56 +02:00
Gael Guennebaud
fb5f546161 improve coverage of unitOrthogonal 2009-09-07 12:53:25 +02:00
Gael Guennebaud
b56bb441dd add a stable_norm unit test 2009-09-07 12:46:16 +02:00
Gael Guennebaud
bdcc0bc157 fix compilation of swap for ICC 2009-09-07 11:37:41 +02:00
Gael Guennebaud
a921292381 uncomment stuff commented for debugging (sorry for the noise) 2009-09-07 11:26:20 +02:00
Gael Guennebaud
225ec02b06 fix another .stride() issue in Cholmod support 2009-09-07 11:15:38 +02:00
Gael Guennebaud
61fe2b6a56 bug fix in SuperLU support: the meaning of Matrix::stride() changed for vectors 2009-09-07 10:55:33 +02:00
Jitse Niesen
5eea8f1824 Typos in tutorial 1. 2009-09-05 19:46:33 +01:00
Gael Guennebaud
e4f94b8c58 enable ILU in super LU only if the super version supports it 2009-09-04 18:19:34 +02:00
Thomas Capricelli
e3db42611b fix warning about unused variable 2009-08-29 02:48:44 +02:00
Thomas Capricelli
982a146a67 merge with tip 2009-08-29 02:47:12 +02:00
Thomas Capricelli
c990938415 eigenization of fcn_chkder + bugfix 2009-08-29 02:46:19 +02:00
Thomas Capricelli
c1265ebbfe fix bounds using c standard instead of fortran's 2009-08-29 02:36:13 +02:00
Thomas Capricelli
4f7daf942c fix indentation for fcn_chkder 2009-08-29 02:30:18 +02:00
John Smith
aacada1662 Force release builds on Windows machines in the test suite.
Added an IGNORE_CVS flag to the test suite (allows submitting local and modified repositories).
Fixed the EI_OFLAG for MSVC.
2009-08-28 20:14:18 +02:00
John Smith
227f6cbce0 Fixed SSE related build warning on 64-bit windows systems. 2009-08-28 00:05:44 +02:00
Jitse Niesen
76fa46c6db Typos in tutorial 2009-08-26 18:53:56 +01:00
Thomas Capricelli
16d08b2b0f check number of evaluation even in the case of *1(), now we have it.. 2009-08-26 14:47:10 +02:00
Thomas Capricelli
458947af5e move Parameters as a class member, simplify calling sequence. Convenience
methods from minpack ( "*1()" ) get their original name back : they are
only useful when porting, anyway. Still, i prefer to keep them.
2009-08-26 14:23:05 +02:00
Thomas Capricelli
c1be96967e remove printfs, they are of no use and may prevent compilation 2009-08-26 01:09:23 +02:00
Gael Guennebaud
3705498721 add coeff(int,int), coeff(int) and value() functions to the inner product specialization 2009-08-26 00:24:22 +02:00
Thomas Capricelli
a1e9e8d082 merge with tip 2009-08-25 23:49:48 +02:00
Thomas Capricelli
6de3f5f0e7 cleaning 2009-08-25 23:47:22 +02:00
Thomas Capricelli
eac9293449 split every algorithm in *Init() + while(running) { *OneStep() } 2009-08-25 23:43:33 +02:00
Thomas Capricelli
bbd44ef0ad move more stuff into Parameters 2009-08-25 23:37:27 +02:00
Thomas Capricelli
a2abb4afb6 cleaning 2009-08-25 23:26:36 +02:00
Thomas Capricelli
baec4f39ab reduce local variables so that we can split algorithms 2009-08-25 22:49:05 +02:00
Thomas Capricelli
be368c33bb cleaning 2009-08-25 22:15:09 +02:00
Thomas Capricelli
470ea55834 put nfev/njev as internal variables as well 2009-08-25 22:13:08 +02:00
Thomas Capricelli
41b6ea81db oops... fixing return values, some copy/paste was done far too quickly 2009-08-25 22:06:58 +02:00
Thomas Capricelli
3bca4bba87 if mode==2, the user is supposed to supply diag: do some basic check. 2009-08-25 22:02:19 +02:00
Thomas Capricelli
fa0183e7c7 make diag be an internal variable too 2009-08-25 21:59:10 +02:00
Thomas Capricelli
e465ea82e1 define and use struct Parameters 2009-08-25 21:50:01 +02:00
Thomas Capricelli
d13bcdc891 those are actually bools 2009-08-25 20:01:30 +02:00
Thomas Capricelli
84f2c451e5 cleaning 2009-08-25 19:57:42 +02:00
Thomas Capricelli
d38d4709bc use an enum for status reporting 2009-08-25 19:48:53 +02:00
Thomas Capricelli
d0a5da95b1 fix installation for recently added files 2009-08-25 18:57:59 +02:00
Thomas Capricelli
d59cc0ad82 merge files 2009-08-25 17:25:56 +02:00
Thomas Capricelli
493c72ac38 rename files 2009-08-25 17:21:16 +02:00
Thomas Capricelli
858acfcc64 remove the boring, old-school nprint option, we'll have a dedicated
method for 'one iteration' anyway.
2009-08-25 17:11:14 +02:00
Thomas Capricelli
613a464320 cleaning 2009-08-25 16:48:09 +02:00
Thomas Capricelli
6c1a9703b1 move most of results vectors/matrices inside respective classes. 2009-08-25 16:08:09 +02:00
Thomas Capricelli
38fc6c8553 cleaning 2009-08-25 14:28:19 +02:00
Thomas Capricelli
201f58e528 merge both c methods lmstr/lmstr1 into one class
LevenbergMarquardtOptimumStorage with two methods.
2009-08-25 14:18:38 +02:00
Thomas Capricelli
3f1b81e129 merge both c methods lmdif/lmdif1 into one class
LevenbergMarquardtNumericalDiff with two methods.
2009-08-25 14:09:19 +02:00
Thomas Capricelli
a736378331 cleaning 2009-08-25 14:03:30 +02:00
Thomas Capricelli
d880e6f774 merge both c methods hybrj1/hybrj into one class HybridNonLinearSolver with
two methods. hybrd stuff renamed to HybridNonLinearSolverNumericalDiff.
2009-08-25 13:56:25 +02:00
Thomas Capricelli
a043708e87 merge both c methods hybrd/hybrd1 into one class HybridNonLinearSolver with
two methods.
2009-08-25 13:48:25 +02:00
Thomas Capricelli
602b13815f merge both c methods lmder/lmder1 into one class LevenbergMarquardt with
two methods.
2009-08-25 13:40:45 +02:00
Thomas Capricelli
86cb9364c9 clean fortran stuff in fdjac* 2009-08-24 21:53:08 +02:00
Thomas Capricelli
45442b8d41 some more work on Functors 2009-08-24 21:48:22 +02:00
Benoit Jacob
191d5275a7 modernize HouseholderQR too, uniformize all that stuff, update tests 2009-08-24 13:46:14 -04:00
Thomas Capricelli
15d2c3af90 playing with functors 2009-08-24 19:45:35 +02:00
Thomas Capricelli
6f567f10be cleaning 2009-08-24 19:19:30 +02:00
Jitse Niesen
7e4bd70157 Fix comment which may cause latex to hang when generating docs 2009-08-24 18:01:18 +01:00
Gael Guennebaud
078a870a87 fix issue #43 2009-08-24 18:56:27 +02:00
Thomas Capricelli
4e62e29869 cleaning covar 2009-08-24 17:49:37 +02:00
Thomas Capricelli
17905c7399 eigenization of ei_covar() 2009-08-24 17:47:35 +02:00
Benoit Jacob
0eb142f559 bring the modern comfort also to ColPivotingHouseholderQR
+ some fixes in FullPivotingHouseholderQR
2009-08-24 11:11:41 -04:00
Benoit Jacob
3288e5157a finally, the correct way of dealing with zero matrices in solve() 2009-08-24 10:51:07 -04:00
Thomas Capricelli
f69869c42a covar : cleaning, removing goto's 2009-08-24 16:49:38 +02:00
Thomas Capricelli
312ab1abb3 further cleaning/ goto removing 2009-08-24 16:39:49 +02:00
Thomas Capricelli
92a5bb4539 clean debug stuff 2009-08-24 16:14:42 +02:00
Thomas Capricelli
c6d7da6edc cleaning some more 2009-08-24 16:08:13 +02:00
Thomas Capricelli
63071ac968 cleaning, removing goto's, uniformization (try to reduce diff between
hybr[dj].h  or lm[der,dif,str].h as much as possible), for future merging.
2009-08-24 16:05:57 +02:00
Benoit Jacob
b8106e97b4 add logAbsDeterminant()
move log and exp functors from Array to Core
update documentation
2009-08-24 09:46:17 -04:00
Thomas Capricelli
91a2145cb3 clean, remove goto's 2009-08-24 15:32:06 +02:00
Thomas Capricelli
d4968cd059 cleaning, fixing most goto's 2009-08-24 15:13:12 +02:00
Thomas Capricelli
e65a7c7c70 misc cleaning 2009-08-24 09:28:29 +02:00
Thomas Capricelli
6e41f15fea use a local variable for qrfac 2009-08-24 09:13:06 +02:00
Thomas Capricelli
dff5135026 merge with head 2009-08-24 08:55:27 +02:00
Thomas Capricelli
88f5d06b08 move ipvt/fortran fixing deeper 2009-08-24 08:45:06 +02:00
Thomas Capricelli
950eb4a254 various cleaning and homogeneization 2009-08-24 08:28:31 +02:00
Benoit Jacob
f31b5a7114 add test for absDeterminant() 2009-08-24 00:35:42 -04:00
Benoit Jacob
c9a307f330 give FullPivotingHouseholderQR all the modern comfort 2009-08-24 00:23:35 -04:00
Benoit Jacob
154bdac9f4 small improvements 2009-08-24 00:09:01 -04:00
Benoit Jacob
b37ab9b324 clarifications in LU::solve() and in LU documentation 2009-08-24 00:02:49 -04:00
Benoit Jacob
0926549659 fix bug: with complex matrices, the condition (ei_imag(c0)==RealScalar(0)) being wrong could bypass the other condition in the &&.
at least that's my explanation why the test_lu was often failing on complex matrices (it uses that via createRandomMatrixOfRank)
and why that's fixed by this diff.
also gcc 4.4 gave a warning about tailSqNorm potentially uninitialized
2009-08-24 00:02:22 -04:00
Benoit Jacob
d38624b1ad merge 2009-08-23 18:05:33 -04:00
Benoit Jacob
97bc1af1f1 add ColPivotingHouseholderQR
rename RRQR to fullPivotingHouseholderQR
2009-08-23 18:04:33 -04:00
Gael Guennebaud
47fda1f3b2 hm, forgot to conjugate the arguments in applyJacobiOnTheLeft 2009-08-24 00:01:02 +02:00
Gael Guennebaud
e86dbd5255 fix apply Jacobi for complexes and add documentation for some *Jacobi* functions 2009-08-23 23:49:44 +02:00
Benoit Jacob
a848ed02ad let createRandomMatrixOfRank support fixed-size! 2009-08-23 17:33:31 -04:00
Thomas Capricelli
930651ff9a dogleg, lmpar : use more eigen features 2009-08-23 21:52:39 +02:00
Thomas Capricelli
4958c53bfb trivial fixes 2009-08-23 21:47:55 +02:00
Thomas Capricelli
5e8dee7a19 eigenize dogleg() 2009-08-23 21:39:47 +02:00
Thomas Capricelli
f793dbe45c only indentation fixes (this eases porting) 2009-08-23 21:06:57 +02:00
Thomas Capricelli
feb5af3ede porting lmpar() to eigen : both api and some of the code 2009-08-23 21:04:55 +02:00
Thomas Capricelli
9a8c5cbd2c misc cleaning 2009-08-23 06:16:05 +02:00
Thomas Capricelli
264e61932c cleaning fdjac*() 2009-08-23 06:04:06 +02:00
Thomas Capricelli
f01332043b only indentation 2009-08-23 05:56:12 +02:00
Thomas Capricelli
8b9b671e83 some eigenization in main algorithms 2009-08-23 05:55:43 +02:00
Thomas Capricelli
134dea76d3 beautify functors for lmdif, lmstr, hybrj, hybrd 2009-08-23 04:57:48 +02:00
Thomas Capricelli
acd757737a beautify Functor for lmder : we now have f,df,debug methods 2009-08-23 04:39:22 +02:00
Thomas Capricelli
878f15b8a5 * use eigen object for callbacks for hybrd and lmdif
* use Functor instead of argument for ei_fdjac*()
2009-08-23 04:06:16 +02:00
Thomas Capricelli
f2fcbb0207 use eigen objects for ei_fdjac*(), this is a prerequisite before porting
hybrd/lmdif..
2009-08-23 03:54:40 +02:00
Thomas Capricelli
8a27e774f8 use eigen objects for hybrj and lmstr 2009-08-23 03:14:42 +02:00
Thomas Capricelli
3251e12258 use eigen objects for the lmder callback 2009-08-23 03:02:03 +02:00
Thomas Capricelli
2727099906 remove redundant code, fix bounds in those loops that still come from
fortran
2009-08-23 02:32:08 +02:00
Thomas Capricelli
1225704753 we do not need/use the 'void *p' parameter 2009-08-23 01:59:20 +02:00
Jitse Niesen
90735b6a9c Rewrite tutorial section on solving linear systems 2009-08-22 20:12:47 +01:00
Benoit Jacob
37dede6077 fix typo 2009-08-22 10:40:39 -04:00
Thomas Capricelli
a3e8a14e3a forgot to clean this one 2009-08-22 07:40:43 +02:00
Thomas Capricelli
c5218c7d38 ei_lmpar : use a reference for the parameter 2009-08-22 07:37:23 +02:00
Thomas Capricelli
b3f8d02df4 use const for machine constants 2009-08-22 07:31:14 +02:00
Thomas Capricelli
bb6ffafdb9 keep on cleaning f2c mess 2009-08-22 07:27:17 +02:00
Thomas Capricelli
a35586504e cleaning f2c mess / trivial stuff 2009-08-22 07:14:17 +02:00
Benoit Jacob
7bedf5e9cb add initial, rough, full-pivoting RRQR decomposition
lots of room for improvement!
and add Gael a (c) line in Householder.h
2009-08-22 01:13:21 -04:00
Thomas Capricelli
93fabbff5e use blueNorm() instead of norm() 2009-08-22 07:05:10 +02:00
Thomas Capricelli
aa3a7c3303 raw import of covar() : this is the last one, and we now do not depend on
the cminpack library anymore.
2009-08-22 06:44:41 +02:00
Thomas Capricelli
16061a46db Now that the main algorithms are imported into eigen, we import subroutines
used by those algorithms (aka "second level").

This is a row import : we copy/paste the files from cminpack and make
very few changes :
* template<Scalar> them (replace double)
* dpmpar() replaced by c++ standard equivalent
* abs/fabs/sqrt/min/max replaced by ei_* or std::*
* use eigen norms instead of enorm()

Important Notes:
* The use of stableNorm() was not enough in some cases, but using
  blueNorm() instead fixed the problems (some tests gave bad results,
  either in number of iterations or precision of the results)
* As a whole, the only test that changed is testNistMGH17() : it now takes
  some few steps less to get the same result. So this is a small improvement.

After this commit, the only remaining dependency from the cminpack
static library is 'covar', only used from the tests.
2009-08-22 06:40:22 +02:00
Thomas Capricelli
783f355904 cleaning defines from f2c (use std::min and such instead of own ones) 2009-08-22 05:32:17 +02:00
Thomas Capricelli
11c3762068 cleaning : removing #define, use std:min() and such 2009-08-22 05:29:33 +02:00
Marcus D. Hanwell
ef582933c1 Proper fix for linking to the Qt libraries (and others)
My initial fix was incorrect, the libraries must be quoted when being
passed to the add test macro, but must be unquoted when passed to the
target_link_libraries function.
2009-08-21 14:04:17 -04:00
Benoit Jacob
2f0b4e1abc fix compilation with gcc 4.1. Indeed the path for recent gcc doesn't work with gcc 4.1, and looking at the implementation of vector in g++ 4.1, it was exactly our fallback case, so use that. 2009-08-21 12:16:37 -04:00
Peter Román
80179e9549 Added support for SuperLU's ILU factorization 2009-08-21 11:14:45 +02:00
Gael Guennebaud
b0aa2520f1 * add real scalar * complex matrix, real matrix * complex scalar,
and complex scalar * real matrix overloads
* allows the inner and outer product specialisations to mix real and complex
2009-09-04 11:22:32 +02:00
Gael Guennebaud
6902ef0824 extend mixingtype test to check diagonal products and fix the later for real*complex products 2009-09-04 10:17:28 +02:00
Gael Guennebaud
a7ed998d52 bug fix in novel makeGivens for real 2009-09-04 10:05:22 +02:00
Gael Guennebaud
3fbf71d6b9 compilation fix for conservativeResize 2009-09-04 09:26:00 +02:00
Gael Guennebaud
68b28f7bfb rename the EigenSolver module to Eigenvalues 2009-09-04 09:23:38 +02:00
Hauke Heibel
7f5256f628 Added conservativeResize + unit test. 2009-09-03 17:27:51 +02:00
Gael Guennebaud
82ad37c730 implement the continuous generation algorithm of Givens rotations by Anderson (2000) 2009-09-03 17:08:38 +02:00
Hauke Heibel
41aea9508e This seems to be important for MSVC to optimize the size of empty base classes. 2009-09-03 13:46:44 +02:00
Gael Guennebaud
3eb37fe1fb update mixingtype unit test to reflect current status, but it is still clear
we should allow matrix products between complex and real ?
2009-09-03 13:03:26 +02:00
Gael Guennebaud
00f4b46908 typo in sqrt(complex) 2009-09-03 11:50:06 +02:00
Gael Guennebaud
a54b99fa72 move eigen values related stuff of the QR module to a new EigenSolver module.
- perhaps we can find a better name ?
- note that the QR module still includes the EigenSolver module for compatibility
2009-09-03 11:39:44 +02:00
Gael Guennebaud
9515b00876 remove the \addexample tags 2009-09-03 11:22:42 +02:00
Gael Guennebaud
16c7b1daab add examples for makeJacobi and makeGivens 2009-09-03 11:17:16 +02:00
Gael Guennebaud
c893917d65 Fix serious bug discovered with gcc 4.2 2009-09-03 10:45:32 +02:00
Hauke Heibel
8d449bd80e Removed debug cout.
Disabled MSVC inconsistent DLL linkage.
2009-09-02 21:23:09 +02:00
Hauke Heibel
e6c9d6c528 Remove last lazyness warnings. 2009-09-02 20:59:57 +02:00
Hauke Heibel
2abd5eeffd Added support to overwrite the generator type.
Eigen'fied the new variables.
2009-09-02 20:57:41 +02:00
Benoit Jacob
7aa6fd3625 big reorganization in JacobiSVD:
- R-SVD preconditioning now done with meta selectors to avoid compiling useless code
- SVD options now honored, with options to hint "at least as many rows as cols" etc...
- fix compilation in bad cases (rectangular and fixed-size)
- the check for termination is now done on the fly, no more goto (should have done that earlier!)
2009-09-03 02:53:51 -04:00
Benoit Jacob
89557ac41d introduce EIGEN_SIZE_MIN
now we should check if some EIGEN_ENUM_MIN usage needs to be replaced by that... potential bug when using mixed-size matrice
2009-09-03 02:50:42 -04:00
Benoit Jacob
7d18c30641 finally the first version was the good one... 2009-09-03 01:25:40 -04:00
Gael Guennebaud
7586f7f706 fix #51 (bad use of std::complex::real) 2009-09-02 15:18:11 +02:00
Gael Guennebaud
b83654b5d0 * rename JacobiRotation => PlanarRotation
* move the makeJacobi and make_givens_* to PlanarRotation
* rename applyJacobi* => apply*
2009-09-02 15:04:10 +02:00
Gael Guennebaud
496ea63972 fix wrong assert 2009-09-02 14:08:33 +02:00
Gael Guennebaud
4a8258369a much simpler fix for Matrix::swap 2009-09-02 13:37:15 +02:00
Benoit Jacob
ec20d58317 * add serious unit test for swap
* fix my stupidity in Matrix::swap()
2009-09-02 16:56:48 -04:00
Benoit Jacob
cc375e2f79 merge 2009-09-02 06:37:41 -04:00
Benoit Jacob
e6b77bcc6b JacobiSVD: implement general R-SVD using full-pivoting QR, so we now support any rectangular matrix size by reducing to the smaller of the two dimensions (which is also an optimization) 2009-09-02 06:36:55 -04:00
Benoit Jacob
c16d65f015 fix compilation errors in swap (could not swap with anything else than the exact same Matrix type) 2009-09-02 06:35:01 -04:00
Hauke Heibel
59f5bce41c fix issue #49 2009-09-01 23:15:30 +02:00
Hauke Heibel
05ddd32849 added missing JacobiRotation's ... 2009-09-01 23:12:40 +02:00
Gael Guennebaud
5b8ffa4d46 clean a bit the previous commit which came from a patch queue,
and since it was my first try of the patch queue feature I did not
managed to apply it with a good commit message, so here you go:
* Add a ComplexSchur decomposition class built on top of HessenbergDecomposition
* Add a ComplexEigenSolver built on top of ComplexSchur
There are still a couple of FIXME but at least they work for any reasonable matrices,
still have to extend the unit tests to stress them with nasty matrices...
2009-09-01 16:35:23 +02:00
Gael Guennebaud
4d91229bdc [mq]: eigensolver 2009-09-01 16:20:56 +02:00
Gael Guennebaud
67ccc6b851 I've been too fast (again) 2009-09-01 13:44:21 +02:00
Gael Guennebaud
1e7a9ea70a fix issue #47: now m.noalias() = XXX properly resize m if needed 2009-09-01 13:35:44 +02:00
Gael Guennebaud
8392373d96 add a JacobiRotation class wrapping the cosine-sine pair with
some convenient features (transpose, adjoint, product)
2009-09-01 13:18:03 +02:00
Jitse Niesen
32f95ec267 Bug fix in MatrixExponential.h
Initialize matrices for intermediate results to correct dimension
2009-09-01 10:50:54 +01:00
Benoit Jacob
6e4e94ff32 * JacobiSVD:
- support complex numbers
 - big rewrite of the 2x2 kernel, much more robust
* Jacobi:
 - fix weirdness in initial design, e.g. applyJacobiOnTheRight actually did the inverse transformation
 - fully support complex numbers
 - fix logic to decide whether to vectorize
 - remove several clumsy methods

fix for complex numbers
2009-08-31 22:26:15 -04:00
Benoit Jacob
29c6b2452d simplifications 2009-08-31 22:09:44 -04:00
Benoit Jacob
5339db6164 add VERIFY_IS_UNITARY 2009-08-31 22:08:43 -04:00
Gael Guennebaud
a16599751f fix Matrix::stride for vectors, add a unit test for Block::stride
and make use of it where it was relevant
2009-08-31 17:39:56 +02:00
Hauke Heibel
ab6eb6a1a4 Adaptions from .lazy() towards .noalias().
Added missing casts.
2009-08-31 17:29:37 +02:00
Hauke Heibel
bc7aec0ef5 ifdef removed from MapBase and warning disabled 2009-08-31 17:24:38 +02:00
Gael Guennebaud
095809edda fix issue #45 and document the .data() and .stride() functions 2009-08-31 17:07:54 +02:00
Gael Guennebaud
27c9ecc50f fix copy/paste issue 2009-08-31 16:41:13 +02:00
Hauke Heibel
0a0a805569 Fixed a cast warning in scaleAndAddTo.
Fixed lazyness in umeyama.
Added a few missing casts.
2009-08-31 15:34:57 +02:00
Hauke Heibel
32a9aee286 Added MSVC guards to assignment operators. 2009-08-31 14:40:53 +02:00
Hauke Heibel
99bfab6dcf Removed redundant assignment operators. 2009-08-31 13:47:32 +02:00
Gael Guennebaud
9005eb0788 compilation fix in AmbiVector<int> 2009-08-31 09:32:46 +02:00
Thomas Capricelli
20480a5438 merging ei_lmdif() and lmdif_template() 2009-08-21 04:24:59 +02:00
Thomas Capricelli
2e3d17c3ce be (hopefully) smarter with indices convention : we keep the c convention
(0->n-1) as much as possible, and only convert at borders with
fortran-expecting methods, that will eventually dissapear.
2009-08-21 04:16:37 +02:00
Thomas Capricelli
524e112ee5 merging ei_lmstr() and lmstr_template() 2009-08-21 03:41:19 +02:00
Thomas Capricelli
e48b6ad905 merging ei_hybrj() and hybrj_template() 2009-08-21 03:26:28 +02:00
Thomas Capricelli
f2ff0d3903 merging ei_hybrd() and hybrd_template() 2009-08-21 03:13:42 +02:00
Thomas Capricelli
1715e2cb3b merging ei_lmder and lmder_template into ei_lmder() which takes eigen
argument, but still uses f2c code inside.
2009-08-21 02:34:40 +02:00
Thomas Capricelli
6a8b52b3aa simplifying 2009-08-21 01:24:04 +02:00
Thomas Capricelli
0abb148b7d use ei_sqrt instead of sqrt 2009-08-21 00:27:11 +02:00
Thomas Capricelli
1ad042c981 rename i__ to i. i really wonder how f2c can produce such things 2009-08-21 00:26:37 +02:00
Thomas Capricelli
9294d33a11 use references intead of pointers for njev/nfev 2009-08-21 00:23:26 +02:00
Thomas Capricelli
054652b789 use math function adapted to the Scalar type instead of hardcoding float or
double
2009-08-21 00:04:41 +02:00
Thomas Capricelli
d05af200a5 some more trivial fixes to f2c generated code 2009-08-20 23:56:13 +02:00
Thomas Capricelli
9e71c2827a nothing more than indentation fixes (using vim '=' command) 2009-08-20 23:36:03 +02:00
Thomas Capricelli
b1e0662785 cleaning f2c mess 2009-08-20 23:33:45 +02:00
Thomas Capricelli
275a658ec5 porting chkder to eigen 2009-08-20 23:26:40 +02:00
Thomas Capricelli
2e3fa34b9f cleaning a little bit the f2c mess for chkder 2009-08-20 23:07:16 +02:00
Thomas Capricelli
b09ebe01da * porting lmdif1 to eigen
* qtf was missing in lmdif signature (this is an output of the method)
2009-08-20 22:59:09 +02:00
Thomas Capricelli
8d2f6ad7e1 iwa is not really an argument, but just an old fashioned 'work array' :
remove it from the eigen API
2009-08-20 22:46:38 +02:00
Thomas Capricelli
b423e640a6 porting hybrj1 to eigen 2009-08-20 22:41:56 +02:00
Thomas Capricelli
6027c4bedf porting hybrd1 to eigen 2009-08-20 22:36:24 +02:00
Thomas Capricelli
de7d14b2b3 porting lmstr1 to eigen 2009-08-20 22:16:30 +02:00
Thomas Capricelli
980c40f72c porting lmder1 to eigen (no more wrapper) 2009-08-20 22:09:05 +02:00
Thomas Capricelli
a84dc9a5c1 coherency for scalar typename : use "Scalar" everywhere 2009-08-20 21:10:28 +02:00
Thomas Capricelli
df98e66019 oops fix hardcoded typename, which is actually provided as template
parameter
2009-08-20 21:06:26 +02:00
Thomas Capricelli
9a876806e1 use eigen stableNorm() instead of cminpack 'enorm'. The results are mostly
slightly better in tests (one test needs 15 iterations intead of 16, for
the same result). Some numerical results have improved slightly, too.

If one uses blueNorm() instead, an assert for 'overflow' is raised from
blueNorm()
2009-08-20 21:04:38 +02:00
Benoit Jacob
72b002eab9 work around internal compiler error with gcc 4.1 and 4.2, reported on the forum 2009-08-20 12:19:15 -04:00
Benoit Jacob
c7ae261ac0 adapt to API changes 2009-08-20 01:29:38 -04:00
Thomas Capricelli
6953cad81d remove unneeded "Eigen::", we already 'use' Eigen namespace 2009-08-19 20:06:34 +02:00
Thomas Capricelli
369693aa1c oops, forgot those ones 2009-08-19 20:02:49 +02:00
Thomas Capricelli
01622e9855 use machine precision from eigen instead of the one from cminpack. The test
pass though they would not if using a value of 2.220e-16 (the real value
for 'double' is 2.22044604926e-16). How sensitive those tests are :)
2009-08-19 19:56:51 +02:00
Thomas Capricelli
3093e92da5 machine_epsilon is now called epsilon in latest eigen 2009-08-19 19:53:08 +02:00
Thomas Capricelli
47ac354190 merge with the main dev branch 2009-08-19 19:46:37 +02:00
Thomas Capricelli
fb54cfb013 import main files from cminpack as *.h files:
* function names are changed by appending _template
* it uses basic templating : template<typename T>
* wrappers now use those versions instead of the ones from cminpack
* lot of external methods from cminpack are still used
* tests pass though they are unchanged (they use wrappers)
2009-08-19 18:38:45 +02:00
Thomas Capricelli
703198a1a6 wrapper for chkder() : this was the last wrapper missing 2009-08-19 18:32:37 +02:00
Jitse Niesen
59a0c4a0d2 Add new unsupported modules to doc/unsupported_modules.dox 2009-08-18 15:30:38 +01:00
Jitse Niesen
7262716b78 Correct syntax error in doxygen comment. 2009-08-18 11:09:20 +01:00
Gael Guennebaud
d56be9c128 * make HessenbergDecomposition uses the Householder module
* bugfix in ei_blas_traits for .conjugate().conjugate()
2009-08-17 17:41:01 +02:00
Gael Guennebaud
ff0f005d4c change the make householder algorithm so that the remaining coefficient
is real, and make Tridiagonalization use it
2009-08-17 17:04:32 +02:00
Gael Guennebaud
e125c199bb add EIGEN_TRANSFORM_PLUGIN 2009-08-17 09:16:41 +02:00
Gael Guennebaud
737bed19c1 make HouseholderQR uses the Householder module 2009-08-16 19:22:15 +02:00
Gael Guennebaud
ef13c3e754 add normalize and normalized overloads in AlignedVector3 2009-08-16 11:51:46 +02:00
Gael Guennebaud
5274c5c326 quick update of TopicLazyEvaluation 2009-08-16 11:01:32 +02:00
Gael Guennebaud
fc9480cbb3 bugfix in compute_matrix_flags, optimization in LU,
improve doc, and workaround aliasing detection in MatrixBase_eval snippet
(not very nice but I don't know how to do it in a better way)
2009-08-16 10:55:10 +02:00
Benoit Jacob
ee982709d3 in all decs, make the compute() methods return *this
(implements feature request #18)
2009-08-15 23:12:39 -04:00
Gael Guennebaud
65fe5f76fd rename back MayAliasBit to EvalBeforeAssigningBit 2009-08-16 00:14:05 +02:00
Gael Guennebaud
f5f2b222a3 make SVD reuses applyJacobi 2009-08-16 00:02:36 +02:00
Gael Guennebaud
044dd0c1dd revert previous change in Quaternion::setFromTwoVectors 2009-08-15 23:37:20 +02:00
Benoit Jacob
03c1e79f35 svd: sort in decreasing order, remove unused code 2009-08-15 19:20:48 -04:00
Gael Guennebaud
239ada95b7 add overloads of lazyAssign to detect common aliasing issue with
transpose and adjoint
2009-08-15 22:19:29 +02:00
Benoit Jacob
a3e6047c25 fix and improve docs 2009-08-15 15:29:44 -04:00
Gael Guennebaud
50c703f0c7 As proposed on the list:
- rename EvalBeforeAssignBit to MayAliasBit
- make .lazy() remove the MayAliasBit only, and mark it as deprecated
- add a NoAlias pseudo expression, and MatrixBase::noalias() function
Todo:
- we have to decide whether += and -= assume no aliasing by default ?
- once we agree on the API: update the Sparse module and the unit tests respectively.
2009-08-15 18:35:51 +02:00
Gael Guennebaud
13a8956188 bugfix in inner-product specialization,
compilation fix in stable norm,
optimize apply householder
2009-08-15 13:12:50 +02:00
Gael Guennebaud
7b60713e87 my previous fix was not very good 2009-08-15 11:52:50 +02:00
Gael Guennebaud
0da31a6e1d bugfix and compilation fix in ProductBase 2009-08-15 10:55:11 +02:00
Gael Guennebaud
bff4238d15 fix setFromTwoVectors because of the change in sorting of the the singular values 2009-08-15 10:24:27 +02:00
Gael Guennebaud
109a4f650b fix a couple of warnings 2009-08-15 10:20:01 +02:00
Gael Guennebaud
d2becb9612 add a "rot" benchmark in BTL 2009-08-15 10:19:16 +02:00
Gael Guennebaud
846e8b49ba fix compilation of unit tests 2009-08-15 10:18:05 +02:00
Benoit Jacob
2f027b0d2c only append the changeset to the version if we're in the default branch 2009-08-14 21:58:41 -04:00
Thomas Capricelli
5fe0c30811 new script that update from mercurial, make the doc, and upload the result
to tuxfamily.org
2009-08-15 03:36:37 +02:00
Benoit Jacob
62748a0963 new script to generate and upload the docs for a given branch
needs cleanup by a better shell scripter!!
2009-08-14 19:07:01 -04:00
Benoit Jacob
8372bd12bd update snippet 2009-08-14 20:23:13 -04:00
Benoit Jacob
fe4a86443f fix warning 2009-08-14 20:16:04 -04:00
Benoit Jacob
a5f820b873 forgot to update this 2009-08-14 20:03:14 -04:00
Benoit Jacob
2f74801ca4 as discussed on list: default to align cols, reorganize parameters accordingly so that the default corresponds to 0 flag,
and implement FullPrecision output (non-default).
2009-08-14 16:31:42 -04:00
Benoit Jacob
16abc0ba7f try to support 16 bit platforms... optimistic, but can't hurt 2009-08-14 15:49:14 -04:00
Benoit Jacob
22ae236d4e machine_epsilon -> epsilon as wrapper around numeric_traits 2009-08-14 15:12:32 -04:00
Thomas Capricelli
3f63d6f97f fix BoxBOD in the first case : now all tests pass 2009-08-14 19:04:09 +02:00
Thomas Capricelli
3856c2d84d oops, i missed one : real last difficult nist test : Eckerle4 2009-08-14 18:54:28 +02:00
Thomas Capricelli
93627fefcf last 'hard' test from Nist : ratkowsky3 2009-08-14 18:31:04 +02:00
Thomas Capricelli
91f61f7679 fix bad urls 2009-08-14 17:48:04 +02:00
Thomas Capricelli
d8c671f475 yet another (difficult) Nist test : Thurber 2009-08-14 17:46:28 +02:00
Thomas Capricelli
e057d1ef47 tweak precision for Chwirut2 test 2009-08-14 17:25:39 +02:00
Thomas Capricelli
56127cfb1a add yet another easy Nist test : Chwirut2 2009-08-14 17:21:31 +02:00
Gael Guennebaud
6373c3cd00 oops bis, I forgot that SelfAdjointEigneSolver directly called the selector... 2009-08-14 13:49:29 +02:00
Gael Guennebaud
8abec72259 oops forgot to remove the #include in Core 2009-08-14 09:49:33 +02:00
Gael Guennebaud
13e95f7f68 optimize "apply Jacobi" for small sizes, and move it to Jacobi.h 2009-08-14 00:17:14 +02:00
Benoit Jacob
f2536416da * remove EIGEN_DONT_INLINE that harm performance for small sizes
* normalize left Jacobi rotations to avoid having to swap rows
* set precision to 2*machine_epsilon instead of machine_epsilon, we lose 1 bit of precision
  but gain between 10% and 100% speed, plus reduce the risk that some day we hit a bad matrix
  where it's impossible to approach machine precision
2009-08-13 14:56:39 -04:00
Thomas Capricelli
f7cd4c8923 cleaning : wa1 used in 'covar' needs not be the same as in lmder* and all.
it's just an old-fashioned way to re-use memory without allocation...
2009-08-13 16:29:17 +02:00
Benoit Jacob
76a3089a43 oops, don't set the precision to -1 !! 2009-08-13 09:56:53 -04:00
Benoit Jacob
13b191d94b apply Koldo's workaround for MSVC bug 2009-08-13 09:53:47 -04:00
Gael Guennebaud
1b257a7620 add an optimized "apply in place a rotation in the plane",
and make Jacobi and SelfAdjointEigenSolver use it
=> ~ x1.75 speedup for JacobiSVD and x2 for SelfAdjointEigenSolver
2009-08-13 11:42:02 +02:00
Benoit Jacob
1d80f561ad apply change discussed on the list :
* new default precision "-1" means use the current stream precision
* otherwise, save and restore the stream precision
2009-08-13 22:50:55 -04:00
Benoit Jacob
99802094e4 do without an empirical homemade formula that i wasn't comfortable about...
turns out it's not needed anymore and removing it seems to only increase the precision
2009-08-12 18:30:37 -04:00
Benoit Jacob
6bb6810c5e merge 2009-08-12 18:25:13 -04:00
Benoit Jacob
2b618a2c16 make jacobi SVD more robust after experimenting with very nasty matrices...
it turns out to be better to repeat the jacobi steps on a given (p,q) pair until it
is diagonal to machine precision, before going to the next (p,q) pair. it's also
an optimization as experiments show that in a majority of cases this allows to find out
that the (p,q) pair is already diagonal to machine precision.
2009-08-12 18:23:39 -04:00
Jitse Niesen
f71f878bab Add support for matrix exponential of floats and complex numbers. 2009-08-12 15:44:22 +01:00
Benoit Jacob
309d540d4a add parentheses; hopefully this solves Koldos MSVC compilation issue... 2009-08-12 10:14:15 -04:00
Benoit Jacob
22d65d47d0 finally, the good approach was two-sided Jacobi. Indeed, it allows
to guarantee the precision of the output, which is very valuable.
Here, we guarantee that the diagonal matrix returned by the SVD is
actually diagonal, to machine precision.

Performance isn't bad at all at 50% of the current householder SVD
performance for a 200x200 matrix (no vectorization) and we have
lots of room for improvement.
2009-08-12 02:35:07 -04:00
Thomas Capricelli
7b922eb634 BoxBOD : oops.. shame on me, i did a mistake in the derivative.... now we need 16
iterations instead of 7693 ;-)
the first test still fails though.
2009-08-12 02:34:22 +02:00
Thomas Capricelli
fd307b8f3f fix a bug in BoxBOD Nist test : we now get the actual value for 'start 2'
'start 1' still fails though :/
2009-08-12 02:27:44 +02:00
Thomas Capricelli
3c675609bf add another Nist test of 'hard' difficutly : Bennett5 2009-08-12 02:13:04 +02:00
Thomas Capricelli
54d09a8122 add another Nist test of 'hard' difficutly : MGH09 2009-08-12 01:50:56 +02:00
Benoit Jacob
ce033ebdfe add EIGEN_DEBUG_VAR 2009-08-11 16:12:34 -04:00
Thomas Capricelli
5ac17b4680 add another Nist test of medium difficutly : MGH17 2009-08-11 20:24:02 +02:00
Gael Guennebaud
afbd73b5cd overload operartor* with a ProductBase such that "scalar * (mat * mat)" is optimized
as one could naturally expect
2009-08-11 15:15:06 +02:00
Gael Guennebaud
a4f6642518 fix issue #36 (missing return *this in Rotation2D 2009-08-11 15:11:47 +02:00
Gael Guennebaud
ea884e6f48 remove #include Bidiagonalization, and add missing ";" 2009-08-11 15:08:03 +02:00
Thomas Capricelli
d1bc9144cb wrapper for lmstr1 and lmstr + eigenization of calling tests 2009-08-10 17:37:27 +02:00
Thomas Capricelli
bb1204a145 wrapper for lmdif1 + eigenization of calling test 2009-08-10 17:16:43 +02:00
Thomas Capricelli
80372c18ee wrapper for lmdif (+test call eigenization) 2009-08-10 16:54:53 +02:00
Thomas Capricelli
4a26baa718 wrapper for hybrj1 2009-08-10 16:32:45 +02:00
Thomas Capricelli
1d53ce8d48 wrapper for hybrj 2009-08-10 16:21:22 +02:00
Thomas Capricelli
120235deef add another (actuallY) difficult NIST test : BoxBOD.
The first try fails, the second one passes, but with a very bad accuracy
(~4 digits only).
anyway, my aim is to check we do not change cminpack while portint, so i
keep this test.
2009-08-10 14:11:55 +02:00
Thomas Capricelli
bcfe874968 add another 'difficult'-rated NIST test, which passes 2009-08-10 13:47:18 +02:00
Thomas Capricelli
7d65bd42eb fix testNistHahn1 : i had swapped x[] and y[].... :/ 2009-08-10 13:46:43 +02:00
Thomas Capricelli
b71aa34946 a Nist test rated 'difficult', which passes. 2009-08-10 13:05:30 +02:00
Thomas Capricelli
9b1130b82a another nist test with difficulty 'leverage', it passes. 2009-08-10 12:49:44 +02:00
Thomas Capricelli
7ecbbc9aa4 another nist test with difficulty 'leverage', this one passes 2009-08-10 12:34:51 +02:00
Thomas Capricelli
1045bc17f5 another nist test ('average' difficulty), which fails. It is disabled until
further notice.
2009-08-10 12:08:31 +02:00
Thomas Capricelli
c7a72958ba add an easy test from the NIST set :
http://www.itl.nist.gov/div898/strd/nls/data/misra1a.shtml
2009-08-10 10:47:55 +02:00
Thomas Capricelli
ec2b9f90a3 hybrd : wrapper + eigenize test 2009-08-10 03:39:50 +02:00
Gael Guennebaud
35b4077a5d merge 2009-08-09 23:11:25 +02:00
Gael Guennebaud
ef55e7f4ce make custom asm directive volatile 2009-08-09 23:09:46 +02:00
Benoit Jacob
216ee335ac LinearVectorization: If the destination isn't aligned,
we have to do runtime checks and we don't unroll, so it's only good for large enough sizes
2009-08-09 22:19:12 +02:00
Benoit Jacob
1f1705868b now you can #define EIGEN_DEBUG_ASSIGN, and all the values in ei_assign_traits are printed 2009-08-09 21:35:13 +02:00
Benoit Jacob
527557672a disable the assembly for fast unaligned stores. indeed, there is a strange bug that is triggered by this code:
#include<Eigen/Core>

int main()
{
  Eigen::Matrix4f m;
  m <<1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16;
  m.col(0).swap(m.col(1));
  std::cout << m << std::endl;
}

when the fast unaligned stores are used, the column is copied instead of being swapped.
2009-08-09 20:49:55 +02:00
Benoit Jacob
8e08680119 don't depend on uninitialized value 2009-08-09 17:18:42 +02:00
Benoit Jacob
3ed83fa681 * add Jacobi transformations
* add Jacobi (Hestenes) SVD decomposition for square matrices
* add function for trivial Householder
2009-08-09 16:58:13 +02:00
Thomas Capricelli
953c37f8e5 i wonder how useful this really is, but others do this way. Probably
related to doxygen.
2009-08-09 05:14:45 +02:00
Thomas Capricelli
50c192961c eigenize lmder + some other small fixes 2009-08-09 05:07:59 +02:00
Thomas Capricelli
a6625f22d4 eigenize the test for lmder1, clean functor stuff.
(and check the tests still pass, of course, that's the whole point..)
2009-08-09 03:54:36 +02:00
Thomas Capricelli
5e4cf6cae1 oops.. use the template paramater instead of hard coding 'double' 2009-08-09 03:34:32 +02:00
Thomas Capricelli
ceeb023ff2 use template parameter Scalar instead of VectorType, fix a segfault. 2009-08-09 03:33:04 +02:00
Thomas Capricelli
7db4052749 eigenize the test a little more 2009-08-09 03:16:24 +02:00
Thomas Capricelli
f19eda7cf6 first test for a basic wrapper (and only wrapper!) for cminpack functions 2009-08-09 03:07:34 +02:00
Thomas Capricelli
2b9f110639 actually use eigen include file 2009-08-09 01:12:14 +02:00
Gael Guennebaud
fe813911f2 make LU::solve() not to crash when rank=0 2009-08-09 00:06:53 +02:00
Benoit Jacob
5f8d58f36a fix bug in sorting of singular values 2009-08-09 00:05:38 +02:00
Thomas Capricelli
b695113a81 Add all other file from cminpack/examples as tests.
Important : one test was failing because cminpack-1.0.2 does x[3]=1. on x
which is of size 3. Probably because fortran indices are shifted wrt to C
indices and someone forgot to fix this one.

This is correct in this commit and this is the only change I've done on files
from cminpack examples.

(i've also reported the bug to cminpack author)
2009-08-08 23:41:54 +02:00
Thomas Capricelli
d646d99366 Start of module "NonLinear". We start out of cminpack-1.0.2
(http://devernay.free.fr/hacks/cminpack.html)
The first test is adapted from the example/ directory.
Some stuff is hard coded for our initial tests.
2009-08-08 22:18:48 +02:00
Thomas Capricelli
b10637be50 add basic .hgignore file for most common generated/temporary files 2009-08-08 21:42:14 +02:00
Gael Guennebaud
f5e1c896c7 replace custom rank one update in LU by an expression 2009-08-08 00:01:43 +02:00
Gael Guennebaud
d1dc088ef0 * implement a second level of micro blocking (faster for small sizes)
* workaround GCC bad implementation of _mm_set1_p*
2009-08-07 11:09:34 +02:00
Gael Guennebaud
543a785756 Fix compilation in sparse module 2009-08-06 17:28:49 +02:00
Gael Guennebaud
2707a6b87c fix determinant in PartialLU 2009-08-06 17:28:31 +02:00
Gael Guennebaud
1d1e4884da oops, one more bug fix in homogeneous 2009-08-06 16:56:10 +02:00
Gael Guennebaud
9822493aaf fixes in determinant and homogeneous 2009-08-06 16:54:55 +02:00
Gael Guennebaud
3ac01b1400 compilation fix in EigenSolver,
bugfix in PartialLU
2009-08-06 16:41:54 +02:00
Gael Guennebaud
e82e30862a typo 2009-08-06 15:04:42 +02:00
Gael Guennebaud
03febf00a0 fix VS compilation issue in MapBase::operator+= and -= 2009-08-06 15:03:37 +02:00
Gael Guennebaud
2e46e9f2b4 shame on me 2009-08-06 14:57:38 +02:00
Gael Guennebaud
d34c5ef509 fix my bad fix of Hauke's fix ;) 2009-08-06 14:54:25 +02:00
Gael Guennebaud
1d4fea48b5 fix a couple of compilations issues 2009-08-06 14:10:02 +02:00
Hauke Heibel
c2861dd41a fixed inversion for AffineCompact matrices 2009-08-06 12:25:18 +02:00
Gael Guennebaud
56d00779db more product refactoring 2009-08-06 12:20:02 +02:00
Hauke Heibel
6b2ab13ac5 fix vs.net compilation issue 2009-08-06 11:40:25 +02:00
Hauke Heibel
8163757cf0 fix vs.net compilation issue 2009-08-06 11:27:25 +02:00
Gael Guennebaud
fa55cf5ce7 fix compilation and segfault issues 2009-08-06 11:19:36 +02:00
Gael Guennebaud
b9b17ea5a5 add the missing Affine Transform * set of column vectors products... 2009-08-06 11:02:03 +02:00
Benoit Jacob
0744638b6f remove remnant of MultiplierBase 2009-08-06 10:35:13 +02:00
Gael Guennebaud
84a7659bef implement the missing outer product,
and attempt to workaround a gcc internal error
2009-08-05 17:39:11 +02:00
Gael Guennebaud
88147e0a91 big refactoring in Product.h:
- all specialized products now inherits ProductBase
- the default product evaluated by Assign is still here,
  but it is currently enabled for small fixed sizes only
- => this significantly speed up compilation for large matrices
- I left the OuterProduct specialization empty as an exercise...
2009-08-05 15:23:35 +02:00
Benoit Jacob
014c581a5b fix assertions, improve docs.
we never assert on conditions that depend on the result of a computation!!
also the assertion that rank>0 amounts to matrix!=0 which we have to leave under the responsibility of the user.
2009-08-05 10:15:28 +02:00
Benoit Jacob
b183a4f879 merge 2009-08-04 17:47:27 +02:00
Gael Guennebaud
ab6302376c merge 2009-08-04 17:12:00 +02:00
Benoit Jacob
c20236ea75 remove the FORCE 2009-08-04 17:06:54 +02:00
Gael Guennebaud
7d607048a9 implement a ProductBase class and, as a proof of concept, update TriangularProduct
and SelfAdjointMatrixProduct to take advantage of it => fewer LOC
2009-08-04 16:54:17 +02:00
Gael Guennebaud
f3a6bc48c4 fix a couple of compilation issue due to the removal of MultiplierBase 2009-08-04 13:16:40 +02:00
Gael Guennebaud
2089a263f8 merge 2009-08-04 11:31:25 +02:00
Gael Guennebaud
c2a92e92a6 add ger and lu with partial pivoting in BTL 2009-08-04 11:30:33 +02:00
Gael Guennebaud
4bec101470 implement two levels of blocking in PartialLU => high speedup 2009-08-04 11:28:02 +02:00
Benoit Jacob
4436a4d68c use explicit Block/VectorBlock xprs to make sure that compile-time known sizes are used 2009-08-04 00:27:58 +02:00
Benoit Jacob
523cdedf58 make the dot product linear in the second variable, not the first variable 2009-08-03 17:20:45 +02:00
Gael Guennebaud
912da9fade merge with special_matrix branch 2009-08-03 16:17:32 +02:00
Gael Guennebaud
a8f943127c merge 2009-08-03 16:11:30 +02:00
Benoit Jacob
d10c710b15 add new Householder module 2009-08-03 16:06:57 +02:00
Gael Guennebaud
3cf5bb31f6 * Bye bye MultiplierBase, extend a bit AnyMatrixBase to allow =, +=, and -=
* This probably makes ReturnByValue needless
2009-08-03 16:05:15 +02:00
Benoit Jacob
66ee2044ce small fixes 2009-08-03 16:05:07 +02:00
Benoit Jacob
3cde9c0e35 apply Gael's idea for auto transpose in mixed fixed/dynamic case 2009-08-03 16:04:15 +02:00
Gael Guennebaud
ce1dc1ab16 implements a blocked version of PartialLU 2009-08-03 12:11:18 +02:00
Gael Guennebaud
0103de8512 bugfix in trsm 2009-08-02 15:32:43 +02:00
Benoit Jacob
cd49780143 apply patch from Marcus Hanwell: Improved quoting of tests when added to the build 2009-08-02 15:09:34 +02:00
Gael Guennebaud
48fc64458c add blocked LLT, and bugfix in trsm asserts 2009-08-01 23:42:51 +02:00
Gael Guennebaud
18429156a1 add selfadjointView from a trinagularView 2009-07-31 17:35:55 +02:00
Gael Guennebaud
2796bcabb1 some cleaning 2009-07-31 17:35:20 +02:00
Gael Guennebaud
a156f5a869 faster trsm kernel and fix a couple of issues 2009-07-31 13:18:19 +02:00
Gael Guennebaud
21f686846b s/std::atan2/ei_atan2 2009-07-31 10:08:23 +02:00
Manuel Yguel
ae5e26a363 add missing ei_atan2 without painfull warnings 2009-07-31 09:21:31 +02:00
Gael Guennebaud
2e9f7f80bf compilation fixes for sun CC 2009-07-31 10:04:34 +02:00
Benoit Jacob
d8bfd151d1 forward-port Anthony Truchet's changeset 8eab0bccbf 2009-07-30 16:05:38 +02:00
Gael Guennebaud
ff20a2ba94 add explicit "on the right" triangular solving,
=> no temporary when the rhs/unknows is row major
2009-07-30 16:03:06 +02:00
Gael Guennebaud
62d9b9b7b5 fix typo 2009-07-29 09:26:20 +02:00
Gael Guennebaud
864171df5c fix a couple of issues related to recent products 2009-07-28 18:11:30 +02:00
Gael Guennebaud
1ba35248e9 synch with main branch 2009-07-28 17:37:22 +02:00
Gael Guennebaud
54804eb626 synch with main branch 2009-07-28 17:35:07 +02:00
Gael Guennebaud
264fe82c65 add a debug mechanism to compute the number of intermediate evaluations (only for dynamic size) 2009-07-28 17:13:13 +02:00
Gael Guennebaud
508f06ac0f update doc 2009-07-28 17:11:15 +02:00
Gael Guennebaud
de8b795895 compilation fixes in BTL 2009-07-28 17:10:34 +02:00
Gael Guennebaud
7ed7ec64b5 improve the expression analyzer to bypass Transpose expression 2009-07-28 14:02:12 +02:00
Gael Guennebaud
6713c75fac update doc 2009-07-28 12:08:26 +02:00
Gael Guennebaud
7579360672 fix compilation of the doc and started a page dedicated to high performance and or BLAS users 2009-07-27 18:50:39 +02:00
Gael Guennebaud
5f3606bce9 bug fix in inverse for 1x1 matrix,
some compilation fixes in sparse_solvers
2009-07-27 18:09:56 +02:00
Gael Guennebaud
94cc30180e compilation fixes 2009-07-27 13:50:23 +02:00
Gael Guennebaud
0590c18555 various compilation and bug fixes in selfadjoint stuff 2009-07-27 13:17:39 +02:00
Gael Guennebaud
b5e4064289 cleaning and fix a perf issue 2009-07-27 12:13:53 +02:00
Gael Guennebaud
f95b77be62 trmm is now fully working and available via TriangularView::operator* 2009-07-27 11:42:54 +02:00
Gael Guennebaud
6aba84719d trmm is now working in all storage order configurations 2009-07-27 10:27:01 +02:00
Gael Guennebaud
1d4d9a37fd some cleaning 2009-07-26 13:53:24 +02:00
Gael Guennebaud
f3fde74695 finalize trsm: works in all situations, and it is now used by solve() and solveInPlace() 2009-07-26 13:01:37 +02:00
Gael Guennebaud
282e18da49 ok, now trsm works very well for upper triangular matrices
TODO: link it with the meta triangular_solve_selector and handle
the case where the rhs is row major by copying it to a col-major
temporary + handle right solving: X = B * M^-1
2009-07-26 00:49:17 +02:00
Gael Guennebaud
f4112dcff3 The new trsm is working very very well (read very fast) for
lower triangular matrix and row or col major lhs.
TODO: handle upper triangular and row major rhs cases
2009-07-25 21:41:01 +02:00
Gael Guennebaud
35927e78c2 add WIP trsm 2009-07-24 16:21:52 +02:00
Gael Guennebaud
c6d06c22ac some cleaning 2009-07-24 10:53:31 +02:00
Gael Guennebaud
6076173f0b add a simplified version of the sybb kernel built on top of gebp 2009-07-24 10:08:21 +02:00
Gael Guennebaud
82c5438c95 split and add unit tests for symm and syrk,
the .rank*update() functions now returns a reference to *this
2009-07-23 21:22:51 +02:00
Gael Guennebaud
b67abe22b3 oops,, update SYRK so that the rhs can be non-square² 2009-07-23 20:56:04 +02:00
Gael Guennebaud
a81388fae9 Implement efficient sefladjoint product (aka SYRK) : C += alpha * U U^T
It is currently available via SelfAdjointView::rankKupdate.
TODO: allows to write SelfAdjointView += u * u.adjoint()
2009-07-23 19:01:20 +02:00
Gael Guennebaud
713c92140c improve SYMV it is now faster and ready for use 2009-07-23 14:20:45 +02:00
Gael Guennebaud
eee14846e3 formating 2009-07-23 10:19:58 +02:00
Gael Guennebaud
ddb3ac98a2 addd matrix * self adjoint high level API 2009-07-23 10:05:38 +02:00
Hauke Heibel
8d2cd744b0 Added an explicit cast from int to bool to suppress MSVC warnings. 2009-07-23 00:11:25 +02:00
Gael Guennebaud
f696efc00e bugfix in SYMM 2009-07-22 23:48:42 +02:00
Gael Guennebaud
0cb4f32e12 implement high level API for SYMM and fix a couple of bugs related to complex 2009-07-22 23:12:22 +02:00
Gael Guennebaud
e7f8e939e2 * GEMM enhencement: no need to pre-transpose the rhs
=> faster a * b.transpose() product
  => this also fix a bug in a so far untested situation
* SYMM is now ready for use => still have to write the high level
  stuff to convert natural expressions into a call to SYMM
2009-07-22 18:04:16 +02:00
Gael Guennebaud
d6475ea390 more refactoring in the level3 products 2009-07-22 11:54:58 +02:00
Gael Guennebaud
d6627d540e * refactoring of the matrix product into multiple small kernels
* started an efficient selfadjoint matrix * general matrix product
  based on the generic kernels ( => need a very little LOC)
2009-07-21 16:58:35 +02:00
Gael Guennebaud
afa8f2ca95 * various fixes related to sub diagonals and band matrix
* allows 0 sized objects in Block/Map
2009-07-21 11:19:52 +02:00
Gael Guennebaud
562864bcfb enable our own ctest dashboard 2009-07-20 23:06:04 +02:00
Gael Guennebaud
a012aecbc4 bugfix in SVD 2009-07-20 13:44:52 +02:00
Gael Guennebaud
4375c043ac minor compilation fixes for Sun CC and ICC 2009-07-20 13:27:41 +02:00
Gael Guennebaud
4c85fa8c73 compilation fix (sun CC) 2009-07-20 10:57:31 +02:00
Gael Guennebaud
c10b919edb compilation fix 2009-07-20 10:56:03 +02:00
Gael Guennebaud
b3ad796d40 bugfix in operator*= (matrix product) 2009-07-20 10:44:07 +02:00
Gael Guennebaud
a551107cce bugfix for a = a * b; when a has to be resized 2009-07-20 10:35:47 +02:00
Gael Guennebaud
32b08ac971 re-implement stableNorm using a homemade blocky and
vectorization friendly algorithm (slow if no vectorization)
2009-07-17 16:22:39 +02:00
Gael Guennebaud
15ed32dd6e add other stable norm impl. in the benchmark 2009-07-16 16:21:26 +02:00
Gael Guennebaud
525da6a464 bugfix in blueNorm 2009-07-16 14:20:36 +02:00
Gael Guennebaud
65fc70b750 add a benchmark for the different norms 2009-07-16 11:33:56 +02:00
Gael Guennebaud
34490f1493 * bugfixes in Product, and test/product_selfadjoint
* speed up in the extraction of the matrix Q in Tridiagonalization
2009-07-16 00:03:17 +02:00
Gael Guennebaud
97c9445c60 synch with main devel branch 2009-07-15 19:54:31 +02:00
Gael Guennebaud
079fa81d84 add a TridiagonalMatrix wrapper arround BandMatrix, and extend this latter 2009-07-15 19:53:08 +02:00
Gael Guennebaud
4f792583c7 add BandMatrix::col() 2009-07-15 18:00:11 +02:00
Gael Guennebaud
df6561a73f change the implementation of BandMatrix to follow the BLAS/LAPACK storage scheme 2009-07-15 17:00:49 +02:00
Gael Guennebaud
1578421ed1 fix issue #25 : the problem was that we assumed Dynamic was a multiple of a packet size
(also disable the test of blueNorm)
2009-07-15 14:20:45 +02:00
Gael Guennebaud
587029a612 started an implementation of BandMatrix: at least the read/write access
to the main/sub/super diagonals seems to work well.
2009-07-14 23:27:37 +02:00
Gael Guennebaud
8120a5cecd synch with main devel branch 2009-07-14 23:06:25 +02:00
Gael Guennebaud
7a9519a9be fix typo in blue norm 2009-07-14 23:00:53 +02:00
Gael Guennebaud
279cedc1ce some cleaning/renaming is Triangular/SelfadjointView 2009-07-14 22:38:21 +02:00
Gael Guennebaud
f5d2317b12 add a blueNorm() function implementing the Blues's stable norm
algorithm. it is currently provided for experimentation
purpose only.
2009-07-13 21:14:47 +02:00
Gael Guennebaud
ddbaaebf9e one more fix of the previous commit (forgot to update ei_must_nest_by_value) 2009-07-13 15:27:01 +02:00
Gael Guennebaud
bd506d837c fix typo in previous commit 2009-07-13 15:21:32 +02:00
Gael Guennebaud
1e7b1a8a85 add a SparseNestByValue expression and fix issue in sparse adjoint evaluation 2009-07-13 14:55:03 +02:00
Gael Guennebaud
a2cf7ba955 add triangular * vector product 2009-07-13 13:17:55 +02:00
Gael Guennebaud
a2087cd7a3 Add an efficient rank2 update function (like the level2 blas xSYR2 routine).
Note that it is already used in Tridiagonalization.
2009-07-11 21:14:59 +02:00
Gael Guennebaud
ab17f92728 more sun studio fixes 2009-07-10 16:27:01 +02:00
Gael Guennebaud
ec5c608aa3 Set of fixes and workaround to make sun studio more happy.
Still remains the problem of alignment and vectorization.
2009-07-10 16:10:03 +02:00
Gael Guennebaud
b47dea8b7a add a meta unroller for the triangular solver (only for vectors as rhs) 2009-07-10 11:30:46 +02:00
Gael Guennebaud
1a1b2e9f27 finally directly calling the low-level products is faster 2009-07-10 10:41:26 +02:00
Gael Guennebaud
1c52985aa7 merge 2009-07-10 08:21:57 +02:00
Gael Guennebaud
629e083d81 slight change in the comparison to -1 2009-07-10 08:21:20 +02:00
Gael Guennebaud
8885d56928 commit woking versions of triangular solvers naturally
handling conjuagted expression. still have to bench whether it
is faster (runtime and compile time) to directly call the
cache friendly functions, whence all the commented piece of code...
2009-07-09 23:59:18 +02:00
Gael Guennebaud
fa60c72398 started to simplify the triangular solvers 2009-07-09 17:11:03 +02:00
Gael Guennebaud
96e7d9f896 ok now all the complex mat-mat and mat-vec products involving conjugate,
adjoint, -, and scalar multiple seems to be well handled. It only remains
the simpler case: C = alpha*(A*B) ... for the next commit
2009-07-08 18:24:37 +02:00
Gael Guennebaud
13b2dafb50 conjugate expressions are now properly caught by Product
=> significant speedup in expr. like a.adjoint() * b,
   for complex scalar type (~ x3)
2009-07-07 21:30:20 +02:00
Gael Guennebaud
5ed6ce90d3 started to catch scalar multiple and conjugate xpr in Product 2009-07-07 16:55:51 +02:00
Gael Guennebaud
ea23f36c78 * change the nesting order of adjoint_return_type to
1 - make it easier to catch conjugate expressions
 2 - make sure there is no unecessary copy (we had NestByValue<Derived> which seems to be very bad)
* update eigensolver wrt recent changes
2009-07-07 15:56:13 +02:00
Gael Guennebaud
79877a9917 * take advantage of new possibilies in LLT (mat -= product)
* fix Block::operator+= product which was not optimized
* fix some compilation issues
2009-07-07 15:32:21 +02:00
Gael Guennebaud
92a35c93b2 * extended the cache friendly products to support C = alpha * A * M and C += alpha * A * B
* this allows to optimize xpr like C -= lazy_product, still have to catch "scalar_product_of_lazy_product"
* started to support conjugate in cache friendly products (very useful to evaluate A * B.adjoint() without
  evaluating B.adjoint() into a temporary
* compilation fix
2009-07-07 11:39:19 +02:00
Gael Guennebaud
544888e342 add a generic mechanism to copy a special matrix to a dense matrix so that
we don't need to add other specialization of MatrixBase::operator=, Matrix::=,
and Matrix::Matrix(...)
2009-07-07 09:05:20 +02:00
Gael Guennebaud
1aea45335f * bybye Part, welcome TriangularView and SelfAdjointView.
* move solveTriangular*() to TriangularView::solve*()
* move .llt() to SelfAdjointView
* add a high level wrapper to the efficient selfadjoint * vector product
* improve LLT so that we can specify which triangular part is meaningless
=> there are still many things to do (doc, cleaning, improve the matrix products, etc.)
2009-07-06 23:43:20 +02:00
Benoit Jacob
889726bf7c add matrixQR() method exposing the storage. that's where the householder thing impacts the API. 2009-07-06 17:24:11 +02:00
Benoit Jacob
e057beee4e fix some search-and-replace damage 2009-07-06 17:20:07 +02:00
Benoit Jacob
e093b43b2c * rename QR to HouseholderQR because really that impacts the API, not just the impl.
* rename qr() to householderQr(), for same reason.
* clarify that it's non-pivoting, non-rank-revealing, so remove all the rank API, make solve() be void instead of bool, update the docs/test, etc.
* fix warning in SVD
2009-07-06 17:12:10 +02:00
Gael Guennebaud
f84bd3e7b1 include the fixes of the third edition 2009-07-06 15:01:30 +02:00
Gael Guennebaud
0c2232e5d9 quick reimplementation of SVD from the numeral recipes book:
this is still not Eigen style code but at least it works for
n>m and it is more accurate than the JAMA based version. (I needed
it now, this is why I did that)
2009-07-06 13:47:41 +02:00
Gael Guennebaud
0cd158820c switch from eigensolver to SVD which seems to be more accurate with float 2009-07-06 11:15:38 +02:00
Gael Guennebaud
90f1e24579 significantly improve the accuracy of setFromTwoVectors (fixes #21) 2009-07-06 10:35:20 +02:00
Gael Guennebaud
ecc4f07af5 fix doc of Quaternion::setFromTwoVectors 2009-07-06 09:17:25 +02:00
Gael Guennebaud
c6f610093b add a VectorBlock expr as a specialization of Block 2009-07-05 11:33:55 +02:00
Gael Guennebaud
eec334c604 fixes a segfault 2009-07-05 10:48:57 +02:00
Benoit Jacob
60467e54a5 some docs improvements 2009-07-05 01:52:42 +02:00
Manuel Yguel
c398d0edcf another test in the non invertible case 2009-07-04 14:55:25 +02:00
Gael Guennebaud
08e419dcb1 * update sparse module wrt new diagonal matrix impl
* fix a bug is SparseMatrix
2009-07-04 11:16:27 +02:00
Gael Guennebaud
d457ec5810 fix #20: SVD::solve() now resize the result 2009-07-04 09:28:11 +02:00
Benoit Jacob
7b750182f2 * polish computeInverseWithCheck to share more code, fix documentation, fix coding style
* add snippet for computeInverseWithCheck documentation
* expand unit-tests to cover computeInverseWithCheck
2009-06-29 22:07:37 +02:00
Manuel Yguel
126a031a39 computeInverseWithCheck method added to matrix base (specialization for 1D to 4D) 2009-06-29 20:47:37 +02:00
Benoit Jacob
632cb7a4a1 patch by Myguel from the forum: fix documentation 2009-06-29 19:26:20 +02:00
Benoit Jacob
2de9b7f537 fully vectorize DiagonalProduct
(it used to be partially vectorized and that had been lost in the big changes from the previous commit)
2009-06-29 04:01:31 +02:00
Benoit Jacob
bf91003d37 FreeBSD: determine precisely when malloc is 16-byte aligned 2009-06-29 00:08:34 +02:00
Benoit Jacob
5eabf2b75d double precision() : change to 1e-12 instead of 1e-11 (as discussed several times on the list) 2009-06-29 00:00:29 +02:00
Benoit Jacob
6809f7b1cd new implementation of diagonal matrices and diagonal matrix expressions 2009-06-28 21:27:37 +02:00
Benoit Jacob
fc9000f23e only disable the inline ASM if we're NEITHER gcc nor icc. right ?? 2009-06-26 05:32:21 +02:00
Benoit Jacob
6ccb97620a patch by Patrick Mihelich: use empty struct + anonymous namespace for NoChange 2009-06-25 03:33:47 +02:00
Benoit Jacob
60e6ac3178 use system variable instead of custom one 2009-06-25 01:06:53 +02:00
Benoit Jacob
903acf0d5c add missing code snippets for newer Matrix methods and PartialLU::solve() 2009-06-25 00:57:51 +02:00
Benoit Jacob
03ad303d14 * add resize(int, NoChange) and resize(NoChange, int)
* add missing assert in resize(int)
* add examples for all resize variants
* expand docs (part of which is from Tim Hutt's e-mail)
2009-06-24 22:07:03 +02:00
Benoit Jacob
96dca681b0 use <...> for system headers 2009-06-24 16:35:02 +02:00
Gael Guennebaud
a44f7cf440 re-enable the fast unaligned loads for gcc and icc using inline assembly
(this allows to avoid incompatible pointer casts and to specify the dependency to the data explicitely)
2009-06-24 10:48:36 +02:00
Gael Guennebaud
aa17b5b514 use the slower unaligned load intrinsics in ei_ploadu because GCC mess up with my tricks 2009-06-23 23:28:34 +02:00
Benoit Jacob
3934a8b5e8 check version number using newer cmake functionality, instead of kde macro 2009-06-23 15:39:46 +02:00
Gael Guennebaud
ec8596f863 update the stack alignment doc 2009-06-22 10:44:09 +02:00
Benoit Jacob
ac9680fb20 document the "wrong stack alignment" issue. 2009-06-21 17:25:57 +02:00
Benoit Jacob
7d48ed4be3 refine the check to disable alignment. now it's disabled on gcc3 (where we don't vectorize anyway) 2009-06-21 04:56:01 +02:00
Benoit Jacob
8be088bfb0 add Eigen/Eigen 2009-06-19 20:46:55 +02:00
Benoit Jacob
fe8ab0147b add "Dense" header 2009-06-19 19:09:35 +02:00
Benoit Jacob
032594cee2 forward port fix to #12 2009-06-19 18:51:15 +02:00
Benoit Jacob
a57325e971 fix #14: make llt::solve() and also ldlt::solve() work with uninitialized result 2009-06-19 17:01:32 +02:00
Moritz Lenz
c6e81869d0 fixed typo in SuperLUSupport.h 2009-06-17 11:55:57 +02:00
Mark Borgerding
fb9a15e451 added copyright notice 2009-06-17 00:09:18 -04:00
Mark Borgerding
e577c70e49 candidate header for Eigen/Complex 2009-06-16 23:54:58 -04:00
Mark Borgerding
218711e18b example file 2009-06-10 23:25:27 -04:00
Mark Borgerding
4d6b962ba4 added FindFFTW, but I don't think it's right yet 2009-06-10 22:16:32 -04:00
Gael Guennebaud
627595ad19 * rename PartialRedux to VectorwiseOp
* add VectorwiseOp's +, -, +=, -= operators
2009-06-10 11:20:30 +02:00
Gael Guennebaud
f3fd7fd22b fix #11: now the default Transform ctor set the last row in Affine mode. 2009-06-10 09:35:04 +02:00
Gael Guennebaud
d97d307fcf SparseMatrix::resize() always resets the matrix to an empty one 2009-06-08 14:12:11 +02:00
Gael Guennebaud
55de162cf6 fix #10: the reallocateSparse function was half coded 2009-06-08 14:05:23 +02:00
Gael Guennebaud
c49d1fd2b5 add a partial LU bench in BTL 2009-06-04 18:16:54 +02:00
Hauke Heibel
f26c691678 Renamed internal helper functions from the Memory header. 2009-06-04 17:25:15 +02:00
Hauke Heibel
5f04f8eb6b Fixes #9. Thanks to the (unknown) bug contributor. 2009-06-04 09:11:35 +02:00
Hauke Heibel
6530c8f5b4 A much simplified version of the earlier commit introducing way fewer changes compared to changeset f292d2352e
.
The reason of the previous commit was incorrect. The smart pointers issues were actually a result of issue 9.
2009-06-03 22:22:15 +02:00
Hauke Heibel
71e5cbcbc4 Added specializations for DontAlign when using Dynamic matrices.
This allows users to store Matrices in smart pointers without the
need for a specialized allocator/de-allocator.
2009-06-03 16:47:38 +02:00
Mark Borgerding
1c54340174 more work on ei_fftw_impl 2009-05-31 15:44:57 -04:00
Mark Borgerding
1fd6dfe428 added ei_fftw_impl 2009-05-30 17:55:47 -04:00
Hauke Heibel
f292d2352e Relaxed checks againts _MaxRows and _MaxCols in Matrix::_check_template_params(). 2009-05-29 09:09:48 +02:00
Mark Borgerding
f13e000b45 various comment changes 2009-05-27 21:32:42 -04:00
Benoit Jacob
ee92009fd8 make Umeyama, and its unit-test, work for me on gcc 4.3 2009-05-27 23:10:24 +02:00
Benoit Jacob
86be59124d fix the static assert checking the size template parameters. 2009-05-27 23:07:29 +02:00
Hauke Heibel
4d1e492c00 * Umeyama has now similar performance for RowMajor and ColMajor layouts.
* Fixed a bug in umeyama for fixed size matrices.
* Fixed the umeyama unit test for fixed size matrices.
* Added XprHelper::ei_plain_matrix_type_row_major.
2009-05-27 19:24:05 +02:00
Ingmar Vanhassel
7a7a3f3570 Fix out of source build 2009-05-27 17:36:46 +02:00
Hauke Heibel
db5647abae Added Umeyama implementation. 2009-05-26 19:22:25 +02:00
Mark Borgerding
09b4733255 added real-optimized inverse FFT (NFFT must be multiple of 4) 2009-05-25 23:52:21 -04:00
Mark Borgerding
03ed6f9bfb refactored ei_kissfft_impl to maintain a cache of cpx fft plans 2009-05-25 23:06:49 -04:00
Mark Borgerding
210092d16c changed name from simple_fft_traits to ei_kissfft_impl 2009-05-25 20:35:24 -04:00
Mark Borgerding
326ea77390 added FFT inverse complex-to-scalar interface (not yet optimized) 2009-05-23 22:50:07 -04:00
Mark Borgerding
3047988172 scalar forward FFT optimized for even size, converts to cpx for odd 2009-05-23 12:50:07 -04:00
Mark Borgerding
9c0fcd0f62 started real optimization, added benchmark for FFT 2009-05-23 10:09:48 -04:00
Gael Guennebaud
9d5728c511 fix #4
and also improve performance of Tridiag::diag/subDiag at the same time
2009-05-23 13:31:20 +00:00
Mark Borgerding
8b4afe3deb added non-optimized real forward fft (no inverse yet) 2009-05-22 22:37:59 -04:00
Benoit Jacob
42848498aa fixes #5 : freebsd really has aligned malloc 2009-05-22 23:54:52 +02:00
Benoit Jacob
7667a93cbe merge 2009-05-22 20:31:26 +02:00
Benoit Jacob
6347b1db5b remove sentence "Eigen itself is part of the KDE project."
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Thomas Capricelli
0ed7c2f6d7 fix typo 2009-05-22 19:46:29 +02:00
Hauke Heibel
c7303a876f Oops, here the actual LLT and LDLT patch. 2009-05-22 15:58:20 +02:00
Hauke Heibel
0523b64fe9 Eigensolver decomposition interface unification.
Added default ctor and public compute method as
well as safe-guards against uninitialized usage.
Added unit tests for the safe-guards.
2009-05-22 14:27:58 +02:00
Hauke Heibel
7435d5c079 LLT and LDLT decomposition interface unification.
Added default ctor and public compute method as
well as safe-guards against uninitialized usage.
Added unit tests for the safe-guards.
2009-05-22 14:27:58 +02:00
Hauke Heibel
2c247fc8a8 LU and PartialLU decomposition interface unification.
Added default ctor and public compute method as well
as safe-guards against uninitialized usage.
Added unit tests for the safe-guards.
2009-05-22 14:27:58 +02:00
Hauke Heibel
5c5789cf0f QR and SVD decomposition interface unification.
Added default ctor and public compute method as
well as safe-guards against uninitialized usage.
Added unit tests for the safe-guards.
2009-05-22 14:27:58 +02:00
Benoit Jacob
c7baddb132 add internal comment (mostly a pretext to test the eigen-commits list) 2009-05-20 16:54:40 +02:00
Gael Guennebaud
dd45c4805c * add a writable generic coeff wise expression (CwiseUnaryView)
* add writable .real() and .imag() functions
2009-05-20 15:41:23 +02:00
Gael Guennebaud
6ecd02d7ec merge 2009-05-20 00:05:28 +02:00
Gael Guennebaud
56aad5aafb * add a FindEigen2.cmake file for reference
* parse the version number from the Macro.h header file
2009-05-20 00:04:08 +02:00
Benoit Jacob
a697cb4094 fix comments (old comments that were copied from LU) 2009-05-19 21:54:29 +02:00
Rhys Ulerich
066acca179 Added pkgconfig support 2009-05-19 11:48:50 -05:00
Gael Guennebaud
65181df1f9 update CTestConfig file for my.cdash.org 2009-05-19 10:21:24 +02:00
Gael Guennebaud
93123bc021 merge 2009-05-19 09:41:27 +02:00
Gael Guennebaud
510029f2bc * optimize sum() for sparse matrices and vectors
* fix the row()/col() functions of some InnerVector
2009-05-19 09:40:00 +02:00
Gael Guennebaud
f47c4b5da8 update cdash testsuite file to use mercurial 2009-05-19 09:24:22 +02:00
Mark Borgerding
68cad98bc9 indent level change 2009-05-19 00:34:38 -04:00
Mark Borgerding
be1b2ad4ec removed unused code 2009-05-19 00:26:19 -04:00
Mark Borgerding
92ca9fc032 initial pass of FFT module -- includes complex 1-d case only 2009-05-19 00:21:04 -04:00
Jitse Niesen
72f66d717d Evaluate argument of matrix exponential only once. 2009-05-18 23:14:53 +01:00
Gael Guennebaud
b83d9b48fa fix compilation with ICC 2009-05-18 18:26:45 +02:00
Gael Guennebaud
c8629e12f4 fix #3, remove inline keywords in QtAlignedMalloc (MSVC fix) 2009-05-18 18:09:21 +02:00
Gael Guennebaud
e186728867 fix #1 : need to nest by value the affine part in homogeneous product 2009-05-18 17:55:50 +02:00
Gael Guennebaud
e0832d5d93 fix bug reported by Moritz Lenz about random setter 2009-05-18 17:26:01 +02:00
Jitse Niesen
e3d64cb418 Fix compilation error in createRandomMatrixOfRank() 2009-05-17 21:17:45 +01:00
Hauke Heibel
6358c12998 * introduced method createRandomMatrixOfRank (R = U*D*V where U,V unitary, D r-by-c diag. with rank non-zero values)
* switched lu/qr tests to be using createRandomMatrixOfRank
* removed unused methods doSomeRankPreservingOperations
* removed NOTE about doSomeRankPreservingOperations
2009-05-17 16:07:12 +02:00
Benoit Jacob
934d6b4749 fix #2, bug in Diagonal::MaxRowsAtCompileTime when Index==Dynamic 2009-05-17 16:00:56 +02:00
Hauke Heibel
6ea7cbdc87 * added missing project definition (see doc of add_subdirectory and EXCLUDE_FROM_ALL) to fix a win build issue
* commented out non-existing unsupported-snippets and -examples
2009-05-17 12:12:39 +02:00
Thomas Capricelli
36c9c2597f Added tag after-hg-migration for changeset cd6c258f44 2009-05-16 17:23:50 +02:00
Thomas Capricelli
cd6c258f44 Fix a previous mistake in moving the 2.0-beta5 in svn for some last-minute
fixes.
2009-05-16 17:23:42 +02:00
Thomas Capricelli
bd9c32e2f1 Remove this old file. It was stalling in history because of a bug in svn,
which did not prevent the commit (svn r722564) to 'svn copy' a directory
called 'Core/' on top of an existing file 'Core'

see http://websvn.kde.org/?view=rev&revision=722564
2009-05-16 17:20:44 +02:00
convert-repo
e0554752d8 update tags 2009-05-16 14:05:54 +00:00
Benoit Jacob
dd390470e1 simplification (no reason anymore to write that in that convoluted way) 2009-05-15 16:05:45 +00:00
Benoit Jacob
5ee9f1a705 argh, forgot to re-add the throw() 2009-05-15 15:54:52 +00:00
Benoit Jacob
126284d08b * fix bugs in EigenTesting.cmake: it didn't work with -DEIGEN_NO_ASSERTION_CHECKING=ON
* only try...catch if exceptions are enabled
2009-05-15 15:53:26 +00:00
Benoit Jacob
7c14e1eac4 add partial-pivoting LU decomposition
the name 'PartialLU' is not meant to be definitive!
make inverse() and determinant() use it, so it's *almost* considered
well tested.
2009-05-13 02:02:22 +00:00
Gael Guennebaud
877c3c00a2 enable testing of complex numbers for taucs 2009-05-12 13:43:40 +00:00
Gael Guennebaud
159c99a288 fix adolc unit test for dynamic sizes 2009-05-12 07:38:46 +00:00
Gael Guennebaud
f5b5571a5a compilation fixes 2009-05-12 07:32:34 +00:00
Gael Guennebaud
9b256d997e various minor updates of some unit tests 2009-05-11 11:09:41 +00:00
Gael Guennebaud
6a4e94f349 bugfix from Jens Mueller (s/RowMajor/IsRowMajor) 2009-05-11 10:59:27 +00:00
Benoit Jacob
9afd1324fd constant Diagonal ---> DiagonalBits
introduce ei_is_diagonal to check for it
DiagonalCoeffs ---> Diagonal and allow Index to by Dynamic
-> add MatrixBase::diagonal(int) with unittest and doc
2009-05-10 16:24:39 +00:00
Benoit Jacob
eac79b6d2e CREDIT Hauke Heibel, fix MSVC warnings 2009-05-09 03:41:17 +00:00
Benoit Jacob
3b79d99f71 *make coeff() return a const reference (required with the recent change with CoeffReturnType)
*fix a unused variable warning
2009-05-09 03:39:31 +00:00
Benoit Jacob
c500415e18 result of our experiments with LU tuning: implement very simple formula, that
turns out to be similar to Higham's formula already in use in LDLt
2009-05-07 20:35:26 +00:00
Gael Guennebaud
b6c76c30cd apply patch from Hauke Heibel cleaning overloaded operator new/detete 2009-05-07 20:33:48 +00:00
Benoit Jacob
47fc3858b9 oops, didn't want to commit that 2009-05-07 18:40:06 +00:00
Benoit Jacob
c8a22dbc08 CREDIT Hauke Heibel, more std::vector::insert fixes 2009-05-07 18:38:07 +00:00
Benoit Jacob
159ab4a043 CREDIT Hauke Heibel, windows compatibility fixes in MatrixExponential 2009-05-07 18:11:49 +00:00
Gael Guennebaud
7080282a6d forgot a svn add CMakeLists.txt 2009-05-07 14:49:56 +00:00
Gael Guennebaud
6dffdca123 fix realloc when initial size was 0 (bug reported by Jens Mueller) 2009-05-07 13:13:42 +00:00
Benoit Jacob
4f0af00e51 *add missing overloads of setZero, etc... that were mentioned in the tutorial
--->they go into Matrix as they resize.
*add isConstant() alias to isApproxToConstant()
*extend unit-test
*change an assert into a static assert
2009-05-06 21:40:24 +00:00
Benoit Jacob
0c0d38272e add copyright on two public headers that are not so trivial... 2009-05-06 15:48:28 +00:00
Benoit Jacob
e1abffa513 add missing function, thanks to Hauke Heibel 2009-05-06 13:33:35 +00:00
Gael Guennebaud
3361ad4589 one more gcc 3.3 fix 2009-05-06 10:51:46 +00:00
Gael Guennebaud
db079f3a8c fix internal error with gcc 3.3 (all tests are now ok with 3.3 !) 2009-05-06 09:37:30 +00:00
Gael Guennebaud
1e286464ab * compilation fixes for gcc 3.3
* test Part::swap
2009-05-06 08:43:38 +00:00
Gael Guennebaud
23f073625d add -Wextra only if compiler supports it 2009-05-06 08:15:54 +00:00
Benoit Jacob
834eb5bfc8 new unsupported module by Jitse Niesen: matrix exponential 2009-05-05 20:46:55 +00:00
Benoit Jacob
66f059b99d remove unused member remean that was just initialized 2009-05-05 17:29:44 +00:00
Benoit Jacob
46e1162a39 reimplement linearRegression on top of fitHyperplane which is much better 2009-05-05 17:15:35 +00:00
Benoit Jacob
2b2f0c0220 fix linearRegression, fix doc, add unit test (it was untested since the change
making fitHyperplane no longer use it)
2009-05-05 16:50:58 +00:00
Gael Guennebaud
5b364a68cb oops 2009-05-04 14:53:21 +00:00
Gael Guennebaud
2829314284 new simplified API to fill sparse matrices (the old functions are
deprecated). Basically there are now only 2 functions to set a
coefficient:
1) mat.coeffRef(row,col) = value;
2) mat.insert(row,col) = value;
coeffRef has no limitation, insert assumes the coeff has not already
been set, and raises an assert otherwise.
In addition I added a much lower level, but more efficient filling
mechanism for
internal use only.
2009-05-04 14:25:12 +00:00
Thomas Capricelli
ddb6e96d48 fix warnings with recent gcc 2009-05-04 13:55:21 +00:00
Benoit Jacob
dec09a618d fix warning, unused variable dummy 2009-05-04 13:01:23 +00:00
Benoit Jacob
b60571a193 fix warnings with unused static functions 2009-05-04 12:49:56 +00:00
Benoit Jacob
8aa8854bbf fix SSE2 detection on win64, reported by 'kajala' 2009-05-04 12:13:37 +00:00
Gael Guennebaud
c25fc50d54 add an AlignedVector3 module suitable for vectorization of 3D vectors 2009-05-03 17:19:19 +00:00
Gael Guennebaud
469b8aa380 "forgot to commit the required changes in stdvector unit test" 2009-05-03 15:39:37 +00:00
Benoit Jacob
95bda5e6ab let the user disable alignment altogether by #defining EIGEN_DONT_ALIGN.
Until now, the user had to edit the source code to do that.
Internally, add EIGEN_ALIGN that takes into account both EIGEN_DONT_ALIGN.and
EIGEN_ARCH_WANTS_ALIGNMENT. From now on, only EIGEN_ALIGN should be used to
test whether we want to align.
2009-05-03 13:50:56 +00:00
Gael Guennebaud
facee57b8d add auto transposition for vectors 2009-04-30 12:09:45 +00:00
Gael Guennebaud
7029ed6b88 ok, this time cast should really work ; sorry for the noise 2009-04-29 17:39:39 +00:00
Gael Guennebaud
428a12902a fixed my mistake in cast 2009-04-29 17:05:19 +00:00
Benoit Jacob
746079f75d gni, forgot to call the new subtest 2009-04-29 17:01:10 +00:00
Gael Guennebaud
9e9e99a42e casting to the same type no longer generates a CwiseUnaryOp 2009-04-29 14:57:18 +00:00
Benoit Jacob
8b1e7c2792 add cast<>() tests. including a vectorization_logic test that currently fails (casting to same type should not prevent vectorization) 2009-04-29 14:51:19 +00:00
Gael Guennebaud
e722f55382 fix the Matrix(int,int)/vector 2D ctors issue so that we really
have a Matrix(Scalar,Scalar) ctor. (useful for std::complex, user
defined types, etc.
2009-04-24 14:38:40 +00:00
Benoit Jacob
32e9801890 CREDIT Ross Smith: fix posix_memalign detection 2009-04-24 13:26:36 +00:00
Gael Guennebaud
e201091d89 remove unsupported/doc from make's "all" target 2009-04-23 12:22:10 +00:00
Gael Guennebaud
acb32c69d4 * update BVH to explicitely use aligned_allocator
* fix warning in StdVector
2009-04-23 11:33:36 +00:00
Gael Guennebaud
c7bb7436f9 make the ei_p* math functions overloads instead of template
specializations
2009-04-22 21:35:50 +00:00
Gael Guennebaud
e0904a70ce update STL vector doc 2009-04-22 21:29:42 +00:00
Gael Guennebaud
6f6b5ad016 add a generic version of std::vector::resize for other stl
implementations
2009-04-21 21:40:33 +00:00
Gael Guennebaud
2697877fac std::vector now explicitely requires the use of Eigen::aligned_allocator
CREDIT Hauke Heibel
2009-04-21 21:30:38 +00:00
Gael Guennebaud
c5c384cf06 add missing vector ctor reported by Markus Moll on the ML 2009-04-21 11:52:25 +00:00
Gael Guennebaud
5e5bac52d7 this should fix the linking issue with StdVector without any user
changes... I cross my fingers...
2009-04-21 11:44:58 +00:00
Gael Guennebaud
4efcc14b91 fix ctor issues with ei_workaround_msvc_std_vector 2009-04-21 08:12:07 +00:00
Christian Ehrlicher
b14d1139df not compilable with msvc :( 2009-04-15 17:13:23 +00:00
Gael Guennebaud
c744960984 patch from Ilya Baran: This small patch fixes a potential initialization
bug in BVAlgorithms and slightly corrects the BVH doc.
2009-04-15 05:54:07 +00:00
Benoit Jacob
6ee4eb94fb fix the 4x4 inverse -- unit test passes again 2009-04-14 13:42:23 +00:00
Benoit Jacob
2bb1c9e8dc finally commit Rohit's work as the start of a new (currently
unsupported) module, MoreVectorization.

CCMAIL:rpg.314@gmail.com
2009-04-14 13:15:53 +00:00
Gael Guennebaud
804a239d30 patch from Moritz Lenz to allow solving transposed problem with superlu 2009-04-10 19:54:43 +00:00
Gael Guennebaud
fb3078fb62 fix memory leak in superlu backend 2009-04-10 18:49:38 +00:00
Gael Guennebaud
7254201632 bugfix when the diagonal is not stored and assumed to be 1 2009-04-10 18:13:37 +00:00
Gael Guennebaud
22edf77470 add a 4x4 inverse case which is not handled by the current
ei_compute_inverse_in_size4_case (reported by mikola on IRC)
2009-04-09 21:55:29 +00:00
Gael Guennebaud
ea41a18b80 clean asm comments 2009-04-09 21:29:31 +00:00
Gael Guennebaud
d78eb02627 add aligned_allocator operator == and != as suggested by Hauke Heibel 2009-04-09 21:22:02 +00:00
Benoit Jacob
0c99de5a17 more patches from Hauke Heibel: compilation/warning fixes from VC++ 2009-04-09 17:19:17 +00:00
Benoit Jacob
e6332cba4b forward-port r951449: patch by Hauke Heibel: compile fix with VS 9 2009-04-09 12:06:13 +00:00
Gael Guennebaud
e8329f9f45 relicence Julien Pommier's SSE code to Eigen's licenses 2009-04-09 06:03:51 +00:00
Gael Guennebaud
aabca5a44f stupid typo 2009-04-06 13:35:48 +00:00
Benoit Jacob
502bf4a81d * fix the binary bloat issue, Rohit's idea was the good one
* a few dox fixes (alloc routines do return 0 on error) and forgot to update version number in CMakeLists
2009-04-06 13:33:42 +00:00
Gael Guennebaud
38f501a596 fix computation of aligned_bit (has been broken by the change of
AutoAlign/DontAlign)
2009-04-05 17:34:59 +00:00
Gael Guennebaud
5b1d0cebc5 sparse module: new API proposal for triangular solves and experimental
solver support with a sparse matrix as the rhs.
2009-04-05 16:30:10 +00:00
Jos van den Oever
5a628567b0 Compile fix. 2009-04-05 14:26:47 +00:00
Kenneth Frank Riddile
9a8096dd1d * fixed truncation warnings caused by MatrixBase::CoeffReturnType under MSVC 2009-04-03 20:19:07 +00:00
Gael Guennebaud
ff3a3209ca add an assertion in sparse LLT for invalid input matrix 2009-04-03 08:30:15 +00:00
Gael Guennebaud
adf5104bae oops, bad copy paste in ei_psqrt, thanks to Jitse Niesen 2009-04-02 06:29:05 +00:00
Benoit Jacob
40432c84b9 CwiseUnaryOp -> SparseCwiseUnaryOp 2009-04-01 14:47:56 +00:00
Gael Guennebaud
0170eb0dbe add an auto-diff module in unsupported. it is similar to adolc's forward
mode but the advantage of using Eigen's expression template to compute
the derivatives (unless you nest an AutoDiffScalar into an Eigen's
matrix).
2009-04-01 14:43:37 +00:00
Benoit Jacob
0f8e692b3f * Find SuperLU also when it is installed without a superlu/ prefix
* Some more CoeffReturnType changes
2009-04-01 14:07:38 +00:00
Benoit Jacob
113fc3a260 now if Derived has the DirectAccess flag, then MatrixBase<Derived>::coeff() const returns a const Scalar& and not a Scalar as before.
useful for people doing direct access. + 1 bugfix thanks to Patrick Mihelich, forgot a & in MapBase::coeff(int).
2009-04-01 00:39:56 +00:00
Benoit Jacob
2f45eeb0c6 More Cholesky fixes.
* Cholesky decs are NOT rank revealing so remove all the rank/isPositiveDefinite etc stuff.
* fix bug in LLT: s/return/continue/
* introduce machine_epsilon constants, they are actually needed for Higman's formula determining
  the cutoff in Cholesky. Btw fix the page reference to his book (chat with Keir).
* solve methods always return true, since this isn't a rank revealing dec. Actually... they already did always return true!! Now it's explicit.
* updated dox and unit-test
2009-04-01 00:21:16 +00:00
Benoit Jacob
8e2b191acf fix typo found by noir.sender from KDE forums 2009-03-31 17:00:30 +00:00
Benoit Jacob
1e6097a810 fix mistake in static assertion, patch by Markus Moll. 2009-03-31 16:07:12 +00:00
Benoit Jacob
bf596d0b3a add adjointInPlace() and add documentation warnings on adjoint() and transpose() about aliasing effects. 2009-03-31 13:55:40 +00:00
Benoit Jacob
a1ba995f05 Many improvements in LLT and LDLT:
* in LDLT, support the negative semidefinite case
* fix bad floating-point comparisons, improves greatly the accuracy of methods like
  isPositiveDefinite() and rank()
* simplifications
* identify (but not resolve) bug: claim that only triangular part is used, is inaccurate
* expanded unit-tests
2009-03-30 21:45:45 +00:00
Gael Guennebaud
df9dfa1455 fix superLU backend: missing operator= 2009-03-27 19:25:22 +00:00
Gael Guennebaud
e9f6167485 make special ei_p functions static to avoid linking issues (they are too
complex to be inlined)
2009-03-27 14:55:46 +00:00
Gael Guennebaud
49fc1e3e84 add vectorization of sqrt for float 2009-03-27 14:41:46 +00:00
Gael Guennebaud
3499f6eccd fix Taucs support (it appears Taucs does not return sorted matrices) 2009-03-26 17:11:43 +00:00
Benoit Jacob
1b7b538e05 The ABI break:
* set AutoAlign=0, DontAlign!=0
* set Dynamic=33331
* add check on fixed sizes
* bump version to 2.0.52
2009-03-26 16:30:54 +00:00
Gael Guennebaud
ce5669dbf9 * enable vectorization of sin, cos, etc. by default with an option to
disable them (-DEIGEN_FAST_MATH=0)
* add a specialization of MatrixBase::operator*(RealScalar) for fast
  "matrix of complex" times scalar products (even more useful for
  autodiff scalar types)
2009-03-26 12:50:24 +00:00
Gael Guennebaud
62de40f8bb oops forgot to include a file in previous commit (I had other local
changes I did not want to commit yet...)
2009-03-26 07:10:59 +00:00
Gael Guennebaud
a22ef7e1f3 for some reason passing the argument by const reference killed the perf
(in the packet version of sin, cos, exp, lop), so let's pass them by
value. Also, improve the perf of ei_plog by reducing dependencies.
2009-03-25 18:33:36 +00:00
Gael Guennebaud
17860e578c add SSE2 versions of sin, cos, log, exp using code from Julien
Pommier. They are for float only, and they return exactly the same
result as the standard versions in about 90% of the cases. Otherwise the max error
is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos
slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective
standard versions. So, is it ok to enable them by default in their respective functors ?
2009-03-25 12:26:13 +00:00
Andrew Coles
64fbd93cd9 As values may be used uninitialised, they have now been given
sensible defaults; or, in other words, if worse comes to worst,
we'll get a guaranteed segfault rather than a heisenburg.
2009-03-23 21:15:33 +00:00
Gael Guennebaud
70c0174bf9 * allows fixed size matrix with size==0 (via a specialization of
MatrixStorage returning a null pointer). For instance this is very
  useful to make Tridiagonalization compile for 1x1 matrices
* fix LLT and eigensolver for 1x1 matrix
2009-03-23 14:44:44 +00:00
Gael Guennebaud
f4cf5e9b26 split and extend eigen-solver tests 2009-03-23 14:38:59 +00:00
Konstantinos A. Margaritis
fe00e864a1 ei_pnegate implemented for AltiVec 2009-03-20 17:26:50 +00:00
Gael Guennebaud
fbf415c547 add vectorization of unary operator-() (the AltiVec version is probably
broken)
2009-03-20 10:03:24 +00:00
Gael Guennebaud
4bb5221d22 Add BVH module in unsupported (patch from Ilya Baran)
(I thought I committed it a week ago but it seems the command failed)
2009-03-18 20:06:06 +00:00
Gael Guennebaud
3d385ae968 angle-axis doc: make it more clear the axis must be normalized 2009-03-18 19:38:55 +00:00
Gael Guennebaud
dcf49e5a28 more MSVC fixes: restrict keywords (sorry for all these commits) 2009-03-17 13:32:26 +00:00
Gael Guennebaud
718af05517 more MSVC fixes (asm comments...) 2009-03-17 13:25:26 +00:00
Gael Guennebaud
497d77b08c fix compilation MSVC 2009-03-17 13:18:53 +00:00
Gael Guennebaud
312994fa98 opengl demo: add aligned operator new where appropriate and remove my
mess...
2009-03-13 13:17:19 +00:00
Gael Guennebaud
c6264d9b7e add affine * homogeneous vector for backward compatibility (e.g.,
kgllib)
2009-03-11 14:32:03 +00:00
Gael Guennebaud
02e97ac0ae minor changes in AlignedBox needed for BVH module 2009-03-11 14:21:55 +00:00
Gael Guennebaud
b8f46090ff add optimized cross3 function (code from Rohit Garg) 2009-03-11 14:20:36 +00:00
Gael Guennebaud
3298320007 fix doxygen generation of unsupported modules 2009-03-11 13:11:38 +00:00
Gael Guennebaud
f697ea6d30 fix a few compilation errors and warnings (ICC) 2009-03-11 08:45:53 +00:00
Gael Guennebaud
14691d6836 fix compilation with old, and future gcc 2009-03-10 11:55:50 +00:00
Gael Guennebaud
3e4307d8a8 compilation fix in MapBase 2009-03-10 09:06:38 +00:00
Gael Guennebaud
caa1ef7515 various BTL updates (disable Cholesky for MTL, add new plot settings,
etc)
2009-03-09 23:28:46 +00:00
Gael Guennebaud
8aa5aa269a add "slice vectorization" of redux (eg. m.block().minCoeff() is now
vectorized)
2009-03-09 23:16:39 +00:00
Gael Guennebaud
c087373968 because of a missing specialization, operator/(scalar) was not vectorized 2009-03-09 23:14:53 +00:00
Gael Guennebaud
db6c3d0197 fix MapBase's ForceAligned concept which was not working at all.... 2009-03-09 19:23:31 +00:00
Gael Guennebaud
3f80c68be5 add the vectorization of abs 2009-03-09 18:40:09 +00:00
Gael Guennebaud
bd8107c90c forgot to include a file in previous commit 2009-03-09 14:18:29 +00:00
Gael Guennebaud
8a424efb11 add an option to bench eigen without GCC's auto vec (might conflict with
Eigen's auto vec)
2009-03-09 14:16:05 +00:00
Gael Guennebaud
314aff875e fix warning in Tridiag 2009-03-09 10:41:37 +00:00
Gael Guennebaud
44f218988c add a small note in Transform doc. 2009-03-08 11:38:45 +00:00
Gael Guennebaud
3ac42fed94 big rework of the Transform class:
* add Projective and AffineCompact modes as an optional third template
  argument
* extend Transform::operator* to support more use cases
2009-03-08 11:35:30 +00:00
Gael Guennebaud
7718a8ed83 slight optimization of SSE base integer mul (thanks to Rohit Garg) 2009-03-08 10:14:07 +00:00
Jure Repinc
f9790a649c Install the newly added eigen2/Eigen/src/Geometry/arch/Geometry_SSE.h 2009-03-08 06:21:21 +00:00
Gael Guennebaud
e4f64ce098 add optimized quaternion * quaternion product specialization for
float/SSE using code from Rohit Garg
2009-03-07 13:52:44 +00:00
Gael Guennebaud
6f95270ede significantly reduce the default stack allocation limit which was much
too high
2009-03-06 16:52:43 +00:00
Gael Guennebaud
2c78ca62a6 patch from Stjepan Rajko (MSVC fix in RotationBase) 2009-03-06 11:45:39 +00:00
Andrew Coles
093e3cd5b5 Commented out duplicate definition of TransformTraits - was causing
compile-time errors.
2009-03-05 22:35:06 +00:00
Gael Guennebaud
fa9f7708d4 add efficient matrix product specializations for Homogeneous 2009-03-05 16:40:56 +00:00
Gael Guennebaud
31332fca0b remove bad #include of SelfadjointRank2Update.h 2009-03-05 10:29:20 +00:00
Gael Guennebaud
0be89a4796 big addons:
* add Homogeneous expression for vector and set of vectors (aka matrix)
  => the next step will be to overload operator*
* add homogeneous normalization (again for vector and set of vectors)
* add a Replicate expression (with uni-directional replication
  facilities)
=> for all of them I'll add examples once we agree on the API
* fix gcc-4.4 warnings
* rename reverse.cpp array_reverse.cpp
2009-03-05 10:25:22 +00:00
Gael Guennebaud
d710ccd41e BTL: add syr2 action 2009-03-05 08:15:32 +00:00
Gael Guennebaud
a72ff5abc1 BTL: - patch from Victor (add ACML support)
- fix overflow issues
2009-03-05 08:11:47 +00:00
Gael Guennebaud
6a26506341 add ReturnByValue pseudo expression for in-place evaluation with a
return-by-value API style (will soon use it for the transform products)
2009-03-04 13:00:00 +00:00
Gael Guennebaud
45136ac3b6 various update of of BTL 2009-03-04 07:21:17 +00:00
Gael Guennebaud
3288e9e168 add much faster versions of unaligned stores (and slightly faster
unaligned loads)
2009-03-03 14:01:30 +00:00
Gael Guennebaud
c4601314be cleaned Tridiagonalization impl, with improved performance 2009-03-02 15:07:39 +00:00
Gael Guennebaud
ed134a0ce5 performance improvement: rewrite of the matrix-matrix product following
Goto's paper => x1.4 speedup with more consistent perf results
2009-03-02 13:15:15 +00:00
Gael Guennebaud
8ed186b9ab improve WIP new matrix product 2009-02-27 17:18:52 +00:00
Gael Guennebaud
40774c625e add a proof of concept autodiff jacobian helper class based on adolc
with unit test and FindAdolc cmake module
2009-02-27 16:19:13 +00:00
Gael Guennebaud
170128770a compilation fix + test orho methods for complex 2009-02-26 10:09:23 +00:00
Gael Guennebaud
91af389a9a fix unitOrthogonal() for size > 3 2009-02-25 09:23:09 +00:00
Laurent Montel
2d6d14a3d3 Add COMPONENT Devel 2009-02-23 07:50:56 +00:00
Laurent Montel
8e7c4df0db Fix install header 2009-02-22 10:28:31 +00:00
Gael Guennebaud
de014efdaf * split CacheFriendlyProduct into multiple smaller files
* add an efficient selfadjoint * vector implementation (= blas symv)
  perf are inbetween MKL and GOTO
  => the interface is still missing (have to be rethougth)
2009-02-21 20:20:38 +00:00
Gael Guennebaud
3d86dcf473 oops, got confused by the preprocessor directives around
posix_memalign...
2009-02-21 16:35:57 +00:00
Gael Guennebaud
7c4f9ecf0c fix posix_memalign return value warning 2009-02-21 16:23:18 +00:00
Benoit Jacob
177500f37e fix typo found by markusfroeb (forum) 2009-02-21 14:30:49 +00:00
Daniel Gomez Ferro
032880074e Added new product implementation.
Just works for square, power of 2 matrices of floats.
2009-02-20 22:23:25 +00:00
Gael Guennebaud
7485aa6d57 add symv bench 2009-02-20 21:05:19 +00:00
Keir Mierle
22b9de1849 Fix some formatting in quick ref. Add references to headers. 2009-02-19 16:46:35 +00:00
Gael Guennebaud
e2ee7a6a58 increase version number for step 2009-02-19 15:46:10 +00:00
Gael Guennebaud
752d95c20e eventually c++ does not provide any optimized pow(int,int) function,
so here you go :) (should also fix Timothy's troubles)
2009-02-18 18:24:31 +00:00
Gael Guennebaud
747d83a05a add size, rows, cols, (), (,) functions in ASCII ref 2009-02-18 16:35:16 +00:00
Gael Guennebaud
eb2cc8f502 patch from Jitse Niesen: fix ascii quick ref matlab operator ' 2009-02-18 16:24:15 +00:00
Gael Guennebaud
8f83b37b2a fix install of IterativeSolvers module 2009-02-18 13:11:29 +00:00
Gael Guennebaud
23f543ee5e add the ASCII quick reference made by Kier 2009-02-18 10:27:18 +00:00
Gael Guennebaud
21161d8bcf fix typo in FindTaucs.cmake 2009-02-17 10:34:17 +00:00
Gael Guennebaud
ee04c5c874 add tests showing bug in unitOrthogonal such that we don't forget it! 2009-02-17 09:57:32 +00:00
Gael Guennebaud
e6f1104b57 * fix Quaternion::setFromTwoVectors (thanks to "benv" from the forum)
* extend PartialRedux::cross() to any matrix sizes with automatic
  vectorization when possible
* unit tests: add "geo_" prefix to all unit tests related to the
  geometry module and start splitting the big "geometry.cpp" tests to
  multiple smaller ones (also include new tests)
2009-02-17 09:53:05 +00:00
Gael Guennebaud
67b4fab4e3 fix assertion issue in slice vectorization 2009-02-16 10:17:21 +00:00
Gael Guennebaud
c5245a3388 update vectorization_logic unit test wrt previous sum/redux change 2009-02-13 08:45:19 +00:00
Konstantinos A. Margaritis
349557db9a no reason for 3 vec_mins, 2 are enough apparently in ei_predux_min 2009-02-12 22:03:30 +00:00
Konstantinos A. Margaritis
ad2bf14dbb modified ei_predux_min/max to actually use altivec instructions 2009-02-12 21:58:44 +00:00
Gael Guennebaud
3b12f48aa3 compilation fix for SuperLU 3.1 2009-02-12 17:50:51 +00:00
Gael Guennebaud
20a8bb96eb fix m = m*m with m sparse (gug found by Frederik Heinz) 2009-02-12 15:57:13 +00:00
Gael Guennebaud
59a1ed0932 fix bug in MapBase found by myguel 2009-02-12 15:29:20 +00:00
Gael Guennebaud
51c991af45 * exit Sum.h, exit Prod.h, welcome vectorization of redux() !
* add vectorization for minCoeff and maxCoeff
2009-02-12 15:18:59 +00:00
Gael Guennebaud
dc97d483fd update of the Array module doc 2009-02-12 07:55:35 +00:00
Gael Guennebaud
deb6254702 some ICC fixes 2009-02-12 07:55:03 +00:00
Gael Guennebaud
7954f7709a add ei_predux_mul for AltiVec 2009-02-10 18:26:59 +00:00
Gael Guennebaud
cbbc6d940b * add ei_predux_mul internal function
* apply Ricard Marxer's prod() patch with fixes for the vectorized path
2009-02-10 18:06:05 +00:00
Gael Guennebaud
a0cc5fba0a fix ICC internal compilation error 2009-02-10 14:15:32 +00:00
Gael Guennebaud
40ad661183 added an experimental IterativeSolvers module (currently in unsupported)
with a constrained conjugate gradient algorithm adapted from GMM++/ITL.
This algorithm is needed for Step.
2009-02-10 10:02:41 +00:00
Gael Guennebaud
e75bef9523 various minor fixes in Sparse module 2009-02-10 10:00:00 +00:00
Gael Guennebaud
169696a078 fix doxygen \ingroup for the array module 2009-02-09 10:13:06 +00:00
Gael Guennebaud
9c14d20656 add select snippet 2009-02-09 10:01:42 +00:00
Gael Guennebaud
a9688f0b71 - add diagonal * sparse product as an expression
- split sparse_basic unit test
- various fixes in sparse module
2009-02-09 09:59:30 +00:00
Gael Guennebaud
e0be020622 add DiagonalMatrix setZero and resize functions 2009-02-09 09:55:54 +00:00
Gael Guennebaud
666ade0c93 add "remap" snippet using placement new 2009-02-09 09:54:48 +00:00
Thomas Capricelli
5b4c3b21f3 documentation about inheriting from Eigen::Matrix 2009-02-08 22:36:28 +00:00
Konstantinos A. Margaritis
15e40b1099 fixed preserve_mask definition for AltiVec (needed __vector keyword) 2009-02-08 18:43:57 +00:00
Konstantinos A. Margaritis
505bdbb9ef should be __powerpc__ instead of __ppc__ 2009-02-08 18:22:34 +00:00
Gael Guennebaud
d8b7283a98 remove remaining debug stuff in Reverse.h 2009-02-08 12:56:47 +00:00
Vincenzo Di Massa
a385de080d fix build 2009-02-07 16:14:46 +00:00
Gael Guennebaud
6c4b6c2da8 forgot to commit the deletion of StdVector directory 2009-02-07 15:47:13 +00:00
Gael Guennebaud
17fd619430 more fixes in StdVector, sorry for the noise 2009-02-07 12:51:58 +00:00
Gael Guennebaud
365ec0744c disable vector::resize() workaround for gcc < 4.1 (they already use a const
reference)
2009-02-07 12:46:04 +00:00
Gael Guennebaud
671398372b arf... s/_MSVC_VER/_MSC_VER 2009-02-07 11:49:33 +00:00
Gael Guennebaud
4dd2efa113 little fix in new StdVector 2009-02-07 11:30:19 +00:00
Gael Guennebaud
3009d79a1f * allow Matrix to be resized to 0 (solve a lot of troubles with
some containers)
* new workaround for std::vector which is supposed to work for any
  classes having EIGEN_MAKE_ALIGNED_OPERATOR_NEW as discussed on ML
2009-02-07 11:16:15 +00:00
Gael Guennebaud
24808dc090 force the use of debug version of QtCore unless it is not available 2009-02-06 21:50:18 +00:00
Gael Guennebaud
1c24f5bbc5 eventually MSVC does not like my /O2 flags (incompatibility with other option set by default) 2009-02-06 15:29:40 +00:00
Gael Guennebaud
19b035ee11 s/cholesky/llt in precompiled lib and BTL 2009-02-06 14:01:01 +00:00
Gael Guennebaud
cc90495e30 add bench_reverse, draft of a reverse vectorization for AltiVec, make
global Scaling function static
2009-02-06 13:28:55 +00:00
Gael Guennebaud
f5d96df800 Add vectorization of Reverse (was more tricky than I thought) and
simplify the index based functions
2009-02-06 12:40:38 +00:00
Gael Guennebaud
4dc4ab3abb Reverse::coeff*(int) functions are for vector only 2009-02-06 09:13:04 +00:00
Gael Guennebaud
6fbca94803 apply Ricard patch for Reverse with minor modifications 2009-02-06 09:01:50 +00:00
Gael Guennebaud
dc97079905 add optimization flag for MSVC and heavy tests
remove unsupported namespace
2009-02-05 22:13:50 +00:00
Gael Guennebaud
a4487ef198 add snippet for sub/super diagonal
fix a few doc issues
2009-02-05 21:19:40 +00:00
Gael Guennebaud
e80f0b6c4e update doc of DiagonalCoeffs 2009-02-05 18:39:23 +00:00
Gael Guennebaud
910b387438 Add sub/super-diagonal expression (read/write) as a trivial extension of
DiagonalCoeffs. The current API is simply:
  m.diagonal<1>() => 1st super diagonal
  m.diagonal<-2>() => the 2nd sub diagonal
I'll add a code snippet once we agree on this API.
2009-02-05 18:37:21 +00:00
Gael Guennebaud
9637af5ecf undo an unecessary change in cache-friendly product made for MSVC 2009-02-05 11:37:57 +00:00
Gael Guennebaud
7c374394ee remove explicit fortran dependency in FindCholmod 2009-02-05 10:34:41 +00:00
Gael Guennebaud
8a6f38c638 forgot to commit changes in main CMakeLists.txt
(add unsupported folder)
2009-02-05 09:37:53 +00:00
Gael Guennebaud
da45184635 add custom FindBLAS FindLAPACK working for c++ compiler
fix issues in Cholmod/Taucs supports
2009-02-05 09:36:52 +00:00
Gael Guennebaud
1119f846cf fix various Taucs and Cholmod issues (they have not been tested for a while) 2009-02-04 22:14:09 +00:00
Gael Guennebaud
41f80b26cb bugfix in LDLt for size==1 2009-02-04 20:20:34 +00:00
Gael Guennebaud
5e6707d7f7 forgot to add EigenTesting.cmake file 2009-02-04 17:11:04 +00:00
Benoit Jacob
5d69c5102b oops, #ifdef instead of #if ---> bug 2009-02-04 16:57:28 +00:00
Benoit Jacob
f81479d392 forgot to update this unit test... 2009-02-04 16:55:38 +00:00
Benoit Jacob
93a089adc8 disable alignment altogether outside of the platforms which potentially have SSE or AltiVec
This should remove most portability issues to other platforms where data alignment issues (including
overloading operator new and new[]) can be tricky, and where data alignment is not needed in the first place.
2009-02-04 16:53:03 +00:00
Gael Guennebaud
c26dc9ab1b nothing interesting to see 2009-02-04 16:22:56 +00:00
Gael Guennebaud
82e0cbbac5 extend unsupported/README.txt file 2009-02-04 15:44:59 +00:00
Gael Guennebaud
95db32fcdc setup the unsupported directory structure.
The unsupported module documentation is automatically generated in:
  build/doc/unsupported/
with bidirectional cross references.
I leave a class Foo in AdolcForward module to illustrate the
cross-reference behavior. I will remove it in the next commit.
2009-02-04 15:37:00 +00:00
Gael Guennebaud
44a527dfa5 * classify and sort the doxygen "related pages"
by tweaking the filename and adding 2 categories:
   Troubleshooting and Advanced
* use the EXCLUDE_SYMBOLS to clean the class list
  (insteaded of a homemade bash script)
* remove the broken "exemple list"
* re-structure the unsupported directory as mentionned in the ML and
  integrate the doc as follow:
  - snippets of the unsupported directory are directly imported from the
    main snippets/CMakefile.txt (no need to duplicate code)
  - add a top level "Unsupported modules" group
  - unsupported modules have to defined their own sub group and nest it
    using \ingroup Unsupported_modules
  - then a pair of //@{ //@} will put everything in the submodule
  - this is just a proposal !
2009-02-04 09:44:44 +00:00
Gael Guennebaud
b0dd22cc72 update cholesky benchmark 2009-02-03 19:05:10 +00:00
Keir Mierle
b9a82be727 Add full pivoting to LDLT decomposition. 2009-02-03 17:50:35 +00:00
Gael Guennebaud
6c5868cc99 add a short unsupported/readme file 2009-02-03 09:02:55 +00:00
Gael Guennebaud
42fa03948c add unsupported/ directory with a first contribution from myself:
a header file providing support for adolc's adouble type in forward
mode. (adolc is an automatic differentiation library)
2009-02-03 08:34:09 +00:00
Keir Mierle
b4777379d4 Add Matrix::resizeLike(other) convenience function and test. 2009-02-03 01:43:59 +00:00
Benoit Jacob
bc9a276b78 call it "Eigen 2.0.50-unstable" to make things clear, and update EIGEN_MINOR_VERSION to 50 2009-02-02 17:15:01 +00:00
Benoit Jacob
61e45ed500 * label Cholesky and solveTriangular.* as experimental
* improve Experimental.dox
* update urls from /api/ to /dox/
2009-02-02 14:21:35 +00:00
Benoit Jacob
f3d5ba0c1f the BSD's don't have aligned malloc after all 2009-02-02 13:22:19 +00:00
Gael Guennebaud
42c4bc0ecf fix tutorial abs/abs2 (thanks to Keir) 2009-02-01 22:38:51 +00:00
Gael Guennebaud
4111316104 forgot to commit QR_solve snippet 2009-02-01 20:47:19 +00:00
Benoit Jacob
37cceeeaca add missing inline keywords 2009-01-30 23:08:47 +00:00
Gael Guennebaud
82e70fbcae fix duplicated geometry module in the doc 2009-01-29 23:10:16 +00:00
Gael Guennebaud
13d0a310fd fix MSVC internal compilation error 2009-01-29 22:49:24 +00:00
Gael Guennebaud
8e0ec3c62b reduce epsilon in QR 2009-01-29 16:11:46 +00:00
Gael Guennebaud
cf0857c44d fix MSVC stupid warnings 2009-01-29 09:45:25 +00:00
Gael Guennebaud
1752a3a677 more MSVC fixes, and more code factorization in Geometry module 2009-01-29 09:36:48 +00:00
Gael Guennebaud
36c8a64923 add MatrixBase::stableNorm() avoiding over/under-flow
using it in QR reduced the error of Keir test from 1e-12 to 1e-24 but
that's much more expensive !
2009-01-28 22:11:56 +00:00
Gael Guennebaud
42b237b83a * make sum and redux honor EvalBeforeNestingBit too
* fix MSVC issues (hopefully)
2009-01-28 21:09:08 +00:00
Gael Guennebaud
cc6159743d make dot() honor EvalBeforeNestingBit 2009-01-28 20:53:27 +00:00
Gael Guennebaud
dde729379a various updates in the (still messy) sparse benchmarks 2009-01-28 20:32:28 +00:00
Gael Guennebaud
9b594ab0fb fix overflow in sparse product 2009-01-28 18:21:38 +00:00
Benoit Jacob
9bc44094c5 add EIGEN_NO_AUTOMATIC_RESIZING
if defined, already initialized matrices won't be automatically resized in assignments
uninitialized matrices may still be initialized
2009-01-28 16:44:03 +00:00
Gael Guennebaud
1b194193ef Big change in DiagonalMatrix and Geometry/Scaling:
* previous DiagonalMatrix expression is now DiagonalMatrixWrapper
* DiagonalMatrix class is now for storage
* add the DiagonalMatrixBase class to factorize code of the
  two previous classes
* remove Scaling class (it is now a global function)
* add UniformScaling helper class
  (don't use it directly, use the Scaling function)
* add the Scaling global function to simplify the creation
  of scaling objects
There is still a lot to do, in particular about DiagonalProduct for which
the goal is to get rid of the "if()" in the coeff() function. At least
it is not worse than before ! Also need to uptade the tutorial and add more doc.
2009-01-28 16:26:06 +00:00
Gael Guennebaud
da555585e2 Patch from Frank fixing stupid MSVC internal crash 2009-01-28 12:52:26 +00:00
Gael Guennebaud
22bfc77124 move EIGEN_DEPRECATED to the begining of the function (pb with MSVC) 2009-01-28 12:48:19 +00:00
Gael Guennebaud
0f15a8d829 QR: add isInjective(), isSurjective(),
mark isFullRank() deprecated,
    add solve() (mix of Keir's patch and LU::solve())
=> there is big problem with complex which are not working
2009-01-28 09:45:53 +00:00
Gael Guennebaud
cf89d9371a LLT: makes the non positive definite test less strict, but we still need
something better.
2009-01-27 23:01:53 +00:00
Gael Guennebaud
8ce4503494 add support for read/write sub sets of inner vectors (sparse module) 2009-01-27 22:48:17 +00:00
Benoit Jacob
d384671793 now these tests succeed with 10,000 repeats 2009-01-27 20:47:12 +00:00
Gael Guennebaud
4ac8cabf8a fix my previous commit with EIGEN_EMPTY macro bug 2009-01-27 19:08:20 +00:00
Benoit Jacob
9b06e072a5 fix type mismatch caught by new static assert 2009-01-27 17:42:04 +00:00
Gael Guennebaud
d700343632 fix "empty macro arguments are undefined in ISO C90 and ISO C++98"
warning found by gcc-svn
2009-01-27 17:40:14 +00:00
Gael Guennebaud
bea1737a5a FindUmfPack: add AMD and COLAMD libraries only if they are found 2009-01-27 16:22:08 +00:00
Gael Guennebaud
4b09865b8f check GSL version in cmake files 2009-01-27 16:04:16 +00:00
Benoit Jacob
f6aa60bcf3 centralize those static asserts more upstream, reduces duplication and ensures they can't be bypassed (e.g.
until now it was possible to bypass the static assert on sizes)
2009-01-27 15:40:05 +00:00
Benoit Jacob
8d236e74a1 add a missing static assertion on the scalar types 2009-01-27 15:29:07 +00:00
Benoit Jacob
046a84c0ef fix doc for norm() and squaredNorm(): these are not only for vectors 2009-01-27 14:05:20 +00:00
Gael Guennebaud
d7f60257dd small fix related to GSL cmake stuff 2009-01-26 20:00:41 +00:00
Benoit Jacob
874ff5a0b4 fix msvc warnings (useful ones again) reported by gael on CDash 2009-01-26 17:56:04 +00:00
Benoit Jacob
062d6bd47a documentation update/improvement 2009-01-26 16:22:40 +00:00
Benoit Jacob
b05c5dd1be compute the rank on the fly rather than at the end, and stop early
in the case of non-full rank (so big optimization in case the rank is low)
2009-01-26 16:14:20 +00:00
Gael Guennebaud
9ea1050281 fix MatrixNase::fillrand bug 2009-01-26 14:01:16 +00:00
Benoit Jacob
2776a3197b make these functions inline, thanks to Mek 2009-01-26 13:59:52 +00:00
Benoit Jacob
670de11dca forgot to backport an update to Mainpage.dox 2009-01-26 13:55:58 +00:00
Benoit Jacob
a79deafc6d * mark Geometry as experimental
* install QtAlignedMalloc
* finish the renaming Regression->LeastSquares
* install LeastSquares directory (!!!)
* misc dox fixes
2009-01-26 13:53:43 +00:00
Gael Guennebaud
65e4ae4ff4 disable unordered_map for ICC 2009-01-26 12:47:58 +00:00
Benoit Jacob
00d7f8e567 * solveTriangularInPlace(): take a const ref and const_cast it, to allow passing temporary xprs.
* improvements, simplifications in LU::solve()
* remove remnant of old norm2()
2009-01-25 23:46:51 +00:00
Benoit Jacob
414ee1db4b Optimization in LU::solve: when rows<=cols, no need to compute the L matrix
Remove matrixL() and matrixU() methods: they were tricky, returning a Part,
and matrixL() was useless for non-square LU. Also they were untested. This is
the occasion to simplify the docs (class_LU.cpp) removing the most confusing part.
I think that it's better to let the user do his own cooking with Part's.
2009-01-25 16:33:06 +00:00
Gael Guennebaud
56c7e164f0 add partial count redux (adapted patch from Ricard Marxer) 2009-01-24 15:22:44 +00:00
Gael Guennebaud
81b0ab53cf add fill() function as an alias for setConstant 2009-01-24 10:52:18 +00:00
Gael Guennebaud
9849931500 eventually it turns out that our current
EIGEN_WORK_AROUND_QT_BUG_CALLING_WRONG_OPERATOR_NEW_FIXED_IN_QT_4_5
is the right way to go for allowing placement new on a class having
overloaded operator new. Qt 4.5 won't add the :: prefix. (I still do not
understand how you can distinghish a placement new from an overloaded
operator new taking an allocator as argument...)
2009-01-23 16:31:03 +00:00
Gael Guennebaud
e7c48fac9b sparse module: makes -= and += operator working
Question 1: why are *=scalar and /=scalar working right away ?
Same weirdness in DynamicSparseMatrix where operators += and -= work wihout
  having to redefine them ???
2009-01-23 13:59:32 +00:00
Gael Guennebaud
899e2ada15 very minor update in test/CMakeLists.txt 2009-01-23 13:17:57 +00:00
Gael Guennebaud
6a722602e6 fix a few remaining warnings
and fix commainitializer unit test with MSVC
2009-01-23 12:26:32 +00:00
Gael Guennebaud
d3dcb04f2d * fix compilation with gcc 3.4
* add an option to disable Qt testing
2009-01-23 09:50:16 +00:00
Benoit Jacob
291ee89684 add computeRotationScaling and computeScalingRotation in SVD
add convenience functions in Transform
reimplement Transform::rotation() to use that
add unit-test
2009-01-22 16:39:08 +00:00
Benoit Jacob
876b1fb842 add polar decomposition on both sides, in SVD, with test 2009-01-22 15:00:47 +00:00
Gael Guennebaud
32754d806d I hope this one fix the issue with MSVC and sparse module 2009-01-22 13:59:28 +00:00
Benoit Jacob
9e3c73110a fix a bunch of warnings (actual issues) reported by Frank 2009-01-22 00:09:34 +00:00
Gael Guennebaud
4225b7bf57 perhaps fix a compilation issue in the sparse module with MSVC 2009-01-21 21:02:55 +00:00
Benoit Jacob
7ccea9222c fix warnings 2009-01-21 20:06:16 +00:00
Gael Guennebaud
52cf07d266 sparse module:
* add row(i), col(i) functions
* add prune() function to remove small coefficients
2009-01-21 18:46:04 +00:00
Gael Guennebaud
25f1658fce update sparse matrix tutorial (far to be satisfactory yet) 2009-01-21 17:44:58 +00:00
Benoit Jacob
5f43a42ee7 * remove set(), revert to old behavior where = resizes
* try to be clever in matrix ctors and operator=: be lazy when we can, always allow
  to copy rowvector into columnvector, check the template parameters,
  try to factor the code better
* add missing copy ctor in UnalignedType
* fix bug in the traits of DiagonalProduct
* renaming: EIGEN_TUNE_FOR_CPU_CACHE_SIZE
* update the dox a little
2009-01-21 17:10:23 +00:00
Marijn Kruisselbrink
a5fbf27843 don't claim googlehash is there when only qt4 has been found :) 2009-01-20 22:15:51 +00:00
Benoit Jacob
294682a25a lu test:don't fail 2009-01-20 19:06:39 +00:00
Gael Guennebaud
f645d1f911 * complete the support of QVector via a QtAlignedMalloc header
* add a unit test for QVector which shows the issue with QVector::fill
2009-01-20 16:50:47 +00:00
Gael Guennebaud
8973d12cda * QR: add a rank() method and improve the accuracy of the rank
* computation
* Array: add a count() method and rename AllAndAny.h file to
  BooleanRedux.h
2009-01-20 16:37:52 +00:00
Benoit Jacob
81cb887baf fix bug in the computation of rank
very difficult to catch in unit-tests because this is very noisy
2009-01-20 16:21:56 +00:00
Gael Guennebaud
3134d5290b forgot to include the update of the qr test 2009-01-20 10:38:56 +00:00
Gael Guennebaud
3e5b3a33fa quick bugfix in QR::isFullRank() (not 100% sure about the reference value
for the comparison to 0)
2009-01-20 10:37:39 +00:00
Marijn Kruisselbrink
8690ba923c I assume these files where supposed to be installed 2009-01-19 22:58:15 +00:00
Gael Guennebaud
36c478cd6e optimize A * v product for A sparse and row major 2009-01-19 22:29:28 +00:00
Gael Guennebaud
178858f1bd add a flexible sparse matrix class designed for fast matrix assembly 2009-01-19 15:20:45 +00:00
Benoit Jacob
385fd3d918 * clarify the situation with experimental parts
* remove all what was marked deprecated
2009-01-19 15:02:24 +00:00
Benoit Jacob
dcaa58744e #error if min or max is defined 2009-01-19 13:23:41 +00:00
Gael Guennebaud
9ae03f4589 add a compilation test in FindGoogleHash.cmake to catch configuration
issues when multiple compilers are used on the same system.
2009-01-19 08:51:06 +00:00
Gael Guennebaud
708fa36750 revert bad commit 2009-01-19 08:21:32 +00:00
Gael Guennebaud
d58bb54e7f add a smart realloc algorithm when filling a sparse matrix 2009-01-18 18:00:19 +00:00
Gael Guennebaud
0c7974dd4d bugfix in Map by Keir Mierle 2009-01-18 09:53:06 +00:00
Gael Guennebaud
22792c696f add ublas vector of vector in sparse setter bench 2009-01-17 16:24:49 +00:00
Benoit Jacob
61b85b1436 update documentation (thanks Kenneth) 2009-01-17 14:24:09 +00:00
Gael Guennebaud
cc6c4d807b add a sparse setter bench 2009-01-17 14:05:01 +00:00
Gael Guennebaud
c5020c6e8e patch from Ricard Marxer: add doc example for select() 2009-01-17 09:59:32 +00:00
Gael Guennebaud
1eec38dc36 Rewrite the vectorized meta unroller of sum to reduce instruction
dependency => significant speed up
2009-01-17 09:48:58 +00:00
Benoit Jacob
e556e647f4 typo found by Ben Axelrod
CCMAIL:baxelrod@coroware.com
2009-01-17 00:53:38 +00:00
Gael Guennebaud
86a192681e * Started a tutorial page for sparse matrix.
* Updated Mainpage.dox's directives used by kde's scripts
2009-01-16 17:20:12 +00:00
Gael Guennebaud
48df9ed715 Add a data() function in Map and Block 2009-01-16 10:25:53 +00:00
Gael Guennebaud
ccdcebcf03 Sparse module: add support for sparse selfadjoint * dense 2009-01-15 18:52:14 +00:00
Gael Guennebaud
9a4b7998cf Sparse module: add row/col methods to the iterators 2009-01-15 14:16:41 +00:00
Gael Guennebaud
87241089e1 Sparse module: bugfix in SparseMatrix::resize(), now the indices are
correctly initialized to 0.
2009-01-15 13:30:50 +00:00
Gael Guennebaud
96e1e582ff Sparse module:
* add a MappedSparseMatrix class (like Eigen::Map but for sparse
  matrices)
* rename SparseArray to CompressedStorage
2009-01-15 12:52:59 +00:00
Gael Guennebaud
4f33fbfc07 two compilation fixes 2009-01-15 08:26:40 +00:00
Gael Guennebaud
2d53466fa9 Sparse module:
* improved performance of mat*=scalar
* bug fix in cwise*
2009-01-14 21:27:54 +00:00
Gael Guennebaud
f5741d4277 add a sparse * dense_vector bench 2009-01-14 18:27:17 +00:00
Gael Guennebaud
0b606dcccd Add support for sparse * dense and dense * sparse matrix/vector products 2009-01-14 17:41:55 +00:00
Gael Guennebaud
c4c70669d1 Big rewrite in the Sparse module: SparseMatrixBase no longer inherits MatrixBase.
That means a lot of features which were available for sparse matrices
via the dense (and super slow) implemention are no longer available.
All features which make sense for sparse matrices (aka can be implemented efficiently) will be
implemented soon, but don't expect to see an API as rich as for the dense path.
Other changes:
* no block(), row(), col() anymore.
* instead use .innerVector() to get a col or row vector of a matrix.
* .segment(), start(), end() will be back soon, not sure for block()
* faster cwise product
2009-01-14 14:24:10 +00:00
Gael Guennebaud
ee87f5ee49 s/EIGEN_WORK_DIRECTORY/EIGEN_WORK_DIR in testsuite script 2009-01-14 10:43:36 +00:00
Benoit Jacob
a5632fe1db add EIGEN_FUNCTORS_PLUGIN 2009-01-13 16:27:26 +00:00
Benoit Jacob
e8e1084267 add EIGEN_CWISE_PLUGIN support for extending class Cwise 2009-01-13 15:31:06 +00:00
Gael Guennebaud
ec0b2f900c fix a couple of doxygen issues 2009-01-13 08:30:17 +00:00
Benoit Jacob
b179f8e1a4 add NetBSD to the list of OSes on which malloc is guaranteed to be 16
byte aligned, after discussion with Mark Davies.

CCMAIL: mark@ecs.vuw.ac.nz
2009-01-12 22:39:26 +00:00
Gael Guennebaud
62d01d3cf4 hm it seems cmake does not understand CYGWIN (=> UNIX) 2009-01-12 19:39:50 +00:00
Gael Guennebaud
534dc60672 improved a bit the testsuite script 2009-01-12 18:10:41 +00:00
Benoit Jacob
38d916d50f just a stub for the householder stuff we did with keir yesterday... 2009-01-12 16:56:11 +00:00
Benoit Jacob
24fd14dab6 * minor dox tweaks
* pretext to bump to beta6
2009-01-12 16:14:13 +00:00
Benoit Jacob
4d44ca226e * make std::vector specializations also for Transform and for Quaternion
* update test_stdvector
* Quaternion() does nothing (instead of bug)
* update test_geometry
* some renaming
2009-01-12 16:06:04 +00:00
Gael Guennebaud
b26e12abcf make ei_traist<Select> honors nested types 2009-01-12 15:55:56 +00:00
Gael Guennebaud
a7554023a4 bugfix in ei_unaligned_type copy ctor 2009-01-12 15:51:49 +00:00
Gael Guennebaud
9d97c06d9d fix a warning in test/alignedbox 2009-01-12 15:29:51 +00:00
Gael Guennebaud
1fc17c4792 remove stupid output in test/meta 2009-01-12 15:27:25 +00:00
Gael Guennebaud
6efaece8ce disable/enable msvc headers are allowed to be included multiple times 2009-01-12 14:46:11 +00:00
Benoit Jacob
294f5f16dd muuuch improved documentation for the unaligned array assert.
also split into several pages for better reusability (more generally useful than
just for this assert)
2009-01-12 14:41:12 +00:00
Gael Guennebaud
f86818b5a6 bug fix in ei_stack_free 2009-01-12 14:20:21 +00:00
Benoit Jacob
336ad58213 * move cwise *= and /= to Core (like * and /)
* tidy the StdVector module
* fix warnings (especially a | instead of ||) in stdvector test
2009-01-12 13:41:40 +00:00
Gael Guennebaud
af5034b3c0 add a clean configuration summary for the unit tests (useful to analyze the dashboards) 2009-01-12 13:40:02 +00:00
Gael Guennebaud
7f881e814f another warning fix 2009-01-12 13:01:31 +00:00
Gael Guennebaud
a4252584ed bugfix in ei_handmade_aligned_free for null pointers 2009-01-12 12:54:32 +00:00
Gael Guennebaud
b365f443a2 make the testsuite works on windows with nmake 2009-01-12 12:53:41 +00:00
Gael Guennebaud
484ed3bbe2 fix a warning in test/sparse.h 2009-01-12 12:52:36 +00:00
Gael Guennebaud
92959aa5f3 add the possiblity to disable Fortran (workaround cmake's bug 2009-01-12 12:51:40 +00:00
Gael Guennebaud
2db5888253 update testsuite script 2009-01-12 11:55:52 +00:00
Gael Guennebaud
f268e79709 simplify some ei_traits<> using inheritance
(less loc and slight compilation speed up)
2009-01-11 22:03:40 +00:00
Gael Guennebaud
9e8f437a6f extend stdvector test with more push_back... 2009-01-11 21:59:04 +00:00
Benoit Jacob
824b75f182 forgot to commit updated test 2009-01-11 21:04:23 +00:00
Benoit Jacob
a30b498ab4 add cwise operator *= and /=.
Keir says hi!!
2009-01-11 20:48:56 +00:00
James Richard Tyrer
28e15574df updating FindEigen2.cmake for proper search order 2009-01-11 16:18:59 +00:00
Armin Berres
6c84b03d77 add missing newline at EOF 2009-01-11 13:32:55 +00:00
Benoit Jacob
4361cb8913 EIGEN_NO_MALLOC must also block traditional unaligned malloc 2009-01-10 14:24:55 +00:00
Benoit Jacob
50ad8b9010 fix potential compilation issue on MSVC + no vectorization 2009-01-10 14:10:40 +00:00
Benoit Jacob
0c1ef2f4c6 make the std::vector fix work also with dynamic size Eigen objects, e.g.
std::vector<VectorXd>
update unit test
2009-01-10 13:10:23 +00:00
Benoit Jacob
3efe6e4176 remove ei_new_allocator
remove corresponding part of test_dynalloc
2009-01-10 02:50:09 +00:00
Benoit Jacob
335d3bcf05 Based on code + help from Alex Stapleton:
*Add Eigen/StdVector header.
Including it #includes<vector> and "Core" and generates a partial
specialization of std::vector<T> for T=Eigen::Matrix<...>
that will work even with vectorizable fixed-size Eigen types
(working around a design issue in the c++ STL)
*Add unit-test

CCMAIL: alex.stapleton@gmail.com
2009-01-09 23:26:45 +00:00
Benoit Jacob
5052d3659b oops, fix compilation (sorry for all that noise!) 2009-01-09 21:43:46 +00:00
Benoit Jacob
5bef59e371 (previous commit was bad) 2009-01-09 21:32:44 +00:00
Benoit Jacob
265ab86005 overloaded operator delete should call ei_conditinal_aligned_free, not
ei_aligned_free
2009-01-09 21:28:53 +00:00
Benoit Jacob
b3d580dec7 ei_aligned_delete was running through the various paths in the wrong
order
2009-01-09 20:57:06 +00:00
Benoit Jacob
fd831d5a12 * implement handmade aligned malloc, fast but always wastes 16 bytes of memory.
only used as fallback for now, needs benchmarking.
  also notice that some malloc() impls do waste memory to keep track of alignment
  and other stuff (check msdn's page on malloc).
* expand test_dynalloc to cover low level aligned alloc funcs. Remove the old
  #ifdef EIGEN_VECTORIZE...
* rewrite the logic choosing an aligned alloc, some new stuff:
  * malloc() already aligned on freebsd and windows x64 (plus apple already)
  * _mm_malloc() used only if EIGEN_VECTORIZE
  * posix_memalign: correct detection according to man page (not necessarily
    linux specific), don't attempt to declare it if the platform didn't declare it
    (there had to be a reason why it didn't declare it, right?)
2009-01-09 14:56:44 +00:00
Kenneth Frank Riddile
f52a9e5315 * Added aligned_allocator for using 16-byte aligned types with STL containers. There is still a compile-time problem with STL containers that have a standard-conformant resize() method, but this should resolve the original user issue which was storing aligned objects in a std::map. 2009-01-09 00:55:53 +00:00
Benoit Jacob
003d0ce03e add missing inline keywords (compilation error) spotted by timvdm 2009-01-08 20:20:11 +00:00
Gael Guennebaud
208b273106 change the day switch time to 5:00 UTC 2009-01-08 18:45:26 +00:00
Gael Guennebaud
b315f4c359 add a ctest testsuite.cmake script to simplify the generation of nightly builds 2009-01-08 18:38:58 +00:00
Benoit Jacob
51aa62a5d4 add EIGEN_WORK_AROUND_QT_BUG_CALLING_WRONG_OPERATOR_NEW_FIXED_IN_QT_4_5
an ugly hack that allows people to do QVector<Vector4f> with Qt <= 4.4
the Qt bug this is working around is fixed by Gael in Qt 4.5
2009-01-08 16:03:24 +00:00
Benoit Jacob
eb7dcbbfce EIGEN_MAKE_ALIGNED_OPERATOR_NEW didn't actually need to get the class
name as parameter
2009-01-08 15:37:13 +00:00
Benoit Jacob
1d52bd4cad the big memory changes. the most important changes are:
ei_aligned_malloc now really behaves like a malloc
 (untyped, doesn't call ctor)
ei_aligned_new is the typed variant calling ctor
EIGEN_MAKE_ALIGNED_OPERATOR_NEW now takes the class name as parameter
2009-01-08 15:20:21 +00:00
Benoit Jacob
e2d2a7d222 fix a bug -- bad read found by valgrind 2009-01-08 14:23:30 +00:00
Gael Guennebaud
af27fb7590 Add cdash.org support:
* the dashboard is there: http://my.cdash.org/index.php?project=Eigen
* now you can run the tests from the top build dir
  and submit report like that (from the top build dir):
  ctest -D Experimental
* todo:
 - add some nighlty builds (I'll add a few on my computer)
 - add valgrind memory checks, performances tests, compilation time tests, etc.
2009-01-08 11:53:21 +00:00
Gael Guennebaud
8d3469ca44 add BLAS dependency in FindSuperLU.cmake 2009-01-08 11:27:02 +00:00
Gael Guennebaud
4432cf8ca3 bugfix in sum 2009-01-08 10:36:54 +00:00
Gael Guennebaud
cb71dc4bbf Add a vectorization_logic unit test for assign and sum.
Need to add dot and more tests, but it seems I've already
found some room for improvement in sum.
2009-01-07 22:20:03 +00:00
Gael Guennebaud
709e903335 Sparse module:
* extend unit tests
* add support for generic sum reduction and dot product
* optimize the cwise()* : this is a special case of CwiseBinaryOp where
  we only have to process the coeffs which are not null for *both* matrices.
  Perhaps there exist some other binary operations like that ?
2009-01-07 17:01:57 +00:00
Kenneth Frank Riddile
7078cfaeaa * re-enabled deprecation warning on msvc now that eigen isn't using functions microsoft deems deprecated 2009-01-07 16:59:43 +00:00
Gael Guennebaud
336f0a8486 fine tuning in dot() and sum(), and prepare for the sparse versions... 2009-01-07 16:58:17 +00:00
Gael Guennebaud
6b9d647fc2 add custom implementation of hypot 2009-01-07 16:53:09 +00:00
Kenneth Frank Riddile
3e90940189 * disabled deprecation warning under msvc that was being triggered due to microsoft deprecating portable functions in favor of non-portable ones ( ex: hypot() deprecated in favor of _hypot() ) 2009-01-07 16:50:35 +00:00
Gael Guennebaud
8cea5f6b31 remove non standard hypotf function call 2009-01-07 16:24:27 +00:00
Gael Guennebaud
94efa16187 more Scalar conversions fixes 2009-01-07 10:22:46 +00:00
Gael Guennebaud
ac9d805c2f two scalar conversion fixes 2009-01-07 09:47:16 +00:00
Benoit Jacob
2db434265b remove the Matrix_ prefix 2009-01-06 18:07:16 +00:00
Armin Berres
8e888a45d0 call methods from the eigen namespace with prepended Eigen:: in define 2009-01-06 11:20:36 +00:00
Kenneth Frank Riddile
6736e52d25 * suppressed some minor warnings 2009-01-06 04:38:00 +00:00
Benoit Jacob
1c29d70312 * introduce macros to replace inheritance for operator new overloading
(former solution still available and tested)
  This plays much better with classes that already have base classes --
  don't force the user to mess with multiple inheritance, which gave
  much trouble with MSVC.
* Expand the unaligned assert dox page
* Minor fixes in the lazy evaluation dox page
2009-01-06 03:16:50 +00:00
Benoit Jacob
e71de20f16 ...so the placement new is really trivial:) 2009-01-05 20:06:52 +00:00
Benoit Jacob
1eec8488a2 oops, placement new should take a void* 2009-01-05 19:56:32 +00:00
Benoit Jacob
9dcde48980 we also had to overload the placement new... issue uncovered by Tim when
doing QVector<Vector3d>...
2009-01-05 19:49:07 +00:00
Benoit Jacob
215ce5bdc1 forgot to install the LeastSquares header 2009-01-05 19:12:35 +00:00
Benoit Jacob
10f9478f35 release beta5, fix a doc typo 2009-01-05 19:00:10 +00:00
Benoit Jacob
9c8a653c0b gaaaaaaaaaaaaaaaaaah
can't believe that 3 people wasted so much time because of a missing &
!!!!!
and this time MSVC didn't catch it so it was really copying the vector
on the stack at an unaligned location!
2009-01-05 18:33:39 +00:00
Benoit Jacob
8551505436 problem solved, we really want public inheritance and it is only
automatic when the _child_ class is a struct.
2009-01-05 18:21:44 +00:00
Benoit Jacob
1b88042880 the empty base class optimization is not standard. Most compilers implement a basic form of it; however MSVC won't implement it if there is more than one empty base class.
For that reason, we shouldn't give Matrix two empty base classes, since sizeof(Matrix) must be optimal.
So we overload operator new and delete manually rather than inheriting an empty struct for doing that.
2009-01-05 16:46:19 +00:00
Benoit Jacob
7db3f2f72a *fix compilation with MSVC 2005 in the Transform::construct_from_matrix
*fix warnings with MSVC 2005: converting M_PI to float gives loss-of-precision warnings
2009-01-05 14:47:38 +00:00
Armin Berres
dae7f065d4 didn't meant to commit the fortran check. but anyway, unfortunately it doesn't work the way iot should. the test fails and cmake will terminate if it doesn't find a fortran compiler 2009-01-05 14:40:27 +00:00
Armin Berres
ab660a4d29 inherit from ei_with_aligned_operator_new even with disabled vectorization 2009-01-05 14:35:07 +00:00
Benoit Jacob
986f301233 *add PartialRedux::cross() with unit test
*add transform-from-matrices test
*undo an unwanted change in Matrix
2009-01-05 14:26:34 +00:00
Benoit Jacob
d316d4f393 fix compilation on apple: _mm_malloc was undefined. the fix is to just use malloc since on apple it already returns aligned ptrs 2009-01-05 13:15:27 +00:00
Benoit Jacob
e1ee876daa fix segfault due to non-aligned packets 2009-01-04 23:23:32 +00:00
Benoit Jacob
3de311f497 oops forgot important parentheses 2009-01-04 21:23:02 +00:00
Benoit Jacob
06fd84cdb1 Fix bug in Matrix, in determining whether to overload operator new with an aligned one, introduced in r905543 2009-01-04 21:20:42 +00:00
Benoit Jacob
0e5c640563 * fix a unused variable warning
* if svnversion returns prose such as "exported" then discard that
2009-01-04 17:32:20 +00:00
Benoit Jacob
be64619ab6 * require CMake 2.6.2 everywhere, Alexander Neundorf says it'd make it
easier to have a uniform requirement in kdesupport for when he makes
fixes.
* add eigen versioning macros
2009-01-04 16:19:12 +00:00
Benoit Jacob
269bf67796 rename Regression --> LeastSquares 2009-01-04 15:55:54 +00:00
Benoit Jacob
15ca6659ac * the 4th template param of Matrix is now Options. One bit for storage
order, one bit for enabling/disabling auto-alignment. If you want to
disable, do:
Matrix<float,4,1,Matrix_DontAlign>
The Matrix_ prefix is the only way I can see to avoid
ambiguity/pollution. The old RowMajor, ColMajor constants are
deprecated, remain for now.
* this prompted several improvements in matrix_storage. ei_aligned_array
renamed to ei_matrix_array and moved there. The %16==0 tests are now
much more centralized in 1 place there.
* unalignedassert test: updated
* update FindEigen2.cmake from KDElibs
* determinant test: use VERIFY_IS_APPROX to fix false positives; add
testing of 1 big matrix
2009-01-04 15:26:32 +00:00
Benoit Jacob
d9e5fd393a * In LU solvers: no need anymore to use row-major matrices
* Matrix: always inherit WithAlignedOperatorNew, regardless of
vectorization or not
* rename ei_alloc_stack to ei_aligned_stack_alloc
* mixingtypes test: disable vectorization as SSE intrinsics don't allow
mixing types and we just get compile errors there.
2009-01-03 22:33:08 +00:00
Gael Guennebaud
fd7eba3394 Added a SparseVector class (not tested yet) 2009-01-02 08:47:09 +00:00
Benoit Jacob
d671205755 fix the nomalloc test: it didn't catch the ei_stack_alloc in
cachefriendlyproduct, that should be banned as well as depending on the
platform they can give a malloc, and they could happen even with (large
enough) fixed size matrices. Corresponding fix in Product.h:
cachefriendly is now only used for dynamic matrices -- fixedsize, no
matter how large, doesn't use the cachefriendly product. We don't need
to care (in my opinion) about performance for large fixed size, as large
fixed size is a bad idea in the first place and it is more important to
be able to guarantee clearly that fixed size never causes a malloc.
2008-12-31 00:25:31 +00:00
Benoit Jacob
3958e7f751 add unit-test checking the assertion on unaligned arrays -- checking
that it's triggered when and only when it should.
2008-12-31 00:05:22 +00:00
Benoit Jacob
164f410bb5 * make WithAlignedOperatorNew always align even when vectorization is disabled
* make ei_aligned_malloc and ei_aligned_free honor custom operator new and delete
2008-12-30 14:11:35 +00:00
Gael Guennebaud
a2782964c1 Sparse module, add an assertion and make the use of CHOLMOD's solve method the default. 2008-12-27 19:51:54 +00:00
Gael Guennebaud
ce3984844d Sparse module:
* enable complex support for the CHOLMOD LLT backend
   using CHOLMOD's triangular solver
 * quick fix for complex support in SparseLLT::solve
2008-12-27 18:13:29 +00:00
Benoit Jacob
361225068d fix bug discovered by Frank:
ei_alloc_stack must really return aligned pointers
2008-12-24 13:59:24 +00:00
Benoit Jacob
6f158fb7fc one last compilation fix ... 2008-12-22 21:04:13 +00:00
Benoit Jacob
789ea9d676 unless i find more failures in the tests, this will be beta3...
* fixes for mistakes (especially in the cast() methods in Geometry) revealed by the new "mixing types" test
* dox love, including a section on coeff access in core and an overview in geometry
2008-12-22 20:50:47 +00:00
Benoit Jacob
4336cf3833 * add unit-tests to check allowed and forbiddent mixing of different scalar types
* fix issues in Product revealed by this test
* in Dot.h forbid mixing of different types (at least for now, might allow real.dot(complex) in the future).
2008-12-22 19:17:44 +00:00
Benoit Jacob
f5a05e7ed1 unfuck v.cwise()*w where v is real and w is complex 2008-12-21 20:55:46 +00:00
Benoit Jacob
9e00d94543 * the Upper->UpperTriangular change
* finally get ei_add_test right
2008-12-20 13:36:12 +00:00
Benoit Jacob
21ab65e4b3 fix nasty little bug in ei_add_test 2008-12-20 01:13:38 +00:00
Gael Guennebaud
ebf906a23a add matrix * transform product 2008-12-19 16:31:22 +00:00
Benoit Jacob
df4bd5e46f * fix a test giving some false positives
* add coverage for various operator*=
2008-12-19 16:16:39 +00:00
Benoit Jacob
a3fad2e3c3 Transform*Transform should return Transform
unit test compiles again
2008-12-19 15:55:35 +00:00
Gael Guennebaud
5f6fbaa0e7 * fix a vectorization issue in Product
* use _mm_malloc/_mm_free on other platforms than linux of MSVC (eg., cygwin, OSX)
* replace a lot of inline keywords by EIGEN_STRONG_INLINE to compensate for
  poor MSVC inlining
2008-12-19 15:38:39 +00:00
Gael Guennebaud
8679d895d3 various MSVC fixes in BTL 2008-12-19 15:31:47 +00:00
Benoit Jacob
22875683b9 * extractRotation ---> rotation
* expand the geometry/Transform tests, after Mek's reports
* fix my own stupidity in eigensolver test
2008-12-19 14:52:03 +00:00
Benoit Jacob
e4980616fd SelfAdjointEigenSolver: add operatorSqrt() and operatorInverseSqrt() 2008-12-19 14:15:32 +00:00
Benoit Jacob
84bb868f07 * more MSVC warning fixes from Kenneth Riddile
* actually GCC 4.3.0 has a bug, "deprecated" placed at the end
  of a function prototype doesn't have any effect, moving them to
  the start of the function prototype makes it actually work!
* finish porting the cholesky unit-test to the new LLT/LDLT,
  after the above fix revealed a deprecated warning
2008-12-19 02:59:04 +00:00
Benoit Jacob
f34a4fa335 * forgot to svn add 2 files
* idea of Keir Mierle: make the static assert error msgs UPPERCASE
2008-12-18 21:04:06 +00:00
Benoit Jacob
8106d35408 Patch by Kenneth Riddile: disable MSVC warnings, reenable them outside
of Eigen, and add a MSVC-friendly path in StaticAssert.
2008-12-18 20:48:02 +00:00
Benoit Jacob
fabaa6915b * fix in IO.h, a useless copy was made because of assignment from
Derived to MatrixBase.
* the optimization of eval() for Matrix now consists in a partial
  specialization of ei_eval, which returns a reference type for Matrix.
  No overriding of eval() in Matrix anymore. Consequence: careful,
  ei_eval is no longer guaranteed to give a plain matrix type!
  For that, use ei_plain_matrix_type, or the PlainMatrixType typedef.
* so lots of changes to adapt to that everywhere. Hope this doesn't
  break (too much) MSVC compilation.
* add code examples for the new image() stuff.
* lower a bit the precision for floats in the unit tests as
  we were already doing some workarounds in inverse.cpp and we got some
  failed tests.
2008-12-18 20:36:25 +00:00
Benoit Jacob
b27a3644a2 Macros: add MSVC paths, add an important comment on EIGEN_ALIGN_128 2008-12-18 13:36:31 +00:00
Gael Guennebaud
93f8d56789 improved MSVC support in cmake files (SSE) 2008-12-18 09:07:36 +00:00
Benoit Jacob
15d72d3f9a somehow we had forgotten this very important static assertion... 2008-12-18 03:47:49 +00:00
Gael Guennebaud
9084e2a216 oops I commited bad stuff in the previous commit 2008-12-17 18:54:28 +00:00
Gael Guennebaud
6e138d0069 more MSVC cmake fixes 2008-12-17 18:37:04 +00:00
Gael Guennebaud
4cb63808d4 some cmake fixes for windows and GSL 2008-12-17 17:50:43 +00:00
Gael Guennebaud
c98fe0185e fix non standard non-const std::complex::real/imag issue 2008-12-17 17:31:23 +00:00
Benoit Jacob
5f582aa4e8 fix bad typos, thanks to Kenneth Riddile 2008-12-17 17:14:37 +00:00
Benoit Jacob
c22d10f94c LU class:
* add image() and computeImage() methods, with unit test
* fix a mistake in the definition of KernelResultType
* fix and improve comments
2008-12-17 16:47:55 +00:00
Gael Guennebaud
e3a8431a4a found one bug in the previous ++ changes 2008-12-17 16:04:21 +00:00
Benoit Jacob
89f468671d * replace postfix ++ by prefix ++ wherever that makes sense in Eigen/
* fix some "unused variable" warnings in the tests; there remains a libstdc++ "deprecated"
warning which I haven't looked much into
2008-12-17 14:30:01 +00:00
Benoit Jacob
2110cca431 actually honor EIGEN_STACK_ALLOCATION.
Can set it to 0 to disable stack alloc which guarantees that bad alloc throws exceptions if they are enabled.
2008-12-16 15:44:48 +00:00
Benoit Jacob
38b83b4157 * throw bad_alloc if exceptions are enabled, after patch by Kenneth Riddile
* disable vectorization on MSVC 2005, as it doesn't have all the required intrinsics. require 2008.
2008-12-16 15:17:29 +00:00
Benoit Jacob
50105c3ed6 Hopefully fix compilation of SSE Packetmath with MSVC.
The reason why we didn't realize until now that it didn't compile at all
with MSVC is that before today with MSVC the SSE2 detection didn't work.
2008-12-16 03:48:49 +00:00
Benoit Jacob
0a220721d1 Finally work around enough of MSVC preprocessor dumbness so that it actually detects SSE2 2008-12-15 21:20:40 +00:00
Benoit Jacob
1ad751b991 only enable the "unaligned array" assert if vectorization is enabled.
if vectorization is disabled, WithAlignedOperatorNew is empty!
2008-12-15 20:49:03 +00:00
Benoit Jacob
55b603e457 * fix a bug I introduced in WithAlignedOperatorNew
* add an important comment
2008-12-15 20:14:59 +00:00
Benoit Jacob
763f0a2434 use ei_aligned_malloc and ei_aligned_free also in WithAlignedOperatorNew, so this too should now work with MSVC. 2008-12-15 17:55:11 +00:00
Benoit Jacob
9b1a3d6e19 small optimization (for MSVC) and simplification of ei_alligned_malloc 2008-12-15 17:33:19 +00:00
Benoit Jacob
dd139b92b4 work around the braindead msvc preprocessor 2008-12-15 17:16:22 +00:00
Benoit Jacob
11c8a6bf63 Fix detection of SSE2 with MSVC. 2008-12-15 16:14:54 +00:00
Benoit Jacob
703951d5cd Fix memory alignment (hence vectorization) on MSVC thanks to help from Armin Berres. 2008-12-15 15:54:33 +00:00
Gael Guennebaud
d11c8704e0 oops, forget to remove a line 2008-12-15 12:35:46 +00:00
Gael Guennebaud
a164646c77 more warning fixes by Armin Berres 2008-12-15 12:30:04 +00:00
Benoit Jacob
936eaf600a compilation fix thanks to Dennis Schridde 2008-12-13 21:09:19 +00:00
Gael Guennebaud
ef0cd3a289 one more warning fix, thanks to Armin Berres 2008-12-12 13:50:01 +00:00
Gael Guennebaud
1ed17b037d * fix a couple of warnings (patch from Armin Berres)
* allow Map to map null data
2008-12-12 12:54:45 +00:00
Gael Guennebaud
5015e48361 Sparse module: add a more flexible SparseMatrix::fillrand() function
which allows to fill a matrix with random inner coordinates (makes sense
only when a very few coeffs are inserted per col/row)
2008-12-11 18:26:24 +00:00
Gael Guennebaud
beabf008b0 bugfix in DiagonalProduct: a "DiagonalProduct<SomeXpr>" expression
is now evaluated as a "DiagonalProduct<Matrix<SomeXpr::Eval> >".
Note that currently this only happens in DiagonalProduct.
2008-12-10 19:02:13 +00:00
Gael Guennebaud
ba9a53f9c6 fix compilation issue for 64bit systems (pointer <=> size_t) 2008-12-08 15:07:42 +00:00
Gael Guennebaud
52a30c1d54 fix minor mistake in the inside eigen example 2008-12-08 10:07:09 +00:00
2027 changed files with 407019 additions and 42580 deletions

19
.clang-format Normal file
View File

@@ -0,0 +1,19 @@
---
BasedOnStyle: Google
ColumnLimit: 120
---
Language: Cpp
BasedOnStyle: Google
ColumnLimit: 120
StatementMacros:
- EIGEN_STATIC_ASSERT
- EIGEN_INITIALIZE_COEFFS_IF_THAT_OPTION_IS_ENABLED
- EIGEN_INTERNAL_DENSE_STORAGE_CTOR_PLUGIN
SortIncludes: false
AttributeMacros:
- EIGEN_STRONG_INLINE
- EIGEN_ALWAYS_INLINE
- EIGEN_DEVICE_FUNC
- EIGEN_DONT_INLINE
- EIGEN_DEPRECATED
- EIGEN_UNUSED

4
.git-blame-ignore-revs Normal file
View File

@@ -0,0 +1,4 @@
# First major clang-format MR (https://gitlab.com/libeigen/eigen/-/merge_requests/1429).
f38e16c193d489c278c189bc06b448a94adb45fb
# Formatting of tests, examples, benchmarks, et cetera (https://gitlab.com/libeigen/eigen/-/merge_requests/1432).
46e9cdb7fea25d7f7aef4332b9c3ead3857e213d

3
.gitattributes vendored Normal file
View File

@@ -0,0 +1,3 @@
*.sh eol=lf
debug/msvc/*.dat eol=crlf
debug/msvc/*.natvis eol=crlf

41
.gitignore vendored Normal file
View File

@@ -0,0 +1,41 @@
qrc_*cxx
*.orig
*.pyc
*.diff
diff
*.save
save
*.old
*.gmo
*.qm
core
core.*
*.bak
*~
*.build*
*.moc.*
*.moc
ui_*
CMakeCache.txt
tags
.*.swp
activity.png
*.out
*.php*
*.log
*.orig
*.rej
log
patch
*.patch
a
a.*
lapack/testing
lapack/reference
.*project
.settings
Makefile
!ci/build.gitlab-ci.yml
!scripts/buildtests.in
!Eigen/Core
!Eigen/src/Core

38
.gitlab-ci.yml Normal file
View File

@@ -0,0 +1,38 @@
# This file is part of Eigen, a lightweight C++ template library
# for linear algebra.
#
# Copyright (C) 2023, The Eigen Authors
#
# This Source Code Form is subject to the terms of the Mozilla
# Public License v. 2.0. If a copy of the MPL was not distributed
# with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
default:
# automatically cancels a job when a new pipeline for the same branch is triggered
interruptible: true
stages:
- checkformat
- build
- test
- deploy
variables:
# CMake build directory.
EIGEN_CI_BUILDDIR: .build
# Specify the CMake build target.
EIGEN_CI_BUILD_TARGET: ""
# If a test regex is specified, that will be selected.
# Otherwise, we will try a label if specified.
EIGEN_CI_CTEST_REGEX: ""
EIGEN_CI_CTEST_LABEL: ""
EIGEN_CI_CTEST_ARGS: ""
include:
- "/ci/checkformat.gitlab-ci.yml"
- "/ci/common.gitlab-ci.yml"
- "/ci/build.linux.gitlab-ci.yml"
- "/ci/build.windows.gitlab-ci.yml"
- "/ci/test.linux.gitlab-ci.yml"
- "/ci/test.windows.gitlab-ci.yml"
- "/ci/deploy.gitlab-ci.yml"

View File

@@ -0,0 +1,69 @@
<!--
Please read this!
Before opening a new issue, make sure to search for keywords in the issues
filtered by "bug::confirmed" or "bug::unconfirmed" and "bugzilla" label:
- https://gitlab.com/libeigen/eigen/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=bug%3A%3Aconfirmed
- https://gitlab.com/libeigen/eigen/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=bug%3A%3Aunconfirmed
- https://gitlab.com/libeigen/eigen/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name[]=bugzilla
and verify the issue you're about to submit isn't a duplicate. -->
### Summary
<!-- Summarize the bug encountered concisely. -->
### Environment
<!-- Please provide your development environment here -->
- **Operating System** : Windows/Linux
- **Architecture** : x64/Arm64/PowerPC ...
- **Eigen Version** : 3.3.9
- **Compiler Version** : Gcc7.0
- **Compile Flags** : -O3 -march=native
- **Vector Extension** : SSE/AVX/NEON ...
### Minimal Example
<!-- If possible, please create a minimal example here that exhibits the problematic behavior.
You can also link to [godbolt](https://godbolt.org). But please note that you need to click
the "Share" button in the top right-hand corner of the godbolt page where you reproduce the sample
code to get the share link instead of in your browser address bar.
You can read [the guidelines on stackoverflow](https://stackoverflow.com/help/minimal-reproducible-example)
on how to create a good minimal example. -->
```cpp
//show your code here
```
### Steps to reproduce
<!-- Describe how one can reproduce the issue - this is very important. Please use an ordered list. -->
1. first step
2. second step
3. ...
### What is the current *bug* behavior?
<!-- Describe what actually happens. -->
### What is the expected *correct* behavior?
<!-- Describe what you should see instead. -->
### Relevant logs
<!-- Add relevant code snippets or program output within blocks marked by " ``` " -->
<!-- OPTIONAL: remove this section if you are not reporting a compilation warning issue.-->
### Warning Messages
<!-- Show us the warning messages you got! -->
<!-- OPTIONAL: remove this section if you are not reporting a performance issue. -->
### Benchmark scripts and results
<!-- Please share any benchmark scripts - either standalone, or using [Google Benchmark](https://github.com/google/benchmark). -->
### Anything else that might help
<!-- It will be better to provide us more information to help narrow down the cause.
Including but not limited to the following:
- lines of code that might help us diagnose the problem.
- potential ways to address the issue.
- last known working/first broken version (release number or commit hash). -->
- [ ] Have a plan to fix this issue.

View File

@@ -0,0 +1,7 @@
### Describe the feature you would like to be implemented.
### Would such a feature be useful for other users? Why?
### Any hints on how to implement the requested feature?
### Additional resources

View File

@@ -0,0 +1,26 @@
<!--
Thanks for contributing a merge request! Please name and fully describe your MR as you would for a commit message.
If the MR fixes an issue, please include "Fixes #issue" in the commit message and the MR description.
In addition, we recommend that first-time contributors read our [contribution guidelines](https://eigen.tuxfamily.org/index.php?title=Contributing_to_Eigen) and [git page](https://eigen.tuxfamily.org/index.php?title=Git), which will help you submit a more standardized MR.
Before submitting the MR, you also need to complete the following checks:
- Make one PR per feature/bugfix (don't mix multiple changes into one PR). Avoid committing unrelated changes.
- Rebase before committing
- For code changes, run the test suite (at least the tests that are likely affected by the change).
See our [test guidelines](https://eigen.tuxfamily.org/index.php?title=Tests).
- If possible, add a test (both for bug-fixes as well as new features)
- Make sure new features are documented
Note that we are a team of volunteers; we appreciate your patience during the review process.
Again, thanks for contributing! -->
### Reference issue
<!-- You can link to a specific issue using the gitlab syntax #<issue number> -->
### What does this implement/fix?
<!--Please explain your changes.-->
### Additional information
<!--Any additional information you think is important.-->

3
.krazy
View File

@@ -1,3 +0,0 @@
SKIP /disabled/
SKIP /bench/
SKIP /build/

1916
CHANGELOG.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,73 +1,801 @@
project(Eigen)
set(EIGEN_VERSION_NUMBER "2.0-beta2")
cmake_minimum_required(VERSION 3.10.0)
#if the svnversion program is absent, this will leave the SVN_REVISION string empty,
#but won't stop CMake.
execute_process(COMMAND svnversion -n ${CMAKE_SOURCE_DIR}
OUTPUT_VARIABLE EIGEN_SVN_REVISION)
#==============================================================================
# CMake Policy issues.
#==============================================================================
# Allow overriding options in a parent project via `set` before including Eigen.
if (POLICY CMP0077)
cmake_policy (SET CMP0077 NEW)
endif (POLICY CMP0077)
if(EIGEN_SVN_REVISION)
set(EIGEN_VERSION "${EIGEN_VERSION_NUMBER} (SVN revision ${EIGEN_SVN_REVISION})")
else(EIGEN_SVN_REVISION)
set(EIGEN_VERSION "${EIGEN_VERSION_NUMBER}")
endif(EIGEN_SVN_REVISION)
# NOTE Remove setting the policy once the minimum required CMake version is
# increased to at least 3.15. Retain enabling the export to package registry.
if (POLICY CMP0090)
# The export command does not populate package registry by default
cmake_policy (SET CMP0090 NEW)
# Unless otherwise specified, always export to package registry to ensure
# backwards compatibility.
if (NOT DEFINED CMAKE_EXPORT_PACKAGE_REGISTRY)
set (CMAKE_EXPORT_PACKAGE_REGISTRY ON)
endif (NOT DEFINED CMAKE_EXPORT_PACKAGE_REGISTRY)
endif (POLICY CMP0090)
cmake_minimum_required(VERSION 2.4)
# Disable warning about find_package(CUDA).
# CUDA language support is lacking for clang as the CUDA compiler
# until at least cmake version 3.18. Even then, there seems to be
# issues on Windows+Ninja in passing build flags. Continue using
# the "old" way for now.
if (POLICY CMP0146)
cmake_policy(SET CMP0146 OLD)
endif ()
set(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake)
# Normalize DESTINATION paths
if (POLICY CMP0177)
cmake_policy(SET CMP0177 NEW)
endif ()
#==============================================================================
# CMake Project.
#==============================================================================
project(Eigen3)
# Remove this block after bumping CMake to v3.21.0
# PROJECT_IS_TOP_LEVEL is defined then by default
if(CMAKE_VERSION VERSION_LESS 3.21.0)
if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
set(PROJECT_IS_TOP_LEVEL ON)
else()
set(PROJECT_IS_TOP_LEVEL OFF)
endif()
endif()
#==============================================================================
# Build ON/OFF Settings.
#==============================================================================
# Determine if we should build tests.
include(CMakeDependentOption)
cmake_dependent_option(BUILD_TESTING "Enable creation of tests." ON "PROJECT_IS_TOP_LEVEL" OFF)
option(EIGEN_BUILD_TESTING "Enable creation of Eigen tests." ${BUILD_TESTING})
option(EIGEN_LEAVE_TEST_IN_ALL_TARGET "Leaves tests in the all target, needed by ctest for automatic building." OFF)
# Determine if we should build BLAS/LAPACK implementations.
option(EIGEN_BUILD_BLAS "Toggles the building of the Eigen Blas library" ${PROJECT_IS_TOP_LEVEL})
option(EIGEN_BUILD_LAPACK "Toggles the building of the included Eigen LAPACK library" ${PROJECT_IS_TOP_LEVEL})
if (EIGEN_BUILD_BLAS OR EIGEN_BUILD_LAPACK)
# Determine if we should build shared libraries for BLAS/LAPACK on this platform.
if (NOT EIGEN_BUILD_SHARED_LIBS)
get_cmake_property(EIGEN_BUILD_SHARED_LIBS TARGET_SUPPORTS_SHARED_LIBS)
endif()
endif()
option(EIGEN_BUILD_TESTS "Build tests" OFF)
option(EIGEN_BUILD_DEMOS "Build demos" OFF)
if(NOT WIN32)
option(EIGEN_BUILD_LIB "Build the binary shared library" OFF)
endif(NOT WIN32)
option(EIGEN_BUILD_BTL "Build benchmark suite" OFF)
option(EIGEN_BUILD_SPBENCH "Build sparse benchmark suite" OFF)
# Avoid building docs if included from another project.
# Building documentation requires creating and running executables on the host
# platform. We shouldn't do this if cross-compiling.
if (PROJECT_IS_TOP_LEVEL AND NOT CMAKE_CROSSCOMPILING)
set(EIGEN_BUILD_DOC_DEFAULT ON)
endif()
option(EIGEN_BUILD_DOC "Enable creation of Eigen documentation" ${EIGEN_BUILD_DOC_DEFAULT})
if(EIGEN_BUILD_LIB)
option(EIGEN_TEST_LIB "Build the unit tests using the library (disable -pedantic)" OFF)
endif(EIGEN_BUILD_LIB)
option(EIGEN_BUILD_DEMOS "Toggles the building of the Eigen demos" ${PROJECT_IS_TOP_LEVEL})
set(CMAKE_INCLUDE_CURRENT_DIR ON)
# Disable pkgconfig only for native Windows builds
if(NOT WIN32 OR NOT CMAKE_HOST_SYSTEM_NAME MATCHES Windows)
option(EIGEN_BUILD_PKGCONFIG "Build pkg-config .pc file for Eigen" ${PROJECT_IS_TOP_LEVEL})
endif()
option(EIGEN_BUILD_CMAKE_PACKAGE "Enables the creation of EigenConfig.cmake and related files" ${PROJECT_IS_TOP_LEVEL})
if(CMAKE_COMPILER_IS_GNUCXX)
if(CMAKE_SYSTEM_NAME MATCHES Linux)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wnon-virtual-dtor -Wno-long-long -ansi -Wundef -Wcast-align -Wchar-subscripts -Wall -W -Wpointer-arith -Wwrite-strings -Wformat-security -fno-exceptions -fno-check-new -fno-common -fstrict-aliasing")
if(NOT EIGEN_TEST_LIB)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pedantic")
endif(NOT EIGEN_TEST_LIB)
if (EIGEN_BUILD_TESTING OR EIGEN_BUILD_BLAS OR EIGEN_BUILD_LAPACK OR EIGEN_BUILT_BTL OR EIGEN_BUILD_BTL OR EIGEN_BUILD_SPBENCH OR EIGEN_BUILD_DOC OR EIGEN_BUILD_DEMOS)
set(EIGEN_IS_BUILDING_ ON)
endif()
#==============================================================================
# Version Info.
#==============================================================================
# If version information is not provided, automatically parse the version number
# from header files.
file(READ "${PROJECT_SOURCE_DIR}/Eigen/Version" _eigen_version_header)
if (NOT DEFINED EIGEN_WORLD_VERSION)
string(REGEX MATCH "define[ \t]+EIGEN_WORLD_VERSION[ \t]+([0-9]+)" _eigen_world_version_match "${_eigen_version_header}")
set(EIGEN_WORLD_VERSION "${CMAKE_MATCH_1}")
endif()
if (NOT DEFINED EIGEN_MAJOR_VERSION)
string(REGEX MATCH "define[ \t]+EIGEN_MAJOR_VERSION[ \t]+([0-9]+)" _eigen_major_version_match "${_eigen_version_header}")
set(EIGEN_MAJOR_VERSION "${CMAKE_MATCH_1}")
endif()
if (NOT DEFINED EIGEN_MINOR_VERSION)
string(REGEX MATCH "define[ \t]+EIGEN_MINOR_VERSION[ \t]+([0-9]+)" _eigen_minor_version_match "${_eigen_version_header}")
set(EIGEN_MINOR_VERSION "${CMAKE_MATCH_1}")
endif()
if (NOT DEFINED EIGEN_PATCH_VERSION)
string(REGEX MATCH "define[ \t]+EIGEN_PATCH_VERSION[ \t]+([0-9]+)" _eigen_patch_version_match "${_eigen_version_header}")
set(EIGEN_PATCH_VERSION "${CMAKE_MATCH_1}")
endif()
if (NOT DEFINED EIGEN_PRERELEASE_VERSION)
set(EIGEN_PRERELEASE_VERSION "dev")
endif()
# If we are in a git repo, extract a changeset.
if(IS_DIRECTORY ${CMAKE_SOURCE_DIR}/.git)
# if the git program is absent or this will leave the EIGEN_GIT_REVNUM string empty,
# but won't stop CMake.
execute_process(COMMAND git ls-remote -q ${CMAKE_SOURCE_DIR} HEAD OUTPUT_VARIABLE EIGEN_GIT_OUTPUT)
endif()
# extract the git rev number from the git output...
if(EIGEN_GIT_OUTPUT)
string(REGEX MATCH "^([0-9;a-f]+).*" EIGEN_GIT_CHANGESET_MATCH "${EIGEN_GIT_OUTPUT}")
set(EIGEN_GIT_REVNUM "${CMAKE_MATCH_1}")
endif()
if (NOT DEFINED EIGEN_BUILD_VERSION AND DEFINED EIGEN_GIT_REVNUM)
string(SUBSTRING "${EIGEN_GIT_REVNUM}" 0 8 EIGEN_BUILD_VERSION)
else()
set(EIGEN_BUILD_VERSION "")
endif()
# The EIGEN_VERSION_NUMBER must be of the form <major.minor.patch>.
# The EIGEN_VERSION_STRING can contain the preprelease/build strings.
set(EIGEN_VERSION_NUMBER "${EIGEN_MAJOR_VERSION}.${EIGEN_MINOR_VERSION}.${EIGEN_PATCH_VERSION}")
set(EIGEN_VERSION_STRING "${EIGEN_VERSION_NUMBER}")
if (NOT "x${EIGEN_PRERELEASE_VERSION}" STREQUAL "x")
set(EIGEN_VERSION_STRING "${EIGEN_VERSION_STRING}-${EIGEN_PRERELEASE_VERSION}")
endif()
if (NOT "x${EIGEN_BUILD_VERSION}" STREQUAL "x")
set(EIGEN_VERSION_STRING "${EIGEN_VERSION_STRING}+${EIGEN_BUILD_VERSION}")
endif()
# Generate version file.
configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/Version.in"
"${CMAKE_CURRENT_BINARY_DIR}/include/Eigen/Version")
#==============================================================================
# Install Path Configuration.
#==============================================================================
# Unconditionally allow install of targets to support nested dependency
# installations.
#
# Note: projects that depend on Eigen should _probably_ exclude installing
# Eigen by default (e.g. by using EXCLUDE_FROM_ALL when using
# FetchContent_Declare or add_subdirectory) to avoid overwriting a previous
# installation.
include(GNUInstallDirs)
# Backward compatibility support for EIGEN_INCLUDE_INSTALL_DIR
if(EIGEN_INCLUDE_INSTALL_DIR)
message(WARNING "EIGEN_INCLUDE_INSTALL_DIR is deprecated. Use INCLUDE_INSTALL_DIR instead.")
endif()
if(EIGEN_INCLUDE_INSTALL_DIR AND NOT INCLUDE_INSTALL_DIR)
set(INCLUDE_INSTALL_DIR ${EIGEN_INCLUDE_INSTALL_DIR}
CACHE PATH "The directory relative to CMAKE_INSTALL_PREFIX where Eigen header files are installed")
else()
set(INCLUDE_INSTALL_DIR
"${CMAKE_INSTALL_INCLUDEDIR}/eigen3"
CACHE PATH "The directory relative to CMAKE_INSTALL_PREFIX where Eigen header files are installed"
)
endif()
set(CMAKEPACKAGE_INSTALL_DIR
"${CMAKE_INSTALL_DATADIR}/eigen3/cmake"
CACHE PATH "The directory relative to CMAKE_INSTALL_PREFIX where Eigen3Config.cmake is installed"
)
set(PKGCONFIG_INSTALL_DIR
"${CMAKE_INSTALL_DATADIR}/pkgconfig"
CACHE PATH "The directory relative to CMAKE_INSTALL_PREFIX where eigen3.pc is installed"
)
foreach(var INCLUDE_INSTALL_DIR CMAKEPACKAGE_INSTALL_DIR PKGCONFIG_INSTALL_DIR)
# If an absolute path is specified, make it relative to "{CMAKE_INSTALL_PREFIX}".
if(IS_ABSOLUTE "${${var}}")
file(RELATIVE_PATH "${var}" "${CMAKE_INSTALL_PREFIX}" "${${var}}")
endif()
endforeach()
#==============================================================================
# Eigen Library.
#==============================================================================
# Alias Eigen_*_DIR to Eigen3_*_DIR:
set(Eigen_SOURCE_DIR ${Eigen3_SOURCE_DIR})
set(Eigen_BINARY_DIR ${Eigen3_BINARY_DIR})
# Imported target support
add_library (eigen INTERFACE)
add_library (Eigen3::Eigen ALIAS eigen)
target_include_directories (eigen INTERFACE
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}>
$<INSTALL_INTERFACE:${INCLUDE_INSTALL_DIR}>
)
# Eigen requires at least C++14
target_compile_features (eigen INTERFACE cxx_std_14)
# Export as title case Eigen
set_target_properties (eigen PROPERTIES EXPORT_NAME Eigen)
#==============================================================================
# Install Rule Configuration.
#==============================================================================
install(FILES
signature_of_eigen3_matrix_library
DESTINATION ${INCLUDE_INSTALL_DIR} COMPONENT Devel
)
if(EIGEN_BUILD_PKGCONFIG)
configure_file(eigen3.pc.in eigen3.pc @ONLY)
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/eigen3.pc
DESTINATION ${PKGCONFIG_INSTALL_DIR})
endif()
install(DIRECTORY Eigen DESTINATION ${INCLUDE_INSTALL_DIR} COMPONENT Devel)
# Replace the "Version" header file with the generated one.
install(FILES ${CMAKE_CURRENT_BINARY_DIR}/include/Eigen/Version
DESTINATION ${INCLUDE_INSTALL_DIR}/Eigen/ COMPONENT Devel)
install(TARGETS eigen EXPORT Eigen3Targets)
if(EIGEN_BUILD_CMAKE_PACKAGE)
include (CMakePackageConfigHelpers)
configure_package_config_file (
${CMAKE_CURRENT_SOURCE_DIR}/cmake/Eigen3Config.cmake.in
${CMAKE_CURRENT_BINARY_DIR}/Eigen3Config.cmake
INSTALL_DESTINATION ${CMAKEPACKAGE_INSTALL_DIR}
NO_SET_AND_CHECK_MACRO # Eigen does not provide legacy style defines
NO_CHECK_REQUIRED_COMPONENTS_MACRO # Eigen does not provide components
)
set(CVF_VERSION "${EIGEN_VERSION_NUMBER}")
configure_file("${CMAKE_CURRENT_SOURCE_DIR}/cmake/Eigen3ConfigVersion.cmake.in"
"Eigen3ConfigVersion.cmake"
@ONLY)
# The Eigen target will be located in the Eigen3 namespace. Other CMake
# targets can refer to it using Eigen3::Eigen.
export (TARGETS eigen NAMESPACE Eigen3:: FILE Eigen3Targets.cmake)
# Export Eigen3 package to CMake registry such that it can be easily found by
# CMake even if it has not been installed to a standard directory.
export (PACKAGE Eigen3)
install (EXPORT Eigen3Targets NAMESPACE Eigen3:: DESTINATION ${CMAKEPACKAGE_INSTALL_DIR})
install (FILES ${CMAKE_CURRENT_BINARY_DIR}/Eigen3Config.cmake
${CMAKE_CURRENT_BINARY_DIR}/Eigen3ConfigVersion.cmake
DESTINATION ${CMAKEPACKAGE_INSTALL_DIR})
# Add uninstall target
if(NOT TARGET uninstall AND PROJECT_IS_TOP_LEVEL)
add_custom_target ( uninstall
COMMAND ${CMAKE_COMMAND} -P ${CMAKE_CURRENT_SOURCE_DIR}/cmake/EigenUninstall.cmake)
endif()
endif()
#==============================================================================
# General Build Configuration.
#==============================================================================
# Avoid setting the standard in a parent if unset.
if(PROJECT_IS_TOP_LEVEL)
set(CMAKE_CXX_STANDARD 14 CACHE STRING "Default C++ standard")
set(CMAKE_CXX_STANDARD_REQUIRED ON CACHE BOOL "Require C++ standard")
set(CMAKE_CXX_EXTENSIONS OFF CACHE BOOL "Allow C++ extensions")
endif()
# Guard against in-source builds
if(${CMAKE_SOURCE_DIR} STREQUAL ${CMAKE_BINARY_DIR})
message(FATAL_ERROR "In-source builds not allowed. Please make a new directory (called a build directory) and run CMake from there. You may need to remove CMakeCache.txt. ")
endif()
# Guard against bad build-type strings
if (PROJECT_IS_TOP_LEVEL AND NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE "Release")
endif()
# Only try to figure out how to link the math library if we are building something.
# Otherwise, let the parent project deal with dependencies.
if (EIGEN_IS_BUILDING_)
# Use Eigen's cmake files.
set(CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake)
set(CMAKE_INCLUDE_CURRENT_DIR OFF)
find_package(StandardMathLibrary)
set(EIGEN_STANDARD_LIBRARIES_TO_LINK_TO "")
if(NOT STANDARD_MATH_LIBRARY_FOUND)
message(FATAL_ERROR
"Can't link to the standard math library. Please report to the Eigen developers, telling them about your platform.")
else()
if(EIGEN_STANDARD_LIBRARIES_TO_LINK_TO)
set(EIGEN_STANDARD_LIBRARIES_TO_LINK_TO "${EIGEN_STANDARD_LIBRARIES_TO_LINK_TO} ${STANDARD_MATH_LIBRARY}")
else()
set(EIGEN_STANDARD_LIBRARIES_TO_LINK_TO "${STANDARD_MATH_LIBRARY}")
endif()
endif()
if(EIGEN_STANDARD_LIBRARIES_TO_LINK_TO)
message(STATUS "Standard libraries to link to explicitly: ${EIGEN_STANDARD_LIBRARIES_TO_LINK_TO}")
else()
message(STATUS "Standard libraries to link to explicitly: none")
endif()
# Default tests/examples/libraries to row-major.
option(EIGEN_DEFAULT_TO_ROW_MAJOR "Use row-major as default matrix storage order" OFF)
if(EIGEN_DEFAULT_TO_ROW_MAJOR)
add_definitions("-DEIGEN_DEFAULT_TO_ROW_MAJOR")
endif()
endif()
#==============================================================================
# Test Configuration.
#==============================================================================
if (EIGEN_BUILD_TESTING)
function(ei_maybe_separate_arguments variable mode args)
# Use separate_arguments if the input is a single string containing a space.
# Otherwise, if it is already a list or doesn't have a space, just propagate
# the original value. This is to better support multi-argument lists.
list(LENGTH args list_length)
if (${list_length} EQUAL 1)
string(FIND "${args}" " " has_space)
if (${has_space} GREATER -1)
separate_arguments(args ${mode} "${args}")
endif()
endif()
set(${variable} ${args} PARENT_SCOPE)
endfunction(ei_maybe_separate_arguments)
include(CheckCXXCompilerFlag)
macro(ei_add_cxx_compiler_flag FLAG)
string(REGEX REPLACE "-" "" SFLAG1 ${FLAG})
string(REGEX REPLACE "\\+" "p" SFLAG ${SFLAG1})
check_cxx_compiler_flag(${FLAG} COMPILER_SUPPORT_${SFLAG})
if(COMPILER_SUPPORT_${SFLAG})
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${FLAG}")
endif()
endmacro()
set(EIGEN_TEST_CUSTOM_LINKER_FLAGS "" CACHE STRING "Additional linker flags when linking unit tests.")
set(EIGEN_TEST_CUSTOM_CXX_FLAGS "" CACHE STRING "Additional compiler flags when compiling unit tests.")
# Convert space-separated arguments into CMake lists for downstream consumption.
ei_maybe_separate_arguments(EIGEN_TEST_CUSTOM_LINKER_FLAGS NATIVE_COMMAND "${EIGEN_TEST_CUSTOM_LINKER_FLAGS}")
ei_maybe_separate_arguments(EIGEN_TEST_CUSTOM_CXX_FLAGS NATIVE_COMMAND "${EIGEN_TEST_CUSTOM_CXX_FLAGS}")
option(EIGEN_SPLIT_LARGE_TESTS "Split large tests into smaller executables" ON)
set(EIGEN_TEST_MAX_SIZE "320" CACHE STRING "Maximal matrix/vector size, default is 320")
# Flags for tests.
if(NOT MSVC)
# We assume that other compilers are partly compatible with GNUCC
# clang outputs some warnings for unknown flags that are not caught by check_cxx_compiler_flag
# adding -Werror turns such warnings into errors
check_cxx_compiler_flag("-Werror" COMPILER_SUPPORT_WERROR)
if(COMPILER_SUPPORT_WERROR)
set(CMAKE_REQUIRED_FLAGS "-Werror")
endif()
ei_add_cxx_compiler_flag("-pedantic")
ei_add_cxx_compiler_flag("-Wall")
ei_add_cxx_compiler_flag("-Wextra")
# ei_add_cxx_compiler_flag("-Weverything") # clang
ei_add_cxx_compiler_flag("-Wundef")
ei_add_cxx_compiler_flag("-Wcast-align")
ei_add_cxx_compiler_flag("-Wchar-subscripts")
ei_add_cxx_compiler_flag("-Wnon-virtual-dtor")
ei_add_cxx_compiler_flag("-Wunused-local-typedefs")
ei_add_cxx_compiler_flag("-Wpointer-arith")
ei_add_cxx_compiler_flag("-Wwrite-strings")
ei_add_cxx_compiler_flag("-Wformat-security")
ei_add_cxx_compiler_flag("-Wshorten-64-to-32")
ei_add_cxx_compiler_flag("-Wlogical-op")
ei_add_cxx_compiler_flag("-Wenum-conversion")
ei_add_cxx_compiler_flag("-Wc++11-extensions")
ei_add_cxx_compiler_flag("-Wdouble-promotion")
# ei_add_cxx_compiler_flag("-Wconversion")
ei_add_cxx_compiler_flag("-Wshadow")
ei_add_cxx_compiler_flag("-Wno-psabi")
ei_add_cxx_compiler_flag("-Wno-variadic-macros")
ei_add_cxx_compiler_flag("-Wno-long-long")
ei_add_cxx_compiler_flag("-fno-common")
ei_add_cxx_compiler_flag("-fstrict-aliasing")
ei_add_cxx_compiler_flag("-wd981") # disable ICC's "operands are evaluated in unspecified order" remark
ei_add_cxx_compiler_flag("-wd2304") # disable ICC's "warning #2304: non-explicit constructor with single argument may cause implicit type conversion" produced by -Wnon-virtual-dtor
# Clang emits warnings about unused flag.
if (NOT CMAKE_CXX_COMPILER_ID MATCHES "Clang")
ei_add_cxx_compiler_flag("-fno-check-new")
endif()
if(ANDROID_NDK)
ei_add_cxx_compiler_flag("-pie")
ei_add_cxx_compiler_flag("-fPIE")
endif()
set(CMAKE_REQUIRED_FLAGS "")
option(EIGEN_TEST_SSE2 "Enable/Disable SSE2 in tests/examples" OFF)
if(EIGEN_TEST_SSE2)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse2")
message("Enabling SSE2 in tests/examples")
endif(EIGEN_TEST_SSE2)
message(STATUS "Enabling SSE2 in tests/examples")
endif()
option(EIGEN_TEST_SSE3 "Enable/Disable SSE3 in tests/examples" OFF)
if(EIGEN_TEST_SSE3)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse3")
message("Enabling SSE3 in tests/examples")
endif(EIGEN_TEST_SSE3)
message(STATUS "Enabling SSE3 in tests/examples")
endif()
option(EIGEN_TEST_SSSE3 "Enable/Disable SSSE3 in tests/examples" OFF)
if(EIGEN_TEST_SSSE3)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mssse3")
message("Enabling SSSE3 in tests/examples")
endif(EIGEN_TEST_SSSE3)
message(STATUS "Enabling SSSE3 in tests/examples")
endif()
option(EIGEN_TEST_SSE4_1 "Enable/Disable SSE4.1 in tests/examples" OFF)
if(EIGEN_TEST_SSE4_1)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.1")
message(STATUS "Enabling SSE4.1 in tests/examples")
endif()
option(EIGEN_TEST_SSE4_2 "Enable/Disable SSE4.2 in tests/examples" OFF)
if(EIGEN_TEST_SSE4_2)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -msse4.2")
message(STATUS "Enabling SSE4.2 in tests/examples")
endif()
option(EIGEN_TEST_AVX "Enable/Disable AVX in tests/examples" OFF)
if(EIGEN_TEST_AVX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx")
message(STATUS "Enabling AVX in tests/examples")
endif()
option(EIGEN_TEST_FMA "Enable/Disable FMA in tests/examples" OFF)
if(EIGEN_TEST_FMA AND NOT EIGEN_TEST_NEON)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfma")
message(STATUS "Enabling FMA in tests/examples")
endif()
option(EIGEN_TEST_AVX2 "Enable/Disable AVX2 in tests/examples" OFF)
if(EIGEN_TEST_AVX2)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx2 -mfma")
message(STATUS "Enabling AVX2 in tests/examples")
endif()
option(EIGEN_TEST_AVX512 "Enable/Disable AVX512 in tests/examples" OFF)
if(EIGEN_TEST_AVX512)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx512f -mfma")
message(STATUS "Enabling AVX512 in tests/examples")
endif()
option(EIGEN_TEST_AVX512DQ "Enable/Disable AVX512DQ in tests/examples" OFF)
if(EIGEN_TEST_AVX512DQ)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx512dq -mfma")
message(STATUS "Enabling AVX512DQ in tests/examples")
endif()
option(EIGEN_TEST_AVX512FP16 "Enable/Disable AVX512-FP16 in tests/examples" OFF)
if(EIGEN_TEST_AVX512FP16)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mavx512f -mfma -mavx512vl -mavx512fp16")
message(STATUS "Enabling AVX512-FP16 in tests/examples")
endif()
option(EIGEN_TEST_F16C "Enable/Disable F16C in tests/examples" OFF)
if(EIGEN_TEST_F16C)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mf16c")
message(STATUS "Enabling F16C in tests/examples")
endif()
option(EIGEN_TEST_ALTIVEC "Enable/Disable AltiVec in tests/examples" OFF)
if(EIGEN_TEST_ALTIVEC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -maltivec -mabi=altivec")
message("Enabling AltiVec in tests/examples")
endif(EIGEN_TEST_ALTIVEC)
endif(CMAKE_SYSTEM_NAME MATCHES Linux)
endif(CMAKE_COMPILER_IS_GNUCXX)
message(STATUS "Enabling AltiVec in tests/examples")
endif()
include_directories(${CMAKE_CURRENT_SOURCE_DIR} ${CMAKE_CURRENT_BINARY_DIR})
option(EIGEN_TEST_VSX "Enable/Disable VSX in tests/examples" OFF)
if(EIGEN_TEST_VSX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -m64 -mvsx")
message(STATUS "Enabling VSX in tests/examples")
endif()
add_subdirectory(Eigen)
option(EIGEN_TEST_MSA "Enable/Disable MSA in tests/examples" OFF)
if(EIGEN_TEST_MSA)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mmsa")
message(STATUS "Enabling MSA in tests/examples")
endif()
if(EIGEN_BUILD_TESTS)
add_subdirectory(test)
endif(EIGEN_BUILD_TESTS)
option(EIGEN_TEST_LSX "Enable/Disable LSX in tests/examples" OFF)
if(EIGEN_TEST_LSX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mlsx")
message(STATUS "Enabling LSX in tests/examples")
endif()
add_subdirectory(doc)
option(EIGEN_TEST_NEON "Enable/Disable Neon in tests/examples" OFF)
if(EIGEN_TEST_NEON)
if(EIGEN_TEST_FMA)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=neon-vfpv4")
else()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=neon")
endif()
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfloat-abi=hard")
message(STATUS "Enabling NEON in tests/examples")
endif()
if(EIGEN_BUILD_DEMOS)
add_subdirectory(demos)
endif(EIGEN_BUILD_DEMOS)
option(EIGEN_TEST_NEON64 "Enable/Disable Neon in tests/examples" OFF)
if(EIGEN_TEST_NEON64)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
message(STATUS "Enabling NEON in tests/examples")
endif()
option(EIGEN_TEST_Z13 "Enable/Disable S390X(zEC13) ZVECTOR in tests/examples" OFF)
if(EIGEN_TEST_Z13)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=z13 -mzvector")
message(STATUS "Enabling S390X(zEC13) ZVECTOR in tests/examples")
endif()
option(EIGEN_TEST_Z14 "Enable/Disable S390X(zEC14) ZVECTOR in tests/examples" OFF)
if(EIGEN_TEST_Z14)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=z14 -mzvector")
message(STATUS "Enabling S390X(zEC13) ZVECTOR in tests/examples")
endif()
check_cxx_compiler_flag("-fopenmp" COMPILER_SUPPORT_OPENMP)
if(COMPILER_SUPPORT_OPENMP)
option(EIGEN_TEST_OPENMP "Enable/Disable OpenMP in tests/examples" OFF)
if(EIGEN_TEST_OPENMP)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fopenmp")
message(STATUS "Enabling OpenMP in tests/examples")
endif()
endif()
else()
# C4127 - conditional expression is constant
# C4714 - marked as __forceinline not inlined (I failed to deactivate it selectively)
# We can disable this warning in the unit tests since it is clear that it occurs
# because we are oftentimes returning objects that have a destructor or may
# throw exceptions - in particular in the unit tests we are throwing extra many
# exceptions to cover indexing errors.
# C4505 - unreferenced local function has been removed (impossible to deactivate selectively)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /EHsc /wd4127 /wd4505 /wd4714")
# replace all /Wx by /W4
string(REGEX REPLACE "/W[0-9]" "/W4" CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
check_cxx_compiler_flag("/openmp" COMPILER_SUPPORT_OPENMP)
if(COMPILER_SUPPORT_OPENMP)
option(EIGEN_TEST_OPENMP "Enable/Disable OpenMP in tests/examples" OFF)
if(EIGEN_TEST_OPENMP)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /openmp")
message(STATUS "Enabling OpenMP in tests/examples")
endif()
endif()
option(EIGEN_TEST_SSE2 "Enable/Disable SSE2 in tests/examples" OFF)
if(EIGEN_TEST_SSE2)
if(NOT CMAKE_CL_64)
# arch is not supported on 64 bit systems, SSE is enabled automatically.
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:SSE2")
endif()
message(STATUS "Enabling SSE2 in tests/examples")
endif()
option(EIGEN_TEST_AVX "Enable/Disable AVX in tests/examples" OFF)
if(EIGEN_TEST_AVX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX")
message(STATUS "Enabling AVX in tests/examples")
endif()
option(EIGEN_TEST_FMA "Enable/Disable FMA/AVX2 in tests/examples" OFF)
option(EIGEN_TEST_AVX2 "Enable/Disable FMA/AVX2 in tests/examples" OFF)
if((EIGEN_TEST_FMA AND NOT EIGEN_TEST_NEON) OR EIGEN_TEST_AVX2)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
message(STATUS "Enabling FMA/AVX2 in tests/examples")
endif()
option(EIGEN_TEST_AVX512 "Enable/Disable AVX512 in tests/examples" OFF)
option(EIGEN_TEST_AVX512DQ "Enable/Disable AVX512DQ in tests/examples" OFF)
if(EIGEN_TEST_AVX512 OR EIGEN_TEST_AVX512DQ)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX512")
message(STATUS "Enabling AVX512 in tests/examples")
endif()
endif(NOT MSVC)
option(EIGEN_TEST_NO_EXPLICIT_VECTORIZATION "Disable explicit vectorization in tests/examples" OFF)
option(EIGEN_TEST_X87 "Force using X87 instructions. Implies no vectorization." OFF)
option(EIGEN_TEST_32BIT "Force generating 32bit code." OFF)
if(EIGEN_TEST_X87)
set(EIGEN_TEST_NO_EXPLICIT_VECTORIZATION ON)
if(CMAKE_COMPILER_IS_GNUCXX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpmath=387")
message(STATUS "Forcing use of x87 instructions in tests/examples")
else()
message(STATUS "EIGEN_TEST_X87 ignored on your compiler")
endif()
endif()
if(EIGEN_TEST_32BIT)
if(CMAKE_COMPILER_IS_GNUCXX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -m32")
message(STATUS "Forcing generation of 32-bit code in tests/examples")
else()
message(STATUS "EIGEN_TEST_32BIT ignored on your compiler")
endif()
endif()
if(EIGEN_TEST_NO_EXPLICIT_VECTORIZATION)
add_definitions(-DEIGEN_DONT_VECTORIZE=1)
message(STATUS "Disabling vectorization in tests/examples")
endif()
option(EIGEN_TEST_NO_EXPLICIT_ALIGNMENT "Disable explicit alignment (hence vectorization) in tests/examples" OFF)
if(EIGEN_TEST_NO_EXPLICIT_ALIGNMENT)
add_definitions(-DEIGEN_DONT_ALIGN=1)
message(STATUS "Disabling alignment in tests/examples")
endif()
option(EIGEN_TEST_NO_EXCEPTIONS "Disables C++ exceptions" OFF)
if(EIGEN_TEST_NO_EXCEPTIONS)
ei_add_cxx_compiler_flag("-fno-exceptions")
message(STATUS "Disabling exceptions in tests/examples")
endif()
set(EIGEN_CUDA_CXX_FLAGS "" CACHE STRING "Additional flags to pass to the cuda compiler.")
set(EIGEN_CUDA_COMPUTE_ARCH 30 CACHE STRING "The CUDA compute architecture(s) to target when compiling CUDA code")
option(EIGEN_TEST_SYCL "Add Sycl support." OFF)
if(EIGEN_TEST_SYCL)
option(EIGEN_SYCL_DPCPP "Use the DPCPP Sycl implementation (DPCPP is default SYCL-Compiler)." ON)
option(EIGEN_SYCL_TRISYCL "Use the triSYCL Sycl implementation." OFF)
option(EIGEN_SYCL_ComputeCpp "Use the ComputeCPP Sycl implementation." OFF)
# Building options
# https://developer.codeplay.com/products/computecpp/ce/2.11.0/guides/eigen-overview/options-for-building-eigen-sycl
option(EIGEN_SYCL_USE_DEFAULT_SELECTOR "Use sycl default selector to select the preferred device." OFF)
option(EIGEN_SYCL_NO_LOCAL_MEM "Build for devices without dedicated shared memory." OFF)
option(EIGEN_SYCL_LOCAL_MEM "Allow the use of local memory (enabled by default)." ON)
option(EIGEN_SYCL_LOCAL_THREAD_DIM0 "Set work group size for dimension 0." 16)
option(EIGEN_SYCL_LOCAL_THREAD_DIM1 "Set work group size for dimension 1." 16)
option(EIGEN_SYCL_ASYNC_EXECUTION "Allow asynchronous execution (enabled by default)." ON)
option(EIGEN_SYCL_DISABLE_SKINNY "Disable optimization for tall/skinny matrices." OFF)
option(EIGEN_SYCL_DISABLE_DOUBLE_BUFFER "Disable double buffer." OFF)
option(EIGEN_SYCL_DISABLE_SCALAR "Disable scalar contraction." OFF)
option(EIGEN_SYCL_DISABLE_GEMV "Disable GEMV and create a single kernel to calculate contraction instead." OFF)
set(EIGEN_SYCL ON)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-deprecated-declarations -Wno-shorten-64-to-32 -Wno-cast-align")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-deprecated-copy-with-user-provided-copy -Wno-unused-variable")
set (CMAKE_MODULE_PATH "${CMAKE_ROOT}/Modules" "cmake/Modules/" "${CMAKE_MODULE_PATH}")
find_package(Threads REQUIRED)
if(EIGEN_SYCL_TRISYCL)
message(STATUS "Using triSYCL")
include(FindTriSYCL)
elseif(EIGEN_SYCL_ComputeCpp)
message(STATUS "Using ComputeCPP SYCL")
include(FindComputeCpp)
set(COMPUTECPP_DRIVER_DEFAULT_VALUE OFF)
if (NOT MSVC)
set(COMPUTECPP_DRIVER_DEFAULT_VALUE ON)
endif()
option(COMPUTECPP_USE_COMPILER_DRIVER
"Use ComputeCpp driver instead of a 2 steps compilation"
${COMPUTECPP_DRIVER_DEFAULT_VALUE}
)
else() #Default SYCL compiler is DPCPP (EIGEN_SYCL_DPCPP)
set(DPCPP_SYCL_TARGET "spir64" CACHE STRING "Default target for Intel CPU/GPU")
message(STATUS "Using DPCPP")
find_package(DPCPP)
add_definitions(-DSYCL_COMPILER_IS_DPCPP)
endif(EIGEN_SYCL_TRISYCL)
if(EIGEN_DONT_VECTORIZE_SYCL)
message(STATUS "Disabling SYCL vectorization in tests/examples")
# When disabling SYCL vectorization, also disable Eigen default vectorization
add_definitions(-DEIGEN_DONT_VECTORIZE=1)
add_definitions(-DEIGEN_DONT_VECTORIZE_SYCL=1)
endif()
endif()
include(EigenConfigureTesting)
if(EIGEN_LEAVE_TEST_IN_ALL_TARGET)
# CTest automatic test building relies on the "all" target.
add_subdirectory(test)
add_subdirectory(failtest)
else()
add_subdirectory(test EXCLUDE_FROM_ALL)
add_subdirectory(failtest EXCLUDE_FROM_ALL)
endif()
ei_testing_print_summary()
if (EIGEN_SPLIT_TESTSUITE)
ei_split_testsuite("${EIGEN_SPLIT_TESTSUITE}")
endif()
endif(EIGEN_BUILD_TESTING)
#==============================================================================
# Other Build Configurations.
#==============================================================================
add_subdirectory(unsupported)
if(EIGEN_BUILD_BLAS)
add_subdirectory(blas)
endif()
if (EIGEN_BUILD_LAPACK)
add_subdirectory(lapack)
endif()
if(EIGEN_BUILD_DOC)
add_subdirectory(doc EXCLUDE_FROM_ALL)
endif()
# TODO: consider also replacing EIGEN_BUILD_BTL by a custom target "make btl"?
if(EIGEN_BUILD_BTL)
add_subdirectory(bench/btl)
endif(EIGEN_BUILD_BTL)
add_subdirectory(bench/btl EXCLUDE_FROM_ALL)
endif()
if(NOT WIN32 AND EIGEN_BUILD_SPBENCH)
add_subdirectory(bench/spbench EXCLUDE_FROM_ALL)
endif()
if (EIGEN_BUILD_DEMOS)
add_subdirectory(demos EXCLUDE_FROM_ALL)
endif()
if (PROJECT_IS_TOP_LEVEL)
# must be after test and unsupported, for configuring buildtests.in
add_subdirectory(scripts EXCLUDE_FROM_ALL)
configure_file(scripts/cdashtesting.cmake.in cdashtesting.cmake @ONLY)
endif()
#==============================================================================
# Summary.
#==============================================================================
if(PROJECT_IS_TOP_LEVEL)
string(TOLOWER "${CMAKE_GENERATOR}" cmake_generator_tolower)
if(cmake_generator_tolower MATCHES "makefile")
message(STATUS "Available targets (use: make TARGET):")
else()
message(STATUS "Available targets (use: cmake --build . --target TARGET):")
endif()
message(STATUS "------------+--------------------------------------------------------------")
message(STATUS "Target | Description")
message(STATUS "------------+--------------------------------------------------------------")
message(STATUS "install | Install Eigen. Headers will be installed to:")
message(STATUS " | <CMAKE_INSTALL_PREFIX>/<INCLUDE_INSTALL_DIR>")
message(STATUS " | Using the following values:")
message(STATUS " | CMAKE_INSTALL_PREFIX: ${CMAKE_INSTALL_PREFIX}")
message(STATUS " | INCLUDE_INSTALL_DIR: ${INCLUDE_INSTALL_DIR}")
message(STATUS " | Change the install location of Eigen headers using:")
message(STATUS " | cmake . -DCMAKE_INSTALL_PREFIX=yourprefix")
message(STATUS " | Or:")
message(STATUS " | cmake . -DINCLUDE_INSTALL_DIR=yourdir")
message(STATUS "uninstall | Remove files installed by the install target")
if (EIGEN_BUILD_DOC)
message(STATUS "doc | Generate the API documentation, requires Doxygen & LaTeX")
message(STATUS "install-doc | Install the API documentation")
endif()
if(EIGEN_BUILD_TESTING)
message(STATUS "check | Build and run the unit-tests. Read this page:")
message(STATUS " | http://eigen.tuxfamily.org/index.php?title=Tests")
endif()
if (EIGEN_BUILD_BLAS)
message(STATUS "blas | Build BLAS library (not the same thing as Eigen)")
endif()
if (EIGEN_BUILD_LAPACK)
message(STATUS "lapack | Build LAPACK subset library (not the same thing as Eigen)")
endif()
message(STATUS "------------+--------------------------------------------------------------")
message(STATUS "")
endif()
message(STATUS "")
message(STATUS "Configured Eigen ${EIGEN_VERSION_STRING}")
message(STATUS "")

674
COPYING
View File

@@ -1,674 +0,0 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<http://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<http://www.gnu.org/philosophy/why-not-lgpl.html>.

203
COPYING.APACHE Normal file
View File

@@ -0,0 +1,203 @@
/*
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

26
COPYING.BSD Normal file
View File

@@ -0,0 +1,26 @@
/*
Copyright (c) 2011, Intel Corporation. All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

View File

@@ -1,165 +0,0 @@
GNU LESSER GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
This version of the GNU Lesser General Public License incorporates
the terms and conditions of version 3 of the GNU General Public
License, supplemented by the additional permissions listed below.
0. Additional Definitions.
As used herein, "this License" refers to version 3 of the GNU Lesser
General Public License, and the "GNU GPL" refers to version 3 of the GNU
General Public License.
"The Library" refers to a covered work governed by this License,
other than an Application or a Combined Work as defined below.
An "Application" is any work that makes use of an interface provided
by the Library, but which is not otherwise based on the Library.
Defining a subclass of a class defined by the Library is deemed a mode
of using an interface provided by the Library.
A "Combined Work" is a work produced by combining or linking an
Application with the Library. The particular version of the Library
with which the Combined Work was made is also called the "Linked
Version".
The "Minimal Corresponding Source" for a Combined Work means the
Corresponding Source for the Combined Work, excluding any source code
for portions of the Combined Work that, considered in isolation, are
based on the Application, and not on the Linked Version.
The "Corresponding Application Code" for a Combined Work means the
object code and/or source code for the Application, including any data
and utility programs needed for reproducing the Combined Work from the
Application, but excluding the System Libraries of the Combined Work.
1. Exception to Section 3 of the GNU GPL.
You may convey a covered work under sections 3 and 4 of this License
without being bound by section 3 of the GNU GPL.
2. Conveying Modified Versions.
If you modify a copy of the Library, and, in your modifications, a
facility refers to a function or data to be supplied by an Application
that uses the facility (other than as an argument passed when the
facility is invoked), then you may convey a copy of the modified
version:
a) under this License, provided that you make a good faith effort to
ensure that, in the event an Application does not supply the
function or data, the facility still operates, and performs
whatever part of its purpose remains meaningful, or
b) under the GNU GPL, with none of the additional permissions of
this License applicable to that copy.
3. Object Code Incorporating Material from Library Header Files.
The object code form of an Application may incorporate material from
a header file that is part of the Library. You may convey such object
code under terms of your choice, provided that, if the incorporated
material is not limited to numerical parameters, data structure
layouts and accessors, or small macros, inline functions and templates
(ten or fewer lines in length), you do both of the following:
a) Give prominent notice with each copy of the object code that the
Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the object code with a copy of the GNU GPL and this license
document.
4. Combined Works.
You may convey a Combined Work under terms of your choice that,
taken together, effectively do not restrict modification of the
portions of the Library contained in the Combined Work and reverse
engineering for debugging such modifications, if you also do each of
the following:
a) Give prominent notice with each copy of the Combined Work that
the Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the Combined Work with a copy of the GNU GPL and this license
document.
c) For a Combined Work that displays copyright notices during
execution, include the copyright notice for the Library among
these notices, as well as a reference directing the user to the
copies of the GNU GPL and this license document.
d) Do one of the following:
0) Convey the Minimal Corresponding Source under the terms of this
License, and the Corresponding Application Code in a form
suitable for, and under terms that permit, the user to
recombine or relink the Application with a modified version of
the Linked Version to produce a modified Combined Work, in the
manner specified by section 6 of the GNU GPL for conveying
Corresponding Source.
1) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (a) uses at run time
a copy of the Library already present on the user's computer
system, and (b) will operate properly with a modified version
of the Library that is interface-compatible with the Linked
Version.
e) Provide Installation Information, but only if you would otherwise
be required to provide such information under section 6 of the
GNU GPL, and only to the extent that such information is
necessary to install and execute a modified version of the
Combined Work produced by recombining or relinking the
Application with a modified version of the Linked Version. (If
you use option 4d0, the Installation Information must accompany
the Minimal Corresponding Source and Corresponding Application
Code. If you use option 4d1, you must provide the Installation
Information in the manner specified by section 6 of the GNU GPL
for conveying Corresponding Source.)
5. Combined Libraries.
You may place library facilities that are a work based on the
Library side by side in a single library together with other library
facilities that are not Applications and are not covered by this
License, and convey such a combined library under terms of your
choice, if you do both of the following:
a) Accompany the combined library with a copy of the same work based
on the Library, uncombined with any other library facilities,
conveyed under the terms of this License.
b) Give prominent notice with the combined library that part of it
is a work based on the Library, and explaining where to find the
accompanying uncombined form of the same work.
6. Revised Versions of the GNU Lesser General Public License.
The Free Software Foundation may publish revised and/or new versions
of the GNU Lesser General Public License from time to time. Such new
versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the
Library as you received it specifies that a certain numbered version
of the GNU Lesser General Public License "or any later version"
applies to it, you have the option of following the terms and
conditions either of that published version or of any later version
published by the Free Software Foundation. If the Library as you
received it does not specify a version number of the GNU Lesser
General Public License, you may choose any version of the GNU Lesser
General Public License ever published by the Free Software Foundation.
If the Library as you received it specifies that a proxy can decide
whether future versions of the GNU Lesser General Public License shall
apply, that proxy's public statement of acceptance of any version is
permanent authorization for you to choose that version for the
Library.

51
COPYING.MINPACK Normal file
View File

@@ -0,0 +1,51 @@
Minpack Copyright Notice (1999) University of Chicago. All rights reserved
Redistribution and use in source and binary forms, with or
without modification, are permitted provided that the
following conditions are met:
1. Redistributions of source code must retain the above
copyright notice, this list of conditions and the following
disclaimer.
2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials
provided with the distribution.
3. The end-user documentation included with the
redistribution, if any, must include the following
acknowledgment:
"This product includes software developed by the
University of Chicago, as Operator of Argonne National
Laboratory.
Alternately, this acknowledgment may appear in the software
itself, if and wherever such third-party acknowledgments
normally appear.
4. WARRANTY DISCLAIMER. THE SOFTWARE IS SUPPLIED "AS IS"
WITHOUT WARRANTY OF ANY KIND. THE COPYRIGHT HOLDER, THE
UNITED STATES, THE UNITED STATES DEPARTMENT OF ENERGY, AND
THEIR EMPLOYEES: (1) DISCLAIM ANY WARRANTIES, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE
OR NON-INFRINGEMENT, (2) DO NOT ASSUME ANY LEGAL LIABILITY
OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR
USEFULNESS OF THE SOFTWARE, (3) DO NOT REPRESENT THAT USE OF
THE SOFTWARE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS, (4)
DO NOT WARRANT THAT THE SOFTWARE WILL FUNCTION
UNINTERRUPTED, THAT IT IS ERROR-FREE OR THAT ANY ERRORS WILL
BE CORRECTED.
5. LIMITATION OF LIABILITY. IN NO EVENT WILL THE COPYRIGHT
HOLDER, THE UNITED STATES, THE UNITED STATES DEPARTMENT OF
ENERGY, OR THEIR EMPLOYEES: BE LIABLE FOR ANY INDIRECT,
INCIDENTAL, CONSEQUENTIAL, SPECIAL OR PUNITIVE DAMAGES OF
ANY KIND OR NATURE, INCLUDING BUT NOT LIMITED TO LOSS OF
PROFITS OR LOSS OF DATA, FOR ANY REASON WHATSOEVER, WHETHER
SUCH LIABILITY IS ASSERTED ON THE BASIS OF CONTRACT, TORT
(INCLUDING NEGLIGENCE OR STRICT LIABILITY), OR OTHERWISE,
EVEN IF ANY OF SAID PARTIES HAS BEEN WARNED OF THE
POSSIBILITY OF SUCH LOSS OR DAMAGES.

373
COPYING.MPL2 Normal file
View File

@@ -0,0 +1,373 @@
Mozilla Public License Version 2.0
==================================
1. Definitions
--------------
1.1. "Contributor"
means each individual or legal entity that creates, contributes to
the creation of, or owns Covered Software.
1.2. "Contributor Version"
means the combination of the Contributions of others (if any) used
by a Contributor and that particular Contributor's Contribution.
1.3. "Contribution"
means Covered Software of a particular Contributor.
1.4. "Covered Software"
means Source Code Form to which the initial Contributor has attached
the notice in Exhibit A, the Executable Form of such Source Code
Form, and Modifications of such Source Code Form, in each case
including portions thereof.
1.5. "Incompatible With Secondary Licenses"
means
(a) that the initial Contributor has attached the notice described
in Exhibit B to the Covered Software; or
(b) that the Covered Software was made available under the terms of
version 1.1 or earlier of the License, but not also under the
terms of a Secondary License.
1.6. "Executable Form"
means any form of the work other than Source Code Form.
1.7. "Larger Work"
means a work that combines Covered Software with other material, in
a separate file or files, that is not Covered Software.
1.8. "License"
means this document.
1.9. "Licensable"
means having the right to grant, to the maximum extent possible,
whether at the time of the initial grant or subsequently, any and
all of the rights conveyed by this License.
1.10. "Modifications"
means any of the following:
(a) any file in Source Code Form that results from an addition to,
deletion from, or modification of the contents of Covered
Software; or
(b) any new file in Source Code Form that contains any Covered
Software.
1.11. "Patent Claims" of a Contributor
means any patent claim(s), including without limitation, method,
process, and apparatus claims, in any patent Licensable by such
Contributor that would be infringed, but for the grant of the
License, by the making, using, selling, offering for sale, having
made, import, or transfer of either its Contributions or its
Contributor Version.
1.12. "Secondary License"
means either the GNU General Public License, Version 2.0, the GNU
Lesser General Public License, Version 2.1, the GNU Affero General
Public License, Version 3.0, or any later versions of those
licenses.
1.13. "Source Code Form"
means the form of the work preferred for making modifications.
1.14. "You" (or "Your")
means an individual or a legal entity exercising rights under this
License. For legal entities, "You" includes any entity that
controls, is controlled by, or is under common control with You. For
purposes of this definition, "control" means (a) the power, direct
or indirect, to cause the direction or management of such entity,
whether by contract or otherwise, or (b) ownership of more than
fifty percent (50%) of the outstanding shares or beneficial
ownership of such entity.
2. License Grants and Conditions
--------------------------------
2.1. Grants
Each Contributor hereby grants You a world-wide, royalty-free,
non-exclusive license:
(a) under intellectual property rights (other than patent or trademark)
Licensable by such Contributor to use, reproduce, make available,
modify, display, perform, distribute, and otherwise exploit its
Contributions, either on an unmodified basis, with Modifications, or
as part of a Larger Work; and
(b) under Patent Claims of such Contributor to make, use, sell, offer
for sale, have made, import, and otherwise transfer either its
Contributions or its Contributor Version.
2.2. Effective Date
The licenses granted in Section 2.1 with respect to any Contribution
become effective for each Contribution on the date the Contributor first
distributes such Contribution.
2.3. Limitations on Grant Scope
The licenses granted in this Section 2 are the only rights granted under
this License. No additional rights or licenses will be implied from the
distribution or licensing of Covered Software under this License.
Notwithstanding Section 2.1(b) above, no patent license is granted by a
Contributor:
(a) for any code that a Contributor has removed from Covered Software;
or
(b) for infringements caused by: (i) Your and any other third party's
modifications of Covered Software, or (ii) the combination of its
Contributions with other software (except as part of its Contributor
Version); or
(c) under Patent Claims infringed by Covered Software in the absence of
its Contributions.
This License does not grant any rights in the trademarks, service marks,
or logos of any Contributor (except as may be necessary to comply with
the notice requirements in Section 3.4).
2.4. Subsequent Licenses
No Contributor makes additional grants as a result of Your choice to
distribute the Covered Software under a subsequent version of this
License (see Section 10.2) or under the terms of a Secondary License (if
permitted under the terms of Section 3.3).
2.5. Representation
Each Contributor represents that the Contributor believes its
Contributions are its original creation(s) or it has sufficient rights
to grant the rights to its Contributions conveyed by this License.
2.6. Fair Use
This License is not intended to limit any rights You have under
applicable copyright doctrines of fair use, fair dealing, or other
equivalents.
2.7. Conditions
Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted
in Section 2.1.
3. Responsibilities
-------------------
3.1. Distribution of Source Form
All distribution of Covered Software in Source Code Form, including any
Modifications that You create or to which You contribute, must be under
the terms of this License. You must inform recipients that the Source
Code Form of the Covered Software is governed by the terms of this
License, and how they can obtain a copy of this License. You may not
attempt to alter or restrict the recipients' rights in the Source Code
Form.
3.2. Distribution of Executable Form
If You distribute Covered Software in Executable Form then:
(a) such Covered Software must also be made available in Source Code
Form, as described in Section 3.1, and You must inform recipients of
the Executable Form how they can obtain a copy of such Source Code
Form by reasonable means in a timely manner, at a charge no more
than the cost of distribution to the recipient; and
(b) You may distribute such Executable Form under the terms of this
License, or sublicense it under different terms, provided that the
license for the Executable Form does not attempt to limit or alter
the recipients' rights in the Source Code Form under this License.
3.3. Distribution of a Larger Work
You may create and distribute a Larger Work under terms of Your choice,
provided that You also comply with the requirements of this License for
the Covered Software. If the Larger Work is a combination of Covered
Software with a work governed by one or more Secondary Licenses, and the
Covered Software is not Incompatible With Secondary Licenses, this
License permits You to additionally distribute such Covered Software
under the terms of such Secondary License(s), so that the recipient of
the Larger Work may, at their option, further distribute the Covered
Software under the terms of either this License or such Secondary
License(s).
3.4. Notices
You may not remove or alter the substance of any license notices
(including copyright notices, patent notices, disclaimers of warranty,
or limitations of liability) contained within the Source Code Form of
the Covered Software, except that You may alter any license notices to
the extent required to remedy known factual inaccuracies.
3.5. Application of Additional Terms
You may choose to offer, and to charge a fee for, warranty, support,
indemnity or liability obligations to one or more recipients of Covered
Software. However, You may do so only on Your own behalf, and not on
behalf of any Contributor. You must make it absolutely clear that any
such warranty, support, indemnity, or liability obligation is offered by
You alone, and You hereby agree to indemnify every Contributor for any
liability incurred by such Contributor as a result of warranty, support,
indemnity or liability terms You offer. You may include additional
disclaimers of warranty and limitations of liability specific to any
jurisdiction.
4. Inability to Comply Due to Statute or Regulation
---------------------------------------------------
If it is impossible for You to comply with any of the terms of this
License with respect to some or all of the Covered Software due to
statute, judicial order, or regulation then You must: (a) comply with
the terms of this License to the maximum extent possible; and (b)
describe the limitations and the code they affect. Such description must
be placed in a text file included with all distributions of the Covered
Software under this License. Except to the extent prohibited by statute
or regulation, such description must be sufficiently detailed for a
recipient of ordinary skill to be able to understand it.
5. Termination
--------------
5.1. The rights granted under this License will terminate automatically
if You fail to comply with any of its terms. However, if You become
compliant, then the rights granted under this License from a particular
Contributor are reinstated (a) provisionally, unless and until such
Contributor explicitly and finally terminates Your grants, and (b) on an
ongoing basis, if such Contributor fails to notify You of the
non-compliance by some reasonable means prior to 60 days after You have
come back into compliance. Moreover, Your grants from a particular
Contributor are reinstated on an ongoing basis if such Contributor
notifies You of the non-compliance by some reasonable means, this is the
first time You have received notice of non-compliance with this License
from such Contributor, and You become compliant prior to 30 days after
Your receipt of the notice.
5.2. If You initiate litigation against any entity by asserting a patent
infringement claim (excluding declaratory judgment actions,
counter-claims, and cross-claims) alleging that a Contributor Version
directly or indirectly infringes any patent, then the rights granted to
You by any and all Contributors for the Covered Software under Section
2.1 of this License shall terminate.
5.3. In the event of termination under Sections 5.1 or 5.2 above, all
end user license agreements (excluding distributors and resellers) which
have been validly granted by You or Your distributors under this License
prior to termination shall survive termination.
************************************************************************
* *
* 6. Disclaimer of Warranty *
* ------------------------- *
* *
* Covered Software is provided under this License on an "as is" *
* basis, without warranty of any kind, either expressed, implied, or *
* statutory, including, without limitation, warranties that the *
* Covered Software is free of defects, merchantable, fit for a *
* particular purpose or non-infringing. The entire risk as to the *
* quality and performance of the Covered Software is with You. *
* Should any Covered Software prove defective in any respect, You *
* (not any Contributor) assume the cost of any necessary servicing, *
* repair, or correction. This disclaimer of warranty constitutes an *
* essential part of this License. No use of any Covered Software is *
* authorized under this License except under this disclaimer. *
* *
************************************************************************
************************************************************************
* *
* 7. Limitation of Liability *
* -------------------------- *
* *
* Under no circumstances and under no legal theory, whether tort *
* (including negligence), contract, or otherwise, shall any *
* Contributor, or anyone who distributes Covered Software as *
* permitted above, be liable to You for any direct, indirect, *
* special, incidental, or consequential damages of any character *
* including, without limitation, damages for lost profits, loss of *
* goodwill, work stoppage, computer failure or malfunction, or any *
* and all other commercial damages or losses, even if such party *
* shall have been informed of the possibility of such damages. This *
* limitation of liability shall not apply to liability for death or *
* personal injury resulting from such party's negligence to the *
* extent applicable law prohibits such limitation. Some *
* jurisdictions do not allow the exclusion or limitation of *
* incidental or consequential damages, so this exclusion and *
* limitation may not apply to You. *
* *
************************************************************************
8. Litigation
-------------
Any litigation relating to this License may be brought only in the
courts of a jurisdiction where the defendant maintains its principal
place of business and such litigation shall be governed by laws of that
jurisdiction, without reference to its conflict-of-law provisions.
Nothing in this Section shall prevent a party's ability to bring
cross-claims or counter-claims.
9. Miscellaneous
----------------
This License represents the complete agreement concerning the subject
matter hereof. If any provision of this License is held to be
unenforceable, such provision shall be reformed only to the extent
necessary to make it enforceable. Any law or regulation which provides
that the language of a contract shall be construed against the drafter
shall not be used to construe this License against a Contributor.
10. Versions of the License
---------------------------
10.1. New Versions
Mozilla Foundation is the license steward. Except as provided in Section
10.3, no one other than the license steward has the right to modify or
publish new versions of this License. Each version will be given a
distinguishing version number.
10.2. Effect of New Versions
You may distribute the Covered Software under the terms of the version
of the License under which You originally received the Covered Software,
or under the terms of any subsequent version published by the license
steward.
10.3. Modified Versions
If you create software not governed by this License, and you want to
create a new license for such software, you may create and use a
modified version of this License if you rename the license and remove
any references to the name of the license steward (except to note that
such modified license differs from this License).
10.4. Distributing Source Code Form that is Incompatible With Secondary
Licenses
If You choose to distribute Source Code Form that is Incompatible With
Secondary Licenses under the terms of this version of the License, the
notice described in Exhibit B of this License must be attached.
Exhibit A - Source Code Form License Notice
-------------------------------------------
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, You can obtain one at https://mozilla.org/MPL/2.0/.
If it is not possible or desirable to put the notice in a particular
file, then You may include the notice in a location (such as a LICENSE
file in a relevant directory) where a recipient would be likely to look
for such a notice.
You may add additional accurate notices of copyright ownership.
Exhibit B - "Incompatible With Secondary Licenses" Notice
---------------------------------------------------------
This Source Code Form is "Incompatible With Secondary Licenses", as
defined by the Mozilla Public License, v. 2.0.

6
COPYING.README Normal file
View File

@@ -0,0 +1,6 @@
Eigen is primarily MPL2 licensed. See COPYING.MPL2 and these links:
http://www.mozilla.org/MPL/2.0/
http://www.mozilla.org/MPL/2.0/FAQ.html
Some files contain third-party code under BSD or other MPL2-compatible licenses,
whence the other COPYING.* files here.

17
CTestConfig.cmake Normal file
View File

@@ -0,0 +1,17 @@
## This file should be placed in the root directory of your project.
## Then modify the CMakeLists.txt file in the root directory of your
## project to incorporate the testing dashboard.
## # The following are required to uses Dart and the Cdash dashboard
## enable_testing()
## include(CTest)
set(CTEST_PROJECT_NAME "Eigen")
set(CTEST_NIGHTLY_START_TIME "00:00:00 UTC")
set(CTEST_DROP_METHOD "http")
set(CTEST_DROP_SITE "my.cdash.org")
set(CTEST_DROP_LOCATION "/submit.php?project=Eigen")
set(CTEST_DROP_SITE_CDASH TRUE)
#set(CTEST_PROJECT_SUBPROJECTS
#Official
#Unsupported
#)

4
CTestCustom.cmake.in Normal file
View File

@@ -0,0 +1,4 @@
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_WARNINGS "2000")
set(CTEST_CUSTOM_MAXIMUM_NUMBER_OF_ERRORS "2000")
list(APPEND CTEST_CUSTOM_ERROR_EXCEPTION @EIGEN_CTEST_ERROR_EXCEPTION@)

281
Doxyfile
View File

@@ -1,281 +0,0 @@
# Doxyfile 1.5.3
#---------------------------------------------------------------------------
# Project related configuration options
#---------------------------------------------------------------------------
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = Eigen
PROJECT_NUMBER = 2.0
OUTPUT_DIRECTORY = ./
CREATE_SUBDIRS = NO
OUTPUT_LANGUAGE = English
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ABBREVIATE_BRIEF = "The $name class" \
"The $name widget" \
"The $name file" \
is \
provides \
specifies \
contains \
represents \
a \
an \
the
ALWAYS_DETAILED_SEC = NO
INLINE_INHERITED_MEMB = NO
FULL_PATH_NAMES = NO
STRIP_FROM_PATH =
STRIP_FROM_INC_PATH =
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = NO
DETAILS_AT_TOP = NO
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 8
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = NO
OPTIMIZE_OUTPUT_JAVA = NO
BUILTIN_STL_SUPPORT = NO
CPP_CLI_SUPPORT = NO
DISTRIBUTE_GROUP_DOC = NO
SUBGROUPING = YES
#---------------------------------------------------------------------------
# Build related configuration options
#---------------------------------------------------------------------------
EXTRACT_ALL = NO
EXTRACT_PRIVATE = NO
EXTRACT_STATIC = NO
EXTRACT_LOCAL_CLASSES = NO
EXTRACT_LOCAL_METHODS = NO
EXTRACT_ANON_NSPACES = NO
HIDE_UNDOC_MEMBERS = YES
HIDE_UNDOC_CLASSES = YES
HIDE_FRIEND_COMPOUNDS = YES
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = YES
HIDE_SCOPE_NAMES = YES
SHOW_INCLUDE_FILES = YES
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_BY_SCOPE_NAME = NO
GENERATE_TODOLIST = YES
GENERATE_TESTLIST = YES
GENERATE_BUGLIST = YES
GENERATE_DEPRECATEDLIST= YES
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_DIRECTORIES = NO
FILE_VERSION_FILTER =
#---------------------------------------------------------------------------
# configuration options related to warning and progress messages
#---------------------------------------------------------------------------
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = NO
WARN_FORMAT = "$file:$line: $text"
WARN_LOGFILE =
#---------------------------------------------------------------------------
# configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = ./
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.c \
*.cc \
*.cxx \
*.cpp \
*.c++ \
*.d \
*.java \
*.ii \
*.ixx \
*.ipp \
*.i++ \
*.inl \
*.h \
*.hh \
*.hxx \
*.hpp \
*.h++ \
*.idl \
*.odl \
*.cs \
*.php \
*.php3 \
*.inc \
*.m \
*.mm \
*.dox \
*.py \
*.C \
*.CC \
*.C++ \
*.II \
*.I++ \
*.H \
*.HH \
*.H++ \
*.CS \
*.PHP \
*.PHP3 \
*.M \
*.MM \
*.PY
RECURSIVE = NO
EXCLUDE =
EXCLUDE_SYMLINKS = NO
EXCLUDE_PATTERNS =
EXCLUDE_SYMBOLS =
EXAMPLE_PATH = doc/examples/
EXAMPLE_PATTERNS = *
EXAMPLE_RECURSIVE = NO
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
#---------------------------------------------------------------------------
# configuration options related to source browsing
#---------------------------------------------------------------------------
SOURCE_BROWSER = NO
INLINE_SOURCES = NO
STRIP_CODE_COMMENTS = YES
REFERENCED_BY_RELATION = YES
REFERENCES_RELATION = YES
REFERENCES_LINK_SOURCE = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
#---------------------------------------------------------------------------
# configuration options related to the alphabetical class index
#---------------------------------------------------------------------------
ALPHABETICAL_INDEX = NO
COLS_IN_ALPHA_INDEX = 5
IGNORE_PREFIX =
#---------------------------------------------------------------------------
# configuration options related to the HTML output
#---------------------------------------------------------------------------
GENERATE_HTML = YES
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_HEADER =
HTML_FOOTER =
HTML_STYLESHEET =
HTML_ALIGN_MEMBERS = YES
GENERATE_HTMLHELP = NO
HTML_DYNAMIC_SECTIONS = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
BINARY_TOC = NO
TOC_EXPAND = NO
DISABLE_INDEX = NO
ENUM_VALUES_PER_LINE = 4
GENERATE_TREEVIEW = NO
TREEVIEW_WIDTH = 250
#---------------------------------------------------------------------------
# configuration options related to the LaTeX output
#---------------------------------------------------------------------------
GENERATE_LATEX = YES
LATEX_OUTPUT = latex
LATEX_CMD_NAME = latex
MAKEINDEX_CMD_NAME = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4wide
EXTRA_PACKAGES =
LATEX_HEADER =
PDF_HYPERLINKS = NO
USE_PDFLATEX = NO
LATEX_BATCHMODE = NO
LATEX_HIDE_INDICES = NO
#---------------------------------------------------------------------------
# configuration options related to the RTF output
#---------------------------------------------------------------------------
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
#---------------------------------------------------------------------------
# configuration options related to the man page output
#---------------------------------------------------------------------------
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_LINKS = NO
#---------------------------------------------------------------------------
# configuration options related to the XML output
#---------------------------------------------------------------------------
GENERATE_XML = NO
XML_OUTPUT = xml
XML_SCHEMA =
XML_DTD =
XML_PROGRAMLISTING = YES
#---------------------------------------------------------------------------
# configuration options for the AutoGen Definitions output
#---------------------------------------------------------------------------
GENERATE_AUTOGEN_DEF = NO
#---------------------------------------------------------------------------
# configuration options related to the Perl module output
#---------------------------------------------------------------------------
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the preprocessor
#---------------------------------------------------------------------------
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = NO
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH =
INCLUDE_FILE_PATTERNS =
PREDEFINED =
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = YES
#---------------------------------------------------------------------------
# Configuration::additions related to external references
#---------------------------------------------------------------------------
TAGFILES =
GENERATE_TAGFILE =
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
PERL_PATH = /usr/bin/perl
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
#---------------------------------------------------------------------------
CLASS_DIAGRAMS = YES
MSCGEN_PATH =
HIDE_UNDOC_RELATIONS = YES
HAVE_DOT = YES
CLASS_GRAPH = YES
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = NO
TEMPLATE_RELATIONS = NO
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = png
DOT_PATH =
DOTFILE_DIRS =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 1000
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = NO
GENERATE_LEGEND = YES
DOT_CLEANUP = YES
#---------------------------------------------------------------------------
# Configuration::additions related to the search engine
#---------------------------------------------------------------------------
SEARCHENGINE = NO

52
Eigen/AccelerateSupport Normal file
View File

@@ -0,0 +1,52 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ACCELERATESUPPORT_MODULE_H
#define EIGEN_ACCELERATESUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
/** \ingroup Support_modules
* \defgroup AccelerateSupport_Module AccelerateSupport module
*
* This module provides an interface to the Apple Accelerate library.
* It provides the seven following main factorization classes:
* - class AccelerateLLT: a Cholesky (LL^T) factorization.
* - class AccelerateLDLT: the default LDL^T factorization.
* - class AccelerateLDLTUnpivoted: a Cholesky-like LDL^T factorization with only 1x1 pivots and no pivoting
* - class AccelerateLDLTSBK: an LDL^T factorization with Supernode Bunch-Kaufman and static pivoting
* - class AccelerateLDLTTPP: an LDL^T factorization with full threshold partial pivoting
* - class AccelerateQR: a QR factorization
* - class AccelerateCholeskyAtA: a QR factorization without storing Q (equivalent to A^TA = R^T R)
*
* \code
* #include <Eigen/AccelerateSupport>
* \endcode
*
* In order to use this module, the Accelerate headers must be accessible from
* the include paths, and your binary must be linked to the Accelerate framework.
* The Accelerate library is only available on Apple hardware.
*
* Note that many of the algorithms can be influenced by the UpLo template
* argument. All matrices are assumed to be symmetric. For example, the following
* creates an LDLT factorization where your matrix is symmetric (implicit) and
* uses the lower triangle:
*
* \code
* AccelerateLDLT<SparseMatrix<float>, Lower> ldlt;
* \endcode
*/
// IWYU pragma: begin_exports
#include "src/AccelerateSupport/AccelerateSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_ACCELERATESUPPORT_MODULE_H

View File

@@ -1,35 +0,0 @@
#ifndef EIGEN_ARRAY_MODULE_H
#define EIGEN_ARRAY_MODULE_H
#include "Core"
namespace Eigen {
/** \defgroup Array Array module
* This module provides several handy features to manipulate matrices as simple array of values.
* In addition to listed classes, it defines various methods of the Cwise interface
* (accessible from MatrixBase::cwise()), including:
* - matrix-scalar sum,
* - coeff-wise comparison operators,
* - sin, cos, sqrt, pow, exp, log, square, cube, inverse (reciprocal).
*
* This module also provides various MatrixBase methods, including:
* - \ref MatrixBase::all() "all", \ref MatrixBase::any() "any",
* - \ref MatrixBase::Random() "random matrix initialization"
*
* \code
* #include <Eigen/Array>
* \endcode
*/
#include "src/Array/CwiseOperators.h"
#include "src/Array/Functors.h"
#include "src/Array/AllAndAny.h"
#include "src/Array/Select.h"
#include "src/Array/PartialRedux.h"
#include "src/Array/Random.h"
#include "src/Array/Norms.h"
} // namespace Eigen
#endif // EIGEN_ARRAY_MODULE_H

View File

@@ -1,34 +0,0 @@
set(Eigen_HEADERS Core LU Cholesky QR Geometry Sparse Array SVD Regression)
if(EIGEN_BUILD_LIB)
set(Eigen_SRCS
src/Core/CoreInstantiations.cpp
src/Cholesky/CholeskyInstantiations.cpp
src/QR/QrInstantiations.cpp
)
add_library(Eigen2 SHARED ${Eigen_SRCS})
install(TARGETS Eigen2
RUNTIME DESTINATION bin
LIBRARY DESTINATION lib
ARCHIVE DESTINATION lib)
endif(EIGEN_BUILD_LIB)
if(CMAKE_COMPILER_IS_GNUCXX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g1 -O2")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -g1 -O2")
endif(CMAKE_COMPILER_IS_GNUCXX)
set(INCLUDE_INSTALL_DIR
"${CMAKE_INSTALL_PREFIX}/include/eigen2"
CACHE PATH
"The directory where we install the header files"
FORCE)
install(FILES
${Eigen_HEADERS}
DESTINATION ${INCLUDE_INSTALL_DIR}/Eigen
)
add_subdirectory(src)

View File

@@ -1,60 +1,43 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CHOLESKY_MODULE_H
#define EIGEN_CHOLESKY_MODULE_H
#include "Core"
#include "Jacobi"
// Note that EIGEN_HIDE_HEAVY_CODE has to be defined per module
#if (defined EIGEN_EXTERN_INSTANTIATIONS) && (EIGEN_EXTERN_INSTANTIATIONS>=2)
#ifndef EIGEN_HIDE_HEAVY_CODE
#define EIGEN_HIDE_HEAVY_CODE
#endif
#elif defined EIGEN_HIDE_HEAVY_CODE
#undef EIGEN_HIDE_HEAVY_CODE
#endif
namespace Eigen {
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup Cholesky_Module Cholesky module
* This module provides two variants of the Cholesky decomposition for selfadjoint (hermitian) matrices.
* Those decompositions are accessible via the following MatrixBase methods:
* - MatrixBase::llt(),
* - MatrixBase::ldlt()
*
* \code
* #include <Eigen/Cholesky>
* \endcode
*/
*
*
*
* This module provides two variants of the Cholesky decomposition for selfadjoint (hermitian) matrices.
* Those decompositions are also accessible via the following methods:
* - MatrixBase::llt()
* - MatrixBase::ldlt()
* - SelfAdjointView::llt()
* - SelfAdjointView::ldlt()
*
* \code
* #include <Eigen/Cholesky>
* \endcode
*/
#include "src/Array/CwiseOperators.h"
#include "src/Array/Functors.h"
// IWYU pragma: begin_exports
#include "src/Cholesky/LLT.h"
#include "src/Cholesky/LDLT.h"
#include "src/Cholesky/Cholesky.h"
#include "src/Cholesky/CholeskyWithoutSquareRoot.h"
} // namespace Eigen
#define EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(MATRIXTYPE,PREFIX) \
PREFIX template class Cholesky<MATRIXTYPE>; \
PREFIX template class CholeskyWithoutSquareRoot<MATRIXTYPE>
#define EIGEN_CHOLESKY_MODULE_INSTANTIATE(PREFIX) \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix2f,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix2d,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix3f,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix3d,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix4f,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(Matrix4d,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(MatrixXf,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(MatrixXd,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(MatrixXcf,PREFIX); \
EIGEN_CHOLESKY_MODULE_INSTANTIATE_TYPE(MatrixXcd,PREFIX)
#ifdef EIGEN_EXTERN_INSTANTIATIONS
namespace Eigen {
EIGEN_CHOLESKY_MODULE_INSTANTIATE(extern);
} // namespace Eigen
#ifdef EIGEN_USE_LAPACKE
#include "src/misc/lapacke_helpers.h"
#include "src/Cholesky/LLT_LAPACKE.h"
#endif
// IWYU pragma: end_exports
#endif // EIGEN_CHOLESKY_MODULE_H
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_CHOLESKY_MODULE_H

48
Eigen/CholmodSupport Normal file
View File

@@ -0,0 +1,48 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CHOLMODSUPPORT_MODULE_H
#define EIGEN_CHOLMODSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
#include <cholmod.h>
/** \ingroup Support_modules
* \defgroup CholmodSupport_Module CholmodSupport module
*
* This module provides an interface to the Cholmod library which is part of the <a
* href="http://www.suitesparse.com">suitesparse</a> package. It provides the two following main factorization classes:
* - class CholmodSupernodalLLT: a supernodal LLT Cholesky factorization.
* - class CholmodDecomposition: a general L(D)LT Cholesky factorization with automatic or explicit runtime selection of
* the underlying factorization method (supernodal or simplicial).
*
* For the sake of completeness, this module also propose the two following classes:
* - class CholmodSimplicialLLT
* - class CholmodSimplicialLDLT
* Note that these classes does not bring any particular advantage compared to the built-in
* SimplicialLLT and SimplicialLDLT factorization classes.
*
* \code
* #include <Eigen/CholmodSupport>
* \endcode
*
* In order to use this module, the cholmod headers must be accessible from the include paths, and your binary must be
* linked to the cholmod library and its dependencies. The dependencies depend on how cholmod has been compiled. For a
* cmake based project, you can use our FindCholmod.cmake module to help you in this task.
*
*/
// IWYU pragma: begin_exports
#include "src/CholmodSupport/CholmodSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_CHOLMODSUPPORT_MODULE_H

View File

@@ -1,119 +1,451 @@
#ifndef EIGEN_CORE_H
#define EIGEN_CORE_H
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2007-2011 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifdef _MSC_VER
#pragma warning( disable : 4181 4244 )
#ifndef EIGEN_CORE_MODULE_H
#define EIGEN_CORE_MODULE_H
// Eigen version information.
#include "Version"
// first thing Eigen does: stop the compiler from reporting useless warnings.
#include "src/Core/util/DisableStupidWarnings.h"
// then include this file where all our macros are defined. It's really important to do it first because
// it's where we do all the compiler/OS/arch detections and define most defaults.
#include "src/Core/util/Macros.h"
// This detects SSE/AVX/NEON/etc. and configure alignment settings
#include "src/Core/util/ConfigureVectorization.h"
// We need cuda_runtime.h/hip_runtime.h to ensure that
// the EIGEN_USING_STD macro works properly on the device side
#if defined(EIGEN_CUDACC)
#include <cuda_runtime.h>
#elif defined(EIGEN_HIPCC)
#include <hip/hip_runtime.h>
#endif
#ifdef __GNUC__
#define EIGEN_GNUC_AT_LEAST(x,y) ((__GNUC__>=x && __GNUC_MINOR__>=y) || __GNUC__>x)
#else
#define EIGEN_GNUC_AT_LEAST(x,y) 0
#ifdef EIGEN_EXCEPTIONS
#include <new>
#endif
#ifndef EIGEN_DONT_VECTORIZE
#if (defined __SSE2__) && ( (!defined __GNUC__) || EIGEN_GNUC_AT_LEAST(4,2) )
#define EIGEN_VECTORIZE
#define EIGEN_VECTORIZE_SSE
#include <emmintrin.h>
#include <xmmintrin.h>
#ifdef __SSE3__
#include <pmmintrin.h>
#endif
#ifdef __SSSE3__
#include <tmmintrin.h>
#endif
#elif (defined __ALTIVEC__)
#define EIGEN_VECTORIZE
#define EIGEN_VECTORIZE_ALTIVEC
#include <altivec.h>
// We _need_ to #undef all these ugly tokens defined in <altivec.h>
// => use __vector instead of vector
#undef bool
#undef vector
#undef pixel
#endif
// Disable the ipa-cp-clone optimization flag with MinGW 6.x or older (enabled by default with -O3)
// See http://eigen.tuxfamily.org/bz/show_bug.cgi?id=556 for details.
#if EIGEN_COMP_MINGW && EIGEN_GNUC_STRICT_LESS_THAN(6, 0, 0)
#pragma GCC optimize("-fno-ipa-cp-clone")
#endif
// Prevent ICC from specializing std::complex operators that silently fail
// on device. This allows us to use our own device-compatible specializations
// instead.
#if EIGEN_COMP_ICC && defined(EIGEN_GPU_COMPILE_PHASE) && !defined(_OVERRIDE_COMPLEX_SPECIALIZATION_)
#define _OVERRIDE_COMPLEX_SPECIALIZATION_ 1
#endif
#include <complex>
// this include file manages BLAS and MKL related macros
// and inclusion of their respective header files
#include "src/Core/util/MKL_support.h"
#if defined(EIGEN_HAS_CUDA_FP16) || defined(EIGEN_HAS_HIP_FP16)
#define EIGEN_HAS_GPU_FP16
#endif
#if defined(EIGEN_HAS_CUDA_BF16) || defined(EIGEN_HAS_HIP_BF16)
#define EIGEN_HAS_GPU_BF16
#endif
#if (defined _OPENMP) && (!defined EIGEN_DONT_PARALLELIZE)
#define EIGEN_HAS_OPENMP
#endif
#ifdef EIGEN_HAS_OPENMP
#include <atomic>
#include <omp.h>
#endif
// MSVC for windows mobile does not have the errno.h file
#if !(EIGEN_COMP_MSVC && EIGEN_OS_WINCE) && !EIGEN_COMP_ARM
#define EIGEN_HAS_ERRNO
#endif
#ifdef EIGEN_HAS_ERRNO
#include <cerrno>
#endif
#include <cstddef>
#include <cstdlib>
#include <cmath>
#include <complex>
#include <cassert>
#include <functional>
#include <iostream>
#ifndef EIGEN_NO_IO
#include <sstream>
#include <iosfwd>
#endif
#include <cstring>
#include <string>
#include <limits>
#include <climits> // for CHAR_BIT
// for min/max:
#include <algorithm>
#include <array>
#include <memory>
#include <vector>
// for std::is_nothrow_move_assignable
#include <type_traits>
// for std::this_thread::yield().
#if !defined(EIGEN_USE_BLAS) && (defined(EIGEN_HAS_OPENMP) || defined(EIGEN_GEMM_THREADPOOL))
#include <thread>
#endif
// for __cpp_lib feature test macros
#if defined(__has_include) && __has_include(<version>)
#include <version>
#endif
// for std::bit_cast()
#if defined(__cpp_lib_bit_cast) && __cpp_lib_bit_cast >= 201806L
#include <bit>
#endif
// for outputting debug info
#ifdef EIGEN_DEBUG_ASSIGN
#include <iostream>
#endif
// required for __cpuid, needs to be included after cmath
// also required for _BitScanReverse on Windows on ARM
#if EIGEN_COMP_MSVC && (EIGEN_ARCH_i386_OR_x86_64 || EIGEN_ARCH_ARM64) && !EIGEN_OS_WINCE
#include <intrin.h>
#endif
#if defined(EIGEN_USE_SYCL)
#undef min
#undef max
#undef isnan
#undef isinf
#undef isfinite
#include <CL/sycl.hpp>
#include <map>
#include <thread>
#include <utility>
#ifndef EIGEN_SYCL_LOCAL_THREAD_DIM0
#define EIGEN_SYCL_LOCAL_THREAD_DIM0 16
#endif
#ifndef EIGEN_SYCL_LOCAL_THREAD_DIM1
#define EIGEN_SYCL_LOCAL_THREAD_DIM1 16
#endif
#endif
#if defined EIGEN2_SUPPORT_STAGE40_FULL_EIGEN3_STRICTNESS || defined EIGEN2_SUPPORT_STAGE30_FULL_EIGEN3_API || \
defined EIGEN2_SUPPORT_STAGE20_RESOLVE_API_CONFLICTS || defined EIGEN2_SUPPORT_STAGE10_FULL_EIGEN2_API || \
defined EIGEN2_SUPPORT
// This will generate an error message:
#error Eigen2-support is only available up to version 3.2. Please go to "http://eigen.tuxfamily.org/index.php?title=Eigen2" for further information
#endif
namespace Eigen {
/** \defgroup Core_Module Core module
* This is the main module of Eigen providing dense matrix and vector support
* (both fixed and dynamic size) with all the features corresponding to a BLAS library
* and much more...
*
* \code
* #include <Eigen/Core>
* \endcode
*/
// we use size_t frequently and we'll never remember to prepend it with std:: every time just to
// ensure QNX/QCC support
using std::size_t;
// gcc 4.6.0 wants std:: for ptrdiff_t
using std::ptrdiff_t;
#include "src/Core/util/Macros.h"
} // namespace Eigen
/** \defgroup Core_Module Core module
* This is the main module of Eigen providing dense matrix and vector support
* (both fixed and dynamic size) with all the features corresponding to a BLAS library
* and much more...
*
* \code
* #include <Eigen/Core>
* \endcode
*/
#ifdef EIGEN_USE_LAPACKE
#ifdef EIGEN_USE_MKL
#include "mkl_lapacke.h"
#else
#include "src/misc/lapacke.h"
#endif
#endif
// IWYU pragma: begin_exports
#include "src/Core/util/Constants.h"
#include "src/Core/util/ForwardDeclarations.h"
#include "src/Core/util/Meta.h"
#include "src/Core/util/XprHelper.h"
#include "src/Core/util/Assert.h"
#include "src/Core/util/ForwardDeclarations.h"
#include "src/Core/util/StaticAssert.h"
#include "src/Core/util/XprHelper.h"
#include "src/Core/util/Memory.h"
#include "src/Core/util/IntegralConstant.h"
#include "src/Core/util/Serializer.h"
#include "src/Core/util/SymbolicIndex.h"
#include "src/Core/util/EmulateArray.h"
#include "src/Core/util/MoreMeta.h"
#include "src/Core/NumTraits.h"
#include "src/Core/MathFunctions.h"
#include "src/Core/RandomImpl.h"
#include "src/Core/GenericPacketMath.h"
#include "src/Core/MathFunctionsImpl.h"
#include "src/Core/arch/Default/ConjHelper.h"
// Generic half float support
#include "src/Core/arch/Default/Half.h"
#include "src/Core/arch/Default/BFloat16.h"
#include "src/Core/arch/Default/GenericPacketMathFunctionsFwd.h"
#if defined EIGEN_VECTORIZE_SSE
#if defined EIGEN_VECTORIZE_AVX512
#include "src/Core/arch/SSE/PacketMath.h"
#elif defined EIGEN_VECTORIZE_ALTIVEC
#include "src/Core/arch/SSE/Reductions.h"
#include "src/Core/arch/AVX/PacketMath.h"
#include "src/Core/arch/AVX/Reductions.h"
#include "src/Core/arch/AVX512/PacketMath.h"
#include "src/Core/arch/AVX512/Reductions.h"
#if defined EIGEN_VECTORIZE_AVX512FP16
#include "src/Core/arch/AVX512/PacketMathFP16.h"
#endif
#include "src/Core/arch/SSE/TypeCasting.h"
#include "src/Core/arch/AVX/TypeCasting.h"
#include "src/Core/arch/AVX512/TypeCasting.h"
#if defined EIGEN_VECTORIZE_AVX512FP16
#include "src/Core/arch/AVX512/TypeCastingFP16.h"
#endif
#include "src/Core/arch/SSE/Complex.h"
#include "src/Core/arch/AVX/Complex.h"
#include "src/Core/arch/AVX512/Complex.h"
#include "src/Core/arch/SSE/MathFunctions.h"
#include "src/Core/arch/AVX/MathFunctions.h"
#include "src/Core/arch/AVX512/MathFunctions.h"
#if defined EIGEN_VECTORIZE_AVX512FP16
#include "src/Core/arch/AVX512/MathFunctionsFP16.h"
#endif
#include "src/Core/arch/AVX512/TrsmKernel.h"
#elif defined EIGEN_VECTORIZE_AVX
// Use AVX for floats and doubles, SSE for integers
#include "src/Core/arch/SSE/PacketMath.h"
#include "src/Core/arch/SSE/Reductions.h"
#include "src/Core/arch/SSE/TypeCasting.h"
#include "src/Core/arch/SSE/Complex.h"
#include "src/Core/arch/AVX/PacketMath.h"
#include "src/Core/arch/AVX/Reductions.h"
#include "src/Core/arch/AVX/TypeCasting.h"
#include "src/Core/arch/AVX/Complex.h"
#include "src/Core/arch/SSE/MathFunctions.h"
#include "src/Core/arch/AVX/MathFunctions.h"
#elif defined EIGEN_VECTORIZE_SSE
#include "src/Core/arch/SSE/PacketMath.h"
#include "src/Core/arch/SSE/Reductions.h"
#include "src/Core/arch/SSE/TypeCasting.h"
#include "src/Core/arch/SSE/MathFunctions.h"
#include "src/Core/arch/SSE/Complex.h"
#endif
#if defined(EIGEN_VECTORIZE_ALTIVEC) || defined(EIGEN_VECTORIZE_VSX)
#include "src/Core/arch/AltiVec/PacketMath.h"
#include "src/Core/arch/AltiVec/TypeCasting.h"
#include "src/Core/arch/AltiVec/MathFunctions.h"
#include "src/Core/arch/AltiVec/Complex.h"
#elif defined EIGEN_VECTORIZE_NEON
#include "src/Core/arch/NEON/PacketMath.h"
#include "src/Core/arch/NEON/TypeCasting.h"
#include "src/Core/arch/NEON/MathFunctions.h"
#include "src/Core/arch/NEON/Complex.h"
#elif defined EIGEN_VECTORIZE_LSX
#include "src/Core/arch/LSX/PacketMath.h"
#include "src/Core/arch/LSX/TypeCasting.h"
#include "src/Core/arch/LSX/MathFunctions.h"
#include "src/Core/arch/LSX/Complex.h"
#elif defined EIGEN_VECTORIZE_SVE
#include "src/Core/arch/SVE/PacketMath.h"
#include "src/Core/arch/SVE/TypeCasting.h"
#include "src/Core/arch/SVE/MathFunctions.h"
#elif defined EIGEN_VECTORIZE_ZVECTOR
#include "src/Core/arch/ZVector/PacketMath.h"
#include "src/Core/arch/ZVector/MathFunctions.h"
#include "src/Core/arch/ZVector/Complex.h"
#elif defined EIGEN_VECTORIZE_MSA
#include "src/Core/arch/MSA/PacketMath.h"
#include "src/Core/arch/MSA/MathFunctions.h"
#include "src/Core/arch/MSA/Complex.h"
#elif defined EIGEN_VECTORIZE_HVX
#include "src/Core/arch/HVX/PacketMath.h"
#endif
#ifndef EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD
#define EIGEN_CACHEFRIENDLY_PRODUCT_THRESHOLD 16
#if defined EIGEN_VECTORIZE_GPU
#include "src/Core/arch/GPU/PacketMath.h"
#include "src/Core/arch/GPU/MathFunctions.h"
#include "src/Core/arch/GPU/TypeCasting.h"
#endif
#include "src/Core/Functors.h"
#if defined(EIGEN_USE_SYCL)
#include "src/Core/arch/SYCL/InteropHeaders.h"
#if !defined(EIGEN_DONT_VECTORIZE_SYCL)
#include "src/Core/arch/SYCL/PacketMath.h"
#include "src/Core/arch/SYCL/MathFunctions.h"
#include "src/Core/arch/SYCL/TypeCasting.h"
#endif
#endif
#include "src/Core/arch/Default/Settings.h"
// This file provides generic implementations valid for scalar as well
#include "src/Core/arch/Default/GenericPacketMathFunctions.h"
#include "src/Core/functors/TernaryFunctors.h"
#include "src/Core/functors/BinaryFunctors.h"
#include "src/Core/functors/UnaryFunctors.h"
#include "src/Core/functors/NullaryFunctors.h"
#include "src/Core/functors/StlFunctors.h"
#include "src/Core/functors/AssignmentFunctors.h"
// Specialized functors for GPU.
#ifdef EIGEN_GPUCC
#include "src/Core/arch/GPU/Complex.h"
#endif
// Specializations of vectorized activation functions for NEON.
#ifdef EIGEN_VECTORIZE_NEON
#include "src/Core/arch/NEON/UnaryFunctors.h"
#endif
#include "src/Core/util/IndexedViewHelper.h"
#include "src/Core/util/ReshapedHelper.h"
#include "src/Core/ArithmeticSequence.h"
#ifndef EIGEN_NO_IO
#include "src/Core/IO.h"
#endif
#include "src/Core/DenseCoeffsBase.h"
#include "src/Core/DenseBase.h"
#include "src/Core/MatrixBase.h"
#include "src/Core/Coeffs.h"
#ifndef EIGEN_PARSED_BY_DOXYGEN // work around Doxygen bug triggered by Assign.h r814874
// at least confirmed with Doxygen 1.5.5 and 1.5.6
#include "src/Core/EigenBase.h"
#include "src/Core/Product.h"
#include "src/Core/CoreEvaluators.h"
#include "src/Core/AssignEvaluator.h"
#include "src/Core/RealView.h"
#include "src/Core/Assign.h"
#endif
#include "src/Core/MatrixStorage.h"
#include "src/Core/ArrayBase.h"
#include "src/Core/util/BlasUtil.h"
#include "src/Core/DenseStorage.h"
#include "src/Core/NestByValue.h"
#include "src/Core/Flagged.h"
// #include "src/Core/ForceAlignedAccess.h"
#include "src/Core/ReturnByValue.h"
#include "src/Core/NoAlias.h"
#include "src/Core/PlainObjectBase.h"
#include "src/Core/Matrix.h"
#include "src/Core/Cwise.h"
#include "src/Core/Array.h"
#include "src/Core/Fill.h"
#include "src/Core/CwiseTernaryOp.h"
#include "src/Core/CwiseBinaryOp.h"
#include "src/Core/CwiseUnaryOp.h"
#include "src/Core/CwiseNullaryOp.h"
#include "src/Core/CwiseUnaryView.h"
#include "src/Core/SelfCwiseBinaryOp.h"
#include "src/Core/InnerProduct.h"
#include "src/Core/Dot.h"
#include "src/Core/Product.h"
#include "src/Core/DiagonalProduct.h"
#include "src/Core/SolveTriangular.h"
#include "src/Core/StableNorm.h"
#include "src/Core/Stride.h"
#include "src/Core/MapBase.h"
#include "src/Core/Map.h"
#include "src/Core/Ref.h"
#include "src/Core/Block.h"
#include "src/Core/Minor.h"
#include "src/Core/VectorBlock.h"
#include "src/Core/IndexedView.h"
#include "src/Core/Reshaped.h"
#include "src/Core/Transpose.h"
#include "src/Core/DiagonalMatrix.h"
#include "src/Core/DiagonalCoeffs.h"
#include "src/Core/Sum.h"
#include "src/Core/Diagonal.h"
#include "src/Core/DiagonalProduct.h"
#include "src/Core/SkewSymmetricMatrix3.h"
#include "src/Core/Redux.h"
#include "src/Core/Visitor.h"
#include "src/Core/FindCoeff.h"
#include "src/Core/Fuzzy.h"
#include "src/Core/IO.h"
#include "src/Core/Swap.h"
#include "src/Core/CommaInitializer.h"
#include "src/Core/Part.h"
#include "src/Core/CacheFriendlyProduct.h"
#include "src/Core/GeneralProduct.h"
#include "src/Core/Solve.h"
#include "src/Core/Inverse.h"
#include "src/Core/SolverBase.h"
#include "src/Core/PermutationMatrix.h"
#include "src/Core/Transpositions.h"
#include "src/Core/TriangularMatrix.h"
#include "src/Core/SelfAdjointView.h"
#include "src/Core/products/GeneralBlockPanelKernel.h"
#include "src/Core/DeviceWrapper.h"
#ifdef EIGEN_GEMM_THREADPOOL
#include "ThreadPool"
#endif
#include "src/Core/products/Parallelizer.h"
#include "src/Core/ProductEvaluators.h"
#include "src/Core/products/GeneralMatrixVector.h"
#include "src/Core/products/GeneralMatrixMatrix.h"
#include "src/Core/SolveTriangular.h"
#include "src/Core/products/GeneralMatrixMatrixTriangular.h"
#include "src/Core/products/SelfadjointMatrixVector.h"
#include "src/Core/products/SelfadjointMatrixMatrix.h"
#include "src/Core/products/SelfadjointProduct.h"
#include "src/Core/products/SelfadjointRank2Update.h"
#include "src/Core/products/TriangularMatrixVector.h"
#include "src/Core/products/TriangularMatrixMatrix.h"
#include "src/Core/products/TriangularSolverMatrix.h"
#include "src/Core/products/TriangularSolverVector.h"
#include "src/Core/BandMatrix.h"
#include "src/Core/CoreIterators.h"
#include "src/Core/ConditionEstimator.h"
} // namespace Eigen
#if defined(EIGEN_VECTORIZE_VSX)
#include "src/Core/arch/AltiVec/MatrixProduct.h"
#elif defined EIGEN_VECTORIZE_NEON
#include "src/Core/arch/NEON/GeneralBlockPanelKernel.h"
#elif defined EIGEN_VECTORIZE_LSX
#include "src/Core/arch/LSX/GeneralBlockPanelKernel.h"
#endif
#endif // EIGEN_CORE_H
#if defined(EIGEN_VECTORIZE_AVX512)
#include "src/Core/arch/AVX512/GemmKernel.h"
#endif
#include "src/Core/Select.h"
#include "src/Core/VectorwiseOp.h"
#include "src/Core/PartialReduxEvaluator.h"
#include "src/Core/Random.h"
#include "src/Core/Replicate.h"
#include "src/Core/Reverse.h"
#include "src/Core/ArrayWrapper.h"
#include "src/Core/StlIterators.h"
#ifdef EIGEN_USE_BLAS
#include "src/Core/products/GeneralMatrixMatrix_BLAS.h"
#include "src/Core/products/GeneralMatrixVector_BLAS.h"
#include "src/Core/products/GeneralMatrixMatrixTriangular_BLAS.h"
#include "src/Core/products/SelfadjointMatrixMatrix_BLAS.h"
#include "src/Core/products/SelfadjointMatrixVector_BLAS.h"
#include "src/Core/products/TriangularMatrixMatrix_BLAS.h"
#include "src/Core/products/TriangularMatrixVector_BLAS.h"
#include "src/Core/products/TriangularSolverMatrix_BLAS.h"
#endif // EIGEN_USE_BLAS
#ifdef EIGEN_USE_MKL_VML
#include "src/Core/Assign_MKL.h"
#endif
#include "src/Core/GlobalFunctions.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_CORE_MODULE_H

7
Eigen/Dense Normal file
View File

@@ -0,0 +1,7 @@
#include "Core"
#include "LU"
#include "Cholesky"
#include "QR"
#include "SVD"
#include "Geometry"
#include "Eigenvalues"

2
Eigen/Eigen Normal file
View File

@@ -0,0 +1,2 @@
#include "Dense"
#include "Sparse"

63
Eigen/Eigenvalues Normal file
View File

@@ -0,0 +1,63 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_EIGENVALUES_MODULE_H
#define EIGEN_EIGENVALUES_MODULE_H
#include "Core"
#include "Cholesky"
#include "Jacobi"
#include "Householder"
#include "LU"
#include "Geometry"
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup Eigenvalues_Module Eigenvalues module
*
*
*
* This module mainly provides various eigenvalue solvers.
* This module also provides some MatrixBase methods, including:
* - MatrixBase::eigenvalues(),
* - MatrixBase::operatorNorm()
*
* \code
* #include <Eigen/Eigenvalues>
* \endcode
*/
#include "src/misc/RealSvd2x2.h"
// IWYU pragma: begin_exports
#include "src/Eigenvalues/Tridiagonalization.h"
#include "src/Eigenvalues/RealSchur.h"
#include "src/Eigenvalues/EigenSolver.h"
#include "src/Eigenvalues/SelfAdjointEigenSolver.h"
#include "src/Eigenvalues/GeneralizedSelfAdjointEigenSolver.h"
#include "src/Eigenvalues/HessenbergDecomposition.h"
#include "src/Eigenvalues/ComplexSchur.h"
#include "src/Eigenvalues/ComplexEigenSolver.h"
#include "src/Eigenvalues/RealQZ.h"
#include "src/Eigenvalues/GeneralizedEigenSolver.h"
#include "src/Eigenvalues/MatrixBaseEigenvalues.h"
#ifdef EIGEN_USE_LAPACKE
#ifdef EIGEN_USE_MKL
#include "mkl_lapacke.h"
#else
#include "src/misc/lapacke.h"
#endif
#include "src/Eigenvalues/RealSchur_LAPACKE.h"
#include "src/Eigenvalues/ComplexSchur_LAPACKE.h"
#include "src/Eigenvalues/SelfAdjointEigenSolver_LAPACKE.h"
#endif
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_EIGENVALUES_MODULE_H

View File

@@ -1,42 +1,59 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_GEOMETRY_MODULE_H
#define EIGEN_GEOMETRY_MODULE_H
#include "Array"
#include "Core"
#include "SVD"
#include "LU"
#include <limits>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
#include "src/Core/util/DisableStupidWarnings.h"
namespace Eigen {
/** \defgroup GeometryModule Geometry module
* This module provides support for:
* - fixed-size homogeneous transformations
* - translation, scaling, 2D and 3D rotations
* - quaternions
* - \ref MatrixBase::cross() "cross product"
* - \ref MatrixBase::unitOrthogonal() "orthognal vector generation"
* - some linear components: parametrized-lines and hyperplanes
*
* \code
* #include <Eigen/Geometry>
* \endcode
*/
/** \defgroup Geometry_Module Geometry module
*
* This module provides support for:
* - fixed-size homogeneous transformations
* - translation, scaling, 2D and 3D rotations
* - \link Quaternion quaternions \endlink
* - cross products (\ref MatrixBase::cross(), \ref MatrixBase::cross3())
* - orthogonal vector generation (MatrixBase::unitOrthogonal)
* - some linear components: \link ParametrizedLine parametrized-lines \endlink and \link Hyperplane hyperplanes \endlink
* - \link AlignedBox axis aligned bounding boxes \endlink
* - \link umeyama() least-square transformation fitting \endlink
* \code
* #include <Eigen/Geometry>
* \endcode
*/
// IWYU pragma: begin_exports
#include "src/Geometry/OrthoMethods.h"
#include "src/Geometry/EulerAngles.h"
#include "src/Geometry/Homogeneous.h"
#include "src/Geometry/RotationBase.h"
#include "src/Geometry/Rotation2D.h"
#include "src/Geometry/Quaternion.h"
#include "src/Geometry/AngleAxis.h"
#include "src/Geometry/EulerAngles.h"
#include "src/Geometry/Transform.h"
#include "src/Geometry/Translation.h"
#include "src/Geometry/Scaling.h"
#include "src/Geometry/Hyperplane.h"
#include "src/Geometry/ParametrizedLine.h"
#include "src/Geometry/AlignedBox.h"
#include "src/Geometry/Umeyama.h"
} // namespace Eigen
// Use the SSE optimized version whenever possible.
#if (defined EIGEN_VECTORIZE_SSE) || (defined EIGEN_VECTORIZE_NEON)
#include "src/Geometry/arch/Geometry_SIMD.h"
#endif
// IWYU pragma: end_exports
#endif // EIGEN_GEOMETRY_MODULE_H
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_GEOMETRY_MODULE_H

31
Eigen/Householder Normal file
View File

@@ -0,0 +1,31 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_HOUSEHOLDER_MODULE_H
#define EIGEN_HOUSEHOLDER_MODULE_H
#include "Core"
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup Householder_Module Householder module
* This module provides Householder transformations.
*
* \code
* #include <Eigen/Householder>
* \endcode
*/
// IWYU pragma: begin_exports
#include "src/Householder/Householder.h"
#include "src/Householder/HouseholderSequence.h"
#include "src/Householder/BlockHouseholder.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_HOUSEHOLDER_MODULE_H

View File

@@ -0,0 +1,52 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ITERATIVELINEARSOLVERS_MODULE_H
#define EIGEN_ITERATIVELINEARSOLVERS_MODULE_H
#include "SparseCore"
#include "OrderingMethods"
#include "src/Core/util/DisableStupidWarnings.h"
/**
* \defgroup IterativeLinearSolvers_Module IterativeLinearSolvers module
*
* This module currently provides iterative methods to solve problems of the form \c A \c x = \c b, where \c A is a
squared matrix, usually very large and sparse.
* Those solvers are accessible via the following classes:
* - ConjugateGradient for selfadjoint (hermitian) matrices,
* - LeastSquaresConjugateGradient for rectangular least-square problems,
* - BiCGSTAB for general square matrices.
*
* These iterative solvers are associated with some preconditioners:
* - IdentityPreconditioner - not really useful
* - DiagonalPreconditioner - also called Jacobi preconditioner, work very well on diagonal dominant matrices.
* - IncompleteLUT - incomplete LU factorization with dual thresholding
*
* Such problems can also be solved using the direct sparse decomposition modules: SparseCholesky, CholmodSupport,
UmfPackSupport, SuperLUSupport, AccelerateSupport.
*
\code
#include <Eigen/IterativeLinearSolvers>
\endcode
*/
// IWYU pragma: begin_exports
#include "src/IterativeLinearSolvers/SolveWithGuess.h"
#include "src/IterativeLinearSolvers/IterativeSolverBase.h"
#include "src/IterativeLinearSolvers/BasicPreconditioners.h"
#include "src/IterativeLinearSolvers/ConjugateGradient.h"
#include "src/IterativeLinearSolvers/LeastSquareConjugateGradient.h"
#include "src/IterativeLinearSolvers/BiCGSTAB.h"
#include "src/IterativeLinearSolvers/IncompleteLUT.h"
#include "src/IterativeLinearSolvers/IncompleteCholesky.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_ITERATIVELINEARSOLVERS_MODULE_H

33
Eigen/Jacobi Normal file
View File

@@ -0,0 +1,33 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_JACOBI_MODULE_H
#define EIGEN_JACOBI_MODULE_H
#include "Core"
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup Jacobi_Module Jacobi module
* This module provides Jacobi and Givens rotations.
*
* \code
* #include <Eigen/Jacobi>
* \endcode
*
* In addition to listed classes, it defines the two following MatrixBase methods to apply a Jacobi or Givens rotation:
* - MatrixBase::applyOnTheLeft()
* - MatrixBase::applyOnTheRight().
*/
// IWYU pragma: begin_exports
#include "src/Jacobi/Jacobi.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_JACOBI_MODULE_H

43
Eigen/KLUSupport Normal file
View File

@@ -0,0 +1,43 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_KLUSUPPORT_MODULE_H
#define EIGEN_KLUSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
extern "C" {
#include <btf.h>
#include <klu.h>
}
/** \ingroup Support_modules
* \defgroup KLUSupport_Module KLUSupport module
*
* This module provides an interface to the KLU library which is part of the <a
* href="http://www.suitesparse.com">suitesparse</a> package. It provides the following factorization class:
* - class KLU: a sparse LU factorization, well-suited for circuit simulation.
*
* \code
* #include <Eigen/KLUSupport>
* \endcode
*
* In order to use this module, the klu and btf headers must be accessible from the include paths, and your binary must
* be linked to the klu library and its dependencies. The dependencies depend on how umfpack has been compiled. For a
* cmake based project, you can use our FindKLU.cmake module to help you in this task.
*
*/
// IWYU pragma: begin_exports
#include "src/KLUSupport/KLUSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_KLUSUPPORT_MODULE_H

View File

@@ -1,25 +1,46 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_LU_MODULE_H
#define EIGEN_LU_MODULE_H
#include "Core"
namespace Eigen {
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup LU_Module LU module
* This module includes %LU decomposition and related notions such as matrix inversion and determinant.
* This module defines the following MatrixBase methods:
* - MatrixBase::inverse()
* - MatrixBase::determinant()
*
* \code
* #include <Eigen/LU>
* \endcode
*/
* This module includes %LU decomposition and related notions such as matrix inversion and determinant.
* This module defines the following MatrixBase methods:
* - MatrixBase::inverse()
* - MatrixBase::determinant()
*
* \code
* #include <Eigen/LU>
* \endcode
*/
#include "src/LU/LU.h"
#include "src/misc/Kernel.h"
#include "src/misc/Image.h"
// IWYU pragma: begin_exports
#include "src/LU/FullPivLU.h"
#include "src/LU/PartialPivLU.h"
#ifdef EIGEN_USE_LAPACKE
#include "src/misc/lapacke_helpers.h"
#include "src/LU/PartialPivLU_LAPACKE.h"
#endif
#include "src/LU/Determinant.h"
#include "src/LU/Inverse.h"
#include "src/LU/InverseImpl.h"
} // namespace Eigen
#if defined EIGEN_VECTORIZE_SSE || defined EIGEN_VECTORIZE_NEON
#include "src/LU/arch/InverseSize4.h"
#endif
// IWYU pragma: end_exports
#endif // EIGEN_LU_MODULE_H
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_LU_MODULE_H

35
Eigen/MetisSupport Normal file
View File

@@ -0,0 +1,35 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_METISSUPPORT_MODULE_H
#define EIGEN_METISSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
extern "C" {
#include <metis.h>
}
/** \ingroup Support_modules
* \defgroup MetisSupport_Module MetisSupport module
*
* \code
* #include <Eigen/MetisSupport>
* \endcode
* This module defines an interface to the METIS reordering package (http://glaros.dtc.umn.edu/gkhome/views/metis).
* It can be used just as any other built-in method as explained in \link OrderingMethods_Module here. \endlink
*/
// IWYU pragma: begin_exports
#include "src/MetisSupport/MetisSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_METISSUPPORT_MODULE_H

73
Eigen/OrderingMethods Normal file
View File

@@ -0,0 +1,73 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ORDERINGMETHODS_MODULE_H
#define EIGEN_ORDERINGMETHODS_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
/**
* \defgroup OrderingMethods_Module OrderingMethods module
*
* This module is currently for internal use only
*
* It defines various built-in and external ordering methods for sparse matrices.
* They are typically used to reduce the number of elements during
* the sparse matrix decomposition (LLT, LU, QR).
* Precisely, in a preprocessing step, a permutation matrix P is computed using
* those ordering methods and applied to the columns of the matrix.
* Using for instance the sparse Cholesky decomposition, it is expected that
* the nonzeros elements in LLT(A*P) will be much smaller than that in LLT(A).
*
*
* Usage :
* \code
* #include <Eigen/OrderingMethods>
* \endcode
*
* A simple usage is as a template parameter in the sparse decomposition classes :
*
* \code
* SparseLU<MatrixType, COLAMDOrdering<int> > solver;
* \endcode
*
* \code
* SparseQR<MatrixType, COLAMDOrdering<int> > solver;
* \endcode
*
* It is possible as well to call directly a particular ordering method for your own purpose,
* \code
* AMDOrdering<int> ordering;
* PermutationMatrix<Dynamic, Dynamic, int> perm;
* SparseMatrix<double> A;
* //Fill the matrix ...
*
* ordering(A, perm); // Call AMD
* \endcode
*
* \note Some of these methods (like AMD or METIS), need the sparsity pattern
* of the input matrix to be symmetric. When the matrix is structurally unsymmetric,
* Eigen computes internally the pattern of \f$A^T*A\f$ before calling the method.
* If your matrix is already symmetric (at least in structure), you can avoid that
* by calling the method with a SelfAdjointView type.
*
* \code
* // Call the ordering on the pattern of the lower triangular matrix A
* ordering(A.selfadjointView<Lower>(), perm);
* \endcode
*/
// IWYU pragma: begin_exports
#include "src/OrderingMethods/Amd.h"
#include "src/OrderingMethods/Ordering.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_ORDERINGMETHODS_MODULE_H

51
Eigen/PaStiXSupport Normal file
View File

@@ -0,0 +1,51 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_PASTIXSUPPORT_MODULE_H
#define EIGEN_PASTIXSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
extern "C" {
#include <pastix_nompi.h>
#include <pastix.h>
}
#ifdef complex
#undef complex
#endif
/** \ingroup Support_modules
* \defgroup PaStiXSupport_Module PaStiXSupport module
*
* This module provides an interface to the <a href="http://pastix.gforge.inria.fr/">PaSTiX</a> library.
* PaSTiX is a general \b supernodal, \b parallel and \b opensource sparse solver.
* It provides the two following main factorization classes:
* - class PastixLLT : a supernodal, parallel LLt Cholesky factorization.
* - class PastixLDLT: a supernodal, parallel LDLt Cholesky factorization.
* - class PastixLU : a supernodal, parallel LU factorization (optimized for a symmetric pattern).
*
* \code
* #include <Eigen/PaStiXSupport>
* \endcode
*
* In order to use this module, the PaSTiX headers must be accessible from the include paths, and your binary must be
* linked to the PaSTiX library and its dependencies. This wrapper resuires PaStiX version 5.x compiled without MPI
* support. The dependencies depend on how PaSTiX has been compiled. For a cmake based project, you can use our
* FindPaSTiX.cmake module to help you in this task.
*
*/
// IWYU pragma: begin_exports
#include "src/PaStiXSupport/PaStiXSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_PASTIXSUPPORT_MODULE_H

38
Eigen/PardisoSupport Normal file
View File

@@ -0,0 +1,38 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_PARDISOSUPPORT_MODULE_H
#define EIGEN_PARDISOSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
#include <mkl_pardiso.h>
/** \ingroup Support_modules
* \defgroup PardisoSupport_Module PardisoSupport module
*
* This module brings support for the Intel(R) MKL PARDISO direct sparse solvers.
*
* \code
* #include <Eigen/PardisoSupport>
* \endcode
*
* In order to use this module, the MKL headers must be accessible from the include paths, and your binary must be
* linked to the MKL library and its dependencies. See this \ref TopicUsingIntelMKL "page" for more information on
* MKL-Eigen integration.
*
*/
// IWYU pragma: begin_exports
#include "src/PardisoSupport/PardisoSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_PARDISOSUPPORT_MODULE_H

View File

@@ -1,65 +1,48 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_QR_MODULE_H
#define EIGEN_QR_MODULE_H
#include "Core"
#include "Cholesky"
#include "Jacobi"
#include "Householder"
// Note that EIGEN_HIDE_HEAVY_CODE has to be defined per module
#if (defined EIGEN_EXTERN_INSTANTIATIONS) && (EIGEN_EXTERN_INSTANTIATIONS>=2)
#ifndef EIGEN_HIDE_HEAVY_CODE
#define EIGEN_HIDE_HEAVY_CODE
#endif
#elif defined EIGEN_HIDE_HEAVY_CODE
#undef EIGEN_HIDE_HEAVY_CODE
#endif
namespace Eigen {
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup QR_Module QR module
* This module mainly provides QR decomposition and an eigen value solver.
* This module also provides some MatrixBase methods, including:
* - MatrixBase::qr(),
* - MatrixBase::eigenvalues(),
* - MatrixBase::operatorNorm()
*
* \code
* #include <Eigen/QR>
* \endcode
*/
*
*
*
* This module provides various QR decompositions
* This module also provides some MatrixBase methods, including:
* - MatrixBase::householderQr()
* - MatrixBase::colPivHouseholderQr()
* - MatrixBase::fullPivHouseholderQr()
*
* \code
* #include <Eigen/QR>
* \endcode
*/
#include "src/QR/QR.h"
#include "src/QR/Tridiagonalization.h"
#include "src/QR/EigenSolver.h"
#include "src/QR/SelfAdjointEigenSolver.h"
#include "src/QR/HessenbergDecomposition.h"
// IWYU pragma: begin_exports
#include "src/QR/HouseholderQR.h"
#include "src/QR/FullPivHouseholderQR.h"
#include "src/QR/ColPivHouseholderQR.h"
#include "src/QR/CompleteOrthogonalDecomposition.h"
#ifdef EIGEN_USE_LAPACKE
#include "src/misc/lapacke_helpers.h"
#include "src/QR/HouseholderQR_LAPACKE.h"
#include "src/QR/ColPivHouseholderQR_LAPACKE.h"
#endif
// IWYU pragma: end_exports
// declare all classes for a given matrix type
#define EIGEN_QR_MODULE_INSTANTIATE_TYPE(MATRIXTYPE,PREFIX) \
PREFIX template class QR<MATRIXTYPE>; \
PREFIX template class Tridiagonalization<MATRIXTYPE>; \
PREFIX template class HessenbergDecomposition<MATRIXTYPE>; \
PREFIX template class SelfAdjointEigenSolver<MATRIXTYPE>
#include "src/Core/util/ReenableStupidWarnings.h"
// removed because it does not support complex yet
// PREFIX template class EigenSolver<MATRIXTYPE>
// declare all class for all types
#define EIGEN_QR_MODULE_INSTANTIATE(PREFIX) \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix2f,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix2d,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix3f,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix3d,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix4f,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(Matrix4d,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(MatrixXf,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(MatrixXd,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(MatrixXcf,PREFIX); \
EIGEN_QR_MODULE_INSTANTIATE_TYPE(MatrixXcd,PREFIX)
#ifdef EIGEN_EXTERN_INSTANTIATIONS
EIGEN_QR_MODULE_INSTANTIATE(extern);
#endif // EIGEN_EXTERN_INSTANTIATIONS
} // namespace Eigen
#endif // EIGEN_QR_MODULE_H
#endif // EIGEN_QR_MODULE_H

32
Eigen/QtAlignedMalloc Normal file
View File

@@ -0,0 +1,32 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_QTMALLOC_MODULE_H
#define EIGEN_QTMALLOC_MODULE_H
#include "Core"
#if (!EIGEN_MALLOC_ALREADY_ALIGNED)
#include "src/Core/util/DisableStupidWarnings.h"
void *qMalloc(std::size_t size) { return Eigen::internal::aligned_malloc(size); }
void qFree(void *ptr) { Eigen::internal::aligned_free(ptr); }
void *qRealloc(void *ptr, std::size_t size) {
void *newPtr = Eigen::internal::aligned_malloc(size);
std::memcpy(newPtr, ptr, size);
Eigen::internal::aligned_free(ptr);
return newPtr;
}
#include "src/Core/util/ReenableStupidWarnings.h"
#endif
#endif // EIGEN_QTMALLOC_MODULE_H

View File

@@ -1,22 +0,0 @@
#ifndef EIGEN_REGRESSION_MODULE_H
#define EIGEN_REGRESSION_MODULE_H
#include "LU"
#include "QR"
#include "Geometry"
namespace Eigen {
/** \defgroup Regression_Module Regression module
* This module provides linear regression and related features.
*
* \code
* #include <Eigen/Regression>
* \endcode
*/
#include "src/Regression/Regression.h"
} // namespace Eigen
#endif // EIGEN_REGRESSION_MODULE_H

41
Eigen/SPQRSupport Normal file
View File

@@ -0,0 +1,41 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPQRSUPPORT_MODULE_H
#define EIGEN_SPQRSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
#include "SuiteSparseQR.hpp"
/** \ingroup Support_modules
* \defgroup SPQRSupport_Module SuiteSparseQR module
*
* This module provides an interface to the SPQR library, which is part of the <a
* href="http://www.suitesparse.com">suitesparse</a> package.
*
* \code
* #include <Eigen/SPQRSupport>
* \endcode
*
* In order to use this module, the SPQR headers must be accessible from the include paths, and your binary must be
* linked to the SPQR library and its dependencies (Cholmod, AMD, COLAMD,...). For a cmake based project, you can use
* our FindSPQR.cmake and FindCholmod.Cmake modules
*
*/
#include "CholmodSupport"
// IWYU pragma: begin_exports
#include "src/SPQRSupport/SuiteSparseQRSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif

View File

@@ -1,22 +1,56 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SVD_MODULE_H
#define EIGEN_SVD_MODULE_H
#include "Core"
#include "QR"
#include "Householder"
#include "Jacobi"
namespace Eigen {
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup SVD_Module SVD module
* This module provides SVD decomposition for (currently) real matrices.
* This decomposition is accessible via the following MatrixBase method:
* - MatrixBase::svd()
*
* \code
* #include <Eigen/SVD>
* \endcode
*/
*
*
*
* This module provides SVD decomposition for matrices (both real and complex).
* Two decomposition algorithms are provided:
* - JacobiSVD implementing two-sided Jacobi iterations is numerically very accurate, fast for small matrices, but very
* slow for larger ones.
* - BDCSVD implementing a recursive divide & conquer strategy on top of an upper-bidiagonalization which remains fast
* for large problems. These decompositions are accessible via the respective classes and following MatrixBase methods:
* - MatrixBase::jacobiSvd()
* - MatrixBase::bdcSvd()
*
* \code
* #include <Eigen/SVD>
* \endcode
*/
#include "src/SVD/SVD.h"
// IWYU pragma: begin_exports
#include "src/misc/RealSvd2x2.h"
#include "src/SVD/UpperBidiagonalization.h"
#include "src/SVD/SVDBase.h"
#include "src/SVD/JacobiSVD.h"
#include "src/SVD/BDCSVD.h"
#ifdef EIGEN_USE_LAPACKE
#ifdef EIGEN_USE_MKL
#include "mkl_lapacke.h"
#else
#include "src/misc/lapacke.h"
#endif
#ifndef EIGEN_USE_LAPACKE_STRICT
#include "src/SVD/JacobiSVD_LAPACKE.h"
#endif
#include "src/SVD/BDCSVD_LAPACKE.h"
#endif
// IWYU pragma: end_exports
} // namespace Eigen
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_SVD_MODULE_H
#endif // EIGEN_SVD_MODULE_H

View File

@@ -1,105 +1,33 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPARSE_MODULE_H
#define EIGEN_SPARSE_MODULE_H
#include "Core"
#include <vector>
#include <map>
#include <cstdlib>
#include <cstring>
#include <algorithm>
/** \defgroup Sparse_Module Sparse meta-module
*
* Meta-module including all related modules:
* - \ref SparseCore_Module
* - \ref OrderingMethods_Module
* - \ref SparseCholesky_Module
* - \ref SparseLU_Module
* - \ref SparseQR_Module
* - \ref IterativeLinearSolvers_Module
*
\code
#include <Eigen/Sparse>
\endcode
*/
#ifdef EIGEN_GOOGLEHASH_SUPPORT
#include <google/dense_hash_map>
#endif
#include "SparseCore"
#include "OrderingMethods"
#include "SparseCholesky"
#include "SparseLU"
#include "SparseQR"
#include "IterativeLinearSolvers"
#ifdef EIGEN_CHOLMOD_SUPPORT
extern "C" {
#include "cholmod.h"
}
#endif
#ifdef EIGEN_TAUCS_SUPPORT
// taucs.h declares a lot of mess
#define isnan
#define finite
#define isinf
extern "C" {
#include "taucs.h"
}
#undef isnan
#undef finite
#undef isinf
#ifdef min
#undef min
#endif
#ifdef max
#undef max
#endif
#endif
#ifdef EIGEN_SUPERLU_SUPPORT
typedef int int_t;
#include "superlu/slu_Cnames.h"
#include "superlu/supermatrix.h"
#include "superlu/slu_util.h"
namespace SuperLU_S {
#include "superlu/slu_sdefs.h"
}
namespace SuperLU_D {
#include "superlu/slu_ddefs.h"
}
namespace SuperLU_C {
#include "superlu/slu_cdefs.h"
}
namespace SuperLU_Z {
#include "superlu/slu_zdefs.h"
}
namespace Eigen { struct SluMatrix; }
#endif
#ifdef EIGEN_UMFPACK_SUPPORT
#include "umfpack.h"
#endif
namespace Eigen {
#include "src/Sparse/SparseUtil.h"
#include "src/Sparse/SparseMatrixBase.h"
#include "src/Sparse/SparseArray.h"
#include "src/Sparse/AmbiVector.h"
#include "src/Sparse/RandomSetter.h"
#include "src/Sparse/SparseBlock.h"
#include "src/Sparse/SparseMatrix.h"
//#include "src/Sparse/HashMatrix.h"
//#include "src/Sparse/LinkedVectorMatrix.h"
#include "src/Sparse/CoreIterators.h"
//#include "src/Sparse/SparseSetter.h"
#include "src/Sparse/SparseProduct.h"
#include "src/Sparse/TriangularSolver.h"
#include "src/Sparse/SparseLLT.h"
#include "src/Sparse/SparseLDLT.h"
#include "src/Sparse/SparseLU.h"
#ifdef EIGEN_CHOLMOD_SUPPORT
# include "src/Sparse/CholmodSupport.h"
#endif
#ifdef EIGEN_TAUCS_SUPPORT
# include "src/Sparse/TaucsSupport.h"
#endif
#ifdef EIGEN_SUPERLU_SUPPORT
# include "src/Sparse/SuperLUSupport.h"
#endif
#ifdef EIGEN_UMFPACK_SUPPORT
# include "src/Sparse/UmfPackSupport.h"
#endif
} // namespace Eigen
#endif // EIGEN_SPARSE_MODULE_H
#endif // EIGEN_SPARSE_MODULE_H

40
Eigen/SparseCholesky Normal file
View File

@@ -0,0 +1,40 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2008-2013 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPARSECHOLESKY_MODULE_H
#define EIGEN_SPARSECHOLESKY_MODULE_H
#include "SparseCore"
#include "OrderingMethods"
#include "src/Core/util/DisableStupidWarnings.h"
/**
* \defgroup SparseCholesky_Module SparseCholesky module
*
* This module currently provides two variants of the direct sparse Cholesky decomposition for selfadjoint (hermitian)
* matrices. Those decompositions are accessible via the following classes:
* - SimplicialLLt,
* - SimplicialLDLt
*
* Such problems can also be solved using the ConjugateGradient solver from the IterativeLinearSolvers module.
*
* \code
* #include <Eigen/SparseCholesky>
* \endcode
*/
// IWYU pragma: begin_exports
#include "src/SparseCholesky/SimplicialCholesky.h"
#include "src/SparseCholesky/SimplicialCholesky_impl.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_SPARSECHOLESKY_MODULE_H

70
Eigen/SparseCore Normal file
View File

@@ -0,0 +1,70 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPARSECORE_MODULE_H
#define EIGEN_SPARSECORE_MODULE_H
#include "Core"
#include "src/Core/util/DisableStupidWarnings.h"
#include <vector>
#include <map>
#include <cstdlib>
#include <cstring>
#include <algorithm>
#include <numeric>
/**
* \defgroup SparseCore_Module SparseCore module
*
* This module provides a sparse matrix representation, and basic associated matrix manipulations
* and operations.
*
* See the \ref TutorialSparse "Sparse tutorial"
*
* \code
* #include <Eigen/SparseCore>
* \endcode
*
* This module depends on: Core.
*/
// IWYU pragma: begin_exports
#include "src/SparseCore/SparseUtil.h"
#include "src/SparseCore/SparseMatrixBase.h"
#include "src/SparseCore/SparseAssign.h"
#include "src/SparseCore/CompressedStorage.h"
#include "src/SparseCore/AmbiVector.h"
#include "src/SparseCore/SparseCompressedBase.h"
#include "src/SparseCore/SparseMatrix.h"
#include "src/SparseCore/SparseMap.h"
#include "src/SparseCore/SparseVector.h"
#include "src/SparseCore/SparseRef.h"
#include "src/SparseCore/SparseCwiseUnaryOp.h"
#include "src/SparseCore/SparseCwiseBinaryOp.h"
#include "src/SparseCore/SparseTranspose.h"
#include "src/SparseCore/SparseBlock.h"
#include "src/SparseCore/SparseDot.h"
#include "src/SparseCore/SparseRedux.h"
#include "src/SparseCore/SparseView.h"
#include "src/SparseCore/SparseDiagonalProduct.h"
#include "src/SparseCore/ConservativeSparseSparseProduct.h"
#include "src/SparseCore/SparseSparseProductWithPruning.h"
#include "src/SparseCore/SparseProduct.h"
#include "src/SparseCore/SparseDenseProduct.h"
#include "src/SparseCore/SparseSelfAdjointView.h"
#include "src/SparseCore/SparseTriangularView.h"
#include "src/SparseCore/TriangularSolver.h"
#include "src/SparseCore/SparsePermutation.h"
#include "src/SparseCore/SparseFuzzy.h"
#include "src/SparseCore/SparseSolverBase.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_SPARSECORE_MODULE_H

50
Eigen/SparseLU Normal file
View File

@@ -0,0 +1,50 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2012 Désiré Nuentsa-Wakam <desire.nuentsa_wakam@inria.fr>
// Copyright (C) 2012 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPARSELU_MODULE_H
#define EIGEN_SPARSELU_MODULE_H
#include "SparseCore"
/**
* \defgroup SparseLU_Module SparseLU module
* This module defines a supernodal factorization of general sparse matrices.
* The code is fully optimized for supernode-panel updates with specialized kernels.
* Please, see the documentation of the SparseLU class for more details.
*/
// Ordering interface
#include "OrderingMethods"
#include "src/Core/util/DisableStupidWarnings.h"
// IWYU pragma: begin_exports
#include "src/SparseLU/SparseLU_Structs.h"
#include "src/SparseLU/SparseLU_SupernodalMatrix.h"
#include "src/SparseLU/SparseLUImpl.h"
#include "src/SparseCore/SparseColEtree.h"
#include "src/SparseLU/SparseLU_Memory.h"
#include "src/SparseLU/SparseLU_heap_relax_snode.h"
#include "src/SparseLU/SparseLU_relax_snode.h"
#include "src/SparseLU/SparseLU_pivotL.h"
#include "src/SparseLU/SparseLU_panel_dfs.h"
#include "src/SparseLU/SparseLU_kernel_bmod.h"
#include "src/SparseLU/SparseLU_panel_bmod.h"
#include "src/SparseLU/SparseLU_column_dfs.h"
#include "src/SparseLU/SparseLU_column_bmod.h"
#include "src/SparseLU/SparseLU_copy_to_ucol.h"
#include "src/SparseLU/SparseLU_pruneL.h"
#include "src/SparseLU/SparseLU_Utils.h"
#include "src/SparseLU/SparseLU.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_SPARSELU_MODULE_H

38
Eigen/SparseQR Normal file
View File

@@ -0,0 +1,38 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SPARSEQR_MODULE_H
#define EIGEN_SPARSEQR_MODULE_H
#include "SparseCore"
#include "OrderingMethods"
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup SparseQR_Module SparseQR module
* \brief Provides QR decomposition for sparse matrices
*
* This module provides a simplicial version of the left-looking Sparse QR decomposition.
* The columns of the input matrix should be reordered to limit the fill-in during the
* decomposition. Built-in methods (COLAMD, AMD) or external methods (METIS) can be used to this end.
* See the \link OrderingMethods_Module OrderingMethods\endlink module for the list
* of built-in and external ordering methods.
*
* \code
* #include <Eigen/SparseQR>
* \endcode
*
*
*/
// IWYU pragma: begin_exports
#include "src/SparseCore/SparseColEtree.h"
#include "src/SparseQR/SparseQR.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif

30
Eigen/StdDeque Normal file
View File

@@ -0,0 +1,30 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2009 Hauke Heibel <hauke.heibel@googlemail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_STDDEQUE_MODULE_H
#define EIGEN_STDDEQUE_MODULE_H
#include "Core"
#include <deque>
#if EIGEN_COMP_MSVC && EIGEN_OS_WIN64 && \
(EIGEN_MAX_STATIC_ALIGN_BYTES <= 16) /* MSVC auto aligns up to 16 bytes in 64 bit builds */
#define EIGEN_DEFINE_STL_DEQUE_SPECIALIZATION(...)
#else
// IWYU pragma: begin_exports
#include "src/StlSupport/StdDeque.h"
// IWYU pragma: end_exports
#endif
#endif // EIGEN_STDDEQUE_MODULE_H

29
Eigen/StdList Normal file
View File

@@ -0,0 +1,29 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Hauke Heibel <hauke.heibel@googlemail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_STDLIST_MODULE_H
#define EIGEN_STDLIST_MODULE_H
#include "Core"
#include <list>
#if EIGEN_COMP_MSVC && EIGEN_OS_WIN64 && \
(EIGEN_MAX_STATIC_ALIGN_BYTES <= 16) /* MSVC auto aligns up to 16 bytes in 64 bit builds */
#define EIGEN_DEFINE_STL_LIST_SPECIALIZATION(...)
#else
// IWYU pragma: begin_exports
#include "src/StlSupport/StdList.h"
// IWYU pragma: end_exports
#endif
#endif // EIGEN_STDLIST_MODULE_H

30
Eigen/StdVector Normal file
View File

@@ -0,0 +1,30 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2009 Hauke Heibel <hauke.heibel@googlemail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_STDVECTOR_MODULE_H
#define EIGEN_STDVECTOR_MODULE_H
#include "Core"
#include <vector>
#if EIGEN_COMP_MSVC && EIGEN_OS_WIN64 && \
(EIGEN_MAX_STATIC_ALIGN_BYTES <= 16) /* MSVC auto aligns up to 16 bytes in 64 bit builds */
#define EIGEN_DEFINE_STL_VECTOR_SPECIALIZATION(...)
#else
// IWYU pragma: begin_exports
#include "src/StlSupport/StdVector.h"
// IWYU pragma: end_exports
#endif
#endif // EIGEN_STDVECTOR_MODULE_H

70
Eigen/SuperLUSupport Normal file
View File

@@ -0,0 +1,70 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_SUPERLUSUPPORT_MODULE_H
#define EIGEN_SUPERLUSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
#ifdef EMPTY
#define EIGEN_EMPTY_WAS_ALREADY_DEFINED
#endif
typedef int int_t;
#include <slu_Cnames.h>
#include <supermatrix.h>
#include <slu_util.h>
// slu_util.h defines a preprocessor token named EMPTY which is really polluting,
// so we remove it in favor of a SUPERLU_EMPTY token.
// If EMPTY was already defined then we don't undef it.
#if defined(EIGEN_EMPTY_WAS_ALREADY_DEFINED)
#undef EIGEN_EMPTY_WAS_ALREADY_DEFINED
#elif defined(EMPTY)
#undef EMPTY
#endif
#define SUPERLU_EMPTY (-1)
namespace Eigen {
struct SluMatrix;
}
/** \ingroup Support_modules
* \defgroup SuperLUSupport_Module SuperLUSupport module
*
* This module provides an interface to the <a href="http://crd-legacy.lbl.gov/~xiaoye/SuperLU/">SuperLU</a> library.
* It provides the following factorization class:
* - class SuperLU: a supernodal sequential LU factorization.
* - class SuperILU: a supernodal sequential incomplete LU factorization (to be used as a preconditioner for iterative
* methods).
*
* \warning This wrapper requires at least versions 4.0 of SuperLU. The 3.x versions are not supported.
*
* \warning When including this module, you have to use SUPERLU_EMPTY instead of EMPTY which is no longer defined
* because it is too polluting.
*
* \code
* #include <Eigen/SuperLUSupport>
* \endcode
*
* In order to use this module, the superlu headers must be accessible from the include paths, and your binary must be
* linked to the superlu library and its dependencies. The dependencies depend on how superlu has been compiled. For a
* cmake based project, you can use our FindSuperLU.cmake module to help you in this task.
*
*/
// IWYU pragma: begin_exports
#include "src/SuperLUSupport/SuperLUSupport.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_SUPERLUSUPPORT_MODULE_H

80
Eigen/ThreadPool Normal file
View File

@@ -0,0 +1,80 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2016 Benoit Steiner <benoit.steiner.goog@gmail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_THREADPOOL_MODULE_H
#define EIGEN_THREADPOOL_MODULE_H
#include "Core"
#include "src/Core/util/DisableStupidWarnings.h"
/** \defgroup ThreadPool_Module ThreadPool Module
*
* This module provides 2 threadpool implementations
* - a simple reference implementation
* - a faster non blocking implementation
*
* \code
* #include <Eigen/ThreadPool>
* \endcode
*/
#include <cstddef>
#include <cstring>
#include <time.h>
#include <vector>
#include <atomic>
#include <condition_variable>
#include <deque>
#include <mutex>
#include <thread>
#include <functional>
#include <memory>
#include <utility>
// There are non-parenthesized calls to "max" in the <unordered_map> header,
// which trigger a check in test/main.h causing compilation to fail.
// We work around the check here by removing the check for max in
// the case where we have to emulate thread_local.
#ifdef max
#undef max
#endif
#include <unordered_map>
#include "src/Core/util/Meta.h"
#include "src/Core/util/MaxSizeVector.h"
#ifndef EIGEN_MUTEX
#define EIGEN_MUTEX std::mutex
#endif
#ifndef EIGEN_MUTEX_LOCK
#define EIGEN_MUTEX_LOCK std::unique_lock<std::mutex>
#endif
#ifndef EIGEN_CONDVAR
#define EIGEN_CONDVAR std::condition_variable
#endif
// IWYU pragma: begin_exports
#include "src/ThreadPool/ThreadLocal.h"
#include "src/ThreadPool/ThreadYield.h"
#include "src/ThreadPool/ThreadCancel.h"
#include "src/ThreadPool/EventCount.h"
#include "src/ThreadPool/RunQueue.h"
#include "src/ThreadPool/ThreadPoolInterface.h"
#include "src/ThreadPool/ThreadEnvironment.h"
#include "src/ThreadPool/Barrier.h"
#include "src/ThreadPool/NonBlockingThreadPool.h"
#include "src/ThreadPool/CoreThreadPoolDevice.h"
#include "src/ThreadPool/ForkJoin.h"
// IWYU pragma: end_exports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_CXX11_THREADPOOL_MODULE_H

42
Eigen/UmfPackSupport Normal file
View File

@@ -0,0 +1,42 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_UMFPACKSUPPORT_MODULE_H
#define EIGEN_UMFPACKSUPPORT_MODULE_H
#include "SparseCore"
#include "src/Core/util/DisableStupidWarnings.h"
extern "C" {
#include <umfpack.h>
}
/** \ingroup Support_modules
* \defgroup UmfPackSupport_Module UmfPackSupport module
*
* This module provides an interface to the UmfPack library which is part of the <a
* href="http://www.suitesparse.com">suitesparse</a> package. It provides the following factorization class:
* - class UmfPackLU: a multifrontal sequential LU factorization.
*
* \code
* #include <Eigen/UmfPackSupport>
* \endcode
*
* In order to use this module, the umfpack headers must be accessible from the include paths, and your binary must be
* linked to the umfpack library and its dependencies. The dependencies depend on how umfpack has been compiled. For a
* cmake based project, you can use our FindUmfPack.cmake module to help you in this task.
*
*/
// IWYU pragma: begin_exports
#include "src/UmfPackSupport/UmfPackSupport.h"
// IWYU pragma: endexports
#include "src/Core/util/ReenableStupidWarnings.h"
#endif // EIGEN_UMFPACKSUPPORT_MODULE_H

14
Eigen/Version Normal file
View File

@@ -0,0 +1,14 @@
#ifndef EIGEN_VERSION_H
#define EIGEN_VERSION_H
// The "WORLD" version will forever remain "3" for the "Eigen3" library.
#define EIGEN_WORLD_VERSION 3
// As of Eigen3 5.0.0, we have moved to Semantic Versioning (semver.org).
#define EIGEN_MAJOR_VERSION 5
#define EIGEN_MINOR_VERSION 0
#define EIGEN_PATCH_VERSION 1
#define EIGEN_PRERELEASE_VERSION ""
#define EIGEN_BUILD_VERSION ""
#define EIGEN_VERSION_STRING "5.0.1"
#endif // EIGEN_VERSION_H

View File

@@ -0,0 +1,423 @@
#ifndef EIGEN_ACCELERATESUPPORT_H
#define EIGEN_ACCELERATESUPPORT_H
#include <Accelerate/Accelerate.h>
#include <Eigen/Sparse>
namespace Eigen {
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
class AccelerateImpl;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateLLT
* \brief A direct Cholesky (LLT) factorization and solver based on Accelerate
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ additional information about the matrix structure. Default is Lower.
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateLLT
*/
template <typename MatrixType, int UpLo = Lower>
using AccelerateLLT = AccelerateImpl<MatrixType, UpLo | Symmetric, SparseFactorizationCholesky, true>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateLDLT
* \brief The default Cholesky (LDLT) factorization and solver based on Accelerate
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ additional information about the matrix structure. Default is Lower.
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateLDLT
*/
template <typename MatrixType, int UpLo = Lower>
using AccelerateLDLT = AccelerateImpl<MatrixType, UpLo | Symmetric, SparseFactorizationLDLT, true>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateLDLTUnpivoted
* \brief A direct Cholesky-like LDL^T factorization and solver based on Accelerate with only 1x1 pivots and no pivoting
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ additional information about the matrix structure. Default is Lower.
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateLDLTUnpivoted
*/
template <typename MatrixType, int UpLo = Lower>
using AccelerateLDLTUnpivoted = AccelerateImpl<MatrixType, UpLo | Symmetric, SparseFactorizationLDLTUnpivoted, true>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateLDLTSBK
* \brief A direct Cholesky (LDLT) factorization and solver based on Accelerate with Supernode Bunch-Kaufman and static
* pivoting
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ additional information about the matrix structure. Default is Lower.
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateLDLTSBK
*/
template <typename MatrixType, int UpLo = Lower>
using AccelerateLDLTSBK = AccelerateImpl<MatrixType, UpLo | Symmetric, SparseFactorizationLDLTSBK, true>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateLDLTTPP
* \brief A direct Cholesky (LDLT) factorization and solver based on Accelerate with full threshold partial pivoting
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ additional information about the matrix structure. Default is Lower.
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateLDLTTPP
*/
template <typename MatrixType, int UpLo = Lower>
using AccelerateLDLTTPP = AccelerateImpl<MatrixType, UpLo | Symmetric, SparseFactorizationLDLTTPP, true>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateQR
* \brief A QR factorization and solver based on Accelerate
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateQR
*/
template <typename MatrixType>
using AccelerateQR = AccelerateImpl<MatrixType, 0, SparseFactorizationQR, false>;
/** \ingroup AccelerateSupport_Module
* \typedef AccelerateCholeskyAtA
* \brief A QR factorization and solver based on Accelerate without storing Q (equivalent to A^TA = R^T R)
*
* \warning Only single and double precision real scalar types are supported by Accelerate
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
*
* \sa \ref TutorialSparseSolverConcept, class AccelerateCholeskyAtA
*/
template <typename MatrixType>
using AccelerateCholeskyAtA = AccelerateImpl<MatrixType, 0, SparseFactorizationCholeskyAtA, false>;
namespace internal {
template <typename T>
struct AccelFactorizationDeleter {
void operator()(T* sym) {
if (sym) {
SparseCleanup(*sym);
delete sym;
sym = nullptr;
}
}
};
template <typename DenseVecT, typename DenseMatT, typename SparseMatT, typename NumFactT>
struct SparseTypesTraitBase {
typedef DenseVecT AccelDenseVector;
typedef DenseMatT AccelDenseMatrix;
typedef SparseMatT AccelSparseMatrix;
typedef SparseOpaqueSymbolicFactorization SymbolicFactorization;
typedef NumFactT NumericFactorization;
typedef AccelFactorizationDeleter<SymbolicFactorization> SymbolicFactorizationDeleter;
typedef AccelFactorizationDeleter<NumericFactorization> NumericFactorizationDeleter;
};
template <typename Scalar>
struct SparseTypesTrait {};
template <>
struct SparseTypesTrait<double> : SparseTypesTraitBase<DenseVector_Double, DenseMatrix_Double, SparseMatrix_Double,
SparseOpaqueFactorization_Double> {};
template <>
struct SparseTypesTrait<float>
: SparseTypesTraitBase<DenseVector_Float, DenseMatrix_Float, SparseMatrix_Float, SparseOpaqueFactorization_Float> {
};
} // end namespace internal
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
class AccelerateImpl : public SparseSolverBase<AccelerateImpl<MatrixType_, UpLo_, Solver_, EnforceSquare_> > {
protected:
using Base = SparseSolverBase<AccelerateImpl>;
using Base::derived;
using Base::m_isInitialized;
public:
using Base::_solve_impl;
typedef MatrixType_ MatrixType;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::StorageIndex StorageIndex;
enum { ColsAtCompileTime = Dynamic, MaxColsAtCompileTime = Dynamic };
enum { UpLo = UpLo_ };
using AccelDenseVector = typename internal::SparseTypesTrait<Scalar>::AccelDenseVector;
using AccelDenseMatrix = typename internal::SparseTypesTrait<Scalar>::AccelDenseMatrix;
using AccelSparseMatrix = typename internal::SparseTypesTrait<Scalar>::AccelSparseMatrix;
using SymbolicFactorization = typename internal::SparseTypesTrait<Scalar>::SymbolicFactorization;
using NumericFactorization = typename internal::SparseTypesTrait<Scalar>::NumericFactorization;
using SymbolicFactorizationDeleter = typename internal::SparseTypesTrait<Scalar>::SymbolicFactorizationDeleter;
using NumericFactorizationDeleter = typename internal::SparseTypesTrait<Scalar>::NumericFactorizationDeleter;
AccelerateImpl() {
m_isInitialized = false;
auto check_flag_set = [](int value, int flag) { return ((value & flag) == flag); };
if (check_flag_set(UpLo_, Symmetric)) {
m_sparseKind = SparseSymmetric;
m_triType = (UpLo_ & Lower) ? SparseLowerTriangle : SparseUpperTriangle;
} else if (check_flag_set(UpLo_, UnitLower)) {
m_sparseKind = SparseUnitTriangular;
m_triType = SparseLowerTriangle;
} else if (check_flag_set(UpLo_, UnitUpper)) {
m_sparseKind = SparseUnitTriangular;
m_triType = SparseUpperTriangle;
} else if (check_flag_set(UpLo_, StrictlyLower)) {
m_sparseKind = SparseTriangular;
m_triType = SparseLowerTriangle;
} else if (check_flag_set(UpLo_, StrictlyUpper)) {
m_sparseKind = SparseTriangular;
m_triType = SparseUpperTriangle;
} else if (check_flag_set(UpLo_, Lower)) {
m_sparseKind = SparseTriangular;
m_triType = SparseLowerTriangle;
} else if (check_flag_set(UpLo_, Upper)) {
m_sparseKind = SparseTriangular;
m_triType = SparseUpperTriangle;
} else {
m_sparseKind = SparseOrdinary;
m_triType = (UpLo_ & Lower) ? SparseLowerTriangle : SparseUpperTriangle;
}
m_order = SparseOrderDefault;
}
explicit AccelerateImpl(const MatrixType& matrix) : AccelerateImpl() { compute(matrix); }
~AccelerateImpl() {}
inline Index cols() const { return m_nCols; }
inline Index rows() const { return m_nRows; }
ComputationInfo info() const {
eigen_assert(m_isInitialized && "Decomposition is not initialized.");
return m_info;
}
void analyzePattern(const MatrixType& matrix);
void factorize(const MatrixType& matrix);
void compute(const MatrixType& matrix);
template <typename Rhs, typename Dest>
void _solve_impl(const MatrixBase<Rhs>& b, MatrixBase<Dest>& dest) const;
/** Sets the ordering algorithm to use. */
void setOrder(SparseOrder_t order) { m_order = order; }
private:
template <typename T>
void buildAccelSparseMatrix(const SparseMatrix<T>& a, AccelSparseMatrix& A, std::vector<long>& columnStarts) {
const Index nColumnsStarts = a.cols() + 1;
columnStarts.resize(nColumnsStarts);
for (Index i = 0; i < nColumnsStarts; i++) columnStarts[i] = a.outerIndexPtr()[i];
SparseAttributes_t attributes{};
attributes.transpose = false;
attributes.triangle = m_triType;
attributes.kind = m_sparseKind;
SparseMatrixStructure structure{};
structure.attributes = attributes;
structure.rowCount = static_cast<int>(a.rows());
structure.columnCount = static_cast<int>(a.cols());
structure.blockSize = 1;
structure.columnStarts = columnStarts.data();
structure.rowIndices = const_cast<int*>(a.innerIndexPtr());
A.structure = structure;
A.data = const_cast<T*>(a.valuePtr());
}
void doAnalysis(AccelSparseMatrix& A) {
m_numericFactorization.reset(nullptr);
SparseSymbolicFactorOptions opts{};
opts.control = SparseDefaultControl;
opts.orderMethod = m_order;
opts.order = nullptr;
opts.ignoreRowsAndColumns = nullptr;
opts.malloc = malloc;
opts.free = free;
opts.reportError = nullptr;
m_symbolicFactorization.reset(new SymbolicFactorization(SparseFactor(Solver_, A.structure, opts)));
SparseStatus_t status = m_symbolicFactorization->status;
updateInfoStatus(status);
if (status != SparseStatusOK) m_symbolicFactorization.reset(nullptr);
}
void doFactorization(AccelSparseMatrix& A) {
SparseStatus_t status = SparseStatusReleased;
if (m_symbolicFactorization) {
m_numericFactorization.reset(new NumericFactorization(SparseFactor(*m_symbolicFactorization, A)));
status = m_numericFactorization->status;
if (status != SparseStatusOK) m_numericFactorization.reset(nullptr);
}
updateInfoStatus(status);
}
protected:
void updateInfoStatus(SparseStatus_t status) const {
switch (status) {
case SparseStatusOK:
m_info = Success;
break;
case SparseFactorizationFailed:
case SparseMatrixIsSingular:
m_info = NumericalIssue;
break;
case SparseInternalError:
case SparseParameterError:
case SparseStatusReleased:
default:
m_info = InvalidInput;
break;
}
}
mutable ComputationInfo m_info;
Index m_nRows, m_nCols;
std::unique_ptr<SymbolicFactorization, SymbolicFactorizationDeleter> m_symbolicFactorization;
std::unique_ptr<NumericFactorization, NumericFactorizationDeleter> m_numericFactorization;
SparseKind_t m_sparseKind;
SparseTriangle_t m_triType;
SparseOrder_t m_order;
};
/** Computes the symbolic and numeric decomposition of matrix \a a */
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
void AccelerateImpl<MatrixType_, UpLo_, Solver_, EnforceSquare_>::compute(const MatrixType& a) {
if (EnforceSquare_) eigen_assert(a.rows() == a.cols());
m_nRows = a.rows();
m_nCols = a.cols();
AccelSparseMatrix A{};
std::vector<long> columnStarts;
buildAccelSparseMatrix(a, A, columnStarts);
doAnalysis(A);
if (m_symbolicFactorization) doFactorization(A);
m_isInitialized = true;
}
/** Performs a symbolic decomposition on the sparsity pattern of matrix \a a.
*
* This function is particularly useful when solving for several problems having the same structure.
*
* \sa factorize()
*/
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
void AccelerateImpl<MatrixType_, UpLo_, Solver_, EnforceSquare_>::analyzePattern(const MatrixType& a) {
if (EnforceSquare_) eigen_assert(a.rows() == a.cols());
m_nRows = a.rows();
m_nCols = a.cols();
AccelSparseMatrix A{};
std::vector<long> columnStarts;
buildAccelSparseMatrix(a, A, columnStarts);
doAnalysis(A);
m_isInitialized = true;
}
/** Performs a numeric decomposition of matrix \a a.
*
* The given matrix must have the same sparsity pattern as the matrix on which the symbolic decomposition has been
* performed.
*
* \sa analyzePattern()
*/
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
void AccelerateImpl<MatrixType_, UpLo_, Solver_, EnforceSquare_>::factorize(const MatrixType& a) {
eigen_assert(m_symbolicFactorization && "You must first call analyzePattern()");
eigen_assert(m_nRows == a.rows() && m_nCols == a.cols());
if (EnforceSquare_) eigen_assert(a.rows() == a.cols());
AccelSparseMatrix A{};
std::vector<long> columnStarts;
buildAccelSparseMatrix(a, A, columnStarts);
doFactorization(A);
}
template <typename MatrixType_, int UpLo_, SparseFactorization_t Solver_, bool EnforceSquare_>
template <typename Rhs, typename Dest>
void AccelerateImpl<MatrixType_, UpLo_, Solver_, EnforceSquare_>::_solve_impl(const MatrixBase<Rhs>& b,
MatrixBase<Dest>& x) const {
if (!m_numericFactorization) {
m_info = InvalidInput;
return;
}
eigen_assert(m_nRows == b.rows());
eigen_assert(((b.cols() == 1) || b.outerStride() == b.rows()));
SparseStatus_t status = SparseStatusOK;
Scalar* b_ptr = const_cast<Scalar*>(b.derived().data());
Scalar* x_ptr = const_cast<Scalar*>(x.derived().data());
AccelDenseMatrix xmat{};
xmat.attributes = SparseAttributes_t();
xmat.columnCount = static_cast<int>(x.cols());
xmat.rowCount = static_cast<int>(x.rows());
xmat.columnStride = xmat.rowCount;
xmat.data = x_ptr;
AccelDenseMatrix bmat{};
bmat.attributes = SparseAttributes_t();
bmat.columnCount = static_cast<int>(b.cols());
bmat.rowCount = static_cast<int>(b.rows());
bmat.columnStride = bmat.rowCount;
bmat.data = b_ptr;
SparseSolve(*m_numericFactorization, bmat, xmat);
updateInfoStatus(status);
}
} // end namespace Eigen
#endif // EIGEN_ACCELERATESUPPORT_H

View File

@@ -0,0 +1,3 @@
#ifndef EIGEN_ACCELERATESUPPORT_MODULE_H
#error "Please include Eigen/AccelerateSupport instead of including headers inside the src directory directly."
#endif

View File

@@ -1,133 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_ALLANDANY_H
#define EIGEN_ALLANDANY_H
template<typename Derived, int UnrollCount>
struct ei_all_unroller
{
enum {
col = (UnrollCount-1) / Derived::RowsAtCompileTime,
row = (UnrollCount-1) % Derived::RowsAtCompileTime
};
inline static bool run(const Derived &mat)
{
return ei_all_unroller<Derived, UnrollCount-1>::run(mat) && mat.coeff(row, col);
}
};
template<typename Derived>
struct ei_all_unroller<Derived, 1>
{
inline static bool run(const Derived &mat) { return mat.coeff(0, 0); }
};
template<typename Derived>
struct ei_all_unroller<Derived, Dynamic>
{
inline static bool run(const Derived &) { return false; }
};
template<typename Derived, int UnrollCount>
struct ei_any_unroller
{
enum {
col = (UnrollCount-1) / Derived::RowsAtCompileTime,
row = (UnrollCount-1) % Derived::RowsAtCompileTime
};
inline static bool run(const Derived &mat)
{
return ei_any_unroller<Derived, UnrollCount-1>::run(mat) || mat.coeff(row, col);
}
};
template<typename Derived>
struct ei_any_unroller<Derived, 1>
{
inline static bool run(const Derived &mat) { return mat.coeff(0, 0); }
};
template<typename Derived>
struct ei_any_unroller<Derived, Dynamic>
{
inline static bool run(const Derived &) { return false; }
};
/** \array_module
*
* \returns true if all coefficients are true
*
* \addexample CwiseAll \label How to check whether a point is inside a box (using operator< and all())
*
* Example: \include MatrixBase_all.cpp
* Output: \verbinclude MatrixBase_all.out
*
* \sa MatrixBase::any(), Cwise::operator<()
*/
template<typename Derived>
inline bool MatrixBase<Derived>::all(void) const
{
const bool unroll = SizeAtCompileTime * (CoeffReadCost + NumTraits<Scalar>::AddCost)
<= EIGEN_UNROLLING_LIMIT;
if(unroll)
return ei_all_unroller<Derived,
unroll ? int(SizeAtCompileTime) : Dynamic
>::run(derived());
else
{
for(int j = 0; j < cols(); j++)
for(int i = 0; i < rows(); i++)
if (!coeff(i, j)) return false;
return true;
}
}
/** \array_module
*
* \returns true if at least one coefficient is true
*
* \sa MatrixBase::all()
*/
template<typename Derived>
inline bool MatrixBase<Derived>::any(void) const
{
const bool unroll = SizeAtCompileTime * (CoeffReadCost + NumTraits<Scalar>::AddCost)
<= EIGEN_UNROLLING_LIMIT;
if(unroll)
return ei_any_unroller<Derived,
unroll ? int(SizeAtCompileTime) : Dynamic
>::run(derived());
else
{
for(int j = 0; j < cols(); j++)
for(int i = 0; i < rows(); i++)
if (coeff(i, j)) return true;
return false;
}
}
#endif // EIGEN_ALLANDANY_H

View File

@@ -1,6 +0,0 @@
FILE(GLOB Eigen_Array_SRCS "*.h")
INSTALL(FILES
${Eigen_Array_SRCS}
DESTINATION ${INCLUDE_INSTALL_DIR}/Eigen/src/Array
)

View File

@@ -1,453 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_ARRAY_CWISE_OPERATORS_H
#define EIGEN_ARRAY_CWISE_OPERATORS_H
// -- unary operators --
/** \array_module
*
* \returns an expression of the coefficient-wise square root of *this.
*
* Example: \include Cwise_sqrt.cpp
* Output: \verbinclude Cwise_sqrt.out
*
* \sa pow(), square()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_sqrt_op)
Cwise<ExpressionType>::sqrt() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise exponential of *this.
*
* Example: \include Cwise_exp.cpp
* Output: \verbinclude Cwise_exp.out
*
* \sa pow(), log(), sin(), cos()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_exp_op)
Cwise<ExpressionType>::exp() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise logarithm of *this.
*
* Example: \include Cwise_log.cpp
* Output: \verbinclude Cwise_log.out
*
* \sa exp()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_log_op)
Cwise<ExpressionType>::log() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise cosine of *this.
*
* Example: \include Cwise_cos.cpp
* Output: \verbinclude Cwise_cos.out
*
* \sa sin(), exp()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_cos_op)
Cwise<ExpressionType>::cos() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise sine of *this.
*
* Example: \include Cwise_sin.cpp
* Output: \verbinclude Cwise_sin.out
*
* \sa cos(), exp()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_sin_op)
Cwise<ExpressionType>::sin() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise power of *this to the given exponent.
*
* Example: \include Cwise_pow.cpp
* Output: \verbinclude Cwise_pow.out
*
* \sa exp(), log()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_pow_op)
Cwise<ExpressionType>::pow(const Scalar& exponent) const
{
return EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_pow_op)(_expression(), ei_scalar_pow_op<Scalar>(exponent));
}
/** \array_module
*
* \returns an expression of the coefficient-wise inverse of *this.
*
* Example: \include Cwise_inverse.cpp
* Output: \verbinclude Cwise_inverse.out
*
* \sa operator/(), operator*()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_inverse_op)
Cwise<ExpressionType>::inverse() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise square of *this.
*
* Example: \include Cwise_square.cpp
* Output: \verbinclude Cwise_square.out
*
* \sa operator/(), operator*(), abs2()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_square_op)
Cwise<ExpressionType>::square() const
{
return _expression();
}
/** \array_module
*
* \returns an expression of the coefficient-wise cube of *this.
*
* Example: \include Cwise_cube.cpp
* Output: \verbinclude Cwise_cube.out
*
* \sa square(), pow()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_cube_op)
Cwise<ExpressionType>::cube() const
{
return _expression();
}
// -- binary operators --
/** \array_module
*
* \returns an expression of the coefficient-wise \< operator of *this and \a other
*
* Example: \include Cwise_less.cpp
* Output: \verbinclude Cwise_less.out
*
* \sa MatrixBase::all(), MatrixBase::any(), operator>(), operator<=()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::less)
Cwise<ExpressionType>::operator<(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::less)(_expression(), other.derived());
}
/** \array_module
*
* \returns an expression of the coefficient-wise \<= operator of *this and \a other
*
* Example: \include Cwise_less_equal.cpp
* Output: \verbinclude Cwise_less_equal.out
*
* \sa MatrixBase::all(), MatrixBase::any(), operator>=(), operator<()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::less_equal)
Cwise<ExpressionType>::operator<=(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::less_equal)(_expression(), other.derived());
}
/** \array_module
*
* \returns an expression of the coefficient-wise \> operator of *this and \a other
*
* Example: \include Cwise_greater.cpp
* Output: \verbinclude Cwise_greater.out
*
* \sa MatrixBase::all(), MatrixBase::any(), operator>=(), operator<()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater)
Cwise<ExpressionType>::operator>(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater)(_expression(), other.derived());
}
/** \array_module
*
* \returns an expression of the coefficient-wise \>= operator of *this and \a other
*
* Example: \include Cwise_greater_equal.cpp
* Output: \verbinclude Cwise_greater_equal.out
*
* \sa MatrixBase::all(), MatrixBase::any(), operator>(), operator<=()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater_equal)
Cwise<ExpressionType>::operator>=(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater_equal)(_expression(), other.derived());
}
/** \array_module
*
* \returns an expression of the coefficient-wise == operator of *this and \a other
*
* \warning this performs an exact comparison, which is generally a bad idea with floating-point types.
* In order to check for equality between two vectors or matrices with floating-point coefficients, it is
* generally a far better idea to use a fuzzy comparison as provided by MatrixBase::isApprox() and
* MatrixBase::isMuchSmallerThan().
*
* Example: \include Cwise_equal_equal.cpp
* Output: \verbinclude Cwise_equal_equal.out
*
* \sa MatrixBase::all(), MatrixBase::any(), MatrixBase::isApprox(), MatrixBase::isMuchSmallerThan()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::equal_to)
Cwise<ExpressionType>::operator==(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::equal_to)(_expression(), other.derived());
}
/** \array_module
*
* \returns an expression of the coefficient-wise != operator of *this and \a other
*
* \warning this performs an exact comparison, which is generally a bad idea with floating-point types.
* In order to check for equality between two vectors or matrices with floating-point coefficients, it is
* generally a far better idea to use a fuzzy comparison as provided by MatrixBase::isApprox() and
* MatrixBase::isMuchSmallerThan().
*
* Example: \include Cwise_not_equal.cpp
* Output: \verbinclude Cwise_not_equal.out
*
* \sa MatrixBase::all(), MatrixBase::any(), MatrixBase::isApprox(), MatrixBase::isMuchSmallerThan()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(std::not_equal_to)
Cwise<ExpressionType>::operator!=(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(std::not_equal_to)(_expression(), other.derived());
}
// comparisons to scalar value
/** \array_module
*
* \returns an expression of the coefficient-wise \< operator of *this and a scalar \a s
*
* \sa operator<(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less)
Cwise<ExpressionType>::operator<(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
/** \array_module
*
* \returns an expression of the coefficient-wise \<= operator of *this and a scalar \a s
*
* \sa operator<=(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less_equal)
Cwise<ExpressionType>::operator<=(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less_equal)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
/** \array_module
*
* \returns an expression of the coefficient-wise \> operator of *this and a scalar \a s
*
* \sa operator>(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater)
Cwise<ExpressionType>::operator>(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
/** \array_module
*
* \returns an expression of the coefficient-wise \>= operator of *this and a scalar \a s
*
* \sa operator>=(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater_equal)
Cwise<ExpressionType>::operator>=(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater_equal)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
/** \array_module
*
* \returns an expression of the coefficient-wise == operator of *this and a scalar \a s
*
* \warning this performs an exact comparison, which is generally a bad idea with floating-point types.
* In order to check for equality between two vectors or matrices with floating-point coefficients, it is
* generally a far better idea to use a fuzzy comparison as provided by MatrixBase::isApprox() and
* MatrixBase::isMuchSmallerThan().
*
* \sa operator==(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::equal_to)
Cwise<ExpressionType>::operator==(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::equal_to)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
/** \array_module
*
* \returns an expression of the coefficient-wise != operator of *this and a scalar \a s
*
* \warning this performs an exact comparison, which is generally a bad idea with floating-point types.
* In order to check for equality between two vectors or matrices with floating-point coefficients, it is
* generally a far better idea to use a fuzzy comparison as provided by MatrixBase::isApprox() and
* MatrixBase::isMuchSmallerThan().
*
* \sa operator!=(const MatrixBase<OtherDerived> &) const
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::not_equal_to)
Cwise<ExpressionType>::operator!=(Scalar s) const
{
return EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::not_equal_to)(_expression(),
typename ExpressionType::ConstantReturnType(_expression().rows(), _expression().cols(), s));
}
// scalar addition
/** \array_module
*
* \returns an expression of \c *this with each coeff incremented by the constant \a scalar
*
* Example: \include Cwise_plus.cpp
* Output: \verbinclude Cwise_plus.out
*
* \sa operator+=(), operator-()
*/
template<typename ExpressionType>
inline const typename Cwise<ExpressionType>::ScalarAddReturnType
Cwise<ExpressionType>::operator+(const Scalar& scalar) const
{
return typename Cwise<ExpressionType>::ScalarAddReturnType(m_matrix, ei_scalar_add_op<Scalar>(scalar));
}
/** \array_module
*
* Adds the given \a scalar to each coeff of this expression.
*
* Example: \include Cwise_plus_equal.cpp
* Output: \verbinclude Cwise_plus_equal.out
*
* \sa operator+(), operator-=()
*/
template<typename ExpressionType>
inline ExpressionType& Cwise<ExpressionType>::operator+=(const Scalar& scalar)
{
return m_matrix.const_cast_derived() = *this + scalar;
}
/** \array_module
*
* \returns an expression of \c *this with each coeff decremented by the constant \a scalar
*
* Example: \include Cwise_minus.cpp
* Output: \verbinclude Cwise_minus.out
*
* \sa operator+(), operator-=()
*/
template<typename ExpressionType>
inline const typename Cwise<ExpressionType>::ScalarAddReturnType
Cwise<ExpressionType>::operator-(const Scalar& scalar) const
{
return *this + (-scalar);
}
/** \array_module
*
* Substracts the given \a scalar from each coeff of this expression.
*
* Example: \include Cwise_minus_equal.cpp
* Output: \verbinclude Cwise_minus_equal.out
*
* \sa operator+=(), operator-()
*/
template<typename ExpressionType>
inline ExpressionType& Cwise<ExpressionType>::operator-=(const Scalar& scalar)
{
return m_matrix.const_cast_derived() = *this - scalar;
}
#endif // EIGEN_ARRAY_CWISE_OPERATORS_H

View File

@@ -1,306 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_ARRAY_FUNCTORS_H
#define EIGEN_ARRAY_FUNCTORS_H
/** \internal
* \array_module
*
* \brief Template functor to add a scalar to a fixed other one
*
* \sa class CwiseUnaryOp, Array::operator+
*/
/* If you wonder why doing the ei_pset1() in packetOp() is an optimization check ei_scalar_multiple_op */
template<typename Scalar>
struct ei_scalar_add_op {
typedef typename ei_packet_traits<Scalar>::type PacketScalar;
// FIXME default copy constructors seems bugged with std::complex<>
inline ei_scalar_add_op(const ei_scalar_add_op& other) : m_other(other.m_other) { }
inline ei_scalar_add_op(const Scalar& other) : m_other(other) { }
inline Scalar operator() (const Scalar& a) const { return a + m_other; }
inline const PacketScalar packetOp(const PacketScalar& a) const
{ return ei_padd(a, ei_pset1(m_other)); }
const Scalar m_other;
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_add_op<Scalar> >
{ enum { Cost = NumTraits<Scalar>::AddCost, PacketAccess = ei_packet_traits<Scalar>::size>1 }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the square root of a scalar
*
* \sa class CwiseUnaryOp, Cwise::sqrt()
*/
template<typename Scalar> struct ei_scalar_sqrt_op EIGEN_EMPTY_STRUCT {
inline const Scalar operator() (const Scalar& a) const { return ei_sqrt(a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_sqrt_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the exponential of a scalar
*
* \sa class CwiseUnaryOp, Cwise::exp()
*/
template<typename Scalar> struct ei_scalar_exp_op EIGEN_EMPTY_STRUCT {
inline const Scalar operator() (const Scalar& a) const { return ei_exp(a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_exp_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the logarithm of a scalar
*
* \sa class CwiseUnaryOp, Cwise::log()
*/
template<typename Scalar> struct ei_scalar_log_op EIGEN_EMPTY_STRUCT {
inline const Scalar operator() (const Scalar& a) const { return ei_log(a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_log_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the cosine of a scalar
*
* \sa class CwiseUnaryOp, Cwise::cos()
*/
template<typename Scalar> struct ei_scalar_cos_op EIGEN_EMPTY_STRUCT {
inline const Scalar operator() (const Scalar& a) const { return ei_cos(a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_cos_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the sine of a scalar
*
* \sa class CwiseUnaryOp, Cwise::sin()
*/
template<typename Scalar> struct ei_scalar_sin_op EIGEN_EMPTY_STRUCT {
inline const Scalar operator() (const Scalar& a) const { return ei_sin(a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_sin_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to raise a scalar to a power
*
* \sa class CwiseUnaryOp, Cwise::pow
*/
template<typename Scalar>
struct ei_scalar_pow_op {
// FIXME default copy constructors seems bugged with std::complex<>
inline ei_scalar_pow_op(const ei_scalar_pow_op& other) : m_exponent(other.m_exponent) { }
inline ei_scalar_pow_op(const Scalar& exponent) : m_exponent(exponent) {}
inline Scalar operator() (const Scalar& a) const { return ei_pow(a, m_exponent); }
const Scalar m_exponent;
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_pow_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the inverse of a scalar
*
* \sa class CwiseUnaryOp, Cwise::inverse()
*/
template<typename Scalar>
struct ei_scalar_inverse_op {
inline Scalar operator() (const Scalar& a) const { return Scalar(1)/a; }
template<typename PacketScalar>
inline const PacketScalar packetOp(const PacketScalar& a) const
{ return ei_pdiv(ei_pset1(Scalar(1)),a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_inverse_op<Scalar> >
{ enum { Cost = NumTraits<Scalar>::MulCost, PacketAccess = int(ei_packet_traits<Scalar>::size)>1 }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the square of a scalar
*
* \sa class CwiseUnaryOp, Cwise::square()
*/
template<typename Scalar>
struct ei_scalar_square_op {
inline Scalar operator() (const Scalar& a) const { return a*a; }
template<typename PacketScalar>
inline const PacketScalar packetOp(const PacketScalar& a) const
{ return ei_pmul(a,a); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_square_op<Scalar> >
{ enum { Cost = NumTraits<Scalar>::MulCost, PacketAccess = int(ei_packet_traits<Scalar>::size)>1 }; };
/** \internal
*
* \array_module
*
* \brief Template functor to compute the cube of a scalar
*
* \sa class CwiseUnaryOp, Cwise::cube()
*/
template<typename Scalar>
struct ei_scalar_cube_op {
inline Scalar operator() (const Scalar& a) const { return a*a*a; }
template<typename PacketScalar>
inline const PacketScalar packetOp(const PacketScalar& a) const
{ return ei_pmul(a,ei_pmul(a,a)); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_cube_op<Scalar> >
{ enum { Cost = 2*NumTraits<Scalar>::MulCost, PacketAccess = int(ei_packet_traits<Scalar>::size)>1 }; };
// default ei_functor_traits for STL functors:
template<typename T>
struct ei_functor_traits<std::multiplies<T> >
{ enum { Cost = NumTraits<T>::MulCost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::divides<T> >
{ enum { Cost = NumTraits<T>::MulCost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::plus<T> >
{ enum { Cost = NumTraits<T>::AddCost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::minus<T> >
{ enum { Cost = NumTraits<T>::AddCost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::negate<T> >
{ enum { Cost = NumTraits<T>::AddCost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::logical_or<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::logical_and<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::logical_not<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::greater<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::less<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::greater_equal<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::less_equal<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::equal_to<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::not_equal_to<T> >
{ enum { Cost = 1, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::binder2nd<T> >
{ enum { Cost = ei_functor_traits<T>::Cost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::binder1st<T> >
{ enum { Cost = ei_functor_traits<T>::Cost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::unary_negate<T> >
{ enum { Cost = 1 + ei_functor_traits<T>::Cost, PacketAccess = false }; };
template<typename T>
struct ei_functor_traits<std::binary_negate<T> >
{ enum { Cost = 1 + ei_functor_traits<T>::Cost, PacketAccess = false }; };
#ifdef EIGEN_STDEXT_SUPPORT
template<typename T0,typename T1>
struct ei_functor_traits<std::project1st<T0,T1> >
{ enum { Cost = 0, PacketAccess = false }; };
template<typename T0,typename T1>
struct ei_functor_traits<std::project2nd<T0,T1> >
{ enum { Cost = 0, PacketAccess = false }; };
template<typename T0,typename T1>
struct ei_functor_traits<std::select2nd<std::pair<T0,T1> > >
{ enum { Cost = 0, PacketAccess = false }; };
template<typename T0,typename T1>
struct ei_functor_traits<std::select1st<std::pair<T0,T1> > >
{ enum { Cost = 0, PacketAccess = false }; };
template<typename T0,typename T1>
struct ei_functor_traits<std::unary_compose<T0,T1> >
{ enum { Cost = ei_functor_traits<T0>::Cost + ei_functor_traits<T1>::Cost, PacketAccess = false }; };
template<typename T0,typename T1,typename T2>
struct ei_functor_traits<std::binary_compose<T0,T1,T2> >
{ enum { Cost = ei_functor_traits<T0>::Cost + ei_functor_traits<T1>::Cost + ei_functor_traits<T2>::Cost, PacketAccess = false }; };
#endif // EIGEN_STDEXT_SUPPORT
#endif // EIGEN_ARRAY_FUNCTORS_H

View File

@@ -1,80 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_ARRAY_NORMS_H
#define EIGEN_ARRAY_NORMS_H
template<typename Derived, int p>
struct ei_lpNorm_selector
{
typedef typename NumTraits<typename ei_traits<Derived>::Scalar>::Real RealScalar;
inline static RealScalar run(const MatrixBase<Derived>& m)
{
return ei_pow(m.cwise().abs().cwise().pow(p).sum(), RealScalar(1)/p);
}
};
template<typename Derived>
struct ei_lpNorm_selector<Derived, 1>
{
inline static typename NumTraits<typename ei_traits<Derived>::Scalar>::Real run(const MatrixBase<Derived>& m)
{
return m.cwise().abs().sum();
}
};
template<typename Derived>
struct ei_lpNorm_selector<Derived, 2>
{
inline static typename NumTraits<typename ei_traits<Derived>::Scalar>::Real run(const MatrixBase<Derived>& m)
{
return m.norm();
}
};
template<typename Derived>
struct ei_lpNorm_selector<Derived, Infinity>
{
inline static typename NumTraits<typename ei_traits<Derived>::Scalar>::Real run(const MatrixBase<Derived>& m)
{
return m.cwise().abs().maxCoeff();
}
};
/** \array_module
*
* \returns the \f$ \ell^p \f$ norm of *this, that is, returns the p-th root of the sum of the p-th powers of the absolute values
* of the coefficients of *this. If \a p is the special value \a Eigen::Infinity, this function returns the \f$ \ell^p\infty \f$
* norm, that is the maximum of the absolute values of the coefficients of *this.
*
* \sa norm()
*/
template<typename Derived>
template<int p>
inline typename NumTraits<typename ei_traits<Derived>::Scalar>::Real MatrixBase<Derived>::lpNorm() const
{
return ei_lpNorm_selector<Derived, p>::run(*this);
}
#endif // EIGEN_ARRAY_NORMS_H

View File

@@ -1,299 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_PARTIAL_REDUX_H
#define EIGEN_PARTIAL_REDUX_H
/** \array_module \ingroup Array
*
* \class PartialReduxExpr
*
* \brief Generic expression of a partially reduxed matrix
*
* \param MatrixType the type of the matrix we are applying the redux operation
* \param MemberOp type of the member functor
* \param Direction indicates the direction of the redux (Vertical or Horizontal)
*
* This class represents an expression of a partial redux operator of a matrix.
* It is the return type of PartialRedux functions,
* and most of the time this is the only way it is used.
*
* \sa class PartialRedux
*/
template< typename MatrixType, typename MemberOp, int Direction>
class PartialReduxExpr;
template<typename MatrixType, typename MemberOp, int Direction>
struct ei_traits<PartialReduxExpr<MatrixType, MemberOp, Direction> >
{
typedef typename MemberOp::result_type Scalar;
typedef typename MatrixType::Scalar InputScalar;
typedef typename ei_nested<MatrixType>::type MatrixTypeNested;
typedef typename ei_cleantype<MatrixTypeNested>::type _MatrixTypeNested;
enum {
RowsAtCompileTime = Direction==Vertical ? 1 : MatrixType::RowsAtCompileTime,
ColsAtCompileTime = Direction==Horizontal ? 1 : MatrixType::ColsAtCompileTime,
MaxRowsAtCompileTime = Direction==Vertical ? 1 : MatrixType::MaxRowsAtCompileTime,
MaxColsAtCompileTime = Direction==Horizontal ? 1 : MatrixType::MaxColsAtCompileTime,
Flags = (unsigned int)_MatrixTypeNested::Flags & HereditaryBits,
TraversalSize = Direction==Vertical ? RowsAtCompileTime : ColsAtCompileTime
};
typedef typename MemberOp::template Cost<InputScalar,int(TraversalSize)> CostOpType;
enum {
CoeffReadCost = TraversalSize * ei_traits<_MatrixTypeNested>::CoeffReadCost + int(CostOpType::value)
};
};
template< typename MatrixType, typename MemberOp, int Direction>
class PartialReduxExpr : ei_no_assignment_operator,
public MatrixBase<PartialReduxExpr<MatrixType, MemberOp, Direction> >
{
public:
EIGEN_GENERIC_PUBLIC_INTERFACE(PartialReduxExpr)
typedef typename ei_traits<PartialReduxExpr>::MatrixTypeNested MatrixTypeNested;
typedef typename ei_traits<PartialReduxExpr>::_MatrixTypeNested _MatrixTypeNested;
PartialReduxExpr(const MatrixType& mat, const MemberOp& func = MemberOp())
: m_matrix(mat), m_functor(func) {}
int rows() const { return (Direction==Vertical ? 1 : m_matrix.rows()); }
int cols() const { return (Direction==Horizontal ? 1 : m_matrix.cols()); }
const Scalar coeff(int i, int j) const
{
if (Direction==Vertical)
return m_functor(m_matrix.col(j));
else
return m_functor(m_matrix.row(i));
}
protected:
const MatrixTypeNested m_matrix;
const MemberOp m_functor;
};
#define EIGEN_MEMBER_FUNCTOR(MEMBER,COST) \
template <typename ResultType> \
struct ei_member_##MEMBER EIGEN_EMPTY_STRUCT { \
typedef ResultType result_type; \
template<typename Scalar, int Size> struct Cost \
{ enum { value = COST }; }; \
template<typename Derived> \
inline ResultType operator()(const MatrixBase<Derived>& mat) const \
{ return mat.MEMBER(); } \
}
EIGEN_MEMBER_FUNCTOR(squaredNorm, Size * NumTraits<Scalar>::MulCost + (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(norm, (Size+5) * NumTraits<Scalar>::MulCost + (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(sum, (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(minCoeff, (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(maxCoeff, (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(all, (Size-1)*NumTraits<Scalar>::AddCost);
EIGEN_MEMBER_FUNCTOR(any, (Size-1)*NumTraits<Scalar>::AddCost);
/** \internal */
template <typename BinaryOp, typename Scalar>
struct ei_member_redux {
typedef typename ei_result_of<
BinaryOp(Scalar)
>::type result_type;
template<typename _Scalar, int Size> struct Cost
{ enum { value = (Size-1) * ei_functor_traits<BinaryOp>::Cost }; };
ei_member_redux(const BinaryOp func) : m_functor(func) {}
template<typename Derived>
inline result_type operator()(const MatrixBase<Derived>& mat) const
{ return mat.redux(m_functor); }
const BinaryOp m_functor;
};
/** \array_module \ingroup Array
*
* \class PartialRedux
*
* \brief Pseudo expression providing partial reduction operations
*
* \param ExpressionType the type of the object on which to do partial reductions
* \param Direction indicates the direction of the redux (Vertical or Horizontal)
*
* This class represents a pseudo expression with partial reduction features.
* It is the return type of MatrixBase::colwise() and MatrixBase::rowwise()
* and most of the time this is the only way it is used.
*
* Example: \include MatrixBase_colwise.cpp
* Output: \verbinclude MatrixBase_colwise.out
*
* \sa MatrixBase::colwise(), MatrixBase::rowwise(), class PartialReduxExpr
*/
template<typename ExpressionType, int Direction> class PartialRedux
{
public:
typedef typename ei_traits<ExpressionType>::Scalar Scalar;
typedef typename ei_meta_if<ei_must_nest_by_value<ExpressionType>::ret,
ExpressionType, const ExpressionType&>::ret ExpressionTypeNested;
template<template<typename _Scalar> class Functor> struct ReturnType
{
typedef PartialReduxExpr<ExpressionType,
Functor<typename ei_traits<ExpressionType>::Scalar>,
Direction
> Type;
};
template<typename BinaryOp> struct ReduxReturnType
{
typedef PartialReduxExpr<ExpressionType,
ei_member_redux<BinaryOp,typename ei_traits<ExpressionType>::Scalar>,
Direction
> Type;
};
inline PartialRedux(const ExpressionType& matrix) : m_matrix(matrix) {}
/** \internal */
inline const ExpressionType& _expression() const { return m_matrix; }
template<typename BinaryOp>
const typename ReduxReturnType<BinaryOp>::Type
redux(const BinaryOp& func = BinaryOp()) const;
/** \returns a row (or column) vector expression of the smallest coefficient
* of each column (or row) of the referenced expression.
*
* Example: \include PartialRedux_minCoeff.cpp
* Output: \verbinclude PartialRedux_minCoeff.out
*
* \sa MatrixBase::minCoeff() */
const typename ReturnType<ei_member_minCoeff>::Type minCoeff() const
{ return _expression(); }
/** \returns a row (or column) vector expression of the largest coefficient
* of each column (or row) of the referenced expression.
*
* Example: \include PartialRedux_maxCoeff.cpp
* Output: \verbinclude PartialRedux_maxCoeff.out
*
* \sa MatrixBase::maxCoeff() */
const typename ReturnType<ei_member_maxCoeff>::Type maxCoeff() const
{ return _expression(); }
/** \returns a row (or column) vector expression of the squared norm
* of each column (or row) of the referenced expression.
*
* Example: \include PartialRedux_squaredNorm.cpp
* Output: \verbinclude PartialRedux_squaredNorm.out
*
* \sa MatrixBase::squaredNorm() */
const typename ReturnType<ei_member_squaredNorm>::Type squaredNorm() const
{ return _expression(); }
/** \returns a row (or column) vector expression of the norm
* of each column (or row) of the referenced expression.
*
* Example: \include PartialRedux_norm.cpp
* Output: \verbinclude PartialRedux_norm.out
*
* \sa MatrixBase::norm() */
const typename ReturnType<ei_member_norm>::Type norm() const
{ return _expression(); }
/** \returns a row (or column) vector expression of the sum
* of each column (or row) of the referenced expression.
*
* Example: \include PartialRedux_sum.cpp
* Output: \verbinclude PartialRedux_sum.out
*
* \sa MatrixBase::sum() */
const typename ReturnType<ei_member_sum>::Type sum() const
{ return _expression(); }
/** \returns a row (or column) vector expression representing
* whether \b all coefficients of each respective column (or row) are \c true.
*
* \sa MatrixBase::all() */
const typename ReturnType<ei_member_all>::Type all() const
{ return _expression(); }
/** \returns a row (or column) vector expression representing
* whether \b at \b least one coefficient of each respective column (or row) is \c true.
*
* \sa MatrixBase::any() */
const typename ReturnType<ei_member_any>::Type any() const
{ return _expression(); }
protected:
ExpressionTypeNested m_matrix;
};
/** \array_module
*
* \returns a PartialRedux wrapper of *this providing additional partial reduction operations
*
* Example: \include MatrixBase_colwise.cpp
* Output: \verbinclude MatrixBase_colwise.out
*
* \sa rowwise(), class PartialRedux
*/
template<typename Derived>
inline const PartialRedux<Derived,Vertical>
MatrixBase<Derived>::colwise() const
{
return derived();
}
/** \array_module
*
* \returns a PartialRedux wrapper of *this providing additional partial reduction operations
*
* Example: \include MatrixBase_rowwise.cpp
* Output: \verbinclude MatrixBase_rowwise.out
*
* \sa colwise(), class PartialRedux
*/
template<typename Derived>
inline const PartialRedux<Derived,Horizontal>
MatrixBase<Derived>::rowwise() const
{
return derived();
}
/** \returns a row or column vector expression of \c *this reduxed by \a func
*
* The template parameter \a BinaryOp is the type of the functor
* of the custom redux operator. Note that func must be an associative operator.
*
* \sa class PartialRedux, MatrixBase::colwise(), MatrixBase::rowwise()
*/
template<typename ExpressionType, int Direction>
template<typename BinaryOp>
const typename PartialRedux<ExpressionType,Direction>::template ReduxReturnType<BinaryOp>::Type
PartialRedux<ExpressionType,Direction>::redux(const BinaryOp& func) const
{
return typename ReduxReturnType<BinaryOp>::Type(_expression(), func);
}
#endif // EIGEN_PARTIAL_REDUX_H

View File

@@ -1,121 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_RANDOM_H
#define EIGEN_RANDOM_H
template<typename Scalar> struct ei_scalar_random_op EIGEN_EMPTY_STRUCT {
inline ei_scalar_random_op(void) {}
inline const Scalar operator() (int, int) const { return ei_random<Scalar>(); }
};
template<typename Scalar>
struct ei_functor_traits<ei_scalar_random_op<Scalar> >
{ enum { Cost = 5 * NumTraits<Scalar>::MulCost, PacketAccess = false, IsRepeatable = false }; };
/** \array_module
*
* \returns a random matrix (not an expression, the matrix is immediately evaluated).
*
* The parameters \a rows and \a cols are the number of rows and of columns of
* the returned matrix. Must be compatible with this MatrixBase type.
*
* This variant is meant to be used for dynamic-size matrix types. For fixed-size types,
* it is redundant to pass \a rows and \a cols as arguments, so ei_random() should be used
* instead.
*
* \addexample RandomExample \label How to create a matrix with random coefficients
*
* Example: \include MatrixBase_random_int_int.cpp
* Output: \verbinclude MatrixBase_random_int_int.out
*
* \sa MatrixBase::setRandom(), MatrixBase::Random(int), MatrixBase::Random()
*/
template<typename Derived>
inline const CwiseNullaryOp<ei_scalar_random_op<typename ei_traits<Derived>::Scalar>, Derived>
MatrixBase<Derived>::Random(int rows, int cols)
{
return NullaryExpr(rows, cols, ei_scalar_random_op<Scalar>());
}
/** \array_module
*
* \returns a random vector (not an expression, the vector is immediately evaluated).
*
* The parameter \a size is the size of the returned vector.
* Must be compatible with this MatrixBase type.
*
* \only_for_vectors
*
* This variant is meant to be used for dynamic-size vector types. For fixed-size types,
* it is redundant to pass \a size as argument, so ei_random() should be used
* instead.
*
* Example: \include MatrixBase_random_int.cpp
* Output: \verbinclude MatrixBase_random_int.out
*
* \sa MatrixBase::setRandom(), MatrixBase::Random(int,int), MatrixBase::Random()
*/
template<typename Derived>
inline const CwiseNullaryOp<ei_scalar_random_op<typename ei_traits<Derived>::Scalar>, Derived>
MatrixBase<Derived>::Random(int size)
{
return NullaryExpr(size, ei_scalar_random_op<Scalar>());
}
/** \array_module
*
* \returns a fixed-size random matrix or vector
* (not an expression, the matrix is immediately evaluated).
*
* This variant is only for fixed-size MatrixBase types. For dynamic-size types, you
* need to use the variants taking size arguments.
*
* Example: \include MatrixBase_random.cpp
* Output: \verbinclude MatrixBase_random.out
*
* \sa MatrixBase::setRandom(), MatrixBase::Random(int,int), MatrixBase::Random(int)
*/
template<typename Derived>
inline const CwiseNullaryOp<ei_scalar_random_op<typename ei_traits<Derived>::Scalar>, Derived>
MatrixBase<Derived>::Random()
{
return NullaryExpr(RowsAtCompileTime, ColsAtCompileTime, ei_scalar_random_op<Scalar>());
}
/** \array_module
*
* Sets all coefficients in this expression to random values.
*
* Example: \include MatrixBase_setRandom.cpp
* Output: \verbinclude MatrixBase_setRandom.out
*
* \sa class CwiseNullaryOp, MatrixBase::setRandom(int,int)
*/
template<typename Derived>
inline Derived& MatrixBase<Derived>::setRandom()
{
return *this = Random(rows(), cols());
}
#endif // EIGEN_RANDOM_H

View File

@@ -1,153 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_SELECT_H
#define EIGEN_SELECT_H
/** \array_module \ingroup Array
*
* \class Select
*
* \brief Expression of a coefficient wise version of the C++ ternary operator ?:
*
* \param ConditionMatrixType the type of the \em condition expression which must be a boolean matrix
* \param ThenMatrixType the type of the \em then expression
* \param ElseMatrixType the type of the \em else expression
*
* This class represents an expression of a coefficient wise version of the C++ ternary operator ?:.
* It is the return type of MatrixBase::select() and most of the time this is the only way it is used.
*
* \sa MatrixBase::select(const MatrixBase<ThenDerived>&, const MatrixBase<ElseDerived>&) const
*/
template<typename ConditionMatrixType, typename ThenMatrixType, typename ElseMatrixType>
struct ei_traits<Select<ConditionMatrixType, ThenMatrixType, ElseMatrixType> >
{
typedef typename ei_traits<ThenMatrixType>::Scalar Scalar;
enum {
RowsAtCompileTime = ConditionMatrixType::RowsAtCompileTime,
ColsAtCompileTime = ConditionMatrixType::ColsAtCompileTime,
MaxRowsAtCompileTime = ConditionMatrixType::MaxRowsAtCompileTime,
MaxColsAtCompileTime = ConditionMatrixType::MaxColsAtCompileTime,
Flags = (unsigned int)ThenMatrixType::Flags & ElseMatrixType::Flags & HereditaryBits,
CoeffReadCost = ei_traits<ConditionMatrixType>::CoeffReadCost
+ EIGEN_ENUM_MAX(ei_traits<ThenMatrixType>::CoeffReadCost,
ei_traits<ElseMatrixType>::CoeffReadCost)
};
};
template<typename ConditionMatrixType, typename ThenMatrixType, typename ElseMatrixType>
class Select : ei_no_assignment_operator,
public MatrixBase<Select<ConditionMatrixType, ThenMatrixType, ElseMatrixType> >
{
public:
EIGEN_GENERIC_PUBLIC_INTERFACE(Select)
Select(const ConditionMatrixType& conditionMatrix,
const ThenMatrixType& thenMatrix,
const ElseMatrixType& elseMatrix)
: m_condition(conditionMatrix), m_then(thenMatrix), m_else(elseMatrix)
{
ei_assert(m_condition.rows() == m_then.rows() && m_condition.rows() == m_else.rows());
ei_assert(m_condition.cols() == m_then.cols() && m_condition.cols() == m_else.cols());
}
int rows() const { return m_condition.rows(); }
int cols() const { return m_condition.cols(); }
const Scalar coeff(int i, int j) const
{
if (m_condition.coeff(i,j))
return m_then.coeff(i,j);
else
return m_else.coeff(i,j);
}
const Scalar coeff(int i) const
{
if (m_condition.coeff(i))
return m_then.coeff(i);
else
return m_else.coeff(i);
}
protected:
const typename ConditionMatrixType::Nested m_condition;
const typename ThenMatrixType::Nested m_then;
const typename ElseMatrixType::Nested m_else;
};
/** \array_module
*
* \returns a matrix where each coefficient (i,j) is equal to \a thenMatrix(i,j)
* if \c *this(i,j), and \a elseMatrix(i,j) otherwise.
*
* \sa class Select
*/
template<typename Derived>
template<typename ThenDerived,typename ElseDerived>
inline const Select<Derived,ThenDerived,ElseDerived>
MatrixBase<Derived>::select(const MatrixBase<ThenDerived>& thenMatrix,
const MatrixBase<ElseDerived>& elseMatrix) const
{
return Select<Derived,ThenDerived,ElseDerived>(derived(), thenMatrix.derived(), elseMatrix.derived());
}
/** \array_module
*
* Version of MatrixBase::select(const MatrixBase&, const MatrixBase&) with
* the \em else expression being a scalar value.
*
* \sa MatrixBase::select(const MatrixBase<ThenDerived>&, const MatrixBase<ElseDerived>&) const, class Select
*/
template<typename Derived>
template<typename ThenDerived>
inline const Select<Derived,ThenDerived, NestByValue<typename ThenDerived::ConstantReturnType> >
MatrixBase<Derived>::select(const MatrixBase<ThenDerived>& thenMatrix,
typename ThenDerived::Scalar elseScalar) const
{
return Select<Derived,ThenDerived,NestByValue<typename ThenDerived::ConstantReturnType> >(
derived(), thenMatrix.derived(), ThenDerived::Constant(rows(),cols(),elseScalar));
}
/** \array_module
*
* Version of MatrixBase::select(const MatrixBase&, const MatrixBase&) with
* the \em then expression being a scalar value.
*
* \sa MatrixBase::select(const MatrixBase<ThenDerived>&, const MatrixBase<ElseDerived>&) const, class Select
*/
template<typename Derived>
template<typename ElseDerived>
inline const Select<Derived, NestByValue<typename ElseDerived::ConstantReturnType>, ElseDerived >
MatrixBase<Derived>::select(typename ElseDerived::Scalar thenScalar,
const MatrixBase<ElseDerived>& elseMatrix) const
{
return Select<Derived,NestByValue<typename ElseDerived::ConstantReturnType>,ElseDerived>(
derived(), ElseDerived::Constant(rows(),cols(),thenScalar), elseMatrix.derived());
}
#endif // EIGEN_SELECT_H

View File

@@ -1,9 +0,0 @@
ADD_SUBDIRECTORY(Core)
ADD_SUBDIRECTORY(LU)
ADD_SUBDIRECTORY(QR)
ADD_SUBDIRECTORY(SVD)
ADD_SUBDIRECTORY(Cholesky)
ADD_SUBDIRECTORY(Array)
ADD_SUBDIRECTORY(Geometry)
ADD_SUBDIRECTORY(Regression)
ADD_SUBDIRECTORY(Sparse)

View File

@@ -1,6 +0,0 @@
FILE(GLOB Eigen_Cholesky_SRCS "*.h")
INSTALL(FILES
${Eigen_Cholesky_SRCS}
DESTINATION ${INCLUDE_INSTALL_DIR}/Eigen/src/Cholesky
)

View File

@@ -1,165 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_CHOLESKY_H
#define EIGEN_CHOLESKY_H
/** \ingroup Cholesky_Module
*
* \class Cholesky
*
* \deprecated this class has been renamed LLT
*/
template<typename MatrixType> class Cholesky
{
private:
typedef typename MatrixType::Scalar Scalar;
typedef typename NumTraits<typename MatrixType::Scalar>::Real RealScalar;
typedef Matrix<Scalar, MatrixType::ColsAtCompileTime, 1> VectorType;
enum {
PacketSize = ei_packet_traits<Scalar>::size,
AlignmentMask = int(PacketSize)-1
};
public:
Cholesky(const MatrixType& matrix)
: m_matrix(matrix.rows(), matrix.cols())
{
compute(matrix);
}
/** \deprecated */
inline Part<MatrixType, Lower> matrixL(void) const { return m_matrix; }
/** \deprecated */
inline bool isPositiveDefinite(void) const { return m_isPositiveDefinite; }
template<typename Derived>
typename Derived::Eval solve(const MatrixBase<Derived> &b) const EIGEN_DEPRECATED;
template<typename RhsDerived, typename ResDerived>
bool solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const;
template<typename Derived>
bool solveInPlace(MatrixBase<Derived> &bAndX) const;
void compute(const MatrixType& matrix);
protected:
/** \internal
* Used to compute and store L
* The strict upper part is not used and even not initialized.
*/
MatrixType m_matrix;
bool m_isPositiveDefinite;
};
/** \deprecated */
template<typename MatrixType>
void Cholesky<MatrixType>::compute(const MatrixType& a)
{
assert(a.rows()==a.cols());
const int size = a.rows();
m_matrix.resize(size, size);
const RealScalar eps = ei_sqrt(precision<Scalar>());
RealScalar x;
x = ei_real(a.coeff(0,0));
m_isPositiveDefinite = x > eps && ei_isMuchSmallerThan(ei_imag(a.coeff(0,0)), RealScalar(1));
m_matrix.coeffRef(0,0) = ei_sqrt(x);
m_matrix.col(0).end(size-1) = a.row(0).end(size-1).adjoint() / ei_real(m_matrix.coeff(0,0));
for (int j = 1; j < size; ++j)
{
Scalar tmp = ei_real(a.coeff(j,j)) - m_matrix.row(j).start(j).squaredNorm();
x = ei_real(tmp);
if (x < eps || (!ei_isMuchSmallerThan(ei_imag(tmp), RealScalar(1))))
{
m_isPositiveDefinite = false;
return;
}
m_matrix.coeffRef(j,j) = x = ei_sqrt(x);
int endSize = size-j-1;
if (endSize>0) {
// Note that when all matrix columns have good alignment, then the following
// product is guaranteed to be optimal with respect to alignment.
m_matrix.col(j).end(endSize) =
(m_matrix.block(j+1, 0, endSize, j) * m_matrix.row(j).start(j).adjoint()).lazy();
// FIXME could use a.col instead of a.row
m_matrix.col(j).end(endSize) = (a.row(j).end(endSize).adjoint()
- m_matrix.col(j).end(endSize) ) / x;
}
}
}
/** \deprecated */
template<typename MatrixType>
template<typename Derived>
typename Derived::Eval Cholesky<MatrixType>::solve(const MatrixBase<Derived> &b) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows());
typename ei_eval_to_column_major<Derived>::type x(b);
solveInPlace(x);
return x;
}
/** \deprecated */
template<typename MatrixType>
template<typename RhsDerived, typename ResDerived>
bool Cholesky<MatrixType>::solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows() && "Cholesky::solve(): invalid number of rows of the right hand side matrix b");
return solveInPlace((*result) = b);
}
/** \deprecated */
template<typename MatrixType>
template<typename Derived>
bool Cholesky<MatrixType>::solveInPlace(MatrixBase<Derived> &bAndX) const
{
const int size = m_matrix.rows();
ei_assert(size==bAndX.rows());
if (!m_isPositiveDefinite)
return false;
matrixL().solveTriangularInPlace(bAndX);
m_matrix.adjoint().template part<Upper>().solveTriangularInPlace(bAndX);
return true;
}
/** \cholesky_module
* \deprecated has been renamed llt()
*/
template<typename Derived>
inline const Cholesky<typename MatrixBase<Derived>::EvalType>
MatrixBase<Derived>::cholesky() const
{
return Cholesky<typename ei_eval<Derived>::type>(derived());
}
#endif // EIGEN_CHOLESKY_H

View File

@@ -1,35 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_EXTERN_INSTANTIATIONS
#define EIGEN_EXTERN_INSTANTIATIONS
#endif
#include "../../Core"
#undef EIGEN_EXTERN_INSTANTIATIONS
#include "../../Cholesky"
namespace Eigen {
EIGEN_CHOLESKY_MODULE_INSTANTIATE();
}

View File

@@ -1,184 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_CHOLESKY_WITHOUT_SQUARE_ROOT_H
#define EIGEN_CHOLESKY_WITHOUT_SQUARE_ROOT_H
/** \deprecated \ingroup Cholesky_Module
*
* \class CholeskyWithoutSquareRoot
*
* \deprecated this class has been renamed LDLT
*/
template<typename MatrixType> class CholeskyWithoutSquareRoot
{
public:
typedef typename MatrixType::Scalar Scalar;
typedef typename NumTraits<typename MatrixType::Scalar>::Real RealScalar;
typedef Matrix<Scalar, MatrixType::ColsAtCompileTime, 1> VectorType;
CholeskyWithoutSquareRoot(const MatrixType& matrix)
: m_matrix(matrix.rows(), matrix.cols())
{
compute(matrix);
}
/** \returns the lower triangular matrix L */
inline Part<MatrixType, UnitLower> matrixL(void) const { return m_matrix; }
/** \returns the coefficients of the diagonal matrix D */
inline DiagonalCoeffs<MatrixType> vectorD(void) const { return m_matrix.diagonal(); }
/** \returns true if the matrix is positive definite */
inline bool isPositiveDefinite(void) const { return m_isPositiveDefinite; }
template<typename Derived>
typename Derived::Eval solve(const MatrixBase<Derived> &b) const EIGEN_DEPRECATED;
template<typename RhsDerived, typename ResDerived>
bool solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const;
template<typename Derived>
bool solveInPlace(MatrixBase<Derived> &bAndX) const;
void compute(const MatrixType& matrix);
protected:
/** \internal
* Used to compute and store the cholesky decomposition A = L D L^* = U^* D U.
* The strict upper part is used during the decomposition, the strict lower
* part correspond to the coefficients of L (its diagonal is equal to 1 and
* is not stored), and the diagonal entries correspond to D.
*/
MatrixType m_matrix;
bool m_isPositiveDefinite;
};
/** \deprecated */
template<typename MatrixType>
void CholeskyWithoutSquareRoot<MatrixType>::compute(const MatrixType& a)
{
assert(a.rows()==a.cols());
const int size = a.rows();
m_matrix.resize(size, size);
m_isPositiveDefinite = true;
const RealScalar eps = ei_sqrt(precision<Scalar>());
if (size<=1)
{
m_matrix = a;
return;
}
// Let's preallocate a temporay vector to evaluate the matrix-vector product into it.
// Unlike the standard Cholesky decomposition, here we cannot evaluate it to the destination
// matrix because it a sub-row which is not compatible suitable for efficient packet evaluation.
// (at least if we assume the matrix is col-major)
Matrix<Scalar,MatrixType::RowsAtCompileTime,1> _temporary(size);
// Note that, in this algorithm the rows of the strict upper part of m_matrix is used to store
// column vector, thus the strange .conjugate() and .transpose()...
m_matrix.row(0) = a.row(0).conjugate();
m_matrix.col(0).end(size-1) = m_matrix.row(0).end(size-1) / m_matrix.coeff(0,0);
for (int j = 1; j < size; ++j)
{
RealScalar tmp = ei_real(a.coeff(j,j) - (m_matrix.row(j).start(j) * m_matrix.col(j).start(j).conjugate()).coeff(0,0));
m_matrix.coeffRef(j,j) = tmp;
if (tmp < eps)
{
m_isPositiveDefinite = false;
return;
}
int endSize = size-j-1;
if (endSize>0)
{
_temporary.end(endSize) = ( m_matrix.block(j+1,0, endSize, j)
* m_matrix.col(j).start(j).conjugate() ).lazy();
m_matrix.row(j).end(endSize) = a.row(j).end(endSize).conjugate()
- _temporary.end(endSize).transpose();
m_matrix.col(j).end(endSize) = m_matrix.row(j).end(endSize) / tmp;
}
}
}
/** \deprecated */
template<typename MatrixType>
template<typename Derived>
typename Derived::Eval CholeskyWithoutSquareRoot<MatrixType>::solve(const MatrixBase<Derived> &b) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows());
return m_matrix.adjoint().template part<UnitUpper>()
.solveTriangular(
( m_matrix.cwise().inverse().template part<Diagonal>()
* matrixL().solveTriangular(b))
);
}
/** \deprecated */
template<typename MatrixType>
template<typename RhsDerived, typename ResDerived>
bool CholeskyWithoutSquareRoot<MatrixType>
::solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows() && "Cholesky::solve(): invalid number of rows of the right hand side matrix b");
*result = b;
return solveInPlace(*result);
}
/** \deprecated */
template<typename MatrixType>
template<typename Derived>
bool CholeskyWithoutSquareRoot<MatrixType>::solveInPlace(MatrixBase<Derived> &bAndX) const
{
const int size = m_matrix.rows();
ei_assert(size==bAndX.rows());
if (!m_isPositiveDefinite)
return false;
matrixL().solveTriangularInPlace(bAndX);
bAndX = (m_matrix.cwise().inverse().template part<Diagonal>() * bAndX).lazy();
m_matrix.adjoint().template part<UnitUpper>().solveTriangularInPlace(bAndX);
return true;
}
/** \cholesky_module
* \deprecated has been renamed ldlt()
*/
template<typename Derived>
inline const CholeskyWithoutSquareRoot<typename MatrixBase<Derived>::EvalType>
MatrixBase<Derived>::choleskyNoSqrt() const
{
return derived();
}
#endif // EIGEN_CHOLESKY_WITHOUT_SQUARE_ROOT_H

View File

@@ -0,0 +1,3 @@
#ifndef EIGEN_CHOLESKY_MODULE_H
#error "Please include Eigen/Cholesky instead of including headers inside the src directory directly."
#endif

View File

@@ -1,198 +1,649 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008-2011 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2009 Keir Mierle <mierle@gmail.com>
// Copyright (C) 2009 Benoit Jacob <jacob.benoit.1@gmail.com>
// Copyright (C) 2011 Timothy E. Holy <tim.holy@gmail.com >
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_LDLT_H
#define EIGEN_LDLT_H
/** \ingroup cholesky_Module
*
* \class LDLT
*
* \brief Robust Cholesky decomposition of a matrix and associated features
*
* \param MatrixType the type of the matrix of which we are computing the LDL^T Cholesky decomposition
*
* This class performs a Cholesky decomposition without square root of a symmetric, positive definite
* matrix A such that A = L D L^* = U^* D U, where L is lower triangular with a unit diagonal
* and D is a diagonal matrix.
*
* Compared to a standard Cholesky decomposition, avoiding the square roots allows for faster and more
* stable computation.
*
* Note that during the decomposition, only the upper triangular part of A is considered. Therefore,
* the strict lower part does not have to store correct values.
*
* \sa MatrixBase::ldlt(), class LLT
*/
template<typename MatrixType> class LDLT
{
public:
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
typedef typename MatrixType::Scalar Scalar;
typedef typename NumTraits<typename MatrixType::Scalar>::Real RealScalar;
typedef Matrix<Scalar, MatrixType::ColsAtCompileTime, 1> VectorType;
namespace Eigen {
LDLT(const MatrixType& matrix)
: m_matrix(matrix.rows(), matrix.cols())
{
compute(matrix);
}
/** \returns the lower triangular matrix L */
inline Part<MatrixType, UnitLower> matrixL(void) const { return m_matrix; }
/** \returns the coefficients of the diagonal matrix D */
inline DiagonalCoeffs<MatrixType> vectorD(void) const { return m_matrix.diagonal(); }
/** \returns true if the matrix is positive definite */
inline bool isPositiveDefinite(void) const { return m_isPositiveDefinite; }
template<typename RhsDerived, typename ResDerived>
bool solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const;
template<typename Derived>
bool solveInPlace(MatrixBase<Derived> &bAndX) const;
void compute(const MatrixType& matrix);
protected:
/** \internal
* Used to compute and store the cholesky decomposition A = L D L^* = U^* D U.
* The strict upper part is used during the decomposition, the strict lower
* part correspond to the coefficients of L (its diagonal is equal to 1 and
* is not stored), and the diagonal entries correspond to D.
*/
MatrixType m_matrix;
bool m_isPositiveDefinite;
namespace internal {
template <typename MatrixType_, int UpLo_>
struct traits<LDLT<MatrixType_, UpLo_> > : traits<MatrixType_> {
typedef MatrixXpr XprKind;
typedef SolverStorage StorageKind;
typedef int StorageIndex;
enum { Flags = 0 };
};
/** Compute / recompute the LLT decomposition A = L D L^* = U^* D U of \a matrix
*/
template<typename MatrixType>
void LDLT<MatrixType>::compute(const MatrixType& a)
{
assert(a.rows()==a.cols());
const int size = a.rows();
m_matrix.resize(size, size);
m_isPositiveDefinite = true;
const RealScalar eps = ei_sqrt(precision<Scalar>());
template <typename MatrixType, int UpLo>
struct LDLT_Traits;
if (size<=1)
{
m_matrix = a;
return;
// PositiveSemiDef means positive semi-definite and non-zero; same for NegativeSemiDef
enum SignMatrix { PositiveSemiDef, NegativeSemiDef, ZeroSign, Indefinite };
} // namespace internal
/** \ingroup Cholesky_Module
*
* \class LDLT
*
* \brief Robust Cholesky decomposition of a matrix with pivoting
*
* \tparam MatrixType_ the type of the matrix of which to compute the LDL^T Cholesky decomposition
* \tparam UpLo_ the triangular part that will be used for the decomposition: Lower (default) or Upper.
* The other triangular part won't be read.
*
* Perform a robust Cholesky decomposition of a positive semidefinite or negative semidefinite
* matrix \f$ A \f$ such that \f$ A = P^TLDL^*P \f$, where P is a permutation matrix, L
* is lower triangular with a unit diagonal and D is a diagonal matrix.
*
* The decomposition uses pivoting to ensure stability, so that D will have
* zeros in the bottom right rank(A) - n submatrix. Avoiding the square root
* on D also stabilizes the computation.
*
* Remember that Cholesky decompositions are not rank-revealing. Also, do not use a Cholesky
* decomposition to determine whether a system of equations has a solution.
*
* This class supports the \link InplaceDecomposition inplace decomposition \endlink mechanism.
*
* \sa MatrixBase::ldlt(), SelfAdjointView::ldlt(), class LLT
*/
template <typename MatrixType_, int UpLo_>
class LDLT : public SolverBase<LDLT<MatrixType_, UpLo_> > {
public:
typedef MatrixType_ MatrixType;
typedef SolverBase<LDLT> Base;
friend class SolverBase<LDLT>;
EIGEN_GENERIC_PUBLIC_INTERFACE(LDLT)
enum {
MaxRowsAtCompileTime = MatrixType::MaxRowsAtCompileTime,
MaxColsAtCompileTime = MatrixType::MaxColsAtCompileTime,
UpLo = UpLo_
};
typedef Matrix<Scalar, RowsAtCompileTime, 1, 0, MaxRowsAtCompileTime, 1> TmpMatrixType;
typedef Transpositions<RowsAtCompileTime, MaxRowsAtCompileTime> TranspositionType;
typedef PermutationMatrix<RowsAtCompileTime, MaxRowsAtCompileTime> PermutationType;
typedef internal::LDLT_Traits<MatrixType, UpLo> Traits;
/** \brief Default Constructor.
*
* The default constructor is useful in cases in which the user intends to
* perform decompositions via LDLT::compute(const MatrixType&).
*/
LDLT() : m_matrix(), m_transpositions(), m_sign(internal::ZeroSign), m_isInitialized(false) {}
/** \brief Default Constructor with memory preallocation
*
* Like the default constructor but with preallocation of the internal data
* according to the specified problem \a size.
* \sa LDLT()
*/
explicit LDLT(Index size)
: m_matrix(size, size),
m_transpositions(size),
m_temporary(size),
m_sign(internal::ZeroSign),
m_isInitialized(false) {}
/** \brief Constructor with decomposition
*
* This calculates the decomposition for the input \a matrix.
*
* \sa LDLT(Index size)
*/
template <typename InputType>
explicit LDLT(const EigenBase<InputType>& matrix)
: m_matrix(matrix.rows(), matrix.cols()),
m_transpositions(matrix.rows()),
m_temporary(matrix.rows()),
m_sign(internal::ZeroSign),
m_isInitialized(false) {
compute(matrix.derived());
}
// Let's preallocate a temporay vector to evaluate the matrix-vector product into it.
// Unlike the standard LLT decomposition, here we cannot evaluate it to the destination
// matrix because it a sub-row which is not compatible suitable for efficient packet evaluation.
// (at least if we assume the matrix is col-major)
Matrix<Scalar,MatrixType::RowsAtCompileTime,1> _temporary(size);
// Note that, in this algorithm the rows of the strict upper part of m_matrix is used to store
// column vector, thus the strange .conjugate() and .transpose()...
m_matrix.row(0) = a.row(0).conjugate();
m_matrix.col(0).end(size-1) = m_matrix.row(0).end(size-1) / m_matrix.coeff(0,0);
for (int j = 1; j < size; ++j)
{
RealScalar tmp = ei_real(a.coeff(j,j) - (m_matrix.row(j).start(j) * m_matrix.col(j).start(j).conjugate()).coeff(0,0));
m_matrix.coeffRef(j,j) = tmp;
if (tmp < eps)
{
m_isPositiveDefinite = false;
return;
}
int endSize = size-j-1;
if (endSize>0)
{
_temporary.end(endSize) = ( m_matrix.block(j+1,0, endSize, j)
* m_matrix.col(j).start(j).conjugate() ).lazy();
m_matrix.row(j).end(endSize) = a.row(j).end(endSize).conjugate()
- _temporary.end(endSize).transpose();
m_matrix.col(j).end(endSize) = m_matrix.row(j).end(endSize) / tmp;
}
/** \brief Constructs a LDLT factorization from a given matrix
*
* This overloaded constructor is provided for \link InplaceDecomposition inplace decomposition \endlink when \c
* MatrixType is a Eigen::Ref.
*
* \sa LDLT(const EigenBase&)
*/
template <typename InputType>
explicit LDLT(EigenBase<InputType>& matrix)
: m_matrix(matrix.derived()),
m_transpositions(matrix.rows()),
m_temporary(matrix.rows()),
m_sign(internal::ZeroSign),
m_isInitialized(false) {
compute(matrix.derived());
}
/** Clear any existing decomposition
* \sa rankUpdate(w,sigma)
*/
void setZero() { m_isInitialized = false; }
/** \returns a view of the upper triangular matrix U */
inline typename Traits::MatrixU matrixU() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return Traits::getU(m_matrix);
}
/** \returns a view of the lower triangular matrix L */
inline typename Traits::MatrixL matrixL() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return Traits::getL(m_matrix);
}
/** \returns the permutation matrix P as a transposition sequence.
*/
inline const TranspositionType& transpositionsP() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_transpositions;
}
/** \returns the coefficients of the diagonal matrix D */
inline Diagonal<const MatrixType> vectorD() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_matrix.diagonal();
}
/** \returns true if the matrix is positive (semidefinite) */
inline bool isPositive() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_sign == internal::PositiveSemiDef || m_sign == internal::ZeroSign;
}
/** \returns true if the matrix is negative (semidefinite) */
inline bool isNegative(void) const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_sign == internal::NegativeSemiDef || m_sign == internal::ZeroSign;
}
#ifdef EIGEN_PARSED_BY_DOXYGEN
/** \returns a solution x of \f$ A x = b \f$ using the current decomposition of A.
*
* This function also supports in-place solves using the syntax <tt>x = decompositionObject.solve(x)</tt> .
*
* \note_about_checking_solutions
*
* More precisely, this method solves \f$ A x = b \f$ using the decomposition \f$ A = P^T L D L^* P \f$
* by solving the systems \f$ P^T y_1 = b \f$, \f$ L y_2 = y_1 \f$, \f$ D y_3 = y_2 \f$,
* \f$ L^* y_4 = y_3 \f$ and \f$ P x = y_4 \f$ in succession. If the matrix \f$ A \f$ is singular, then
* \f$ D \f$ will also be singular (all the other matrices are invertible). In that case, the
* least-square solution of \f$ D y_3 = y_2 \f$ is computed. This does not mean that this function
* computes the least-square solution of \f$ A x = b \f$ if \f$ A \f$ is singular.
*
* \sa MatrixBase::ldlt(), SelfAdjointView::ldlt()
*/
template <typename Rhs>
inline const Solve<LDLT, Rhs> solve(const MatrixBase<Rhs>& b) const;
#endif
template <typename Derived>
bool solveInPlace(MatrixBase<Derived>& bAndX) const;
template <typename InputType>
LDLT& compute(const EigenBase<InputType>& matrix);
/** \returns an estimate of the reciprocal condition number of the matrix of
* which \c *this is the LDLT decomposition.
*/
RealScalar rcond() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return internal::rcond_estimate_helper(m_l1_norm, *this);
}
template <typename Derived>
LDLT& rankUpdate(const MatrixBase<Derived>& w, const RealScalar& alpha = 1);
/** \returns the internal LDLT decomposition matrix
*
* TODO: document the storage layout
*/
inline const MatrixType& matrixLDLT() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_matrix;
}
MatrixType reconstructedMatrix() const;
/** \returns the adjoint of \c *this, that is, a const reference to the decomposition itself as the underlying matrix
* is self-adjoint.
*
* This method is provided for compatibility with other matrix decompositions, thus enabling generic code such as:
* \code x = decomposition.adjoint().solve(b) \endcode
*/
const LDLT& adjoint() const { return *this; }
EIGEN_DEVICE_FUNC constexpr Index rows() const noexcept { return m_matrix.rows(); }
EIGEN_DEVICE_FUNC constexpr Index cols() const noexcept { return m_matrix.cols(); }
/** \brief Reports whether previous computation was successful.
*
* \returns \c Success if computation was successful,
* \c NumericalIssue if the factorization failed because of a zero pivot.
*/
ComputationInfo info() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
return m_info;
}
#ifndef EIGEN_PARSED_BY_DOXYGEN
template <typename RhsType, typename DstType>
void _solve_impl(const RhsType& rhs, DstType& dst) const;
template <bool Conjugate, typename RhsType, typename DstType>
void _solve_impl_transposed(const RhsType& rhs, DstType& dst) const;
#endif
protected:
EIGEN_STATIC_ASSERT_NON_INTEGER(Scalar)
/** \internal
* Used to compute and store the Cholesky decomposition A = L D L^* = U^* D U.
* The strict upper part is used during the decomposition, the strict lower
* part correspond to the coefficients of L (its diagonal is equal to 1 and
* is not stored), and the diagonal entries correspond to D.
*/
MatrixType m_matrix;
RealScalar m_l1_norm;
TranspositionType m_transpositions;
TmpMatrixType m_temporary;
internal::SignMatrix m_sign;
bool m_isInitialized;
ComputationInfo m_info;
};
namespace internal {
template <int UpLo>
struct ldlt_inplace;
template <>
struct ldlt_inplace<Lower> {
template <typename MatrixType, typename TranspositionType, typename Workspace>
static bool unblocked(MatrixType& mat, TranspositionType& transpositions, Workspace& temp, SignMatrix& sign) {
using std::abs;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef typename TranspositionType::StorageIndex IndexType;
eigen_assert(mat.rows() == mat.cols());
const Index size = mat.rows();
bool found_zero_pivot = false;
bool ret = true;
if (size <= 1) {
transpositions.setIdentity();
if (size == 0)
sign = ZeroSign;
else if (numext::real(mat.coeff(0, 0)) > static_cast<RealScalar>(0))
sign = PositiveSemiDef;
else if (numext::real(mat.coeff(0, 0)) < static_cast<RealScalar>(0))
sign = NegativeSemiDef;
else
sign = ZeroSign;
return true;
}
for (Index k = 0; k < size; ++k) {
// Find largest diagonal element
Index index_of_biggest_in_corner;
mat.diagonal().tail(size - k).cwiseAbs().maxCoeff(&index_of_biggest_in_corner);
index_of_biggest_in_corner += k;
transpositions.coeffRef(k) = IndexType(index_of_biggest_in_corner);
if (k != index_of_biggest_in_corner) {
// apply the transposition while taking care to consider only
// the lower triangular part
Index s = size - index_of_biggest_in_corner - 1; // trailing size after the biggest element
mat.row(k).head(k).swap(mat.row(index_of_biggest_in_corner).head(k));
mat.col(k).tail(s).swap(mat.col(index_of_biggest_in_corner).tail(s));
std::swap(mat.coeffRef(k, k), mat.coeffRef(index_of_biggest_in_corner, index_of_biggest_in_corner));
for (Index i = k + 1; i < index_of_biggest_in_corner; ++i) {
Scalar tmp = mat.coeffRef(i, k);
mat.coeffRef(i, k) = numext::conj(mat.coeffRef(index_of_biggest_in_corner, i));
mat.coeffRef(index_of_biggest_in_corner, i) = numext::conj(tmp);
}
if (NumTraits<Scalar>::IsComplex)
mat.coeffRef(index_of_biggest_in_corner, k) = numext::conj(mat.coeff(index_of_biggest_in_corner, k));
}
// partition the matrix:
// A00 | - | -
// lu = A10 | A11 | -
// A20 | A21 | A22
Index rs = size - k - 1;
Block<MatrixType, Dynamic, 1> A21(mat, k + 1, k, rs, 1);
Block<MatrixType, 1, Dynamic> A10(mat, k, 0, 1, k);
Block<MatrixType, Dynamic, Dynamic> A20(mat, k + 1, 0, rs, k);
if (k > 0) {
temp.head(k) = mat.diagonal().real().head(k).asDiagonal() * A10.adjoint();
mat.coeffRef(k, k) -= (A10 * temp.head(k)).value();
if (rs > 0) A21.noalias() -= A20 * temp.head(k);
}
// In some previous versions of Eigen (e.g., 3.2.1), the scaling was omitted if the pivot
// was smaller than the cutoff value. However, since LDLT is not rank-revealing
// we should only make sure that we do not introduce INF or NaN values.
// Remark that LAPACK also uses 0 as the cutoff value.
RealScalar realAkk = numext::real(mat.coeffRef(k, k));
bool pivot_is_valid = (abs(realAkk) > RealScalar(0));
if (k == 0 && !pivot_is_valid) {
// The entire diagonal is zero, there is nothing more to do
// except filling the transpositions, and checking whether the matrix is zero.
sign = ZeroSign;
for (Index j = 0; j < size; ++j) {
transpositions.coeffRef(j) = IndexType(j);
ret = ret && (mat.col(j).tail(size - j - 1).array() == Scalar(0)).all();
}
return ret;
}
if ((rs > 0) && pivot_is_valid)
A21 /= realAkk;
else if (rs > 0)
ret = ret && (A21.array() == Scalar(0)).all();
if (found_zero_pivot && pivot_is_valid)
ret = false; // factorization failed
else if (!pivot_is_valid)
found_zero_pivot = true;
if (sign == PositiveSemiDef) {
if (realAkk < static_cast<RealScalar>(0)) sign = Indefinite;
} else if (sign == NegativeSemiDef) {
if (realAkk > static_cast<RealScalar>(0)) sign = Indefinite;
} else if (sign == ZeroSign) {
if (realAkk > static_cast<RealScalar>(0))
sign = PositiveSemiDef;
else if (realAkk < static_cast<RealScalar>(0))
sign = NegativeSemiDef;
}
}
return ret;
}
// Reference for the algorithm: Davis and Hager, "Multiple Rank
// Modifications of a Sparse Cholesky Factorization" (Algorithm 1)
// Trivial rearrangements of their computations (Timothy E. Holy)
// allow their algorithm to work for rank-1 updates even if the
// original matrix is not of full rank.
// Here only rank-1 updates are implemented, to reduce the
// requirement for intermediate storage and improve accuracy
template <typename MatrixType, typename WDerived>
static bool updateInPlace(MatrixType& mat, MatrixBase<WDerived>& w,
const typename MatrixType::RealScalar& sigma = 1) {
using numext::isfinite;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
const Index size = mat.rows();
eigen_assert(mat.cols() == size && w.size() == size);
RealScalar alpha = 1;
// Apply the update
for (Index j = 0; j < size; j++) {
// Check for termination due to an original decomposition of low-rank
if (!(isfinite)(alpha)) break;
// Update the diagonal terms
RealScalar dj = numext::real(mat.coeff(j, j));
Scalar wj = w.coeff(j);
RealScalar swj2 = sigma * numext::abs2(wj);
RealScalar gamma = dj * alpha + swj2;
mat.coeffRef(j, j) += swj2 / alpha;
alpha += swj2 / dj;
// Update the terms of L
Index rs = size - j - 1;
w.tail(rs) -= wj * mat.col(j).tail(rs);
if (!numext::is_exactly_zero(gamma)) mat.col(j).tail(rs) += (sigma * numext::conj(wj) / gamma) * w.tail(rs);
}
return true;
}
template <typename MatrixType, typename TranspositionType, typename Workspace, typename WType>
static bool update(MatrixType& mat, const TranspositionType& transpositions, Workspace& tmp, const WType& w,
const typename MatrixType::RealScalar& sigma = 1) {
// Apply the permutation to the input w
tmp = transpositions * w;
return ldlt_inplace<Lower>::updateInPlace(mat, tmp, sigma);
}
};
template <>
struct ldlt_inplace<Upper> {
template <typename MatrixType, typename TranspositionType, typename Workspace>
static EIGEN_STRONG_INLINE bool unblocked(MatrixType& mat, TranspositionType& transpositions, Workspace& temp,
SignMatrix& sign) {
Transpose<MatrixType> matt(mat);
return ldlt_inplace<Lower>::unblocked(matt, transpositions, temp, sign);
}
template <typename MatrixType, typename TranspositionType, typename Workspace, typename WType>
static EIGEN_STRONG_INLINE bool update(MatrixType& mat, TranspositionType& transpositions, Workspace& tmp, WType& w,
const typename MatrixType::RealScalar& sigma = 1) {
Transpose<MatrixType> matt(mat);
return ldlt_inplace<Lower>::update(matt, transpositions, tmp, w.conjugate(), sigma);
}
};
template <typename MatrixType>
struct LDLT_Traits<MatrixType, Lower> {
typedef const TriangularView<const MatrixType, UnitLower> MatrixL;
typedef const TriangularView<const typename MatrixType::AdjointReturnType, UnitUpper> MatrixU;
static inline MatrixL getL(const MatrixType& m) { return MatrixL(m); }
static inline MatrixU getU(const MatrixType& m) { return MatrixU(m.adjoint()); }
};
template <typename MatrixType>
struct LDLT_Traits<MatrixType, Upper> {
typedef const TriangularView<const typename MatrixType::AdjointReturnType, UnitLower> MatrixL;
typedef const TriangularView<const MatrixType, UnitUpper> MatrixU;
static inline MatrixL getL(const MatrixType& m) { return MatrixL(m.adjoint()); }
static inline MatrixU getU(const MatrixType& m) { return MatrixU(m); }
};
} // end namespace internal
/** Compute / recompute the LDLT decomposition A = L D L^* = U^* D U of \a matrix
*/
template <typename MatrixType, int UpLo_>
template <typename InputType>
LDLT<MatrixType, UpLo_>& LDLT<MatrixType, UpLo_>::compute(const EigenBase<InputType>& a) {
eigen_assert(a.rows() == a.cols());
const Index size = a.rows();
m_matrix = a.derived();
// Compute matrix L1 norm = max abs column sum.
m_l1_norm = RealScalar(0);
// TODO move this code to SelfAdjointView
for (Index col = 0; col < size; ++col) {
RealScalar abs_col_sum;
if (UpLo_ == Lower)
abs_col_sum =
m_matrix.col(col).tail(size - col).template lpNorm<1>() + m_matrix.row(col).head(col).template lpNorm<1>();
else
abs_col_sum =
m_matrix.col(col).head(col).template lpNorm<1>() + m_matrix.row(col).tail(size - col).template lpNorm<1>();
if (abs_col_sum > m_l1_norm) m_l1_norm = abs_col_sum;
}
m_transpositions.resize(size);
m_isInitialized = false;
m_temporary.resize(size);
m_sign = internal::ZeroSign;
m_info = internal::ldlt_inplace<UpLo>::unblocked(m_matrix, m_transpositions, m_temporary, m_sign) ? Success
: NumericalIssue;
m_isInitialized = true;
return *this;
}
/** Computes the solution x of \f$ A x = b \f$ using the current decomposition of A.
* The result is stored in \a result
*
* \returns true in case of success, false otherwise.
*
* In other words, it computes \f$ b = A^{-1} b \f$ with
* \f$ {L^{*}}^{-1} D^{-1} L^{-1} b \f$ from right to left.
*
* \sa LDLT::solveInPlace(), MatrixBase::ldlt()
*/
template<typename MatrixType>
template<typename RhsDerived, typename ResDerived>
bool LDLT<MatrixType>
::solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows() && "LLT::solve(): invalid number of rows of the right hand side matrix b");
*result = b;
return solveInPlace(*result);
/** Update the LDLT decomposition: given A = L D L^T, efficiently compute the decomposition of A + sigma w w^T.
* \param w a vector to be incorporated into the decomposition.
* \param sigma a scalar, +1 for updates and -1 for "downdates," which correspond to removing previously-added column
* vectors. Optional; default value is +1. \sa setZero()
*/
template <typename MatrixType, int UpLo_>
template <typename Derived>
LDLT<MatrixType, UpLo_>& LDLT<MatrixType, UpLo_>::rankUpdate(
const MatrixBase<Derived>& w, const typename LDLT<MatrixType, UpLo_>::RealScalar& sigma) {
typedef typename TranspositionType::StorageIndex IndexType;
const Index size = w.rows();
if (m_isInitialized) {
eigen_assert(m_matrix.rows() == size);
} else {
m_matrix.resize(size, size);
m_matrix.setZero();
m_transpositions.resize(size);
for (Index i = 0; i < size; i++) m_transpositions.coeffRef(i) = IndexType(i);
m_temporary.resize(size);
m_sign = sigma >= 0 ? internal::PositiveSemiDef : internal::NegativeSemiDef;
m_isInitialized = true;
}
internal::ldlt_inplace<UpLo>::update(m_matrix, m_transpositions, m_temporary, w, sigma);
return *this;
}
/** This is the \em in-place version of solve().
*
* \param bAndX represents both the right-hand side matrix b and result x.
*
* This version avoids a copy when the right hand side matrix b is not
* needed anymore.
*
* \sa LDLT::solve(), MatrixBase::ldlt()
*/
template<typename MatrixType>
template<typename Derived>
bool LDLT<MatrixType>::solveInPlace(MatrixBase<Derived> &bAndX) const
{
const int size = m_matrix.rows();
ei_assert(size==bAndX.rows());
if (!m_isPositiveDefinite)
return false;
matrixL().solveTriangularInPlace(bAndX);
bAndX = (m_matrix.cwise().inverse().template part<Diagonal>() * bAndX).lazy();
m_matrix.adjoint().template part<UnitUpper>().solveTriangularInPlace(bAndX);
#ifndef EIGEN_PARSED_BY_DOXYGEN
template <typename MatrixType_, int UpLo_>
template <typename RhsType, typename DstType>
void LDLT<MatrixType_, UpLo_>::_solve_impl(const RhsType& rhs, DstType& dst) const {
_solve_impl_transposed<true>(rhs, dst);
}
template <typename MatrixType_, int UpLo_>
template <bool Conjugate, typename RhsType, typename DstType>
void LDLT<MatrixType_, UpLo_>::_solve_impl_transposed(const RhsType& rhs, DstType& dst) const {
// dst = P b
dst = m_transpositions * rhs;
// dst = L^-1 (P b)
// dst = L^-*T (P b)
matrixL().template conjugateIf<!Conjugate>().solveInPlace(dst);
// dst = D^-* (L^-1 P b)
// dst = D^-1 (L^-*T P b)
// more precisely, use pseudo-inverse of D (see bug 241)
using std::abs;
const typename Diagonal<const MatrixType>::RealReturnType vecD(vectorD());
// In some previous versions, tolerance was set to the max of 1/highest (or rather numeric_limits::min())
// and the maximal diagonal entry * epsilon as motivated by LAPACK's xGELSS:
// RealScalar tolerance = numext::maxi(vecD.array().abs().maxCoeff() * NumTraits<RealScalar>::epsilon(),RealScalar(1)
// / NumTraits<RealScalar>::highest()); However, LDLT is not rank revealing, and so adjusting the tolerance wrt to the
// highest diagonal element is not well justified and leads to numerical issues in some cases. Moreover, Lapack's
// xSYTRS routines use 0 for the tolerance. Using numeric_limits::min() gives us more robustness to denormals.
RealScalar tolerance = (std::numeric_limits<RealScalar>::min)();
for (Index i = 0; i < vecD.size(); ++i) {
if (abs(vecD(i)) > tolerance)
dst.row(i) /= vecD(i);
else
dst.row(i).setZero();
}
// dst = L^-* (D^-* L^-1 P b)
// dst = L^-T (D^-1 L^-*T P b)
matrixL().transpose().template conjugateIf<Conjugate>().solveInPlace(dst);
// dst = P^T (L^-* D^-* L^-1 P b) = A^-1 b
// dst = P^-T (L^-T D^-1 L^-*T P b) = A^-1 b
dst = m_transpositions.transpose() * dst;
}
#endif
/** \internal use x = ldlt_object.solve(x);
*
* This is the \em in-place version of solve().
*
* \param bAndX represents both the right-hand side matrix b and result x.
*
* \returns true always! If you need to check for existence of solutions, use another decomposition like LU, QR, or SVD.
*
* This version avoids a copy when the right hand side matrix b is not
* needed anymore.
*
* \sa LDLT::solve(), MatrixBase::ldlt()
*/
template <typename MatrixType, int UpLo_>
template <typename Derived>
bool LDLT<MatrixType, UpLo_>::solveInPlace(MatrixBase<Derived>& bAndX) const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
eigen_assert(m_matrix.rows() == bAndX.rows());
bAndX = this->solve(bAndX);
return true;
}
/** \cholesky_module
* \returns the Cholesky decomposition without square root of \c *this
*/
template<typename Derived>
inline const LDLT<typename MatrixBase<Derived>::EvalType>
MatrixBase<Derived>::ldlt() const
{
return derived();
/** \returns the matrix represented by the decomposition,
* i.e., it returns the product: P^T L D L^* P.
* This function is provided for debug purpose. */
template <typename MatrixType, int UpLo_>
MatrixType LDLT<MatrixType, UpLo_>::reconstructedMatrix() const {
eigen_assert(m_isInitialized && "LDLT is not initialized.");
const Index size = m_matrix.rows();
MatrixType res(size, size);
// P
res.setIdentity();
res = transpositionsP() * res;
// L^* P
res = matrixU() * res;
// D(L^*P)
res = vectorD().real().asDiagonal() * res;
// L(DL^*P)
res = matrixL() * res;
// P^T (LDL^*P)
res = transpositionsP().transpose() * res;
return res;
}
#endif // EIGEN_LDLT_H
/** \cholesky_module
* \returns the Cholesky decomposition with full pivoting without square root of \c *this
* \sa MatrixBase::ldlt()
*/
template <typename MatrixType, unsigned int UpLo>
inline const LDLT<typename SelfAdjointView<MatrixType, UpLo>::PlainObject, UpLo>
SelfAdjointView<MatrixType, UpLo>::ldlt() const {
return LDLT<PlainObject, UpLo>(m_matrix);
}
/** \cholesky_module
* \returns the Cholesky decomposition with full pivoting without square root of \c *this
* \sa SelfAdjointView::ldlt()
*/
template <typename Derived>
inline const LDLT<typename MatrixBase<Derived>::PlainObject> MatrixBase<Derived>::ldlt() const {
return LDLT<PlainObject>(derived());
}
} // end namespace Eigen
#endif // EIGEN_LDLT_H

View File

@@ -1,186 +1,514 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_LLT_H
#define EIGEN_LLT_H
/** \ingroup cholesky_Module
*
* \class LLT
*
* \brief Standard Cholesky decomposition (LL^T) of a matrix and associated features
*
* \param MatrixType the type of the matrix of which we are computing the LL^T Cholesky decomposition
*
* This class performs a LL^T Cholesky decomposition of a symmetric, positive definite
* matrix A such that A = LL^* = U^*U, where L is lower triangular.
*
* While the Cholesky decomposition is particularly useful to solve selfadjoint problems like D^*D x = b,
* for that purpose, we recommend the Cholesky decomposition without square root which is more stable
* and even faster. Nevertheless, this standard Cholesky decomposition remains useful in many other
* situations like generalised eigen problems with hermitian matrices.
*
* Note that during the decomposition, only the upper triangular part of A is considered. Therefore,
* the strict lower part does not have to store correct values.
*
* \sa MatrixBase::llt(), class LDLT
*/
template<typename MatrixType> class LLT
{
private:
typedef typename MatrixType::Scalar Scalar;
typedef typename NumTraits<typename MatrixType::Scalar>::Real RealScalar;
typedef Matrix<Scalar, MatrixType::ColsAtCompileTime, 1> VectorType;
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
enum {
PacketSize = ei_packet_traits<Scalar>::size,
AlignmentMask = int(PacketSize)-1
};
namespace Eigen {
public:
namespace internal {
LLT(const MatrixType& matrix)
: m_matrix(matrix.rows(), matrix.cols())
{
compute(matrix);
}
/** \returns the lower triangular matrix L */
inline Part<MatrixType, Lower> matrixL(void) const { return m_matrix; }
/** \returns true if the matrix is positive definite */
inline bool isPositiveDefinite(void) const { return m_isPositiveDefinite; }
template<typename RhsDerived, typename ResDerived>
bool solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const;
template<typename Derived>
bool solveInPlace(MatrixBase<Derived> &bAndX) const;
void compute(const MatrixType& matrix);
protected:
/** \internal
* Used to compute and store L
* The strict upper part is not used and even not initialized.
*/
MatrixType m_matrix;
bool m_isPositiveDefinite;
template <typename MatrixType_, int UpLo_>
struct traits<LLT<MatrixType_, UpLo_> > : traits<MatrixType_> {
typedef MatrixXpr XprKind;
typedef SolverStorage StorageKind;
typedef int StorageIndex;
enum { Flags = 0 };
};
/** Computes / recomputes the Cholesky decomposition A = LL^* = U^*U of \a matrix
*/
template<typename MatrixType>
void LLT<MatrixType>::compute(const MatrixType& a)
{
assert(a.rows()==a.cols());
const int size = a.rows();
m_matrix.resize(size, size);
const RealScalar eps = ei_sqrt(precision<Scalar>());
template <typename MatrixType, int UpLo>
struct LLT_Traits;
} // namespace internal
RealScalar x;
x = ei_real(a.coeff(0,0));
m_isPositiveDefinite = x > eps && ei_isMuchSmallerThan(ei_imag(a.coeff(0,0)), RealScalar(1));
m_matrix.coeffRef(0,0) = ei_sqrt(x);
m_matrix.col(0).end(size-1) = a.row(0).end(size-1).adjoint() / ei_real(m_matrix.coeff(0,0));
for (int j = 1; j < size; ++j)
{
Scalar tmp = ei_real(a.coeff(j,j)) - m_matrix.row(j).start(j).squaredNorm();
x = ei_real(tmp);
if (x < eps || (!ei_isMuchSmallerThan(ei_imag(tmp), RealScalar(1))))
{
m_isPositiveDefinite = false;
return;
/** \ingroup Cholesky_Module
*
* \class LLT
*
* \brief Standard Cholesky decomposition (LL^T) of a matrix and associated features
*
* \tparam MatrixType_ the type of the matrix of which we are computing the LL^T Cholesky decomposition
* \tparam UpLo_ the triangular part that will be used for the decomposition: Lower (default) or Upper.
* The other triangular part won't be read.
*
* This class performs a LL^T Cholesky decomposition of a symmetric, positive definite
* matrix A such that A = LL^* = U^*U, where L is lower triangular.
*
* While the Cholesky decomposition is particularly useful to solve selfadjoint problems like D^*D x = b,
* for that purpose, we recommend the Cholesky decomposition without square root which is more stable
* and even faster. Nevertheless, this standard Cholesky decomposition remains useful in many other
* situations like generalised eigen problems with hermitian matrices.
*
* Remember that Cholesky decompositions are not rank-revealing. This LLT decomposition is only stable on positive
* definite matrices, use LDLT instead for the semidefinite case. Also, do not use a Cholesky decomposition to determine
* whether a system of equations has a solution.
*
* Example: \include LLT_example.cpp
* Output: \verbinclude LLT_example.out
*
* \b Performance: for best performance, it is recommended to use a column-major storage format
* with the Lower triangular part (the default), or, equivalently, a row-major storage format
* with the Upper triangular part. Otherwise, you might get a 20% slowdown for the full factorization
* step, and rank-updates can be up to 3 times slower.
*
* This class supports the \link InplaceDecomposition inplace decomposition \endlink mechanism.
*
* Note that during the decomposition, only the lower (or upper, as defined by UpLo_) triangular part of A is
* considered. Therefore, the strict lower part does not have to store correct values.
*
* \sa MatrixBase::llt(), SelfAdjointView::llt(), class LDLT
*/
template <typename MatrixType_, int UpLo_>
class LLT : public SolverBase<LLT<MatrixType_, UpLo_> > {
public:
typedef MatrixType_ MatrixType;
typedef SolverBase<LLT> Base;
friend class SolverBase<LLT>;
EIGEN_GENERIC_PUBLIC_INTERFACE(LLT)
enum { MaxColsAtCompileTime = MatrixType::MaxColsAtCompileTime };
enum { PacketSize = internal::packet_traits<Scalar>::size, AlignmentMask = int(PacketSize) - 1, UpLo = UpLo_ };
typedef internal::LLT_Traits<MatrixType, UpLo> Traits;
/**
* \brief Default Constructor.
*
* The default constructor is useful in cases in which the user intends to
* perform decompositions via LLT::compute(const MatrixType&).
*/
LLT() : m_matrix(), m_isInitialized(false) {}
/** \brief Default Constructor with memory preallocation
*
* Like the default constructor but with preallocation of the internal data
* according to the specified problem \a size.
* \sa LLT()
*/
explicit LLT(Index size) : m_matrix(size, size), m_isInitialized(false) {}
template <typename InputType>
explicit LLT(const EigenBase<InputType>& matrix) : m_matrix(matrix.rows(), matrix.cols()), m_isInitialized(false) {
compute(matrix.derived());
}
/** \brief Constructs a LLT factorization from a given matrix
*
* This overloaded constructor is provided for \link InplaceDecomposition inplace decomposition \endlink when
* \c MatrixType is a Eigen::Ref.
*
* \sa LLT(const EigenBase&)
*/
template <typename InputType>
explicit LLT(EigenBase<InputType>& matrix) : m_matrix(matrix.derived()), m_isInitialized(false) {
compute(matrix.derived());
}
/** \returns a view of the upper triangular matrix U */
inline typename Traits::MatrixU matrixU() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
return Traits::getU(m_matrix);
}
/** \returns a view of the lower triangular matrix L */
inline typename Traits::MatrixL matrixL() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
return Traits::getL(m_matrix);
}
#ifdef EIGEN_PARSED_BY_DOXYGEN
/** \returns the solution x of \f$ A x = b \f$ using the current decomposition of A.
*
* Since this LLT class assumes anyway that the matrix A is invertible, the solution
* theoretically exists and is unique regardless of b.
*
* Example: \include LLT_solve.cpp
* Output: \verbinclude LLT_solve.out
*
* \sa solveInPlace(), MatrixBase::llt(), SelfAdjointView::llt()
*/
template <typename Rhs>
inline const Solve<LLT, Rhs> solve(const MatrixBase<Rhs>& b) const;
#endif
template <typename Derived>
void solveInPlace(const MatrixBase<Derived>& bAndX) const;
template <typename InputType>
LLT& compute(const EigenBase<InputType>& matrix);
/** \returns an estimate of the reciprocal condition number of the matrix of
* which \c *this is the Cholesky decomposition.
*/
RealScalar rcond() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
eigen_assert(m_info == Success && "LLT failed because matrix appears to be negative");
return internal::rcond_estimate_helper(m_l1_norm, *this);
}
/** \returns the LLT decomposition matrix
*
* TODO: document the storage layout
*/
inline const MatrixType& matrixLLT() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
return m_matrix;
}
MatrixType reconstructedMatrix() const;
/** \brief Reports whether previous computation was successful.
*
* \returns \c Success if computation was successful,
* \c NumericalIssue if the matrix.appears not to be positive definite.
*/
ComputationInfo info() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
return m_info;
}
/** \returns the adjoint of \c *this, that is, a const reference to the decomposition itself as the underlying matrix
* is self-adjoint.
*
* This method is provided for compatibility with other matrix decompositions, thus enabling generic code such as:
* \code x = decomposition.adjoint().solve(b) \endcode
*/
const LLT& adjoint() const noexcept { return *this; }
constexpr Index rows() const noexcept { return m_matrix.rows(); }
constexpr Index cols() const noexcept { return m_matrix.cols(); }
template <typename VectorType>
LLT& rankUpdate(const VectorType& vec, const RealScalar& sigma = 1);
#ifndef EIGEN_PARSED_BY_DOXYGEN
template <typename RhsType, typename DstType>
void _solve_impl(const RhsType& rhs, DstType& dst) const;
template <bool Conjugate, typename RhsType, typename DstType>
void _solve_impl_transposed(const RhsType& rhs, DstType& dst) const;
#endif
protected:
EIGEN_STATIC_ASSERT_NON_INTEGER(Scalar)
/** \internal
* Used to compute and store L
* The strict upper part is not used and even not initialized.
*/
MatrixType m_matrix;
RealScalar m_l1_norm;
bool m_isInitialized;
ComputationInfo m_info;
};
namespace internal {
template <typename Scalar, int UpLo>
struct llt_inplace;
template <typename MatrixType, typename VectorType>
static Index llt_rank_update_lower(MatrixType& mat, const VectorType& vec,
const typename MatrixType::RealScalar& sigma) {
using std::sqrt;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef typename MatrixType::ColXpr ColXpr;
typedef internal::remove_all_t<ColXpr> ColXprCleaned;
typedef typename ColXprCleaned::SegmentReturnType ColXprSegment;
typedef Matrix<Scalar, Dynamic, 1> TempVectorType;
typedef typename TempVectorType::SegmentReturnType TempVecSegment;
Index n = mat.cols();
eigen_assert(mat.rows() == n && vec.size() == n);
TempVectorType temp;
if (sigma > 0) {
// This version is based on Givens rotations.
// It is faster than the other one below, but only works for updates,
// i.e., for sigma > 0
temp = sqrt(sigma) * vec;
for (Index i = 0; i < n; ++i) {
JacobiRotation<Scalar> g;
g.makeGivens(mat(i, i), -temp(i), &mat(i, i));
Index rs = n - i - 1;
if (rs > 0) {
ColXprSegment x(mat.col(i).tail(rs));
TempVecSegment y(temp.tail(rs));
apply_rotation_in_the_plane(x, y, g);
}
}
m_matrix.coeffRef(j,j) = x = ei_sqrt(x);
} else {
temp = vec;
RealScalar beta = 1;
for (Index j = 0; j < n; ++j) {
RealScalar Ljj = numext::real(mat.coeff(j, j));
RealScalar dj = numext::abs2(Ljj);
Scalar wj = temp.coeff(j);
RealScalar swj2 = sigma * numext::abs2(wj);
RealScalar gamma = dj * beta + swj2;
int endSize = size-j-1;
if (endSize>0) {
// Note that when all matrix columns have good alignment, then the following
// product is guaranteed to be optimal with respect to alignment.
m_matrix.col(j).end(endSize) =
(m_matrix.block(j+1, 0, endSize, j) * m_matrix.row(j).start(j).adjoint()).lazy();
RealScalar x = dj + swj2 / beta;
if (x <= RealScalar(0)) return j;
RealScalar nLjj = sqrt(x);
mat.coeffRef(j, j) = nLjj;
beta += swj2 / dj;
// FIXME could use a.col instead of a.row
m_matrix.col(j).end(endSize) = (a.row(j).end(endSize).adjoint()
- m_matrix.col(j).end(endSize) ) / x;
// Update the terms of L
Index rs = n - j - 1;
if (rs) {
temp.tail(rs) -= (wj / Ljj) * mat.col(j).tail(rs);
if (!numext::is_exactly_zero(gamma))
mat.col(j).tail(rs) =
(nLjj / Ljj) * mat.col(j).tail(rs) + (nLjj * sigma * numext::conj(wj) / gamma) * temp.tail(rs);
}
}
}
return -1;
}
/** Computes the solution x of \f$ A x = b \f$ using the current decomposition of A.
* The result is stored in \a result
*
* \returns true in case of success, false otherwise.
*
* In other words, it computes \f$ b = A^{-1} b \f$ with
* \f$ {L^{*}}^{-1} L^{-1} b \f$ from right to left.
*
* Example: \include LLT_solve.cpp
* Output: \verbinclude LLT_solve.out
*
* \sa LLT::solveInPlace(), MatrixBase::llt()
*/
template<typename MatrixType>
template<typename RhsDerived, typename ResDerived>
bool LLT<MatrixType>::solve(const MatrixBase<RhsDerived> &b, MatrixBase<ResDerived> *result) const
{
const int size = m_matrix.rows();
ei_assert(size==b.rows() && "LLT::solve(): invalid number of rows of the right hand side matrix b");
return solveInPlace((*result) = b);
template <typename Scalar>
struct llt_inplace<Scalar, Lower> {
typedef typename NumTraits<Scalar>::Real RealScalar;
template <typename MatrixType>
static Index unblocked(MatrixType& mat) {
using std::sqrt;
eigen_assert(mat.rows() == mat.cols());
const Index size = mat.rows();
for (Index k = 0; k < size; ++k) {
Index rs = size - k - 1; // remaining size
Block<MatrixType, Dynamic, 1> A21(mat, k + 1, k, rs, 1);
Block<MatrixType, 1, Dynamic> A10(mat, k, 0, 1, k);
Block<MatrixType, Dynamic, Dynamic> A20(mat, k + 1, 0, rs, k);
RealScalar x = numext::real(mat.coeff(k, k));
if (k > 0) x -= A10.squaredNorm();
if (x <= RealScalar(0)) return k;
mat.coeffRef(k, k) = x = sqrt(x);
if (k > 0 && rs > 0) A21.noalias() -= A20 * A10.adjoint();
if (rs > 0) A21 /= x;
}
return -1;
}
template <typename MatrixType>
static Index blocked(MatrixType& m) {
eigen_assert(m.rows() == m.cols());
Index size = m.rows();
if (size < 32) return unblocked(m);
Index blockSize = size / 8;
blockSize = (blockSize / 16) * 16;
blockSize = (std::min)((std::max)(blockSize, Index(8)), Index(128));
for (Index k = 0; k < size; k += blockSize) {
// partition the matrix:
// A00 | - | -
// lu = A10 | A11 | -
// A20 | A21 | A22
Index bs = (std::min)(blockSize, size - k);
Index rs = size - k - bs;
Block<MatrixType, Dynamic, Dynamic> A11(m, k, k, bs, bs);
Block<MatrixType, Dynamic, Dynamic> A21(m, k + bs, k, rs, bs);
Block<MatrixType, Dynamic, Dynamic> A22(m, k + bs, k + bs, rs, rs);
Index ret;
if ((ret = unblocked(A11)) >= 0) return k + ret;
if (rs > 0) A11.adjoint().template triangularView<Upper>().template solveInPlace<OnTheRight>(A21);
if (rs > 0)
A22.template selfadjointView<Lower>().rankUpdate(A21,
typename NumTraits<RealScalar>::Literal(-1)); // bottleneck
}
return -1;
}
template <typename MatrixType, typename VectorType>
static Index rankUpdate(MatrixType& mat, const VectorType& vec, const RealScalar& sigma) {
return Eigen::internal::llt_rank_update_lower(mat, vec, sigma);
}
};
template <typename Scalar>
struct llt_inplace<Scalar, Upper> {
typedef typename NumTraits<Scalar>::Real RealScalar;
template <typename MatrixType>
static EIGEN_STRONG_INLINE Index unblocked(MatrixType& mat) {
Transpose<MatrixType> matt(mat);
return llt_inplace<Scalar, Lower>::unblocked(matt);
}
template <typename MatrixType>
static EIGEN_STRONG_INLINE Index blocked(MatrixType& mat) {
Transpose<MatrixType> matt(mat);
return llt_inplace<Scalar, Lower>::blocked(matt);
}
template <typename MatrixType, typename VectorType>
static Index rankUpdate(MatrixType& mat, const VectorType& vec, const RealScalar& sigma) {
Transpose<MatrixType> matt(mat);
return llt_inplace<Scalar, Lower>::rankUpdate(matt, vec.conjugate(), sigma);
}
};
template <typename MatrixType>
struct LLT_Traits<MatrixType, Lower> {
typedef const TriangularView<const MatrixType, Lower> MatrixL;
typedef const TriangularView<const typename MatrixType::AdjointReturnType, Upper> MatrixU;
static inline MatrixL getL(const MatrixType& m) { return MatrixL(m); }
static inline MatrixU getU(const MatrixType& m) { return MatrixU(m.adjoint()); }
static bool inplace_decomposition(MatrixType& m) {
return llt_inplace<typename MatrixType::Scalar, Lower>::blocked(m) == -1;
}
};
template <typename MatrixType>
struct LLT_Traits<MatrixType, Upper> {
typedef const TriangularView<const typename MatrixType::AdjointReturnType, Lower> MatrixL;
typedef const TriangularView<const MatrixType, Upper> MatrixU;
static inline MatrixL getL(const MatrixType& m) { return MatrixL(m.adjoint()); }
static inline MatrixU getU(const MatrixType& m) { return MatrixU(m); }
static bool inplace_decomposition(MatrixType& m) {
return llt_inplace<typename MatrixType::Scalar, Upper>::blocked(m) == -1;
}
};
} // end namespace internal
/** Computes / recomputes the Cholesky decomposition A = LL^* = U^*U of \a matrix
*
* \returns a reference to *this
*
* Example: \include TutorialLinAlgComputeTwice.cpp
* Output: \verbinclude TutorialLinAlgComputeTwice.out
*/
template <typename MatrixType, int UpLo_>
template <typename InputType>
LLT<MatrixType, UpLo_>& LLT<MatrixType, UpLo_>::compute(const EigenBase<InputType>& a) {
eigen_assert(a.rows() == a.cols());
const Index size = a.rows();
m_matrix.resize(size, size);
if (!internal::is_same_dense(m_matrix, a.derived())) m_matrix = a.derived();
// Compute matrix L1 norm = max abs column sum.
m_l1_norm = RealScalar(0);
// TODO move this code to SelfAdjointView
for (Index col = 0; col < size; ++col) {
RealScalar abs_col_sum;
if (UpLo_ == Lower)
abs_col_sum =
m_matrix.col(col).tail(size - col).template lpNorm<1>() + m_matrix.row(col).head(col).template lpNorm<1>();
else
abs_col_sum =
m_matrix.col(col).head(col).template lpNorm<1>() + m_matrix.row(col).tail(size - col).template lpNorm<1>();
if (abs_col_sum > m_l1_norm) m_l1_norm = abs_col_sum;
}
m_isInitialized = true;
bool ok = Traits::inplace_decomposition(m_matrix);
m_info = ok ? Success : NumericalIssue;
return *this;
}
/** This is the \em in-place version of solve().
*
* \param bAndX represents both the right-hand side matrix b and result x.
*
* This version avoids a copy when the right hand side matrix b is not
* needed anymore.
*
* \sa LLT::solve(), MatrixBase::llt()
*/
template<typename MatrixType>
template<typename Derived>
bool LLT<MatrixType>::solveInPlace(MatrixBase<Derived> &bAndX) const
{
const int size = m_matrix.rows();
ei_assert(size==bAndX.rows());
if (!m_isPositiveDefinite)
return false;
matrixL().solveTriangularInPlace(bAndX);
m_matrix.adjoint().template part<Upper>().solveTriangularInPlace(bAndX);
return true;
/** Performs a rank one update (or dowdate) of the current decomposition.
* If A = LL^* before the rank one update,
* then after it we have LL^* = A + sigma * v v^* where \a v must be a vector
* of same dimension.
*/
template <typename MatrixType_, int UpLo_>
template <typename VectorType>
LLT<MatrixType_, UpLo_>& LLT<MatrixType_, UpLo_>::rankUpdate(const VectorType& v, const RealScalar& sigma) {
EIGEN_STATIC_ASSERT_VECTOR_ONLY(VectorType);
eigen_assert(v.size() == m_matrix.cols());
eigen_assert(m_isInitialized);
if (internal::llt_inplace<typename MatrixType::Scalar, UpLo>::rankUpdate(m_matrix, v, sigma) >= 0)
m_info = NumericalIssue;
else
m_info = Success;
return *this;
}
#ifndef EIGEN_PARSED_BY_DOXYGEN
template <typename MatrixType_, int UpLo_>
template <typename RhsType, typename DstType>
void LLT<MatrixType_, UpLo_>::_solve_impl(const RhsType& rhs, DstType& dst) const {
_solve_impl_transposed<true>(rhs, dst);
}
template <typename MatrixType_, int UpLo_>
template <bool Conjugate, typename RhsType, typename DstType>
void LLT<MatrixType_, UpLo_>::_solve_impl_transposed(const RhsType& rhs, DstType& dst) const {
dst = rhs;
matrixL().template conjugateIf<!Conjugate>().solveInPlace(dst);
matrixU().template conjugateIf<!Conjugate>().solveInPlace(dst);
}
#endif
/** \internal use x = llt_object.solve(x);
*
* This is the \em in-place version of solve().
*
* \param bAndX represents both the right-hand side matrix b and result x.
*
* This version avoids a copy when the right hand side matrix b is not needed anymore.
*
* \warning The parameter is only marked 'const' to make the C++ compiler accept a temporary expression here.
* This function will const_cast it, so constness isn't honored here.
*
* \sa LLT::solve(), MatrixBase::llt()
*/
template <typename MatrixType, int UpLo_>
template <typename Derived>
void LLT<MatrixType, UpLo_>::solveInPlace(const MatrixBase<Derived>& bAndX) const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
eigen_assert(m_matrix.rows() == bAndX.rows());
matrixL().solveInPlace(bAndX);
matrixU().solveInPlace(bAndX);
}
/** \returns the matrix represented by the decomposition,
* i.e., it returns the product: L L^*.
* This function is provided for debug purpose. */
template <typename MatrixType, int UpLo_>
MatrixType LLT<MatrixType, UpLo_>::reconstructedMatrix() const {
eigen_assert(m_isInitialized && "LLT is not initialized.");
return matrixL() * matrixL().adjoint().toDenseMatrix();
}
/** \cholesky_module
* \returns the LLT decomposition of \c *this
*/
template<typename Derived>
inline const LLT<typename MatrixBase<Derived>::EvalType>
MatrixBase<Derived>::llt() const
{
return LLT<typename ei_eval<Derived>::type>(derived());
* \returns the LLT decomposition of \c *this
* \sa SelfAdjointView::llt()
*/
template <typename Derived>
inline const LLT<typename MatrixBase<Derived>::PlainObject> MatrixBase<Derived>::llt() const {
return LLT<PlainObject>(derived());
}
#endif // EIGEN_LLT_H
/** \cholesky_module
* \returns the LLT decomposition of \c *this
* \sa SelfAdjointView::llt()
*/
template <typename MatrixType, unsigned int UpLo>
inline const LLT<typename SelfAdjointView<MatrixType, UpLo>::PlainObject, UpLo> SelfAdjointView<MatrixType, UpLo>::llt()
const {
return LLT<PlainObject, UpLo>(m_matrix);
}
} // end namespace Eigen
#endif // EIGEN_LLT_H

View File

@@ -0,0 +1,124 @@
/*
Copyright (c) 2011, Intel Corporation. All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
********************************************************************************
* Content : Eigen bindings to LAPACKe
* LLt decomposition based on LAPACKE_?potrf function.
********************************************************************************
*/
#ifndef EIGEN_LLT_LAPACKE_H
#define EIGEN_LLT_LAPACKE_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
namespace lapacke_helpers {
// -------------------------------------------------------------------------------------------------------------------
// Dispatch for rank update handling upper and lower parts
// -------------------------------------------------------------------------------------------------------------------
template <UpLoType Mode>
struct rank_update {};
template <>
struct rank_update<Lower> {
template <typename MatrixType, typename VectorType>
static Index run(MatrixType &mat, const VectorType &vec, const typename MatrixType::RealScalar &sigma) {
return Eigen::internal::llt_rank_update_lower(mat, vec, sigma);
}
};
template <>
struct rank_update<Upper> {
template <typename MatrixType, typename VectorType>
static Index run(MatrixType &mat, const VectorType &vec, const typename MatrixType::RealScalar &sigma) {
Transpose<MatrixType> matt(mat);
return Eigen::internal::llt_rank_update_lower(matt, vec.conjugate(), sigma);
}
};
// -------------------------------------------------------------------------------------------------------------------
// Generic lapacke llt implementation that hands of to the dispatches
// -------------------------------------------------------------------------------------------------------------------
template <typename Scalar, UpLoType Mode>
struct lapacke_llt {
EIGEN_STATIC_ASSERT(((Mode == Lower) || (Mode == Upper)), MODE_MUST_BE_UPPER_OR_LOWER)
template <typename MatrixType>
static Index blocked(MatrixType &m) {
eigen_assert(m.rows() == m.cols());
if (m.rows() == 0) {
return -1;
}
/* Set up parameters for ?potrf */
lapack_int size = to_lapack(m.rows());
lapack_int matrix_order = lapack_storage_of(m);
constexpr char uplo = Mode == Upper ? 'U' : 'L';
Scalar *a = &(m.coeffRef(0, 0));
lapack_int lda = to_lapack(m.outerStride());
lapack_int info = potrf(matrix_order, uplo, size, to_lapack(a), lda);
info = (info == 0) ? -1 : info > 0 ? info - 1 : size;
return info;
}
template <typename MatrixType, typename VectorType>
static Index rankUpdate(MatrixType &mat, const VectorType &vec, const typename MatrixType::RealScalar &sigma) {
return rank_update<Mode>::run(mat, vec, sigma);
}
};
} // namespace lapacke_helpers
// end namespace lapacke_helpers
/*
* Here, we just put the generic implementation from lapacke_llt into a full specialization of the llt_inplace
* type. By being a full specialization, the versions defined here thus get precedence over the generic implementation
* in LLT.h for double, float and complex double, complex float types.
*/
#define EIGEN_LAPACKE_LLT(EIGTYPE) \
template <> \
struct llt_inplace<EIGTYPE, Lower> : public lapacke_helpers::lapacke_llt<EIGTYPE, Lower> {}; \
template <> \
struct llt_inplace<EIGTYPE, Upper> : public lapacke_helpers::lapacke_llt<EIGTYPE, Upper> {};
EIGEN_LAPACKE_LLT(double)
EIGEN_LAPACKE_LLT(float)
EIGEN_LAPACKE_LLT(std::complex<double>)
EIGEN_LAPACKE_LLT(std::complex<float>)
#undef EIGEN_LAPACKE_LLT
} // end namespace internal
} // end namespace Eigen
#endif // EIGEN_LLT_LAPACKE_H

View File

@@ -0,0 +1,738 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2008-2010 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CHOLMODSUPPORT_H
#define EIGEN_CHOLMODSUPPORT_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename Scalar>
struct cholmod_configure_matrix;
template <>
struct cholmod_configure_matrix<double> {
template <typename CholmodType>
static void run(CholmodType& mat) {
mat.xtype = CHOLMOD_REAL;
mat.dtype = CHOLMOD_DOUBLE;
}
};
template <>
struct cholmod_configure_matrix<std::complex<double> > {
template <typename CholmodType>
static void run(CholmodType& mat) {
mat.xtype = CHOLMOD_COMPLEX;
mat.dtype = CHOLMOD_DOUBLE;
}
};
// Other scalar types are not yet supported by Cholmod
// template<> struct cholmod_configure_matrix<float> {
// template<typename CholmodType>
// static void run(CholmodType& mat) {
// mat.xtype = CHOLMOD_REAL;
// mat.dtype = CHOLMOD_SINGLE;
// }
// };
//
// template<> struct cholmod_configure_matrix<std::complex<float> > {
// template<typename CholmodType>
// static void run(CholmodType& mat) {
// mat.xtype = CHOLMOD_COMPLEX;
// mat.dtype = CHOLMOD_SINGLE;
// }
// };
} // namespace internal
/** Wraps the Eigen sparse matrix \a mat into a Cholmod sparse matrix object.
* Note that the data are shared.
*/
template <typename Scalar_, int Options_, typename StorageIndex_>
cholmod_sparse viewAsCholmod(Ref<SparseMatrix<Scalar_, Options_, StorageIndex_> > mat) {
cholmod_sparse res;
res.nzmax = mat.nonZeros();
res.nrow = mat.rows();
res.ncol = mat.cols();
res.p = mat.outerIndexPtr();
res.i = mat.innerIndexPtr();
res.x = mat.valuePtr();
res.z = 0;
res.sorted = 1;
if (mat.isCompressed()) {
res.packed = 1;
res.nz = 0;
} else {
res.packed = 0;
res.nz = mat.innerNonZeroPtr();
}
res.dtype = 0;
res.stype = -1;
if (internal::is_same<StorageIndex_, int>::value) {
res.itype = CHOLMOD_INT;
} else if (internal::is_same<StorageIndex_, SuiteSparse_long>::value) {
res.itype = CHOLMOD_LONG;
} else {
eigen_assert(false && "Index type not supported yet");
}
// setup res.xtype
internal::cholmod_configure_matrix<Scalar_>::run(res);
res.stype = 0;
return res;
}
template <typename Scalar_, int Options_, typename Index_>
const cholmod_sparse viewAsCholmod(const SparseMatrix<Scalar_, Options_, Index_>& mat) {
cholmod_sparse res = viewAsCholmod(Ref<SparseMatrix<Scalar_, Options_, Index_> >(mat.const_cast_derived()));
return res;
}
template <typename Scalar_, int Options_, typename Index_>
const cholmod_sparse viewAsCholmod(const SparseVector<Scalar_, Options_, Index_>& mat) {
cholmod_sparse res = viewAsCholmod(Ref<SparseMatrix<Scalar_, Options_, Index_> >(mat.const_cast_derived()));
return res;
}
/** Returns a view of the Eigen sparse matrix \a mat as Cholmod sparse matrix.
* The data are not copied but shared. */
template <typename Scalar_, int Options_, typename Index_, unsigned int UpLo>
cholmod_sparse viewAsCholmod(const SparseSelfAdjointView<const SparseMatrix<Scalar_, Options_, Index_>, UpLo>& mat) {
cholmod_sparse res = viewAsCholmod(Ref<SparseMatrix<Scalar_, Options_, Index_> >(mat.matrix().const_cast_derived()));
if (UpLo == Upper) res.stype = 1;
if (UpLo == Lower) res.stype = -1;
// swap stype for rowmajor matrices (only works for real matrices)
EIGEN_STATIC_ASSERT((Options_ & RowMajorBit) == 0 || NumTraits<Scalar_>::IsComplex == 0,
THIS_METHOD_IS_ONLY_FOR_COLUMN_MAJOR_MATRICES);
if (Options_ & RowMajorBit) res.stype *= -1;
return res;
}
/** Returns a view of the Eigen \b dense matrix \a mat as Cholmod dense matrix.
* The data are not copied but shared. */
template <typename Derived>
cholmod_dense viewAsCholmod(MatrixBase<Derived>& mat) {
EIGEN_STATIC_ASSERT((internal::traits<Derived>::Flags & RowMajorBit) == 0,
THIS_METHOD_IS_ONLY_FOR_COLUMN_MAJOR_MATRICES);
typedef typename Derived::Scalar Scalar;
cholmod_dense res;
res.nrow = mat.rows();
res.ncol = mat.cols();
res.nzmax = res.nrow * res.ncol;
res.d = Derived::IsVectorAtCompileTime ? mat.derived().size() : mat.derived().outerStride();
res.x = (void*)(mat.derived().data());
res.z = 0;
internal::cholmod_configure_matrix<Scalar>::run(res);
return res;
}
/** Returns a view of the Cholmod sparse matrix \a cm as an Eigen sparse matrix.
* The data are not copied but shared. */
template <typename Scalar, typename StorageIndex>
Map<const SparseMatrix<Scalar, ColMajor, StorageIndex> > viewAsEigen(cholmod_sparse& cm) {
return Map<const SparseMatrix<Scalar, ColMajor, StorageIndex> >(
cm.nrow, cm.ncol, static_cast<StorageIndex*>(cm.p)[cm.ncol], static_cast<StorageIndex*>(cm.p),
static_cast<StorageIndex*>(cm.i), static_cast<Scalar*>(cm.x));
}
/** Returns a view of the Cholmod sparse matrix factor \a cm as an Eigen sparse matrix.
* The data are not copied but shared. */
template <typename Scalar, typename StorageIndex>
Map<const SparseMatrix<Scalar, ColMajor, StorageIndex> > viewAsEigen(cholmod_factor& cm) {
return Map<const SparseMatrix<Scalar, ColMajor, StorageIndex> >(
cm.n, cm.n, static_cast<StorageIndex*>(cm.p)[cm.n], static_cast<StorageIndex*>(cm.p),
static_cast<StorageIndex*>(cm.i), static_cast<Scalar*>(cm.x));
}
namespace internal {
// template specializations for int and long that call the correct cholmod method
#define EIGEN_CHOLMOD_SPECIALIZE0(ret, name) \
template <typename StorageIndex_> \
inline ret cm_##name(cholmod_common& Common) { \
return cholmod_##name(&Common); \
} \
template <> \
inline ret cm_##name<SuiteSparse_long>(cholmod_common & Common) { \
return cholmod_l_##name(&Common); \
}
#define EIGEN_CHOLMOD_SPECIALIZE1(ret, name, t1, a1) \
template <typename StorageIndex_> \
inline ret cm_##name(t1& a1, cholmod_common& Common) { \
return cholmod_##name(&a1, &Common); \
} \
template <> \
inline ret cm_##name<SuiteSparse_long>(t1 & a1, cholmod_common & Common) { \
return cholmod_l_##name(&a1, &Common); \
}
EIGEN_CHOLMOD_SPECIALIZE0(int, start)
EIGEN_CHOLMOD_SPECIALIZE0(int, finish)
EIGEN_CHOLMOD_SPECIALIZE1(int, free_factor, cholmod_factor*, L)
EIGEN_CHOLMOD_SPECIALIZE1(int, free_dense, cholmod_dense*, X)
EIGEN_CHOLMOD_SPECIALIZE1(int, free_sparse, cholmod_sparse*, A)
EIGEN_CHOLMOD_SPECIALIZE1(cholmod_factor*, analyze, cholmod_sparse, A)
EIGEN_CHOLMOD_SPECIALIZE1(cholmod_sparse*, factor_to_sparse, cholmod_factor, L)
template <typename StorageIndex_>
inline cholmod_dense* cm_solve(int sys, cholmod_factor& L, cholmod_dense& B, cholmod_common& Common) {
return cholmod_solve(sys, &L, &B, &Common);
}
template <>
inline cholmod_dense* cm_solve<SuiteSparse_long>(int sys, cholmod_factor& L, cholmod_dense& B, cholmod_common& Common) {
return cholmod_l_solve(sys, &L, &B, &Common);
}
template <typename StorageIndex_>
inline cholmod_sparse* cm_spsolve(int sys, cholmod_factor& L, cholmod_sparse& B, cholmod_common& Common) {
return cholmod_spsolve(sys, &L, &B, &Common);
}
template <>
inline cholmod_sparse* cm_spsolve<SuiteSparse_long>(int sys, cholmod_factor& L, cholmod_sparse& B,
cholmod_common& Common) {
return cholmod_l_spsolve(sys, &L, &B, &Common);
}
template <typename StorageIndex_>
inline int cm_factorize_p(cholmod_sparse* A, double beta[2], StorageIndex_* fset, std::size_t fsize, cholmod_factor* L,
cholmod_common& Common) {
return cholmod_factorize_p(A, beta, fset, fsize, L, &Common);
}
template <>
inline int cm_factorize_p<SuiteSparse_long>(cholmod_sparse* A, double beta[2], SuiteSparse_long* fset,
std::size_t fsize, cholmod_factor* L, cholmod_common& Common) {
return cholmod_l_factorize_p(A, beta, fset, fsize, L, &Common);
}
#undef EIGEN_CHOLMOD_SPECIALIZE0
#undef EIGEN_CHOLMOD_SPECIALIZE1
} // namespace internal
enum CholmodMode { CholmodAuto, CholmodSimplicialLLt, CholmodSupernodalLLt, CholmodLDLt };
/** \ingroup CholmodSupport_Module
* \class CholmodBase
* \brief The base class for the direct Cholesky factorization of Cholmod
* \sa class CholmodSupernodalLLT, class CholmodSimplicialLDLT, class CholmodSimplicialLLT
*/
template <typename MatrixType_, int UpLo_, typename Derived>
class CholmodBase : public SparseSolverBase<Derived> {
protected:
typedef SparseSolverBase<Derived> Base;
using Base::derived;
using Base::m_isInitialized;
public:
typedef MatrixType_ MatrixType;
enum { UpLo = UpLo_ };
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef MatrixType CholMatrixType;
typedef typename MatrixType::StorageIndex StorageIndex;
enum { ColsAtCompileTime = MatrixType::ColsAtCompileTime, MaxColsAtCompileTime = MatrixType::MaxColsAtCompileTime };
public:
CholmodBase() : m_cholmodFactor(0), m_info(Success), m_factorizationIsOk(false), m_analysisIsOk(false) {
EIGEN_STATIC_ASSERT((internal::is_same<double, RealScalar>::value), CHOLMOD_SUPPORTS_DOUBLE_PRECISION_ONLY);
m_shiftOffset[0] = m_shiftOffset[1] = 0.0;
internal::cm_start<StorageIndex>(m_cholmod);
}
explicit CholmodBase(const MatrixType& matrix)
: m_cholmodFactor(0), m_info(Success), m_factorizationIsOk(false), m_analysisIsOk(false) {
EIGEN_STATIC_ASSERT((internal::is_same<double, RealScalar>::value), CHOLMOD_SUPPORTS_DOUBLE_PRECISION_ONLY);
m_shiftOffset[0] = m_shiftOffset[1] = 0.0;
internal::cm_start<StorageIndex>(m_cholmod);
compute(matrix);
}
~CholmodBase() {
if (m_cholmodFactor) internal::cm_free_factor<StorageIndex>(m_cholmodFactor, m_cholmod);
internal::cm_finish<StorageIndex>(m_cholmod);
}
inline StorageIndex cols() const { return internal::convert_index<StorageIndex, Index>(m_cholmodFactor->n); }
inline StorageIndex rows() const { return internal::convert_index<StorageIndex, Index>(m_cholmodFactor->n); }
/** \brief Reports whether previous computation was successful.
*
* \returns \c Success if computation was successful,
* \c NumericalIssue if the matrix.appears to be negative.
*/
ComputationInfo info() const {
eigen_assert(m_isInitialized && "Decomposition is not initialized.");
return m_info;
}
/** Computes the sparse Cholesky decomposition of \a matrix */
Derived& compute(const MatrixType& matrix) {
analyzePattern(matrix);
factorize(matrix);
return derived();
}
/** Performs a symbolic decomposition on the sparsity pattern of \a matrix.
*
* This function is particularly useful when solving for several problems having the same structure.
*
* \sa factorize()
*/
void analyzePattern(const MatrixType& matrix) {
if (m_cholmodFactor) {
internal::cm_free_factor<StorageIndex>(m_cholmodFactor, m_cholmod);
m_cholmodFactor = 0;
}
cholmod_sparse A = viewAsCholmod(matrix.template selfadjointView<UpLo>());
m_cholmodFactor = internal::cm_analyze<StorageIndex>(A, m_cholmod);
this->m_isInitialized = true;
this->m_info = Success;
m_analysisIsOk = true;
m_factorizationIsOk = false;
}
/** Performs a numeric decomposition of \a matrix
*
* The given matrix must have the same sparsity pattern as the matrix on which the symbolic decomposition has been
* performed.
*
* \sa analyzePattern()
*/
void factorize(const MatrixType& matrix) {
eigen_assert(m_analysisIsOk && "You must first call analyzePattern()");
cholmod_sparse A = viewAsCholmod(matrix.template selfadjointView<UpLo>());
internal::cm_factorize_p<StorageIndex>(&A, m_shiftOffset, 0, 0, m_cholmodFactor, m_cholmod);
// If the factorization failed, either the input matrix was zero (so m_cholmodFactor == nullptr), or minor is the
// column at which it failed. On success minor == n.
this->m_info =
(m_cholmodFactor != nullptr && m_cholmodFactor->minor == m_cholmodFactor->n ? Success : NumericalIssue);
m_factorizationIsOk = true;
}
/** Returns a reference to the Cholmod's configuration structure to get a full control over the performed operations.
* See the Cholmod user guide for details. */
cholmod_common& cholmod() { return m_cholmod; }
#ifndef EIGEN_PARSED_BY_DOXYGEN
/** \internal */
template <typename Rhs, typename Dest>
void _solve_impl(const MatrixBase<Rhs>& b, MatrixBase<Dest>& dest) const {
eigen_assert(m_factorizationIsOk &&
"The decomposition is not in a valid state for solving, you must first call either compute() or "
"symbolic()/numeric()");
const Index size = m_cholmodFactor->n;
EIGEN_UNUSED_VARIABLE(size);
eigen_assert(size == b.rows());
// Cholmod needs column-major storage without inner-stride, which corresponds to the default behavior of Ref.
Ref<const Matrix<typename Rhs::Scalar, Dynamic, Dynamic, ColMajor> > b_ref(b.derived());
cholmod_dense b_cd = viewAsCholmod(b_ref);
cholmod_dense* x_cd = internal::cm_solve<StorageIndex>(CHOLMOD_A, *m_cholmodFactor, b_cd, m_cholmod);
if (!x_cd) {
this->m_info = NumericalIssue;
return;
}
// TODO optimize this copy by swapping when possible (be careful with alignment, etc.)
// NOTE Actually, the copy can be avoided by calling cholmod_solve2 instead of cholmod_solve
dest = Matrix<Scalar, Dest::RowsAtCompileTime, Dest::ColsAtCompileTime>::Map(reinterpret_cast<Scalar*>(x_cd->x),
b.rows(), b.cols());
internal::cm_free_dense<StorageIndex>(x_cd, m_cholmod);
}
/** \internal */
template <typename RhsDerived, typename DestDerived>
void _solve_impl(const SparseMatrixBase<RhsDerived>& b, SparseMatrixBase<DestDerived>& dest) const {
eigen_assert(m_factorizationIsOk &&
"The decomposition is not in a valid state for solving, you must first call either compute() or "
"symbolic()/numeric()");
const Index size = m_cholmodFactor->n;
EIGEN_UNUSED_VARIABLE(size);
eigen_assert(size == b.rows());
// note: cs stands for Cholmod Sparse
Ref<SparseMatrix<typename RhsDerived::Scalar, ColMajor, typename RhsDerived::StorageIndex> > b_ref(
b.const_cast_derived());
cholmod_sparse b_cs = viewAsCholmod(b_ref);
cholmod_sparse* x_cs = internal::cm_spsolve<StorageIndex>(CHOLMOD_A, *m_cholmodFactor, b_cs, m_cholmod);
if (!x_cs) {
this->m_info = NumericalIssue;
return;
}
// TODO optimize this copy by swapping when possible (be careful with alignment, etc.)
// NOTE cholmod_spsolve in fact just calls the dense solver for blocks of 4 columns at a time (similar to Eigen's
// sparse solver)
dest.derived() = viewAsEigen<typename DestDerived::Scalar, typename DestDerived::StorageIndex>(*x_cs);
internal::cm_free_sparse<StorageIndex>(x_cs, m_cholmod);
}
#endif // EIGEN_PARSED_BY_DOXYGEN
/** Sets the shift parameter that will be used to adjust the diagonal coefficients during the numerical factorization.
*
* During the numerical factorization, an offset term is added to the diagonal coefficients:\n
* \c d_ii = \a offset + \c d_ii
*
* The default is \a offset=0.
*
* \returns a reference to \c *this.
*/
Derived& setShift(const RealScalar& offset) {
m_shiftOffset[0] = double(offset);
return derived();
}
/** \returns the determinant of the underlying matrix from the current factorization */
Scalar determinant() const {
using std::exp;
return exp(logDeterminant());
}
/** \returns the log determinant of the underlying matrix from the current factorization */
Scalar logDeterminant() const {
using numext::real;
using std::log;
eigen_assert(m_factorizationIsOk &&
"The decomposition is not in a valid state for solving, you must first call either compute() or "
"symbolic()/numeric()");
RealScalar logDet = 0;
Scalar* x = static_cast<Scalar*>(m_cholmodFactor->x);
if (m_cholmodFactor->is_super) {
// Supernodal factorization stored as a packed list of dense column-major blocks,
// as described by the following structure:
// super[k] == index of the first column of the j-th super node
StorageIndex* super = static_cast<StorageIndex*>(m_cholmodFactor->super);
// pi[k] == offset to the description of row indices
StorageIndex* pi = static_cast<StorageIndex*>(m_cholmodFactor->pi);
// px[k] == offset to the respective dense block
StorageIndex* px = static_cast<StorageIndex*>(m_cholmodFactor->px);
Index nb_super_nodes = m_cholmodFactor->nsuper;
for (Index k = 0; k < nb_super_nodes; ++k) {
StorageIndex ncols = super[k + 1] - super[k];
StorageIndex nrows = pi[k + 1] - pi[k];
Map<const Array<Scalar, 1, Dynamic>, 0, InnerStride<> > sk(x + px[k], ncols, InnerStride<>(nrows + 1));
logDet += sk.real().log().sum();
}
} else {
// Simplicial factorization stored as standard CSC matrix.
StorageIndex* p = static_cast<StorageIndex*>(m_cholmodFactor->p);
Index size = m_cholmodFactor->n;
for (Index k = 0; k < size; ++k) logDet += log(real(x[p[k]]));
}
if (m_cholmodFactor->is_ll) logDet *= 2.0;
return logDet;
}
template <typename Stream>
void dumpMemory(Stream& /*s*/) {}
protected:
mutable cholmod_common m_cholmod;
cholmod_factor* m_cholmodFactor;
double m_shiftOffset[2];
mutable ComputationInfo m_info;
int m_factorizationIsOk;
int m_analysisIsOk;
};
/** \ingroup CholmodSupport_Module
* \class CholmodSimplicialLLT
* \brief A simplicial direct Cholesky (LLT) factorization and solver based on Cholmod
*
* This class allows to solve for A.X = B sparse linear problems via a simplicial LL^T Cholesky factorization
* using the Cholmod library.
* This simplicial variant is equivalent to Eigen's built-in SimplicialLLT class. Therefore, it has little practical
* interest. The sparse matrix A must be selfadjoint and positive definite. The vectors or matrices X and B can be
* either dense or sparse.
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ the triangular part that will be used for the computations. It can be Lower
* or Upper. Default is Lower.
*
* \implsparsesolverconcept
*
* This class supports all kind of SparseMatrix<>: row or column major; upper, lower, or both; compressed or non
* compressed.
*
* \warning Only double precision real and complex scalar types are supported by Cholmod.
*
* \sa \ref TutorialSparseSolverConcept, class CholmodSupernodalLLT, class SimplicialLLT
*/
template <typename MatrixType_, int UpLo_ = Lower>
class CholmodSimplicialLLT : public CholmodBase<MatrixType_, UpLo_, CholmodSimplicialLLT<MatrixType_, UpLo_> > {
typedef CholmodBase<MatrixType_, UpLo_, CholmodSimplicialLLT> Base;
using Base::m_cholmod;
public:
typedef MatrixType_ MatrixType;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef typename MatrixType::StorageIndex StorageIndex;
typedef TriangularView<const MatrixType, Eigen::Lower> MatrixL;
typedef TriangularView<const typename MatrixType::AdjointReturnType, Eigen::Upper> MatrixU;
CholmodSimplicialLLT() : Base() { init(); }
CholmodSimplicialLLT(const MatrixType& matrix) : Base() {
init();
this->compute(matrix);
}
~CholmodSimplicialLLT() {}
/** \returns an expression of the factor L */
inline MatrixL matrixL() const { return viewAsEigen<Scalar, StorageIndex>(*Base::m_cholmodFactor); }
/** \returns an expression of the factor U (= L^*) */
inline MatrixU matrixU() const { return matrixL().adjoint(); }
protected:
void init() {
m_cholmod.final_asis = 0;
m_cholmod.supernodal = CHOLMOD_SIMPLICIAL;
m_cholmod.final_ll = 1;
}
};
/** \ingroup CholmodSupport_Module
* \class CholmodSimplicialLDLT
* \brief A simplicial direct Cholesky (LDLT) factorization and solver based on Cholmod
*
* This class allows to solve for A.X = B sparse linear problems via a simplicial LDL^T Cholesky factorization
* using the Cholmod library.
* This simplicial variant is equivalent to Eigen's built-in SimplicialLDLT class. Therefore, it has little practical
* interest. The sparse matrix A must be selfadjoint and positive definite. The vectors or matrices X and B can be
* either dense or sparse.
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ the triangular part that will be used for the computations. It can be Lower
* or Upper. Default is Lower.
*
* \implsparsesolverconcept
*
* This class supports all kind of SparseMatrix<>: row or column major; upper, lower, or both; compressed or non
* compressed.
*
* \warning Only double precision real and complex scalar types are supported by Cholmod.
*
* \sa \ref TutorialSparseSolverConcept, class CholmodSupernodalLLT, class SimplicialLDLT
*/
template <typename MatrixType_, int UpLo_ = Lower>
class CholmodSimplicialLDLT : public CholmodBase<MatrixType_, UpLo_, CholmodSimplicialLDLT<MatrixType_, UpLo_> > {
typedef CholmodBase<MatrixType_, UpLo_, CholmodSimplicialLDLT> Base;
using Base::m_cholmod;
public:
typedef MatrixType_ MatrixType;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef typename MatrixType::StorageIndex StorageIndex;
typedef Matrix<Scalar, Dynamic, 1> VectorType;
typedef TriangularView<const MatrixType, Eigen::UnitLower> MatrixL;
typedef TriangularView<const typename MatrixType::AdjointReturnType, Eigen::UnitUpper> MatrixU;
CholmodSimplicialLDLT() : Base() { init(); }
CholmodSimplicialLDLT(const MatrixType& matrix) : Base() {
init();
this->compute(matrix);
}
~CholmodSimplicialLDLT() {}
/** \returns a vector expression of the diagonal D */
inline VectorType vectorD() const {
auto cholmodL = viewAsEigen<Scalar, StorageIndex>(*Base::m_cholmodFactor);
VectorType D{cholmodL.rows()};
for (Index k = 0; k < cholmodL.outerSize(); ++k) {
typename decltype(cholmodL)::InnerIterator it{cholmodL, k};
D(k) = it.value();
}
return D;
}
/** \returns an expression of the factor L */
inline MatrixL matrixL() const { return viewAsEigen<Scalar, StorageIndex>(*Base::m_cholmodFactor); }
/** \returns an expression of the factor U (= L^*) */
inline MatrixU matrixU() const { return matrixL().adjoint(); }
protected:
void init() {
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_SIMPLICIAL;
}
};
/** \ingroup CholmodSupport_Module
* \class CholmodSupernodalLLT
* \brief A supernodal Cholesky (LLT) factorization and solver based on Cholmod
*
* This class allows to solve for A.X = B sparse linear problems via a supernodal LL^T Cholesky factorization
* using the Cholmod library.
* This supernodal variant performs best on dense enough problems, e.g., 3D FEM, or very high order 2D FEM.
* The sparse matrix A must be selfadjoint and positive definite. The vectors or matrices
* X and B can be either dense or sparse.
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ the triangular part that will be used for the computations. It can be Lower
* or Upper. Default is Lower.
*
* \implsparsesolverconcept
*
* This class supports all kind of SparseMatrix<>: row or column major; upper, lower, or both; compressed or non
* compressed.
*
* \warning Only double precision real and complex scalar types are supported by Cholmod.
*
* \sa \ref TutorialSparseSolverConcept
*/
template <typename MatrixType_, int UpLo_ = Lower>
class CholmodSupernodalLLT : public CholmodBase<MatrixType_, UpLo_, CholmodSupernodalLLT<MatrixType_, UpLo_> > {
typedef CholmodBase<MatrixType_, UpLo_, CholmodSupernodalLLT> Base;
using Base::m_cholmod;
public:
typedef MatrixType_ MatrixType;
typedef typename MatrixType::Scalar Scalar;
typedef typename MatrixType::RealScalar RealScalar;
typedef typename MatrixType::StorageIndex StorageIndex;
CholmodSupernodalLLT() : Base() { init(); }
CholmodSupernodalLLT(const MatrixType& matrix) : Base() {
init();
this->compute(matrix);
}
~CholmodSupernodalLLT() {}
/** \returns an expression of the factor L */
inline MatrixType matrixL() const {
// Convert Cholmod factor's supernodal storage format to Eigen's CSC storage format
cholmod_sparse* cholmodL = internal::cm_factor_to_sparse(*Base::m_cholmodFactor, m_cholmod);
MatrixType L = viewAsEigen<Scalar, StorageIndex>(*cholmodL);
internal::cm_free_sparse<StorageIndex>(cholmodL, m_cholmod);
return L;
}
/** \returns an expression of the factor U (= L^*) */
inline MatrixType matrixU() const { return matrixL().adjoint(); }
protected:
void init() {
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_SUPERNODAL;
}
};
/** \ingroup CholmodSupport_Module
* \class CholmodDecomposition
* \brief A general Cholesky factorization and solver based on Cholmod
*
* This class allows to solve for A.X = B sparse linear problems via a LL^T or LDL^T Cholesky factorization
* using the Cholmod library. The sparse matrix A must be selfadjoint and positive definite. The vectors or matrices
* X and B can be either dense or sparse.
*
* This variant permits to change the underlying Cholesky method at runtime.
* On the other hand, it does not provide access to the result of the factorization.
* The default is to let Cholmod automatically choose between a simplicial and supernodal factorization.
*
* \tparam MatrixType_ the type of the sparse matrix A, it must be a SparseMatrix<>
* \tparam UpLo_ the triangular part that will be used for the computations. It can be Lower
* or Upper. Default is Lower.
*
* \implsparsesolverconcept
*
* This class supports all kind of SparseMatrix<>: row or column major; upper, lower, or both; compressed or non
* compressed.
*
* \warning Only double precision real and complex scalar types are supported by Cholmod.
*
* \sa \ref TutorialSparseSolverConcept
*/
template <typename MatrixType_, int UpLo_ = Lower>
class CholmodDecomposition : public CholmodBase<MatrixType_, UpLo_, CholmodDecomposition<MatrixType_, UpLo_> > {
typedef CholmodBase<MatrixType_, UpLo_, CholmodDecomposition> Base;
using Base::m_cholmod;
public:
typedef MatrixType_ MatrixType;
CholmodDecomposition() : Base() { init(); }
CholmodDecomposition(const MatrixType& matrix) : Base() {
init();
this->compute(matrix);
}
~CholmodDecomposition() {}
void setMode(CholmodMode mode) {
switch (mode) {
case CholmodAuto:
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_AUTO;
break;
case CholmodSimplicialLLt:
m_cholmod.final_asis = 0;
m_cholmod.supernodal = CHOLMOD_SIMPLICIAL;
m_cholmod.final_ll = 1;
break;
case CholmodSupernodalLLt:
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_SUPERNODAL;
break;
case CholmodLDLt:
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_SIMPLICIAL;
break;
default:
break;
}
}
protected:
void init() {
m_cholmod.final_asis = 1;
m_cholmod.supernodal = CHOLMOD_AUTO;
}
};
} // end namespace Eigen
#endif // EIGEN_CHOLMODSUPPORT_H

View File

@@ -0,0 +1,3 @@
#ifndef EIGEN_CHOLMODSUPPORT_MODULE_H
#error "Please include Eigen/CholmodSupport instead of including headers inside the src directory directly."
#endif

View File

@@ -0,0 +1,239 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2017 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ARITHMETIC_SEQUENCE_H
#define EIGEN_ARITHMETIC_SEQUENCE_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
// Helper to cleanup the type of the increment:
template <typename T>
struct cleanup_seq_incr {
typedef typename cleanup_index_type<T, DynamicIndex>::type type;
};
} // namespace internal
//--------------------------------------------------------------------------------
// seq(first,last,incr) and seqN(first,size,incr)
//--------------------------------------------------------------------------------
template <typename FirstType = Index, typename SizeType = Index, typename IncrType = internal::FixedInt<1> >
class ArithmeticSequence;
template <typename FirstType, typename SizeType, typename IncrType>
ArithmeticSequence<typename internal::cleanup_index_type<FirstType>::type,
typename internal::cleanup_index_type<SizeType>::type,
typename internal::cleanup_seq_incr<IncrType>::type>
seqN(FirstType first, SizeType size, IncrType incr);
/** \class ArithmeticSequence
* \ingroup Core_Module
*
* This class represents an arithmetic progression \f$ a_0, a_1, a_2, ..., a_{n-1}\f$ defined by
* its \em first value \f$ a_0 \f$, its \em size (aka length) \em n, and the \em increment (aka stride)
* that is equal to \f$ a_{i+1}-a_{i}\f$ for any \em i.
*
* It is internally used as the return type of the Eigen::seq and Eigen::seqN functions, and as the input arguments
* of DenseBase::operator()(const RowIndices&, const ColIndices&), and most of the time this is the
* only way it is used.
*
* \tparam FirstType type of the first element, usually an Index,
* but internally it can be a symbolic expression
* \tparam SizeType type representing the size of the sequence, usually an Index
* or a compile time integral constant. Internally, it can also be a symbolic expression
* \tparam IncrType type of the increment, can be a runtime Index, or a compile time integral constant (default is
* compile-time 1)
*
* \sa Eigen::seq, Eigen::seqN, DenseBase::operator()(const RowIndices&, const ColIndices&), class IndexedView
*/
template <typename FirstType, typename SizeType, typename IncrType>
class ArithmeticSequence {
public:
constexpr ArithmeticSequence() = default;
constexpr ArithmeticSequence(FirstType first, SizeType size) : m_first(first), m_size(size) {}
constexpr ArithmeticSequence(FirstType first, SizeType size, IncrType incr)
: m_first(first), m_size(size), m_incr(incr) {}
enum {
// SizeAtCompileTime = internal::get_fixed_value<SizeType>::value,
IncrAtCompileTime = internal::get_fixed_value<IncrType, DynamicIndex>::value
};
/** \returns the size, i.e., number of elements, of the sequence */
constexpr Index size() const { return m_size; }
/** \returns the first element \f$ a_0 \f$ in the sequence */
constexpr Index first() const { return m_first; }
/** \returns the value \f$ a_i \f$ at index \a i in the sequence. */
constexpr Index operator[](Index i) const { return m_first + i * m_incr; }
constexpr const FirstType& firstObject() const { return m_first; }
constexpr const SizeType& sizeObject() const { return m_size; }
constexpr const IncrType& incrObject() const { return m_incr; }
protected:
FirstType m_first;
SizeType m_size;
IncrType m_incr;
public:
constexpr auto reverse() const -> decltype(Eigen::seqN(m_first + (m_size + fix<-1>()) * m_incr, m_size, -m_incr)) {
return seqN(m_first + (m_size + fix<-1>()) * m_incr, m_size, -m_incr);
}
};
/** \returns an ArithmeticSequence starting at \a first, of length \a size, and increment \a incr
*
* \sa seqN(FirstType,SizeType), seq(FirstType,LastType,IncrType) */
template <typename FirstType, typename SizeType, typename IncrType>
ArithmeticSequence<typename internal::cleanup_index_type<FirstType>::type,
typename internal::cleanup_index_type<SizeType>::type,
typename internal::cleanup_seq_incr<IncrType>::type>
seqN(FirstType first, SizeType size, IncrType incr) {
return ArithmeticSequence<typename internal::cleanup_index_type<FirstType>::type,
typename internal::cleanup_index_type<SizeType>::type,
typename internal::cleanup_seq_incr<IncrType>::type>(first, size, incr);
}
/** \returns an ArithmeticSequence starting at \a first, of length \a size, and unit increment
*
* \sa seqN(FirstType,SizeType,IncrType), seq(FirstType,LastType) */
template <typename FirstType, typename SizeType>
ArithmeticSequence<typename internal::cleanup_index_type<FirstType>::type,
typename internal::cleanup_index_type<SizeType>::type>
seqN(FirstType first, SizeType size) {
return ArithmeticSequence<typename internal::cleanup_index_type<FirstType>::type,
typename internal::cleanup_index_type<SizeType>::type>(first, size);
}
#ifdef EIGEN_PARSED_BY_DOXYGEN
/** \returns an ArithmeticSequence starting at \a f, up (or down) to \a l, and with positive (or negative) increment \a
* incr
*
* It is essentially an alias to:
* \code
* seqN(f, (l-f+incr)/incr, incr);
* \endcode
*
* \sa seqN(FirstType,SizeType,IncrType), seq(FirstType,LastType)
*/
template <typename FirstType, typename LastType, typename IncrType>
auto seq(FirstType f, LastType l, IncrType incr);
/** \returns an ArithmeticSequence starting at \a f, up (or down) to \a l, and unit increment
*
* It is essentially an alias to:
* \code
* seqN(f,l-f+1);
* \endcode
*
* \sa seqN(FirstType,SizeType), seq(FirstType,LastType,IncrType)
*/
template <typename FirstType, typename LastType>
auto seq(FirstType f, LastType l);
#else // EIGEN_PARSED_BY_DOXYGEN
template <typename FirstType, typename LastType>
auto seq(FirstType f, LastType l)
-> decltype(seqN(typename internal::cleanup_index_type<FirstType>::type(f),
(typename internal::cleanup_index_type<LastType>::type(l) -
typename internal::cleanup_index_type<FirstType>::type(f) + fix<1>()))) {
return seqN(typename internal::cleanup_index_type<FirstType>::type(f),
(typename internal::cleanup_index_type<LastType>::type(l) -
typename internal::cleanup_index_type<FirstType>::type(f) + fix<1>()));
}
template <typename FirstType, typename LastType, typename IncrType>
auto seq(FirstType f, LastType l, IncrType incr)
-> decltype(seqN(typename internal::cleanup_index_type<FirstType>::type(f),
(typename internal::cleanup_index_type<LastType>::type(l) -
typename internal::cleanup_index_type<FirstType>::type(f) +
typename internal::cleanup_seq_incr<IncrType>::type(incr)) /
typename internal::cleanup_seq_incr<IncrType>::type(incr),
typename internal::cleanup_seq_incr<IncrType>::type(incr))) {
typedef typename internal::cleanup_seq_incr<IncrType>::type CleanedIncrType;
return seqN(typename internal::cleanup_index_type<FirstType>::type(f),
(typename internal::cleanup_index_type<LastType>::type(l) -
typename internal::cleanup_index_type<FirstType>::type(f) + CleanedIncrType(incr)) /
CleanedIncrType(incr),
CleanedIncrType(incr));
}
#endif // EIGEN_PARSED_BY_DOXYGEN
namespace placeholders {
/** \cpp11
* \returns a symbolic ArithmeticSequence representing the last \a size elements with increment \a incr.
*
* It is a shortcut for: \code seqN(last-(size-fix<1>)*incr, size, incr) \endcode
*
* \sa lastN(SizeType), seqN(FirstType,SizeType), seq(FirstType,LastType,IncrType) */
template <typename SizeType, typename IncrType>
auto lastN(SizeType size, IncrType incr)
-> decltype(seqN(Eigen::placeholders::last - (size - fix<1>()) * incr, size, incr)) {
return seqN(Eigen::placeholders::last - (size - fix<1>()) * incr, size, incr);
}
/** \cpp11
* \returns a symbolic ArithmeticSequence representing the last \a size elements with a unit increment.
*
* It is a shortcut for: \code seq(last+fix<1>-size, last) \endcode
*
* \sa lastN(SizeType,IncrType, seqN(FirstType,SizeType), seq(FirstType,LastType) */
template <typename SizeType>
auto lastN(SizeType size) -> decltype(seqN(Eigen::placeholders::last + fix<1>() - size, size)) {
return seqN(Eigen::placeholders::last + fix<1>() - size, size);
}
} // namespace placeholders
/** \namespace Eigen::indexing
* \ingroup Core_Module
*
* The sole purpose of this namespace is to be able to import all functions
* and symbols that are expected to be used within operator() for indexing
* and slicing. If you already imported the whole Eigen namespace:
* \code using namespace Eigen; \endcode
* then you are already all set. Otherwise, if you don't want/cannot import
* the whole Eigen namespace, the following line:
* \code using namespace Eigen::indexing; \endcode
* is equivalent to:
* \code
using Eigen::fix;
using Eigen::seq;
using Eigen::seqN;
using Eigen::placeholders::all;
using Eigen::placeholders::last;
using Eigen::placeholders::lastN; // c++11 only
using Eigen::placeholders::lastp1;
\endcode
*/
namespace indexing {
using Eigen::fix;
using Eigen::seq;
using Eigen::seqN;
using Eigen::placeholders::all;
using Eigen::placeholders::last;
using Eigen::placeholders::lastN;
using Eigen::placeholders::lastp1;
} // namespace indexing
} // end namespace Eigen
#endif // EIGEN_ARITHMETIC_SEQUENCE_H

376
Eigen/src/Core/Array.h Normal file
View File

@@ -0,0 +1,376 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ARRAY_H
#define EIGEN_ARRAY_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename Scalar_, int Rows_, int Cols_, int Options_, int MaxRows_, int MaxCols_>
struct traits<Array<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>>
: traits<Matrix<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>> {
typedef ArrayXpr XprKind;
typedef ArrayBase<Array<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>> XprBase;
};
} // namespace internal
/** \class Array
* \ingroup Core_Module
*
* \brief General-purpose arrays with easy API for coefficient-wise operations
*
* The %Array class is very similar to the Matrix class. It provides
* general-purpose one- and two-dimensional arrays. The difference between the
* %Array and the %Matrix class is primarily in the API: the API for the
* %Array class provides easy access to coefficient-wise operations, while the
* API for the %Matrix class provides easy access to linear-algebra
* operations.
*
* See documentation of class Matrix for detailed information on the template parameters
* storage layout.
*
* This class can be extended with the help of the plugin mechanism described on the page
* \ref TopicCustomizing_Plugins by defining the preprocessor symbol \c EIGEN_ARRAY_PLUGIN.
*
* \sa \blank \ref TutorialArrayClass, \ref TopicClassHierarchy
*/
template <typename Scalar_, int Rows_, int Cols_, int Options_, int MaxRows_, int MaxCols_>
class Array : public PlainObjectBase<Array<Scalar_, Rows_, Cols_, Options_, MaxRows_, MaxCols_>> {
public:
typedef PlainObjectBase<Array> Base;
EIGEN_DENSE_PUBLIC_INTERFACE(Array)
enum { Options = Options_ };
typedef typename Base::PlainObject PlainObject;
protected:
template <typename Derived, typename OtherDerived, bool IsVector>
friend struct internal::conservative_resize_like_impl;
using Base::m_storage;
public:
using Base::base;
using Base::coeff;
using Base::coeffRef;
/**
* The usage of
* using Base::operator=;
* fails on MSVC. Since the code below is working with GCC and MSVC, we skipped
* the usage of 'using'. This should be done only for operator=.
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array& operator=(const EigenBase<OtherDerived>& other) {
return Base::operator=(other);
}
/** Set all the entries to \a value.
* \sa DenseBase::setConstant(), DenseBase::fill()
*/
/* This overload is needed because the usage of
* using Base::operator=;
* fails on MSVC. Since the code below is working with GCC and MSVC, we skipped
* the usage of 'using'. This should be done only for operator=.
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array& operator=(const Scalar& value) {
Base::setConstant(value);
return *this;
}
/** Copies the value of the expression \a other into \c *this with automatic resizing.
*
* *this might be resized to match the dimensions of \a other. If *this was a null matrix (not already initialized),
* it will be initialized.
*
* Note that copying a row-vector into a vector (and conversely) is allowed.
* The resizing, if any, is then done in the appropriate way so that row-vectors
* remain row-vectors and vectors remain vectors.
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array& operator=(const DenseBase<OtherDerived>& other) {
return Base::_set(other);
}
/**
* \brief Assigns arrays to each other.
*
* \note This is a special case of the templated operator=. Its purpose is
* to prevent a default operator= from hiding the templated operator=.
*
* \callgraph
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array& operator=(const Array& other) { return Base::_set(other); }
/** Default constructor.
*
* For fixed-size matrices, does nothing.
*
* For dynamic-size matrices, creates an empty matrix of size 0. Does not allocate any array. Such a matrix
* is called a null matrix. This constructor is the unique way to create null matrices: resizing
* a matrix to 0 is not supported.
*
* \sa resize(Index,Index)
*/
#ifdef EIGEN_INITIALIZE_COEFFS
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Array() : Base() { EIGEN_INITIALIZE_COEFFS_IF_THAT_OPTION_IS_ENABLED }
#else
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Array() = default;
#endif
/** \brief Move constructor */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Array(Array&&) = default;
EIGEN_DEVICE_FUNC Array& operator=(Array&& other) noexcept(std::is_nothrow_move_assignable<Scalar>::value) {
Base::operator=(std::move(other));
return *this;
}
/** \brief Construct a row of column vector with fixed size from an arbitrary number of coefficients.
*
* \only_for_vectors
*
* This constructor is for 1D array or vectors with more than 4 coefficients.
*
* \warning To construct a column (resp. row) vector of fixed length, the number of values passed to this
* constructor must match the the fixed number of rows (resp. columns) of \c *this.
*
*
* Example: \include Array_variadic_ctor_cxx11.cpp
* Output: \verbinclude Array_variadic_ctor_cxx11.out
*
* \sa Array(const std::initializer_list<std::initializer_list<Scalar>>&)
* \sa Array(const Scalar&), Array(const Scalar&,const Scalar&)
*/
template <typename... ArgTypes>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array(const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3,
const ArgTypes&... args)
: Base(a0, a1, a2, a3, args...) {}
/** \brief Constructs an array and initializes it from the coefficients given as initializer-lists grouped by row.
* \cpp11
*
* In the general case, the constructor takes a list of rows, each row being represented as a list of coefficients:
*
* Example: \include Array_initializer_list_23_cxx11.cpp
* Output: \verbinclude Array_initializer_list_23_cxx11.out
*
* Each of the inner initializer lists must contain the exact same number of elements, otherwise an assertion is
* triggered.
*
* In the case of a compile-time column 1D array, implicit transposition from a single row is allowed.
* Therefore <code> Array<int,Dynamic,1>{{1,2,3,4,5}}</code> is legal and the more verbose syntax
* <code>Array<int,Dynamic,1>{{1},{2},{3},{4},{5}}</code> can be avoided:
*
* Example: \include Array_initializer_list_vector_cxx11.cpp
* Output: \verbinclude Array_initializer_list_vector_cxx11.out
*
* In the case of fixed-sized arrays, the initializer list sizes must exactly match the array sizes,
* and implicit transposition is allowed for compile-time 1D arrays only.
*
* \sa Array(const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3, const ArgTypes&... args)
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Array(
const std::initializer_list<std::initializer_list<Scalar>>& list)
: Base(list) {}
#ifndef EIGEN_PARSED_BY_DOXYGEN
template <typename T>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE explicit Array(const T& x) {
Base::template _init1<T>(x);
}
template <typename T0, typename T1>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array(const T0& val0, const T1& val1) {
this->template _init2<T0, T1>(val0, val1);
}
#else
/** \brief Constructs a fixed-sized array initialized with coefficients starting at \a data */
EIGEN_DEVICE_FUNC explicit Array(const Scalar* data);
/** Constructs a vector or row-vector with given dimension. \only_for_vectors
*
* Note that this is only useful for dynamic-size vectors. For fixed-size vectors,
* it is redundant to pass the dimension here, so it makes more sense to use the default
* constructor Array() instead.
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE explicit Array(Index dim);
/** constructs an initialized 1x1 Array with the given coefficient
* \sa const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3, const ArgTypes&... args */
Array(const Scalar& value);
/** constructs an uninitialized array with \a rows rows and \a cols columns.
*
* This is useful for dynamic-size arrays. For fixed-size arrays,
* it is redundant to pass these parameters, so one should use the default constructor
* Array() instead. */
Array(Index rows, Index cols);
/** constructs an initialized 2D vector with given coefficients
* \sa Array(const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3, const ArgTypes&... args) */
Array(const Scalar& val0, const Scalar& val1);
#endif // end EIGEN_PARSED_BY_DOXYGEN
/** constructs an initialized 3D vector with given coefficients
* \sa Array(const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3, const ArgTypes&... args)
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array(const Scalar& val0, const Scalar& val1, const Scalar& val2) {
EIGEN_STATIC_ASSERT_VECTOR_SPECIFIC_SIZE(Array, 3)
m_storage.data()[0] = val0;
m_storage.data()[1] = val1;
m_storage.data()[2] = val2;
}
/** constructs an initialized 4D vector with given coefficients
* \sa Array(const Scalar& a0, const Scalar& a1, const Scalar& a2, const Scalar& a3, const ArgTypes&... args)
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array(const Scalar& val0, const Scalar& val1, const Scalar& val2,
const Scalar& val3) {
EIGEN_STATIC_ASSERT_VECTOR_SPECIFIC_SIZE(Array, 4)
m_storage.data()[0] = val0;
m_storage.data()[1] = val1;
m_storage.data()[2] = val2;
m_storage.data()[3] = val3;
}
/** Copy constructor */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Array(const Array&) = default;
private:
struct PrivateType {};
public:
/** \sa MatrixBase::operator=(const EigenBase<OtherDerived>&) */
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Array(
const EigenBase<OtherDerived>& other,
std::enable_if_t<internal::is_convertible<typename OtherDerived::Scalar, Scalar>::value, PrivateType> =
PrivateType())
: Base(other.derived()) {}
EIGEN_DEVICE_FUNC constexpr Index innerStride() const noexcept { return 1; }
EIGEN_DEVICE_FUNC constexpr Index outerStride() const noexcept { return this->innerSize(); }
#ifdef EIGEN_ARRAY_PLUGIN
#include EIGEN_ARRAY_PLUGIN
#endif
private:
template <typename MatrixType, typename OtherDerived, bool SwapPointers>
friend struct internal::matrix_swap_impl;
};
/** \defgroup arraytypedefs Global array typedefs
* \ingroup Core_Module
*
* %Eigen defines several typedef shortcuts for most common 1D and 2D array types.
*
* The general patterns are the following:
*
* \c ArrayRowsColsType where \c Rows and \c Cols can be \c 2,\c 3,\c 4 for fixed size square matrices or \c X for
* dynamic size, and where \c Type can be \c i for integer, \c f for float, \c d for double, \c cf for complex float, \c
* cd for complex double.
*
* For example, \c Array33d is a fixed-size 3x3 array type of doubles, and \c ArrayXXf is a dynamic-size matrix of
* floats.
*
* There are also \c ArraySizeType which are self-explanatory. For example, \c Array4cf is
* a fixed-size 1D array of 4 complex floats.
*
* With \cpp11, template alias are also defined for common sizes.
* They follow the same pattern as above except that the scalar type suffix is replaced by a
* template parameter, i.e.:
* - `ArrayRowsCols<Type>` where `Rows` and `Cols` can be \c 2,\c 3,\c 4, or \c X for fixed or dynamic size.
* - `ArraySize<Type>` where `Size` can be \c 2,\c 3,\c 4 or \c X for fixed or dynamic size 1D arrays.
*
* \sa class Array
*/
#define EIGEN_MAKE_ARRAY_TYPEDEFS(Type, TypeSuffix, Size, SizeSuffix) \
/** \ingroup arraytypedefs */ \
typedef Array<Type, Size, Size> Array##SizeSuffix##SizeSuffix##TypeSuffix; \
/** \ingroup arraytypedefs */ \
typedef Array<Type, Size, 1> Array##SizeSuffix##TypeSuffix;
#define EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(Type, TypeSuffix, Size) \
/** \ingroup arraytypedefs */ \
typedef Array<Type, Size, Dynamic> Array##Size##X##TypeSuffix; \
/** \ingroup arraytypedefs */ \
typedef Array<Type, Dynamic, Size> Array##X##Size##TypeSuffix;
#define EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(Type, TypeSuffix) \
EIGEN_MAKE_ARRAY_TYPEDEFS(Type, TypeSuffix, 2, 2) \
EIGEN_MAKE_ARRAY_TYPEDEFS(Type, TypeSuffix, 3, 3) \
EIGEN_MAKE_ARRAY_TYPEDEFS(Type, TypeSuffix, 4, 4) \
EIGEN_MAKE_ARRAY_TYPEDEFS(Type, TypeSuffix, Dynamic, X) \
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(Type, TypeSuffix, 2) \
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(Type, TypeSuffix, 3) \
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(Type, TypeSuffix, 4)
EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(int, i)
EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(float, f)
EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(double, d)
EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(std::complex<float>, cf)
EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES(std::complex<double>, cd)
#undef EIGEN_MAKE_ARRAY_TYPEDEFS_ALL_SIZES
#undef EIGEN_MAKE_ARRAY_TYPEDEFS
#undef EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS
#define EIGEN_MAKE_ARRAY_TYPEDEFS(Size, SizeSuffix) \
/** \ingroup arraytypedefs */ \
/** \brief \cpp11 */ \
template <typename Type> \
using Array##SizeSuffix##SizeSuffix = Array<Type, Size, Size>; \
/** \ingroup arraytypedefs */ \
/** \brief \cpp11 */ \
template <typename Type> \
using Array##SizeSuffix = Array<Type, Size, 1>;
#define EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(Size) \
/** \ingroup arraytypedefs */ \
/** \brief \cpp11 */ \
template <typename Type> \
using Array##Size##X = Array<Type, Size, Dynamic>; \
/** \ingroup arraytypedefs */ \
/** \brief \cpp11 */ \
template <typename Type> \
using Array##X##Size = Array<Type, Dynamic, Size>;
EIGEN_MAKE_ARRAY_TYPEDEFS(2, 2)
EIGEN_MAKE_ARRAY_TYPEDEFS(3, 3)
EIGEN_MAKE_ARRAY_TYPEDEFS(4, 4)
EIGEN_MAKE_ARRAY_TYPEDEFS(Dynamic, X)
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(2)
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(3)
EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS(4)
#undef EIGEN_MAKE_ARRAY_TYPEDEFS
#undef EIGEN_MAKE_ARRAY_FIXED_TYPEDEFS
#define EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE_AND_SIZE(TypeSuffix, SizeSuffix) \
using Eigen::Matrix##SizeSuffix##TypeSuffix; \
using Eigen::Vector##SizeSuffix##TypeSuffix; \
using Eigen::RowVector##SizeSuffix##TypeSuffix;
#define EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(TypeSuffix) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE_AND_SIZE(TypeSuffix, 2) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE_AND_SIZE(TypeSuffix, 3) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE_AND_SIZE(TypeSuffix, 4) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE_AND_SIZE(TypeSuffix, X)
#define EIGEN_USING_ARRAY_TYPEDEFS \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(i) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(f) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(d) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(cf) \
EIGEN_USING_ARRAY_TYPEDEFS_FOR_TYPE(cd)
} // end namespace Eigen
#endif // EIGEN_ARRAY_H

213
Eigen/src/Core/ArrayBase.h Normal file
View File

@@ -0,0 +1,213 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ARRAYBASE_H
#define EIGEN_ARRAYBASE_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
template <typename ExpressionType>
class MatrixWrapper;
/** \class ArrayBase
* \ingroup Core_Module
*
* \brief Base class for all 1D and 2D array, and related expressions
*
* An array is similar to a dense vector or matrix. While matrices are mathematical
* objects with well defined linear algebra operators, an array is just a collection
* of scalar values arranged in a one or two dimensional fashion. As the main consequence,
* all operations applied to an array are performed coefficient wise. Furthermore,
* arrays support scalar math functions of the c++ standard library (e.g., std::sin(x)), and convenient
* constructors allowing to easily write generic code working for both scalar values
* and arrays.
*
* This class is the base that is inherited by all array expression types.
*
* \tparam Derived is the derived type, e.g., an array or an expression type.
*
* This class can be extended with the help of the plugin mechanism described on the page
* \ref TopicCustomizing_Plugins by defining the preprocessor symbol \c EIGEN_ARRAYBASE_PLUGIN.
*
* \sa class MatrixBase, \ref TopicClassHierarchy
*/
template <typename Derived>
class ArrayBase : public DenseBase<Derived> {
public:
#ifndef EIGEN_PARSED_BY_DOXYGEN
/** The base class for a given storage type. */
typedef ArrayBase StorageBaseType;
typedef ArrayBase Eigen_BaseClassForSpecializationOfGlobalMathFuncImpl;
typedef typename internal::traits<Derived>::StorageKind StorageKind;
typedef typename internal::traits<Derived>::Scalar Scalar;
typedef typename internal::packet_traits<Scalar>::type PacketScalar;
typedef typename NumTraits<Scalar>::Real RealScalar;
typedef DenseBase<Derived> Base;
using Base::ColsAtCompileTime;
using Base::Flags;
using Base::IsVectorAtCompileTime;
using Base::MaxColsAtCompileTime;
using Base::MaxRowsAtCompileTime;
using Base::MaxSizeAtCompileTime;
using Base::RowsAtCompileTime;
using Base::SizeAtCompileTime;
using Base::coeff;
using Base::coeffRef;
using Base::cols;
using Base::const_cast_derived;
using Base::derived;
using Base::lazyAssign;
using Base::rows;
using Base::size;
using Base::operator-;
using Base::operator=;
using Base::operator+=;
using Base::operator-=;
using Base::operator*=;
using Base::operator/=;
typedef typename Base::CoeffReturnType CoeffReturnType;
typedef typename Base::PlainObject PlainObject;
/** \internal Represents a matrix with all coefficients equal to one another*/
typedef CwiseNullaryOp<internal::scalar_constant_op<Scalar>, PlainObject> ConstantReturnType;
#endif // not EIGEN_PARSED_BY_DOXYGEN
#define EIGEN_CURRENT_STORAGE_BASE_CLASS Eigen::ArrayBase
#define EIGEN_DOC_UNARY_ADDONS(X, Y)
#include "../plugins/MatrixCwiseUnaryOps.inc"
#include "../plugins/ArrayCwiseUnaryOps.inc"
#include "../plugins/CommonCwiseBinaryOps.inc"
#include "../plugins/MatrixCwiseBinaryOps.inc"
#include "../plugins/ArrayCwiseBinaryOps.inc"
#ifdef EIGEN_ARRAYBASE_PLUGIN
#include EIGEN_ARRAYBASE_PLUGIN
#endif
#undef EIGEN_CURRENT_STORAGE_BASE_CLASS
#undef EIGEN_DOC_UNARY_ADDONS
/** Special case of the template operator=, in order to prevent the compiler
* from generating a default operator= (issue hit with g++ 4.1)
*/
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator=(const ArrayBase& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
/** Set all the entries to \a value.
* \sa DenseBase::setConstant(), DenseBase::fill() */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator=(const Scalar& value) {
Base::setConstant(value);
return derived();
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator+=(const Scalar& other) {
internal::call_assignment(this->derived(), PlainObject::Constant(rows(), cols(), other),
internal::add_assign_op<Scalar, Scalar>());
return derived();
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator-=(const Scalar& other) {
internal::call_assignment(this->derived(), PlainObject::Constant(rows(), cols(), other),
internal::sub_assign_op<Scalar, Scalar>());
return derived();
}
/** replaces \c *this by \c *this + \a other.
*
* \returns a reference to \c *this
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator+=(const ArrayBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::add_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
/** replaces \c *this by \c *this - \a other.
*
* \returns a reference to \c *this
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator-=(const ArrayBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::sub_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
/** replaces \c *this by \c *this * \a other coefficient wise.
*
* \returns a reference to \c *this
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator*=(const ArrayBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::mul_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
/** replaces \c *this by \c *this / \a other coefficient wise.
*
* \returns a reference to \c *this
*/
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& operator/=(const ArrayBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::div_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
public:
EIGEN_DEVICE_FUNC ArrayBase<Derived>& array() { return *this; }
EIGEN_DEVICE_FUNC const ArrayBase<Derived>& array() const { return *this; }
/** \returns an \link Eigen::MatrixBase Matrix \endlink expression of this array
* \sa MatrixBase::array() */
EIGEN_DEVICE_FUNC MatrixWrapper<Derived> matrix() { return MatrixWrapper<Derived>(derived()); }
EIGEN_DEVICE_FUNC const MatrixWrapper<const Derived> matrix() const {
return MatrixWrapper<const Derived>(derived());
}
// template<typename Dest>
// inline void evalTo(Dest& dst) const { dst = matrix(); }
protected:
EIGEN_DEFAULT_COPY_CONSTRUCTOR(ArrayBase)
EIGEN_DEFAULT_EMPTY_CONSTRUCTOR_AND_DESTRUCTOR(ArrayBase)
private:
explicit ArrayBase(Index);
ArrayBase(Index, Index);
template <typename OtherDerived>
explicit ArrayBase(const ArrayBase<OtherDerived>&);
protected:
// mixing arrays and matrices is not legal
template <typename OtherDerived>
Derived& operator+=(const MatrixBase<OtherDerived>&) {
EIGEN_STATIC_ASSERT(std::ptrdiff_t(sizeof(typename OtherDerived::Scalar)) == -1,
YOU_CANNOT_MIX_ARRAYS_AND_MATRICES);
return *this;
}
// mixing arrays and matrices is not legal
template <typename OtherDerived>
Derived& operator-=(const MatrixBase<OtherDerived>&) {
EIGEN_STATIC_ASSERT(std::ptrdiff_t(sizeof(typename OtherDerived::Scalar)) == -1,
YOU_CANNOT_MIX_ARRAYS_AND_MATRICES);
return *this;
}
};
} // end namespace Eigen
#endif // EIGEN_ARRAYBASE_H

View File

@@ -0,0 +1,165 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009-2010 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ARRAYWRAPPER_H
#define EIGEN_ARRAYWRAPPER_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
/** \class ArrayWrapper
* \ingroup Core_Module
*
* \brief Expression of a mathematical vector or matrix as an array object
*
* This class is the return type of MatrixBase::array(), and most of the time
* this is the only way it is use.
*
* \sa MatrixBase::array(), class MatrixWrapper
*/
namespace internal {
template <typename ExpressionType>
struct traits<ArrayWrapper<ExpressionType> > : public traits<remove_all_t<typename ExpressionType::Nested> > {
typedef ArrayXpr XprKind;
// Let's remove NestByRefBit
enum {
Flags0 = traits<remove_all_t<typename ExpressionType::Nested> >::Flags,
LvalueBitFlag = is_lvalue<ExpressionType>::value ? LvalueBit : 0,
Flags = (Flags0 & ~(NestByRefBit | LvalueBit)) | LvalueBitFlag
};
};
} // namespace internal
template <typename ExpressionType>
class ArrayWrapper : public ArrayBase<ArrayWrapper<ExpressionType> > {
public:
typedef ArrayBase<ArrayWrapper> Base;
EIGEN_DENSE_PUBLIC_INTERFACE(ArrayWrapper)
EIGEN_INHERIT_ASSIGNMENT_OPERATORS(ArrayWrapper)
typedef internal::remove_all_t<ExpressionType> NestedExpression;
typedef std::conditional_t<internal::is_lvalue<ExpressionType>::value, Scalar, const Scalar>
ScalarWithConstIfNotLvalue;
typedef typename internal::ref_selector<ExpressionType>::non_const_type NestedExpressionType;
using Base::coeffRef;
EIGEN_DEVICE_FUNC explicit EIGEN_STRONG_INLINE ArrayWrapper(ExpressionType& matrix) : m_expression(matrix) {}
EIGEN_DEVICE_FUNC constexpr Index rows() const noexcept { return m_expression.rows(); }
EIGEN_DEVICE_FUNC constexpr Index cols() const noexcept { return m_expression.cols(); }
EIGEN_DEVICE_FUNC constexpr Index outerStride() const noexcept { return m_expression.outerStride(); }
EIGEN_DEVICE_FUNC constexpr Index innerStride() const noexcept { return m_expression.innerStride(); }
EIGEN_DEVICE_FUNC constexpr ScalarWithConstIfNotLvalue* data() { return m_expression.data(); }
EIGEN_DEVICE_FUNC constexpr const Scalar* data() const { return m_expression.data(); }
EIGEN_DEVICE_FUNC inline const Scalar& coeffRef(Index rowId, Index colId) const {
return m_expression.coeffRef(rowId, colId);
}
EIGEN_DEVICE_FUNC inline const Scalar& coeffRef(Index index) const { return m_expression.coeffRef(index); }
template <typename Dest>
EIGEN_DEVICE_FUNC inline void evalTo(Dest& dst) const {
dst = m_expression;
}
EIGEN_DEVICE_FUNC const internal::remove_all_t<NestedExpressionType>& nestedExpression() const {
return m_expression;
}
/** Forwards the resizing request to the nested expression
* \sa DenseBase::resize(Index) */
EIGEN_DEVICE_FUNC void resize(Index newSize) { m_expression.resize(newSize); }
/** Forwards the resizing request to the nested expression
* \sa DenseBase::resize(Index,Index)*/
EIGEN_DEVICE_FUNC void resize(Index rows, Index cols) { m_expression.resize(rows, cols); }
protected:
NestedExpressionType m_expression;
};
/** \class MatrixWrapper
* \ingroup Core_Module
*
* \brief Expression of an array as a mathematical vector or matrix
*
* This class is the return type of ArrayBase::matrix(), and most of the time
* this is the only way it is use.
*
* \sa MatrixBase::matrix(), class ArrayWrapper
*/
namespace internal {
template <typename ExpressionType>
struct traits<MatrixWrapper<ExpressionType> > : public traits<remove_all_t<typename ExpressionType::Nested> > {
typedef MatrixXpr XprKind;
// Let's remove NestByRefBit
enum {
Flags0 = traits<remove_all_t<typename ExpressionType::Nested> >::Flags,
LvalueBitFlag = is_lvalue<ExpressionType>::value ? LvalueBit : 0,
Flags = (Flags0 & ~(NestByRefBit | LvalueBit)) | LvalueBitFlag
};
};
} // namespace internal
template <typename ExpressionType>
class MatrixWrapper : public MatrixBase<MatrixWrapper<ExpressionType> > {
public:
typedef MatrixBase<MatrixWrapper<ExpressionType> > Base;
EIGEN_DENSE_PUBLIC_INTERFACE(MatrixWrapper)
EIGEN_INHERIT_ASSIGNMENT_OPERATORS(MatrixWrapper)
typedef internal::remove_all_t<ExpressionType> NestedExpression;
typedef std::conditional_t<internal::is_lvalue<ExpressionType>::value, Scalar, const Scalar>
ScalarWithConstIfNotLvalue;
typedef typename internal::ref_selector<ExpressionType>::non_const_type NestedExpressionType;
using Base::coeffRef;
EIGEN_DEVICE_FUNC explicit inline MatrixWrapper(ExpressionType& matrix) : m_expression(matrix) {}
EIGEN_DEVICE_FUNC constexpr Index rows() const noexcept { return m_expression.rows(); }
EIGEN_DEVICE_FUNC constexpr Index cols() const noexcept { return m_expression.cols(); }
EIGEN_DEVICE_FUNC constexpr Index outerStride() const noexcept { return m_expression.outerStride(); }
EIGEN_DEVICE_FUNC constexpr Index innerStride() const noexcept { return m_expression.innerStride(); }
EIGEN_DEVICE_FUNC constexpr ScalarWithConstIfNotLvalue* data() { return m_expression.data(); }
EIGEN_DEVICE_FUNC constexpr const Scalar* data() const { return m_expression.data(); }
EIGEN_DEVICE_FUNC inline const Scalar& coeffRef(Index rowId, Index colId) const {
return m_expression.derived().coeffRef(rowId, colId);
}
EIGEN_DEVICE_FUNC inline const Scalar& coeffRef(Index index) const { return m_expression.coeffRef(index); }
EIGEN_DEVICE_FUNC const internal::remove_all_t<NestedExpressionType>& nestedExpression() const {
return m_expression;
}
/** Forwards the resizing request to the nested expression
* \sa DenseBase::resize(Index) */
EIGEN_DEVICE_FUNC void resize(Index newSize) { m_expression.resize(newSize); }
/** Forwards the resizing request to the nested expression
* \sa DenseBase::resize(Index,Index)*/
EIGEN_DEVICE_FUNC void resize(Index rows, Index cols) { m_expression.resize(rows, cols); }
protected:
NestedExpressionType m_expression;
};
} // end namespace Eigen
#endif // EIGEN_ARRAYWRAPPER_H

View File

@@ -1,443 +1,80 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2007 Michael Olbrich <michael.olbrich@gmx.net>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2006-2010 Benoit Jacob <jacob.benoit.1@gmail.com>
// Copyright (C) 2008 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_ASSIGN_H
#define EIGEN_ASSIGN_H
/***************************************************************************
* Part 1 : the logic deciding a strategy for vectorization and unrolling
***************************************************************************/
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
template <typename Derived, typename OtherDerived>
struct ei_assign_traits
{
public:
enum {
DstIsAligned = Derived::Flags & AlignedBit,
SrcIsAligned = OtherDerived::Flags & AlignedBit,
SrcAlignment = DstIsAligned && SrcIsAligned ? Aligned : Unaligned
};
namespace Eigen {
private:
enum {
InnerSize = int(Derived::Flags)&RowMajorBit
? Derived::ColsAtCompileTime
: Derived::RowsAtCompileTime,
InnerMaxSize = int(Derived::Flags)&RowMajorBit
? Derived::MaxColsAtCompileTime
: Derived::MaxRowsAtCompileTime,
PacketSize = ei_packet_traits<typename Derived::Scalar>::size
};
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& DenseBase<Derived>::lazyAssign(const DenseBase<OtherDerived>& other) {
enum { SameType = internal::is_same<typename Derived::Scalar, typename OtherDerived::Scalar>::value };
enum {
MightVectorize = (int(Derived::Flags) & int(OtherDerived::Flags) & ActualPacketAccessBit)
&& ((int(Derived::Flags)&RowMajorBit)==(int(OtherDerived::Flags)&RowMajorBit)),
MayInnerVectorize = MightVectorize && int(InnerSize)!=Dynamic && int(InnerSize)%int(PacketSize)==0
&& int(DstIsAligned) && int(SrcIsAligned),
MayLinearVectorize = MightVectorize && (int(Derived::Flags) & int(OtherDerived::Flags) & LinearAccessBit),
MaySliceVectorize = MightVectorize && int(InnerMaxSize)>=3*PacketSize /* slice vectorization can be slow, so we only
want it if the slices are big, which is indicated by InnerMaxSize rather than InnerSize, think of the case
of a dynamic block in a fixed-size matrix */
};
EIGEN_STATIC_ASSERT_LVALUE(Derived)
EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE(Derived, OtherDerived)
EIGEN_STATIC_ASSERT(
SameType,
YOU_MIXED_DIFFERENT_NUMERIC_TYPES__YOU_NEED_TO_USE_THE_CAST_METHOD_OF_MATRIXBASE_TO_CAST_NUMERIC_TYPES_EXPLICITLY)
public:
enum {
Vectorization = int(MayInnerVectorize) ? int(InnerVectorization)
: int(MayLinearVectorize) ? int(LinearVectorization)
: int(MaySliceVectorize) ? int(SliceVectorization)
: int(NoVectorization)
};
eigen_assert(rows() == other.rows() && cols() == other.cols());
internal::call_assignment_no_alias(derived(), other.derived());
private:
enum {
UnrollingLimit = EIGEN_UNROLLING_LIMIT * (int(Vectorization) == int(NoVectorization) ? 1 : int(PacketSize)),
MayUnrollCompletely = int(Derived::SizeAtCompileTime) * int(OtherDerived::CoeffReadCost) <= int(UnrollingLimit),
MayUnrollInner = int(InnerSize * OtherDerived::CoeffReadCost) <= int(UnrollingLimit)
};
public:
enum {
Unrolling = (int(Vectorization) == int(InnerVectorization) || int(Vectorization) == int(NoVectorization))
? (
int(MayUnrollCompletely) ? int(CompleteUnrolling)
: int(MayUnrollInner) ? int(InnerUnrolling)
: int(NoUnrolling)
)
: int(Vectorization) == int(LinearVectorization)
? ( int(MayUnrollCompletely) && int(DstIsAligned) ? int(CompleteUnrolling) : int(NoUnrolling) )
: int(NoUnrolling)
};
};
/***************************************************************************
* Part 2 : meta-unrollers
***************************************************************************/
/***********************
*** No vectorization ***
***********************/
template<typename Derived1, typename Derived2, int Index, int Stop>
struct ei_assign_novec_CompleteUnrolling
{
enum {
row = int(Derived1::Flags)&RowMajorBit
? Index / int(Derived1::ColsAtCompileTime)
: Index % Derived1::RowsAtCompileTime,
col = int(Derived1::Flags)&RowMajorBit
? Index % int(Derived1::ColsAtCompileTime)
: Index / Derived1::RowsAtCompileTime
};
inline static void run(Derived1 &dst, const Derived2 &src)
{
dst.copyCoeff(row, col, src);
ei_assign_novec_CompleteUnrolling<Derived1, Derived2, Index+1, Stop>::run(dst, src);
}
};
template<typename Derived1, typename Derived2, int Stop>
struct ei_assign_novec_CompleteUnrolling<Derived1, Derived2, Stop, Stop>
{
inline static void run(Derived1 &, const Derived2 &) {}
};
template<typename Derived1, typename Derived2, int Index, int Stop>
struct ei_assign_novec_InnerUnrolling
{
inline static void run(Derived1 &dst, const Derived2 &src, int row_or_col)
{
const bool rowMajor = int(Derived1::Flags)&RowMajorBit;
const int row = rowMajor ? row_or_col : Index;
const int col = rowMajor ? Index : row_or_col;
dst.copyCoeff(row, col, src);
ei_assign_novec_InnerUnrolling<Derived1, Derived2, Index+1, Stop>::run(dst, src, row_or_col);
}
};
template<typename Derived1, typename Derived2, int Stop>
struct ei_assign_novec_InnerUnrolling<Derived1, Derived2, Stop, Stop>
{
inline static void run(Derived1 &, const Derived2 &, int) {}
};
/**************************
*** Inner vectorization ***
**************************/
template<typename Derived1, typename Derived2, int Index, int Stop>
struct ei_assign_innervec_CompleteUnrolling
{
enum {
row = int(Derived1::Flags)&RowMajorBit
? Index / int(Derived1::ColsAtCompileTime)
: Index % Derived1::RowsAtCompileTime,
col = int(Derived1::Flags)&RowMajorBit
? Index % int(Derived1::ColsAtCompileTime)
: Index / Derived1::RowsAtCompileTime,
SrcAlignment = ei_assign_traits<Derived1,Derived2>::SrcAlignment
};
inline static void run(Derived1 &dst, const Derived2 &src)
{
dst.template copyPacket<Derived2, Aligned, SrcAlignment>(row, col, src);
ei_assign_innervec_CompleteUnrolling<Derived1, Derived2,
Index+ei_packet_traits<typename Derived1::Scalar>::size, Stop>::run(dst, src);
}
};
template<typename Derived1, typename Derived2, int Stop>
struct ei_assign_innervec_CompleteUnrolling<Derived1, Derived2, Stop, Stop>
{
inline static void run(Derived1 &, const Derived2 &) {}
};
template<typename Derived1, typename Derived2, int Index, int Stop>
struct ei_assign_innervec_InnerUnrolling
{
inline static void run(Derived1 &dst, const Derived2 &src, int row_or_col)
{
const int row = int(Derived1::Flags)&RowMajorBit ? row_or_col : Index;
const int col = int(Derived1::Flags)&RowMajorBit ? Index : row_or_col;
dst.template copyPacket<Derived2, Aligned, Aligned>(row, col, src);
ei_assign_innervec_InnerUnrolling<Derived1, Derived2,
Index+ei_packet_traits<typename Derived1::Scalar>::size, Stop>::run(dst, src, row_or_col);
}
};
template<typename Derived1, typename Derived2, int Stop>
struct ei_assign_innervec_InnerUnrolling<Derived1, Derived2, Stop, Stop>
{
inline static void run(Derived1 &, const Derived2 &, int) {}
};
/***************************************************************************
* Part 3 : implementation of all cases
***************************************************************************/
template<typename Derived1, typename Derived2,
int Vectorization = ei_assign_traits<Derived1, Derived2>::Vectorization,
int Unrolling = ei_assign_traits<Derived1, Derived2>::Unrolling>
struct ei_assign_impl;
/***********************
*** No vectorization ***
***********************/
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, NoVectorization, NoUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const int innerSize = dst.innerSize();
const int outerSize = dst.outerSize();
for(int j = 0; j < outerSize; j++)
for(int i = 0; i < innerSize; i++)
{
if(int(Derived1::Flags)&RowMajorBit)
dst.copyCoeff(j, i, src);
else
dst.copyCoeff(i, j, src);
}
}
};
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, NoVectorization, CompleteUnrolling>
{
inline static void run(Derived1 &dst, const Derived2 &src)
{
ei_assign_novec_CompleteUnrolling<Derived1, Derived2, 0, Derived1::SizeAtCompileTime>
::run(dst, src);
}
};
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, NoVectorization, InnerUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const bool rowMajor = int(Derived1::Flags)&RowMajorBit;
const int innerSize = rowMajor ? Derived1::ColsAtCompileTime : Derived1::RowsAtCompileTime;
const int outerSize = dst.outerSize();
for(int j = 0; j < outerSize; j++)
ei_assign_novec_InnerUnrolling<Derived1, Derived2, 0, innerSize>
::run(dst, src, j);
}
};
/**************************
*** Inner vectorization ***
**************************/
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, InnerVectorization, NoUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const int innerSize = dst.innerSize();
const int outerSize = dst.outerSize();
const int packetSize = ei_packet_traits<typename Derived1::Scalar>::size;
for(int j = 0; j < outerSize; j++)
for(int i = 0; i < innerSize; i+=packetSize)
{
if(int(Derived1::Flags)&RowMajorBit)
dst.template copyPacket<Derived2, Aligned, Aligned>(j, i, src);
else
dst.template copyPacket<Derived2, Aligned, Aligned>(i, j, src);
}
}
};
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, InnerVectorization, CompleteUnrolling>
{
inline static void run(Derived1 &dst, const Derived2 &src)
{
ei_assign_innervec_CompleteUnrolling<Derived1, Derived2, 0, Derived1::SizeAtCompileTime>
::run(dst, src);
}
};
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, InnerVectorization, InnerUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const bool rowMajor = int(Derived1::Flags)&RowMajorBit;
const int innerSize = rowMajor ? Derived1::ColsAtCompileTime : Derived1::RowsAtCompileTime;
const int outerSize = dst.outerSize();
for(int j = 0; j < outerSize; j++)
ei_assign_innervec_InnerUnrolling<Derived1, Derived2, 0, innerSize>
::run(dst, src, j);
}
};
/***************************
*** Linear vectorization ***
***************************/
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, LinearVectorization, NoUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const int size = dst.size();
const int packetSize = ei_packet_traits<typename Derived1::Scalar>::size;
const int alignedStart = ei_assign_traits<Derived1,Derived2>::DstIsAligned ? 0
: ei_alignmentOffset(&dst.coeffRef(0), size);
const int alignedEnd = alignedStart + ((size-alignedStart)/packetSize)*packetSize;
for(int index = 0; index < alignedStart; index++)
dst.copyCoeff(index, src);
for(int index = alignedStart; index < alignedEnd; index += packetSize)
{
dst.template copyPacket<Derived2, Aligned, ei_assign_traits<Derived1,Derived2>::SrcAlignment>(index, src);
}
for(int index = alignedEnd; index < size; index++)
dst.copyCoeff(index, src);
}
};
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, LinearVectorization, CompleteUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const int size = Derived1::SizeAtCompileTime;
const int packetSize = ei_packet_traits<typename Derived1::Scalar>::size;
const int alignedSize = (size/packetSize)*packetSize;
ei_assign_innervec_CompleteUnrolling<Derived1, Derived2, 0, alignedSize>::run(dst, src);
ei_assign_novec_CompleteUnrolling<Derived1, Derived2, alignedSize, size>::run(dst, src);
}
};
/**************************
*** Slice vectorization ***
***************************/
template<typename Derived1, typename Derived2>
struct ei_assign_impl<Derived1, Derived2, SliceVectorization, NoUnrolling>
{
static void run(Derived1 &dst, const Derived2 &src)
{
const int packetSize = ei_packet_traits<typename Derived1::Scalar>::size;
const int packetAlignedMask = packetSize - 1;
const int innerSize = dst.innerSize();
const int outerSize = dst.outerSize();
const int alignedStep = (packetSize - dst.stride() % packetSize) & packetAlignedMask;
int alignedStart = ei_assign_traits<Derived1,Derived2>::DstIsAligned ? 0
: ei_alignmentOffset(&dst.coeffRef(0), innerSize);
for(int i = 0; i < outerSize; i++)
{
const int alignedEnd = alignedStart + ((innerSize-alignedStart) & ~packetAlignedMask);
// do the non-vectorizable part of the assignment
for (int index = 0; index<alignedStart ; index++)
{
if(Derived1::Flags&RowMajorBit)
dst.copyCoeff(i, index, src);
else
dst.copyCoeff(index, i, src);
}
// do the vectorizable part of the assignment
for (int index = alignedStart; index<alignedEnd; index+=packetSize)
{
if(Derived1::Flags&RowMajorBit)
dst.template copyPacket<Derived2, Aligned, Unaligned>(i, index, src);
else
dst.template copyPacket<Derived2, Aligned, Unaligned>(index, i, src);
}
// do the non-vectorizable part of the assignment
for (int index = alignedEnd; index<innerSize ; index++)
{
if(Derived1::Flags&RowMajorBit)
dst.copyCoeff(i, index, src);
else
dst.copyCoeff(index, i, src);
}
alignedStart = std::min<int>((alignedStart+alignedStep)%packetSize, innerSize);
}
}
};
/***************************************************************************
* Part 4 : implementation of MatrixBase methods
***************************************************************************/
template<typename Derived>
template<typename OtherDerived>
inline Derived& MatrixBase<Derived>
::lazyAssign(const MatrixBase<OtherDerived>& other)
{
EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE(Derived,OtherDerived)
ei_assert(rows() == other.rows() && cols() == other.cols());
ei_assign_impl<Derived, OtherDerived>::run(derived(),other.derived());
return derived();
}
template<typename Derived, typename OtherDerived,
bool EvalBeforeAssigning = int(OtherDerived::Flags) & EvalBeforeAssigningBit,
bool NeedToTranspose = Derived::IsVectorAtCompileTime
&& OtherDerived::IsVectorAtCompileTime
&& int(Derived::RowsAtCompileTime) == int(OtherDerived::ColsAtCompileTime)
&& int(Derived::ColsAtCompileTime) == int(OtherDerived::RowsAtCompileTime)
&& int(Derived::SizeAtCompileTime) != 1>
struct ei_assign_selector;
template<typename Derived, typename OtherDerived>
struct ei_assign_selector<Derived,OtherDerived,false,false> {
static Derived& run(Derived& dst, const OtherDerived& other) { return dst.lazyAssign(other.derived()); }
};
template<typename Derived, typename OtherDerived>
struct ei_assign_selector<Derived,OtherDerived,true,false> {
static Derived& run(Derived& dst, const OtherDerived& other) { return dst.lazyAssign(other.eval()); }
};
template<typename Derived, typename OtherDerived>
struct ei_assign_selector<Derived,OtherDerived,false,true> {
static Derived& run(Derived& dst, const OtherDerived& other) { return dst.lazyAssign(other.transpose()); }
};
template<typename Derived, typename OtherDerived>
struct ei_assign_selector<Derived,OtherDerived,true,true> {
static Derived& run(Derived& dst, const OtherDerived& other) { return dst.lazyAssign(other.transpose().eval()); }
};
template<typename Derived>
template<typename OtherDerived>
inline Derived& MatrixBase<Derived>
::operator=(const MatrixBase<OtherDerived>& other)
{
return ei_assign_selector<Derived,OtherDerived>::run(derived(), other.derived());
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& DenseBase<Derived>::operator=(const DenseBase<OtherDerived>& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
#endif // EIGEN_ASSIGN_H
template <typename Derived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& DenseBase<Derived>::operator=(const DenseBase& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
template <typename Derived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator=(const MatrixBase& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator=(const DenseBase<OtherDerived>& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator=(const EigenBase<OtherDerived>& other) {
internal::call_assignment(derived(), other.derived());
return derived();
}
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator=(
const ReturnByValue<OtherDerived>& other) {
other.derived().evalTo(derived());
return derived();
}
} // end namespace Eigen
#endif // EIGEN_ASSIGN_H

File diff suppressed because it is too large Load Diff

183
Eigen/src/Core/Assign_MKL.h Normal file
View File

@@ -0,0 +1,183 @@
/*
Copyright (c) 2011, Intel Corporation. All rights reserved.
Copyright (C) 2015 Gael Guennebaud <gael.guennebaud@inria.fr>
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
* Neither the name of Intel Corporation nor the names of its contributors may
be used to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
********************************************************************************
* Content : Eigen bindings to Intel(R) MKL
* MKL VML support for coefficient-wise unary Eigen expressions like a=b.sin()
********************************************************************************
*/
#ifndef EIGEN_ASSIGN_VML_H
#define EIGEN_ASSIGN_VML_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename Dst, typename Src>
class vml_assign_traits {
private:
enum {
DstHasDirectAccess = Dst::Flags & DirectAccessBit,
SrcHasDirectAccess = Src::Flags & DirectAccessBit,
StorageOrdersAgree = (int(Dst::IsRowMajor) == int(Src::IsRowMajor)),
InnerSize = int(Dst::IsVectorAtCompileTime) ? int(Dst::SizeAtCompileTime)
: int(Dst::Flags) & RowMajorBit ? int(Dst::ColsAtCompileTime)
: int(Dst::RowsAtCompileTime),
InnerMaxSize = int(Dst::IsVectorAtCompileTime) ? int(Dst::MaxSizeAtCompileTime)
: int(Dst::Flags) & RowMajorBit ? int(Dst::MaxColsAtCompileTime)
: int(Dst::MaxRowsAtCompileTime),
MaxSizeAtCompileTime = Dst::SizeAtCompileTime,
MightEnableVml = bool(StorageOrdersAgree) && bool(DstHasDirectAccess) && bool(SrcHasDirectAccess) &&
Src::InnerStrideAtCompileTime == 1 && Dst::InnerStrideAtCompileTime == 1,
MightLinearize = bool(MightEnableVml) && (int(Dst::Flags) & int(Src::Flags) & LinearAccessBit),
VmlSize = bool(MightLinearize) ? MaxSizeAtCompileTime : InnerMaxSize,
LargeEnough = (VmlSize == Dynamic) || VmlSize >= EIGEN_MKL_VML_THRESHOLD
};
public:
enum { EnableVml = MightEnableVml && LargeEnough, Traversal = MightLinearize ? LinearTraversal : DefaultTraversal };
};
#define EIGEN_PP_EXPAND(ARG) ARG
#if !defined(EIGEN_FAST_MATH) || (EIGEN_FAST_MATH != 1)
#define EIGEN_VMLMODE_EXPAND_xLA , VML_HA
#else
#define EIGEN_VMLMODE_EXPAND_xLA , VML_LA
#endif
#define EIGEN_VMLMODE_EXPAND_x_
#define EIGEN_VMLMODE_PREFIX_xLA vm
#define EIGEN_VMLMODE_PREFIX_x_ v
#define EIGEN_VMLMODE_PREFIX(VMLMODE) EIGEN_CAT(EIGEN_VMLMODE_PREFIX_x, VMLMODE)
#define EIGEN_MKL_VML_DECLARE_UNARY_CALL(EIGENOP, VMLOP, EIGENTYPE, VMLTYPE, VMLMODE) \
template <typename DstXprType, typename SrcXprNested> \
struct Assignment<DstXprType, CwiseUnaryOp<scalar_##EIGENOP##_op<EIGENTYPE>, SrcXprNested>, \
assign_op<EIGENTYPE, EIGENTYPE>, Dense2Dense, \
std::enable_if_t<vml_assign_traits<DstXprType, SrcXprNested>::EnableVml>> { \
typedef CwiseUnaryOp<scalar_##EIGENOP##_op<EIGENTYPE>, SrcXprNested> SrcXprType; \
static void run(DstXprType &dst, const SrcXprType &src, const assign_op<EIGENTYPE, EIGENTYPE> &func) { \
resize_if_allowed(dst, src, func); \
eigen_assert(dst.rows() == src.rows() && dst.cols() == src.cols()); \
if (vml_assign_traits<DstXprType, SrcXprNested>::Traversal == (int)LinearTraversal) { \
VMLOP(dst.size(), (const VMLTYPE *)src.nestedExpression().data(), \
(VMLTYPE *)dst.data() EIGEN_PP_EXPAND(EIGEN_VMLMODE_EXPAND_x##VMLMODE)); \
} else { \
const Index outerSize = dst.outerSize(); \
for (Index outer = 0; outer < outerSize; ++outer) { \
const EIGENTYPE *src_ptr = src.IsRowMajor ? &(src.nestedExpression().coeffRef(outer, 0)) \
: &(src.nestedExpression().coeffRef(0, outer)); \
EIGENTYPE *dst_ptr = dst.IsRowMajor ? &(dst.coeffRef(outer, 0)) : &(dst.coeffRef(0, outer)); \
VMLOP(dst.innerSize(), (const VMLTYPE *)src_ptr, \
(VMLTYPE *)dst_ptr EIGEN_PP_EXPAND(EIGEN_VMLMODE_EXPAND_x##VMLMODE)); \
} \
} \
} \
};
#define EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(EIGENOP, VMLOP, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALL(EIGENOP, EIGEN_CAT(EIGEN_VMLMODE_PREFIX(VMLMODE), s##VMLOP), float, float, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALL(EIGENOP, EIGEN_CAT(EIGEN_VMLMODE_PREFIX(VMLMODE), d##VMLOP), double, double, VMLMODE)
#define EIGEN_MKL_VML_DECLARE_UNARY_CALLS_CPLX(EIGENOP, VMLOP, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALL(EIGENOP, EIGEN_CAT(EIGEN_VMLMODE_PREFIX(VMLMODE), c##VMLOP), scomplex, \
MKL_Complex8, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALL(EIGENOP, EIGEN_CAT(EIGEN_VMLMODE_PREFIX(VMLMODE), z##VMLOP), dcomplex, \
MKL_Complex16, VMLMODE)
#define EIGEN_MKL_VML_DECLARE_UNARY_CALLS(EIGENOP, VMLOP, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(EIGENOP, VMLOP, VMLMODE) \
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_CPLX(EIGENOP, VMLOP, VMLMODE)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(sin, Sin, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(asin, Asin, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(sinh, Sinh, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(cos, Cos, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(acos, Acos, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(cosh, Cosh, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(tan, Tan, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(atan, Atan, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(tanh, Tanh, LA)
// EIGEN_MKL_VML_DECLARE_UNARY_CALLS(abs, Abs, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(exp, Exp, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(log, Ln, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(log10, Log10, LA)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS(sqrt, Sqrt, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(square, Sqr, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_CPLX(arg, Arg, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(round, Round, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(floor, Floor, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(ceil, Ceil, _)
EIGEN_MKL_VML_DECLARE_UNARY_CALLS_REAL(cbrt, Cbrt, _)
#define EIGEN_MKL_VML_DECLARE_POW_CALL(EIGENOP, VMLOP, EIGENTYPE, VMLTYPE, VMLMODE) \
template <typename DstXprType, typename SrcXprNested, typename Plain> \
struct Assignment<DstXprType, \
CwiseBinaryOp<scalar_##EIGENOP##_op<EIGENTYPE, EIGENTYPE>, SrcXprNested, \
const CwiseNullaryOp<internal::scalar_constant_op<EIGENTYPE>, Plain>>, \
assign_op<EIGENTYPE, EIGENTYPE>, Dense2Dense, \
std::enable_if_t<vml_assign_traits<DstXprType, SrcXprNested>::EnableVml>> { \
typedef CwiseBinaryOp<scalar_##EIGENOP##_op<EIGENTYPE, EIGENTYPE>, SrcXprNested, \
const CwiseNullaryOp<internal::scalar_constant_op<EIGENTYPE>, Plain>> \
SrcXprType; \
static void run(DstXprType &dst, const SrcXprType &src, const assign_op<EIGENTYPE, EIGENTYPE> &func) { \
resize_if_allowed(dst, src, func); \
eigen_assert(dst.rows() == src.rows() && dst.cols() == src.cols()); \
VMLTYPE exponent = reinterpret_cast<const VMLTYPE &>(src.rhs().functor().m_other); \
if (vml_assign_traits<DstXprType, SrcXprNested>::Traversal == LinearTraversal) { \
VMLOP(dst.size(), (const VMLTYPE *)src.lhs().data(), exponent, \
(VMLTYPE *)dst.data() EIGEN_PP_EXPAND(EIGEN_VMLMODE_EXPAND_x##VMLMODE)); \
} else { \
const Index outerSize = dst.outerSize(); \
for (Index outer = 0; outer < outerSize; ++outer) { \
const EIGENTYPE *src_ptr = \
src.IsRowMajor ? &(src.lhs().coeffRef(outer, 0)) : &(src.lhs().coeffRef(0, outer)); \
EIGENTYPE *dst_ptr = dst.IsRowMajor ? &(dst.coeffRef(outer, 0)) : &(dst.coeffRef(0, outer)); \
VMLOP(dst.innerSize(), (const VMLTYPE *)src_ptr, exponent, \
(VMLTYPE *)dst_ptr EIGEN_PP_EXPAND(EIGEN_VMLMODE_EXPAND_x##VMLMODE)); \
} \
} \
} \
};
EIGEN_MKL_VML_DECLARE_POW_CALL(pow, vmsPowx, float, float, LA)
EIGEN_MKL_VML_DECLARE_POW_CALL(pow, vmdPowx, double, double, LA)
EIGEN_MKL_VML_DECLARE_POW_CALL(pow, vmcPowx, scomplex, MKL_Complex8, LA)
EIGEN_MKL_VML_DECLARE_POW_CALL(pow, vmzPowx, dcomplex, MKL_Complex16, LA)
} // end namespace internal
} // end namespace Eigen
#endif // EIGEN_ASSIGN_VML_H

338
Eigen/src/Core/BandMatrix.h Normal file
View File

@@ -0,0 +1,338 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2009 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_BANDMATRIX_H
#define EIGEN_BANDMATRIX_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename Derived>
class BandMatrixBase : public EigenBase<Derived> {
public:
enum {
Flags = internal::traits<Derived>::Flags,
CoeffReadCost = internal::traits<Derived>::CoeffReadCost,
RowsAtCompileTime = internal::traits<Derived>::RowsAtCompileTime,
ColsAtCompileTime = internal::traits<Derived>::ColsAtCompileTime,
MaxRowsAtCompileTime = internal::traits<Derived>::MaxRowsAtCompileTime,
MaxColsAtCompileTime = internal::traits<Derived>::MaxColsAtCompileTime,
Supers = internal::traits<Derived>::Supers,
Subs = internal::traits<Derived>::Subs,
Options = internal::traits<Derived>::Options
};
typedef typename internal::traits<Derived>::Scalar Scalar;
typedef Matrix<Scalar, RowsAtCompileTime, ColsAtCompileTime> DenseMatrixType;
typedef typename DenseMatrixType::StorageIndex StorageIndex;
typedef typename internal::traits<Derived>::CoefficientsType CoefficientsType;
typedef EigenBase<Derived> Base;
protected:
enum {
DataRowsAtCompileTime = ((Supers != Dynamic) && (Subs != Dynamic)) ? 1 + Supers + Subs : Dynamic,
SizeAtCompileTime = min_size_prefer_dynamic(RowsAtCompileTime, ColsAtCompileTime)
};
public:
using Base::cols;
using Base::derived;
using Base::rows;
/** \returns the number of super diagonals */
inline Index supers() const { return derived().supers(); }
/** \returns the number of sub diagonals */
inline Index subs() const { return derived().subs(); }
/** \returns an expression of the underlying coefficient matrix */
inline const CoefficientsType& coeffs() const { return derived().coeffs(); }
/** \returns an expression of the underlying coefficient matrix */
inline CoefficientsType& coeffs() { return derived().coeffs(); }
/** \returns a vector expression of the \a i -th column,
* only the meaningful part is returned.
* \warning the internal storage must be column major. */
inline Block<CoefficientsType, Dynamic, 1> col(Index i) {
EIGEN_STATIC_ASSERT((int(Options) & int(RowMajor)) == 0, THIS_METHOD_IS_ONLY_FOR_COLUMN_MAJOR_MATRICES);
Index start = 0;
Index len = coeffs().rows();
if (i <= supers()) {
start = supers() - i;
len = (std::min)(rows(), std::max<Index>(0, coeffs().rows() - (supers() - i)));
} else if (i >= rows() - subs())
len = std::max<Index>(0, coeffs().rows() - (i + 1 - rows() + subs()));
return Block<CoefficientsType, Dynamic, 1>(coeffs(), start, i, len, 1);
}
/** \returns a vector expression of the main diagonal */
inline Block<CoefficientsType, 1, SizeAtCompileTime> diagonal() {
return Block<CoefficientsType, 1, SizeAtCompileTime>(coeffs(), supers(), 0, 1, (std::min)(rows(), cols()));
}
/** \returns a vector expression of the main diagonal (const version) */
inline const Block<const CoefficientsType, 1, SizeAtCompileTime> diagonal() const {
return Block<const CoefficientsType, 1, SizeAtCompileTime>(coeffs(), supers(), 0, 1, (std::min)(rows(), cols()));
}
template <int Index>
struct DiagonalIntReturnType {
enum {
ReturnOpposite =
(int(Options) & int(SelfAdjoint)) && (((Index) > 0 && Supers == 0) || ((Index) < 0 && Subs == 0)),
Conjugate = ReturnOpposite && NumTraits<Scalar>::IsComplex,
ActualIndex = ReturnOpposite ? -Index : Index,
DiagonalSize =
(RowsAtCompileTime == Dynamic || ColsAtCompileTime == Dynamic)
? Dynamic
: (ActualIndex < 0 ? min_size_prefer_dynamic(ColsAtCompileTime, RowsAtCompileTime + ActualIndex)
: min_size_prefer_dynamic(RowsAtCompileTime, ColsAtCompileTime - ActualIndex))
};
typedef Block<CoefficientsType, 1, DiagonalSize> BuildType;
typedef std::conditional_t<Conjugate, CwiseUnaryOp<internal::scalar_conjugate_op<Scalar>, BuildType>, BuildType>
Type;
};
/** \returns a vector expression of the \a N -th sub or super diagonal */
template <int N>
inline typename DiagonalIntReturnType<N>::Type diagonal() {
return typename DiagonalIntReturnType<N>::BuildType(coeffs(), supers() - N, (std::max)(0, N), 1, diagonalLength(N));
}
/** \returns a vector expression of the \a N -th sub or super diagonal */
template <int N>
inline const typename DiagonalIntReturnType<N>::Type diagonal() const {
return typename DiagonalIntReturnType<N>::BuildType(coeffs(), supers() - N, (std::max)(0, N), 1, diagonalLength(N));
}
/** \returns a vector expression of the \a i -th sub or super diagonal */
inline Block<CoefficientsType, 1, Dynamic> diagonal(Index i) {
eigen_assert((i < 0 && -i <= subs()) || (i >= 0 && i <= supers()));
return Block<CoefficientsType, 1, Dynamic>(coeffs(), supers() - i, std::max<Index>(0, i), 1, diagonalLength(i));
}
/** \returns a vector expression of the \a i -th sub or super diagonal */
inline const Block<const CoefficientsType, 1, Dynamic> diagonal(Index i) const {
eigen_assert((i < 0 && -i <= subs()) || (i >= 0 && i <= supers()));
return Block<const CoefficientsType, 1, Dynamic>(coeffs(), supers() - i, std::max<Index>(0, i), 1,
diagonalLength(i));
}
template <typename Dest>
inline void evalTo(Dest& dst) const {
dst.resize(rows(), cols());
dst.setZero();
dst.diagonal() = diagonal();
for (Index i = 1; i <= supers(); ++i) dst.diagonal(i) = diagonal(i);
for (Index i = 1; i <= subs(); ++i) dst.diagonal(-i) = diagonal(-i);
}
DenseMatrixType toDenseMatrix() const {
DenseMatrixType res(rows(), cols());
evalTo(res);
return res;
}
protected:
inline Index diagonalLength(Index i) const {
return i < 0 ? (std::min)(cols(), rows() + i) : (std::min)(rows(), cols() - i);
}
};
/**
* \class BandMatrix
* \ingroup Core_Module
*
* \brief Represents a rectangular matrix with a banded storage
*
* \tparam Scalar_ Numeric type, i.e. float, double, int
* \tparam Rows_ Number of rows, or \b Dynamic
* \tparam Cols_ Number of columns, or \b Dynamic
* \tparam Supers_ Number of super diagonal
* \tparam Subs_ Number of sub diagonal
* \tparam Options_ A combination of either \b #RowMajor or \b #ColMajor, and of \b #SelfAdjoint
* The former controls \ref TopicStorageOrders "storage order", and defaults to
* column-major. The latter controls whether the matrix represents a selfadjoint
* matrix in which case either Supers of Subs have to be null.
*
* \sa class TridiagonalMatrix
*/
template <typename Scalar_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
struct traits<BandMatrix<Scalar_, Rows_, Cols_, Supers_, Subs_, Options_> > {
typedef Scalar_ Scalar;
typedef Dense StorageKind;
typedef Eigen::Index StorageIndex;
enum {
CoeffReadCost = NumTraits<Scalar>::ReadCost,
RowsAtCompileTime = Rows_,
ColsAtCompileTime = Cols_,
MaxRowsAtCompileTime = Rows_,
MaxColsAtCompileTime = Cols_,
Flags = LvalueBit,
Supers = Supers_,
Subs = Subs_,
Options = Options_,
DataRowsAtCompileTime = ((Supers != Dynamic) && (Subs != Dynamic)) ? 1 + Supers + Subs : Dynamic
};
typedef Matrix<Scalar, DataRowsAtCompileTime, ColsAtCompileTime, int(Options) & int(RowMajor) ? RowMajor : ColMajor>
CoefficientsType;
};
template <typename Scalar_, int Rows, int Cols, int Supers, int Subs, int Options>
class BandMatrix : public BandMatrixBase<BandMatrix<Scalar_, Rows, Cols, Supers, Subs, Options> > {
public:
typedef typename internal::traits<BandMatrix>::Scalar Scalar;
typedef typename internal::traits<BandMatrix>::StorageIndex StorageIndex;
typedef typename internal::traits<BandMatrix>::CoefficientsType CoefficientsType;
explicit inline BandMatrix(Index rows = Rows, Index cols = Cols, Index supers = Supers, Index subs = Subs)
: m_coeffs(1 + supers + subs, cols), m_rows(rows), m_supers(supers), m_subs(subs) {}
/** \returns the number of columns */
constexpr Index rows() const { return m_rows.value(); }
/** \returns the number of rows */
constexpr Index cols() const { return m_coeffs.cols(); }
/** \returns the number of super diagonals */
constexpr Index supers() const { return m_supers.value(); }
/** \returns the number of sub diagonals */
constexpr Index subs() const { return m_subs.value(); }
inline const CoefficientsType& coeffs() const { return m_coeffs; }
inline CoefficientsType& coeffs() { return m_coeffs; }
protected:
CoefficientsType m_coeffs;
internal::variable_if_dynamic<Index, Rows> m_rows;
internal::variable_if_dynamic<Index, Supers> m_supers;
internal::variable_if_dynamic<Index, Subs> m_subs;
};
template <typename CoefficientsType_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
class BandMatrixWrapper;
template <typename CoefficientsType_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
struct traits<BandMatrixWrapper<CoefficientsType_, Rows_, Cols_, Supers_, Subs_, Options_> > {
typedef typename CoefficientsType_::Scalar Scalar;
typedef typename CoefficientsType_::StorageKind StorageKind;
typedef typename CoefficientsType_::StorageIndex StorageIndex;
enum {
CoeffReadCost = internal::traits<CoefficientsType_>::CoeffReadCost,
RowsAtCompileTime = Rows_,
ColsAtCompileTime = Cols_,
MaxRowsAtCompileTime = Rows_,
MaxColsAtCompileTime = Cols_,
Flags = LvalueBit,
Supers = Supers_,
Subs = Subs_,
Options = Options_,
DataRowsAtCompileTime = ((Supers != Dynamic) && (Subs != Dynamic)) ? 1 + Supers + Subs : Dynamic
};
typedef CoefficientsType_ CoefficientsType;
};
template <typename CoefficientsType_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
class BandMatrixWrapper
: public BandMatrixBase<BandMatrixWrapper<CoefficientsType_, Rows_, Cols_, Supers_, Subs_, Options_> > {
public:
typedef typename internal::traits<BandMatrixWrapper>::Scalar Scalar;
typedef typename internal::traits<BandMatrixWrapper>::CoefficientsType CoefficientsType;
typedef typename internal::traits<BandMatrixWrapper>::StorageIndex StorageIndex;
explicit inline BandMatrixWrapper(const CoefficientsType& coeffs, Index rows = Rows_, Index cols = Cols_,
Index supers = Supers_, Index subs = Subs_)
: m_coeffs(coeffs), m_rows(rows), m_supers(supers), m_subs(subs) {
EIGEN_UNUSED_VARIABLE(cols);
// eigen_assert(coeffs.cols()==cols() && (supers()+subs()+1)==coeffs.rows());
}
/** \returns the number of columns */
constexpr Index rows() const { return m_rows.value(); }
/** \returns the number of rows */
constexpr Index cols() const { return m_coeffs.cols(); }
/** \returns the number of super diagonals */
constexpr Index supers() const { return m_supers.value(); }
/** \returns the number of sub diagonals */
constexpr Index subs() const { return m_subs.value(); }
inline const CoefficientsType& coeffs() const { return m_coeffs; }
protected:
const CoefficientsType& m_coeffs;
internal::variable_if_dynamic<Index, Rows_> m_rows;
internal::variable_if_dynamic<Index, Supers_> m_supers;
internal::variable_if_dynamic<Index, Subs_> m_subs;
};
/**
* \class TridiagonalMatrix
* \ingroup Core_Module
*
* \brief Represents a tridiagonal matrix with a compact banded storage
*
* \tparam Scalar Numeric type, i.e. float, double, int
* \tparam Size Number of rows and cols, or \b Dynamic
* \tparam Options Can be 0 or \b SelfAdjoint
*
* \sa class BandMatrix
*/
template <typename Scalar, int Size, int Options>
class TridiagonalMatrix : public BandMatrix<Scalar, Size, Size, Options & SelfAdjoint ? 0 : 1, 1, Options | RowMajor> {
typedef BandMatrix<Scalar, Size, Size, Options & SelfAdjoint ? 0 : 1, 1, Options | RowMajor> Base;
typedef typename Base::StorageIndex StorageIndex;
public:
explicit TridiagonalMatrix(Index size = Size) : Base(size, size, Options & SelfAdjoint ? 0 : 1, 1) {}
inline typename Base::template DiagonalIntReturnType<1>::Type super() { return Base::template diagonal<1>(); }
inline const typename Base::template DiagonalIntReturnType<1>::Type super() const {
return Base::template diagonal<1>();
}
inline typename Base::template DiagonalIntReturnType<-1>::Type sub() { return Base::template diagonal<-1>(); }
inline const typename Base::template DiagonalIntReturnType<-1>::Type sub() const {
return Base::template diagonal<-1>();
}
protected:
};
struct BandShape {};
template <typename Scalar_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
struct evaluator_traits<BandMatrix<Scalar_, Rows_, Cols_, Supers_, Subs_, Options_> >
: public evaluator_traits_base<BandMatrix<Scalar_, Rows_, Cols_, Supers_, Subs_, Options_> > {
typedef BandShape Shape;
};
template <typename CoefficientsType_, int Rows_, int Cols_, int Supers_, int Subs_, int Options_>
struct evaluator_traits<BandMatrixWrapper<CoefficientsType_, Rows_, Cols_, Supers_, Subs_, Options_> >
: public evaluator_traits_base<BandMatrixWrapper<CoefficientsType_, Rows_, Cols_, Supers_, Subs_, Options_> > {
typedef BandShape Shape;
};
template <>
struct AssignmentKind<DenseShape, BandShape> {
typedef EigenBase2EigenBase Kind;
};
} // end namespace internal
} // end namespace Eigen
#endif // EIGEN_BANDMATRIX_H

File diff suppressed because it is too large Load Diff

View File

@@ -1,9 +0,0 @@
FILE(GLOB Eigen_Core_SRCS "*.h")
INSTALL(FILES
${Eigen_Core_SRCS}
DESTINATION ${INCLUDE_INSTALL_DIR}/Eigen/src/Core
)
ADD_SUBDIRECTORY(util)
ADD_SUBDIRECTORY(arch)

View File

@@ -1,752 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_CACHE_FRIENDLY_PRODUCT_H
#define EIGEN_CACHE_FRIENDLY_PRODUCT_H
template <int L2MemorySize,typename Scalar>
struct ei_L2_block_traits {
enum {width = 8 * ei_meta_sqrt<L2MemorySize/(64*sizeof(Scalar))>::ret };
};
#ifndef EIGEN_EXTERN_INSTANTIATIONS
template<typename Scalar>
static void ei_cache_friendly_product(
int _rows, int _cols, int depth,
bool _lhsRowMajor, const Scalar* _lhs, int _lhsStride,
bool _rhsRowMajor, const Scalar* _rhs, int _rhsStride,
bool resRowMajor, Scalar* res, int resStride)
{
const Scalar* EIGEN_RESTRICT lhs;
const Scalar* EIGEN_RESTRICT rhs;
int lhsStride, rhsStride, rows, cols;
bool lhsRowMajor;
if (resRowMajor)
{
lhs = _rhs;
rhs = _lhs;
lhsStride = _rhsStride;
rhsStride = _lhsStride;
cols = _rows;
rows = _cols;
lhsRowMajor = !_rhsRowMajor;
ei_assert(_lhsRowMajor);
}
else
{
lhs = _lhs;
rhs = _rhs;
lhsStride = _lhsStride;
rhsStride = _rhsStride;
rows = _rows;
cols = _cols;
lhsRowMajor = _lhsRowMajor;
ei_assert(!_rhsRowMajor);
}
typedef typename ei_packet_traits<Scalar>::type PacketType;
enum {
PacketSize = sizeof(PacketType)/sizeof(Scalar),
#if (defined __i386__)
// i386 architecture provides only 8 xmm registers,
// so let's reduce the max number of rows processed at once.
MaxBlockRows = 4,
MaxBlockRows_ClampingMask = 0xFFFFFC,
#else
MaxBlockRows = 8,
MaxBlockRows_ClampingMask = 0xFFFFF8,
#endif
// maximal size of the blocks fitted in L2 cache
MaxL2BlockSize = ei_L2_block_traits<EIGEN_TUNE_FOR_L2_CACHE_SIZE,Scalar>::width
};
const bool resIsAligned = (PacketSize==1) || (((resStride%PacketSize) == 0) && (size_t(res)%16==0));
const int remainingSize = depth % PacketSize;
const int size = depth - remainingSize; // third dimension of the product clamped to packet boundaries
const int l2BlockRows = MaxL2BlockSize > rows ? rows : MaxL2BlockSize;
const int l2BlockCols = MaxL2BlockSize > cols ? cols : MaxL2BlockSize;
const int l2BlockSize = MaxL2BlockSize > size ? size : MaxL2BlockSize;
const int l2BlockSizeAligned = (1 + std::max(l2BlockSize,l2BlockCols)/PacketSize)*PacketSize;
const bool needRhsCopy = (PacketSize>1) && ((rhsStride%PacketSize!=0) || (size_t(rhs)%16!=0));
Scalar* EIGEN_RESTRICT block = 0;
const int allocBlockSize = l2BlockRows*size;
block = ei_alloc_stack(Scalar, allocBlockSize);
Scalar* EIGEN_RESTRICT rhsCopy
= ei_alloc_stack(Scalar, l2BlockSizeAligned*l2BlockSizeAligned);
// loops on each L2 cache friendly blocks of the result
for(int l2i=0; l2i<rows; l2i+=l2BlockRows)
{
const int l2blockRowEnd = std::min(l2i+l2BlockRows, rows);
const int l2blockRowEndBW = l2blockRowEnd & MaxBlockRows_ClampingMask; // end of the rows aligned to bw
const int l2blockRemainingRows = l2blockRowEnd - l2blockRowEndBW; // number of remaining rows
//const int l2blockRowEndBWPlusOne = l2blockRowEndBW + (l2blockRemainingRows?0:MaxBlockRows);
// build a cache friendly blocky matrix
int count = 0;
// copy l2blocksize rows of m_lhs to blocks of ps x bw
for(int l2k=0; l2k<size; l2k+=l2BlockSize)
{
const int l2blockSizeEnd = std::min(l2k+l2BlockSize, size);
for (int i = l2i; i<l2blockRowEndBW/*PlusOne*/; i+=MaxBlockRows)
{
// TODO merge the "if l2blockRemainingRows" using something like:
// const int blockRows = std::min(i+MaxBlockRows, rows) - i;
for (int k=l2k; k<l2blockSizeEnd; k+=PacketSize)
{
// TODO write these loops using meta unrolling
// negligible for large matrices but useful for small ones
if (lhsRowMajor)
{
for (int w=0; w<MaxBlockRows; ++w)
for (int s=0; s<PacketSize; ++s)
block[count++] = lhs[(i+w)*lhsStride + (k+s)];
}
else
{
for (int w=0; w<MaxBlockRows; ++w)
for (int s=0; s<PacketSize; ++s)
block[count++] = lhs[(i+w) + (k+s)*lhsStride];
}
}
}
if (l2blockRemainingRows>0)
{
for (int k=l2k; k<l2blockSizeEnd; k+=PacketSize)
{
if (lhsRowMajor)
{
for (int w=0; w<l2blockRemainingRows; ++w)
for (int s=0; s<PacketSize; ++s)
block[count++] = lhs[(l2blockRowEndBW+w)*lhsStride + (k+s)];
}
else
{
for (int w=0; w<l2blockRemainingRows; ++w)
for (int s=0; s<PacketSize; ++s)
block[count++] = lhs[(l2blockRowEndBW+w) + (k+s)*lhsStride];
}
}
}
}
for(int l2j=0; l2j<cols; l2j+=l2BlockCols)
{
int l2blockColEnd = std::min(l2j+l2BlockCols, cols);
for(int l2k=0; l2k<size; l2k+=l2BlockSize)
{
// acumulate bw rows of lhs time a single column of rhs to a bw x 1 block of res
int l2blockSizeEnd = std::min(l2k+l2BlockSize, size);
// if not aligned, copy the rhs block
if (needRhsCopy)
for(int l1j=l2j; l1j<l2blockColEnd; l1j+=1)
{
ei_internal_assert(l2BlockSizeAligned*(l1j-l2j)+(l2blockSizeEnd-l2k) < l2BlockSizeAligned*l2BlockSizeAligned);
memcpy(rhsCopy+l2BlockSizeAligned*(l1j-l2j),&(rhs[l1j*rhsStride+l2k]),(l2blockSizeEnd-l2k)*sizeof(Scalar));
}
// for each bw x 1 result's block
for(int l1i=l2i; l1i<l2blockRowEndBW; l1i+=MaxBlockRows)
{
int offsetblock = l2k * (l2blockRowEnd-l2i) + (l1i-l2i)*(l2blockSizeEnd-l2k) - l2k*MaxBlockRows;
const Scalar* EIGEN_RESTRICT localB = &block[offsetblock];
for(int l1j=l2j; l1j<l2blockColEnd; l1j+=1)
{
const Scalar* EIGEN_RESTRICT rhsColumn;
if (needRhsCopy)
rhsColumn = &(rhsCopy[l2BlockSizeAligned*(l1j-l2j)-l2k]);
else
rhsColumn = &(rhs[l1j*rhsStride]);
PacketType dst[MaxBlockRows];
dst[3] = dst[2] = dst[1] = dst[0] = ei_pset1(Scalar(0.));
if (MaxBlockRows==8)
dst[7] = dst[6] = dst[5] = dst[4] = dst[0];
PacketType tmp;
for(int k=l2k; k<l2blockSizeEnd; k+=PacketSize)
{
tmp = ei_ploadu(&rhsColumn[k]);
PacketType A0, A1, A2, A3, A4, A5;
A0 = ei_pload(localB + k*MaxBlockRows);
A1 = ei_pload(localB + k*MaxBlockRows+1*PacketSize);
A2 = ei_pload(localB + k*MaxBlockRows+2*PacketSize);
A3 = ei_pload(localB + k*MaxBlockRows+3*PacketSize);
if (MaxBlockRows==8) A4 = ei_pload(localB + k*MaxBlockRows+4*PacketSize);
if (MaxBlockRows==8) A5 = ei_pload(localB + k*MaxBlockRows+5*PacketSize);
dst[0] = ei_pmadd(tmp, A0, dst[0]);
if (MaxBlockRows==8) A0 = ei_pload(localB + k*MaxBlockRows+6*PacketSize);
dst[1] = ei_pmadd(tmp, A1, dst[1]);
if (MaxBlockRows==8) A1 = ei_pload(localB + k*MaxBlockRows+7*PacketSize);
dst[2] = ei_pmadd(tmp, A2, dst[2]);
dst[3] = ei_pmadd(tmp, A3, dst[3]);
if (MaxBlockRows==8)
{
dst[4] = ei_pmadd(tmp, A4, dst[4]);
dst[5] = ei_pmadd(tmp, A5, dst[5]);
dst[6] = ei_pmadd(tmp, A0, dst[6]);
dst[7] = ei_pmadd(tmp, A1, dst[7]);
}
}
Scalar* EIGEN_RESTRICT localRes = &(res[l1i + l1j*resStride]);
if (PacketSize>1 && resIsAligned)
{
// the result is aligned: let's do packet reduction
ei_pstore(&(localRes[0]), ei_padd(ei_pload(&(localRes[0])), ei_preduxp(&dst[0])));
if (PacketSize==2)
ei_pstore(&(localRes[2]), ei_padd(ei_pload(&(localRes[2])), ei_preduxp(&(dst[2]))));
if (MaxBlockRows==8)
{
ei_pstore(&(localRes[4]), ei_padd(ei_pload(&(localRes[4])), ei_preduxp(&(dst[4]))));
if (PacketSize==2)
ei_pstore(&(localRes[6]), ei_padd(ei_pload(&(localRes[6])), ei_preduxp(&(dst[6]))));
}
}
else
{
// not aligned => per coeff packet reduction
localRes[0] += ei_predux(dst[0]);
localRes[1] += ei_predux(dst[1]);
localRes[2] += ei_predux(dst[2]);
localRes[3] += ei_predux(dst[3]);
if (MaxBlockRows==8)
{
localRes[4] += ei_predux(dst[4]);
localRes[5] += ei_predux(dst[5]);
localRes[6] += ei_predux(dst[6]);
localRes[7] += ei_predux(dst[7]);
}
}
}
}
if (l2blockRemainingRows>0)
{
int offsetblock = l2k * (l2blockRowEnd-l2i) + (l2blockRowEndBW-l2i)*(l2blockSizeEnd-l2k) - l2k*l2blockRemainingRows;
const Scalar* localB = &block[offsetblock];
for(int l1j=l2j; l1j<l2blockColEnd; l1j+=1)
{
const Scalar* EIGEN_RESTRICT rhsColumn;
if (needRhsCopy)
rhsColumn = &(rhsCopy[l2BlockSizeAligned*(l1j-l2j)-l2k]);
else
rhsColumn = &(rhs[l1j*rhsStride]);
PacketType dst[MaxBlockRows];
dst[3] = dst[2] = dst[1] = dst[0] = ei_pset1(Scalar(0.));
if (MaxBlockRows==8)
dst[7] = dst[6] = dst[5] = dst[4] = dst[0];
// let's declare a few other temporary registers
PacketType tmp;
for(int k=l2k; k<l2blockSizeEnd; k+=PacketSize)
{
tmp = ei_pload(&rhsColumn[k]);
dst[0] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows ])), dst[0]);
if (l2blockRemainingRows>=2) dst[1] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+ PacketSize])), dst[1]);
if (l2blockRemainingRows>=3) dst[2] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+2*PacketSize])), dst[2]);
if (l2blockRemainingRows>=4) dst[3] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+3*PacketSize])), dst[3]);
if (MaxBlockRows==8)
{
if (l2blockRemainingRows>=5) dst[4] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+4*PacketSize])), dst[4]);
if (l2blockRemainingRows>=6) dst[5] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+5*PacketSize])), dst[5]);
if (l2blockRemainingRows>=7) dst[6] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+6*PacketSize])), dst[6]);
if (l2blockRemainingRows>=8) dst[7] = ei_pmadd(tmp, ei_pload(&(localB[k*l2blockRemainingRows+7*PacketSize])), dst[7]);
}
}
Scalar* EIGEN_RESTRICT localRes = &(res[l2blockRowEndBW + l1j*resStride]);
// process the remaining rows once at a time
localRes[0] += ei_predux(dst[0]);
if (l2blockRemainingRows>=2) localRes[1] += ei_predux(dst[1]);
if (l2blockRemainingRows>=3) localRes[2] += ei_predux(dst[2]);
if (l2blockRemainingRows>=4) localRes[3] += ei_predux(dst[3]);
if (MaxBlockRows==8)
{
if (l2blockRemainingRows>=5) localRes[4] += ei_predux(dst[4]);
if (l2blockRemainingRows>=6) localRes[5] += ei_predux(dst[5]);
if (l2blockRemainingRows>=7) localRes[6] += ei_predux(dst[6]);
if (l2blockRemainingRows>=8) localRes[7] += ei_predux(dst[7]);
}
}
}
}
}
}
if (PacketSize>1 && remainingSize)
{
if (lhsRowMajor)
{
for (int j=0; j<cols; ++j)
for (int i=0; i<rows; ++i)
{
Scalar tmp = lhs[i*lhsStride+size] * rhs[j*rhsStride+size];
// FIXME this loop get vectorized by the compiler !
for (int k=1; k<remainingSize; ++k)
tmp += lhs[i*lhsStride+size+k] * rhs[j*rhsStride+size+k];
res[i+j*resStride] += tmp;
}
}
else
{
for (int j=0; j<cols; ++j)
for (int i=0; i<rows; ++i)
{
Scalar tmp = lhs[i+size*lhsStride] * rhs[j*rhsStride+size];
for (int k=1; k<remainingSize; ++k)
tmp += lhs[i+(size+k)*lhsStride] * rhs[j*rhsStride+size+k];
res[i+j*resStride] += tmp;
}
}
}
ei_free_stack(block, Scalar, allocBlockSize);
ei_free_stack(rhsCopy, Scalar, l2BlockSizeAligned*l2BlockSizeAligned);
}
#endif // EIGEN_EXTERN_INSTANTIATIONS
/* Optimized col-major matrix * vector product:
* This algorithm processes 4 columns at onces that allows to both reduce
* the number of load/stores of the result by a factor 4 and to reduce
* the instruction dependency. Moreover, we know that all bands have the
* same alignment pattern.
* TODO: since rhs gets evaluated only once, no need to evaluate it
*/
template<typename Scalar, typename RhsType>
static EIGEN_DONT_INLINE void ei_cache_friendly_product_colmajor_times_vector(
int size,
const Scalar* lhs, int lhsStride,
const RhsType& rhs,
Scalar* res)
{
#ifdef _EIGEN_ACCUMULATE_PACKETS
#error _EIGEN_ACCUMULATE_PACKETS has already been defined
#endif
#define _EIGEN_ACCUMULATE_PACKETS(A0,A13,A2,OFFSET) \
ei_pstore(&res[j OFFSET], \
ei_padd(ei_pload(&res[j OFFSET]), \
ei_padd( \
ei_padd(ei_pmul(ptmp0,ei_pload ## A0(&lhs0[j OFFSET])),ei_pmul(ptmp1,ei_pload ## A13(&lhs1[j OFFSET]))), \
ei_padd(ei_pmul(ptmp2,ei_pload ## A2(&lhs2[j OFFSET])),ei_pmul(ptmp3,ei_pload ## A13(&lhs3[j OFFSET]))) )))
typedef typename ei_packet_traits<Scalar>::type Packet;
const int PacketSize = sizeof(Packet)/sizeof(Scalar);
enum { AllAligned = 0, EvenAligned, FirstAligned, NoneAligned };
const int columnsAtOnce = 4;
const int peels = 2;
const int PacketAlignedMask = PacketSize-1;
const int PeelAlignedMask = PacketSize*peels-1;
// How many coeffs of the result do we have to skip to be aligned.
// Here we assume data are at least aligned on the base scalar type that is mandatory anyway.
const int alignedStart = ei_alignmentOffset(res,size);
const int alignedSize = PacketSize>1 ? alignedStart + ((size-alignedStart) & ~PacketAlignedMask) : 0;
const int peeledSize = peels>1 ? alignedStart + ((alignedSize-alignedStart) & ~PeelAlignedMask) : alignedStart;
const int alignmentStep = PacketSize>1 ? (PacketSize - lhsStride % PacketSize) & PacketAlignedMask : 0;
int alignmentPattern = alignmentStep==0 ? AllAligned
: alignmentStep==(PacketSize/2) ? EvenAligned
: FirstAligned;
// we cannot assume the first element is aligned because of sub-matrices
const int lhsAlignmentOffset = ei_alignmentOffset(lhs,size);
// find how many columns do we have to skip to be aligned with the result (if possible)
int skipColumns = 0;
if (PacketSize>1)
{
ei_internal_assert(size_t(lhs+lhsAlignmentOffset)%sizeof(Packet)==0 || size<PacketSize);
while (skipColumns<PacketSize &&
alignedStart != ((lhsAlignmentOffset + alignmentStep*skipColumns)%PacketSize))
++skipColumns;
if (skipColumns==PacketSize)
{
// nothing can be aligned, no need to skip any column
alignmentPattern = NoneAligned;
skipColumns = 0;
}
else
{
skipColumns = std::min(skipColumns,rhs.size());
// note that the skiped columns are processed later.
}
ei_internal_assert((alignmentPattern==NoneAligned) || (size_t(lhs+alignedStart+lhsStride*skipColumns)%sizeof(Packet))==0);
}
int offset1 = (FirstAligned && alignmentStep==1?3:1);
int offset3 = (FirstAligned && alignmentStep==1?1:3);
int columnBound = ((rhs.size()-skipColumns)/columnsAtOnce)*columnsAtOnce + skipColumns;
for (int i=skipColumns; i<columnBound; i+=columnsAtOnce)
{
Packet ptmp0 = ei_pset1(rhs[i]), ptmp1 = ei_pset1(rhs[i+offset1]),
ptmp2 = ei_pset1(rhs[i+2]), ptmp3 = ei_pset1(rhs[i+offset3]);
// this helps a lot generating better binary code
const Scalar *lhs0 = lhs + i*lhsStride, *lhs1 = lhs + (i+offset1)*lhsStride,
*lhs2 = lhs + (i+2)*lhsStride, *lhs3 = lhs + (i+offset3)*lhsStride;
if (PacketSize>1)
{
/* explicit vectorization */
// process initial unaligned coeffs
for (int j=0; j<alignedStart; j++)
res[j] += ei_pfirst(ptmp0)*lhs0[j] + ei_pfirst(ptmp1)*lhs1[j] + ei_pfirst(ptmp2)*lhs2[j] + ei_pfirst(ptmp3)*lhs3[j];
if (alignedSize>alignedStart)
{
switch(alignmentPattern)
{
case AllAligned:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,,,);
break;
case EvenAligned:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,u,,);
break;
case FirstAligned:
if(peels>1)
{
Packet A00, A01, A02, A03, A10, A11, A12, A13;
A01 = ei_pload(&lhs1[alignedStart-1]);
A02 = ei_pload(&lhs2[alignedStart-2]);
A03 = ei_pload(&lhs3[alignedStart-3]);
for (int j = alignedStart; j<peeledSize; j+=peels*PacketSize)
{
A11 = ei_pload(&lhs1[j-1+PacketSize]); ei_palign<1>(A01,A11);
A12 = ei_pload(&lhs2[j-2+PacketSize]); ei_palign<2>(A02,A12);
A13 = ei_pload(&lhs3[j-3+PacketSize]); ei_palign<3>(A03,A13);
A00 = ei_pload (&lhs0[j]);
A10 = ei_pload (&lhs0[j+PacketSize]);
A00 = ei_pmadd(ptmp0, A00, ei_pload(&res[j]));
A10 = ei_pmadd(ptmp0, A10, ei_pload(&res[j+PacketSize]));
A00 = ei_pmadd(ptmp1, A01, A00);
A01 = ei_pload(&lhs1[j-1+2*PacketSize]); ei_palign<1>(A11,A01);
A00 = ei_pmadd(ptmp2, A02, A00);
A02 = ei_pload(&lhs2[j-2+2*PacketSize]); ei_palign<2>(A12,A02);
A00 = ei_pmadd(ptmp3, A03, A00);
ei_pstore(&res[j],A00);
A03 = ei_pload(&lhs3[j-3+2*PacketSize]); ei_palign<3>(A13,A03);
A10 = ei_pmadd(ptmp1, A11, A10);
A10 = ei_pmadd(ptmp2, A12, A10);
A10 = ei_pmadd(ptmp3, A13, A10);
ei_pstore(&res[j+PacketSize],A10);
}
}
for (int j = peeledSize; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,u,u,);
break;
default:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(u,u,u,);
break;
}
}
} // end explicit vectorization
/* process remaining coeffs (or all if there is no explicit vectorization) */
for (int j=alignedSize; j<size; j++)
res[j] += ei_pfirst(ptmp0)*lhs0[j] + ei_pfirst(ptmp1)*lhs1[j] + ei_pfirst(ptmp2)*lhs2[j] + ei_pfirst(ptmp3)*lhs3[j];
}
// process remaining first and last columns (at most columnsAtOnce-1)
int end = rhs.size();
int start = columnBound;
do
{
for (int i=start; i<end; i++)
{
Packet ptmp0 = ei_pset1(rhs[i]);
const Scalar* lhs0 = lhs + i*lhsStride;
if (PacketSize>1)
{
/* explicit vectorization */
// process first unaligned result's coeffs
for (int j=0; j<alignedStart; j++)
res[j] += ei_pfirst(ptmp0) * lhs0[j];
// process aligned result's coeffs
if ((size_t(lhs0+alignedStart)%sizeof(Packet))==0)
for (int j = alignedStart;j<alignedSize;j+=PacketSize)
ei_pstore(&res[j], ei_pmadd(ptmp0,ei_pload(&lhs0[j]),ei_pload(&res[j])));
else
for (int j = alignedStart;j<alignedSize;j+=PacketSize)
ei_pstore(&res[j], ei_pmadd(ptmp0,ei_ploadu(&lhs0[j]),ei_pload(&res[j])));
}
// process remaining scalars (or all if no explicit vectorization)
for (int j=alignedSize; j<size; j++)
res[j] += ei_pfirst(ptmp0) * lhs0[j];
}
if (skipColumns)
{
start = 0;
end = skipColumns;
skipColumns = 0;
}
else
break;
} while(PacketSize>1);
#undef _EIGEN_ACCUMULATE_PACKETS
}
// TODO add peeling to mask unaligned load/stores
template<typename Scalar, typename ResType>
static EIGEN_DONT_INLINE void ei_cache_friendly_product_rowmajor_times_vector(
const Scalar* lhs, int lhsStride,
const Scalar* rhs, int rhsSize,
ResType& res)
{
#ifdef _EIGEN_ACCUMULATE_PACKETS
#error _EIGEN_ACCUMULATE_PACKETS has already been defined
#endif
#define _EIGEN_ACCUMULATE_PACKETS(A0,A13,A2,OFFSET) {\
Packet b = ei_pload(&rhs[j]); \
ptmp0 = ei_pmadd(b, ei_pload##A0 (&lhs0[j]), ptmp0); \
ptmp1 = ei_pmadd(b, ei_pload##A13(&lhs1[j]), ptmp1); \
ptmp2 = ei_pmadd(b, ei_pload##A2 (&lhs2[j]), ptmp2); \
ptmp3 = ei_pmadd(b, ei_pload##A13(&lhs3[j]), ptmp3); }
typedef typename ei_packet_traits<Scalar>::type Packet;
const int PacketSize = sizeof(Packet)/sizeof(Scalar);
enum { AllAligned=0, EvenAligned=1, FirstAligned=2, NoneAligned=3 };
const int rowsAtOnce = 4;
const int peels = 2;
const int PacketAlignedMask = PacketSize-1;
const int PeelAlignedMask = PacketSize*peels-1;
const int size = rhsSize;
// How many coeffs of the result do we have to skip to be aligned.
// Here we assume data are at least aligned on the base scalar type that is mandatory anyway.
const int alignedStart = ei_alignmentOffset(rhs, size);
const int alignedSize = PacketSize>1 ? alignedStart + ((size-alignedStart) & ~PacketAlignedMask) : 0;
const int peeledSize = peels>1 ? alignedStart + ((alignedSize-alignedStart) & ~PeelAlignedMask) : alignedStart;
const int alignmentStep = PacketSize>1 ? (PacketSize - lhsStride % PacketSize) & PacketAlignedMask : 0;
int alignmentPattern = alignmentStep==0 ? AllAligned
: alignmentStep==(PacketSize/2) ? EvenAligned
: FirstAligned;
// we cannot assume the first element is aligned because of sub-matrices
const int lhsAlignmentOffset = ei_alignmentOffset(lhs,size);
// find how many rows do we have to skip to be aligned with rhs (if possible)
int skipRows = 0;
if (PacketSize>1)
{
ei_internal_assert(size_t(lhs+lhsAlignmentOffset)%sizeof(Packet)==0 || size<PacketSize);
while (skipRows<PacketSize &&
alignedStart != ((lhsAlignmentOffset + alignmentStep*skipRows)%PacketSize))
++skipRows;
if (skipRows==PacketSize)
{
// nothing can be aligned, no need to skip any column
alignmentPattern = NoneAligned;
skipRows = 0;
}
else
{
skipRows = std::min(skipRows,res.size());
// note that the skiped columns are processed later.
}
ei_internal_assert((alignmentPattern==NoneAligned) || PacketSize==1
|| (size_t(lhs+alignedStart+lhsStride*skipRows)%sizeof(Packet))==0);
}
int offset1 = (FirstAligned && alignmentStep==1?3:1);
int offset3 = (FirstAligned && alignmentStep==1?1:3);
int rowBound = ((res.size()-skipRows)/rowsAtOnce)*rowsAtOnce + skipRows;
for (int i=skipRows; i<rowBound; i+=rowsAtOnce)
{
Scalar tmp0 = Scalar(0), tmp1 = Scalar(0), tmp2 = Scalar(0), tmp3 = Scalar(0);
// this helps the compiler generating good binary code
const Scalar *lhs0 = lhs + i*lhsStride, *lhs1 = lhs + (i+offset1)*lhsStride,
*lhs2 = lhs + (i+2)*lhsStride, *lhs3 = lhs + (i+offset3)*lhsStride;
if (PacketSize>1)
{
/* explicit vectorization */
Packet ptmp0 = ei_pset1(Scalar(0)), ptmp1 = ei_pset1(Scalar(0)), ptmp2 = ei_pset1(Scalar(0)), ptmp3 = ei_pset1(Scalar(0));
// process initial unaligned coeffs
// FIXME this loop get vectorized by the compiler !
for (int j=0; j<alignedStart; j++)
{
Scalar b = rhs[j];
tmp0 += b*lhs0[j]; tmp1 += b*lhs1[j]; tmp2 += b*lhs2[j]; tmp3 += b*lhs3[j];
}
if (alignedSize>alignedStart)
{
switch(alignmentPattern)
{
case AllAligned:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,,,);
break;
case EvenAligned:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,u,,);
break;
case FirstAligned:
if (peels>1)
{
/* Here we proccess 4 rows with with two peeled iterations to hide
* tghe overhead of unaligned loads. Moreover unaligned loads are handled
* using special shift/move operations between the two aligned packets
* overlaping the desired unaligned packet. This is *much* more efficient
* than basic unaligned loads.
*/
Packet A01, A02, A03, b, A11, A12, A13;
A01 = ei_pload(&lhs1[alignedStart-1]);
A02 = ei_pload(&lhs2[alignedStart-2]);
A03 = ei_pload(&lhs3[alignedStart-3]);
for (int j = alignedStart; j<peeledSize; j+=peels*PacketSize)
{
b = ei_pload(&rhs[j]);
A11 = ei_pload(&lhs1[j-1+PacketSize]); ei_palign<1>(A01,A11);
A12 = ei_pload(&lhs2[j-2+PacketSize]); ei_palign<2>(A02,A12);
A13 = ei_pload(&lhs3[j-3+PacketSize]); ei_palign<3>(A03,A13);
ptmp0 = ei_pmadd(b, ei_pload (&lhs0[j]), ptmp0);
ptmp1 = ei_pmadd(b, A01, ptmp1);
A01 = ei_pload(&lhs1[j-1+2*PacketSize]); ei_palign<1>(A11,A01);
ptmp2 = ei_pmadd(b, A02, ptmp2);
A02 = ei_pload(&lhs2[j-2+2*PacketSize]); ei_palign<2>(A12,A02);
ptmp3 = ei_pmadd(b, A03, ptmp3);
A03 = ei_pload(&lhs3[j-3+2*PacketSize]); ei_palign<3>(A13,A03);
b = ei_pload(&rhs[j+PacketSize]);
ptmp0 = ei_pmadd(b, ei_pload (&lhs0[j+PacketSize]), ptmp0);
ptmp1 = ei_pmadd(b, A11, ptmp1);
ptmp2 = ei_pmadd(b, A12, ptmp2);
ptmp3 = ei_pmadd(b, A13, ptmp3);
}
}
for (int j = peeledSize; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(,u,u,);
break;
default:
for (int j = alignedStart; j<alignedSize; j+=PacketSize)
_EIGEN_ACCUMULATE_PACKETS(u,u,u,);
break;
}
tmp0 += ei_predux(ptmp0);
tmp1 += ei_predux(ptmp1);
tmp2 += ei_predux(ptmp2);
tmp3 += ei_predux(ptmp3);
}
} // end explicit vectorization
// process remaining coeffs (or all if no explicit vectorization)
// FIXME this loop get vectorized by the compiler !
for (int j=alignedSize; j<size; j++)
{
Scalar b = rhs[j];
tmp0 += b*lhs0[j]; tmp1 += b*lhs1[j]; tmp2 += b*lhs2[j]; tmp3 += b*lhs3[j];
}
res[i] += tmp0; res[i+offset1] += tmp1; res[i+2] += tmp2; res[i+offset3] += tmp3;
}
// process remaining first and last rows (at most columnsAtOnce-1)
int end = res.size();
int start = rowBound;
do
{
for (int i=start; i<end; i++)
{
Scalar tmp0 = Scalar(0);
Packet ptmp0 = ei_pset1(tmp0);
const Scalar* lhs0 = lhs + i*lhsStride;
// process first unaligned result's coeffs
// FIXME this loop get vectorized by the compiler !
for (int j=0; j<alignedStart; j++)
tmp0 += rhs[j] * lhs0[j];
if (alignedSize>alignedStart)
{
// process aligned rhs coeffs
if ((size_t(lhs0+alignedStart)%sizeof(Packet))==0)
for (int j = alignedStart;j<alignedSize;j+=PacketSize)
ptmp0 = ei_pmadd(ei_pload(&rhs[j]), ei_pload(&lhs0[j]), ptmp0);
else
for (int j = alignedStart;j<alignedSize;j+=PacketSize)
ptmp0 = ei_pmadd(ei_pload(&rhs[j]), ei_ploadu(&lhs0[j]), ptmp0);
tmp0 += ei_predux(ptmp0);
}
// process remaining scalars
// FIXME this loop get vectorized by the compiler !
for (int j=alignedSize; j<size; j++)
tmp0 += rhs[j] * lhs0[j];
res[i] += tmp0;
}
if (skipRows)
{
start = 0;
end = skipRows;
skipRows = 0;
}
else
break;
} while(PacketSize>1);
#undef _EIGEN_ACCUMULATE_PACKETS
}
#endif // EIGEN_CACHE_FRIENDLY_PRODUCT_H

View File

@@ -1,380 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_COEFFS_H
#define EIGEN_COEFFS_H
/** Short version: don't use this function, use
* \link operator()(int,int) const \endlink instead.
*
* Long version: this function is similar to
* \link operator()(int,int) const \endlink, but without the assertion.
* Use this for limiting the performance cost of debugging code when doing
* repeated coefficient access. Only use this when it is guaranteed that the
* parameters \a row and \a col are in range.
*
* If EIGEN_INTERNAL_DEBUGGING is defined, an assertion will be made, making this
* function equivalent to \link operator()(int,int) const \endlink.
*
* \sa operator()(int,int) const, coeffRef(int,int), coeff(int) const
*/
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::coeff(int row, int col) const
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
return derived().coeff(row, col);
}
/** \returns the coefficient at given the given row and column.
*
* \sa operator()(int,int), operator[](int) const
*/
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::operator()(int row, int col) const
{
ei_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
return derived().coeff(row, col);
}
/** Short version: don't use this function, use
* \link operator()(int,int) \endlink instead.
*
* Long version: this function is similar to
* \link operator()(int,int) \endlink, but without the assertion.
* Use this for limiting the performance cost of debugging code when doing
* repeated coefficient access. Only use this when it is guaranteed that the
* parameters \a row and \a col are in range.
*
* If EIGEN_INTERNAL_DEBUGGING is defined, an assertion will be made, making this
* function equivalent to \link operator()(int,int) \endlink.
*
* \sa operator()(int,int), coeff(int, int) const, coeffRef(int)
*/
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::coeffRef(int row, int col)
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
return derived().coeffRef(row, col);
}
/** \returns a reference to the coefficient at given the given row and column.
*
* \sa operator()(int,int) const, operator[](int)
*/
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::operator()(int row, int col)
{
ei_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
return derived().coeffRef(row, col);
}
/** Short version: don't use this function, use
* \link operator[](int) const \endlink instead.
*
* Long version: this function is similar to
* \link operator[](int) const \endlink, but without the assertion.
* Use this for limiting the performance cost of debugging code when doing
* repeated coefficient access. Only use this when it is guaranteed that the
* parameter \a index is in range.
*
* If EIGEN_INTERNAL_DEBUGGING is defined, an assertion will be made, making this
* function equivalent to \link operator[](int) const \endlink.
*
* \sa operator[](int) const, coeffRef(int), coeff(int,int) const
*/
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::coeff(int index) const
{
ei_internal_assert(index >= 0 && index < size());
return derived().coeff(index);
}
/** \returns the coefficient at given index.
*
* This method is allowed only for vector expressions, and for matrix expressions having the LinearAccessBit.
*
* \sa operator[](int), operator()(int,int) const, x() const, y() const,
* z() const, w() const
*/
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::operator[](int index) const
{
ei_assert(index >= 0 && index < size());
return derived().coeff(index);
}
/** \returns the coefficient at given index.
*
* This is synonymous to operator[](int) const.
*
* This method is allowed only for vector expressions, and for matrix expressions having the LinearAccessBit.
*
* \sa operator[](int), operator()(int,int) const, x() const, y() const,
* z() const, w() const
*/
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::operator()(int index) const
{
ei_assert(index >= 0 && index < size());
return derived().coeff(index);
}
/** Short version: don't use this function, use
* \link operator[](int) \endlink instead.
*
* Long version: this function is similar to
* \link operator[](int) \endlink, but without the assertion.
* Use this for limiting the performance cost of debugging code when doing
* repeated coefficient access. Only use this when it is guaranteed that the
* parameters \a row and \a col are in range.
*
* If EIGEN_INTERNAL_DEBUGGING is defined, an assertion will be made, making this
* function equivalent to \link operator[](int) \endlink.
*
* \sa operator[](int), coeff(int) const, coeffRef(int,int)
*/
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::coeffRef(int index)
{
ei_internal_assert(index >= 0 && index < size());
return derived().coeffRef(index);
}
/** \returns a reference to the coefficient at given index.
*
* This method is allowed only for vector expressions, and for matrix expressions having the LinearAccessBit.
*
* \sa operator[](int) const, operator()(int,int), x(), y(), z(), w()
*/
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::operator[](int index)
{
ei_assert(index >= 0 && index < size());
return derived().coeffRef(index);
}
/** \returns a reference to the coefficient at given index.
*
* This is synonymous to operator[](int).
*
* This method is allowed only for vector expressions, and for matrix expressions having the LinearAccessBit.
*
* \sa operator[](int) const, operator()(int,int), x(), y(), z(), w()
*/
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::operator()(int index)
{
ei_assert(index >= 0 && index < size());
return derived().coeffRef(index);
}
/** equivalent to operator[](0). */
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::x() const { return (*this)[0]; }
/** equivalent to operator[](1). */
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::y() const { return (*this)[1]; }
/** equivalent to operator[](2). */
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::z() const { return (*this)[2]; }
/** equivalent to operator[](3). */
template<typename Derived>
inline const typename ei_traits<Derived>::Scalar MatrixBase<Derived>
::w() const { return (*this)[3]; }
/** equivalent to operator[](0). */
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::x() { return (*this)[0]; }
/** equivalent to operator[](1). */
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::y() { return (*this)[1]; }
/** equivalent to operator[](2). */
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::z() { return (*this)[2]; }
/** equivalent to operator[](3). */
template<typename Derived>
inline typename ei_traits<Derived>::Scalar& MatrixBase<Derived>
::w() { return (*this)[3]; }
/** \returns the packet of coefficients starting at the given row and column. It is your responsibility
* to ensure that a packet really starts there. This method is only available on expressions having the
* PacketAccessBit.
*
* The \a LoadMode parameter may have the value \a Aligned or \a Unaligned. Its effect is to select
* the appropriate vectorization instruction. Aligned access is faster, but is only possible for packets
* starting at an address which is a multiple of the packet size.
*/
template<typename Derived>
template<int LoadMode>
inline typename ei_packet_traits<typename ei_traits<Derived>::Scalar>::type
MatrixBase<Derived>::packet(int row, int col) const
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
return derived().template packet<LoadMode>(row,col);
}
/** Stores the given packet of coefficients, at the given row and column of this expression. It is your responsibility
* to ensure that a packet really starts there. This method is only available on expressions having the
* PacketAccessBit.
*
* The \a LoadMode parameter may have the value \a Aligned or \a Unaligned. Its effect is to select
* the appropriate vectorization instruction. Aligned access is faster, but is only possible for packets
* starting at an address which is a multiple of the packet size.
*/
template<typename Derived>
template<int StoreMode>
inline void MatrixBase<Derived>::writePacket
(int row, int col, const typename ei_packet_traits<typename ei_traits<Derived>::Scalar>::type& x)
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
derived().template writePacket<StoreMode>(row,col,x);
}
/** \returns the packet of coefficients starting at the given index. It is your responsibility
* to ensure that a packet really starts there. This method is only available on expressions having the
* PacketAccessBit and the LinearAccessBit.
*
* The \a LoadMode parameter may have the value \a Aligned or \a Unaligned. Its effect is to select
* the appropriate vectorization instruction. Aligned access is faster, but is only possible for packets
* starting at an address which is a multiple of the packet size.
*/
template<typename Derived>
template<int LoadMode>
inline typename ei_packet_traits<typename ei_traits<Derived>::Scalar>::type
MatrixBase<Derived>::packet(int index) const
{
ei_internal_assert(index >= 0 && index < size());
return derived().template packet<LoadMode>(index);
}
/** Stores the given packet of coefficients, at the given index in this expression. It is your responsibility
* to ensure that a packet really starts there. This method is only available on expressions having the
* PacketAccessBit and the LinearAccessBit.
*
* The \a LoadMode parameter may have the value \a Aligned or \a Unaligned. Its effect is to select
* the appropriate vectorization instruction. Aligned access is faster, but is only possible for packets
* starting at an address which is a multiple of the packet size.
*/
template<typename Derived>
template<int StoreMode>
inline void MatrixBase<Derived>::writePacket
(int index, const typename ei_packet_traits<typename ei_traits<Derived>::Scalar>::type& x)
{
ei_internal_assert(index >= 0 && index < size());
derived().template writePacket<StoreMode>(index,x);
}
/** \internal Copies the coefficient at position (row,col) of other into *this.
*
* This method is overridden in SwapWrapper, allowing swap() assignments to share 99% of their code
* with usual assignments.
*
* Outside of this internal usage, this method has probably no usefulness. It is hidden in the public API dox.
*/
template<typename Derived>
template<typename OtherDerived>
inline void MatrixBase<Derived>::copyCoeff(int row, int col, const MatrixBase<OtherDerived>& other)
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
derived().coeffRef(row, col) = other.derived().coeff(row, col);
}
/** \internal Copies the coefficient at the given index of other into *this.
*
* This method is overridden in SwapWrapper, allowing swap() assignments to share 99% of their code
* with usual assignments.
*
* Outside of this internal usage, this method has probably no usefulness. It is hidden in the public API dox.
*/
template<typename Derived>
template<typename OtherDerived>
inline void MatrixBase<Derived>::copyCoeff(int index, const MatrixBase<OtherDerived>& other)
{
ei_internal_assert(index >= 0 && index < size());
derived().coeffRef(index) = other.derived().coeff(index);
}
/** \internal Copies the packet at position (row,col) of other into *this.
*
* This method is overridden in SwapWrapper, allowing swap() assignments to share 99% of their code
* with usual assignments.
*
* Outside of this internal usage, this method has probably no usefulness. It is hidden in the public API dox.
*/
template<typename Derived>
template<typename OtherDerived, int StoreMode, int LoadMode>
inline void MatrixBase<Derived>::copyPacket(int row, int col, const MatrixBase<OtherDerived>& other)
{
ei_internal_assert(row >= 0 && row < rows()
&& col >= 0 && col < cols());
derived().template writePacket<StoreMode>(row, col,
other.derived().template packet<LoadMode>(row, col));
}
/** \internal Copies the packet at the given index of other into *this.
*
* This method is overridden in SwapWrapper, allowing swap() assignments to share 99% of their code
* with usual assignments.
*
* Outside of this internal usage, this method has probably no usefulness. It is hidden in the public API dox.
*/
template<typename Derived>
template<typename OtherDerived, int StoreMode, int LoadMode>
inline void MatrixBase<Derived>::copyPacket(int index, const MatrixBase<OtherDerived>& other)
{
ei_internal_assert(index >= 0 && index < size());
derived().template writePacket<StoreMode>(index,
other.derived().template packet<LoadMode>(index));
}
#endif // EIGEN_COEFFS_H

View File

@@ -1,149 +1,149 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_COMMAINITIALIZER_H
#define EIGEN_COMMAINITIALIZER_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
/** \class CommaInitializer
*
* \brief Helper class used by the comma initializer operator
*
* This class is internally used to implement the comma initializer feature. It is
* the return type of MatrixBase::operator<<, and most of the time this is the only
* way it is used.
*
* \sa \ref MatrixBaseCommaInitRef "MatrixBase::operator<<", CommaInitializer::finished()
*/
template<typename MatrixType>
struct CommaInitializer
{
typedef typename ei_traits<MatrixType>::Scalar Scalar;
inline CommaInitializer(MatrixType& mat, const Scalar& s)
: m_matrix(mat), m_row(0), m_col(1), m_currentBlockRows(1)
{
m_matrix.coeffRef(0,0) = s;
* \ingroup Core_Module
*
* \brief Helper class used by the comma initializer operator
*
* This class is internally used to implement the comma initializer feature. It is
* the return type of MatrixBase::operator<<, and most of the time this is the only
* way it is used.
*
* \sa \blank \ref MatrixBaseCommaInitRef "MatrixBase::operator<<", CommaInitializer::finished()
*/
template <typename XprType>
struct CommaInitializer {
typedef typename XprType::Scalar Scalar;
EIGEN_DEVICE_FUNC inline CommaInitializer(XprType& xpr, const Scalar& s)
: m_xpr(xpr), m_row(0), m_col(1), m_currentBlockRows(1) {
eigen_assert(m_xpr.rows() > 0 && m_xpr.cols() > 0 && "Cannot comma-initialize a 0x0 matrix (operator<<)");
m_xpr.coeffRef(0, 0) = s;
}
template<typename OtherDerived>
inline CommaInitializer(MatrixType& mat, const MatrixBase<OtherDerived>& other)
: m_matrix(mat), m_row(0), m_col(other.cols()), m_currentBlockRows(other.rows())
{
m_matrix.block(0, 0, other.rows(), other.cols()) = other;
template <typename OtherDerived>
EIGEN_DEVICE_FUNC inline CommaInitializer(XprType& xpr, const DenseBase<OtherDerived>& other)
: m_xpr(xpr), m_row(0), m_col(other.cols()), m_currentBlockRows(other.rows()) {
eigen_assert(m_xpr.rows() >= other.rows() && m_xpr.cols() >= other.cols() &&
"Cannot comma-initialize a 0x0 matrix (operator<<)");
m_xpr.template block<OtherDerived::RowsAtCompileTime, OtherDerived::ColsAtCompileTime>(0, 0, other.rows(),
other.cols()) = other;
}
/* Copy/Move constructor which transfers ownership. This is crucial in
* absence of return value optimization to avoid assertions during destruction. */
// FIXME in C++11 mode this could be replaced by a proper RValue constructor
EIGEN_DEVICE_FUNC inline CommaInitializer(const CommaInitializer& o)
: m_xpr(o.m_xpr), m_row(o.m_row), m_col(o.m_col), m_currentBlockRows(o.m_currentBlockRows) {
// Mark original object as finished. In absence of R-value references we need to const_cast:
const_cast<CommaInitializer&>(o).m_row = m_xpr.rows();
const_cast<CommaInitializer&>(o).m_col = m_xpr.cols();
const_cast<CommaInitializer&>(o).m_currentBlockRows = 0;
}
/* inserts a scalar value in the target matrix */
CommaInitializer& operator,(const Scalar& s)
{
if (m_col==m_matrix.cols())
{
m_row+=m_currentBlockRows;
EIGEN_DEVICE_FUNC CommaInitializer &operator,(const Scalar& s) {
if (m_col == m_xpr.cols()) {
m_row += m_currentBlockRows;
m_col = 0;
m_currentBlockRows = 1;
ei_assert(m_row<m_matrix.rows()
&& "Too many rows passed to comma initializer (operator<<)");
eigen_assert(m_row < m_xpr.rows() && "Too many rows passed to comma initializer (operator<<)");
}
ei_assert(m_col<m_matrix.cols()
&& "Too many coefficients passed to comma initializer (operator<<)");
ei_assert(m_currentBlockRows==1);
m_matrix.coeffRef(m_row, m_col++) = s;
eigen_assert(m_col < m_xpr.cols() && "Too many coefficients passed to comma initializer (operator<<)");
eigen_assert(m_currentBlockRows == 1);
m_xpr.coeffRef(m_row, m_col++) = s;
return *this;
}
/* inserts a matrix expression in the target matrix */
template<typename OtherDerived>
CommaInitializer& operator,(const MatrixBase<OtherDerived>& other)
{
if (m_col==m_matrix.cols())
{
m_row+=m_currentBlockRows;
template <typename OtherDerived>
EIGEN_DEVICE_FUNC CommaInitializer &operator,(const DenseBase<OtherDerived>& other) {
if (m_col == m_xpr.cols() && (other.cols() != 0 || other.rows() != m_currentBlockRows)) {
m_row += m_currentBlockRows;
m_col = 0;
m_currentBlockRows = other.rows();
ei_assert(m_row+m_currentBlockRows<=m_matrix.rows()
&& "Too many rows passed to comma initializer (operator<<)");
eigen_assert(m_row + m_currentBlockRows <= m_xpr.rows() &&
"Too many rows passed to comma initializer (operator<<)");
}
ei_assert(m_col<m_matrix.cols()
&& "Too many coefficients passed to comma initializer (operator<<)");
ei_assert(m_currentBlockRows==other.rows());
if (OtherDerived::SizeAtCompileTime != Dynamic)
m_matrix.template block<OtherDerived::RowsAtCompileTime != Dynamic ? OtherDerived::RowsAtCompileTime : 1,
OtherDerived::ColsAtCompileTime != Dynamic ? OtherDerived::ColsAtCompileTime : 1>
(m_row, m_col) = other;
else
m_matrix.block(m_row, m_col, other.rows(), other.cols()) = other;
eigen_assert((m_col + other.cols() <= m_xpr.cols()) &&
"Too many coefficients passed to comma initializer (operator<<)");
eigen_assert(m_currentBlockRows == other.rows());
m_xpr.template block<OtherDerived::RowsAtCompileTime, OtherDerived::ColsAtCompileTime>(m_row, m_col, other.rows(),
other.cols()) = other;
m_col += other.cols();
return *this;
}
inline ~CommaInitializer()
EIGEN_DEVICE_FUNC inline ~CommaInitializer()
#if defined VERIFY_RAISES_ASSERT && (!defined EIGEN_NO_ASSERTION_CHECKING) && defined EIGEN_EXCEPTIONS
noexcept(false) // Eigen::eigen_assert_exception
#endif
{
ei_assert((m_row+m_currentBlockRows) == m_matrix.rows()
&& m_col == m_matrix.cols()
&& "Too few coefficients passed to comma initializer (operator<<)");
finished();
}
/** \returns the built matrix once all its coefficients have been set.
* Calling finished is 100% optional. Its purpose is to write expressions
* like this:
* \code
* quaternion.fromRotationMatrix((Matrix3f() << axis0, axis1, axis2).finished());
* \endcode
*/
inline MatrixType& finished() { return m_matrix; }
* Calling finished is 100% optional. Its purpose is to write expressions
* like this:
* \code
* quaternion.fromRotationMatrix((Matrix3f() << axis0, axis1, axis2).finished());
* \endcode
*/
EIGEN_DEVICE_FUNC inline XprType& finished() {
eigen_assert(((m_row + m_currentBlockRows) == m_xpr.rows() || m_xpr.cols() == 0) && m_col == m_xpr.cols() &&
"Too few coefficients passed to comma initializer (operator<<)");
return m_xpr;
}
MatrixType& m_matrix; // target matrix
int m_row; // current row id
int m_col; // current col id
int m_currentBlockRows; // current block height
XprType& m_xpr; // target expression
Index m_row; // current row id
Index m_col; // current col id
Index m_currentBlockRows; // current block height
};
/** \anchor MatrixBaseCommaInitRef
* Convenient operator to set the coefficients of a matrix.
*
* The coefficients must be provided in a row major order and exactly match
* the size of the matrix. Otherwise an assertion is raised.
*
* \addexample CommaInit \label How to easily set all the coefficients of a matrix
*
* Example: \include MatrixBase_set.cpp
* Output: \verbinclude MatrixBase_set.out
*
* \sa CommaInitializer::finished(), class CommaInitializer
*/
template<typename Derived>
inline CommaInitializer<Derived> MatrixBase<Derived>::operator<< (const Scalar& s)
{
* Convenient operator to set the coefficients of a matrix.
*
* The coefficients must be provided in a row major order and exactly match
* the size of the matrix. Otherwise an assertion is raised.
*
* Example: \include MatrixBase_set.cpp
* Output: \verbinclude MatrixBase_set.out
*
* \note According the c++ standard, the argument expressions of this comma initializer are evaluated in arbitrary
* order.
*
* \sa CommaInitializer::finished(), class CommaInitializer
*/
template <typename Derived>
EIGEN_DEVICE_FUNC inline CommaInitializer<Derived> DenseBase<Derived>::operator<<(const Scalar& s) {
return CommaInitializer<Derived>(*static_cast<Derived*>(this), s);
}
/** \sa operator<<(const Scalar&) */
template<typename Derived>
template<typename OtherDerived>
inline CommaInitializer<Derived>
MatrixBase<Derived>::operator<<(const MatrixBase<OtherDerived>& other)
{
return CommaInitializer<Derived>(*static_cast<Derived *>(this), other);
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC inline CommaInitializer<Derived> DenseBase<Derived>::operator<<(
const DenseBase<OtherDerived>& other) {
return CommaInitializer<Derived>(*static_cast<Derived*>(this), other);
}
#endif // EIGEN_COMMAINITIALIZER_H
} // end namespace Eigen
#endif // EIGEN_COMMAINITIALIZER_H

View File

@@ -0,0 +1,173 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2016 Rasmus Munk Larsen (rmlarsen@google.com)
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CONDITIONESTIMATOR_H
#define EIGEN_CONDITIONESTIMATOR_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename Vector, typename RealVector, bool IsComplex>
struct rcond_compute_sign {
static inline Vector run(const Vector& v) {
const RealVector v_abs = v.cwiseAbs();
return (v_abs.array() == static_cast<typename Vector::RealScalar>(0))
.select(Vector::Ones(v.size()), v.cwiseQuotient(v_abs));
}
};
// Partial specialization to avoid elementwise division for real vectors.
template <typename Vector>
struct rcond_compute_sign<Vector, Vector, false> {
static inline Vector run(const Vector& v) {
return (v.array() < static_cast<typename Vector::RealScalar>(0))
.select(-Vector::Ones(v.size()), Vector::Ones(v.size()));
}
};
/**
* \returns an estimate of ||inv(matrix)||_1 given a decomposition of
* \a matrix that implements .solve() and .adjoint().solve() methods.
*
* This function implements Algorithms 4.1 and 5.1 from
* http://www.maths.manchester.ac.uk/~higham/narep/narep135.pdf
* which also forms the basis for the condition number estimators in
* LAPACK. Since at most 10 calls to the solve method of dec are
* performed, the total cost is O(dims^2), as opposed to O(dims^3)
* needed to compute the inverse matrix explicitly.
*
* The most common usage is in estimating the condition number
* ||matrix||_1 * ||inv(matrix)||_1. The first term ||matrix||_1 can be
* computed directly in O(n^2) operations.
*
* Supports the following decompositions: FullPivLU, PartialPivLU, LDLT, and
* LLT.
*
* \sa FullPivLU, PartialPivLU, LDLT, LLT.
*/
template <typename Decomposition>
typename Decomposition::RealScalar rcond_invmatrix_L1_norm_estimate(const Decomposition& dec) {
typedef typename Decomposition::MatrixType MatrixType;
typedef typename Decomposition::Scalar Scalar;
typedef typename Decomposition::RealScalar RealScalar;
typedef typename internal::plain_col_type<MatrixType>::type Vector;
typedef typename internal::plain_col_type<MatrixType, RealScalar>::type RealVector;
const bool is_complex = (NumTraits<Scalar>::IsComplex != 0);
eigen_assert(dec.rows() == dec.cols());
const Index n = dec.rows();
if (n == 0) return 0;
// Disable Index to float conversion warning
#ifdef __INTEL_COMPILER
#pragma warning push
#pragma warning(disable : 2259)
#endif
Vector v = dec.solve(Vector::Ones(n) / Scalar(n));
#ifdef __INTEL_COMPILER
#pragma warning pop
#endif
// lower_bound is a lower bound on
// ||inv(matrix)||_1 = sup_v ||inv(matrix) v||_1 / ||v||_1
// and is the objective maximized by the ("super-") gradient ascent
// algorithm below.
RealScalar lower_bound = v.template lpNorm<1>();
if (n == 1) return lower_bound;
// Gradient ascent algorithm follows: We know that the optimum is achieved at
// one of the simplices v = e_i, so in each iteration we follow a
// super-gradient to move towards the optimal one.
RealScalar old_lower_bound = lower_bound;
Vector sign_vector(n);
Vector old_sign_vector;
Index v_max_abs_index = -1;
Index old_v_max_abs_index = v_max_abs_index;
for (int k = 0; k < 4; ++k) {
sign_vector = internal::rcond_compute_sign<Vector, RealVector, is_complex>::run(v);
if (k > 0 && !is_complex && sign_vector == old_sign_vector) {
// Break if the solution stagnated.
break;
}
// v_max_abs_index = argmax |real( inv(matrix)^T * sign_vector )|
v = dec.adjoint().solve(sign_vector);
v.real().cwiseAbs().maxCoeff(&v_max_abs_index);
if (v_max_abs_index == old_v_max_abs_index) {
// Break if the solution stagnated.
break;
}
// Move to the new simplex e_j, where j = v_max_abs_index.
v = dec.solve(Vector::Unit(n, v_max_abs_index)); // v = inv(matrix) * e_j.
lower_bound = v.template lpNorm<1>();
if (lower_bound <= old_lower_bound) {
// Break if the gradient step did not increase the lower_bound.
break;
}
if (!is_complex) {
old_sign_vector = sign_vector;
}
old_v_max_abs_index = v_max_abs_index;
old_lower_bound = lower_bound;
}
// The following calculates an independent estimate of ||matrix||_1 by
// multiplying matrix by a vector with entries of slowly increasing
// magnitude and alternating sign:
// v_i = (-1)^{i} (1 + (i / (dim-1))), i = 0,...,dim-1.
// This improvement to Hager's algorithm above is due to Higham. It was
// added to make the algorithm more robust in certain corner cases where
// large elements in the matrix might otherwise escape detection due to
// exact cancellation (especially when op and op_adjoint correspond to a
// sequence of backsubstitutions and permutations), which could cause
// Hager's algorithm to vastly underestimate ||matrix||_1.
Scalar alternating_sign(RealScalar(1));
for (Index i = 0; i < n; ++i) {
// The static_cast is needed when Scalar is a complex and RealScalar implements expression templates
v[i] = alternating_sign * static_cast<RealScalar>(RealScalar(1) + (RealScalar(i) / (RealScalar(n - 1))));
alternating_sign = -alternating_sign;
}
v = dec.solve(v);
const RealScalar alternate_lower_bound = (2 * v.template lpNorm<1>()) / (3 * RealScalar(n));
return numext::maxi(lower_bound, alternate_lower_bound);
}
/** \brief Reciprocal condition number estimator.
*
* Computing a decomposition of a dense matrix takes O(n^3) operations, while
* this method estimates the condition number quickly and reliably in O(n^2)
* operations.
*
* \returns an estimate of the reciprocal condition number
* (1 / (||matrix||_1 * ||inv(matrix)||_1)) of matrix, given ||matrix||_1 and
* its decomposition. Supports the following decompositions: FullPivLU,
* PartialPivLU, LDLT, and LLT.
*
* \sa FullPivLU, PartialPivLU, LDLT, LLT.
*/
template <typename Decomposition>
typename Decomposition::RealScalar rcond_estimate_helper(typename Decomposition::RealScalar matrix_norm,
const Decomposition& dec) {
typedef typename Decomposition::RealScalar RealScalar;
eigen_assert(dec.rows() == dec.cols());
if (dec.rows() == 0) return NumTraits<RealScalar>::infinity();
if (numext::is_exactly_zero(matrix_norm)) return RealScalar(0);
if (dec.rows() == 1) return RealScalar(1);
const RealScalar inverse_matrix_norm = rcond_invmatrix_L1_norm_estimate(dec);
return (numext::is_exactly_zero(inverse_matrix_norm) ? RealScalar(0)
: (RealScalar(1) / inverse_matrix_norm) / matrix_norm);
}
} // namespace internal
} // namespace Eigen
#endif

File diff suppressed because it is too large Load Diff

View File

@@ -1,47 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifdef EIGEN_EXTERN_INSTANTIATIONS
#undef EIGEN_EXTERN_INSTANTIATIONS
#endif
#include "../../Core"
namespace Eigen
{
#define EIGEN_INSTANTIATE_PRODUCT(TYPE) \
template static void ei_cache_friendly_product<TYPE>( \
int _rows, int _cols, int depth, \
bool _lhsRowMajor, const TYPE* _lhs, int _lhsStride, \
bool _rhsRowMajor, const TYPE* _rhs, int _rhsStride, \
bool resRowMajor, TYPE* res, int resStride)
EIGEN_INSTANTIATE_PRODUCT(float);
EIGEN_INSTANTIATE_PRODUCT(double);
EIGEN_INSTANTIATE_PRODUCT(int);
EIGEN_INSTANTIATE_PRODUCT(std::complex<float>);
EIGEN_INSTANTIATE_PRODUCT(std::complex<double>);
}

View File

@@ -0,0 +1,141 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2008-2014 Gael Guennebaud <gael.guennebaud@inria.fr>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_COREITERATORS_H
#define EIGEN_COREITERATORS_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
/* This file contains the respective InnerIterator definition of the expressions defined in Eigen/Core
*/
namespace internal {
template <typename XprType, typename EvaluatorKind>
class inner_iterator_selector;
}
/** \class InnerIterator
* \brief An InnerIterator allows to loop over the element of any matrix expression.
*
* \warning To be used with care because an evaluator is constructed every time an InnerIterator iterator is
* constructed.
*
* TODO: add a usage example
*/
template <typename XprType>
class InnerIterator {
protected:
typedef internal::inner_iterator_selector<XprType, typename internal::evaluator_traits<XprType>::Kind> IteratorType;
typedef internal::evaluator<XprType> EvaluatorType;
typedef typename internal::traits<XprType>::Scalar Scalar;
public:
/** Construct an iterator over the \a outerId -th row or column of \a xpr */
InnerIterator(const XprType &xpr, const Index &outerId) : m_eval(xpr), m_iter(m_eval, outerId, xpr.innerSize()) {}
/// \returns the value of the current coefficient.
EIGEN_STRONG_INLINE Scalar value() const { return m_iter.value(); }
/** Increment the iterator \c *this to the next non-zero coefficient.
* Explicit zeros are not skipped over. To skip explicit zeros, see class SparseView
*/
EIGEN_STRONG_INLINE InnerIterator &operator++() {
m_iter.operator++();
return *this;
}
EIGEN_STRONG_INLINE InnerIterator &operator+=(Index i) {
m_iter.operator+=(i);
return *this;
}
EIGEN_STRONG_INLINE InnerIterator operator+(Index i) {
InnerIterator result(*this);
result += i;
return result;
}
/// \returns the column or row index of the current coefficient.
EIGEN_STRONG_INLINE Index index() const { return m_iter.index(); }
/// \returns the row index of the current coefficient.
EIGEN_STRONG_INLINE Index row() const { return m_iter.row(); }
/// \returns the column index of the current coefficient.
EIGEN_STRONG_INLINE Index col() const { return m_iter.col(); }
/// \returns \c true if the iterator \c *this still references a valid coefficient.
EIGEN_STRONG_INLINE operator bool() const { return m_iter; }
protected:
EvaluatorType m_eval;
IteratorType m_iter;
private:
// If you get here, then you're not using the right InnerIterator type, e.g.:
// SparseMatrix<double,RowMajor> A;
// SparseMatrix<double>::InnerIterator it(A,0);
template <typename T>
InnerIterator(const EigenBase<T> &, Index outer);
};
namespace internal {
// Generic inner iterator implementation for dense objects
template <typename XprType>
class inner_iterator_selector<XprType, IndexBased> {
protected:
typedef evaluator<XprType> EvaluatorType;
typedef typename traits<XprType>::Scalar Scalar;
enum { IsRowMajor = (XprType::Flags & RowMajorBit) == RowMajorBit };
public:
EIGEN_STRONG_INLINE inner_iterator_selector(const EvaluatorType &eval, const Index &outerId, const Index &innerSize)
: m_eval(eval), m_inner(0), m_outer(outerId), m_end(innerSize) {}
EIGEN_STRONG_INLINE Scalar value() const {
return (IsRowMajor) ? m_eval.coeff(m_outer, m_inner) : m_eval.coeff(m_inner, m_outer);
}
EIGEN_STRONG_INLINE inner_iterator_selector &operator++() {
m_inner++;
return *this;
}
EIGEN_STRONG_INLINE Index index() const { return m_inner; }
inline Index row() const { return IsRowMajor ? m_outer : index(); }
inline Index col() const { return IsRowMajor ? index() : m_outer; }
EIGEN_STRONG_INLINE operator bool() const { return m_inner < m_end && m_inner >= 0; }
protected:
const EvaluatorType &m_eval;
Index m_inner;
const Index m_outer;
const Index m_end;
};
// For iterator-based evaluator, inner-iterator is already implemented as
// evaluator<>::InnerIterator
template <typename XprType>
class inner_iterator_selector<XprType, IteratorBased> : public evaluator<XprType>::InnerIterator {
protected:
typedef typename evaluator<XprType>::InnerIterator Base;
typedef evaluator<XprType> EvaluatorType;
public:
EIGEN_STRONG_INLINE inner_iterator_selector(const EvaluatorType &eval, const Index &outerId,
const Index & /*innerSize*/)
: Base(eval, outerId) {}
};
} // end namespace internal
} // end namespace Eigen
#endif // EIGEN_COREITERATORS_H

View File

@@ -1,188 +0,0 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
#ifndef EIGEN_CWISE_H
#define EIGEN_CWISE_H
/** \internal
* convenient macro to defined the return type of a cwise binary operation */
#define EIGEN_CWISE_BINOP_RETURN_TYPE(OP) \
CwiseBinaryOp<OP<typename ei_traits<ExpressionType>::Scalar>, ExpressionType, OtherDerived>
/** \internal
* convenient macro to defined the return type of a cwise unary operation */
#define EIGEN_CWISE_UNOP_RETURN_TYPE(OP) \
CwiseUnaryOp<OP<typename ei_traits<ExpressionType>::Scalar>, ExpressionType>
/** \internal
* convenient macro to defined the return type of a cwise comparison to a scalar */
#define EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(OP) \
CwiseBinaryOp<OP<typename ei_traits<ExpressionType>::Scalar>, ExpressionType, \
NestByValue<typename ExpressionType::ConstantReturnType> >
/** \class Cwise
*
* \brief Pseudo expression providing additional coefficient-wise operations
*
* \param ExpressionType the type of the object on which to do coefficient-wise operations
*
* This class represents an expression with additional coefficient-wise features.
* It is the return type of MatrixBase::cwise()
* and most of the time this is the only way it is used.
*
* Note that some methods are defined in the \ref Array module.
*
* Example: \include MatrixBase_cwise_const.cpp
* Output: \verbinclude MatrixBase_cwise_const.out
*
* \sa MatrixBase::cwise() const, MatrixBase::cwise()
*/
template<typename ExpressionType> class Cwise
{
public:
typedef typename ei_traits<ExpressionType>::Scalar Scalar;
typedef typename ei_meta_if<ei_must_nest_by_value<ExpressionType>::ret,
ExpressionType, const ExpressionType&>::ret ExpressionTypeNested;
typedef CwiseUnaryOp<ei_scalar_add_op<Scalar>, ExpressionType> ScalarAddReturnType;
inline Cwise(const ExpressionType& matrix) : m_matrix(matrix) {}
/** \internal */
inline const ExpressionType& _expression() const { return m_matrix; }
template<typename OtherDerived>
const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_product_op)
operator*(const MatrixBase<OtherDerived> &other) const;
template<typename OtherDerived>
const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_quotient_op)
operator/(const MatrixBase<OtherDerived> &other) const;
template<typename OtherDerived>
const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_min_op)
min(const MatrixBase<OtherDerived> &other) const;
template<typename OtherDerived>
const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_max_op)
max(const MatrixBase<OtherDerived> &other) const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_abs_op) abs() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_abs2_op) abs2() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_square_op) square() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_cube_op) cube() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_inverse_op) inverse() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_sqrt_op) sqrt() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_exp_op) exp() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_log_op) log() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_cos_op) cos() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_sin_op) sin() const;
const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_pow_op) pow(const Scalar& exponent) const;
const ScalarAddReturnType
operator+(const Scalar& scalar) const;
/** \relates Cwise */
friend const ScalarAddReturnType
operator+(const Scalar& scalar, const Cwise& mat)
{ return mat + scalar; }
ExpressionType& operator+=(const Scalar& scalar);
const ScalarAddReturnType
operator-(const Scalar& scalar) const;
ExpressionType& operator-=(const Scalar& scalar);
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::less)
operator<(const MatrixBase<OtherDerived>& other) const;
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::less_equal)
operator<=(const MatrixBase<OtherDerived>& other) const;
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater)
operator>(const MatrixBase<OtherDerived>& other) const;
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::greater_equal)
operator>=(const MatrixBase<OtherDerived>& other) const;
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::equal_to)
operator==(const MatrixBase<OtherDerived>& other) const;
template<typename OtherDerived> const EIGEN_CWISE_BINOP_RETURN_TYPE(std::not_equal_to)
operator!=(const MatrixBase<OtherDerived>& other) const;
// comparisons to a scalar value
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less)
operator<(Scalar s) const;
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::less_equal)
operator<=(Scalar s) const;
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater)
operator>(Scalar s) const;
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::greater_equal)
operator>=(Scalar s) const;
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::equal_to)
operator==(Scalar s) const;
const EIGEN_CWISE_COMP_TO_SCALAR_RETURN_TYPE(std::not_equal_to)
operator!=(Scalar s) const;
protected:
ExpressionTypeNested m_matrix;
};
/** \returns a Cwise wrapper of *this providing additional coefficient-wise operations
*
* Example: \include MatrixBase_cwise_const.cpp
* Output: \verbinclude MatrixBase_cwise_const.out
*
* \sa class Cwise, cwise()
*/
template<typename Derived>
inline const Cwise<Derived>
MatrixBase<Derived>::cwise() const
{
return derived();
}
/** \returns a Cwise wrapper of *this providing additional coefficient-wise operations
*
* Example: \include MatrixBase_cwise.cpp
* Output: \verbinclude MatrixBase_cwise.out
*
* \sa class Cwise, cwise() const
*/
template<typename Derived>
inline Cwise<Derived>
MatrixBase<Derived>::cwise()
{
return derived();
}
#endif // EIGEN_CWISE_H

View File

@@ -1,276 +1,166 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008-2014 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CWISE_BINARY_OP_H
#define EIGEN_CWISE_BINARY_OP_H
/** \class CwiseBinaryOp
*
* \brief Generic expression of a coefficient-wise operator between two matrices or vectors
*
* \param BinaryOp template functor implementing the operator
* \param Lhs the type of the left-hand side
* \param Rhs the type of the right-hand side
*
* This class represents an expression of a generic binary operator of two matrices or vectors.
* It is the return type of the operator+, operator-, and the Cwise methods, and most
* of the time this is the only way it is used.
*
* However, if you want to write a function returning such an expression, you
* will need to use this class.
*
* \sa MatrixBase::binaryExpr(const MatrixBase<OtherDerived> &,const CustomBinaryOp &) const, class CwiseUnaryOp, class CwiseNullaryOp
*/
template<typename BinaryOp, typename Lhs, typename Rhs>
struct ei_traits<CwiseBinaryOp<BinaryOp, Lhs, Rhs> >
{
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename BinaryOp, typename Lhs, typename Rhs>
struct traits<CwiseBinaryOp<BinaryOp, Lhs, Rhs>> {
// we must not inherit from traits<Lhs> since it has
// the potential to cause problems with MSVC
typedef remove_all_t<Lhs> Ancestor;
typedef typename traits<Ancestor>::XprKind XprKind;
enum {
RowsAtCompileTime = traits<Ancestor>::RowsAtCompileTime,
ColsAtCompileTime = traits<Ancestor>::ColsAtCompileTime,
MaxRowsAtCompileTime = traits<Ancestor>::MaxRowsAtCompileTime,
MaxColsAtCompileTime = traits<Ancestor>::MaxColsAtCompileTime
};
// even though we require Lhs and Rhs to have the same scalar type (see CwiseBinaryOp constructor),
// we still want to handle the case when the result type is different.
typedef typename ei_result_of<
BinaryOp(
typename Lhs::Scalar,
typename Rhs::Scalar
)
>::type Scalar;
typedef typename result_of<BinaryOp(const typename Lhs::Scalar&, const typename Rhs::Scalar&)>::type Scalar;
typedef typename cwise_promote_storage_type<typename traits<Lhs>::StorageKind, typename traits<Rhs>::StorageKind,
BinaryOp>::ret StorageKind;
typedef typename promote_index_type<typename traits<Lhs>::StorageIndex, typename traits<Rhs>::StorageIndex>::type
StorageIndex;
typedef typename Lhs::Nested LhsNested;
typedef typename Rhs::Nested RhsNested;
typedef typename ei_unref<LhsNested>::type _LhsNested;
typedef typename ei_unref<RhsNested>::type _RhsNested;
typedef std::remove_reference_t<LhsNested> LhsNested_;
typedef std::remove_reference_t<RhsNested> RhsNested_;
enum {
LhsCoeffReadCost = _LhsNested::CoeffReadCost,
RhsCoeffReadCost = _RhsNested::CoeffReadCost,
LhsFlags = _LhsNested::Flags,
RhsFlags = _RhsNested::Flags,
RowsAtCompileTime = Lhs::RowsAtCompileTime,
ColsAtCompileTime = Lhs::ColsAtCompileTime,
MaxRowsAtCompileTime = Lhs::MaxRowsAtCompileTime,
MaxColsAtCompileTime = Lhs::MaxColsAtCompileTime,
Flags = (int(LhsFlags) | int(RhsFlags)) & (
HereditaryBits
| (int(LhsFlags) & int(RhsFlags) & (LinearAccessBit | AlignedBit))
| (ei_functor_traits<BinaryOp>::PacketAccess && ((int(LhsFlags) & RowMajorBit)==(int(RhsFlags) & RowMajorBit))
? (int(LhsFlags) & int(RhsFlags) & PacketAccessBit) : 0)),
CoeffReadCost = LhsCoeffReadCost + RhsCoeffReadCost + ei_functor_traits<BinaryOp>::Cost
Flags = cwise_promote_storage_order<typename traits<Lhs>::StorageKind, typename traits<Rhs>::StorageKind,
LhsNested_::Flags & RowMajorBit, RhsNested_::Flags & RowMajorBit>::value
};
};
} // end namespace internal
template<typename BinaryOp, typename Lhs, typename Rhs>
class CwiseBinaryOp : ei_no_assignment_operator,
public MatrixBase<CwiseBinaryOp<BinaryOp, Lhs, Rhs> >
{
public:
template <typename BinaryOp, typename Lhs, typename Rhs, typename StorageKind>
class CwiseBinaryOpImpl;
EIGEN_GENERIC_PUBLIC_INTERFACE(CwiseBinaryOp)
typedef typename ei_traits<CwiseBinaryOp>::LhsNested LhsNested;
typedef typename ei_traits<CwiseBinaryOp>::RhsNested RhsNested;
/** \class CwiseBinaryOp
* \ingroup Core_Module
*
* \brief Generic expression where a coefficient-wise binary operator is applied to two expressions
*
* \tparam BinaryOp template functor implementing the operator
* \tparam LhsType the type of the left-hand side
* \tparam RhsType the type of the right-hand side
*
* This class represents an expression where a coefficient-wise binary operator is applied to two expressions.
* It is the return type of binary operators, by which we mean only those binary operators where
* both the left-hand side and the right-hand side are Eigen expressions.
* For example, the return type of matrix1+matrix2 is a CwiseBinaryOp.
*
* Most of the time, this is the only way that it is used, so you typically don't have to name
* CwiseBinaryOp types explicitly.
*
* \sa MatrixBase::binaryExpr(const MatrixBase<OtherDerived> &,const CustomBinaryOp &) const, class CwiseUnaryOp, class
* CwiseNullaryOp
*/
template <typename BinaryOp, typename LhsType, typename RhsType>
class CwiseBinaryOp : public CwiseBinaryOpImpl<BinaryOp, LhsType, RhsType,
typename internal::cwise_promote_storage_type<
typename internal::traits<LhsType>::StorageKind,
typename internal::traits<RhsType>::StorageKind, BinaryOp>::ret>,
internal::no_assignment_operator {
public:
typedef internal::remove_all_t<BinaryOp> Functor;
typedef internal::remove_all_t<LhsType> Lhs;
typedef internal::remove_all_t<RhsType> Rhs;
class InnerIterator;
typedef typename CwiseBinaryOpImpl<
BinaryOp, LhsType, RhsType,
typename internal::cwise_promote_storage_type<typename internal::traits<LhsType>::StorageKind,
typename internal::traits<Rhs>::StorageKind, BinaryOp>::ret>::Base
Base;
EIGEN_GENERIC_PUBLIC_INTERFACE(CwiseBinaryOp)
inline CwiseBinaryOp(const Lhs& lhs, const Rhs& rhs, const BinaryOp& func = BinaryOp())
: m_lhs(lhs), m_rhs(rhs), m_functor(func)
{
// we require Lhs and Rhs to have the same scalar type. Currently there is no example of a binary functor
// that would take two operands of different types. If there were such an example, then this check should be
// moved to the BinaryOp functors, on a per-case basis. This would however require a change in the BinaryOp functors, as
// currently they take only one typename Scalar template parameter.
// It is tempting to always allow mixing different types but remember that this is often impossible in the vectorized paths.
// So allowing mixing different types gives very unexpected errors when enabling vectorization, when the user tries to
// add together a float matrix and a double matrix.
EIGEN_STATIC_ASSERT((ei_is_same_type<typename Lhs::Scalar, typename Rhs::Scalar>::ret),
you_mixed_different_numeric_types__you_need_to_use_the_cast_method_of_MatrixBase_to_cast_numeric_types_explicitly)
// require the sizes to match
ei_assert(lhs.rows() == rhs.rows() && lhs.cols() == rhs.cols());
}
EIGEN_CHECK_BINARY_COMPATIBILIY(BinaryOp, typename Lhs::Scalar, typename Rhs::Scalar)
EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE(Lhs, Rhs)
inline int rows() const { return m_lhs.rows(); }
inline int cols() const { return m_lhs.cols(); }
typedef typename internal::ref_selector<LhsType>::type LhsNested;
typedef typename internal::ref_selector<RhsType>::type RhsNested;
typedef std::remove_reference_t<LhsNested> LhsNested_;
typedef std::remove_reference_t<RhsNested> RhsNested_;
inline const Scalar coeff(int row, int col) const
{
return m_functor(m_lhs.coeff(row, col), m_rhs.coeff(row, col));
}
#if EIGEN_COMP_MSVC
// Required for Visual Studio or the Copy constructor will probably not get inlined!
EIGEN_STRONG_INLINE CwiseBinaryOp(const CwiseBinaryOp<BinaryOp, LhsType, RhsType>&) = default;
#endif
template<int LoadMode>
inline PacketScalar packet(int row, int col) const
{
return m_functor.packetOp(m_lhs.template packet<LoadMode>(row, col), m_rhs.template packet<LoadMode>(row, col));
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE CwiseBinaryOp(const Lhs& aLhs, const Rhs& aRhs,
const BinaryOp& func = BinaryOp())
: m_lhs(aLhs), m_rhs(aRhs), m_functor(func) {
eigen_assert(aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols());
}
inline const Scalar coeff(int index) const
{
return m_functor(m_lhs.coeff(index), m_rhs.coeff(index));
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Index rows() const noexcept {
// return the fixed size type if available to enable compile time optimizations
return internal::traits<internal::remove_all_t<LhsNested>>::RowsAtCompileTime == Dynamic ? m_rhs.rows()
: m_lhs.rows();
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Index cols() const noexcept {
// return the fixed size type if available to enable compile time optimizations
return internal::traits<internal::remove_all_t<LhsNested>>::ColsAtCompileTime == Dynamic ? m_rhs.cols()
: m_lhs.cols();
}
template<int LoadMode>
inline PacketScalar packet(int index) const
{
return m_functor.packetOp(m_lhs.template packet<LoadMode>(index), m_rhs.template packet<LoadMode>(index));
}
/** \returns the left hand side nested expression */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const LhsNested_& lhs() const { return m_lhs; }
/** \returns the right hand side nested expression */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const RhsNested_& rhs() const { return m_rhs; }
/** \returns the functor representing the binary operation */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const BinaryOp& functor() const { return m_functor; }
protected:
const LhsNested m_lhs;
const RhsNested m_rhs;
const BinaryOp m_functor;
protected:
LhsNested m_lhs;
RhsNested m_rhs;
const BinaryOp m_functor;
};
/**\returns an expression of the difference of \c *this and \a other
*
* \note If you want to substract a given scalar from all coefficients, see Cwise::operator-().
*
* \sa class CwiseBinaryOp, MatrixBase::operator-=(), Cwise::operator-()
*/
template<typename Derived>
template<typename OtherDerived>
inline const CwiseBinaryOp<ei_scalar_difference_op<typename ei_traits<Derived>::Scalar>,
Derived, OtherDerived>
MatrixBase<Derived>::operator-(const MatrixBase<OtherDerived> &other) const
{
return CwiseBinaryOp<ei_scalar_difference_op<Scalar>,
Derived, OtherDerived>(derived(), other.derived());
}
// Generic API dispatcher
template <typename BinaryOp, typename Lhs, typename Rhs, typename StorageKind>
class CwiseBinaryOpImpl : public internal::generic_xpr_base<CwiseBinaryOp<BinaryOp, Lhs, Rhs>>::type {
public:
typedef typename internal::generic_xpr_base<CwiseBinaryOp<BinaryOp, Lhs, Rhs>>::type Base;
};
/** replaces \c *this by \c *this - \a other.
*
* \returns a reference to \c *this
*/
template<typename Derived>
template<typename OtherDerived>
inline Derived &
MatrixBase<Derived>::operator-=(const MatrixBase<OtherDerived> &other)
{
return *this = *this - other;
}
/** \relates MatrixBase
*
* \returns an expression of the sum of \c *this and \a other
*
* \note If you want to add a given scalar to all coefficients, see Cwise::operator+().
*
* \sa class CwiseBinaryOp, MatrixBase::operator+=(), Cwise::operator+()
*/
template<typename Derived>
template<typename OtherDerived>
inline const CwiseBinaryOp<ei_scalar_sum_op<typename ei_traits<Derived>::Scalar>, Derived, OtherDerived>
MatrixBase<Derived>::operator+(const MatrixBase<OtherDerived> &other) const
{
return CwiseBinaryOp<ei_scalar_sum_op<Scalar>, Derived, OtherDerived>(derived(), other.derived());
*
* \returns a reference to \c *this
*/
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator-=(const MatrixBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::sub_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
/** replaces \c *this by \c *this + \a other.
*
* \returns a reference to \c *this
*/
template<typename Derived>
template<typename OtherDerived>
inline Derived &
MatrixBase<Derived>::operator+=(const MatrixBase<OtherDerived>& other)
{
return *this = *this + other;
*
* \returns a reference to \c *this
*/
template <typename Derived>
template <typename OtherDerived>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Derived& MatrixBase<Derived>::operator+=(const MatrixBase<OtherDerived>& other) {
call_assignment(derived(), other.derived(), internal::add_assign_op<Scalar, typename OtherDerived::Scalar>());
return derived();
}
/** \returns an expression of the Schur product (coefficient wise product) of *this and \a other
*
* Example: \include Cwise_product.cpp
* Output: \verbinclude Cwise_product.out
*
* \sa class CwiseBinaryOp, operator/(), square()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_product_op)
Cwise<ExpressionType>::operator*(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_product_op)(_expression(), other.derived());
}
} // end namespace Eigen
/** \returns an expression of the coefficient-wise quotient of *this and \a other
*
* Example: \include Cwise_quotient.cpp
* Output: \verbinclude Cwise_quotient.out
*
* \sa class CwiseBinaryOp, operator*(), inverse()
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_quotient_op)
Cwise<ExpressionType>::operator/(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_quotient_op)(_expression(), other.derived());
}
/** \returns an expression of the coefficient-wise min of *this and \a other
*
* Example: \include Cwise_min.cpp
* Output: \verbinclude Cwise_min.out
*
* \sa class CwiseBinaryOp
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_min_op)
Cwise<ExpressionType>::min(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_min_op)(_expression(), other.derived());
}
/** \returns an expression of the coefficient-wise max of *this and \a other
*
* Example: \include Cwise_max.cpp
* Output: \verbinclude Cwise_max.out
*
* \sa class CwiseBinaryOp
*/
template<typename ExpressionType>
template<typename OtherDerived>
inline const EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_max_op)
Cwise<ExpressionType>::max(const MatrixBase<OtherDerived> &other) const
{
return EIGEN_CWISE_BINOP_RETURN_TYPE(ei_scalar_max_op)(_expression(), other.derived());
}
/** \returns an expression of a custom coefficient-wise operator \a func of *this and \a other
*
* The template parameter \a CustomBinaryOp is the type of the functor
* of the custom operator (see class CwiseBinaryOp for an example)
*
* \addexample CustomCwiseBinaryFunctors \label How to use custom coeff wise binary functors
*
* Here is an example illustrating the use of custom functors:
* \include class_CwiseBinaryOp.cpp
* Output: \verbinclude class_CwiseBinaryOp.out
*
* \sa class CwiseBinaryOp, MatrixBase::operator+, MatrixBase::operator-, Cwise::operator*, Cwise::operator/
*/
template<typename Derived>
template<typename CustomBinaryOp, typename OtherDerived>
inline const CwiseBinaryOp<CustomBinaryOp, Derived, OtherDerived>
MatrixBase<Derived>::binaryExpr(const MatrixBase<OtherDerived> &other, const CustomBinaryOp& func) const
{
return CwiseBinaryOp<CustomBinaryOp, Derived, OtherDerived>(derived(), other.derived(), func);
}
#endif // EIGEN_CWISE_BINARY_OP_H
#endif // EIGEN_CWISE_BINARY_OP_H

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,171 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra.
//
// Copyright (C) 2008-2014 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
// Copyright (C) 2016 Eugene Brevdo <ebrevdo@gmail.com>
//
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CWISE_TERNARY_OP_H
#define EIGEN_CWISE_TERNARY_OP_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename TernaryOp, typename Arg1, typename Arg2, typename Arg3>
struct traits<CwiseTernaryOp<TernaryOp, Arg1, Arg2, Arg3>> {
// we must not inherit from traits<Arg1> since it has
// the potential to cause problems with MSVC
typedef remove_all_t<Arg1> Ancestor;
typedef typename traits<Ancestor>::XprKind XprKind;
enum {
RowsAtCompileTime = traits<Ancestor>::RowsAtCompileTime,
ColsAtCompileTime = traits<Ancestor>::ColsAtCompileTime,
MaxRowsAtCompileTime = traits<Ancestor>::MaxRowsAtCompileTime,
MaxColsAtCompileTime = traits<Ancestor>::MaxColsAtCompileTime
};
// even though we require Arg1, Arg2, and Arg3 to have the same scalar type
// (see CwiseTernaryOp constructor),
// we still want to handle the case when the result type is different.
typedef typename result_of<TernaryOp(const typename Arg1::Scalar&, const typename Arg2::Scalar&,
const typename Arg3::Scalar&)>::type Scalar;
typedef typename internal::traits<Arg1>::StorageKind StorageKind;
typedef typename internal::traits<Arg1>::StorageIndex StorageIndex;
typedef typename Arg1::Nested Arg1Nested;
typedef typename Arg2::Nested Arg2Nested;
typedef typename Arg3::Nested Arg3Nested;
typedef std::remove_reference_t<Arg1Nested> Arg1Nested_;
typedef std::remove_reference_t<Arg2Nested> Arg2Nested_;
typedef std::remove_reference_t<Arg3Nested> Arg3Nested_;
enum { Flags = Arg1Nested_::Flags & RowMajorBit };
};
} // end namespace internal
template <typename TernaryOp, typename Arg1, typename Arg2, typename Arg3, typename StorageKind>
class CwiseTernaryOpImpl;
/** \class CwiseTernaryOp
* \ingroup Core_Module
*
* \brief Generic expression where a coefficient-wise ternary operator is
* applied to two expressions
*
* \tparam TernaryOp template functor implementing the operator
* \tparam Arg1Type the type of the first argument
* \tparam Arg2Type the type of the second argument
* \tparam Arg3Type the type of the third argument
*
* This class represents an expression where a coefficient-wise ternary
* operator is applied to three expressions.
* It is the return type of ternary operators, by which we mean only those
* ternary operators where
* all three arguments are Eigen expressions.
* For example, the return type of betainc(matrix1, matrix2, matrix3) is a
* CwiseTernaryOp.
*
* Most of the time, this is the only way that it is used, so you typically
* don't have to name
* CwiseTernaryOp types explicitly.
*
* \sa MatrixBase::ternaryExpr(const MatrixBase<Argument2> &, const
* MatrixBase<Argument3> &, const CustomTernaryOp &) const, class CwiseBinaryOp,
* class CwiseUnaryOp, class CwiseNullaryOp
*/
template <typename TernaryOp, typename Arg1Type, typename Arg2Type, typename Arg3Type>
class CwiseTernaryOp : public CwiseTernaryOpImpl<TernaryOp, Arg1Type, Arg2Type, Arg3Type,
typename internal::traits<Arg1Type>::StorageKind>,
internal::no_assignment_operator {
public:
typedef internal::remove_all_t<Arg1Type> Arg1;
typedef internal::remove_all_t<Arg2Type> Arg2;
typedef internal::remove_all_t<Arg3Type> Arg3;
// require the sizes to match
EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE(Arg1, Arg2)
EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE(Arg1, Arg3)
// The index types should match
EIGEN_STATIC_ASSERT((internal::is_same<typename internal::traits<Arg1Type>::StorageKind,
typename internal::traits<Arg2Type>::StorageKind>::value),
STORAGE_KIND_MUST_MATCH)
EIGEN_STATIC_ASSERT((internal::is_same<typename internal::traits<Arg1Type>::StorageKind,
typename internal::traits<Arg3Type>::StorageKind>::value),
STORAGE_KIND_MUST_MATCH)
typedef typename CwiseTernaryOpImpl<TernaryOp, Arg1Type, Arg2Type, Arg3Type,
typename internal::traits<Arg1Type>::StorageKind>::Base Base;
EIGEN_GENERIC_PUBLIC_INTERFACE(CwiseTernaryOp)
typedef typename internal::ref_selector<Arg1Type>::type Arg1Nested;
typedef typename internal::ref_selector<Arg2Type>::type Arg2Nested;
typedef typename internal::ref_selector<Arg3Type>::type Arg3Nested;
typedef std::remove_reference_t<Arg1Nested> Arg1Nested_;
typedef std::remove_reference_t<Arg2Nested> Arg2Nested_;
typedef std::remove_reference_t<Arg3Nested> Arg3Nested_;
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE CwiseTernaryOp(const Arg1& a1, const Arg2& a2, const Arg3& a3,
const TernaryOp& func = TernaryOp())
: m_arg1(a1), m_arg2(a2), m_arg3(a3), m_functor(func) {
eigen_assert(a1.rows() == a2.rows() && a1.cols() == a2.cols() && a1.rows() == a3.rows() && a1.cols() == a3.cols());
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Index rows() const {
// return the fixed size type if available to enable compile time
// optimizations
if (internal::traits<internal::remove_all_t<Arg1Nested>>::RowsAtCompileTime == Dynamic &&
internal::traits<internal::remove_all_t<Arg2Nested>>::RowsAtCompileTime == Dynamic)
return m_arg3.rows();
else if (internal::traits<internal::remove_all_t<Arg1Nested>>::RowsAtCompileTime == Dynamic &&
internal::traits<internal::remove_all_t<Arg3Nested>>::RowsAtCompileTime == Dynamic)
return m_arg2.rows();
else
return m_arg1.rows();
}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE Index cols() const {
// return the fixed size type if available to enable compile time
// optimizations
if (internal::traits<internal::remove_all_t<Arg1Nested>>::ColsAtCompileTime == Dynamic &&
internal::traits<internal::remove_all_t<Arg2Nested>>::ColsAtCompileTime == Dynamic)
return m_arg3.cols();
else if (internal::traits<internal::remove_all_t<Arg1Nested>>::ColsAtCompileTime == Dynamic &&
internal::traits<internal::remove_all_t<Arg3Nested>>::ColsAtCompileTime == Dynamic)
return m_arg2.cols();
else
return m_arg1.cols();
}
/** \returns the first argument nested expression */
EIGEN_DEVICE_FUNC const Arg1Nested_& arg1() const { return m_arg1; }
/** \returns the first argument nested expression */
EIGEN_DEVICE_FUNC const Arg2Nested_& arg2() const { return m_arg2; }
/** \returns the third argument nested expression */
EIGEN_DEVICE_FUNC const Arg3Nested_& arg3() const { return m_arg3; }
/** \returns the functor representing the ternary operation */
EIGEN_DEVICE_FUNC const TernaryOp& functor() const { return m_functor; }
protected:
Arg1Nested m_arg1;
Arg2Nested m_arg2;
Arg3Nested m_arg3;
const TernaryOp m_functor;
};
// Generic API dispatcher
template <typename TernaryOp, typename Arg1, typename Arg2, typename Arg3, typename StorageKind>
class CwiseTernaryOpImpl : public internal::generic_xpr_base<CwiseTernaryOp<TernaryOp, Arg1, Arg2, Arg3>>::type {
public:
typedef typename internal::generic_xpr_base<CwiseTernaryOp<TernaryOp, Arg1, Arg2, Arg3>>::type Base;
};
} // end namespace Eigen
#endif // EIGEN_CWISE_TERNARY_OP_H

View File

@@ -1,236 +1,91 @@
// This file is part of Eigen, a lightweight C++ template library
// for linear algebra. Eigen itself is part of the KDE project.
// for linear algebra.
//
// Copyright (C) 2008 Gael Guennebaud <g.gael@free.fr>
// Copyright (C) 2008-2014 Gael Guennebaud <gael.guennebaud@inria.fr>
// Copyright (C) 2006-2008 Benoit Jacob <jacob.benoit.1@gmail.com>
//
// Eigen is free software; you can redistribute it and/or
// modify it under the terms of the GNU Lesser General Public
// License as published by the Free Software Foundation; either
// version 3 of the License, or (at your option) any later version.
//
// Alternatively, you can redistribute it and/or
// modify it under the terms of the GNU General Public License as
// published by the Free Software Foundation; either version 2 of
// the License, or (at your option) any later version.
//
// Eigen is distributed in the hope that it will be useful, but WITHOUT ANY
// WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
// FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License or the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public
// License and a copy of the GNU General Public License along with
// Eigen. If not, see <http://www.gnu.org/licenses/>.
// This Source Code Form is subject to the terms of the Mozilla
// Public License v. 2.0. If a copy of the MPL was not distributed
// with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
#ifndef EIGEN_CWISE_UNARY_OP_H
#define EIGEN_CWISE_UNARY_OP_H
// IWYU pragma: private
#include "./InternalHeaderCheck.h"
namespace Eigen {
namespace internal {
template <typename UnaryOp, typename XprType>
struct traits<CwiseUnaryOp<UnaryOp, XprType> > : traits<XprType> {
typedef typename result_of<UnaryOp(const typename XprType::Scalar&)>::type Scalar;
typedef typename XprType::Nested XprTypeNested;
typedef std::remove_reference_t<XprTypeNested> XprTypeNested_;
enum { Flags = XprTypeNested_::Flags & RowMajorBit };
};
} // namespace internal
template <typename UnaryOp, typename XprType, typename StorageKind>
class CwiseUnaryOpImpl;
/** \class CwiseUnaryOp
*
* \brief Generic expression of a coefficient-wise unary operator of a matrix or a vector
*
* \param UnaryOp template functor implementing the operator
* \param MatrixType the type of the matrix we are applying the unary operator
*
* This class represents an expression of a generic unary operator of a matrix or a vector.
* It is the return type of the unary operator-, of a matrix or a vector, and most
* of the time this is the only way it is used.
*
* \sa MatrixBase::unaryExpr(const CustomUnaryOp &) const, class CwiseBinaryOp, class CwiseNullaryOp
*/
template<typename UnaryOp, typename MatrixType>
struct ei_traits<CwiseUnaryOp<UnaryOp, MatrixType> >
{
typedef typename ei_result_of<
UnaryOp(typename MatrixType::Scalar)
>::type Scalar;
typedef typename MatrixType::Nested MatrixTypeNested;
typedef typename ei_unref<MatrixTypeNested>::type _MatrixTypeNested;
enum {
MatrixTypeCoeffReadCost = _MatrixTypeNested::CoeffReadCost,
MatrixTypeFlags = _MatrixTypeNested::Flags,
RowsAtCompileTime = MatrixType::RowsAtCompileTime,
ColsAtCompileTime = MatrixType::ColsAtCompileTime,
MaxRowsAtCompileTime = MatrixType::MaxRowsAtCompileTime,
MaxColsAtCompileTime = MatrixType::MaxColsAtCompileTime,
Flags = (MatrixTypeFlags & (
HereditaryBits | LinearAccessBit | AlignedBit
| (ei_functor_traits<UnaryOp>::PacketAccess ? PacketAccessBit : 0))),
CoeffReadCost = MatrixTypeCoeffReadCost + ei_functor_traits<UnaryOp>::Cost
};
* \ingroup Core_Module
*
* \brief Generic expression where a coefficient-wise unary operator is applied to an expression
*
* \tparam UnaryOp template functor implementing the operator
* \tparam XprType the type of the expression to which we are applying the unary operator
*
* This class represents an expression where a unary operator is applied to an expression.
* It is the return type of all operations taking exactly 1 input expression, regardless of the
* presence of other inputs such as scalars. For example, the operator* in the expression 3*matrix
* is considered unary, because only the right-hand side is an expression, and its
* return type is a specialization of CwiseUnaryOp.
*
* Most of the time, this is the only way that it is used, so you typically don't have to name
* CwiseUnaryOp types explicitly.
*
* \sa MatrixBase::unaryExpr(const CustomUnaryOp &) const, class CwiseBinaryOp, class CwiseNullaryOp
*/
template <typename UnaryOp, typename XprType>
class CwiseUnaryOp : public CwiseUnaryOpImpl<UnaryOp, XprType, typename internal::traits<XprType>::StorageKind>,
internal::no_assignment_operator {
public:
typedef typename CwiseUnaryOpImpl<UnaryOp, XprType, typename internal::traits<XprType>::StorageKind>::Base Base;
EIGEN_GENERIC_PUBLIC_INTERFACE(CwiseUnaryOp)
typedef typename internal::ref_selector<XprType>::type XprTypeNested;
typedef internal::remove_all_t<XprType> NestedExpression;
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE explicit CwiseUnaryOp(const XprType& xpr, const UnaryOp& func = UnaryOp())
: m_xpr(xpr), m_functor(func) {}
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Index rows() const noexcept { return m_xpr.rows(); }
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE constexpr Index cols() const noexcept { return m_xpr.cols(); }
/** \returns the functor representing the unary operation */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const UnaryOp& functor() const { return m_functor; }
/** \returns the nested expression */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE const internal::remove_all_t<XprTypeNested>& nestedExpression() const {
return m_xpr;
}
/** \returns the nested expression */
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE internal::remove_all_t<XprTypeNested>& nestedExpression() { return m_xpr; }
protected:
XprTypeNested m_xpr;
const UnaryOp m_functor;
};
template<typename UnaryOp, typename MatrixType>
class CwiseUnaryOp : ei_no_assignment_operator,
public MatrixBase<CwiseUnaryOp<UnaryOp, MatrixType> >
{
public:
EIGEN_GENERIC_PUBLIC_INTERFACE(CwiseUnaryOp)
class InnerIterator;
inline CwiseUnaryOp(const MatrixType& mat, const UnaryOp& func = UnaryOp())
: m_matrix(mat), m_functor(func) {}
inline int rows() const { return m_matrix.rows(); }
inline int cols() const { return m_matrix.cols(); }
inline const Scalar coeff(int row, int col) const
{
return m_functor(m_matrix.coeff(row, col));
}
template<int LoadMode>
inline PacketScalar packet(int row, int col) const
{
return m_functor.packetOp(m_matrix.template packet<LoadMode>(row, col));
}
inline const Scalar coeff(int index) const
{
return m_functor(m_matrix.coeff(index));
}
template<int LoadMode>
inline PacketScalar packet(int index) const
{
return m_functor.packetOp(m_matrix.template packet<LoadMode>(index));
}
protected:
const typename MatrixType::Nested m_matrix;
const UnaryOp m_functor;
// Generic API dispatcher
template <typename UnaryOp, typename XprType, typename StorageKind>
class CwiseUnaryOpImpl : public internal::generic_xpr_base<CwiseUnaryOp<UnaryOp, XprType> >::type {
public:
typedef typename internal::generic_xpr_base<CwiseUnaryOp<UnaryOp, XprType> >::type Base;
};
/** \returns an expression of a custom coefficient-wise unary operator \a func of *this
*
* The template parameter \a CustomUnaryOp is the type of the functor
* of the custom unary operator.
*
* \addexample CustomCwiseUnaryFunctors \label How to use custom coeff wise unary functors
*
* Example:
* \include class_CwiseUnaryOp.cpp
* Output: \verbinclude class_CwiseUnaryOp.out
*
* \sa class CwiseUnaryOp, class CwiseBinarOp, MatrixBase::operator-, Cwise::abs
*/
template<typename Derived>
template<typename CustomUnaryOp>
inline const CwiseUnaryOp<CustomUnaryOp, Derived>
MatrixBase<Derived>::unaryExpr(const CustomUnaryOp& func) const
{
return CwiseUnaryOp<CustomUnaryOp, Derived>(derived(), func);
}
} // end namespace Eigen
/** \returns an expression of the opposite of \c *this
*/
template<typename Derived>
inline const CwiseUnaryOp<ei_scalar_opposite_op<typename ei_traits<Derived>::Scalar>,Derived>
MatrixBase<Derived>::operator-() const
{
return derived();
}
/** \returns an expression of the coefficient-wise absolute value of \c *this
*
* Example: \include Cwise_abs.cpp
* Output: \verbinclude Cwise_abs.out
*
* \sa abs2()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_abs_op)
Cwise<ExpressionType>::abs() const
{
return _expression();
}
/** \returns an expression of the coefficient-wise squared absolute value of \c *this
*
* Example: \include Cwise_abs2.cpp
* Output: \verbinclude Cwise_abs2.out
*
* \sa abs(), square()
*/
template<typename ExpressionType>
inline const EIGEN_CWISE_UNOP_RETURN_TYPE(ei_scalar_abs2_op)
Cwise<ExpressionType>::abs2() const
{
return _expression();
}
/** \returns an expression of the complex conjugate of \c *this.
*
* \sa adjoint() */
template<typename Derived>
inline typename MatrixBase<Derived>::ConjugateReturnType
MatrixBase<Derived>::conjugate() const
{
return ConjugateReturnType(derived());
}
/** \returns an expression of the real part of \c *this.
*
* \sa imag() */
template<typename Derived>
inline const typename MatrixBase<Derived>::RealReturnType
MatrixBase<Derived>::real() const { return derived(); }
/** \returns an expression of the imaginary part of \c *this.
*
* \sa real() */
template<typename Derived>
inline const typename MatrixBase<Derived>::ImagReturnType
MatrixBase<Derived>::imag() const { return derived(); }
/** \returns an expression of *this with the \a Scalar type casted to
* \a NewScalar.
*
* The template parameter \a NewScalar is the type we are casting the scalars to.
*
* \sa class CwiseUnaryOp
*/
template<typename Derived>
template<typename NewType>
inline const CwiseUnaryOp<ei_scalar_cast_op<typename ei_traits<Derived>::Scalar, NewType>, Derived>
MatrixBase<Derived>::cast() const
{
return derived();
}
/** \relates MatrixBase */
template<typename Derived>
inline const typename MatrixBase<Derived>::ScalarMultipleReturnType
MatrixBase<Derived>::operator*(const Scalar& scalar) const
{
return CwiseUnaryOp<ei_scalar_multiple_op<Scalar>, Derived>
(derived(), ei_scalar_multiple_op<Scalar>(scalar));
}
/** \relates MatrixBase */
template<typename Derived>
inline const CwiseUnaryOp<ei_scalar_quotient1_op<typename ei_traits<Derived>::Scalar>, Derived>
MatrixBase<Derived>::operator/(const Scalar& scalar) const
{
return CwiseUnaryOp<ei_scalar_quotient1_op<Scalar>, Derived>
(derived(), ei_scalar_quotient1_op<Scalar>(scalar));
}
template<typename Derived>
inline Derived&
MatrixBase<Derived>::operator*=(const Scalar& other)
{
return *this = *this * other;
}
template<typename Derived>
inline Derived&
MatrixBase<Derived>::operator/=(const Scalar& other)
{
return *this = *this / other;
}
#endif // EIGEN_CWISE_UNARY_OP_H
#endif // EIGEN_CWISE_UNARY_OP_H

Some files were not shown because too many files have changed in this diff Show More