diff --git a/doc/DenseDecompositionBenchmark.dox b/doc/DenseDecompositionBenchmark.dox
index f5bfc9c9c..fcbc74b24 100644
--- a/doc/DenseDecompositionBenchmark.dox
+++ b/doc/DenseDecompositionBenchmark.dox
@@ -30,10 +30,11 @@ Timings are in \b milliseconds, and factors are relative to the LLT decompositio
\b *: This decomposition do not support direct least-square solving for over-constrained problems, and the reported timing include the cost to form the symmetric covariance matrix \f$ A^T A \f$.
\b Observations:
- + LLT is always the fastest solvers.
+ + LLT is always the fastest solver.
+ For largely over-constrained problems, the cost of Cholesky/LU decompositions is dominated by the computation of the symmetric covariance matrix.
- + For large problem sizes, only the decomposition implementing a cache-friendly blocking strategy scale well. Those include LLT, PartialPivLU, HouseholderQR, and BDCSVD. This explain why for a 4k x 4k matrix, HouseholderQR is faster than LDLT. In the future, LDLT and ColPivHouseholderQR will also implement blocking strategies.
+ + For large problem sizes, only the decompositions implementing a cache-friendly blocking strategy scale well. Those include LLT, PartialPivLU, HouseholderQR, and BDCSVD. This explains why for a 4k x 4k matrix, HouseholderQR is faster than LDLT.
+ CompleteOrthogonalDecomposition is based on ColPivHouseholderQR and they thus achieve the same level of performance.
+ + FullPivLU and FullPivHouseholderQR are dramatically slower for large matrices due to the lack of blocking, and are not shown for the 4k x 4k case.
The above table was originally generated by a benchmark tool. Feel free to write your own benchmark to generate a table matching your hardware, compiler, and favorite problem sizes.
diff --git a/doc/LeastSquares.dox b/doc/LeastSquares.dox
index ddbf38dec..606f3583c 100644
--- a/doc/LeastSquares.dox
+++ b/doc/LeastSquares.dox
@@ -7,13 +7,33 @@ of equations, say \a Ax = \a b, has no solutions. In this case, it makes sense t
vector \a x which is closest to being a solution, in the sense that the difference \a Ax - \a b is
as small as possible. This \a x is called the least square solution (if the Euclidean norm is used).
-The three methods discussed on this page are the SVD decomposition, the QR decomposition and normal
-equations. Of these, the SVD decomposition is generally the most accurate but the slowest, normal
-equations is the fastest but least accurate, and the QR decomposition is in between.
+The methods discussed on this page are the complete orthogonal decomposition (COD), the SVD
+decomposition, other QR decompositions, and normal equations. For most problems, we recommend
+CompleteOrthogonalDecomposition: it robustly computes the minimum-norm least squares solution
+(like the SVD) for both over- and under-determined systems, including rank-deficient ones, but at
+QR-like speed. The SVD is the most robust but also the slowest; use it when you also need singular
+values or vectors. Normal equations are the fastest but least robust.
\eigenAutoToc
+\section LeastSquaresCOD Using the complete orthogonal decomposition (recommended)
+
+CompleteOrthogonalDecomposition is the recommended method for least squares problems. It handles the
+widest class of problems — overdetermined, underdetermined, and rank-deficient systems — and computes
+the minimum-norm solution when the system is rank-deficient or underdetermined, just like the SVD.
+It is based on a rank-revealing QR factorization (ColPivHouseholderQR) followed by a post-processing
+step, so it is significantly faster than SVD while providing comparable robustness.
+
+
+| Example: | Output: |
+
+ | \include LeastSquaresCOD.cpp |
+ \verbinclude LeastSquaresCOD.out |
+
+
+
+
\section LeastSquaresSVD Using the SVD decomposition
The \link BDCSVD::solve() solve() \endlink method in the BDCSVD class can be directly used to
@@ -30,16 +50,19 @@ computing least squares solutions:
This is example from the page \link TutorialLinearAlgebra Linear algebra and decompositions \endlink.
-If you just need to solve the least squares problem, but are not interested in the SVD per se, a
-faster alternative method is CompleteOrthogonalDecomposition.
+The SVD gives you singular values and vectors in addition to the least squares solution, but if you
+only need the solution, CompleteOrthogonalDecomposition (above) is faster.
-\section LeastSquaresQR Using the QR decomposition
+\section LeastSquaresQR Using other QR decompositions
-The solve() method in QR decomposition classes also computes the least squares solution. There are
-three QR decomposition classes: HouseholderQR (no pivoting, fast but unstable if your matrix is
-not rull rank), ColPivHouseholderQR (column pivoting, thus a bit slower but more stable) and
-FullPivHouseholderQR (full pivoting, so slowest and slightly more stable than ColPivHouseholderQR).
+The solve() method in QR decomposition classes also computes the least squares solution. Besides
+CompleteOrthogonalDecomposition (above), there are three other QR decomposition classes:
+HouseholderQR (no pivoting, so fast but unreliable if your matrix is not full rank),
+ColPivHouseholderQR (column pivoting, a bit slower but rank-revealing), and FullPivHouseholderQR
+(full pivoting, significantly slower and rarely needed in practice).
+Note that only CompleteOrthogonalDecomposition and the SVD-based solvers compute minimum-norm
+solutions for rank-deficient or underdetermined problems; the other QR variants do not.
Here is an example with column pivoting:
diff --git a/doc/TopicLinearAlgebraDecompositions.dox b/doc/TopicLinearAlgebraDecompositions.dox
index 8598ce65b..190f09d14 100644
--- a/doc/TopicLinearAlgebraDecompositions.dox
+++ b/doc/TopicLinearAlgebraDecompositions.dox
@@ -42,10 +42,10 @@ To get an overview of the true relative speed of the different decompositions, c
| FullPivLU |
- |
- Slow |
+ Slow (no blocking) |
Proven |
Yes |
- - |
+ Rank, kernel, image |
Yes |
Excellent |
- |
@@ -78,7 +78,7 @@ To get an overview of the true relative speed of the different decompositions, c
| FullPivHouseholderQR |
- |
- Slow |
+ Slow (no blocking) |
Proven |
Yes |
Orthogonalization |
@@ -120,7 +120,7 @@ To get an overview of the true relative speed of the different decompositions, c
- |
Yes |
Excellent |
- Soon: blocking |
+ - |
| \n Singular values and eigenvalues decompositions |
@@ -232,7 +232,7 @@ To get an overview of the true relative speed of the different decompositions, c
- |
- |
Good |
- Soon: blocking |
+ - |
@@ -244,7 +244,7 @@ To get an overview of the true relative speed of the different decompositions, c
| - |
- |
Good |
- Soon: blocking |
+ - |
@@ -253,9 +253,32 @@ To get an overview of the true relative speed of the different decompositions, c
- \b 1: There exist two variants of the LDLT algorithm. Eigen's one produces a pure diagonal D matrix, and therefore it cannot handle indefinite matrices, unlike Lapack's one which produces a block diagonal D matrix.
- \b 2: Eigenvalues, SVD and Schur decompositions rely on iterative algorithms. Their convergence speed depends on how well the eigenvalues are separated.
-- \b 3: Our JacobiSVD is two-sided, making for proven and optimal precision for square matrices. For non-square matrices, we have to use a QR preconditioner first. The default choice, ColPivHouseholderQR, is already very reliable, but if you want it to be proven, use FullPivHouseholderQR instead.
+
- \b 3: Our JacobiSVD is two-sided, making for proven and optimal precision for square matrices. For non-square matrices, we have to use a QR preconditioner first. The default choice, ColPivHouseholderQR, is already very reliable, but if you want it to be proven, use FullPivHouseholderQR instead.
+\section TopicLinAlgPracticalGuidance Practical guidance
+
+The following recommendations apply to the most common use cases:
+
+\li Symmetric positive definite systems: Use \b LLT. It is the fastest solver and has excellent
+ numerical properties for this class of problems. For semidefinite or nearly singular symmetric systems,
+ use \b LDLT.
+\li General invertible systems: Use \b PartialPivLU. It uses cache-friendly blocking and implicit
+ multi-threading, making it the fastest general-purpose solver. Partial pivoting is sufficient for
+ virtually all practical problems.
+\li Least squares (over- or under-determined systems): Use \b CompleteOrthogonalDecomposition as
+ the default. Like the SVD, it robustly computes the minimum-norm solution for rank-deficient and
+ under-determined problems, but at QR-like speed. Use \b BDCSVD when you also need singular values
+ or vectors, not just the least squares solution.
+\li Full-rank least squares (overdetermined systems): When the matrix is known to be full rank,
+ \b HouseholderQR is the fastest option. For very tall and skinny well-conditioned matrices,
+ solving via the normal equations with \b LLT can be faster still.
+\li FullPivLU and FullPivHouseholderQR use complete pivoting, which prevents the use of
+ cache-friendly blocking algorithms and makes them significantly slower than their partial/column
+ pivoting counterparts. In practice, complete pivoting rarely provides meaningful accuracy benefits.
+ These decompositions are primarily useful for debugging, pedagogy, or the very rare case
+ where column pivoting is insufficient.
+
\section TopicLinAlgTerminology Terminology
diff --git a/doc/TutorialLinearAlgebra.dox b/doc/TutorialLinearAlgebra.dox
index 8042fcad3..f2dd13e40 100644
--- a/doc/TutorialLinearAlgebra.dox
+++ b/doc/TutorialLinearAlgebra.dox
@@ -43,7 +43,23 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
Requirements on the matrix |
Speed (small-to-medium) |
Speed (large) |
- Accuracy |
+ Robustness* |
+
+
+ | LLT |
+ llt() |
+ Positive definite |
+ +++ |
+ +++ |
+ + |
+
+
+ | LDLT |
+ ldlt() |
+ Positive or negative semidefinite |
+ +++ |
+ + |
+ ++ |
| PartialPivLU |
@@ -54,14 +70,6 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
+ |
- | FullPivLU |
- fullPivLu() |
- None |
- - |
- - - |
- +++ |
-
-
| HouseholderQR |
householderQr() |
None |
@@ -69,7 +77,7 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
++ |
+ |
-
+
| ColPivHouseholderQR |
colPivHouseholderQr() |
None |
@@ -77,14 +85,6 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
- |
+++ |
-
- | FullPivHouseholderQR |
- fullPivHouseholderQr() |
- None |
- - |
- - - |
- +++ |
-
| CompleteOrthogonalDecomposition |
completeOrthogonalDecomposition() |
@@ -93,23 +93,7 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
- |
+++ |
-
- | LLT |
- llt() |
- Positive definite |
- +++ |
- +++ |
- + |
-
- | LDLT |
- ldlt() |
- Positive or negative semidefinite |
- +++ |
- + |
- ++ |
-
-
| BDCSVD |
bdcSvd() |
None |
@@ -126,15 +110,36 @@ depending on your matrix, the problem you are trying to solve, and the trade-off
+++ |
+
+* The Robustness column indicates how well the decomposition handles
+ill-conditioned or rank-deficient matrices. All decompositions give excellent accuracy when their
+requirements on the matrix are met and the problem is well-conditioned.
+
To get an overview of the true relative speed of the different decompositions, check this \link DenseDecompositionBenchmark benchmark \endlink.
-All of these decompositions offer a solve() method that works as in the above example.
+All of these decompositions offer a solve() method that works as in the above example.
-If you know more about the properties of your matrix, you can use the above table to select the best method.
-For example, a good choice for solving linear systems with a non-symmetric matrix of full rank is PartialPivLU.
-If you know that your matrix is also symmetric and positive definite, the above table says that
-a very good choice is the LLT or LDLT decomposition. Here's an example, also demonstrating that using a general
-matrix (not a vector) as right hand side is possible:
+\b Practical \b recommendations:
+\li If your matrix is symmetric positive definite, use \b LLT. It is the fastest and is perfectly accurate
+ for this class of problems. If your matrix is only positive or negative semidefinite, use \b LDLT.
+\li For a general invertible matrix, \b PartialPivLU is the best choice. It is fast (uses cache-friendly
+ blocking) and reliable for the vast majority of problems.
+\li For least squares problems (over- or under-determined systems), \b CompleteOrthogonalDecomposition
+ is the recommended default. Like the SVD, it robustly computes the minimum-norm solution for
+ rank-deficient and under-determined problems, but at the cost of a QR decomposition rather than
+ an SVD. Use \b ColPivHouseholderQR if you only need least squares for full-rank overdetermined
+ systems and don't need the minimum-norm property.
+\li \b SVD decompositions (BDCSVD, JacobiSVD) are the most robust but also the slowest. Use these when
+ you need singular values/vectors, not just the solution.
+\li \b HouseholderQR is the fastest option for full-rank least squares problems, but it does not
+ reveal rank and cannot compute minimum-norm solutions for rank-deficient problems.
+\li FullPivLU and FullPivHouseholderQR use complete pivoting, which is significantly slower due to
+ lack of blocking. In practice, they rarely provide meaningful benefits over PartialPivLU and
+ ColPivHouseholderQR, respectively, and are not recommended for general use. They are primarily useful
+ for debugging or for pedagogical purposes.
+
+Here's an example showing the use of LLT for a symmetric positive definite system, also demonstrating
+that using a general matrix (not a vector) as right hand side is possible:
| Example: | Output: |
@@ -151,14 +156,15 @@ supports many other decompositions), see our special page on
\section TutorialLinAlgLeastsquares Least squares solving
-The most general and accurate method to solve under- or over-determined linear systems
-in the least squares sense, is the SVD decomposition. Eigen provides two implementations.
-The recommended one is the BDCSVD class, which scales well for large problems
-and automatically falls back to the JacobiSVD class for smaller problems.
-For both classes, their solve() method solved the linear system in the least-squares
-sense.
+The recommended method to solve under- or over-determined linear systems in the least squares sense is
+\b CompleteOrthogonalDecomposition. Like the SVD, it robustly computes the minimum-norm least squares
+solution, correctly handling rank-deficient and under-determined problems, but it is significantly faster
+since it is based on a rank-revealing QR decomposition rather than a full SVD.
-Here is an example:
+If you also need the singular values or vectors themselves (not just the least squares solution), use
+\b BDCSVD, which scales well for large problems and automatically falls back to JacobiSVD for smaller ones.
+
+Here is an example using the SVD:
| Example: | Output: |
@@ -167,11 +173,9 @@ Here is an example:
-An alternative to the SVD, which is usually faster and about as accurate, is CompleteOrthogonalDecomposition.
-
-Again, if you know more about the problem, the table above contains methods that are potentially faster.
-If your matrix is full rank, HouseHolderQR is the method of choice. If your matrix is full rank and well conditioned,
-using the Cholesky decomposition (LLT) on the matrix of the normal equations can be faster still.
+If you know more about the problem, faster methods are available.
+If your matrix is full rank, HouseholderQR is the fastest method. If your matrix is full rank and
+well conditioned, using the Cholesky decomposition (LLT) on the normal equations can be faster still.
Our page on \link LeastSquares least squares solving \endlink has more details.
@@ -267,8 +271,9 @@ singular matrix). On \ref TopicLinearAlgebraDecompositions "this table" you can
whether they are rank-revealing or not.
Rank-revealing decompositions offer at least a rank() method. They can also offer convenience methods such as isInvertible(),
-and some are also providing methods to compute the kernel (null-space) and image (column-space) of the matrix, as is the
-case with FullPivLU:
+and some are also providing methods to compute the kernel (null-space) and image (column-space) of the matrix.
+ColPivHouseholderQR, CompleteOrthogonalDecomposition, and FullPivLU all provide these methods. Here is an example using
+FullPivLU:
| Example: | Output: |
diff --git a/doc/snippets/LeastSquaresCOD.cpp b/doc/snippets/LeastSquaresCOD.cpp
new file mode 100644
index 000000000..b7325bc29
--- /dev/null
+++ b/doc/snippets/LeastSquaresCOD.cpp
@@ -0,0 +1,3 @@
+MatrixXf A = MatrixXf::Random(3, 2);
+VectorXf b = VectorXf::Random(3);
+cout << "The solution using the COD is:\n" << A.completeOrthogonalDecomposition().solve(b) << endl;