88bb2087c1
New implementation of Swap as discussed, reusing Assign. Makes LU run 10% faster overall.
Benoit Jacob
2008-08-05 21:55:57 +00:00
c94be35bc8
introduce copyCoeff and copyPacket methods in MatrixBase, used by Assign, in preparation for new Swap impl reusing Assign code. remove last remnant of old Inverse class in Transform.
Benoit Jacob
2008-08-05 18:00:23 +00:00
09ef7db9d9
Add partial pivoting runtime option to LU.
Benoit Jacob
2008-08-05 15:43:11 +00:00
e741b7beca
further big perf improvement in Inverse
Benoit Jacob
2008-08-04 23:47:09 +00:00
79a0feee68
big performance improvement in inverse and LU
Benoit Jacob
2008-08-04 23:34:21 +00:00
a7a05382d1
Add a LU decomposition action in BTL and various cleaning in BTL. For instance all per plot settings have been moved to a single file, go_mean now takes an optional second argument "tiny" to generate plots for tiny matrices, and output of comparison information wrt to previous benchs (if any).
Gael Guennebaud
2008-08-04 23:12:48 +00:00
c2f8ecf466
* LU decomposition, supporting all rectangular matrices, with full pivoting for better numerical stability. For now the only application is determinant. * New determinant unit-test. * Disable most of Swap.h for now as it makes LU fail (mysterious). Anyway Swap needs a big overhaul as proposed on IRC. * Remnants of old class Inverse removed. * Some warnings fixed.
Benoit Jacob
2008-08-04 04:45:59 +00:00
f81dfcf00b
fix two perf issues in product. fix positive definite test in Cholesky. remove #include <cstring> in CoreDeclaration.
Gael Guennebaud
2008-08-03 20:23:06 +00:00
49ae3fca89
fix compile errors with gcc 4.3: unresolved func call to ei_cache_friendly_product, and undeclared memcpy
Benoit Jacob
2008-08-03 15:44:06 +00:00
6d11a07e5e
Added a ei_palign function align a packet from two others. This allows much faster code dealing with unligned as in the updated matrix-vector product functions.
Gael Guennebaud
2008-08-03 15:15:46 +00:00
55aeb1f83a
Optimizations: * faster matrix-matrix and matrix-vector products (especially for not aligned cases) * faster tridiagonalization (make it using our matrix-vector impl.) Others: * fix Flags of Map * split the test_product to two smaller ones
Gael Guennebaud
2008-08-01 23:44:59 +00:00
b32b186c14
removed the packet specializations of some functors (GCC generates better code without those "optimizations")
Gael Guennebaud
2008-07-31 21:03:11 +00:00
842c4f8bfa
Several compilation fixes for MSVC and NVCC, basically: - added explicit enum to int conversion where needed - if a function is not defined as declared and the return type is "tricky" then the type must be typedefined somewhere. A "tricky return type" can be: * a template class with a default parameter which depends on another template parameter * a nested template class, or type of a nested template class
Gael Guennebaud
2008-07-29 16:33:07 +00:00
e0215ee510
BTL: - added tridiagonalization and hessenberg decomposition bench - added GOTO library
Gael Guennebaud
2008-07-28 20:48:21 +00:00
44d95e0540
fix some internal asserts in CacheFrinedlyProduct
Gael Guennebaud
2008-07-27 22:14:08 +00:00
02a7efa910
forgot to include this file in previous commit
Gael Guennebaud
2008-07-27 14:24:32 +00:00
93115619c2
* updated benchmark files according to recent renamings * various improvements in BTL including trisolver and cholesky bench
Gael Guennebaud
2008-07-27 11:39:47 +00:00
e9e5261664
Fix a couple issues introduced in the previous commit: * removed DirectAccessBit from Part * use a template specialization in inverseProduct() to transform a Part xpr to a Flagged xpr
Gael Guennebaud
2008-07-26 23:05:44 +00:00
e77ccf2928
* Rewrite the triangular solver so that we can take advantage of our efficient matrix-vector products: => up to 6 times faster ! * Added DirectAccessBit to Part * Added an exemple of a cwise operator * Renamed perpendicular() => someOrthogonal() (geometry module) * Fix a weired bug in ei_constant_functor: the default copy constructor did not copy the imaginary part when the single member of the class is a complex...
Gael Guennebaud
2008-07-26 20:40:29 +00:00
2940617e6f
bugfix in some internal asserts of CacheFriendlyProduct
Gael Guennebaud
2008-07-26 12:26:27 +00:00
f997a3e902
update the inverse test a little make use of static asserts in Map fix 2 warnings in CacheFriendlyProduct: unused var 'Vectorized'
Benoit Jacob
2008-07-26 12:08:28 +00:00
b466c266a0
* Fix some complex alignment issues in the cache friendly matrix-vector products. * Minor update of the cores of the Cholesky algorithms to make them more friendly wrt to matrix-vector products => speedup x5 !
Gael Guennebaud
2008-07-23 17:30:00 +00:00
172000aaeb
Add .perpendicular() function in Geometry module (adapted from Eigen1) Documentation: * add an overview for each module. * add an example for .all() and Cwise::operator<
Gael Guennebaud
2008-07-22 10:54:42 +00:00
516db2c3b9
Fix compilation issues with icc and g++ < 4.1. Those include: - conflicts with operator * overloads - discard the use of ei_pdiv for interger (g++ handles operators on __m128* types, this is why it worked) - weird behavior of icc in fixed size Block() constructor complaining the initializer of m_blockRows and m_blockCols were missing while we are in fixed size (maybe this hide deeper problem since this is a recent one, but icc gives only little feedback)
Gael Guennebaud
2008-07-21 12:40:56 +00:00
c10f069b6b
* Merge Extract and Part to the Part expression. Renamed "MatrixBase::extract() const" to "MatrixBase::part() const" * Renamed static functions identity, zero, ones, random with an upper case first letter: Identity, Zero, Ones and Random.
Gael Guennebaud
2008-07-21 00:34:46 +00:00
ce425d92f1
Various documentation improvements, in particualr in Cholesky and Geometry module. Added doxygen groups for Matrix typedefs and the Geometry module
Gael Guennebaud
2008-07-20 15:18:54 +00:00
269f683902
Add cholesky's members to MatrixBase Various documentation improvements including new snippets (AngleAxis and Cholesky)
Gael Guennebaud
2008-07-19 22:59:05 +00:00
6e2c53e056
Added an automatically generated list of selected examples in the documentation. Added the custom gemetry_module tag, and use it.
Gael Guennebaud
2008-07-19 20:36:41 +00:00
05ad083467
Added MatrixBase::Unit*() static function to easily create unit/basis vectors. Removed EulerAngles, addes typdefs for Quaternion and AngleAxis, and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that: Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY); just works.
Gael Guennebaud
2008-07-19 13:03:23 +00:00
7245c63067
Complete rewrite of partial reduction according to mailing list discussions.
Gael Guennebaud
2008-07-19 11:36:32 +00:00
8b4945a5a2
add some static asserts, use them, fix gcc 4.3 warning in Product.h.
Benoit Jacob
2008-07-19 00:25:41 +00:00
22a816ade8
* Fix a couple of issues related to the recent cache friendly products * Improve the efficiency of matrix*vector in unaligned cases * Trivial fixes in the destructors of MatrixStorage * Removed the matrixNorm in test/product.cpp (twice faster and that assumed the matrix product was ok while checking that !!)
Gael Guennebaud
2008-07-19 00:09:01 +00:00
62ec1dd616
* big rework of Inverse.h: - remove all invertibility checking, will be redundant with LU - general case: adapt to matrix storage order for better perf - size 4 case: handle corner cases without falling back to gen case. - rationalize with selectors instead of compile time if - add C-style computeInverse() * update inverse test. * in snippets, default cout precision to 3 decimal places * add some cmake module from kdelibs to support btl with cmake 2.4
Benoit Jacob
2008-07-15 23:56:17 +00:00
b970a9c8aa
trivial fix in EulerAngles constructor
Gael Guennebaud
2008-07-15 22:42:55 +00:00
c8cbc1665e
enhancements of the plot generator: - removed the ugly X11 and PNG gnuplots terminals - use enhanced postscript terminal - use imagemagick to generate the png files (with compression) - disable the fortran impl by default since it is as meaningless as a "C impl" - update line settings
Gael Guennebaud
2008-07-13 11:46:36 +00:00
99a625243f
Optimization: added super efficient rowmajor * vector product (and vector * colmajor). It basically performs 4 dot products at once reducing loads of the vector and improving instructions scheduling. With 3 cache friendly algorithms, we now handle all product configurations with outstanding perf for large matrices.
Gael Guennebaud
2008-07-13 01:22:54 +00:00
51e6ee39f0
SVN_SILENT trivial fix
Benoit Jacob
2008-07-12 23:42:19 +00:00
bd0183f850
fix a cmake issue in FindTvmet and FindMKL
Gael Guennebaud
2008-07-12 23:34:42 +00:00
e979e6485f
another occurence of that little cmake fix
Benoit Jacob
2008-07-12 23:27:41 +00:00
861d18d553
* Optimization: added a specialization of Block for xpr with DirectAccessBit * some simplifications and fixes in cache friendly products
Gael Guennebaud
2008-07-12 22:59:34 +00:00
1bbaea9885
little cmake fix
Benoit Jacob
2008-07-12 22:13:03 +00:00
10c4e36b39
disable MKL check and fortran for cmake <2.6
Gael Guennebaud
2008-07-12 21:54:02 +00:00
ed6e07b2f6
various improvements of the plot generator in BTL
Gael Guennebaud
2008-07-12 21:41:32 +00:00
8233de8b69
various minor updates in the benchmark suite like non inlining of some functions as well as the experimental C code used to design efficient eigen's matrix vector products.
Gael Guennebaud
2008-07-12 12:14:08 +00:00
b7bd1b3446
Add a *very efficient* evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp.
Gael Guennebaud
2008-07-12 12:12:02 +00:00
6f71ef8277
resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends in the benchmark suite
Gael Guennebaud
2008-07-10 18:28:50 +00:00
2b53fd4d53
some performance fixes in Assign.h reported by Gael. Some doc update in Cwise.
Benoit Jacob
2008-07-10 16:15:55 +00:00
7b4c6b8862
in BTL: a specific bench/action can be selected at runtime, e.g.: BTL_CONFIG="-a ata" ctest -V -R eigen run the all benchmarks having "ata" in their name for all libraries matching the regexp "eigen"
Gael Guennebaud
2008-07-09 22:35:11 +00:00
c9b046d5d5
* added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose
Gael Guennebaud
2008-07-09 22:30:18 +00:00
25904802bc
raah, results were corrupted by overflow. Now slice vectorization is about a +25% speedup which is still nice as i expected zero or even negative benefit.
Benoit Jacob
2008-07-09 16:46:26 +00:00
8f21a5e862
add benchmark for slice vectorization... expected it to be little or zero benefit... turns out to be 20x speedup. Something is wrong.
Benoit Jacob
2008-07-09 16:43:11 +00:00
28539e7597
imported a reworked version of BTL (Benchmark for Templated Libraries). the modifications to initial code follow: * changed build system from plain makefiles to cmake * added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces * added "transposed matrix * vector" product action * updated blitz interface to use condensed products instead of hand coded loops * removed some deprecated interfaces * changed default storage order to column major for all libraries * new generic bench timer strategy which is supposed to be more accurate * various code clean-up
Gael Guennebaud
2008-07-09 14:04:48 +00:00
5f55ab524c
* added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)
Gael Guennebaud
2008-07-09 13:54:21 +00:00
783eb6da9b
I forgot that the previous commit needed minor changes outside the bench folder
Gael Guennebaud
2008-07-08 17:25:58 +00:00
77a622f2bb
add Cholesky and eigensolver benchmark
Gael Guennebaud
2008-07-08 17:20:17 +00:00
6f09d3a67d
- many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!=
Benoit Jacob
2008-07-08 07:56:01 +00:00
f5791eeb70
the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h.
Benoit Jacob
2008-07-08 00:49:10 +00:00
c910c517b3
fix issues in previously added additionnal product tests
Gael Guennebaud
2008-07-06 19:02:03 +00:00
a9d319d44f
* do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp
Benoit Jacob
2008-07-04 12:43:55 +00:00
8463b7d3f4
* fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size
Gael Guennebaud
2008-07-02 16:05:33 +00:00
9433df83a7
* resurected Flagged::_expression used to optimize m+=(a*b).lazy() (equivalent to the GEMM blas routine) * added a GEMM benchmark
Gael Guennebaud
2008-07-01 16:20:06 +00:00
95549007b3
* fix error in divergence test, now it is even faster * add comments in render() in case anyone ever reads that :P
Benoit Jacob
2008-07-01 14:23:01 +00:00
a356ebd47d
interleaved rendering balances the load better
Benoit Jacob
2008-07-01 14:12:32 +00:00
56d03f181e
* multi-threaded rendering * increased number of iterations, with more iterations done before testing divergence. results in x2 speedup from vectorization.
Benoit Jacob
2008-07-01 12:01:58 +00:00
cacf986a7f
- use double precision to store the position / zoom / other stuff - some temporary fix to get a +50% improvement from vectorization until we have vectorisation for comparisons and redux
Benoit Jacob
2008-06-30 07:33:08 +00:00
37a50fa526
* added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module
Gael Guennebaud
2008-06-29 21:29:12 +00:00
fbdecf09e1
fix little bug in computation of max_iter
Benoit Jacob
2008-06-29 12:20:07 +00:00
97a1038653
improve greatly mandelbrot demo: - much better coloring - determine max number of iterations and choice between float and double at runtime based on zoom level - do draft renderings with increasing resolution before final rendering
Benoit Jacob
2008-06-29 12:04:00 +00:00
027818d739
* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !
Gael Guennebaud
2008-06-28 23:07:14 +00:00
6917be9113
add mandelbrot demo
Benoit Jacob
2008-06-28 20:33:47 +00:00
55e08f7102
fix breakage from my last commit
Benoit Jacob
2008-06-28 17:15:16 +00:00
844f69e4a9
* update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example.
Benoit Jacob
2008-06-27 10:53:30 +00:00
6de4871c8c
fix a couple of issues in the new Map.h
Benoit Jacob
2008-06-27 01:42:44 +00:00
e27b2b95cf
* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.
Benoit Jacob
2008-06-27 01:22:35 +00:00
e5d301dc96
various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation
Gael Guennebaud
2008-06-26 23:22:26 +00:00
c5bd1703cb
change derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer...
Benoit Jacob
2008-06-26 20:08:16 +00:00
25ba9f377c
* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyn*dyn size * fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.
Benoit Jacob
2008-06-26 16:06:41 +00:00
5b0da4b778
make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction.
Benoit Jacob
2008-06-24 18:08:35 +00:00
3b94436d2f
* vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor
Benoit Jacob
2008-06-24 15:13:00 +00:00
c9560df4a0
* add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case
Benoit Jacob
2008-06-23 22:00:18 +00:00
ac9aa47bbc
optimize linear vectorization both in Assign and Sum (optimal amortized perf)
Gael Guennebaud
2008-06-23 15:50:28 +00:00
ea1990ef3d
add experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings !
Gael Guennebaud
2008-06-23 13:25:22 +00:00
03d19f3bae
quick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without.
Benoit Jacob
2008-06-23 11:23:05 +00:00
32596c5e9e
add benchmark for sum
Benoit Jacob
2008-06-23 11:03:27 +00:00
dc9206cec5
split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types).
Benoit Jacob
2008-06-23 10:32:48 +00:00
8a967fb17c
* implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option.
Benoit Jacob
2008-06-22 15:02:05 +00:00
8cef541b5a
forgot to add the unit test array.cpp
Gael Guennebaud
2008-06-21 17:28:07 +00:00
32c5ea388e
work on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class
Gael Guennebaud
2008-06-21 15:01:49 +00:00
574416b842
Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them.
Benoit Jacob
2008-06-20 15:26:39 +00:00
54238961d6
* added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1
Gael Guennebaud
2008-06-20 12:38:03 +00:00
e735692e37
move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC.
Gael Guennebaud
2008-06-20 07:10:50 +00:00
fb4a151982
* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
Gael Guennebaud
2008-06-19 23:00:51 +00:00
82c3cea1d5
* refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase
Gael Guennebaud
2008-06-19 17:33:57 +00:00
5dbfed1902
fix two bugs dicovered by the previous commit.
Gael Guennebaud
2008-06-16 16:39:58 +00:00
bb1f4e44f1
* Block: row and column expressions in the inner direction now have the Like1D flag.
Benoit Jacob
2008-06-16 14:54:31 +00:00
9857764ae7
aaargh.
Benoit Jacob
2008-06-16 11:20:29 +00:00
478bfaf228
fix bug in computation of unrolling limit: div instead of mul
Benoit Jacob
2008-06-16 11:18:59 +00:00
c905b31b42
* Big rework of Assign.h: ** Much better organization ** Fix a few bugs ** Add the ability to unroll only the inner loop ** Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented.
Benoit Jacob
2008-06-16 10:49:44 +00:00
bc0c7c57ed
Added an extensible mechanism to support any kind of rotation representation in Transform via the template static class ToRotationMatrix. Added a lightweight AngleAxis class (similar to Rotation2D).
Gael Guennebaud
2008-06-15 17:22:41 +00:00
0ee6b08128
* split Product to a DiagonalProduct template specialization to optimize matrix-diag and diag-matrix products without making Product over complicated. * compilation fixes in Tridiagonalization and HessenbergDecomposition in the case of 2x2 matrices. * added an Orientation2D small class with similar interface than Quaternion (used by Transform to handle 2D and 3D orientations seamlessly) * added a couple of features in Transform.
Gael Guennebaud
2008-06-15 11:54:18 +00:00