Commit Graph

  • 88bb2087c1 New implementation of Swap as discussed, reusing Assign. Makes LU run 10% faster overall. Benoit Jacob 2008-08-05 21:55:57 +00:00
  • c94be35bc8 introduce copyCoeff and copyPacket methods in MatrixBase, used by Assign, in preparation for new Swap impl reusing Assign code. remove last remnant of old Inverse class in Transform. Benoit Jacob 2008-08-05 18:00:23 +00:00
  • 09ef7db9d9 Add partial pivoting runtime option to LU. Benoit Jacob 2008-08-05 15:43:11 +00:00
  • e741b7beca further big perf improvement in Inverse Benoit Jacob 2008-08-04 23:47:09 +00:00
  • 79a0feee68 big performance improvement in inverse and LU Benoit Jacob 2008-08-04 23:34:21 +00:00
  • a7a05382d1 Add a LU decomposition action in BTL and various cleaning in BTL. For instance all per plot settings have been moved to a single file, go_mean now takes an optional second argument "tiny" to generate plots for tiny matrices, and output of comparison information wrt to previous benchs (if any). Gael Guennebaud 2008-08-04 23:12:48 +00:00
  • c2f8ecf466 * LU decomposition, supporting all rectangular matrices, with full pivoting for better numerical stability. For now the only application is determinant. * New determinant unit-test. * Disable most of Swap.h for now as it makes LU fail (mysterious). Anyway Swap needs a big overhaul as proposed on IRC. * Remnants of old class Inverse removed. * Some warnings fixed. Benoit Jacob 2008-08-04 04:45:59 +00:00
  • f81dfcf00b fix two perf issues in product. fix positive definite test in Cholesky. remove #include <cstring> in CoreDeclaration. Gael Guennebaud 2008-08-03 20:23:06 +00:00
  • 49ae3fca89 fix compile errors with gcc 4.3: unresolved func call to ei_cache_friendly_product, and undeclared memcpy Benoit Jacob 2008-08-03 15:44:06 +00:00
  • 6d11a07e5e Added a ei_palign function align a packet from two others. This allows much faster code dealing with unligned as in the updated matrix-vector product functions. Gael Guennebaud 2008-08-03 15:15:46 +00:00
  • 55aeb1f83a Optimizations: * faster matrix-matrix and matrix-vector products (especially for not aligned cases) * faster tridiagonalization (make it using our matrix-vector impl.) Others: * fix Flags of Map * split the test_product to two smaller ones Gael Guennebaud 2008-08-01 23:44:59 +00:00
  • b32b186c14 removed the packet specializations of some functors (GCC generates better code without those "optimizations") Gael Guennebaud 2008-07-31 21:03:11 +00:00
  • 842c4f8bfa Several compilation fixes for MSVC and NVCC, basically: - added explicit enum to int conversion where needed - if a function is not defined as declared and the return type is "tricky" then the type must be typedefined somewhere. A "tricky return type" can be: * a template class with a default parameter which depends on another template parameter * a nested template class, or type of a nested template class Gael Guennebaud 2008-07-29 16:33:07 +00:00
  • e0215ee510 BTL: - added tridiagonalization and hessenberg decomposition bench - added GOTO library Gael Guennebaud 2008-07-28 20:48:21 +00:00
  • 44d95e0540 fix some internal asserts in CacheFrinedlyProduct Gael Guennebaud 2008-07-27 22:14:08 +00:00
  • 02a7efa910 forgot to include this file in previous commit Gael Guennebaud 2008-07-27 14:24:32 +00:00
  • 93115619c2 * updated benchmark files according to recent renamings * various improvements in BTL including trisolver and cholesky bench Gael Guennebaud 2008-07-27 11:39:47 +00:00
  • e9e5261664 Fix a couple issues introduced in the previous commit: * removed DirectAccessBit from Part * use a template specialization in inverseProduct() to transform a Part xpr to a Flagged xpr Gael Guennebaud 2008-07-26 23:05:44 +00:00
  • e77ccf2928 * Rewrite the triangular solver so that we can take advantage of our efficient matrix-vector products: => up to 6 times faster ! * Added DirectAccessBit to Part * Added an exemple of a cwise operator * Renamed perpendicular() => someOrthogonal() (geometry module) * Fix a weired bug in ei_constant_functor: the default copy constructor did not copy the imaginary part when the single member of the class is a complex... Gael Guennebaud 2008-07-26 20:40:29 +00:00
  • 2940617e6f bugfix in some internal asserts of CacheFriendlyProduct Gael Guennebaud 2008-07-26 12:26:27 +00:00
  • f997a3e902 update the inverse test a little make use of static asserts in Map fix 2 warnings in CacheFriendlyProduct: unused var 'Vectorized' Benoit Jacob 2008-07-26 12:08:28 +00:00
  • b466c266a0 * Fix some complex alignment issues in the cache friendly matrix-vector products. * Minor update of the cores of the Cholesky algorithms to make them more friendly wrt to matrix-vector products => speedup x5 ! Gael Guennebaud 2008-07-23 17:30:00 +00:00
  • 172000aaeb Add .perpendicular() function in Geometry module (adapted from Eigen1) Documentation: * add an overview for each module. * add an example for .all() and Cwise::operator< Gael Guennebaud 2008-07-22 10:54:42 +00:00
  • 516db2c3b9 Fix compilation issues with icc and g++ < 4.1. Those include: - conflicts with operator * overloads - discard the use of ei_pdiv for interger (g++ handles operators on __m128* types, this is why it worked) - weird behavior of icc in fixed size Block() constructor complaining the initializer of m_blockRows and m_blockCols were missing while we are in fixed size (maybe this hide deeper problem since this is a recent one, but icc gives only little feedback) Gael Guennebaud 2008-07-21 12:40:56 +00:00
  • c10f069b6b * Merge Extract and Part to the Part expression. Renamed "MatrixBase::extract() const" to "MatrixBase::part() const" * Renamed static functions identity, zero, ones, random with an upper case first letter: Identity, Zero, Ones and Random. Gael Guennebaud 2008-07-21 00:34:46 +00:00
  • ce425d92f1 Various documentation improvements, in particualr in Cholesky and Geometry module. Added doxygen groups for Matrix typedefs and the Geometry module Gael Guennebaud 2008-07-20 15:18:54 +00:00
  • 269f683902 Add cholesky's members to MatrixBase Various documentation improvements including new snippets (AngleAxis and Cholesky) Gael Guennebaud 2008-07-19 22:59:05 +00:00
  • 6e2c53e056 Added an automatically generated list of selected examples in the documentation. Added the custom gemetry_module tag, and use it. Gael Guennebaud 2008-07-19 20:36:41 +00:00
  • 05ad083467 Added MatrixBase::Unit*() static function to easily create unit/basis vectors. Removed EulerAngles, addes typdefs for Quaternion and AngleAxis, and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that: Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY); just works. Gael Guennebaud 2008-07-19 13:03:23 +00:00
  • 7245c63067 Complete rewrite of partial reduction according to mailing list discussions. Gael Guennebaud 2008-07-19 11:36:32 +00:00
  • 8b4945a5a2 add some static asserts, use them, fix gcc 4.3 warning in Product.h. Benoit Jacob 2008-07-19 00:25:41 +00:00
  • 22a816ade8 * Fix a couple of issues related to the recent cache friendly products * Improve the efficiency of matrix*vector in unaligned cases * Trivial fixes in the destructors of MatrixStorage * Removed the matrixNorm in test/product.cpp (twice faster and that assumed the matrix product was ok while checking that !!) Gael Guennebaud 2008-07-19 00:09:01 +00:00
  • 62ec1dd616 * big rework of Inverse.h: - remove all invertibility checking, will be redundant with LU - general case: adapt to matrix storage order for better perf - size 4 case: handle corner cases without falling back to gen case. - rationalize with selectors instead of compile time if - add C-style computeInverse() * update inverse test. * in snippets, default cout precision to 3 decimal places * add some cmake module from kdelibs to support btl with cmake 2.4 Benoit Jacob 2008-07-15 23:56:17 +00:00
  • b970a9c8aa trivial fix in EulerAngles constructor Gael Guennebaud 2008-07-15 22:42:55 +00:00
  • c8cbc1665e enhancements of the plot generator: - removed the ugly X11 and PNG gnuplots terminals - use enhanced postscript terminal - use imagemagick to generate the png files (with compression) - disable the fortran impl by default since it is as meaningless as a "C impl" - update line settings Gael Guennebaud 2008-07-13 11:46:36 +00:00
  • 99a625243f Optimization: added super efficient rowmajor * vector product (and vector * colmajor). It basically performs 4 dot products at once reducing loads of the vector and improving instructions scheduling. With 3 cache friendly algorithms, we now handle all product configurations with outstanding perf for large matrices. Gael Guennebaud 2008-07-13 01:22:54 +00:00
  • 51e6ee39f0 SVN_SILENT trivial fix Benoit Jacob 2008-07-12 23:42:19 +00:00
  • bd0183f850 fix a cmake issue in FindTvmet and FindMKL Gael Guennebaud 2008-07-12 23:34:42 +00:00
  • e979e6485f another occurence of that little cmake fix Benoit Jacob 2008-07-12 23:27:41 +00:00
  • 861d18d553 * Optimization: added a specialization of Block for xpr with DirectAccessBit * some simplifications and fixes in cache friendly products Gael Guennebaud 2008-07-12 22:59:34 +00:00
  • 1bbaea9885 little cmake fix Benoit Jacob 2008-07-12 22:13:03 +00:00
  • 10c4e36b39 disable MKL check and fortran for cmake <2.6 Gael Guennebaud 2008-07-12 21:54:02 +00:00
  • ed6e07b2f6 various improvements of the plot generator in BTL Gael Guennebaud 2008-07-12 21:41:32 +00:00
  • 8233de8b69 various minor updates in the benchmark suite like non inlining of some functions as well as the experimental C code used to design efficient eigen's matrix vector products. Gael Guennebaud 2008-07-12 12:14:08 +00:00
  • b7bd1b3446 Add a *very efficient* evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp. Gael Guennebaud 2008-07-12 12:12:02 +00:00
  • 6f71ef8277 resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends in the benchmark suite Gael Guennebaud 2008-07-10 18:28:50 +00:00
  • 2b53fd4d53 some performance fixes in Assign.h reported by Gael. Some doc update in Cwise. Benoit Jacob 2008-07-10 16:15:55 +00:00
  • 7b4c6b8862 in BTL: a specific bench/action can be selected at runtime, e.g.: BTL_CONFIG="-a ata" ctest -V -R eigen run the all benchmarks having "ata" in their name for all libraries matching the regexp "eigen" Gael Guennebaud 2008-07-09 22:35:11 +00:00
  • c9b046d5d5 * added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose Gael Guennebaud 2008-07-09 22:30:18 +00:00
  • 25904802bc raah, results were corrupted by overflow. Now slice vectorization is about a +25% speedup which is still nice as i expected zero or even negative benefit. Benoit Jacob 2008-07-09 16:46:26 +00:00
  • 8f21a5e862 add benchmark for slice vectorization... expected it to be little or zero benefit... turns out to be 20x speedup. Something is wrong. Benoit Jacob 2008-07-09 16:43:11 +00:00
  • 28539e7597 imported a reworked version of BTL (Benchmark for Templated Libraries). the modifications to initial code follow: * changed build system from plain makefiles to cmake * added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces * added "transposed matrix * vector" product action * updated blitz interface to use condensed products instead of hand coded loops * removed some deprecated interfaces * changed default storage order to column major for all libraries * new generic bench timer strategy which is supposed to be more accurate * various code clean-up Gael Guennebaud 2008-07-09 14:04:48 +00:00
  • 5f55ab524c * added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations) Gael Guennebaud 2008-07-09 13:54:21 +00:00
  • 783eb6da9b I forgot that the previous commit needed minor changes outside the bench folder Gael Guennebaud 2008-07-08 17:25:58 +00:00
  • 77a622f2bb add Cholesky and eigensolver benchmark Gael Guennebaud 2008-07-08 17:20:17 +00:00
  • 6f09d3a67d - many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!= Benoit Jacob 2008-07-08 07:56:01 +00:00
  • f5791eeb70 the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h. Benoit Jacob 2008-07-08 00:49:10 +00:00
  • c910c517b3 fix issues in previously added additionnal product tests Gael Guennebaud 2008-07-06 19:02:03 +00:00
  • a9d319d44f * do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp Benoit Jacob 2008-07-04 12:43:55 +00:00
  • 8463b7d3f4 * fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size Gael Guennebaud 2008-07-02 16:05:33 +00:00
  • 9433df83a7 * resurected Flagged::_expression used to optimize m+=(a*b).lazy() (equivalent to the GEMM blas routine) * added a GEMM benchmark Gael Guennebaud 2008-07-01 16:20:06 +00:00
  • 95549007b3 * fix error in divergence test, now it is even faster * add comments in render() in case anyone ever reads that :P Benoit Jacob 2008-07-01 14:23:01 +00:00
  • a356ebd47d interleaved rendering balances the load better Benoit Jacob 2008-07-01 14:12:32 +00:00
  • 56d03f181e * multi-threaded rendering * increased number of iterations, with more iterations done before testing divergence. results in x2 speedup from vectorization. Benoit Jacob 2008-07-01 12:01:58 +00:00
  • cacf986a7f - use double precision to store the position / zoom / other stuff - some temporary fix to get a +50% improvement from vectorization until we have vectorisation for comparisons and redux Benoit Jacob 2008-06-30 07:33:08 +00:00
  • 37a50fa526 * added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module Gael Guennebaud 2008-06-29 21:29:12 +00:00
  • fbdecf09e1 fix little bug in computation of max_iter Benoit Jacob 2008-06-29 12:20:07 +00:00
  • 97a1038653 improve greatly mandelbrot demo: - much better coloring - determine max number of iterations and choice between float and double at runtime based on zoom level - do draft renderings with increasing resolution before final rendering Benoit Jacob 2008-06-29 12:04:00 +00:00
  • 027818d739 * added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations ! Gael Guennebaud 2008-06-28 23:07:14 +00:00
  • 6917be9113 add mandelbrot demo Benoit Jacob 2008-06-28 20:33:47 +00:00
  • 55e08f7102 fix breakage from my last commit Benoit Jacob 2008-06-28 17:15:16 +00:00
  • 844f69e4a9 * update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example. Benoit Jacob 2008-06-27 10:53:30 +00:00
  • 6de4871c8c fix a couple of issues in the new Map.h Benoit Jacob 2008-06-27 01:42:44 +00:00
  • e27b2b95cf * rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead. Benoit Jacob 2008-06-27 01:22:35 +00:00
  • e5d301dc96 various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation Gael Guennebaud 2008-06-26 23:22:26 +00:00
  • c5bd1703cb change derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer... Benoit Jacob 2008-06-26 20:08:16 +00:00
  • 25ba9f377c * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyn*dyn size * fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too. Benoit Jacob 2008-06-26 16:06:41 +00:00
  • 5b0da4b778 make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction. Benoit Jacob 2008-06-24 18:08:35 +00:00
  • 3b94436d2f * vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor Benoit Jacob 2008-06-24 15:13:00 +00:00
  • c9560df4a0 * add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case Benoit Jacob 2008-06-23 22:00:18 +00:00
  • ac9aa47bbc optimize linear vectorization both in Assign and Sum (optimal amortized perf) Gael Guennebaud 2008-06-23 15:50:28 +00:00
  • ea1990ef3d add experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings ! Gael Guennebaud 2008-06-23 13:25:22 +00:00
  • 03d19f3bae quick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without. Benoit Jacob 2008-06-23 11:23:05 +00:00
  • 32596c5e9e add benchmark for sum Benoit Jacob 2008-06-23 11:03:27 +00:00
  • dc9206cec5 split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types). Benoit Jacob 2008-06-23 10:32:48 +00:00
  • 8a967fb17c * implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option. Benoit Jacob 2008-06-22 15:02:05 +00:00
  • 8cef541b5a forgot to add the unit test array.cpp Gael Guennebaud 2008-06-21 17:28:07 +00:00
  • 32c5ea388e work on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class Gael Guennebaud 2008-06-21 15:01:49 +00:00
  • 574416b842 Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them. Benoit Jacob 2008-06-20 15:26:39 +00:00
  • 54238961d6 * added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1 Gael Guennebaud 2008-06-20 12:38:03 +00:00
  • e735692e37 move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC. Gael Guennebaud 2008-06-20 07:10:50 +00:00
  • fb4a151982 * more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME) Gael Guennebaud 2008-06-19 23:00:51 +00:00
  • 82c3cea1d5 * refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase Gael Guennebaud 2008-06-19 17:33:57 +00:00
  • 5dbfed1902 fix two bugs dicovered by the previous commit. Gael Guennebaud 2008-06-16 16:39:58 +00:00
  • bb1f4e44f1 * Block: row and column expressions in the inner direction now have the Like1D flag. Benoit Jacob 2008-06-16 14:54:31 +00:00
  • 9857764ae7 aaargh. Benoit Jacob 2008-06-16 11:20:29 +00:00
  • 478bfaf228 fix bug in computation of unrolling limit: div instead of mul Benoit Jacob 2008-06-16 11:18:59 +00:00
  • c905b31b42 * Big rework of Assign.h: ** Much better organization ** Fix a few bugs ** Add the ability to unroll only the inner loop ** Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented. Benoit Jacob 2008-06-16 10:49:44 +00:00
  • bc0c7c57ed Added an extensible mechanism to support any kind of rotation representation in Transform via the template static class ToRotationMatrix. Added a lightweight AngleAxis class (similar to Rotation2D). Gael Guennebaud 2008-06-15 17:22:41 +00:00
  • 0ee6b08128 * split Product to a DiagonalProduct template specialization to optimize matrix-diag and diag-matrix products without making Product over complicated. * compilation fixes in Tridiagonalization and HessenbergDecomposition in the case of 2x2 matrices. * added an Orientation2D small class with similar interface than Quaternion (used by Transform to handle 2D and 3D orientations seamlessly) * added a couple of features in Transform. Gael Guennebaud 2008-06-15 11:54:18 +00:00