Gael Guennebaud
|
ee06f78679
|
Introduce unified macros to identify compiler, OS, and architecture. They are all defined in util/Macros.h and prefixed with EIGEN_COMP_, EIGEN_OS_, and EIGEN_ARCH_ respectively.
|
2014-11-04 21:58:52 +01:00 |
|
Konstantinos Margaritis
|
fcb3573d17
|
Merged eigen/eigen into default
|
2014-10-22 10:42:18 +03:00 |
|
Konstantinos Margaritis
|
fae4fd7a26
|
Added ARMv8 support
|
2014-10-22 07:39:49 +00:00 |
|
Konstantinos Margaritis
|
b508619392
|
working 64-bit support in PacketMath.h, Complex.h needed
|
2014-10-21 18:10:33 +00:00 |
|
Christoph Hertzberg
|
84aaa03182
|
Addendum to bug #859: pexp(NaN) for double did not return NaN, also, plog(NaN) did not return NaN.
psqrt(NaN) and psqrt(-1) shall return NaN if EIGEN_FAST_MATH==0
|
2014-10-20 13:13:43 +02:00 |
|
Gael Guennebaud
|
aa5f79206f
|
Fix bug #859: pexp(NaN) returned Inf instead of NaN
|
2014-10-20 11:38:51 +02:00 |
|
Konstantinos Margaritis
|
9d3c69952b
|
fixed to make big-endian VSX work as well
|
2014-10-01 09:43:56 +00:00 |
|
Konstantinos Margaritis
|
de38ff2499
|
prefetch are noops on VSX, actually disable the prefetch trait
|
2014-09-21 11:56:07 +00:00 |
|
Konstantinos Margaritis
|
60e093a9dc
|
Merged eigen/eigen into default
|
2014-09-21 14:02:51 +03:00 |
|
Konstantinos Margaritis
|
56408504e4
|
fix compile error on big endian altivec
|
2014-09-21 13:59:30 +03:00 |
|
Konstantinos Margaritis
|
974fe38ca3
|
prefetch are noops on VSX
|
2014-09-21 11:24:30 +00:00 |
|
Konstantinos Margaritis
|
c0205ca4af
|
VSX supports vec_div, implement where appropriate (float/doubles)
|
2014-09-21 08:12:22 +00:00 |
|
Konstantinos Margaritis
|
10f8aabb61
|
VSX port passes packetmath_[1-5] tests!
|
2014-09-20 22:31:31 +00:00 |
|
Konstantinos Margaritis
|
60663a510a
|
32-bit floats/ints, 64-bit doubles pass packetmath tests, complex 32/64-bit remaining
|
2014-09-19 21:05:01 +00:00 |
|
Konstantinos Margaritis
|
470aa15c35
|
First time it compiles, but fails to pass the tests.
|
2014-09-09 16:58:48 +00:00 |
|
Konstantinos Margaritis
|
7ff266e3ce
|
Initial VSX commit
|
2014-08-29 20:03:49 +00:00 |
|
Jitse Niesen
|
25bceefb4e
|
Replace asm by __asm__ (bug #873)
|
2014-09-06 11:47:24 +01:00 |
|
Gael Guennebaud
|
0369db12af
|
bug #871: fix compilation on ARM/Neon regarding __has_builtin usage
|
2014-09-01 10:52:58 +02:00 |
|
Konstantinos Margaritis
|
2c625ec9ba
|
Simplification of some Altivec constants, reuse existing constants and avoid loading from RAM esp in the case of p16uc_COMPLEX_TRANSPOSE*
|
2014-07-22 20:46:03 +00:00 |
|
Konstantinos Margaritis
|
0a945687b7
|
Added HasDiv=1 to Altivec PacketMath.h, now vectorization_logic test passes.
Added comments to the constants, indicative of the actual values
|
2014-07-15 11:02:51 +00:00 |
|
Christoph Hertzberg
|
d1460d9278
|
stride must be DenseIndex not int
|
2014-07-10 16:23:20 +02:00 |
|
Gael Guennebaud
|
b47ef1431f
|
Fix many long to int implicit conversions
|
2014-07-08 16:47:11 +02:00 |
|
Gael Guennebaud
|
d67aa1549b
|
Add missing add_subdirectory directive
|
2014-05-03 10:46:11 +02:00 |
|
Gael Guennebaud
|
450d0c3de0
|
Make sure that calls to broadcast4 are 16 bytes aligned
|
2014-04-25 22:25:48 +02:00 |
|
Gael Guennebaud
|
2dbfd83424
|
Implement pbroadcast4 on altivec
|
2014-04-25 02:46:57 -07:00 |
|
Gael Guennebaud
|
4def7b1fa5
|
Fix ptranspose overload prototypes for NEON
|
2014-04-25 11:15:13 +02:00 |
|
Gael Guennebaud
|
3d8d0f6269
|
Enable vectorization of pack_rhs with a column-major RHS.
Rename and generalize Kernel<*> to PacketBlock<*,N>.
|
2014-04-25 10:56:18 +02:00 |
|
Gael Guennebaud
|
b0e19db1cf
|
Enable fused madd for Altivec
|
2014-04-24 23:17:18 +02:00 |
|
Gael Guennebaud
|
8d85ce88e1
|
Implement ptranspose on altivec and fix pgather/pscatter
|
2014-04-24 05:47:53 -07:00 |
|
Benoit Steiner
|
4eb92e5647
|
Fixed the NEON implementation of predux_max<Packet4i>.
|
2014-04-23 18:23:07 -07:00 |
|
Benoit Steiner
|
ccb4dec719
|
Created a NEON version of the ptranspose packet primitives
|
2014-04-23 18:22:10 -07:00 |
|
Gael Guennebaud
|
82b09fcb91
|
Add Altivec implementation of pgather/pscatter (not tested)
|
2014-04-23 13:09:26 +02:00 |
|
Gael Guennebaud
|
934ce93886
|
merge with default branch
|
2014-04-22 17:00:38 +02:00 |
|
Gael Guennebaud
|
5c5231ab71
|
Workaround gcc's default ABI not being able to distinghish between vector types of different sizes.
|
2014-04-22 16:03:19 +02:00 |
|
Gael Guennebaud
|
1388f4f9fd
|
Fix typo (was working with clang\!)
|
2014-04-18 11:43:13 +02:00 |
|
Gael Guennebaud
|
2c3c95990d
|
merge
|
2014-04-17 22:50:49 +02:00 |
|
Benoit Steiner
|
6d6df90c9a
|
Implemented the pgather/pscatter packet primitives for the arm/NEON architecture
|
2014-04-17 12:28:01 -07:00 |
|
Gael Guennebaud
|
9746396d1b
|
Optimize AVX pset1 for complexes and ploaddup
|
2014-04-17 20:51:04 +02:00 |
|
Gael Guennebaud
|
0fa8290366
|
Optimize ploaddup for AVX
|
2014-04-17 16:02:27 +02:00 |
|
Gael Guennebaud
|
d5a795f673
|
New gebp kernel handling up to 3 packets x 4 register-level blocks. Huge speeup on Haswell.
This changeset also introduce new vector functions: ploadquad and predux4.
|
2014-04-16 17:05:11 +02:00 |
|
Benoit Steiner
|
feaf7c7e6d
|
Optimized SSE unaligned loads and stores when compiling a 64bit target with a recent version of gcc (ie gcc 4.8).
|
2014-04-14 10:44:17 -07:00 |
|
Benoit Steiner
|
8044b00a7f
|
bug #782: Workaround for gcc <= 4.4 compilation error on the NEON PacketMath code.
|
2014-04-03 23:41:47 +02:00 |
|
Gael Guennebaud
|
1c0728043a
|
Workaround alignment warnings
|
2014-03-30 22:43:47 +02:00 |
|
Gael Guennebaud
|
10aa14592a
|
Add a mechanism to recursively access to half-size packet types
|
2014-03-28 10:18:04 +01:00 |
|
Benoit Steiner
|
51e85c936d
|
Merged latest changes from parent.
|
2014-03-27 18:32:15 -07:00 |
|
Benoit Steiner
|
8a94cb3edd
|
Implemented the SSE version of the gather and scatter packet primitives.
|
2014-03-27 18:29:01 -07:00 |
|
Benoit Steiner
|
7f3162f707
|
Implemented the AVX version of the gather and scatter packet primitives.
|
2014-03-27 17:42:25 -07:00 |
|
Gael Guennebaud
|
58fe2fc2b2
|
enforce the use of vfmadd231ps for pmadd (gcc and clang stupidely generates the other fmadd variants plus some register moves...)
|
2014-03-27 23:38:50 +01:00 |
|
Benoit Steiner
|
c4902a3d01
|
Implemented the AVX version of the ptranspose packet primitive.
|
2014-03-27 09:34:51 -07:00 |
|
Gael Guennebaud
|
052aedd394
|
Implement pcplflip, palign, predux and the likes from AVC/complexes
|
2014-03-27 14:47:00 +01:00 |
|