Gael Guennebaud
cca6c207f4
AVX512: implement faster ploadquad<Packet16f> thus speeding up GEMM
2019-02-21 17:18:28 +01:00
Gael Guennebaud
d85ae650bf
bug #1678 : workaround MSVC compilation issues with AVX512
2019-02-15 10:24:17 +01:00
Gael Guennebaud
eb4c6bb22d
Fix conflicts and merge
2019-01-30 15:57:08 +01:00
Christoph Hertzberg
5a52e35f9a
Renaming some more I identifiers
2019-01-26 13:18:21 +01:00
Gael Guennebaud
61b6eb05fe
AVX512 (r)sqrt(double) was mistakenly disabled with clang and others
2019-01-14 17:28:47 +01:00
Rasmus Munk Larsen
fcfced13ed
Rename pones -> ptrue. Use _CMP_TRUE_UQ where appropriate.
2019-01-09 17:20:33 -08:00
Rasmus Munk Larsen
8f04442526
Collapsed revision
...
* Collapsed revision
* Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
* Collapsed revision
* Simplify a bit.
* Undo useless diffs.
* Fix typo.
2019-01-09 16:34:23 -08:00
Rasmus Munk Larsen
cb955df9a6
Add packet up "pones". Write pnot(a) as pxor(pones(a), a).
2019-01-09 16:17:08 -08:00
Rasmus Larsen
cb3c059fa4
Merged eigen/eigen into default
2019-01-09 15:04:17 -08:00
Gael Guennebaud
47810cf5b7
Add dedicated implementations of predux_any for AVX512, NEON, and Altivec/VSE
2019-01-09 16:40:42 +01:00
Gael Guennebaud
aeec68f77b
Add missing pcmp_lt and others for AVX512
2019-01-09 15:36:41 +01:00
Rasmus Munk Larsen
055f0b73db
Add support for pcmp_eq and pnot, including for complex types.
2019-01-07 16:53:36 -08:00
Mark D Ryan
bc5dd4cafd
PR560: Fix the AVX512f only builds
...
Commit c53eececb0
introduced AVX512 support for complex numbers but required
avx512dq to build. Commit 1d683ae2f5
fixed some but not, it would seem all,
of the hard avx512dq dependencies. Build failures are still evident on
Eigen and TensorFlow when compiling with just avx512f and no avx512dq
using gcc 7.3. Looking at the code there does indeed seem to be a problem.
Commit c53eececb0
calls avx512dq intrinsics directly, e.g, _mm512_extractf32x8_ps
and _mm512_and_ps. This commit fixes the issue by replacing the direct
intrinsic calls with the various wrapper functions that are safe to use on
avx512f only builds.
2019-01-03 14:33:04 +01:00
Gael Guennebaud
60d3fe9a89
One more stupid AVX 512 fix (I don't have direct access to AVX512 machines)
2018-12-24 13:05:03 +01:00
Gael Guennebaud
4aa667b510
Add EIGEN_STRONG_INLINE where required
2018-12-24 10:45:01 +01:00
Gael Guennebaud
961ff567e8
Add missing pcmp_lt_or_nan for AVX512
2018-12-23 22:13:29 +01:00
Gustavo Lima Chaves
e763fcd09e
Introducing "vectorized" byte on unpacket_traits structs
...
This is a preparation to a change on gebp_traits, where a new template
argument will be introduced to dictate the packet size, so it won't be
bound to the current/max packet size only anymore.
By having packet types defined early on gebp_traits, one has now to
act on packet types, not scalars anymore, for the enum values defined
on that class. One approach for reaching the vectorizable/size
properties one needs there could be getting the packet's scalar again
with unpacket_traits<>, then the size/Vectorizable enum entries from
packet_traits<>. It turns out guards like "#ifndef
EIGEN_VECTORIZE_AVX512" at AVX/PacketMath.h will hide smaller packet
variations of packet_traits<> for some types (and it makes sense to
keep that). In other words, one can't go back to the scalar and create
a new PacketType, as this will always lead to the maximum packet type
for the architecture.
The less costly/invasive solution for that, thus, is to add the
vectorizable info on every unpacket_traits struct as well.
2018-12-19 14:24:44 -08:00
Gael Guennebaud
0a7e7af6fd
Properly set the number of registers for AVX512
2018-12-11 15:33:17 +01:00
Gael Guennebaud
cbf2f4b7a0
AVX512f includes FMA but GCC does not define __FMA__ with -mavx512f only
2018-12-06 18:21:56 +01:00
Gael Guennebaud
c53eececb0
Implement AVX512 vectorization of std::complex<float/double>
2018-12-06 15:58:06 +01:00
Gael Guennebaud
69ace742be
Several improvements regarding packet-bitwise operations:
...
- add unit tests
- optimize their AVX512f implementation
- add missing implementations (half, Packet4f, ...)
2018-11-30 15:56:08 +01:00
Gael Guennebaud
fa87f9d876
Add psin/pcos on AVX512 -> almost for free, at last!
2018-11-30 14:33:13 +01:00
Gael Guennebaud
f91500d303
Fix pandnot order in AVX512
2018-11-30 14:32:06 +01:00
Gael Guennebaud
43633fbaba
Fix warning with AVX512f
2018-10-11 10:13:48 +02:00
Gael Guennebaud
b3f66d29a5
Enable avx512 plog with clang
2018-10-11 10:12:21 +02:00
Gael Guennebaud
626942d9dd
fix alignment issue in ploaddup for AVX512
2018-09-28 16:57:32 +02:00
Gael Guennebaud
5a30eed17e
Fix warnings in AVX512
2018-09-20 16:58:51 +02:00
Christoph Hertzberg
ad4a08fb68
Use Intel cast intrinsics, since MSVC does not allow direct casting.
...
Reported by David Winkler.
2018-08-24 19:04:33 +02:00
Gael Guennebaud
7134fa7a2e
Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).
2018-06-07 09:33:10 +02:00
Gael Guennebaud
584951ca4d
Rename predux_downto4 to be more accurate on its semantic.
2018-04-03 14:28:38 +02:00
Gael Guennebaud
6719409cd9
AVX512: add missing pinsertfirst and pinsertlast, implement pblend for Packet8d, fix compilation without AVX512DQ
2018-04-03 14:11:56 +02:00
Rasmus Munk Larsen
7b6aaa3440
Fix NaN propagation for AVX512.
2017-01-24 13:37:08 -08:00
Benoit Steiner
354baa0fb1
Avoid using horizontal adds since they're not very efficient.
2016-12-21 20:55:07 -08:00
Benoit Steiner
d7825b6707
Use native AVX512 types instead of Eigen Packets whenever possible.
2016-12-21 20:06:18 -08:00
Benoit Steiner
923acadfac
Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
2016-12-19 13:02:27 -08:00
Benoit Steiner
507b661106
Renamed predux_half into predux_downto4
2016-10-06 17:57:04 -07:00
Benoit Steiner
a498ff7df6
Fixed incorrect comment
2016-10-06 15:27:27 -07:00
Benoit Steiner
a7473d6d5a
Fixed compilation error with gcc >= 5.3
2016-10-06 14:33:22 -07:00
Benoit Steiner
5e64cea896
Silenced a compilation warning
2016-10-06 14:24:17 -07:00
Benoit Steiner
cb5cd69872
Silenced a compilation warning.
2016-10-05 18:50:53 -07:00
Benoit Steiner
9c2b6c049b
Silenced a few compilation warnings
2016-10-05 18:37:31 -07:00
Benoit Steiner
fa5a8f055a
Implemented palign_impl for AVX512
2016-04-29 13:30:13 -07:00
Benoit Steiner
ef3ac9d05a
Fixed the AVX512 packet traits
2016-04-29 13:28:36 -07:00
Benoit Steiner
d7b75e8d86
Added pdiv packet primitives for avx512
2016-04-29 13:26:47 -07:00
Benoit Steiner
5e89ded685
Implemented preduxp for AVX512
2016-04-29 13:00:33 -07:00
Benoit Steiner
5f85662ad8
Implemented the pabs and preverse primitives for avx512.
2016-04-29 12:53:34 -07:00
Benoit Steiner
d37ee89ca8
Disabled some of the AVX512 primitives on compilers that don't support them
2016-04-29 12:50:29 -07:00
Benoit Steiner
8bfe739cd2
Updated the AVX512 PacketMath to properly leverage the AVX512DQ instructions
2016-04-11 18:40:16 -07:00
Benoit Steiner
3ca1ae2bb7
Commented out the version of pexp<Packet8d> since it fails to compile with gcc 5.3
2016-02-04 13:49:06 -08:00
Benoit Steiner
6c9cf117c1
Fixed indentation
2016-02-04 10:34:10 -08:00