Commit Graph

500 Commits

Author SHA1 Message Date
Gael Guennebaud
b3151bca40 Implement pmadd for float and double to make it consistent with the vectorized path when FMA is available. 2016-08-23 14:24:08 +02:00
Gael Guennebaud
a4c266f827 Factorize the 4 copies of tanh implementations, make numext::tanh consistent with array::tanh, enable fast tanh in fast-math mode only. 2016-08-23 14:23:08 +02:00
Gael Guennebaud
d937a420a2 Fix compilation with MSVC by using our portable numext::log1p implementation. 2016-08-22 15:44:21 +02:00
Gael Guennebaud
2d5731e40a bug #1270: bypass custom asm for pmadd and recent clang version 2016-08-22 15:38:03 +02:00
Igor Babuschkin
59bacfe520 Fix compilation on CUDA 8 by removing call to h2log1p 2016-08-15 23:38:05 +01:00
Igor Babuschkin
aee693ac52 Add log1p support for CUDA and half floats 2016-08-08 20:24:59 +01:00
Benoit Steiner
fe778427f2 Fixed the constructors of the new half_base class. 2016-08-04 18:32:26 -07:00
Benoit Steiner
9506343349 Fixed the isnan, isfinite and isinf operations on GPU 2016-08-04 17:25:53 -07:00
Gael Guennebaud
17b9a55d98 Move Eigen::half_impl::half to Eigen::half while preserving the free functions to the Eigen::half_impl namespace together with ADL 2016-08-04 00:00:43 +02:00
Benoit Steiner
02fe89f5ef half implementation has been moved to half_impl namespace 2016-07-29 15:09:34 -07:00
Christoph Hertzberg
c5b893f434 bug #1266: half implementation has been moved to half_impl namespace 2016-07-29 18:36:08 +02:00
Gael Guennebaud
395c835f4b Fix CUDA compilation 2016-07-22 15:30:24 +02:00
Gael Guennebaud
47afc9a365 More cleaning in half:
- put its definition and functions in its own half_impl namespace such that the free function does not polute the Eigen namespace while still making them visible for half through ADL.
 - expose Eigen::half throguh a using statement
 - move operator<< from std to half_float namespace
2016-07-22 14:33:28 +02:00
Gael Guennebaud
0f350a8b7e Fix CUDA compilation 2016-07-21 18:47:07 +02:00
Gael Guennebaud
87fbda812f Add missing log10 and random generator for half. 2016-07-21 15:46:45 +02:00
Gael Guennebaud
01d12d3e82 Some cleanup in Halh: standard functions should be defined in the namespace of the class half to make ADL work, and thus the global is* functions can be removed. 2016-07-21 15:10:48 +02:00
Gael Guennebaud
a96a7ce3f7 Move CUDA's special functions to SpecialFunctions module. 2016-07-11 18:39:11 +02:00
Gael Guennebaud
fd60966310 merge 2016-07-11 18:11:47 +02:00
Konstantinos Margaritis
ef05463fcf Merged kmargar/eigen/tip into default, Altivec/VSX port should be working ok now. 2016-07-10 16:11:46 +03:00
Konstantinos Margaritis
9f7caa7e7d minor fixes for big endian altivec/vsx 2016-07-10 07:05:10 -03:00
Gael Guennebaud
2f7e2614e7 bug #1232: refactor special functions as a new SpecialFunctions module, currently in unsupported/. 2016-07-08 11:13:55 +02:00
Benoit Jacob
328c5d876a Undo changes in AltiVec --- I don't have any way to test there. 2016-06-28 11:15:25 -04:00
Benoit Jacob
38fb606052 Avoid global variables with static constructors in NEON/Complex.h 2016-06-28 11:12:49 -04:00
Konstantinos Margaritis
be107e387b fix compilation with clang 3.9, fix performance with pset1, use vector operators instead of intrinsics in some cases 2016-06-23 10:19:05 -03:00
Konstantinos Margaritis
8c34b5a0e3 mostly cleanups and modernizing code 2016-06-19 16:13:17 -03:00
Konstantinos Margaritis
b410d46482 mostly cleanups and modernizing code 2016-06-19 16:12:52 -03:00
Konstantinos Margaritis
b80379bda0 fixed pexp<Packet2d>, was failing tests 2016-06-19 16:11:58 -03:00
Gael Guennebaud
0028049380 bug #1240: Remove any assumption on NEON vector types. 2016-06-09 23:08:11 +02:00
Sean Templeton
bd21243821 Fix compile errors initializing packets on ARM DS-5 5.20
The ARM DS-5 5.20 compiler fails compiling with the following errors:

"src/Core/arch/NEON/PacketMath.h", line 113: Error:  #146: too many initializer values
    Packet4f countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/PacketMath.h", line 118: Error:  #146: too many initializer values
    Packet4i countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3);
                         ^
"src/Core/arch/NEON/Complex.h", line 30: Error:  #146: too many initializer values
  static uint32x4_t p4ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET4(0x00000000, 0x80000000, 0x00000000, 0x80000000);
                                    ^
"src/Core/arch/NEON/Complex.h", line 31: Error:  #146: too many initializer values
  static uint32x2_t p2ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET2(0x00000000, 0x80000000);
                                    ^

The vectors are implemented as two doubles, hence the too many initializer values error.
Changed the code to use intrinsic load functions which all compilers
implementing NEON should have.
2016-06-03 10:51:35 -05:00
Benoit Steiner
8fd57a97f2 Enable the vectorization of adds and mults of fp16 2016-06-07 18:22:18 -07:00
Eugene Brevdo
39baff850c Add TernaryFunctors and the betainc SpecialFunction.
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.

Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.

Added unit tests in array.cpp and cxx11_tensor_cuda.cu


Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc().  Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe.  and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Benoit Steiner
b6e306f189 Improved support for CUDA 8.0 2016-05-31 09:47:59 -07:00
Benoit Steiner
3a5d6a3c38 Disable the use of MMX instructions since the code is broken on many platforms 2016-05-27 09:13:26 -07:00
Benoit Steiner
094f4a56c8 Deleted extra namespace 2016-05-26 14:49:51 -07:00
Gael Guennebaud
7ff5fadcc0 Disable usage of MMX with msvc. 2016-05-26 17:58:46 +02:00
Gael Guennebaud
cc1ab64f29 Add missing inclusion of mmintrin.h 2016-05-26 09:51:50 +02:00
Benoit Steiner
3585ff585e Silenced a compilation warning 2016-05-25 22:09:19 -07:00
Benoit Steiner
efeb89dcdb Specify the rounding mode in the correct location 2016-05-25 17:53:24 -07:00
Benoit Steiner
0322c66a3f Explicitly specify the rounding mode when converting floats to fp16 2016-05-25 15:56:15 -07:00
Benoit Steiner
ed783872ab Disable the use of MMX instructions on x86_64 since too many compilers only support them in 32bit mode 2016-05-25 08:27:26 -07:00
Gael Guennebaud
bbf9109e25 Fix compilation with ICC. 2016-05-25 10:00:55 +02:00
Benoit Steiner
d041a528da Cleaned up the fp16 code a little more 2016-05-24 22:43:26 -07:00
Benoit Steiner
ff4a289572 Cleaned up the fp16 code 2016-05-24 18:50:09 -07:00
Benoit Jacob
40a16282c7 Remove now-unused protate PacketMath func 2016-05-24 11:01:18 -04:00
Benoit Steiner
e617711306 Don't attempt to use MMX instructions with visualstudio since they're only partially supported. 2016-05-24 06:43:58 -07:00
Benoit Steiner
334e76537f Worked around missing clang intrinsic 2016-05-24 00:29:28 -07:00
Benoit Steiner
b517ab349b Use the generic ploadquad intrinsics since it does the job 2016-05-24 00:11:17 -07:00
Benoit Steiner
646872cb3b Worked around missing clang intrinsics 2016-05-24 00:07:08 -07:00
Benoit Steiner
3dfc391a61 Added missing EIGEN_DEVICE_FUNC qualifier 2016-05-23 20:56:59 -07:00
Benoit Steiner
33a94f5dc7 Use the Index type instead of integers to specify the strides in pgather/pscatter 2016-05-23 20:37:30 -07:00