Files
eigen/Eigen/src/Core
Antonio Sanchez 45e67a6fda Use reinterpret_cast on GPU for bit_cast.
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation.  We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
..
2021-10-20 19:18:34 +00:00
2021-09-22 16:15:06 +00:00
2021-09-22 16:15:06 +00:00
2021-09-22 16:15:06 +00:00
2021-09-22 16:15:06 +00:00
2021-09-23 15:22:00 +00:00
2021-09-23 15:22:00 +00:00
2021-09-22 16:15:06 +00:00
2021-09-22 16:15:06 +00:00
2021-09-23 15:22:00 +00:00
2021-10-20 16:58:01 +00:00