Charles Schlosser
fdfdd4c96b
test suite: emit the function name when an ieee test fails
...
libeigen/eigen!2114
2026-01-22 02:32:38 +00:00
Rasmus Munk Larsen
a7674b70d3
Improve packet op test coverage for IEEE special values.
...
libeigen/eigen!2075
Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
2025-11-12 22:19:50 +00:00
Rasmus Munk Larsen
ec93a6d098
Add a generic Eigen backend based on clang vector extensions
...
The goal of this MR is to implement a generic SIMD backend (packet ops) for Eigen that uses clang vector extensions instead of platform-dependent intrinsics. Ideally, this should make it possible to build Eigen and achieve reasonable speed on any platform that has a recent clang compiler, without having to write any inline assembly or intrinsics.
Caveats:
* The current implementation is a proof of concept and supports vectorization for float, double, int32_t, and int64_t using fixed-size 512-bit vectors (a somewhat arbitrary choice). I have not done much to tune this for speed yet.
* For now, there is no way to enable this other than setting -DEIGEN_VECTORIZE_GENERIC on the command line.
* This only compiles with newer versions of clang. I have tested that it compiles and all tests pass with clang 19.1.7.
https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors
Closes #2998 and #2997
See merge request libeigen/eigen!2051
Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
Co-authored-by: Antonio Sánchez <cantonios@google.com>
2025-11-06 21:52:19 +00:00
Rasmus Munk Larsen
b6fcddccfc
Get rid of pblend packet op.
...
There was only a single code path left in TensorEvaluator using pblend. We can replace that with a call to the more general TernarySelectOp and get rid of pblend entirely from Core.
Closes #2998
See merge request libeigen/eigen!2056
Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
2025-11-03 23:27:50 +00:00
Antonio Sánchez
2e8cc042a1
Replace calls to numext::fma with numext:madd.
2025-08-28 21:40:19 +00:00
Antonio Sánchez
db8bd5b825
Modify pselect and various masks to use Scalar(1) for true.
2025-06-20 22:40:46 +00:00
Antonio Sánchez
c458d68fae
Fix compile warning about * with bool.
2025-06-05 22:48:57 +00:00
Rasmus Munk Larsen
33f5f59614
Vectorize cbrt for float and double.
2025-04-17 23:31:20 +00:00
Antonio Sánchez
b860042263
Add postream for ostream-ing packets more reliably.
2025-04-01 22:12:00 +00:00
Antonio Sanchez
8e32cbf7da
Reduce flakiness of test for Eigen::half.
2025-03-23 22:31:25 -07:00
Antonio Sánchez
d935916ac6
Add numext::fma and missing pmadd implementations.
2025-03-23 01:05:53 +00:00
Antonio Sánchez
70f2aead9a
Use native _Float16 for AVX512FP16 and update vectorization.
2025-03-19 19:55:26 +00:00
Charles Schlosser
10e62ccd22
Fix x86 complex vectorized fma
2025-03-12 17:06:32 +00:00
Antonio Sánchez
d79bac0d3c
Fix boolean scatter and random generation for tensors.
2025-02-25 21:37:09 +00:00
Rasmus Munk Larsen
5064cb7d5e
Add test for using pcast on scalars.
2024-11-25 22:27:26 -08:00
Rasmus Munk Larsen
3f067c4850
Add exp2() as a packet op and array method.
2024-10-22 22:09:34 +00:00
Charles Schlosser
9d3d37c5b7
Complex Numtraits::HasSign and nmsub test
2024-08-28 03:02:47 +00:00
Charles Schlosser
fb95e90f7f
Add truncation op
2024-04-29 23:45:49 +00:00
Antonio Sánchez
a5e147305b
Fix undefined behavior for generating inputs to the predux_mul test.
2024-04-29 20:32:09 +00:00
Antonio Sánchez
dcceb9afec
Unbork avx512 preduce_mul on MSVC.
2024-04-26 15:28:03 +00:00
Charles Schlosser
122befe54c
Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings
2024-04-12 19:35:04 +00:00
Antonio Sánchez
17f3bf8985
Fix pexp test for ARM.
2024-03-07 00:19:57 +00:00
Antonio Sánchez
3e8e63eb46
Fix packetmath plog test on Windows.
2024-03-06 23:51:47 +00:00
Antonio Sánchez
38fcedaf8e
Fix pexp complex test edge-cases.
2024-03-04 17:44:38 +00:00
Charles Schlosser
8a4118746e
fix exp complex test: use int instead of index
2024-02-17 03:55:32 +00:00
Charles Schlosser
18a161bf17
fix pexp_complex_test
2024-02-17 03:08:23 +00:00
Damiano Franzò
be06c9ad51
Implement float pexp_complex
2024-02-17 00:26:57 +00:00
Antonio Sánchez
f40ad38fda
Fix failure on ARM with latest compilers.
2024-02-14 23:00:56 +00:00
Antonio Sánchez
6ea33f95df
Eliminate warning about writing bytes directly to non-trivial type.
2024-02-12 23:27:48 +00:00
Antonio Sánchez
7b87b21910
Fix UB in bool packetmath test.
2024-02-09 19:46:45 +00:00
Antonio Sánchez
a9ddab3e06
Fix a bunch of ODR violations.
2024-01-30 22:38:43 +00:00
Damiano Franzò
7fd7a3f946
Implement plog_complex
2024-01-30 19:06:05 +00:00
Antonio Sánchez
46e9cdb7fe
Clang-format tests, examples, libraries, benchmarks, etc.
2023-12-05 21:22:55 +00:00
Charles Schlosser
81b48065ea
Fix arm32 float division and related bugs
2023-08-29 00:36:07 +00:00
Pedro Gonnet
17b5b4de58
Add `Packet4ui`, `Packet8ui`, and `Packet4ul` to the `SSE`/`AVX` `PacketMath.h` headers
2023-04-17 23:33:59 +00:00
Antonio Sánchez
394aabb0a3
Fix failing MSVC tests due to compiler bugs.
2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen
ce62177b5b
Vectorize atanh & add a missing definition and unit test for atan.
2023-02-21 03:14:05 +00:00
Antonio Sánchez
8588d8c74b
Correct pnegate for floating-point zero.
2022-11-15 18:07:23 +00:00
Rasmus Munk Larsen
97e0784dc6
Vectorize the sign operator in Eigen.
2022-08-09 19:54:57 +00:00
Antonio Sánchez
39d22ef46b
Fix flaky packetmath_1 test.
2022-08-02 17:42:45 +00:00
Chip Kerchner
84cf3ff18d
Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial.
2022-06-27 19:18:00 +00:00
Erik Schultheis
421cbf0866
Replace Eigen type metaprogramming with corresponding std types and make use of alias templates
2022-03-16 16:43:40 +00:00
Antonio Sánchez
711803c427
Skip denormal test if `Cond` is false.
2022-03-03 04:32:13 +00:00
Antonio Sánchez
9c07e201ff
Modified sqrt/rsqrt for denormal handling.
2022-03-02 17:20:47 +00:00
Antonio Sánchez
2ed4bee78f
Fix frexp packetmath tests for MSVC.
2022-02-24 22:16:37 +00:00
Antonio Sánchez
3d7e2d0e3e
Fix packetmath compilation error.
2022-02-23 23:27:08 +00:00
Antonio Sánchez
8970719771
Fix gcc-5 packetmath_12 bug.
2022-02-23 21:56:25 +00:00
Rasmus Munk Larsen
8b875dbef1
Changes to fast SQRT/RSQRT
2022-02-23 17:32:21 +00:00
Antonio Sánchez
28e008b99a
Fix sqrt/rsqrt for NEON.
2022-02-15 21:31:51 +00:00
Rasmus Munk Larsen
979fdd58a4
Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments.
2022-02-05 00:20:13 +00:00