Commit Graph

272 Commits

Author SHA1 Message Date
Charles Schlosser fdfdd4c96b test suite: emit the function name when an ieee test fails
libeigen/eigen!2114
2026-01-22 02:32:38 +00:00
Rasmus Munk Larsen a7674b70d3 Improve packet op test coverage for IEEE special values.
libeigen/eigen!2075

Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
2025-11-12 22:19:50 +00:00
Rasmus Munk Larsen ec93a6d098 Add a generic Eigen backend based on clang vector extensions
The goal of this MR is to implement a generic SIMD backend (packet ops) for Eigen that uses clang vector extensions instead of platform-dependent intrinsics. Ideally, this should make it possible to build Eigen and achieve reasonable speed on any platform that has a recent clang compiler, without having to write any inline assembly or intrinsics.

Caveats:

* The current implementation is a proof of concept and supports vectorization for float, double, int32_t, and int64_t using fixed-size 512-bit vectors (a somewhat arbitrary choice). I have not done much to tune this for speed yet.
* For now, there is no way to enable this other than setting -DEIGEN_VECTORIZE_GENERIC on the command line.
* This only compiles with newer versions of clang. I have tested that it compiles and all tests pass with clang 19.1.7.

https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors

Closes #2998 and #2997

See merge request libeigen/eigen!2051

Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
Co-authored-by: Antonio Sánchez <cantonios@google.com>
2025-11-06 21:52:19 +00:00
Rasmus Munk Larsen b6fcddccfc Get rid of pblend packet op.
There was only a single code path left in TensorEvaluator using pblend. We can replace that with a call to the more general TernarySelectOp and get rid of pblend entirely from Core.

Closes #2998

See merge request libeigen/eigen!2056

Co-authored-by: Rasmus Munk Larsen <rmlarsen@google.com>
2025-11-03 23:27:50 +00:00
Antonio Sánchez 2e8cc042a1 Replace calls to numext::fma with numext:madd. 2025-08-28 21:40:19 +00:00
Antonio Sánchez db8bd5b825 Modify pselect and various masks to use Scalar(1) for true. 2025-06-20 22:40:46 +00:00
Antonio Sánchez c458d68fae Fix compile warning about * with bool. 2025-06-05 22:48:57 +00:00
Rasmus Munk Larsen 33f5f59614 Vectorize cbrt for float and double. 2025-04-17 23:31:20 +00:00
Antonio Sánchez b860042263 Add postream for ostream-ing packets more reliably. 2025-04-01 22:12:00 +00:00
Antonio Sanchez 8e32cbf7da Reduce flakiness of test for Eigen::half. 2025-03-23 22:31:25 -07:00
Antonio Sánchez d935916ac6 Add numext::fma and missing pmadd implementations. 2025-03-23 01:05:53 +00:00
Antonio Sánchez 70f2aead9a Use native _Float16 for AVX512FP16 and update vectorization. 2025-03-19 19:55:26 +00:00
Charles Schlosser 10e62ccd22 Fix x86 complex vectorized fma 2025-03-12 17:06:32 +00:00
Antonio Sánchez d79bac0d3c Fix boolean scatter and random generation for tensors. 2025-02-25 21:37:09 +00:00
Rasmus Munk Larsen 5064cb7d5e Add test for using pcast on scalars. 2024-11-25 22:27:26 -08:00
Rasmus Munk Larsen 3f067c4850 Add exp2() as a packet op and array method. 2024-10-22 22:09:34 +00:00
Charles Schlosser 9d3d37c5b7 Complex Numtraits::HasSign and nmsub test 2024-08-28 03:02:47 +00:00
Charles Schlosser fb95e90f7f Add truncation op 2024-04-29 23:45:49 +00:00
Antonio Sánchez a5e147305b Fix undefined behavior for generating inputs to the predux_mul test. 2024-04-29 20:32:09 +00:00
Antonio Sánchez dcceb9afec Unbork avx512 preduce_mul on MSVC. 2024-04-26 15:28:03 +00:00
Charles Schlosser 122befe54c Fix "unary minus operator applied to unsigned type, result still unsigned" on MSVC and other stupid warnings 2024-04-12 19:35:04 +00:00
Antonio Sánchez 17f3bf8985 Fix pexp test for ARM. 2024-03-07 00:19:57 +00:00
Antonio Sánchez 3e8e63eb46 Fix packetmath plog test on Windows. 2024-03-06 23:51:47 +00:00
Antonio Sánchez 38fcedaf8e Fix pexp complex test edge-cases. 2024-03-04 17:44:38 +00:00
Charles Schlosser 8a4118746e fix exp complex test: use int instead of index 2024-02-17 03:55:32 +00:00
Charles Schlosser 18a161bf17 fix pexp_complex_test 2024-02-17 03:08:23 +00:00
Damiano Franzò be06c9ad51 Implement float pexp_complex 2024-02-17 00:26:57 +00:00
Antonio Sánchez f40ad38fda Fix failure on ARM with latest compilers. 2024-02-14 23:00:56 +00:00
Antonio Sánchez 6ea33f95df Eliminate warning about writing bytes directly to non-trivial type. 2024-02-12 23:27:48 +00:00
Antonio Sánchez 7b87b21910 Fix UB in bool packetmath test. 2024-02-09 19:46:45 +00:00
Antonio Sánchez a9ddab3e06 Fix a bunch of ODR violations. 2024-01-30 22:38:43 +00:00
Damiano Franzò 7fd7a3f946 Implement plog_complex 2024-01-30 19:06:05 +00:00
Antonio Sánchez 46e9cdb7fe Clang-format tests, examples, libraries, benchmarks, etc. 2023-12-05 21:22:55 +00:00
Charles Schlosser 81b48065ea Fix arm32 float division and related bugs 2023-08-29 00:36:07 +00:00
Pedro Gonnet 17b5b4de58 Add `Packet4ui`, `Packet8ui`, and `Packet4ul` to the `SSE`/`AVX` `PacketMath.h` headers 2023-04-17 23:33:59 +00:00
Antonio Sánchez 394aabb0a3 Fix failing MSVC tests due to compiler bugs. 2023-03-10 22:36:57 +00:00
Rasmus Munk Larsen ce62177b5b Vectorize atanh & add a missing definition and unit test for atan. 2023-02-21 03:14:05 +00:00
Antonio Sánchez 8588d8c74b Correct pnegate for floating-point zero. 2022-11-15 18:07:23 +00:00
Rasmus Munk Larsen 97e0784dc6 Vectorize the sign operator in Eigen. 2022-08-09 19:54:57 +00:00
Antonio Sánchez 39d22ef46b Fix flaky packetmath_1 test. 2022-08-02 17:42:45 +00:00
Chip Kerchner 84cf3ff18d Add pload_partial, pstore_partial (and unaligned versions), pgather_partial, pscatter_partial, loadPacketPartial and storePacketPartial. 2022-06-27 19:18:00 +00:00
Erik Schultheis 421cbf0866 Replace Eigen type metaprogramming with corresponding std types and make use of alias templates 2022-03-16 16:43:40 +00:00
Antonio Sánchez 711803c427 Skip denormal test if `Cond` is false. 2022-03-03 04:32:13 +00:00
Antonio Sánchez 9c07e201ff Modified sqrt/rsqrt for denormal handling. 2022-03-02 17:20:47 +00:00
Antonio Sánchez 2ed4bee78f Fix frexp packetmath tests for MSVC. 2022-02-24 22:16:37 +00:00
Antonio Sánchez 3d7e2d0e3e Fix packetmath compilation error. 2022-02-23 23:27:08 +00:00
Antonio Sánchez 8970719771 Fix gcc-5 packetmath_12 bug. 2022-02-23 21:56:25 +00:00
Rasmus Munk Larsen 8b875dbef1 Changes to fast SQRT/RSQRT 2022-02-23 17:32:21 +00:00
Antonio Sánchez 28e008b99a Fix sqrt/rsqrt for NEON. 2022-02-15 21:31:51 +00:00
Rasmus Munk Larsen 979fdd58a4 Add generic fast psqrt and prsqrt impls and make them correct for 0, +Inf, NaN, and negative arguments. 2022-02-05 00:20:13 +00:00