AVX 10.2 and APX support#592
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends Embree’s ISA detection, build configuration, and runtime dispatch to add a new APX (and related AVX10.2) codepath, including updates to CPU feature bitmask width and AVX10.2-specific SIMD helper intrinsics.
Changes:
- Widen CPU feature/ISA bitmasks and plumbing from
inttoint64_tacross state/config/verify paths. - Add APX target support to CMake (flags, target library, config export) and extend runtime symbol-selection macros to include APX.
- Extend CPU feature detection + SIMD helpers to recognize and use AVX10.2/APX-related capabilities.
Reviewed changes
Copilot reviewed 32 out of 32 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tutorials/verify/verify.h | Switch ISA storage/params to int64_t for wider feature masks. |
| tutorials/verify/verify.cpp | Use int64_t ISA masks and add APX to verify ISA list. |
| tutorials/common/tutorial/application.cpp | Document apx in CLI help output. |
| kernels/common/state.h | Update ISA-related APIs/fields to int64_t. |
| kernels/common/state.cpp | Update ISA handling and add APX debug verification + AVX10 string parsing. |
| kernels/common/rtcore.cpp | Extend geometry factory dispatch to include APX-capable symbol selection. |
| kernels/common/isa.h | Add APX namespace and new ISA selection macro combinations (AVX512+APX, etc.). |
| kernels/common/device.cpp | Print CPU features with int64_t and include APX in supported target list string. |
| kernels/CMakeLists.txt | Add embree_apx static library target and integrate APX into file selection. |
| kernels/bvh/bvh8_factory.h | Widen builder/intersector feature masks to int64_t. |
| kernels/bvh/bvh8_factory.cpp | Use APX-aware selection macros for BVH8 intersectors. |
| kernels/bvh/bvh4_factory.h | Widen builder/intersector feature masks to int64_t. |
| kernels/bvh/bvh4_factory.cpp | Use APX-aware selection macros for BVH4 intersectors. |
| kernels/bvh/bvh_intersector1_bvh4.cpp | Return int64_t from getISA() for wider ISA values. |
| common/sys/sysinfo.h | Add APX/AVX10.* feature bits and widen ISA constants/APIs to int64_t. |
| common/sys/sysinfo.cpp | Implement AVX10/APX CPUID + XCR0 detection and widen feature handling. |
| common/simd/vuint16_avx512.h | Use AVX10.2 reductions when available; adjust multiply intrinsic. |
| common/simd/vllong8_avx512.h | Use AVX10.2 reduction intrinsics when available. |
| common/simd/vfloat16_avx512.h | Use AVX10.2 abs/minmax/reduce helpers when available. |
| common/simd/vdouble8_avx512.h | Use AVX10.2 abs/minmax/reduce helpers when available. |
| common/simd/vboolf8_avx512.h | Use AVX10.2 mask intrinsics where applicable. |
| common/simd/vboolf4_avx512.h | Use AVX10.2 mask intrinsics where applicable. |
| common/simd/vboolf16_avx512.h | Use AVX10.2 mask intrinsics where applicable. |
| common/simd/vboold8_avx512.h | Use AVX10.2 mask intrinsics where applicable. |
| common/simd/vboold4_avx512.h | Use AVX10.2 mask intrinsics where applicable. |
| common/cmake/msvc.cmake | Add MSVC flags for APX target builds. |
| common/cmake/gnu.cmake | Add GNU flags for APX target builds. |
| common/cmake/embree-config.cmake | Export APX target setting and include APX targets for static builds. |
| common/cmake/dpcpp.cmake | Add DPCPP flags for APX target builds. |
| common/cmake/clang.cmake | Add Clang flags for APX target builds. |
| common/cmake/check_isa.cpp | Extend ISA probe to detect/report APX. |
| CMakeLists.txt | Add APX to max-ISA selection and enable APX build option/defines. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Require CPUID leaf 0x24 512-bit AVX10 vector-length support and ZMM OS state before enabling AVX10.1/10.2 CPU feature bits. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| SET(FLAGS_AVX "${FLAGS_SSE42} /arch:AVX") | ||
| SET(FLAGS_AVX2 "${FLAGS_SSE42} /arch:AVX2") | ||
| SET(FLAGS_AVX512 "${FLAGS_AVX2} /arch:AVX512") | ||
| SET(FLAGS_APX "${FLAGS_AVX512} /arch:AVX10.2") |
There was a problem hiding this comment.
This is not sufficient one also needs to enable APX separately. Also we want 512-bit AVX10.2 not sure what that default actually does, but /arch:AVX10.2/512 should be used here, and /arch:APX
| SET(FLAGS_AVX "${FLAGS_SSE42} /arch:AVX") | ||
| SET(FLAGS_AVX2 "${FLAGS_SSE42} /arch:AVX2") | ||
| SET(FLAGS_AVX512 "${FLAGS_AVX2} /arch:AVX512") | ||
| SET(FLAGS_APX "${FLAGS_AVX512} /arch:AVX10.2 /vlen=512 /feature:APX /D__AVX10_VER__=102") |
There was a problem hiding this comment.
Why not just consistently define AVX10_2 if you anyway define some makro? Also you set version here to 102 but compare >= 2 in code.
Adding support for the AVX 10.2 and APX instruction sets