OpenCL 3.1 is here (www.khronos.org)
from JRepin@lemmy.ml to programming@programming.dev on 05 May 11:51
https://lemmy.ml/post/46884844

cross-posted from: lemmy.ml/post/46884793

The Khronos OpenCL Working Group has released OpenCL 3.1, bringing widely deployed, field-proven capabilities into the core specification to expand functionality, including SPIR-V ingestion, that developers will be able to rely on across conformant implementations.

Features now mandated by OpenCL 3.1 have been deployed as extensions or optional capabilities. This is by design. The OpenCL working group evolves the specification by proving features in the field as extensions first, watching how they get used across multiple implementations, refining them based on developer feedback, and only then graduating them into the core specification.

Every conformant OpenCL 3.1 implementation will be required to consume SPIR-V kernels — a feature that has been one of the most requested by developers. OpenCL 3.1 additionally requires support for the SPIR-V query extension, which enables applications to enumerate the SPIR-V capabilities, extensions, and versions that a device supports, simplifying the adoption of new SPIR-V features as they become available.

Several features essential to HPC and AI kernels are also now mandatory in the core OpenCL 3.1 specification:

  • Subgroups, including shuffles, rotations, and an expanded set of supported data types. A fundamental building block for tuned reductions, scans, and matrix kernels.
  • Integer dot products, including saturating and accumulating variants, together with extended bit operations: Both map directly to dedicated hardware instructions on a wide range of modern silicon, and both are common building blocks for matrix multiplications and the low-precision arithmetic central to inference workloads.
  • A new query for the suggested local work-group size. This gives applications and profilers a runtime hint for the optimal work-group size for a given kernel and device, eliminating the need for manual tuning or repeated size calculations across multiple enqueues and improving performance predictability on diverse hardware.
  • A standard device UUID query, matching Vulkan’s VkPhysicalDeviceIDProperties::deviceUUID. This allows applications to correlate the same physical device across APIs, which is essential for multi-device systems and for external memory-sharing scenarios that span OpenCL and Vulkan.

#programming

threaded - newest