Dependencies
This feature is written against the OpenCL API Specification Version V3.0.6.
This extension requires OpenCL 1.0.
Overview
This extension can be used to query additional information about Intel OpenCL devices. The additional information may be useful to tailor a specific workload to the properties of the device.
New API Enums
Possible values accepted as the param_name parameter of clGetDeviceInfo, depending on the device type and the extension version. Additional queries may be added in subsequent versions of the extension:
/* For GPU devices, version 1.0.0: */
#define CL_DEVICE_IP_VERSION_INTEL 0x4250
#define CL_DEVICE_ID_INTEL 0x4251
#define CL_DEVICE_NUM_SLICES_INTEL 0x4252
#define CL_DEVICE_NUM_SUB_SLICES_PER_SLICE_INTEL 0x4253
#define CL_DEVICE_NUM_EUS_PER_SUB_SLICE_INTEL 0x4254
#define CL_DEVICE_NUM_THREADS_PER_EU_INTEL 0x4255
#define CL_DEVICE_FEATURE_CAPABILITIES_INTEL 0x4256
Bitfield type describing the feature capabilities of a device. Additional feature flags may be added in subsequent versions of the extension.
typedef cl_bitfield cl_device_feature_capabilities_intel;
/* For GPU devices, version 1.0.0: */
#define CL_DEVICE_FEATURE_FLAG_DP4A_INTEL (1 << 0)
#define CL_DEVICE_FEATURE_FLAG_DPAS_INTEL (1 << 1)
Modifications to the OpenCL API Specification
- (Add to the list preceding Table 5 in Section 4.2 - Querying Devices)
-
The device queries described in the Device Queries table should return the same information for a root-level device i.e. a device returned by clGetDeviceIDs and any sub-devices created from this device except for the following queries:
-
…
-
CL_DEVICE_
NUM_ SLICES_ INTEL
-
- (Add to Table 5 - OpenCL Device Queries in Section 4.2 - Querying Devices)
-
Table 5. List of supported param_names by clGetDeviceInfo cl_device_info Return Type Description CL_DEVICE_
IP_ VERSION_ INTEL cl_version
The IP version for the device. The meaning of this version is implementation-defined, but newer devices should have a higher version than older devices.
CL_DEVICE_
ID_ INTEL cl_uint
A unique device identifier for the OpenCL device.
If the implementation is driven primarily by a PCI device with a PCI device ID, the low 16 bits of the device identifier must contain that PCI device ID, and the remaining bits must be set to zero. Otherwise, the choice of what to return may be dictated by operating system or platform policies - but should uniquely identify both the device version and any major configuration options (for example, core count in the case of multi-core devices).
CL_DEVICE_
NUM_ SLICES_ INTEL cl_uint
The number of slices in the device. A slice is a collection of sub-slices.
CL_DEVICE_
NUM_ SUB_ SLICES_ PER_ SLICE_ INTEL cl_uint
The maximum number of sub-slices in one slice. A sub-slice is a collection of execution units (EUs), fixed-function units, and caches.
CL_DEVICE_
NUM_ EUS_ PER_ SUB_ SLICE_ INTEL cl_uint
The maximum number of execution units (EUs) in one sub-slice. An execution unit is a multi-threaded processor that runs OpenCL kernels.
CL_DEVICE_
NUM_ THREADS_ PER_ EU_ INTEL cl_uint
The maximum number of threads that may be simultaneously resident on one execution unit (EU).
CL_DEVICE_
FEATURE_ CAPABILITIES_ INTEL cl_device_
feature_ capabilities_ intel Feature flags describing capabilities supported by the device.
Issues
-
How to express cache sizes? Is it for the entire device?
RESOLVED: The cache size query was removed in rev B, for now. If we decide to add it back, note that some caches are local to the device, while others are local to a sub-slice.
-
Do we need a query for a "full device ID"?
RESOLVED: This is useful but it will be added by a different extension.
-
What should the return type be for the
CL_DEVICE_
query?IP_ VERSION_ INTEL RESOLVED: Using the OpenCL 3.0
cl_version
type for now, but it could just as easily be interpreted as a plaincl_uint
.