Name Strings
cl_intel_required_subgroup_size
Contact
Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)
Contributors
Ben Ashbaugh, Intel
Notice
Copyright (c) 2018-2019 Intel Corporation. All rights reserved.
Status
Final Draft
Version
Built On: 2019-10-23
Revision: 3
Dependencies
Support for OpenCL 2.1, cl_khr_subgroups
, or cl_intel_subgroups
is required.
This extension is written against revision 23 of the OpenCL 2.1 API specification, against revision 30 of the OpenCL 2.0 OpenCL C specification, against version 31 of the OpenCL 2.0 Extensions specification, and against version 3 of the cl_intel_subgroups
specification.
Overview
The goal of this extension is to allow programmers to optionally specify the required subgroup size for a kernel function. This information is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.
New API Functions
None.
New API Enums
Accepted as the param_name parameter of clGetDeviceInfo:
CL_DEVICE_SUB_GROUP_SIZES_INTEL 0x4108
Accepted as the param_name parameter of clGetKernelWorkGroupInfo:
CL_KERNEL_SPILL_MEM_SIZE_INTEL 0x4109
Accepted as the param_name parameter of clGetKernelSubGroupInfo and/or clGetKernelSubGroupInfoKHR:
CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL 0x410A
New OpenCL C Optional Attribute Qualifiers
Optional __kernel
qualifier:
__attribute__((intel_reqd_sub_group_size(<int>)))
Modifications to the OpenCL API Specification
Additions to Table 4.3 - "OpenCL Device Queries"
cl_device_info | Return Type | Description |
---|---|---|
|
|
Returns the set of subgroup sizes supported by the device. |
Additions to Table 5.21 - "clGetKernelWorkGroupInfo parameter queries":
cl_kernel_work_group_info | Return Type | Info. returned in param_value |
---|---|---|
|
|
Returns the amount of spill memory used by a kernel. The meaning of this value will vary from implementation-to-implementation, however a return value of 0 will always indicate that compiler was able to compile the kernel to fit into the device’s register file without spilling registers to memory. |
Additions to "clGetKernelSubGroupInfo parameter queries":
This is Table 5.22 - "clGetKernelSubGroupInfo parameter queries" in the OpenCL 2.1 API spec, in Section 9.17.2.1 for clGetKernelSubGroupInfoKHR in the OpenCL 2.0 Extensions spec, and in the section describing the changes to Section 5.9.3 for clGetKernelSubGroupInfoKHR in the cl_intel_subgroups
spec:
cl_kernel_sub_group_info | Input Type | Return Type | Info. returned in param_value |
---|---|---|---|
|
|
|
Returns the subgroup size specified by the If the subgroup size is not specified using the above attribute qualifier then 0 is returned. |
Modifications to the OpenCL C Specification
Additions to Section 6.7.2 - "Optional Attribute Qualifiers"
The optional __attribute__((intel_reqd_sub_group_size(<int>)))
can be used to indicate that the kernel must be compiled and executed with the specified subgroup size.
When this attribute is present, get_max_sub_group_size() is guaranteed to return the specified integer value.
This is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.
Note that there is no guarantee for the value of get_sub_group_size() even when this attribute is present, particularly when the work-group size is not evenly divisible by the required subgroup size.
Note as well that some devices may support a limited number of subgroup sizes, and that some devices may not support all language constructs with all subgroup sizes. This means that some kernels may fail compilation with one required subgroup size and succeed with another required subgroup size, even if both subgroup sizes are supported by the device.
Finally, note that requiring one subgroup size (particularly, a larger subgroup size) may require more spill memory than another subgroup size, and may negatively impact application performance."
Issues
None.
Revision History
Rev | Date | Author | Changes |
---|---|---|---|
1 |
2016-07-14 |
Ben Ashbaugh |
First public revision. |
2 |
2018-11-15 |
Ben Ashbaugh |
Conversion to asciidoc. |
3 |
2019-09-17 |
Ben Ashbaugh |
Minor formatting fixes for asciidoctor. |