Name Strings



Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)


Ben Ashbaugh, Intel


Copyright (c) 2018-2019 Intel Corporation. All rights reserved.


Final Draft


Built On: 2019-10-23
Revision: 3


Support for OpenCL 2.1, cl_khr_subgroups, or cl_intel_subgroups is required. This extension is written against revision 23 of the OpenCL 2.1 API specification, against revision 30 of the OpenCL 2.0 OpenCL C specification, against version 31 of the OpenCL 2.0 Extensions specification, and against version 3 of the cl_intel_subgroups specification.


The goal of this extension is to allow programmers to optionally specify the required subgroup size for a kernel function. This information is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.

New API Functions


New API Enums

Accepted as the param_name parameter of clGetDeviceInfo:

CL_DEVICE_SUB_GROUP_SIZES_INTEL                 0x4108

Accepted as the param_name parameter of clGetKernelWorkGroupInfo:

CL_KERNEL_SPILL_MEM_SIZE_INTEL                  0x4109

Accepted as the param_name parameter of clGetKernelSubGroupInfo and/or clGetKernelSubGroupInfoKHR:


New OpenCL C Optional Attribute Qualifiers

Optional __kernel qualifier:


Modifications to the OpenCL API Specification

Additions to Table 4.3 - "OpenCL Device Queries"

cl_device_info Return Type Description



Returns the set of subgroup sizes supported by the device.

Additions to Table 5.21 - "clGetKernelWorkGroupInfo parameter queries":

cl_kernel_work_group_info Return Type Info. returned in param_value



Returns the amount of spill memory used by a kernel. The meaning of this value will vary from implementation-to-implementation, however a return value of 0 will always indicate that compiler was able to compile the kernel to fit into the device’s register file without spilling registers to memory.

Additions to "clGetKernelSubGroupInfo parameter queries":

This is Table 5.22 - "clGetKernelSubGroupInfo parameter queries" in the OpenCL 2.1 API spec, in Section for clGetKernelSubGroupInfoKHR in the OpenCL 2.0 Extensions spec, and in the section describing the changes to Section 5.9.3 for clGetKernelSubGroupInfoKHR in the cl_intel_subgroups spec:

cl_kernel_sub_group_info Input Type Return Type Info. returned in param_value




Returns the subgroup size specified by the __attribute__((intel_reqd_sub_group_size(<int>))) qualifier. Refer to section 6.7.2.

If the subgroup size is not specified using the above attribute qualifier then 0 is returned.

Modifications to the OpenCL C Specification

Additions to Section 6.7.2 - "Optional Attribute Qualifiers"

The optional __attribute__((intel_reqd_sub_group_size(<int>))) can be used to indicate that the kernel must be compiled and executed with the specified subgroup size. When this attribute is present, get_max_sub_group_size() is guaranteed to return the specified integer value. This is important for the correctness of many subgroup algorithms, and in some cases may be used by the compiler to generate more optimal code.

Note that there is no guarantee for the value of get_sub_group_size() even when this attribute is present, particularly when the work-group size is not evenly divisible by the required subgroup size.

Note as well that some devices may support a limited number of subgroup sizes, and that some devices may not support all language constructs with all subgroup sizes. This means that some kernels may fail compilation with one required subgroup size and succeed with another required subgroup size, even if both subgroup sizes are supported by the device.

Finally, note that requiring one subgroup size (particularly, a larger subgroup size) may require more spill memory than another subgroup size, and may negatively impact application performance."



Revision History

Rev Date Author Changes



Ben Ashbaugh

First public revision.



Ben Ashbaugh

Conversion to asciidoc.



Ben Ashbaugh

Minor formatting fixes for asciidoctor.