Name Strings

cl_intel_required_subgroup_size

Contact

Ben Ashbaugh, Intel (ben 'dot' ashbaugh 'at' intel 'dot' com)

Contributors

Ben Ashbaugh, Intel

Notice

Copyright (c) 2018-2023 Intel Corporation. All rights reserved.

Status

Final Draft

Version

Built On: 2023-06-12
Revision: 3

Dependencies

Support for OpenCL 2.1, cl_khr_subgroups, or cl_intel_subgroups is required. This extension is written against revision 23 of the OpenCL 2.1 API specification, against revision 30 of the OpenCL 2.0 OpenCL C specification, against version 31 of the OpenCL 2.0 Extensions specification, and against version 3 of the cl_intel_subgroups specification.

Overview

The goal of this extension is to allow programmers to optionally specify the required sub-group size for a kernel function. This information is important for the correctness of many sub-group algorithms, and in some cases may be used by the compiler to generate more optimal code.

New API Functions

None.

New API Enums

Accepted as the param_name parameter of clGetDeviceInfo:

CL_DEVICE_SUB_GROUP_SIZES_INTEL                 0x4108

Accepted as the param_name parameter of clGetKernelWorkGroupInfo:

CL_KERNEL_SPILL_MEM_SIZE_INTEL                  0x4109

Accepted as the param_name parameter of clGetKernelSubGroupInfo and/or clGetKernelSubGroupInfoKHR:

CL_KERNEL_COMPILE_SUB_GROUP_SIZE_INTEL          0x410A

New OpenCL C Optional Attribute Qualifiers

Optional __kernel qualifier:

__attribute__((intel_reqd_sub_group_size(<int>)))

Modifications to the OpenCL API Specification

Additions to Table 4.3 - "OpenCL Device Queries"

cl_device_info Return Type Description

CL_​DEVICE_​SUB_​GROUP_​SIZES_​INTEL

size_t[]

Returns the set of sub-group sizes supported by the device.

Additions to Table 5.21 - "clGetKernelWorkGroupInfo parameter queries":

cl_kernel_work_group_info Return Type Info. returned in param_value

CL_​KERNEL_​SPILL_​MEM_​SIZE_​INTEL

cl_ulong

Returns the amount of spill memory used by a kernel. The meaning of this value will vary from implementation-to-implementation, however a return value of 0 will always indicate that compiler was able to compile the kernel to fit into the device’s register file without spilling registers to memory.

Additions to "clGetKernelSubGroupInfo parameter queries":

This is Table 5.22 - "clGetKernelSubGroupInfo parameter queries" in the OpenCL 2.1 API spec, in Section 9.17.2.1 for clGetKernelSubGroupInfoKHR in the OpenCL 2.0 Extensions spec, and in the section describing the changes to Section 5.9.3 for clGetKernelSubGroupInfoKHR in the cl_intel_subgroups spec:

cl_kernel_sub_group_info Input Type Return Type Info. returned in param_value

CL_​KERNEL_​COMPILE_​SUB_​GROUP_​SIZE_​INTEL

ignored

size_t

Returns the sub-group size specified by the __attribute__((intel_reqd_sub_group_size(<int>))) qualifier. Refer to section 6.7.2.

If the sub-group size is not specified using the above attribute qualifier then 0 is returned.

Modifications to the OpenCL C Specification

Additions to Section 6.7.2 - "Optional Attribute Qualifiers"

The optional __attribute__((intel_reqd_sub_group_size(<int>))) can be used to indicate that the kernel must be compiled and executed with the specified sub-group size. When this attribute is present, get_max_sub_group_size() is guaranteed to return the specified integer value. This is important for the correctness of many sub-group algorithms, and in some cases may be used by the compiler to generate more optimal code.

Note that there is no guarantee for the value of get_sub_group_size() even when this attribute is present, particularly when the work-group size is not evenly divisible by the required sub-group size.

Note as well that some devices may support a limited number of sub-group sizes, and that some devices may not support all language constructs with all sub-group sizes. This means that some kernels may fail compilation with one required sub-group size and succeed with another required sub-group size, even if both sub-group sizes are supported by the device.

Finally, note that requiring one sub-group size (particularly, a larger sub-group size) may require more spill memory than another sub-group size, and may negatively impact application performance."

Issues

None.

Revision History

Rev Date Author Changes

1

2016-07-14

Ben Ashbaugh

First public revision.

2

2018-11-15

Ben Ashbaugh

Conversion to asciidoc.

3

2019-09-17

Ben Ashbaugh

Minor formatting fixes for asciidoctor.