Description
The functionality described in this section requires
support for
the cl_khr_ extension macro; or for
OpenCL C 3.0 or newer and the __opencl_c_ feature.
|
The following table describes OpenCL C
programming language built-in functions that operate on a sub-group level.
These built-in functions must be encountered by all work-items in the
sub-group executing the kernel.
For the functions below, the generic type name gentype may be the one of the
supported built-in scalar data types int, uint, long
[1], ulong, half
[2], float, and double
[3].
If the cl_khr_ extension is supported, the
generic type name gentype may additionally be char, uchar, short, and
ushort.
For the sub_group_broadcast function, gentype may additionally be one of
the supported built-in vector data types charn, ucharn,
shortn, ushortn, intn, uintn, longn,
ulongn, floatn, halfn [4], or
doublen [5]
|
| Function | Description |
|---|---|
int sub_group_all (int predicate) |
Evaluates predicate for all work-items in the sub-group and returns a non-zero value if predicate evaluates to non-zero for all work-items in the sub-group. |
int sub_group_any (int predicate) |
Evaluates predicate for all work-items in the sub-group and returns a non-zero value if predicate evaluates to non-zero for any work-items in the sub-group. |
gentype sub_group_broadcast ( |
Broadcast the value of x for work-item identified by sub_group_local_id (value returned by get_sub_group_local_id) to all work-items in the sub-group. Behavior is undefined when the value of sub_group_local_id is not equivalent for all work-items in the sub-group. Behavior is undefined when sub_group_local_id is greater or equal to the sub-group size. |
gentype sub_group_reduce_<op> ( |
Return result of reduction operation specified by <op> for all values of x specified by work-items in a sub-group. |
gentype sub_group_scan_exclusive_<op> ( |
Do an exclusive scan operation specified by <op> of all values specified by work-items in a sub-group. The scan results are returned for each work-item. The scan order is defined by increasing sub-group local ID within the sub-group. |
gentype sub_group_scan_inclusive_<op> ( |
Do an inclusive scan operation specified by <op> of all values specified by work-items in a sub-group. The scan results are returned for each work-item. The scan order is defined by increasing sub-group local ID within the sub-group. |
The <op> in sub_group_reduce_<op>, sub_group_scan_inclusive_<op> and sub_group_scan_exclusive_<op> defines the operator and can be add, min or max.
The exclusive scan operation takes a binary operator op with an identity I and n (where n is the size of the sub-group) elements [a0, a1, … an-1] and returns [I, a0, (a0 op a1), … (a0 op a1 op … op an-2)].
The inclusive scan operation takes a binary operator op with an identity I and n (where n is the size of the sub-group) elements [a0, a1, … an-1] and returns [a0, (a0 op a1), … (a0 op a1 op … op an-1)].
If op = add, the identity I is 0.
If op = min, the identity I is INT_MAX, UINT_MAX, LONG_MAX, ULONG_MAX, for int, uint, long, ulong types and is +INF for
floating-point types.
Similarly if op = max, the identity I is INT_MIN, 0, LONG_MIN, 0 and -INF.
|
The order of floating-point operations is not guaranteed for the sub_group_reduce_<op>, sub_group_scan_inclusive_<op> and sub_group_scan_exclusive_<op> built-in functions that operate on |
The functionality described in the following table requires support
the cl_khr_ extension macro; or for
OpenCL C 3.0 or newer and the __opencl_c_ and __opencl_c_
features.
|
The following table describes built-in pipe
functions that operate at a sub-group level.
These built-in functions must be encountered by all work-items in a sub-group
executing the kernel with the same argument values, otherwise the behavior
is undefined.
We use the generic type name gentype to indicate the built-in OpenCL C
scalar or vector integer or floating-point data types or any user defined
type built from these scalar and vector data types can be used as the type
for the arguments to the pipe functions listed in table 6.29.
| Function | Description |
|---|---|
reserve_id_t sub_group_reserve_read_pipe ( reserve_id_t sub_group_reserve_write_pipe ( |
Reserve num_packets entries for reading from or writing to pipe. Returns a valid non-zero reservation ID if the reservation is successful and 0 otherwise. The reserved pipe entries are referred to by indices that go from 0 … num_packets - 1. |
void sub_group_commit_read_pipe ( void sub_group_commit_write_pipe ( |
Indicates that all reads and writes to num_packets associated with reservation reserve_id are completed. |
Note: Reservations made by a sub-group are ordered in the pipe as they are ordered in the program. Reservations made by different sub-groups that belong to the same work-group can be ordered using sub-group synchronization. The order of sub-group based reservations that belong to different work groups is implementation-defined.
The functionality described in the following table requires support
the cl_khr_ extension macro; or for
OpenCL C 3.0 or newer and the __opencl_c_ and
__opencl_c_ features.
|
The following table describes built-in functions to query sub-group information for a block to be enqueued.
| Built-in Function | Description |
|---|---|
uint get_kernel_sub_group_count_for_ndrange ( uint get_kernel_sub_group_count_for_ndrange ( |
Returns the number of sub-groups in each work-group of the dispatch (except for the last in cases where the global size does not divide cleanly into work-groups) given the combination of the passed ndrange and block. block specifies the block to be enqueued. |
uint get_kernel_max_sub_group_size_for_ndrange ( uint get_kernel_max_sub_group_size_for_ndrange ( |
Returns the maximum sub-group size for a block. |
Document Notes
For more information, see the OpenCL C Specification
This page is extracted from the OpenCL C Specification. Fixes and changes should be made to the Specification, not directly.