Description

The functionality described in this section requires support for OpenCL C 2.0, or OpenCL C 3.0 or newer and the __opencl_c_work_group_collective_functions feature.

This section describes built-in functions that perform collective options across a work-group. These built-in functions must be encountered by all work-items in a work-group executing the kernel. We use the generic type name gentype to indicate the built-in data types half [1], int, uint, long [2], ulong, float or double [3] as the type for the arguments.

Table 1. Built-in Work-group Collective Functions

Function

Description

int work_group_all(int predicate)

Evaluates predicate for all work-items in the work-group and returns a non-zero value if predicate evaluates to non-zero for all work-items in the work-group.

int work_group_any(int predicate)

Evaluates predicate for all work-items in the work-group and returns a non-zero value if predicate evaluates to non-zero for any work-items in the work-group.

gentype work_group_broadcast(gentype a, size_t local_id)
gentype work_group_broadcast(gentype a, size_t local_id_x, size_t local_id_y)
gentype work_group_broadcast(gentype a, size_t local_id_x, size_t local_id_y, size_t local_id_z)

Broadcast the value of a for work-item identified by local_id to all work-items in the work-group.

Behavior is undefined when the value of local_id is not equivalent for all work-items in the work-group.

Behavior is undefined when local_id is greater or equal to the work-group size in the corresponding dimension.

gentype work_group_reduce_<op>(gentype x)

Return result of reduction operation specified by <op> for all values of x specified by work-items in a work-group.

gentype work_group_scan_exclusive_<op>(gentype x)

Do an exclusive scan operation specified by <op> of all values specified by work-items in the work-group. The scan results are returned for each work-item.

The scan order is defined by increasing 1D linear global ID within the work-group.

gentype work_group_scan_inclusive_<op>(gentype x)

Do an inclusive scan operation specified by <op> of all values specified by work-items in the work-group. The scan results are returned for each work-item.

The scan order is defined by increasing 1D linear global ID within the work-group.

The <op> in work_group_reduce_<op>, work_group_scan_exclusive_<op> and work_group_scan_inclusive_<op> defines the operator and can be add, min or max.

The inclusive scan operation takes a binary operator op with n (where n is the size of the work-group) elements [a0, a1, …​ an-1] and returns [a0, (a0 op a1), …​ (a0 op a1 op …​ op an-1)].

Consider the following example:

void foo(int *p)
{
    ...
    int prefix_sum_val = work_group_scan_inclusive_add(
                            p[get_local_id(0)]);
}

For the example above, let’s assume that the work-group size is 8 and p points to the following elements [3 1 7 0 4 1 6 3]. Work-item 0 calls work_group_scan_inclusive_add with 3 and returns 3. Work-item 1 calls work_group_scan_inclusive_add with 1 and returns 4. The full set of values returned by work_group_scan_inclusive_add for work-items 0 …​ 7 are [3 4 11 11 15 16 22 25].

The exclusive scan operation takes a binary associative operator op with an identity I and n (where n is the size of the work-group) elements [a0, a1, …​ an-1] and returns [I, a0, (a0 op a1), …​ (a0 op a1 op …​ op an-2)]. If op = add, the identity I is 0. If op = min, the identity I is INT_MAX, UINT_MAX, LONG_MAX, ULONG_MAX, for int, uint, long, ulong types and is +INF for floating-point types. Similarly if op = max, the identity I is INT_MIN, 0, LONG_MIN, 0 and -INF. For the example above, the exclusive scan add operation on the ordered set [3 1 7 0 4 1 6 3] would return [0 3 4 11 11 15 16 22].

The order of floating-point operations is not guaranteed for the work_group_reduce_<op>, work_group_scan_inclusive_<op> and work_group_scan_exclusive_<op> built-in functions that operate on half, float and double data types. The order of these floating-point operations is also non-deterministic for a given work-group.

See Also

No cross-references are available

Document Notes

For more information, see the OpenCL C Specification

This page is extracted from the OpenCL C Specification. Fixes and changes should be made to the Specification, not directly.

Copyright 2014-2023 The Khronos Group Inc.

SPDX-License-Identifier: CC-BY-4.0


1. Only if the cl_khr_fp16 extension is supported and has been enabled.
2. Only if 64-bit integers are supported. In OpenCL C 3.0 this will be indicated by the presence of the __opencl_c_int64 feature macro.
3. Only if double precision is supported. In OpenCL C 3.0 this will be indicated by the presence of the __opencl_c_fp64 feature macro.