Extension and Version Dependencies
-
This is a experimental extension and must be used with caution. See the description of experimental header files for enablement and stability details.
Other Extension Metadata
- Last Modified Date
-
2024-09-05
- IP Status
-
No known IP claims.
- Contributors
-
-
Ewan Crawford, Codeplay Software Ltd.
-
Gordon Brown, Codeplay Software Ltd.
-
Kenneth Benzie, Codeplay Software Ltd.
-
Alastair Murray, Codeplay Software Ltd.
-
Jack Frankland, Codeplay Software Ltd.
-
Balaji Calidas, Qualcomm Technologies Inc.
-
Joshua Kelly, Qualcomm Technologies, Inc.
-
Kevin Petit, Arm Ltd.
-
Aharon Abramson, Intel.
-
Ben Ashbaugh, Intel.
-
Boaz Ouriel, Intel.
-
Pekka Jääskeläinen, Tampere University
-
Jan Solanti, Tampere University
-
Nikhil Joshi, NVIDIA
-
James Price, Google
-
Description
The cl_khr_ extension separates command construction
from enqueue by providing a mechanism to record a set of commands which can
then be repeatedly enqueued.
However, the commands recorded to the command-buffer are immutable between
enqueues.
cl_khr_ removes this restriction.
In particular, this extension allows the configuration of a kernel execution
command in a command-buffer, called a mutable-dispatch, to be modified.
This allows inputs and outputs to the kernel, as well as work-item sizes and
offsets, to change without having to re-record the entire command sequence
in a new command-buffer.
Interactions With Other Extensions
The clUpdateMutableCommandsKHR entry-point has been designed for the purpose
of allowing expansion of mutable functionality in future extensions layered on
top of cl_khr_command_buffer_mutable_dispatch.
A new extension can define its own structure type to specify the update
configuration it requires, with a matching
cl_command_ value. This new structure type can
then be passed to clUpdateMutableCommandsKHR where it is reinterpreted from a
void pointer using cl_command_.
New Types
-
cl_mutable_dispatch_ fields_ khr -
cl_mutable_command_ info_ khr -
cl_command_buffer_ update_ type_ khr -
cl_mutable_dispatch_ asserts_ khr -
cl_mutable_dispatch_ config_ khr -
cl_mutable_dispatch_ exec_ info_ khr -
cl_mutable_dispatch_ arg_ khr
New Enums
-
cl_device_info -
CL_DEVICE_MUTABLE_ DISPATCH_ CAPABILITIES_ KHR
-
-
cl_command_properties_ khr -
CL_MUTABLE_DISPATCH_ ASSERTS_ KHR -
CL_MUTABLE_DISPATCH_ UPDATABLE_ FIELDS_ KHR
-
-
cl_mutable_dispatch_ asserts_ khr -
CL_MUTABLE_DISPATCH_ ASSERT_ NO_ ADDITIONAL_ WORK_ GROUPS_ KHR
-
-
cl_mutable_dispatch_ fields_ khr -
CL_MUTABLE_DISPATCH_ GLOBAL_ OFFSET_ KHR -
CL_MUTABLE_DISPATCH_ GLOBAL_ SIZE_ KHR -
CL_MUTABLE_DISPATCH_ LOCAL_ SIZE_ KHR -
CL_MUTABLE_DISPATCH_ ARGUMENTS_ KHR -
CL_MUTABLE_DISPATCH_ EXEC_ INFO_ KHR
-
-
cl_mutable_command_ info_ khr -
CL_MUTABLE_COMMAND_ COMMAND_ QUEUE_ KHR -
CL_MUTABLE_COMMAND_ COMMAND_ BUFFER_ KHR -
CL_MUTABLE_COMMAND_ PROPERTIES_ ARRAY_ KHR -
CL_MUTABLE_DISPATCH_ KERNEL_ KHR -
CL_MUTABLE_DISPATCH_ DIMENSIONS_ KHR -
CL_MUTABLE_DISPATCH_ GLOBAL_ WORK_ OFFSET_ KHR -
CL_MUTABLE_DISPATCH_ GLOBAL_ WORK_ SIZE_ KHR -
CL_MUTABLE_DISPATCH_ LOCAL_ WORK_ SIZE_ KHR -
CL_MUTABLE_COMMAND_ COMMAND_ TYPE_ KHR
-
-
cl_command_buffer_ flags_ khr -
CL_COMMAND_BUFFER_ MUTABLE_ KHR
-
-
cl_command_buffer_ properties_ khr -
CL_COMMAND_BUFFER_ MUTABLE_ DISPATCH_ ASSERTS_ KHR
-
-
cl_command_buffer_ update_ type_ khr -
CL_STRUCTURE_TYPE_ MUTABLE_ DISPATCH_ CONFIG_ KHR
-
-
New Error Codes
-
CL_INVALID_MUTABLE_ COMMAND_ KHR
-
Sample Code
Sample Application Updating the Arguments to a Mutable-dispatch Between Command-buffer Submissions
#define CL_CHECK(ERROR) \
if (ERROR) { \
std::cerr << "OpenCL error: " << ERROR << "\n"; \
return ERROR; \
}
int main() {
cl_platform_id platform;
CL_CHECK(clGetPlatformIDs(1, &platform, nullptr));
cl_device_id device;
CL_CHECK(clGetDeviceIDs(platform, CL_DEVICE_TYPE_ALL, 1, &device, nullptr));
cl_mutable_dispatch_fields_khr mutable_capabilities;
CL_CHECK(clGetDeviceInfo(device, CL_DEVICE_MUTABLE_DISPATCH_CAPABILITIES_KHR,
sizeof(mutable_capabilities), &mutable_capabilities,
nullptr));
if (!(mutable_capabilities & CL_MUTABLE_DISPATCH_ARGUMENTS_KHR)) {
std::cerr
<< "Device does not support update arguments to a mutable-dispatch, "
"skipping example.\n";
return 0;
}
cl_int error;
cl_context context =
clCreateContext(nullptr, 1, &device, nullptr, nullptr, &error);
CL_CHECK(error);
const char* code = R"OpenCLC(
kernel void vector_addition(global int* tile1, global int* tile2,
global int* res) {
size_t index = get_global_id(0);
res[index] = tile1[index] + tile2[index];
}
)OpenCLC";
const size_t length = std::strlen(code);
cl_program program =
clCreateProgramWithSource(context, 1, &code, &length, &error);
CL_CHECK(error);
CL_CHECK(clBuildProgram(program, 1, &device, nullptr, nullptr, nullptr));
cl_kernel kernel = clCreateKernel(program, "vector_addition", &error);
CL_CHECK(error);
// Set the parameters of the frames
constexpr size_t iterations = 60;
constexpr size_t elem_size = sizeof(cl_int);
constexpr size_t frame_width = 32;
constexpr size_t frame_count = frame_width * frame_width;
constexpr size_t frame_size = frame_count * elem_size;
cl_mem input_A_buffers[2] = {nullptr, nullptr};
cl_mem input_B_buffers[2] = {nullptr, nullptr};
cl_mem output_buffers[2] = {nullptr, nullptr};
// Create the buffer to swap between even and odd kernel iterations
for (size_t i = 0; i < 2; i++) {
input_A_buffers[i] =
clCreateBuffer(context, CL_MEM_READ_ONLY, frame_size, nullptr, &error);
CL_CHECK(error);
input_B_buffers[i] =
clCreateBuffer(context, CL_MEM_READ_ONLY, frame_size, nullptr, &error);
CL_CHECK(error);
output_buffers[i] =
clCreateBuffer(context, CL_MEM_WRITE_ONLY, frame_size, nullptr, &error);
CL_CHECK(error);
}
cl_command_queue command_queue =
clCreateCommandQueue(context, device, 0, &error);
CL_CHECK(error);
// Create command-buffer with mutable flag so we can update it
cl_command_buffer_properties_khr properties[3] = {
CL_COMMAND_BUFFER_FLAGS_KHR, CL_COMMAND_BUFFER_MUTABLE_KHR, 0};
cl_command_buffer_khr command_buffer =
clCreateCommandBufferKHR(1, &command_queue, properties, &error);
CL_CHECK(error);
CL_CHECK(clSetKernelArg(kernel, 0, sizeof(cl_mem), &input_A_buffers[0]));
CL_CHECK(clSetKernelArg(kernel, 1, sizeof(cl_mem), &input_B_buffers[0]));
CL_CHECK(clSetKernelArg(kernel, 2, sizeof(cl_mem), &output_buffers[0]));
// Instruct the nd-range command to allow for mutable kernel arguments
cl_command_properties_khr mutable_properties[] = {
CL_MUTABLE_DISPATCH_UPDATABLE_FIELDS_KHR,
CL_MUTABLE_DISPATCH_ARGUMENTS_KHR, 0};
// Create command handle for mutating nd-range command
cl_mutable_command_khr command_handle = nullptr;
// Add the nd-range kernel command
error = clCommandNDRangeKernelKHR(
command_buffer, command_queue, mutable_properties, kernel, 1, nullptr,
&frame_count, nullptr, 0, nullptr, nullptr, &command_handle);
CL_CHECK(error);
CL_CHECK(clFinalizeCommandBufferKHR(command_buffer));
// Prepare for random input generation
std::random_device random_device;
std::mt19937 random_engine{random_device()};
std::uniform_int_distribution<cl_int> random_distribution{
std::numeric_limits<cl_int>::min() / 2,
std::numeric_limits<cl_int>::max() / 2};
// Iterate over each frame
for (size_t i = 0; i < iterations; i++) {
// Set the buffers for the current frame
cl_mem input_A_buffer = input_A_buffers[i % 2];
cl_mem input_B_buffer = input_B_buffers[i % 2];
cl_mem output_buffer = output_buffers[i % 2];
// Generate input A data
std::vector<cl_int> input_a(frame_count);
std::generate(std::begin(input_a), std::end(input_a),
[&]() { return random_distribution(random_engine); });
// Write the generated data to the input A buffer
error =
clEnqueueWriteBuffer(command_queue, input_A_buffer, CL_FALSE, 0,
frame_size, input_a.data(), 0, nullptr, nullptr);
CL_CHECK(error);
// Generate input B data
std::vector<cl_int> input_b(frame_count);
std::generate(std::begin(input_b), std::end(input_b),
[&]() { return random_distribution(random_engine); });
// Write the generated data to the input B buffer
error =
clEnqueueWriteBuffer(command_queue, input_B_buffer, CL_FALSE, 0,
frame_size, input_b.data(), 0, nullptr, nullptr);
CL_CHECK(error);
// If not executing the first frame
if (i != 0) {
// Configure the mutable configuration to update the kernel arguments
cl_mutable_dispatch_arg_khr arg_0{0, sizeof(cl_mem), &input_A_buffer};
cl_mutable_dispatch_arg_khr arg_1{1, sizeof(cl_mem), &input_B_buffer};
cl_mutable_dispatch_arg_khr arg_2{2, sizeof(cl_mem), &output_buffer};
cl_mutable_dispatch_arg_khr args[] = {arg_0, arg_1, arg_2};
cl_mutable_dispatch_config_khr dispatch_config{
command_handle,
3 /* num_args */,
0 /* num_svm_arg */,
0 /* num_exec_infos */,
0 /* work_dim - 0 means no change to dimensions */,
args /* arg_list */,
nullptr /* arg_svm_list - nullptr means no change*/,
nullptr /* exec_info_list */,
nullptr /* global_work_offset */,
nullptr /* global_work_size */,
nullptr /* local_work_size */};
// Update the command buffer with the mutable configuration
cl_uint num_configs = 1;
cl_command_buffer_update_type_khr config_types[1] = {
CL_STRUCTURE_TYPE_MUTABLE_DISPATCH_CONFIG_KHR
};
const void* configs[1] = {&dispatch_config};
error = clUpdateMutableCommandsKHR(command_buffer, num_configs,
config_types, configs);
CL_CHECK(error);
}
// Enqueue the command buffer
error = clEnqueueCommandBufferKHR(0, nullptr, command_buffer, 0, nullptr,
nullptr);
CL_CHECK(error);
// Allocate memory for the output data
std::vector<cl_int> output(frame_count);
// Read the output data from the output buffer
error = clEnqueueReadBuffer(command_queue, output_buffer, CL_TRUE, 0,
frame_size, output.data(), 0, nullptr, nullptr);
CL_CHECK(error);
// Flush and execute the read buffer
error = clFinish(command_queue);
CL_CHECK(error);
// Verify the results of the frame
for (size_t i = 0; i < frame_count; ++i) {
const cl_int result = input_a[i] + input_b[i];
if (output[i] != result) {
std::cerr << "Error: Incorrect result at index " << i << " - Expected "
<< output[i] << " was " << result << std::endl;
std::exit(1);
}
}
}
std::cout << "Result verified\n";
CL_CHECK(clReleaseCommandBufferKHR(command_buffer));
for (size_t i = 0; i < 2; i++) {
CL_CHECK(clReleaseMemObject(input_A_buffers[i]));
CL_CHECK(clReleaseMemObject(input_B_buffers[i]));
CL_CHECK(clReleaseMemObject(output_buffers[i]));
}
CL_CHECK(clReleaseCommandQueue(command_queue));
CL_CHECK(clReleaseKernel(kernel));
CL_CHECK(clReleaseProgram(program));
CL_CHECK(clReleaseContext(context));
CL_CHECK(clReleaseDevice(device));
return 0;
}
Issues
-
Include simpler, more user friendly, entry-points for updating kernel arguments?
RESOLVED: Can be implemented in the ecosystem as a layer on top, if that layer proves popular then can be introduced, possibly as another extension on top.
-
Add a command-buffer clone entry-point for deep copying a command-buffer? Arguments could then be updated and both command-buffers used. Useful for techniques like double buffering.
RESOLVED: In the use-case we’re targeting a user would only have a handle to the original command-buffer, but not the clone, which may limit the usefulness of this capability. Additionally, an implementation could be complicated by non-trivial deep copying of the underlying objects contained in the command-buffer. As a result of this new entry-point being an additive change to the specification it is omitted, and if its functionality has demand later, it may be a introduced as a stand alone extension.
Version History
-
Revision 0.9.0, 2022-08-31
-
First assigned version (experimental).
-
-
Revision 0.9.1, 2023-11-07
-
Add type
cl_mutable_and its possible values (experimental).dispatch_ asserts_ khr
-
-
Revision 0.9.2, 2024-06-19
-
Change clUpdateMutableCommandsKHR API to pass configs as an array rather than linked list (experimental).
-
-
Revision 0.9.3, 2024-09-05
-
Rename
CL_MUTABLE_DISPATCH_PROPERTIES_ARRAY_KHRtoCL_MUTABLE_COMMAND_PROPERTIES_ARRAY_KHR(experimental).
-
Document Notes
For more information, see the OpenCL Specification
This page is a generated document. Fixes and changes should be made to the generator scripts, not directly.