Contributors
Ben Ashbaugh, Intel
Maciej Dziuban, Intel
Filip Hazubski, Intel
Dariusz Mroz, Intel
Michal Mrozek, Intel
Dependencies
This extension is written against the OpenCL API Specification Version 3.0.5.
Because this extension adds to the command-queue properties accepted by the clCreateCommandQueueWithProperties API, this extension requires support for either OpenCL 2.0 or the cl_khr_create_command_queue extension.
Overview
Some OpenCL devices may support different sets of command-queues with different capabilities or execution properties. These sets are described in this extension as command-queue families. Applications may be able to improve performance or predictability by creating command-queues from a specific command-queue family.
This extension adds the ability to:
-
Query the command-queue families supported by an OpenCL device and their capabilities.
-
Create an OpenCL command-queue from a specific command-queue family.
-
Query the command-queue family and command-queue index associated with an OpenCL command-queue.
New API Enums
Accepted value for the param_name parameter to clGetDeviceInfo to query the number of command-queue families and command-queue family properties supported by an OpenCL device:
#define CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL 0x418B
Accepted as a property name for the properties parameter to clCreateCommandQueueWithProperties to specify the command-queue family and command-queue index that this command-queue should submit work to; and for the param_name parameter to clGetCommandQueueInfo to query the command-queue family or command-queue index associated with a command-queue:
#define CL_QUEUE_FAMILY_INTEL 0x418C
#define CL_QUEUE_INDEX_INTEL 0x418D
New API Types
Returned as the query result value clGetDeviceInfo for CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL:
#define CL_QUEUE_FAMILY_MAX_NAME_SIZE_INTEL 64
typedef struct _cl_queue_family_properties_intel {
cl_command_queue_properties properties;
cl_command_queue_capabilities_intel capabilities;
cl_uint count;
char name[CL_QUEUE_FAMILY_MAX_NAME_SIZE_INTEL];
} cl_queue_family_properties_intel;
Bitfield type describing the capabilities of the queues in a command-queue family. Subsequent versions of this extension may add additional queue capabilities:
typedef cl_bitfield cl_command_queue_capabilities_intel;
#define CL_QUEUE_DEFAULT_CAPABILITIES_INTEL 0
/* Synchronization Capabilities: */
#define CL_QUEUE_CAPABILITY_CREATE_SINGLE_QUEUE_EVENTS_INTEL (1 << 0)
#define CL_QUEUE_CAPABILITY_CREATE_CROSS_QUEUE_EVENTS_INTEL (1 << 1)
#define CL_QUEUE_CAPABILITY_SINGLE_QUEUE_EVENT_WAIT_LIST_INTEL (1 << 2)
#define CL_QUEUE_CAPABILITY_CROSS_QUEUE_EVENT_WAIT_LIST_INTEL (1 << 3)
/* bit 4 - bit 7 are currently unused */
/* Transfer Capabilities: */
#define CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_INTEL (1 << 8)
#define CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_RECT_INTEL (1 << 9)
#define CL_QUEUE_CAPABILITY_MAP_BUFFER_INTEL (1 << 10)
#define CL_QUEUE_CAPABILITY_FILL_BUFFER_INTEL (1 << 11)
#define CL_QUEUE_CAPABILITY_TRANSFER_IMAGE_INTEL (1 << 12)
#define CL_QUEUE_CAPABILITY_MAP_IMAGE_INTEL (1 << 13)
#define CL_QUEUE_CAPABILITY_FILL_IMAGE_INTEL (1 << 14)
#define CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_IMAGE_INTEL (1 << 15)
#define CL_QUEUE_CAPABILITY_TRANSFER_IMAGE_BUFFER_INTEL (1 << 16)
/* bit 17 - bit 23 are currently unused */
/* Execution Capabilities */
#define CL_QUEUE_CAPABILITY_MARKER_INTEL (1 << 24)
#define CL_QUEUE_CAPABILITY_BARRIER_INTEL (1 << 25)
#define CL_QUEUE_CAPABILITY_KERNEL_INTEL (1 << 26)
/* bit 27 and beyond are currently unused */
Modifications to the OpenCL API Specification
- (Add to the list preceding Table 5 in Section 4.2 - Querying Devices)
-
The device queries described in the Device Queries table should return the same information for a root-level device i.e. a device returned by clGetDeviceIDs and any sub-devices created from this device except for the following queries:
-
…
-
CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL
-
- (Add to Table 5 - OpenCL Device Queries in Section 4.2 - Querying Devices)
-
Table 5. List of supported param_names by clGetDeviceInfo Device Info Return Type Description CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTELcl_queue_family_properties_intel[]Returns an array of
cl_queue_family_properties_intelstructures describing command-queue families supported by the device. Each structure consists of:properties: Describes the host command-queue properties supported by this command-queue family. The supported property values are the same as those returned by the query forCL_DEVICE_QUEUE_ON_HOST_PROPERTIES.capabilities: Describes the command-queue capabilities supported by this command-queue family. This is a bitfield value that may either beCL_QUEUE_DEFAULT_CAPABILITIES_INTELor a set of queue capabilities from the Queue Capabilities Table.count: Describes the number of command-queues in this command-queue family. Command-queues created with unique command-queue indices may execute more efficiently than command-queues created with equal indices.name: An array ofCL_QUEUE_FAMILY_MAX_NAME_SIZE_INTELbytes used as storage for a null-terminated string. The string is a descriptive name for this command-queue family. The descriptive name is purely informative and has no semantic meaning.At least one entry in the array must return the same properties returned by
CL_DEVICE_QUEUE_ON_HOST_PROPERTIESand must have capabilities equal toCL_QUEUE_DEFAULT_CAPABILITIES_INTEL.Please note that a sub-device may support different command-queue families than its root-level device.
- (Add to Table 9 - List of supported queue creation properties by clCreateCommandQueueWithProperties)
-
Table 9. List of supported queue creation properties by clCreateCommandQueueWithProperties Queue Property Property Value Description CL_QUEUE_FAMILY_INTELcl_uintSpecifies the command-queue family that this command-queue will submit work to.
The specified command-queue family must be between zero and the total number of command-queue families supported by the device. If a command-queue family is specified then a command-queue index must also be specified.
CL_QUEUE_INDEX_INTELcl_uintSpecifies the command-queue index within the command-queue family that this command-queue will submit work to.
The specified command-queue index must be between zero and the total number of command-queues in the command-queue family for this command-queue for the device. If a command-queue index is specified then a command-queue family must also be specified.
- (Add to the list of error conditions for clCreateCommandQueueWithProperties)
-
clCreateCommandQueueWithProperties returns a valid non-zero command-queue and errcode_ret is set to
CL_SUCCESSif the command-queue is created successfully. Otherwise, it returns aNULLvalue with one of the following error values returned in errcode_ret:-
…
-
CL_INVALID_VALUEif the property value forCL_QUEUE_FAMILY_INTELspecifies a command-queue family greater than or equal to the number of command-queue families supported by the device. -
CL_INVALID_VALUEif the property value forCL_QUEUE_INDEX_INTELspecifies a command-queue index greater than or equal to the number of queues for the command-queue family for the device. -
CL_INVALID_VALUEif the propertyCL_QUEUE_FAMILY_INTELis specified and the propertyCL_QUEUE_INDEX_INTELis not specified, or if the propertyCL_QUEUE_INDEX_INTELis specified and the propertyCL_QUEUE_FAMILY_INTELis not specified.
-
- (Add to Table 9 - List of Supported param_names by clGetCommandQueueInfo)
-
Table 9. List of supported param_names by clGetCommandQueueInfo Queue Properties Property Value Description CL_QUEUE_FAMILY_INTELcl_uintReturns the command-queue family that this command-queue will submit work to.
If no command-queue family was specified when this command-queue was created then the value returned for this query is implementation-defined, but must be a command-queue family with the same properties returned by
CL_DEVICE_QUEUE_ON_HOST_PROPERTIESfor the device and capabilities equal toCL_QUEUE_DEFAULT_CAPABILITIES_INTEL.CL_QUEUE_INDEX_INTELcl_uintReturns the command-queue index within the command-queue family that this command-queue will submit work to.
If no command-queue index was specified when this command-queue was created then the value returned for this query is implementation-defined, but must be between zero and the total number of queues supported by the device for the command-queue family that this command-queue will submit work to.
- (Add a new Section 5.1.X Command-Queue Families)
-
Some OpenCL devices may support different sets of command-queues with different capabilities or execution properties. The sets of command-queues with different capabilities or execution properties are known as command-queue families. Each command-queue family may contain multiple queues with similar characteristics.
Using multiple unique queues from a command-queue family or queues from different command-queue families may improve performance, such as by allowing commands to execute concurrently or using dedicated hardware resources.
Every OpenCL device must support at least one command-queue family with "default" command-queue capabilities. These command-queue families are identified with the special command-queue capability value
CL_QUEUE_DEFAULT_CAPABILITIES_INTEL. Command-queues created from a command-queue family with default command-queue capabilities have no additional restrictions and support all commands and command-queue features described by standard OpenCL device queries.When a command-queue family does not have the default command-queue capabilities, the command-queue family capability value is a bitfield describing the commands and command-queue features that are supported for queues created from the command-queue family. Enqueueing an unsupported command or using an unsupported command-queue feature will fail and generate an OpenCL error.
The following table describes the supported command-queue capabilities and the OpenCL commands they enable.
Table X. List of supported command-queue capabilities Queue Capability Description CL_QUEUE_DEFAULT_CAPABILITIES_INTELA special capability value to indicate that queues in this command-queue family have no additional restrictions. At least one command-queue family must support this capability.
CL_QUEUE_CAPABILITY_CREATE_SINGLE_QUEUE_EVENTS_INTELIndicates that queues in this command-queue family support creating event objects identifying commands for event profiling, waiting on the host, or in the event wait list for another command in the same queue.
CL_QUEUE_CAPABILITY_CREATE_CROSS_QUEUE_EVENTS_INTELIndicates that queues in this command-queue family support creating event objects identifying commands for event profiling, waiting on the host, or in the event wait list for another command in another queue. When creating cross-queue events is supported, creating single-queue events must also be supported.
CL_QUEUE_CAPABILITY_SINGLE_QUEUE_EVENT_WAIT_LIST_INTELIndicates that queues in this command-queue family support commands that wait on events that were created in the same queue.
CL_QUEUE_CAPABILITY_CROSS_QUEUE_EVENT_WAIT_LIST_INTELIndicates that queues in this command-queue family support commands that wait on events that were created in another queue. When waiting on cross-queue events is supported, waiting on single-queue events must also be supported.
CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueReadBuffer,clEnqueueWriteBuffer, andclEnqueueCopyBuffer.CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_RECT_INTELIndicates that queues in this command-queue family support calls to
clEnqueueReadBufferRect,clEnqueueWriteBufferRect, andclEnqueueCopyBufferRect.CL_QUEUE_CAPABILITY_MAP_BUFFER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueMapBufferandclEnqueueUnmapMemObject.CL_QUEUE_CAPABILITY_FILL_BUFFER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueFillBuffer.CL_QUEUE_CAPABILITY_TRANSFER_IMAGE_INTELIndicates that queues in this command-queue family support calls to
clEnqueueReadImage,clEnqueueWriteImage, andclEnqueueCopyImage.CL_QUEUE_CAPABILITY_MAP_IMAGE_INTELIndicates that queues in this command-queue family support calls to
clEnqueueMapImageandclEnqueueUnmapMemObject.CL_QUEUE_CAPABILITY_FILL_IMAGE_INTELIndicates that queues in this command-queue family support calls to
clEnqueueFillImage.CL_QUEUE_CAPABILITY_TRANSFER_BUFFER_IMAGE_INTELIndicates that queues in this command-queue family support calls to
clEnqueueCopyBufferToImage.CL_QUEUE_CAPABILITY_TRANSFER_IMAGE_BUFFER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueCopyImageToBuffer.CL_QUEUE_CAPABILITY_MARKER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueMarkerandclEnqueueMarkerWithWaitList.CL_QUEUE_CAPABILITY_BARRIER_INTELIndicates that queues in this command-queue family support calls to
clEnqueueBarrierandclEnqueueBarrierWithWaitList. - (Add to the list of error conditions for all enqueue APIs)
-
-
CL_INVALID_EVENT_WAIT_LISTif the queue capabilities for command_queue is notCL_QUEUE_DEFAULT_CAPABILITIES_INTELor does not includeCL_QUEUE_CAPABILITY_SINGLE_QUEUE_EVENT_WAIT_LIST_INTEL, and num_events_in_wait_list is not0or event_wait_list is notNULL.
-
CL_INVALID_EVENT_WAIT_LISTif the queue capabilities for the command-queue associated with an event in the event wait list is notCL_QUEUE_DEFAULT_CAPABILITIES_INTELor does not includeCL_QUEUE_CREATE_CROSS_QUEUE_EVENTS_INTEL, and command_queue is not equal to the command-queue associated with the event.
-
CL_INVALID_EVENT_WAIT_LISTif the queue capabilities for command_queue is notCL_QUEUE_DEFAULT_CAPABILITIES_INTELand command_queue is not equal to the command-queue associated with an event.
-
CL_INVALID_EVENTif the queue capabilities for command_queue is notCL_QUEUE_DEFAULT_CAPABILITIES_INTELor does not includeCL_QUEUE_CAPABILITY_CREATE_SINGLE_QUEUE_EVENTS_INTELorCL_QUEUE_CAPABILITY_CREATE_CROSS_QUEUE_EVENTS_INTEL, and event is notNULL.
For all enqueue APIs described in the Queue Capabilities Table:
-
CL_INVALID_OPERATIONif the queue capabilities for command_queue is notCL_QUEUE_DEFAULT_CAPABILITIES_INTELor does not include the required queue capability.
For all other enqueue APIs not described in the Queue Capabilities Table:
-
CL_INVALID_OPERATIONif the queue capabilities for command_queue is notCL_QUEUE_DEFAULT_CAPABILITIES_INTEL.
-
Issues
-
What should this extension be called?
RESOLVED
The name of this extension is
cl_intel_command_queue_families. -
What does this extension offer compared to "device partitioning"?
RESOLVED: This extension describes command-queue families and their properties to control how work can be executed on a device or sub-device. It is complementary to device partitioning.
-
What are the memory model implications for command-queue families?
UNRESOLVED
-
Is there a better way to describe
CL_QUEUE_CAPABILITY_ALL_INTEL?RESOLVED
This special capability was switched to
CL_QUEUE_DEFAULT_CAPABILITIES_INTEL, and the value of the capability was changed to zero from all-bits-set. This should allow for special queue capabilities that go beyond default command-queue capabilities, if desired. -
Do we need a query for the number of command-queue families for a device?
RESOLVED
No, this is not needed. The number of command-queue families can be derived from the size returned by
CL_DEVICE_QUEUE_FAMILY_PROPERTIES_INTEL. -
Should there be a default command-queue family or command-queue index for a command-queue?
RESOLVED
No, it’s preferable to allow an implementation to vary the command-queue family and command-queue index per-command-queue. This enables an implementation to implement a policy to choose among different command-queue families or command queue indices for each command-queue rather than a single default if it leads to improved performance.
Note that specifying only a command-queue family or a command-queue index is an error, and an application must either specify no command-queue family or command queue index, or both a command-queue family and command-queue index.
-
Do we need a capability for cross-queue event wait lists?
RESOLVED
CL_QUEUE_CAPABILITY_CROSS_QUEUE_EVENT_WAIT_LIST_INTELwas added.