Contributors
Dan Petre, Intel
Krzysztof Laskowski, Intel
Bartosz Sochacki, Intel
Ben Ashbaugh, Intel
Biju George, Intel
Dependencies
OpenCL 1.2 is required.
This extension is written against the OpenCL Specification Version 3.0.7.
Overview
The purpose of this extension is to provide OpenCL support for the Planar YUV (YCbCr) image formats. The NV12 format must be supported; support for other Planar YUV formats that may be defined in this extension is optional.
The extension introduces two new cl_mem_flags:
-
CL_MEM_NO_ACCESS_INTELmay be used together with image formats for which device does not support reading from or writing to at the OpenCL kernel level, but are still useful in other use-cases. -
CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTELmay be used to relax the memory access rights specified incl_mem_flagsat memory object creation time and allow to access and modify the contents of the underlying data storage in an unrestricted way e.g. by creating another memory object from that memory object or using dedicated device mechanisms.
New API Enums
Accepted as cl_mem_flags:
#define CL_MEM_NO_ACCESS_INTEL (1 << 24)
#define CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL (1 << 25)
Accepted as the image_channel_order of cl_image_format:
#define CL_NV12_INTEL 0x410E
Accepted value for the param_name parameter to clGetDeviceInfo:
#define CL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTEL 0x417E
#define CL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTEL 0x417F
Modifications to the OpenCL API Specification
- (Add to Table 5 - OpenCL Device Queries in Section 4.2 - Querying Devices)
-
Table 5. List of supported param_names by clGetDeviceInfo Device Info Return Type Description CL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTELsize_tMax width of a Planar YUV image in pixels.
CL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTELsize_tMax height of a Planar YUV image in pixels.
- (Add to Table 12 - List of supported memory flag values in Section 5.2.1 - Creating Buffer Objects)
-
Table 12. List of supported memory flag values Memory Flags Description CL_MEM_NO_ACCESS_INTELThis flag specifies that the device will not read or write to the memory object.
CL_MEM_NO_ACCESS_INTELandCL_MEM_READ_WRITE,CL_MEM_WRITE_ONLY, orCL_MEM_READ_ONLYare mutually exclusiveCL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTELThis flag indicates that the host and device access flags used together with this flag do not strictly prohibit reading or modifying the contents of this memory object. Memory objects created from this memory object may re-specify the host and device access capabilities of the created memory object with new access capabilities, and any mechanisms provided by the implementation which explicitly support certain operations on memory objects of this type are allowed to access this memory object without any restrictions.
- (Extend the argument description for clCreateImage in Section 5.3.1 - Creating Image Objects)
-
-
flags is a bit-field that is used to specify allocation and usage information about the image memory object being created and is described in the supported memory flag values table. […] If the
image_channel_orderof image_format is a Planar YUV format then flags must includeCL_MEM_HOST_NO_ACCESS. IfCL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTELwas specified in the memory flags associated with mem_object then flags can specify any host and device access capabilities regardless of the memory flags associated with mem_object.
-
- (Modify the error code descriptions for clCreateImage in Section 5.3.1 - Creating Image Objects)
-
clCreateImage returns a valid non-zero image object created and errcode_ret is set to
CL_SUCCESSif the image object is created successfully. Otherwise, it returns aNULLvalue with one of the following error values returned in errcode_ret:-
…
-
CL_INVALID_VALUEif theimage_channel_orderof image_format is a Planar YUV format and flags does not includeCL_MEM_HOST_NO_ACCESS. -
CL_INVALID_VALUEif an image is being created from another memory object (buffer or image) under one of the following circumstances: 1) mem_object was created withCL_MEM_WRITE_ONLYand flags specifiesCL_MEM_READ_WRITEorCL_MEM_READ_ONLY, 2) mem_object was created withCL_MEM_READ_ONLYand flags specifiesCL_MEM_READ_WRITEorCL_MEM_WRITE_ONLY, 3) mem_object was created withCL_MEM_NO_ACCESS_INTELand flags specifiesCL_MEM_READ_ONLY,CL_MEM_WRITE_ONLYorCL_MEM_READ_WRITE, 4) flags specifiesCL_MEM_USE_HOST_PTRorCL_MEM_ALLOC_HOST_PTRorCL_MEM_COPY_HOST_PTR. However, restrictions 1), 2) and 3) described above do not apply if mem_object was created withCL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL. -
CL_INVALID_VALUEif an image is being created from another memory object (buffer or image) and mem_object was created withCL_MEM_HOST_WRITE_ONLYand flags specifiesCL_MEM_HOST_READ_ONLY, or if mem_object was created withCL_MEM_HOST_READ_ONLYand flags specifiesCL_MEM_HOST_WRITE_ONLY, or if mem_object was created withCL_MEM_HOST_NO_ACCESSand flags specifiesCL_MEM_HOST_READ_ONLYorCL_MEM_HOST_WRITE_ONLY. However, these restrictions do not apply if mem_object was created withCL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL.
-
- (Add to Table 15 - Required host_ptr buffer sizes for images in Section 5.3.1 - Creating Image Objects)
-
Table 15. Required host_ptr buffer sizes for images Image Type Size of buffer that host_ptr points to CL_MEM_OBJECT_IMAGE2D>= image_row_pitch * image_height + image_row_pitch * image_height / 2, for images with
image_channel_orderequal toCL_NV12_INTEL. - (Add to the description of creation of an image object from another image object in Section 5.3.1.2 - Image Descriptor)
-
Creating a 2D image from a Planar YUV image object allows creation of a new image object that shares the Planar YUV image object’s data store but represents only the specified plane. Restrictions are:
-
All the values specified in image_desc except for mem_object must match the image descriptor information associated with mem_object, with exception where mem_object is a Planar YUV image object then image_width and image_height are ignored and derived from the mem_object and image_depth specifies the index of the target plane the image will be created against and must be one of the following:
image_channel_order of mem_object Plane image_depth specified in image_desc CL_NV12_INTELY
0
CL_NV12_INTELUV
1
The derived values of image_width and image_height can be later queried using clGetImageInfo.
-
The channel data type specified in image_format must match the channel data type associated with mem_object with exception to the following list of supported combinations:
image_channel_order of mem_object image_channel_data_type specified in image_format CL_NV12_INTELCL_UNORM_INT8CL_NV12_INTELCL_UNSIGNED_INT8 -
If mem_object is a Planar YUV image object the channel order specified in image format must be one of the following:
image_channel_order specified in image_format image_channel_order of mem_object Plane Channel Mappings CL_RCL_NV12_INTELY
R = Y
CL_RGCL_NV12_INTELUV
R = U, G = V
NoteConcurrent reading from or writing to both a Planar YUV image object and an image object created from the Planar YUV image object is undefined.
Reading from or writing to an image created from a Planar YUV image and then reading from or writing to the Planar YUV image in a kernel even if appropriate synchronization operations (such as a barrier) are performed between the reads or writes is undefined. Similarly, reading from and writing to the Planar YUV image and then reading from or writing to the image created from the Planar YUV image with appropriate synchronization between the reads or writes is undefined.
-
- (Add to Table 16 - List of supported Image Channel Order Values in Section 5.3.1 - Creating Image Objects)
-
Table 16. List of supported Image Channel Order Values Image Channel Order Description CL_NV12_INTELA Planar YUV image format with two planes. There are three channels in a
CL_NV12_INTELimage. For aCL_NV12_INTELimage, the image element size refers to an image element in the Y plane. - (Extend the descriptions in Section 5.3.1.2 - Image Descriptor)
-
-
image_widthis the width of the image in pixels. […] For aCL_NV12_INTELimage, the image width must be a multiple of 4 and less than or equal toCL_DEVICE_PLANAR_YUV_MAX_WIDTH_INTEL. -
image_heightis the height of the image in pixels. […] For aCL_NV12_INTELimage, the image height must be a multiple of 4 and less than or equal toCL_DEVICE_PLANAR_YUV_MAX_HEIGHT_INTEL. -
image_depthis the depth of the image in pixels. […] For aCL_NV12_INTELimage, the image depth must be 1.
-
- (Add Section 5.3.1.X - Memory Layout for Planar YUV Images)
-
In Planar YUV formats the Y, U and V components can all be stored as separate planes or the U and V components can be stored combined as one plane. There are various flavors of Planar YUV formats, differing in the number of planes, order, layout and the sub-sampling methods used for the U and V components.
The
CL_NV12_INTELimage format consists of two planes, Y (luma) plane and an interleaved UV (chroma) plane:<---- WIDTH ----> +------------------------+ ^ |YYYYYYYYYYYYYYYYYYYY^^^^| | |YYYYYYYYYYYYYYYYYYYY^^^^| H |YYYYYYYYYYYYYYYYYYYY^^^^| E |YYYYYYYYYYYYYYYYYYYY^^^^| I Luma plane (Y) |YYYYYYYYYYYYYYYYYYYY^^^^| G |YYYYYYYYYYYYYYYYYYYY^^^^| H |YYYYYYYYYYYYYYYYYYYY^^^^| T |YYYYYYYYYYYYYYYYYYYY^^^^| | +------------------------+ v |UVUVUVUVUVUVUVUVUVUV^^^^| |UVUVUVUVUVUVUVUVUVUV^^^^| Chroma plane (UV) |UVUVUVUVUVUVUVUVUVUV^^^^| |UVUVUVUVUVUVUVUVUVUV^^^^| +------------------------+ <---- ROW PITCH --->The luma plane contains 8 bit Y samples in case of the
CL_NV12_INTELformat, one for each image element:+-----+-----+-----+-----+-- | Y00 | Y01 | Y02 | Y03 | ... +-----+-----+-----+-----+-- | Y10 | Y11 | Y12 | Y13 | ... +-----+-----+-----+-----+-- | Y20 | Y21 | Y22 | Y23 | ... +-----+-----+-----+-----+-- | ... | ... | ... | ... | Sample -> 0 1 2 3 OffsetThe chroma plane contains interleaved 8 bit UV 2x2 samples in case of the
CL_NV12_INTELformat. The chroma components are sampled only once for every other image element and for every other row of image elements:+-----+-----+-----+-----+-- | U00 | V00 | U02 | V02 | ... +-----+-----+-----+-----+-- | U20 | V20 | U22 | V22 | ... +-----+-----+-----+-----+-- | ... | ... | ... | ... | Sample -> 0 1 2 3 OffsetUsing the above notation we can represent image elements like this:
+-----+-----+-----+-----+-- | P00 | P01 | P02 | P03 | ... +-----+-----+-----+-----+-- | P10 | P11 | P12 | P13 | ... +-----+-----+-----+-----+-- | P20 | P21 | P22 | P23 | ... +-----+-----+-----+-----+-- | ... | ... | ... | ... |where:
P00 = Y00U00V00 P01 = Y01U00V00 P10 = Y10U00V00 P11 = Y11U00V00 ... P20 = Y20U20V20 P21 = Y21U20V20 ... P30 = Y30U20V20 P31 = Y31U20V20etc.
The Y (luma) plane is followed immediately by the UV (chroma) plane. Both the Y and the UV planes have the same image_row_pitch. The Y plane height is image_height. The UV plane height is (image_height / 2). The Y plane width is image_width. The UV plane width is (image_width / 2).
- (Extend the description of flags in Section 5.3.2 - Querying List of Supported Image Formats)
-
flags is a bit-field that is used to specify information about the image formats being queried […]. To get a list of images that cannot be read from nor written to by a kernel, flags must be set to
CL_MEM_NO_ACCESS_INTEL. - (Add a table to Section 5.3.2.1 - Minimum List of Supported Image Formats)
-
For 2D image objects, the mandated minimum list of image formats that are not required to be read from nor written to by a kernel and that must be supported by all devices that support the
c_intel_planar_yuvextension is:num_channels channel_order channel_data_type 3
CL_NV12_INTELCL_UNORM_INT8
Modifications to the OpenCL C Specification
- (Add Planar YUV formats to Section 6.15.15.1.1 - Determining the border color or value)
-
-
If image channel order is
CL_NV12_INTELthe border color is value is undefined.
-
- (Add to the un-numbered table in Section 6.15.15.7 - Mapping image channels to color values returned by read_image and color values passed to write_image to image channels)
-
Channel Order
float4,int4oruint4components of channel dataCL_NV12_INTEL(V, Y, U, 1.0)
- (Add to the beginning of Section 6.15.15.2 Built-in Image Read Functions)
-
Note that reading from a
CL_NV12_INTELimage object is only supported by read_imagef functions that take integer coordinates.
Sample Code
Sample Host Code
cl_image_format image_format;
image_format.image_channel_order = CL_NV12_INTEL;
image_format.image_channel_data_type = CL_UNORM_INT8;
cl_image_desc image_desc;
image_desc.image_type = CL_MEM_OBJECT_IMAGE2D;
image_desc.image_width = width;
image_desc.image_height = height;
image_desc.image_array_size = 0;
image_desc.image_row_pitch = 0;
image_desc.image_slice_pitch = 0;
image_desc.num_mip_levels = 0;
image_desc.num_samples = 0;
image_desc.mem_object = NULL;
// create a CL_NV12_IMAGE
cl_mem nv12Img = clCreateImage(context,
CL_MEM_READ_ONLY | CL_MEM_HOST_NO_ACCESS |
CL_MEM_ACCESS_FLAGS_UNRESTRICTED_INTEL,
image_format, image_desc,
host_ptr, errcode_ret);
// image_width & image_height are ignored for plane extraction
image_desc.image_width = 0;
image_desc.image_height = 0;
// set mem_object to the full NV12 image
image_desc.mem_object = nv12Img;
// get access to the Y plane (CL_R)
image_desc.image_depth = 0;
// set proper image_format for the Y plane
image_format.image_channel_order = CL_R;
image_format.image_channel_data_type = CL_UNORM_INT8;
cl_mem nv12YplaneImg = clCreateImage(context, CL_MEM_READ_WRITE,
image_format, image_desc,
NULL, errcode_ret);
// get access to the UV plane (CL_RG)
image_desc.image_depth = 1;
// set proper image_format for the UV plane
image_format.image_channel_order = CL_RG;
image_format.image_channel_data_type = CL_UNORM_INT8;
cl_mem nv12UVplaneImg = clCreateImage(context, CL_MEM_READ_WRITE,
image_format, image_desc,
NULL, errcode_ret);
// NOT SUPPORTED: transfer the whole NV12 image to the device
// status = clEnqueueWriteImage(queue, nv12Img, true, origin, region,
// row_pitch, slice_pitch,
// ptr, 0, NULL, NULL);
// write Y plane of NV12 image
status = clEnqueueWriteImage(queue, nv12YplaneImg, true,
origin, region, row_pitch, slice_pitch,
ptr, 0, NULL, NULL);
// write UV plane of NV12 image
status = clEnqueueWriteImage(queue, nv12YplaneImg, true,
origin, region, row_pitch, slice_pitch,
ptr + uvPlaneOffset, 0, NULL, NULL);
// NOT SUPPORTED: read the whole NV12 image back
// status = clEnqueueReadImage(queue, nv12Img, true,
// origin, region, row_pitch, slice_pitch,
// ptr, 0, NULL, NULL);
// read Y plane of NV12 image
status = clEnqueueReadImage(queue, nv12UVplaneImg, true,
origin, region, row_pitch, slice_pitch,
ptr, 0, NULL, NULL);
// read UV plane of NV12 image
status = clEnqueueReadImage(queue, nv12UVplaneImg, true,
origin, region, row_pitch, slice_pitch,
ptr + uvPlaneOffset, 0, NULL, NULL);
Sample Kernel Code
// do something with a whole NV12 image
kernel void DoSomethingWithNV12
(
...
read_write image2d_t nv12Img,
...
)
{
...
// sample the CL_NV12_INTEL image - supported if CL_NV12_INTEL format is
// available with CL_MEM_READ_ONLY or CL_MEM_READ_WRITE access flags
// based on clGetSupportedImageFormats query.
float4 p = read_imagef(nv12Img, sampler, coord);
...
// write to the CL_NV12_INTEL image - supported if CL_NV12_INTEL format is
// available with CL_MEM_WRITE_ONLY or CL_MEM_READ_WRITE access flags
// based on clGetSupportedImageFormats query.
write_imagef(nv12Img, coord, p);
...
}
// do something with planes of an NV12 image
kernel void DoSomethingWithNV12Planes
(
...
read_write image2d_t nv12ImgYPlane,
read_write image2d_t nv12ImgUVPlane,
...
)
{
...
// sample the Y & UV planes
float4 py = read_imagef(nv12ImgYPlane, sampler, coord);
float4 puv = read_imagef(nv12ImgUVPlane, sampler, coord);
...
// write to Y & UV planes
write_imagef(nv12ImgYPlane, coord, py);
write_imagef(nv12ImgUVPlane, coord, puv);
...
}