Name Strings cl_qcom_ion_host_ptr Contributors Balaji Calidas, Qualcomm Technologies, Inc. David Garcia, Qualcomm Technologies, Inc. Sushmita Susheelendra, Qualcomm Innovation Center, Inc. Contact bcalidas at qti dot qualcomm dot com Version Version 7, 2018/01/19 Number OpenCL Extension #22 Status Shipping Extension Type OpenCL device extension Dependencies OpenCL 1.1 is required. cl_qcom_ext_host_ptr is required. This extension is written against the OpenCL 1.1 specification If present, cl_qcom_ext_host_ptr_iocoherent extends the functionality of this extension. Overview This extension extends the functionality provided by clCreateBuffer, clCreateImage2D, clCreateImage3D. It allows applications to pass an ION memory allocation to these functions so that it can be mapped to the device's address space and thus avoid having to copy data back and forth between the host and the device. Header File cl_ext.h New Tokens Accepted by the argument of clCreateBuffer, clCreateImage2D and clCreateImage3D: typedef struct _cl_mem_ion_host_ptr { // Type of external memory allocation. // Must be CL_MEM_ION_HOST_PTR_QCOM for ION allocations. cl_mem_ext_host_ptr ext_host_ptr; // ION file descriptor int ion_filedesc; // Host pointer to the ION allocated memory void* ion_hostptr; } cl_mem_ion_host_ptr; Used together with CL_MEM_EXT_HOST_PTR_QCOM: CL_MEM_ION_HOST_PTR_QCOM 0x40A8 Additions to Chapter 5.2.1 of the OpenCL 1.1 Specification (Creating Buffer Objects) When CL_MEM_EXT_HOST_PTR_QCOM is enabled in the argument, then is interpreted as a pointer to cl_mem_ext_host_ptr. When ->allocation_type is equal to CL_MEM_ION_HOST_PTR_QCOM then can also be interpreted as a pointer to cl_mem_ion_host_ptr. In addition to that, the application must also initialize the following struct fields: * ->host_cache_policy should be set as follows - If the ion allocation was made with the flag ION_FLAG_CACHED enabled and cl_mem_ext_host_ptr_iocoherent is present, ->host_cache_policy can be set to either CL_MEM_HOST_WRITEBACK_QCOM or CL_MEM_HOST_IOCOHERENT_QCOM. If the ION allocation was made with the flag ION_FLAG_CACHED enabled and cl_mem_ext_host_ptr_iocoherent is not present, ->host_cache_policy should be set to CL_MEM_HOST_WRITEBACK_QCOM. It must be equal to CL_MEM_HOST_UNCACHED_QCOM otherwise. * ->ion_filedesc must be the file descriptor of the ION memory allocation that the application wants to use as storage bits for the memory object. * ->ion_hostptr must be the host virtual pointer associated with the same ION memory allocation. If the application does not need to map the newly created cl memory object for host access, it can set ->ion_hostptr to NULL. If this happens, then calls to host access functions such as clEnqueueMapBuffer will fail out and return an errorcode of CL_INVALID_OPERATION. Setting ->ion_hostptr to NULL avoids the need for the application to make an extra map call for acquiring the host virtual pointer. Memory specified this way must be aligned to the device's page size. The application can query the device's page size by using clGetDeviceInfo(..., CL_DEVICE_PAGE_SIZE_QCOM, ...). Once the memory object is created, the application must call clEnqueueMapBuffer/clEnqueueMapImage with appropriate flags before reading or writing to it on the host. The host unmaps the region when accesses (reads and/or writes) to this mapped region by the host are complete. As per the OpenCL 1.2 specification, clEnqueueMapBuffer and clEnqueueMapImage act as synchronization points for the region of the buffer object being mapped. Sample Code 1) Using the extension for CL buffer objects cl_mem buffer_object = NULL; size_t buffer_size_in_bytes = 0; size_t buffer_size_with_padding = 0; cl_mem_ion_host_ptr myionmem = {0}; size_t ext_mem_padding_in_bytes = 0; size_t device_page_size = 0; // Query the device's page size and the amount of padding necessary at // the end of the buffer. clGetDeviceInfo(device, CL_DEVICE_PAGE_SIZE_QCOM, sizeof(device_page_size), &device_page_size, NULL); clGetDeviceInfo(device, CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM, sizeof(ext_mem_padding_in_bytes), &ext_mem_padding_in_bytes, NULL); // Compute the desired size for the data in the buffer. buffer_size_in_bytes = foobar(); // Compute amount of memory that needs to be allocated for the buffer // including padding. buffer_size_with_padding = buffer_size_in_bytes + ext_mem_padding_in_bytes; // Make an ION memory allocation of size buffer_size_with_padding here. // Note that allocating buffer_size_in_bytes instead would be a mistake. // It's important to allocate the extra padding. Let's say the // parameters of the allocation are stored in a struct named ion_info // that we will use below. // Create an OpenCL buffer object that uses ion_info as its data store. // Notice how the buffer is created with size buffer_size_in_bytes, not // buffer_size_with_padding. myionmem.ext_host_ptr.allocation_type = CL_MEM_ION_HOST_PTR_QCOM; myionmem.ext_host_ptr.host_cache_policy = CL_MEM_HOST_UNCACHED_QCOM; // file descriptor for ION myionmem.ion_filedesc = ion_info_fd.file_descriptor; // hostptr returned by ION which is device page size aligned myionmem.ion_hostptr = ion_info.host_virtual_address; if(myionmem.ion_hostptr % device_page_size) { error("Host pointer must be aligned to device_page_size!"); } buffer_object = clCreateBuffer(context, CL_MEM_USE_HOST_PTR | CL_MEM_EXT_HOST_PTR_QCOM, buffer_size_in_bytes, &myionmem, &errcode); 2) Using the extension for CL image objects cl_mem image_object = NULL; cl_mem_ion_host_ptr myionmem = {0}; size_t ext_mem_padding_in_bytes = 0; size_t device_page_size = 0; size_t row_pitch = 0; // Query the device's page size and the amount of padding necessary at // the end of the buffer. clGetDeviceInfo(device, CL_DEVICE_PAGE_SIZE_QCOM, sizeof(device_page_size), &device_page_size, NULL); clGetDeviceInfo(device, CL_DEVICE_EXT_MEM_PADDING_IN_BYTES_QCOM, sizeof(ext_mem_padding_in_bytes), &ext_mem_padding_in_bytes, NULL); // Query the device supported row and slice pitch using // clGetDeviceImageInfoQCOM // imgw - image width // imgh - image height // img_fmt - image format clGetDeviceImageInfoQCOM(device, imgw, imgh, &img_fmt, CL_IMAGE_ROW_PITCH, sizeof(image_row_pitch), &image_row_pitch, NULL); // Use the image height, row pitch obtained above and element size to // compute the size of the buffer buffer_size_in_bytes = imgh * image_row_pitch; // Compute amount of memory that needs to be allocated for the buffer // including padding. buffer_size_with_padding = buffer_size_in_bytes + ext_mem_padding_in_bytes; // Make an ION memory allocation of size buffer_size_with_padding here. // Note that allocating buffer_size_in_bytes instead would be a mistake. // It's important to allocate the extra padding. Let's say the // parameters of the allocation are stored in a struct named ion_info // that we will use below. // Create an OpenCL image object that uses ion_info as its data store. myionmem.ext_host_ptr.allocation_type = CL_MEM_ION_HOST_PTR_QCOM; myionmem.ext_host_ptr.host_cache_policy = CL_MEM_HOST_UNCACHED_QCOM; // file descriptor for ION myionmem.ion_filedesc = ion_info_fd.file_descriptor; // hostptr returned by ION which is device page size aligned myionmem.ion_hostptr = ion_info.host_virtual_address; if(myionmem.ion_hostptr % device_page_size) { error("Host pointer must be aligned to device_page_size!"); } // Note that the image_row_pitch obtained by calling // clGetDeviceImageInfoQCOM should be passed to clCreateImage2D image_object = clCreateImage2D(context, CL_MEM_USE_HOST_PTR | CL_MEM_EXT_HOST_PTR_QCOM, &image_fmt, imgw, imgh, image_row_pitch, &myionmem, &errcode); // Call clEnqueueMapImage before filling input image data pinput = clEnqueueMapImage(command_queue, image_object, CL_TRUE, CL_MAP_WRITE, origin, region, &row_pitch, NULL, 0, NULL, NULL, &errcode); // Fill the input image data using the hostptr and row_pitch returned by // clEnqueueMapImage cl_uchar* inp = pinput; memset(inp, 0x0, (row_pitch * imgh)); for(i = 0; i < (row_pitch * imgh); i+=row_pitch) { memset(inp+i, 0xff, imgw * element_size); } errcode = clEnqueueUnmapMemObject(command_queue, image_object, pinput, 0, NULL, NULL); Revision History Revision 1, 2012/10/18: Initial version. Revision 2, 2012/11/01: Improved sample code. Revision 3, 2013/05/17: Generalized. Cleaned-up for Khronos. Added final token values. Revision 4, 2017/06/16: Clean up. No functional changes. Revision 5, 2017/11/13: Clean up. No functional changes. Revision 6, 2018/01/03: Added reference to cl_qcom_ext_host_ptr_iocoherent. Revision 7, 2018/01/19: Formatting and misc changes. No functional changes.