Name ARM_import_memory Name Strings cl_arm_import_memory cl_arm_import_memory_host cl_arm_import_memory_dma_buf cl_arm_import_memory_protected cl_arm_import_memory_android_hardware_buffer cl_arm_import_memory will be reported if at least one of the other extension strings is also reported. Contributors Robert Elliott, Arm Ltd. Vatsalya Prasad, Arm Ltd. Kévin Petit, Arm Ltd. Sam Laynton, Arm Ltd. James Morrissey, Arm Ltd. Einar Hov, Arm Ltd. Contact Kévin Petit, ARM (kevin.petit 'at' ARM.com) IP Status No claims or disclosures are known to exist. Version Revision: #12, Sep 29th, 2022 Number OpenCL Extension #38 Status Complete. Extension Type OpenCL device extension Dependencies Requires OpenCL version 1.0 or later. Requires OpenCL 1.2 for host access cl_mem_flags use when importing, as these were introduced in OpenCL 1.2. Overview This extension adds a new function that allows for direct memory import into OpenCL via the clImportMemoryARM function. Memory imported through this interface will be mapped into the device's page tables directly, providing zero copy access. It will never fall back to copy operations and aliased buffers, instead producing an error when mapping is not possible. Types of memory supported for import are specified as additional extension strings. Header File cl_ext.h Glossary Protected memory: a zone of memory that is protected using TrustZone Media Protection. New Types None New Procedures and Functions The new function cl_mem clImportMemoryARM( cl_context context, cl_mem_flags flags, const cl_import_properties_arm *properties, void *memory, size_t size, cl_int *errorcode_ret ); Description Given a suitable pointer to an external memory allocation this function will map the memory into the device page tables. The flags argument provides standard OpenCL memory object flags. Valid are: * CL_MEM_READ_WRITE, CL_MEM_WRITE_ONLY, CL_MEM_READ_ONLY * CL_MEM_HOST_WRITE_ONLY, CL_MEM_HOST_READ_ONLY, CL_MEM_HOST_NO_ACCESS - where the host flags are only hints and only apply from OpenCL 1.2 onwards. * CL_MEM_USE_HOST_PTR - this flag has no effect, it is ignored. The properties argument provides a list of key value pairs, with a zero terminator. properties can be NULL or point to a single zero value if the default behaviour is desired. Valid are: * CL_IMPORT_TYPE_ARM * CL_IMPORT_TYPE_PROTECTED_ARM * CL_IMPORT_DMA_BUF_DATA_CONSISTENCY_WITH_HOST_ARM * CL_IMPORT_ANDROID_HARDWARE_BUFFER_PLANE_INDEX_ARM * CL_IMPORT_ANDROID_HARDWARE_BUFFER_LAYER_INDEX_ARM Valid values for CL_IMPORT_TYPE_ARM are: * CL_IMPORT_TYPE_HOST_ARM - this is the default * CL_IMPORT_TYPE_DMA_BUF_ARM * CL_IMPORT_TYPE_ANDROID_HARDWARE_BUFFER_ARM Valid values for CL_IMPORT_TYPE_PROTECTED_ARM are: * CL_FALSE - the memory is imported with no special behaviour, this is the default. * CL_TRUE - the memory is imported as protected memory. When CL_IMPORT_TYPE_PROTECTED_ARM is set to CL_TRUE, the following restrictions apply: * CL_MEM_HOST_NO_ACCESS is the only accepted host access flag, all others are rejected. If host access flags are omitted, then this is assumed. * CL_IMPORT_TYPE_DMA_BUF_ARM is the only valid value for CL_IMPORT_TYPE_ARM. Valid values for CL_IMPORT_DMA_BUF_DATA_CONSISTENCY_WITH_HOST_ARM are: * CL_FALSE - maintaining data consistency between memory and the host's view of it is the application's responsibility (default). * CL_TRUE - data consistency between memory and the host's view is managed by the OpenCL runtime. Valid values for CL_IMPORT_ANDROID_HARDWARE_BUFFER_PLANE_INDEX_ARM and CL_IMPORT_ANDROID_HARDWARE_BUFFER_LAYER_INDEX_ARM are integers in the range [0; num_{planes,layers}). If is NULL, default values are taken. Valid pointer is dependent on the TYPE passed in properties. must be greater than 0 and represents the size of the memory object to be created. Errors CL_INVALID_CONTEXT on invalid context. CL_INVALID_VALUE on invalid flag input. CL_INVALID_BUFFER_SIZE when size is 0. CL_INVALID_BUFFER_SIZE when size is greater than the size of the allocation being imported. CL_INVALID_PROPERTY when invalid properties or combination of properties are passed. CL_INVALID_VALUE if memory is NULL. CL_INVALID_VALUE if an out-of-range plane or layer index is passed via CL_IMPORT_ANDROID_HARDWARE_BUFFER_PLANE_INDEX_ARM or CL_IMPORT_ANDROID_HARDWARE_BUFFER_LAYER_INDEX_ARM. CL_INVALID_OPERATION when host virtual pages in the range of to + are not mapped in the userspace address space. This does _not_ include cases where physical pages are not allocated. For specific behaviour see documentation for those memory types. Prior to version 1.1.0, CL_INVALID_OPERATION when an imported memory object has been passed to one of the following API functions (they can't be used with imported memory objects): * clEnqueueMapBuffer * clEnqueueMapImage * clEnqueueUnmapMemObject * clEnqueueReadImage * clEnqueueWriteImage * clEnqueueReadBuffer * clEnqueueReadBufferRect * clEnqueueWriteBuffer * clEnqueueWriteBufferRect * clEnqueueCopyBuffer * clEnqueueCopyBufferRect * clEnqueueCopyBufferToImage * clEnqueueCopyImageToBuffer * clEnqueueCopyImage * clEnqueueFillBuffer * clEnqueueFillImage CL_INVALID_OPERATION if a host memory flag other than CL_MEM_HOST_NO_ACCESS is used when CL_IMPORT_TYPE_PROTECTED_ARM is set to CL_TRUE. CL_INVALID_OPERATION if a command that makes use of a protected memory object is enqueued to a command queue that has CL_QUEUE_PROFILING_ENABLE set. Futher error information may be reported via the cl_context callback function. New memory import types Linux dma_buf memory type - CL_IMPORT_TYPE_DMA_BUF_ARM If the extension string cl_arm_import_memory_dma_buf is exposed then importing from dma_buf file handles is supported. When CL_IMPORT_DMA_BUF_DATA_CONSISTENCY_WITH_HOST_ARM is set to CL_TRUE, the OpenCL runtime manages data consistency between the host CPU and the memory backing the dma_buf allocation. The application is still responsible for ensuring that changes to the memory done outside of the coherency domain in which the host CPU and the OpenCL device reside are made visibible to that coherency domain before using that memory from an OpenCL command. This can be achieved either by not enqueueing the workload until the data is visible, or by using a user event to prevent the command from being executed until the expected data has reached memory. Prior to the introduction of CL_IMPORT_DMA_BUF_DATA_CONSISTENCY_WITH_HOST_ARM drivers always managed data consistency between the host CPU and memory. Flags attached to a dma_buf file handle take precedence over memory flags supplied to clImportMemoryARM. For example, if a dma_buf allocation originally created with a read-only flag is passed to clImportMemoryARM with the READ_WRITE flag, the more restrictive READ_ONLY will take precedence. dma_buf allocations are page-aligned and their size is a whole number of pages. must be less than or equal to the size of the imported dma_buf allocation. See also, the code example below. Host process memory type - CL_IMPORT_TYPE_HOST_ARM If the extension string cl_arm_import_memory_host is exposed then importing from normal userspace allocations (such as those created via malloc) is supported. If the host OS is linux and overcommit of VA is allowed, then this function will commit and pin physical pages for the VA range. This may cause larger physical allocations than the application typically provokes if memory is sparsely used. In this case sub-ranges of the host allocation should be passed to the import function individually. It is the application's responsibility to align for the datatype being accessed. Though the application is free to provide allocations without any specific alignment on coherent systems, there is a requirement to provide pointers aligned to a cache line on systems where there is no HW-managed coherency between CPU and GPU. When alignment is less than a page size then whole pages touched by addresses in the range of to + will be mapped into the device. If the page is already mapped by another unaligned import, an error will occur. Cache coherency will be HW-managed on systems where it is supported. Otherwise, cache maintenance operations will be added by the CL runtime where needed. Importing host memory that is otherwise being used by a device outside of the CPU/GPU coherency domain isn't guaranteed to work and the GPU caches may contain stale data. Importing dma_buf pages through a CPU mapping is undefined. Importing two allocations that aren't page-aligned and that request different memory flags is unsupported; an error will be returned. This method is recommended to be used when interoperating with an existing host library which performs its own allocation and cannot be passed handles to mapped OpenCL buffers. See also, the code example below. Protected memory If the extension string cl_arm_import_memory_protected is exposed then using CL_IMPORT_TYPE_PROTECTED_ARM in the list of is allowed. If the property's value is set to CL_TRUE, then the memory is imported as protected memory. Android hardware buffer This allows to directly import a AHardwareBuffer. This will acquire a reference on the AHardwareBuffer object that will not be released until the cl_mem is destroyed. In version 1.0.0: Multiplanar buffers are only supported when backed by a single dma_buf. In version 1.1.0: It is possible to select the specific plane or layer to import via CL_IMPORT_ANDROID_HARDWARE_BUFFER_PLANE_INDEX_ARM and/or CL_IMPORT_ANDROID_HARDWARE_BUFFER_LAYER_INDEX_ARM. The imported memory will start at the first pixel for the imported plane/layer. may be set to CL_IMPORT_MEMORY_WHOLE_ALLOCATION_ARM to require that the memory object have the same size as the imported AHardwareBuffer. New Tokens None Interactions with other extensions This extension produces cl_mem memory objects which are compatible with all other uses of cl_mem in the standard API, including creating images from the resulting cl_mem and subject to the restrictions listed in this document. In order to guarantee data consistency, applications must ensure that neither the host nor any device attempt to perform simultaneous read and write operations on any part of the memory backing an imported buffer or sub-buffers created therefrom, even if these accesses do not overlap. For example, this implies that it is not possible to write part of the memory backing an imported buffer on the host while reading a sub-buffer created from that buffer on a device, even if the memory written by the host is not visible through the sub-buffer. This extension also provides an alternative to image import via EGL. Sample Code CL_IMPORT_TYPE_DMA_BUF_ARM #define WIDTH 1024 #define HEIGHT 512 // Create buffer to be used as a hardware texture with graphics APIs (can also // include video/camera use flags here) int dma_buf_handle = get_dma_buf_handle_from_exporter_kernel_module( ..., WIDTH * HEIGHT * 2 ); cl_int error = CL_SUCCESS; cl_mem buffer = clImportMemoryARM( ctx, CL_MEM_READ_WRITE, { CL_IMPORT_TYPE_ARM, CL_IMPORT_TYPE_DMA_BUF_ARM, 0 }, &dma_buf_handle WIDTH * HEIGHT * 2, &error ); if( error == CL_SUCCESS ) { // Use as you would any other cl_mem buffer } CL_IMPORT_TYPE_HOST_ARM #define WIDTH 1024 #define HEIGHT 512 // tightly packed buffer we will treat as RGB565 char *allocptr = malloc( WIDTH * HEIGHT * 2 ); // The type CL_IMPORT_TYPE_HOST_ARM can be omitted as it is the default cl_int error = CL_SUCCESS; cl_mem buffer = clImportMemoryARM( ctx, CL_MEM_READ_WRITE, NULL, allocptr, WIDTH * HEIGHT * 2, &error ); if( error == CL_SUCCESS ) { // Use as you would any other cl_mem buffer } CL_IMPORT_TYPE_PROTECTED_ARM #define BUFFER_SIZE 4096 // Platform specific method of allocating protected memory and obtaining // a handle. For example, from a protected dma_buf or ION region. int protected_fd = protected_alloc( BUFFER_SIZE ); cl_int error = CL_SUCCESS; // Import memory of type dma_buf and identify it as protected const cl_import_properties_arm properties = { CL_IMPORT_TYPE_ARM, CL_IMPORT_TYPE_DMA_BUF_ARM, CL_IMPORT_TYPE_PROTECTED_ARM, CL_TRUE, 0 }; cl_mem buffer = clImportMemoryARM( context, CL_MEM_READ_WRITE, properties, &protected_fd, BUFFER_SIZE, &error ); if( error == CL_SUCCESS ) { // Use as cl_mem buffer, observing the usage restrictions above. } CL_IMPORT_TYPE_ANDROID_HARDWARE_BUFFER_ARM AHardwareBuffer *ahb = ... cl_int error = CL_SUCCESS; const cl_import_properties_arm properties = { CL_IMPORT_TYPE_ARM, CL_IMPORT_TYPE_ANDROID_HARDWARE_BUFFER_ARM, 0 }; cl_mem buffer = clImportMemoryARM( context, CL_MEM_READ_WRITE, properties, ahb, CL_IMPORT_MEMORY_WHOLE_ALLOCATION_ARM, &error ); if( error == CL_SUCCESS ) { // Use as cl_mem buffer, observing the usage restrictions above. } Conformance Tests None Revision History Revision: #1, Apr 27th, 2015 - Initial revision Revision: #2, Apr 28th, 2015 - Added properties field to avoid type inferrence. Added Issues section. Revision: #3, May 5th, 2015 - Added image support info in Issues. Revision: #4, Aug 4th, 2015 - Revised based on implementation and design changes made during review. Revision: #5, May 3rd, 2017 - Additional restrictions on host operations and general cleanup / clarification. Revision: #6, Jan 5th, 2018 - Support creating a sub-buffer from an imported buffer. Revision: #7, Jun 19th, 2018 - Add support for protected memory imports. Revision: #8, Aug 21st, 2019 - Clarified requirements on the size parameter to clImportMemoryARM. Revision: #9, Jan 10th, 2020 - Add support for Android harware buffers. Revision: #10, Feb 13th, 2020 - Make dma_buf <-> host data consistency optional. Revision: #11, Feb 15th, 2021 - Add support for Android hardware buffer plane/layer selection. Revision: #12, Sep 29th, 2022 - Version 1.1.0 of cl_arm_import_memory adds support for host access to imported memory for all memory types.