Name ARM_thread_limit_hint Name Strings cl_arm_thread_limit_hint Contributors Robert Elliott, ARM Ltd. Kévin Petit, ARM Ltd. Contact Kévin Petit, ARM Ltd. (kevin.petit 'at' ARM.com) IP Status No claims or disclosures are known to exist. Version Revision: #3, Sept 28th, 2017 Number OpenCL Extension #41 Status Complete. Extension Type OpenCL device extension Dependencies Requires OpenCL version 1.0 or later. Overview This extension enables an application to provide a hint for the maximum number of threads allowed to run concurrently on a compute unit. This results in a limit in the threads used by a kernel instance on devices that support it, lowering pressure on caches. Header File No host changes needed. Glossary No new terminology is introduced by this extension. New Types None New Procedures and Functions The new kernel qualifier __attribute__((arm_thread_limit_hint(N))) Description The attribute can be specified as part of the declaration of a kernel and provides a hint to the implementation that using fewer threads is desired. The implementation will accept any number between 0 and CL_DEVICE_MAX_WORK_GROUP_SIZE and choose the closest number that can be used. If the hint is larger than the maximum workgroup size supported by the kernel for that device, it is not honored. If the hint is smaller than the requested workgroup size for the kernel- instance, it is not honored. If the hint is not honored, a warning will be produced on context_notify. The hint will be honored on devices which support this feature. New Tokens OpenCL kernel code now has access to: #pragma OPENCL EXTENSION cl_arm_thread_limit_hint : enable The define cl_arm_thread_limit_hint is also present. Interactions with other extensions None Issues None Sample Code The following is a basic example of use, nothing else is required for the extension to function: // Check for extension and define a throttle value if it is present. This // is portable to drivers or devices without support for the extension. #ifdef cl_arm_thread_limit_hint #pragma OPENCL EXTENSION cl_arm_thread_limit_hint : enable #define THROTTLE_ATTRIBUTE __attribute__((arm_thread_limit_hint(64))) #else #define THROTTLE_ATTRIBUTE #endif kernel THROTTLE_ATTRIBUTE void throttled_kernel( global int* in, global int *out ) { // Kernel body ... } Conformance Tests None Revision History Revision: #1, Feb 2nd, 2015 - Initial revision Revision: #2, Feb 23rd, 2015 - Tidied up some of the language, added _hint to the extension to be more consistent with other extensions. Revision: #3, Sept 28th, 2017 - Relaxed the constraints on the number of threads accepted. Clarified the wording.