Name NV_device_attribute_query Name Strings cl_nv_device_attribute_query Number OpenCL Extension #18 Contributors Cyril Zeller, NVIDIA Corporation Yogesh Kini, NVIDIA Corporation Kedar Patil, NVIDIA Corporation Notice Copyright NVIDIA Corporation, 2009. IP Status NVIDIA Proprietary. Version October 5, 2009 (version 1.0) Dependencies OpenCL 1.0 is required. Overview This extension provides a mechanism to query device attributes specific to NVIDIA hardware. This will enable the programmer to optimize OpenCL kernels based on the specifics of the hardware. Details OpenCL 1.0 specification allows the programmer to query various device attributes. The complete list of these attributes are listed in table 4.3. However there is no way to query vendor specific information. This extension extends this table to include NVIDIA specific device attribute queries. This extension extends the table 4.3 of OpenCL 1.0 specification to include the following |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_COMPUTE_CAPABILITY_MAJOR_NV | cl_uint | Returns the major revision number that defines the CUDA | | | | compute capability of the device. | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_COMPUTE_CAPABILITY_MINOR_NV | cl_uint | Returns the minor revision number that defines the CUDA | | | | compute capability of the device. | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_REGISTERS_PER_BLOCK_NV | cl_unit | Maximum number of 32-bit registers available to a | | | | work-group; this number is shared by all work-groups | | | | simultaneously resident on a multiprocessor. | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_WARP_SIZE_NV | cl_uint | Warp size in work-items. | | | | | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_GPU_OVERLAP_NV | cl_bool | Returns CL_TRUE if the device can concurrently copy memory | | | | between host and device while executing a kernel, or | | | | CL_FALSE if not. | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV | cl_bool | CL_TRUE if there is a run time limit for kernels executed | | | | on the device, or CL_FALSE if not. | |------------------------------------------------------------------------------------------------------------------| | CL_DEVICE_INTEGRATED_MEMORY_NV | cl_bool | CL_TRUE if the device is integrated with the memory | | | | subsystem, or CL_FALSE if not. | |------------------------------------------------------------------------------------------------------------------| The function clGetDeviceInfo can be called with the constants above in order to query the device attributes.