C Specification
To get the CUDA module cache call:
// Provided by VK_NV_cuda_kernel_launch
VkResult vkGetCudaModuleCacheNV(
VkDevice device,
VkCudaModuleNV module,
size_t* pCacheSize,
void* pCacheData);
Parameters
-
device
is the logical device that destroys the Function. -
module
is the CUDA module. -
pCacheSize
is a pointer containing the amount of bytes to be copied inpCacheData
-
pCacheData
is a pointer to a buffer in which to copy the binary cache
Description
If pCacheData
is NULL
, then the size of the binary cache, in bytes,
is returned in pCacheSize
.
Otherwise, pCacheSize
must point to a variable set by the user to the
size of the buffer, in bytes, pointed to by pCacheData
, and on return
the variable is overwritten with the amount of data actually written to
pCacheData
.
If pCacheSize
is less than the size of the binary shader code, nothing
is written to pCacheData
, and VK_INCOMPLETE
will be returned
instead of VK_SUCCESS
.
The returned cache may then be used later for further initialization of the CUDA module, by sending this cache instead of the PTX code when using vkCreateCudaModuleNV.
Note
Using the binary cache instead of the original PTX code should significantly speed up initialization of the CUDA module, given that the whole compilation and validation will not be necessary. As with VkPipelineCache, the binary cache depends on the specific implementation. The application must assume the cache upload might fail in many circumstances and thus may have to get ready for falling back to the original PTX code if necessary. Most often, the cache may succeed if the same device driver and architecture is used between the cache generation from PTX and the use of this cache. In the event of a new driver version or if using a different device architecture, this cache may become invalid. |
Document Notes
For more information, see the Vulkan Specification
This page is extracted from the Vulkan Specification. Fixes and changes should be made to the Specification, not directly.