Vulkan Logo

25. Mesh Shading

Task and mesh shaders operate in workgroups to produce a collection of primitives that will be processed by subsequent stages of the graphics pipeline.

Work on the mesh pipeline is initiated by the application drawing a set of mesh tasks organized in global workgroups. If the optional task shader is active, each workgroup triggers the execution of task shader invocations that will create a new set of mesh workgroups upon completion. Each of these created workgroups, or each of the original workgroups if no task shader is present, triggers the execution of mesh shader invocations.

Each mesh shader workgroup emits zero or more output primitives along with the group of vertices and their associated data required for each output primitive.

25.1. Task Shader Input

For every workgroup issued via the drawing commands a group of task shader invocations is executed. There are no inputs other than the builtin workgroup identifiers.

25.2. Task Shader Output

The task shader can emit zero or more mesh workgroups to be generated. Shaders using the TaskNV Execution Model can do so using the built-in variable TaskCountNV. This value must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesNV::maxTaskOutputCount. Shaders using the TaskEXT Execution Model can do so using the OpEmitMeshTasksEXT instruction. The groupCountX, groupCountY and groupCountZ arguments passed to this instruction must be less than or equal to the respective dimension within VkPhysicalDeviceMeshShaderPropertiesEXT::maxMeshWorkGroupCount. The product of these arguments must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesEXT::maxMeshWorkGroupTotalCount.

The task shader can also pass user-defined data to all mesh shader invocations that it creates. Shaders using the TaskNV Execution Model can do so by writing to output variables that are decorated with PerTaskNV. They are available as inputs in mesh shaders. Shaders using the TaskEXT Execution Model can do so by writing to a payload variable with TaskPayloadWorkgroupEXT storage class that is passed to the OpEmitMeshTasksEXT instruction.

25.3. Mesh Generation

If a task shader exists, the mesh assembler creates a variable amount of mesh workgroups depending on each task’s output. If there is no task shader, the drawing commands emit the mesh shader invocations directly.

25.4. Mesh Shader Input

The only inputs available to the mesh shader are variables identifying the specific workgroup and invocation and, if applicable, any outputs written as PerTaskNV or the payload variable passed to the OpEmitMeshTasksEXT instruction by the task shader that spawned the mesh shader’s workgroup. The mesh shader can operate without a task shader as well.

25.5. Mesh Shader Output

A mesh shader generates primitives in one of three output modes: points, lines, or triangles. For shaders using the MeshNV Execution Model the primitive mode is specified in the shader using an OpExecutionMode instruction with the OutputPoints, OutputLinesNV, or OutputTrianglesNV modes, respectively. For shaders using the MeshEXT Execution Model the primitive mode is specified in the shader using an OpExecutionMode instruction with the OutputPoints, OutputLinesEXT, or OutputTrianglesEXT modes, respectively. Each mesh shader must include exactly one output primitive mode.

For shaders using the MeshNV Execution Model the maximum output vertex count is specified as a literal in the shader using an OpExecutionMode instruction with the mode set to OutputVertices and must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesNV::maxMeshOutputVertices. For shaders using the MeshEXT Execution Model the maximum output vertex count is specified as a literal in the shader using an OpExecutionMode instruction with the mode set to OutputVertices and must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesEXT::maxMeshOutputVertices.

For shaders using the MeshNV Execution Model the maximum output primitive count is specified as a literal in the shader using an OpExecutionMode instruction with the mode set to OutputPrimitivesNV and must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesNV::maxMeshOutputPrimitives. For shaders using the MeshEXT Execution Model the maximum output primitive count is specified as a literal in the shader using an OpExecutionMode instruction with the mode set to OutputPrimitivesEXT, and must be less than or equal to VkPhysicalDeviceMeshShaderPropertiesEXT::maxMeshOutputPrimitives.

For shaders using the MeshNV Execution Model the number of primitives output by the mesh shader is provided via writing to the built-in variable PrimitiveCountNV and must be less than or equal to the maximum output primitive count specified in the shader. A variable decorated with PrimitiveIndicesNV is an output array of local index values into the vertex output arrays from which primitives are assembled according to the output primitive type. For shaders using the MeshEXT Execution Model the number of vertices and primitives output by the mesh shader is provided via calling the OpSetMeshOutputsEXT instruction. The vertexCount argument must be less than or equal to the maximum output vertex count specified in the shader. The primitiveCount argument must be less than or equal to the maximum output primitive count specified in the shader.

Depending on the output primitive mode an appropriately-decorated variable is the output array of local index values into the vertex output arrays from which primitives are assembled according to the output primitive type:

  • OutputPoints uses the PrimitivePointIndicesEXT decoration.

  • OutputLinesEXT uses the PrimitiveLineIndicesEXT decoration.

  • OutputTrianglesEXT uses the PrimitiveTriangleIndicesEXT decoration.

These resulting primitives are then further processed as described in Rasterization.

With the exception of primitive indices, all output built-ins and custom attributes count towards the total storage size occupied by output variables in mesh shaders. This size can be calculated as follows, taking into account the fact that the number of effective scalar attributes is 4 times the number of effective locations used according to the location assignment rules. Let be the number of views, be the number of effective scalar per-vertex attributes not dependent on ViewIndex, be the number of effective scalar per-vertex attributes dependent on ViewIndex, be the maximum number of vertices specified by the OutputVertices Execution Mode, be meshOutputPerVertexGranularity, be the number of effective scalar per-primitive attributes not dependent on ViewIndex, be the number of effective scalar per-primitive attributes dependent on ViewIndex, be the maximum number of primitives specified by the OutputPrimitivesEXT Execution Mode and be meshOutputPerPrimitiveGranularity:

25.6. Mesh Shader Per-View Outputs

The mesh shader outputs decorated with the PositionPerViewNV, ClipDistancePerViewNV, CullDistancePerViewNV, LayerPerViewNV, and ViewportMaskPerViewNV built-in decorations are the per-view versions of the single-view variables with equivalent names (that is Position, ClipDistance, CullDistance, Layer, and ViewportMaskNV, respectively). If a shader statically assigns a value to any element of a per-view array it must not statically assign a value to the equivalent single-view variable.

Each of these outputs is considered arrayed, with separate values for each view. The view number is used to index the first dimension of these arrays.

The second dimension of the ClipDistancePerViewNV, and CullDistancePerViewNV arrays have the same requirements as the ClipDistance, and CullDistance arrays.

If a mesh shader output is per-view, the corresponding fragment shader input is taken from the element of the per-view output array that corresponds to the view that is currently being processed by the fragment shader.

These per-view outputs are available only in shaders using the MeshNV Execution Model. They are not available in shaders using the MeshEXT Execution Model.

25.7. Mesh Shader Primitive Ordering

Following guarantees are provided for the relative ordering of primitives produced by a mesh shader, as they pertain to primitive order.

  • When a task shader is used, mesh workgroups spawned from lower tasks will be ordered prior those workgroups from subsequent tasks.

  • All output primitives generated from a given mesh workgroup are passed to subsequent pipeline stages before any output primitives generated from subsequent input workgroups.

  • All output primitives within a mesh workgroup, will be generated in the ordering provided by the builtin primitive indexbuffer (from low address to high address).