© Copyright 2014-2017 The Khronos Group Inc. All Rights Reserved.
This specification is protected by copyright laws and contains material proprietary to the Khronos Group, Inc. It or any components may not be reproduced, republished, distributed, transmitted, displayed, broadcast, or otherwise exploited in any manner without the express prior written permission of Khronos Group. You may use this specification for implementing the functionality therein, without altering or removing any trademark, copyright or other notice from the specification, but the receipt or possession of this specification does not convey any rights to reproduce, disclose, or distribute its contents, or to manufacture, use, or sell anything that it may describe, in whole or in part.
Khronos Group grants express permission to any current Promoter, Contributor or Adopter member of Khronos to copy and redistribute UNMODIFIED versions of this specification in any fashion, provided that NO CHARGE is made for the specification and the latest available update of the specification for any version of the API is used whenever possible. Such distributed specification may be reformatted AS LONG AS the contents of the specification are not changed in any way. The specification may be incorporated into a product that is sold as long as such product includes significant independent work developed by the seller. A link to the current version of this specification on the Khronos Group website should be included whenever possible with specification distributions.
Khronos Group makes no, and expressly disclaims any, representations or warranties, express or implied, regarding this specification, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose or noninfringement of any intellectual property. Khronos Group makes no, and expressly disclaims any, warranties, express or implied, regarding the correctness, accuracy, completeness, timeliness, and reliability of the specification. Under no circumstances will the Khronos Group, or any of its Promoters, Contributors or Members or their respective partners, officers, directors, employees, agents, or representatives be liable for any damages, whether direct, indirect, special or consequential damages for lost revenues, lost profits, or otherwise, arising from or in connection with these materials.
Khronos, SYCL, SPIR, WebGL, EGL, COLLADA, StreamInput, OpenVX, OpenKCam, glTF, OpenKODE, OpenVG, OpenWF, OpenSL ES, OpenMAX, OpenMAX AL, OpenMAX IL and OpenMAX DL are trademarks and WebCL is a certification mark of the Khronos Group Inc. OpenCL is a trademark of Apple Inc. and OpenGL and OpenML are registered trademarks and the OpenGL ES and OpenGL SC logos are trademarks of Silicon Graphics International used under license by Khronos. All other product names, trademarks, and/or company names are used solely for identification and belong to their respective owners.
Contributors and Acknowledgments
Connor Abbott, Intel
Alexey Bader, Intel
Dan Baker, Oxide Games
Kenneth Benzie, Codeplay
Gordon Brown, Codeplay
Pat Brown, NVIDIA
Diana Po-Yu Chen, MediaTek
Stephen Clarke, Imagination
Patrick Doane, Blizzard Entertainment
Stefanus Du Toit, Google
Tim Foley, Intel
Ben Gaster, Qualcomm
Alexander Galazin, ARM
Christopher Gautier, ARM
Neil Henning, Codeplay
Kerch Holt, NVIDIA
Lee Howes, Qualcomm
Roy Ju, MediaTek
Daniel Koch, NVIDIA
Ashwin Kolhe, NVIDIA
Raun Krisch, Intel
Graeme Leese, Broadcom
Yuan Lin, NVIDIA
Yaxun Liu, AMD
Timothy Lottes, Epic Games
John McDonald, Valve
David Neto, Google
Christophe Riccio, Unity
Andrew Richards, Codeplay
Ian Romanick, Intel
Graham Sellers, AMD
Robert Simpson, Qualcomm
Brian Sumner, AMD
Andrew Woloszyn, Google
Weifeng Zhang, Qualcomm
Note
|
Up-to-date HTML and PDF versions of this specification may be found at the Khronos SPIR-V Registry. (https://www.khronos.org/registry/spir-v/) |
1. Introduction
Abstract
SPIR-V is a simple binary intermediate language for graphical shaders and compute kernels. A SPIR-V module contains multiple entry points with potentially shared functions in the entry point’s call trees. Each function contains a control-flow graph (CFG) of basic blocks, with optional instructions to express structured control flow. Load/store instructions are used to access declared variables, which includes all input/output (IO). Intermediate results bypassing load/store use static single-assignment (SSA) representation. Data objects are represented logically, with hierarchical type information: There is no flattening of aggregates or assignment to physical register banks, etc. Selectable addressing models establish whether general pointer operations may be used, or if memory access is purely logical.
This document fully defines SPIR-V, a Khronos-standard binary intermediate language for representing graphical-shader stages and compute kernels for multiple Khronos APIs.
1.1. Goals
SPIR-V has the following goals:
-
Provide a simple binary intermediate language for all functionality appearing in Khronos shaders/kernels.
-
Have a concise, transparent, self-contained specification (sections Specification and Binary Form).
-
Map easily to other intermediate languages.
-
Be the form passed by an API into a driver to set shaders/kernels.
-
Can be targeted by new front ends for novel high-level languages.
-
Allow the first steps of compilation and reflection to be done offline.
-
Be low-level enough to require a reverse-engineering step to reconstruct source code.
-
Improve portability by enabling shared tools to generate or operate on it.
-
Allow separation of core specification from source-language-specific sets of built-in functions.
-
Reduce compile time during application run time. (Eliminating most of the compile time during application run time is not a goal of this intermediate language. Target-specific register allocation and scheduling are still expected to take significant time.)
-
Allow some optimizations to be done offline.
1.2. About this document
This document aims to:
-
Include everything needed to fully understand, create, and consume SPIR-V. However:
-
Imported sets of instructions (which implement source-specific built-in functions) will need their own specification.
-
Many validation rules are client-API specific, and hence documented with client API and not in this specification.
-
-
Separate expository and specification language. The specification-proper is in Specification and Binary Form.
1.3. Extendability
SPIR-V can be extended by multiple vendors or parties simultaneously:
-
Using the OpExtension instruction to require new semantics that must be supported. Such new semantics would come from an extension document.
-
Reserving (registering) ranges of the token values, as described further below.
-
Aided by instruction skipping, also further described below.
Enumeration Token Values. It is easy to extend all the types, storage classes, opcodes, decorations, etc. by adding to the token values.
Registration. Ranges of token values in the Binary Form section can be pre-allocated to numerous vendors/parties. This allows combining multiple independent extensions without conflict. To register ranges, see https://www.khronos.org/registry/spir-v/api/spir-v.xml.
Extended Instructions. Sets of extended instructions can be provided and specified in separate specifications. These help personalize SPIR-V for different source languages or execution environments (client APIs). Multiple sets of extended instructions can be imported without conflict, as the extended instructions are selected by {set id, instruction number} pairs.
Instruction Skipping. Tools are encouraged to skip opcodes for features they are not required to process. This is trivially enabled by the word count in an instruction, which makes it easier to add new instructions without breaking existing tools.
1.4. Debuggability
SPIR-V can decorate, with a text string, virtually anything created in the shader: types, variables, functions, etc. This is required for externally visible symbols, and also allowed for naming the result of any instruction. This can be used to aid in understandability when disassembling or debugging lowered versions of SPIR-V.
Location information (file names, lines, and columns) can be interleaved with the instruction stream to track the origin of each instruction.
1.5. Design Principles
Regularity. All instructions start with a word count. This allows walking a SPIR-V module without decoding each opcode. All instructions have an opcode that dictates for all operands what kind of operand they are. For instructions with a variable number of operands, the number of variable operands is known by subtracting the number of non-variable words from the instruction’s word count.
Non Combinatorial. There is no combinatorial type explosion or need for large encode/decode tables for types. Rather, types are parameterized. Image types declare their dimensionality, arrayness, etc. all orthogonally, which greatly simplify code. This is done similarly for other types. It also applies to opcodes. Operations are orthogonal to scalar/vector size, but not to integer vs. floating-point differences.
Modeless. After a given execution model (e.g., pipeline stage) is specified, internal operation is essentially modeless: Generally, it will follow the rule: "same spelling, same semantics", and does not have mode bits that modify semantics. If a change to SPIR-V modifies semantics, it should use a different spelling. This makes consumers of SPIR-V much more robust. There are execution modes declared, but these are generally to affect the way the module interacts with the environment around it, not the internal semantics. Capabilities are also declared, but this is to declare the subset of functionality that is used, not to change any semantics of what is used.
Declarative. SPIR-V declares externally-visible modes like "writes depth", rather than having rules that require deduction from full shader inspection. It also explicitly declares what addressing modes, execution model, extended instruction sets, etc. will be used. See Language Capabilities for more information.
SSA. All results of intermediate operations are strictly SSA. However, declared variables reside in memory and use load/store for access, and such variables can be stored to multiple times.
IO. Some storage classes are for input/output (IO) and, fundamentally, IO will be done through load/store of variables declared in these storage classes.
1.6. Static Single Assignment (SSA)
SPIR-V includes a phi instruction to allow the merging together of intermediate results from split control flow. This allows split control flow without load/store to memory. SPIR-V is flexible in the degree to which load/store is used; it is possible to use control flow with no phi-instructions, while still staying in SSA form, by using memory load/store.
Some storage classes are for IO and, fundamentally, IO will be done through load/store, and initial load and final store can never be eliminated. Other storage classes are shader local and can have their load/store eliminated. It can be considered an optimization to largely eliminate such loads/stores by moving them into intermediate results in SSA form.
1.7. Built-In Variables
SPIR-V identifies built-in variables from a high-level language with an enumerant decoration. This assigns any unusual semantics to the variable. Built-in variables must otherwise be declared with their correct SPIR-V type and treated the same as any other variable.
1.8. Specialization
Specialization enables creating a portable SPIR-V module outside the target execution environment, based on constant values that won’t be known until inside the execution environment. For example, to size a fixed array with a constant not known during creation of a module, but known when the module will be lowered to the target architecture.
See Specialization in the next section for more details.
1.9. Example
The SPIR-V form is binary, not human readable, and fully described in Binary Form. This is an example disassembly to give a basic idea of what SPIR-V looks like:
GLSL fragment shader:
#version 450 in vec4 color1; in vec4 multiplier; noperspective in vec4 color2; out vec4 color; struct S { bool b; vec4 v[5]; int i; }; uniform blockName { S s; bool cond; }; void main() { vec4 scale = vec4(1.0, 1.0, 2.0, 1.0); if (cond) color = color1 + s.v[2]; else color = sqrt(color2) * scale; for (int i = 0; i < 4; ++i) color *= multiplier; }
Corresponding SPIR-V:
; Magic: 0x07230203 (SPIR-V) ; Version: 0x00010000 (Version: 1.0.0) ; Generator: 0x00080001 (Khronos Glslang Reference Front End; 1) ; Bound: 63 ; Schema: 0 OpCapability Shader %1 = OpExtInstImport "GLSL.std.450" OpMemoryModel Logical GLSL450 OpEntryPoint Fragment %4 "main" %31 %33 %42 %57 OpExecutionMode %4 OriginLowerLeft ; Debug information OpSource GLSL 450 OpName %4 "main" OpName %9 "scale" OpName %17 "S" OpMemberName %17 0 "b" OpMemberName %17 1 "v" OpMemberName %17 2 "i" OpName %18 "blockName" OpMemberName %18 0 "s" OpMemberName %18 1 "cond" OpName %20 "" OpName %31 "color" OpName %33 "color1" OpName %42 "color2" OpName %48 "i" OpName %57 "multiplier" ; Annotations (non-debug) OpDecorate %15 ArrayStride 16 OpMemberDecorate %17 0 Offset 0 OpMemberDecorate %17 1 Offset 16 OpMemberDecorate %17 2 Offset 96 OpMemberDecorate %18 0 Offset 0 OpMemberDecorate %18 1 Offset 112 OpDecorate %18 Block OpDecorate %20 DescriptorSet 0 OpDecorate %42 NoPerspective ; All types, variables, and constants %2 = OpTypeVoid %3 = OpTypeFunction %2 ; void () %6 = OpTypeFloat 32 ; 32-bit float %7 = OpTypeVector %6 4 ; vec4 %8 = OpTypePointer Function %7 ; function-local vec4* %10 = OpConstant %6 1 %11 = OpConstant %6 2 %12 = OpConstantComposite %7 %10 %10 %11 %10 ; vec4(1.0, 1.0, 2.0, 1.0) %13 = OpTypeInt 32 0 ; 32-bit int, sign-less %14 = OpConstant %13 5 %15 = OpTypeArray %7 %14 %16 = OpTypeInt 32 1 %17 = OpTypeStruct %13 %15 %16 %18 = OpTypeStruct %17 %13 %19 = OpTypePointer Uniform %18 %20 = OpVariable %19 Uniform %21 = OpConstant %16 1 %22 = OpTypePointer Uniform %13 %25 = OpTypeBool %26 = OpConstant %13 0 %30 = OpTypePointer Output %7 %31 = OpVariable %30 Output %32 = OpTypePointer Input %7 %33 = OpVariable %32 Input %35 = OpConstant %16 0 %36 = OpConstant %16 2 %37 = OpTypePointer Uniform %7 %42 = OpVariable %32 Input %47 = OpTypePointer Function %16 %55 = OpConstant %16 4 %57 = OpVariable %32 Input ; All functions %4 = OpFunction %2 None %3 ; main() %5 = OpLabel %9 = OpVariable %8 Function %48 = OpVariable %47 Function OpStore %9 %12 %23 = OpAccessChain %22 %20 %21 ; location of cond %24 = OpLoad %13 %23 ; load 32-bit int from cond %27 = OpINotEqual %25 %24 %26 ; convert to bool OpSelectionMerge %29 None ; structured if OpBranchConditional %27 %28 %41 ; if cond %28 = OpLabel ; then %34 = OpLoad %7 %33 %38 = OpAccessChain %37 %20 %35 %21 %36 ; s.v[2] %39 = OpLoad %7 %38 %40 = OpFAdd %7 %34 %39 OpStore %31 %40 OpBranch %29 %41 = OpLabel ; else %43 = OpLoad %7 %42 %44 = OpExtInst %7 %1 Sqrt %43 ; extended instruction sqrt %45 = OpLoad %7 %9 %46 = OpFMul %7 %44 %45 OpStore %31 %46 OpBranch %29 %29 = OpLabel ; endif OpStore %48 %35 OpBranch %49 %49 = OpLabel OpLoopMerge %51 %52 None ; structured loop OpBranch %53 %53 = OpLabel %54 = OpLoad %16 %48 %56 = OpSLessThan %25 %54 %55 ; i < 4 ? OpBranchConditional %56 %50 %51 ; body or break %50 = OpLabel ; body %58 = OpLoad %7 %57 %59 = OpLoad %7 %31 %60 = OpFMul %7 %59 %58 OpStore %31 %60 OpBranch %52 %52 = OpLabel ; continue target %61 = OpLoad %16 %48 %62 = OpIAdd %16 %61 %21 ; ++i OpStore %48 %62 OpBranch %49 ; loop back %51 = OpLabel ; loop merge point OpReturn OpFunctionEnd
2. Specification
2.1. Language Capabilities
A SPIR-V module is consumed by an execution environment, specified by a client API, that needs to support the features used by that SPIR-V module. Features are classified through capabilities. Capabilities used by a particular SPIR-V module must be declared early in that module with the OpCapability instruction. Then:
-
A validator can validate that the module uses only its declared capabilities.
-
An execution environment is allowed to reject modules declaring capabilities it does not support. (See client API specifications for environment-specific rules.)
All available capabilities and their dependencies form a capability hierarchy, fully listed in the capability section. Only top-level capabilities need to be explicitly declared; their dependencies are implicitly declared.
When an instruction, enumerant, or other feature specifies multiple enabling capabilities, only one such capability needs to be declared to use the feature. This declaration does not itself imply anything about the presence of the other enabling capabilities: The execution environment needs to support only the declared capability.
This (SPIR-V) specification provides capability-specific validation rules, in the validation section. To ensure portability, each client API needs to include the following:
-
Which capabilities in the capability section it requires environments to support, and hence allows in SPIR-V modules.
-
Required limits, if they are beyond the Universal Limits.
-
Any validation requirements specific to the environment that are not tied to specific capabilities, and hence not covered in the SPIR-V specification.
2.2. Terms
2.2.1. Instructions
<id>: A numerical name; the name used to refer to an object, a type, a function, a label, etc. An <id> always consumes one word. The <id>s defined by a module obey SSA.
Result <id>: Most instructions define a result, named by an <id> explicitly provided in the instruction. The Result <id> is used as an operand in other instructions to refer to the instruction that defined it.
Literal String: A nul-terminated stream of characters consuming an integral number of words. The character set is Unicode in the UTF-8 encoding scheme. The UTF-8 octets (8-bit bytes) are packed four per word, following the little-endian convention (i.e., the first octet is in the lowest-order 8 bits of the word). The final word contains the string’s nul-termination character (0), and all contents past the end of the string in the final word are padded with 0.
Literal Number: A numeric value consuming one or more words. An instruction will determine what type a literal will be interpreted as. When the type’s bit width is larger than one word, the literal’s low-order words appear first. When the type’s bit width is less than 32-bits, the literal’s value appears in the low-order bits of the word, and the high-order bits must be 0 for a floating-point type, or 0 for an integer type with Signedness of 0, or sign extended when Signedness is 1. (Similarly for the remaining bits of widths larger than 32 bits but not a multiple of 32 bits.)
Operand: A one-word argument to an instruction. E.g., it could be an <id>, or a (part of a) literal. Which form it holds is always explicitly known from the opcode.
Immediate: Operand(s) directly holding a literal value rather than an <id>. Immediate values larger than one word will consume multiple operands, one per word. That is, operand counting is always done per word, not per immediate.
WordCount: The complete number of words taken by an instruction, including the word holding the word count and opcode, and any optional operands. An instruction’s word count is the total space taken by the instruction.
Instruction: After a header, a module is simply a linear list of instructions. An instruction contains a word count, an opcode, an optional Result <id>, an optional <id> of the instruction’s type, and a variable list of operands. All instruction opcodes and semantics are listed in Instructions.
Decoration: Auxiliary information such as built-in variable, stream numbers, invariance, interpolation type, relaxed precision, etc., added to <id>s or structure-type members through Decorations. Decorations are enumerated in Decoration in the Binary Form section.
Object: An instantiation of a non-void type, either as the Result <id> of an operation, or created through OpVariable.
Memory Object: An object created through OpVariable. Such an object can die on function exit, if it was a function variable, or exist for the duration of an entry point.
Intermediate Object or Intermediate Value or Intermediate Result: An object created by an operation (not memory allocated by OpVariable) and dying on its last consumption.
2.2.2. Types
Boolean type: The type returned by OpTypeBool.
Integer type: Any width signed or unsigned type from OpTypeInt. By convention, the lowest-order bit will be referred to as bit-number 0, and the highest-order bit as bit-number Width - 1.
Floating-point type: Any width type from OpTypeFloat.
Numerical type: An integer type or a floating-point type.
Scalar: A single instance of a numerical type or Boolean type. Scalars will also be called components when being discussed either by themselves or in the context of the contents of a vector.
Vector: An ordered homogeneous collection of two or more scalars. Vector sizes are quite restrictive and dependent on the execution model.
Matrix: An ordered homogeneous collection of vectors. When vectors are part of a matrix, they will also be called columns. Matrix sizes are quite restrictive and dependent on the execution model.
Array: An ordered homogeneous collection of any non-void-type objects. When an object is part of an array, it will also be called an element. Array sizes are generally not restricted.
Structure: An ordered heterogeneous collection of any non-void types. When an object is part of a structure, it will also be called a member.
Image: A traditional texture or image; SPIR-V has this single name for these. An image type is declared with OpTypeImage. An image does not include any information about how to access, filter, or sample it.
Sampler: Settings that describe how to access, filter, or sample an image. Can come either from literal declarations of settings or be an opaque reference to externally bound settings. A sampler does not include an image.
Sampled Image: An image combined with a sampler, enabling filtered accesses of the image’s contents.
Concrete Type: A numerical scalar, vector, or matrix type, or OpTypePointer when using a Physical addressing model, or any aggregate containing only these types.
Abstract Type: An OpTypeVoid or OpTypeBool, or OpTypePointer when using the Logical addressing model, or any aggregate type containing any of these.
2.2.3. Module
Module: A single unit of SPIR-V. It can contain multiple entry points, but only one set of capabilities.
Entry Point: A function in a module where execution begins. A single entry point is limited to a single execution model. An entry point is declared using OpEntryPoint.
Execution Model: A graphical-pipeline stage or OpenCL kernel. These are enumerated in Execution Model.
Execution Mode: Modes of operation relating to the interface or execution environment of the module. These are enumerated in Execution Mode. Generally, modes do not change the semantics of instructions within a SPIR-V module.
2.2.4. Control Flow
Block: A contiguous sequence of instructions starting with an OpLabel, ending with a termination instruction. A block has no additional label or termination instructions.
Branch Instruction: One of the following, used as a termination instruction:
Dominate: A block A dominates a block B, where A and B are in the same function, if every path from the function’s entry point to block B includes block A. A strictly dominates B only if A dominates B and A and B are different blocks.
Post Dominate: A block B post dominates a block A, where A and B are in the same function, if every path from A to a function-return instruction goes through block B.
Control-Flow Graph: The graph formed by a function’s blocks and branches. The blocks are the graph’s nodes, and the branches the graph’s edges.
CFG: Control-flow graph.
Back Edge: If a depth-first traversal is done on a function’s CFG, starting from the first block of the function, a back edge is a branch to a previously visited block. A back-edge block is the block containing such a branch.
Merge Instruction: One of the following, used before a branch instruction to declare structured control flow:
Header Block: A block containing a merge instruction.
Loop Header: A header block whose merge instruction is an OpLoopMerge.
Merge Block: A block declared by the Merge Block operand of a merge instruction.
Break Block: A block containing a branch to the Merge Block of a loop header’s merge instruction.
Continue Block: A block containing a branch to an OpLoopMerge instruction’s Continue Target.
Return Block: A block containing an OpReturn or OpReturnValue branch.
Invocation: A single execution of an entry point in a SPIR-V module, operating only on the amount of data explicitly exposed by the semantics of the instructions. (Any implicit operation on additional instances of data would comprise additional invocations.) For example, in compute execution models, a single invocation operates only on a single work item, or, in a vertex execution model, a single invocation operates only on a single vertex.
Subgroup: The set of invocations exposed as running concurrently with the current invocation. In compute models, the current workgroup is a superset of the subgroup.
Invocation Group: The complete set of invocations collectively processing a particular compute workgroup or graphical operation, where the scope of a "graphical operation" is implementation dependent, but at least as large as a single point, line, triangle, or patch, and at most as large as a single rendering command, as defined by the client API.
Derivative Group: Defined only for the Fragment Execution Model: The set of invocations collectively processing a single point, line, or triangle, including any helper invocations.
Dynamic Instance: Within a single invocation, a single static instruction can be executed multiple times, giving multiple dynamic instances of that instruction. This can happen when the instruction is executed in a loop, or in a function called from multiple call sites, or combinations of multiple of these. Different loop iterations and different dynamic function-call-site chains yield different dynamic instances of such an instruction. Dynamic instances are distinguished by the control-flow path within an invocation, not by which invocation executed it. That is, different invocations of an entry point execute the same dynamic instances of an instruction when they follow the same control-flow path, starting from that entry point.
Dynamically Uniform: An <id> is dynamically uniform for a dynamic instance consuming it when its value is the same for all invocations (in the invocation group) that execute that dynamic instance.
Uniform Control Flow: Uniform control flow (or converged control flow) occurs when all invocations in the invocation group or derivative group execute the same control-flow path (and hence the same sequence of dynamic instances of instructions). Uniform control flow is the initial state at the entry point, and lasts until a conditional branch takes different control paths for different invocations (non-uniform or divergent control flow). Such divergence can reconverge, with all the invocations once again executing the same control-flow path, and this re-establishes the existence of uniform control flow. If control flow is uniform upon entry into a header block, and all invocations leave that dynamic instance of the header block’s control-flow construct via the header block’s declared merge block, then control flow reconverges to be uniform at that merge block.
2.3. Physical Layout of a SPIR-V Module and Instruction
A SPIR-V module is a single linear stream of words. The first words are shown in the following table:
Word Number | Contents |
---|---|
0 |
|
1 |
Version number. The bytes are, high-order to low-order: 0 | Major Number | Minor Number | 0 Hence, version 1.00 is the value 0x00010000. |
2 |
Generator’s magic number. It is associated with the tool that generated the module. Its value does not affect any semantics, and is allowed to be 0. Using a non-0 value is encouraged, and can be registered with Khronos at https://www.khronos.org/registry/spir-v/api/spir-v.xml. |
3 |
Bound; where all <id>s in this module are guaranteed to satisfy 0 < id < Bound Bound should be small, smaller is better, with all <id> in a module being densely packed and near 0. |
4 |
0 (Reserved for instruction schema, if needed.) |
5 |
First word of instruction stream, see below. |
All remaining words are a linear sequence of instructions.
Each instruction is a stream of words:
Instruction Word Number | Contents |
---|---|
0 |
Opcode: The 16 high-order bits are the WordCount of the instruction. The 16 low-order bits are the opcode enumerant. |
1 |
Optional instruction type <id> (presence determined by opcode). |
. |
Optional instruction Result <id> (presence determined by opcode). |
. |
Operand 1 (if needed) |
. |
Operand 2 (if needed) |
… |
… |
WordCount - 1 |
Operand N (N is determined by WordCount minus the 1 to 3 words used for the opcode, instruction type <id>, and instruction Result <id>). |
Instructions are variable length due both to having optional instruction type <id> and Result <id> words as well as a variable number of operands. The details for each specific instruction are given in the Binary Form section.
2.4. Logical Layout of a Module
The instructions of a SPIR-V module must be in the following order. For sections earlier than function definitions, it is invalid to use instructions other than those indicated.
-
All OpCapability instructions.
-
Optional OpExtension instructions (extensions to SPIR-V).
-
Optional OpExtInstImport instructions.
-
The single required OpMemoryModel instruction.
-
All entry point declarations, using OpEntryPoint.
-
All execution mode declarations, using OpExecutionMode.
-
These debug instructions, which must be in the following order:
-
all OpString, OpSourceExtension, OpSource, and OpSourceContinued, without forward references.
-
all OpName and all OpMemberName
-
-
All annotation instructions:
-
all decoration instructions (OpDecorate, OpMemberDecorate, OpGroupDecorate, OpGroupMemberDecorate, and OpDecorationGroup).
-
-
All type declarations (OpTypeXXX instructions), all constant instructions, and all global variable declarations (all OpVariable instructions whose Storage Class is not Function). This is the preferred location for OpUndef instructions, though they can also appear in function bodies. All operands in all these instructions must be declared before being used. Otherwise, they can be in any order. This section is the first section to allow use of OpLine debug information.
-
All function declarations ("declarations" are functions without a body; there is no forward declaration to a function with a body). A function declaration is as follows.
-
Function declaration, using OpFunction.
-
Function parameter declarations, using OpFunctionParameter.
-
Function end, using OpFunctionEnd.
-
-
All function definitions (functions with a body). A function definition is as follows.
-
Function definition, using OpFunction.
-
Function parameter declarations, using OpFunctionParameter.
-
Block
-
Block
-
…
-
Function end, using OpFunctionEnd.
-
Within a function definition:
-
A block always starts with an OpLabel instruction. This may be immediately preceded by an OpLine instruction, but the OpLabel is considered as the beginning of the block.
-
A block always ends with a termination instruction (see validation rules for more detail).
-
All OpVariable instructions in a function must have a Storage Class of Function.
-
All OpVariable instructions in a function must be in the first block in the function. These instructions, together with any immediately preceding OpLine instructions, must be the first instructions in that block. (Note the validation rules prevent OpPhi instructions in the first block of a function.)
-
A function definition (starts with OpFunction) can be immediately preceded by an OpLine instruction.
Forward references (an operand <id> that appears before the Result <id> defining it) are allowed for:
-
Operands that are an OpFunction. This allows for recursion and early declaration of entry points.
-
Annotation-instruction operands. This is required to fully know everything about a type or variable once it is declared.
-
Labels.
-
Loops can have forward references to a phi function.
-
An OpTypeForwardPointer has a forward reference to an OpTypePointer.
-
An OpTypeStruct operand that’s a forward reference to the Pointer Type operand to an OpTypeForwardPointer.
-
The list of <id> provided in the OpEntryPoint instruction.
In all cases, there is enough type information to enable a single simple pass through a module to transform it. For example, function calls have all the type information in the call, phi-functions don’t change type, and labels don’t have type. The pointer forward reference allows structures to contain pointers to themselves or to be mutually recursive (through pointers), without needing additional type information.
The Validation Rules section lists additional rules that must be satisfied.
2.5. Instructions
Most instructions create a Result <id>, as provided in the Result <id> field of the instruction. These Result <id>s are then referred to by other instructions through their <id> operands. All instruction operands are specified in the Binary Form section.
Instructions are explicit about whether they require immediates, rather than an <id> referring to some other result. This is strictly known just from the opcode.
-
An immediate 32-bit (or smaller) integer is always one operand directly holding a 32-bit two’s-complement value.
-
An immediate 32-bit float is always one operand, directly holding a 32-bit IEEE 754 floating-point representation.
-
An immediate 64-bit float is always two operands, directly holding a 64-bit IEEE 754 representation. The low-order 32 bits appear in the first operand.
2.5.1. SSA Form
A module is always in static single assignment (SSA) form. That is, there is always exactly one instruction resulting in any particular Result <id>. Storing into variables declared in memory is not subject to this; such stores do not create Result <id>s. Accessing declared variables is done through:
-
OpVariable to allocate an object in memory and create a Result <id> that is the name of a pointer to it.
-
OpAccessChain or OpInBoundsAccessChain to create a pointer to a subpart of a composite object in memory.
-
OpLoad through a pointer, giving the loaded object a Result <id> that can then be used as an operand in other instructions.
-
OpStore through a pointer, to write a value. There is no Result <id> for an OpStore.
OpLoad and OpStore instructions can often be eliminated, using intermediate results instead. When this happens in multiple control-flow paths, these values need to be merged again at the path’s merge point. Use OpPhi to merge such values together.
2.6. Entry Point and Execution Model
The OpEntryPoint instruction identifies an entry point with two key things: an execution model and a function definition. Execution models include Vertex, GLCompute, etc. (one for each graphical stage), as well as Kernel for OpenCL kernels. For the complete list, see Execution Model. An OpEntryPoint also supplies a name that can be used externally to identify the entry point, and a declaration of all the Input and Output variables that form its input/output interface.
The static function call graphs rooted at two entry points are allowed to overlap, so that function definitions and global variable definitions can be shared. The execution model and any execution modes associated with an entry point apply to the entire static function call graph rooted at that entry point. This rule implies that a function appearing in both call graphs of two distinct entry points may behave differently in each case. Similarly, variables whose semantics depend on properties of an entry point, e.g. those using the Input Storage Class, may behave differently when used in call graphs rooted in two different entry points.
2.7. Execution Modes
Information like the following is declared with OpExecutionMode instructions. For example,
-
number of invocations (Invocations)
-
vertex-order CCW (VertexOrderCcw)
-
triangle strip generation (OutputTriangleStrip)
-
number of output vertices (OutputVertices)
-
etc.
For a complete list, see Execution Mode.
2.8. Types and Variables
Types are built up hierarchically, using OpTypeXXX instructions. The Result <id> of an OpTypeXXX instruction becomes a type <id> for future use where type <id>s are needed (therefore, OpTypeXXX instructions do not have a type <id>, like most other instructions do).
The "leaves" to start building with are types like OpTypeFloat, OpTypeInt, OpTypeImage, OpTypeEvent, etc. Other types are built up from the Result <id> of these. The numerical types are parameterized to specify bit width and signed vs. unsigned.
Higher-level types are then constructed using opcodes like OpTypeVector, OpTypeMatrix, OpTypeImage, OpTypeArray, OpTypeRuntimeArray, OpTypeStruct, and OpTypePointer. These are parameterized by number of components, array size, member lists, etc. The image types are parameterized by the return type, dimensionality, arrayness, etc. To do sampling or filtering operations, a type from OpTypeSampledImage is used that contains both an image and a sampler. Such a sampled image can be set directly by the API, or combined in a SPIR-V module from an independent image and an independent sampler.
Types are built bottom up: A parameterizing operand in a type must be defined before being used.
Some additional information about the type of an <id> can be provided using the decoration instructions (OpDecorate, OpMemberDecorate, OpGroupDecorate, OpGroupMemberDecorate, and OpDecorationGroup). These can add, for example, Invariant to an <id> created by another instruction. See the full list of Decorations in the Binary Form section.
Two different type <id>s form, by definition, two different types. It is valid to declare multiple aggregate type <id>s having the same opcode and operands. This is to allow multiple instances of aggregate types with the same structure to be decorated differently. (Different decorations are not required; two different aggregate type <id>s are allowed to have identical declarations and decorations, and will still be two different types.) Non-aggregate types are different: It is invalid to declare multiple type <id>s for the same scalar, vector, or matrix type. That is, non-aggregate type declarations must all have different opcodes or operands. (Note that non-aggregate types cannot be decorated in ways that affect their type.)
Variables are declared to be of an already built type, and placed in a Storage Class. Storage classes include UniformConstant, Input, Workgroup, etc. and are fully specified in Storage Class. Variables declared with the Function Storage Class can have their lifetime’s specified within their function using the OpLifetimeStart and OpLifetimeStop instructions.
Intermediate results are typed by the instruction’s type <id>, which must validate with respect to the operation being done.
Built-in variables needing special driver handling (having unique semantics) are declared using OpDecorate or OpMemberDecorate with the BuiltIn Decoration, followed by a BuiltIn enumerant. This decoration is applied to a variable or a structure-type member.
2.9. Function Calling
To call a function defined in the current module or a function declared to be imported from another module, use OpFunctionCall with an operand that is the <id> of the OpFunction to call, and the <id>s of the arguments to pass. All arguments are passed by value into the called function. This includes pointers, through which a callee object could be modified.
2.10. Extended Instruction Sets
Many operations and/or built-in function calls from high-level languages are represented through extended instruction sets. Extended instruction sets will include things like
-
trigonometric functions: sin(), cos(), …
-
exponentiation functions: exp(), pow(), …
-
geometry functions: reflect(), smoothstep(), …
-
functions having rich performance/accuracy trade-offs
-
etc.
Non-extended instructions, those that are core SPIR-V instructions, are listed in the Binary Form section. Native operations include:
-
Basic arithmetic: +, -, *, min(), scalar * vector, etc.
-
Texturing, to help with back-end decoding and support special code-motion rules.
-
Derivatives, due to special code-motion rules.
Extended instruction sets are specified in independent specifications. They can be referenced (but not specified) in this specification. The separate extended instruction set specification will specify instruction opcodes, semantics, and instruction names.
To use an extended instruction set, first import it by name string using OpExtInstImport and giving it a Result <id>:
<extinst-id> OpExtInstImport "name-of-extended-instruction-set"
The "name-of-extended-instruction-set" is a literal string. The standard convention for this string is
"<source language name>.<package name>.<version>"
For example "GLSL.std.450" could be the name of the core built-in functions for GLSL versions 450 and earlier.
Note
|
There is nothing precluding having two "mirror" sets of instructions with different names but the same opcode values, which could, for example, let modifying just the import statement to change a performance/accuracy trade off. |
Then, to call a specific extended instruction, use OpExtInst:
OpExtInst <extinst-id> instruction-number operand0, operand1, ...
Extended instruction-set specifications will provide semantics for each "instruction-number". It is up to the specific specification what the overloading rules are on operand type. The specification must be clear on its semantics, and producers/consumers of it must follow those semantics.
By convention, it is recommended that all external specifications include an enum {…} listing all the "instruction-numbers", and a mapping between these numbers and a string representing the instruction name. However, there are no requirements that instruction name strings are provided or mangled.
Note
|
Producing and consuming extended instructions can be done entirely through numbers (no string parsing). An extended instruction set specification provides opcode enumerant values for the instructions, and these will be produced by the front end and consumed by the back end. |
2.11. Structured Control Flow
SPIR-V can explicitly declare structured control-flow constructs using merge instructions. These explicitly declare a header block before the control flow diverges and a merge block where control flow subsequently converges. These blocks delimit constructs that must nest, and can only be entered and exited in structured ways, as per the following.
Structured control-flow declarations must satisfy the following rules:
-
the merge block declared by a header block cannot be a merge block declared by any other header block
-
each header block must strictly dominate its merge block, unless the merge block is unreachable in the CFG
-
all CFG back edges must branch to a loop header, with each loop header having exactly one back edge branching to it
-
for a given loop header, its OpLoopMerge Continue Target, and corresponding back-edge block:
-
the loop header must dominate the Continue Target, unless the Continue Target is unreachable in the CFG
-
the Continue Target must dominate the back-edge block
-
the back-edge block must post dominate the Continue Target
-
A structured control-flow construct is then defined as one of:
-
a selection construct: the set of blocks dominated by a selection header, minus the set of blocks dominated by the header’s merge block
-
a continue construct: the set of blocks dominated by an OpLoopMerge’s Continue Target and post dominated by the corresponding back-edge block
-
a loop construct: the set of blocks dominated by a loop header, minus the set of blocks dominated by the loop’s merge block, minus the loop’s corresponding continue construct
-
a case construct: the set of blocks dominated by an OpSwitch Target or Default, minus the set of blocks dominated by the OpSwitch’s merge block (this construct is only defined for those OpSwitch Target or Default that are not equal to the OpSwitch’s corresponding merge block)
The above structured control-flow constructs must satisfy the following rules:
-
if a construct contains another header block, then it also contains that header’s corresponding merge block
-
the only blocks in a construct that can branch outside the construct are
-
a block branching to the construct’s merge block
-
a block branching from one case construct to another, for the same OpSwitch
-
a continue block for the innermost loop it is nested inside of
-
a break block for the innermost loop it is nested inside of
-
-
additionally for switches:
-
an OpSwitch block dominates all its defined case constructs
-
each case construct has at most one branch to another case construct
-
each case construct is branched to by at most one other case construct
-
if Target T1 branches to Target T2, or if Target T1 branches to the Default and the Default branches to Target T2, then T1 must immediately precede T2 in the list of the OpSwitch Target operands
-
2.12. Specialization
Specialization is intended for constant objects that will not have known constant values until after initial generation of a SPIR-V module. Such objects are called specialization constants.
A SPIR-V module containing specialization constants can consume one or more externally provided specializations: A set of final constant values for some subset of the module’s specialization constants. Applying these final constant values yields a new module having fewer remaining specialization constants. A module also contains default values for any specialization constants that never get externally specialized.
Note
|
No optimizing transforms are required to make a specialized module functionally correct. The specializing transform is straightforward and explicitly defined below. |
Note
|
Ad hoc specializing should not be done through constants (OpConstant or OpConstantComposite) that get overwritten: A SPIR-V → SPIR-V transform might want to do something irreversible with the value of such a constant, unconstrained from the possibility that its value could be later changed. |
Within a module, a Specialization Constant is declared with one of these instructions:
The literal operands to OpSpecConstant are the default numerical specialization constants. Similarly, the "True" and "False" parts of OpSpecConstantTrue and OpSpecConstantFalse provide the default Boolean specialization constants. These default values make an external specialization optional. However, such a default constant is applied only after all external specializations are complete, and none contained a specialization for it.
An external specialization is provided as a logical list of pairs. Each pair is a SpecId Decoration of a scalar specialization instruction along with its specialization constant. The numeric values are exactly what the operands would be to a corresponding OpConstant instruction. Boolean values are true if non-zero and false if zero.
Specializing a module is straightforward. The following specialization-constant instructions can be updated with specialization constants, and replaced in place, leaving everything else in the module exactly the same:
OpSpecConstantTrue -> OpConstantTrue or OpConstantFalse OpSpecConstantFalse -> OpConstantTrue or OpConstantFalse OpSpecConstant -> OpConstant OpSpecConstantComposite -> OpConstantComposite
The OpSpecConstantOp instruction is specialized by executing the operation and replacing the instruction with the result. The result can be expressed in terms of a constant instruction that is not a specialization-constant instruction. (Note, however, this resulting instruction might not have the same size as the original instruction, so is not a "replaced in place" operation.)
When applying an external specialization, the following (and only the following) must be modified to be non-specialization-constant instructions:
-
specialization-constant instructions with values provided by the specialization
-
specialization-constant instructions that consume nothing but non-specialization constant instructions (including those that the partial specialization transformed from specialization-constant instructions; these are in order, so it is a single pass to do so)
A full specialization can also be done, when requested or required, in which all specialization-constant instructions will be modified to non-specialization-constant instructions, using the default values where required.
2.13. Linkage
The ability to have partially linked modules and libraries is provided as part of the Linkage capability.
By default, functions and global variables are private to a module and cannot be accessed by other modules. However, a module may be written to export or import functions and global (module scope) variables. Imported functions and global variable definitions are resolved at linkage time. A module is considered to be partially linked if it depends on imported values.
Within a module, imported or exported values are decorated using the Linkage Attributes Decoration. This decoration assigns the following linkage attributes to decorated values:
-
A Linkage Type.
-
A name, which is a Literal String, and is used to uniquely identify exported values.
Note
|
When resolving imported functions, the Function Control and all Function Parameter Attributes are taken from the function definition, and not from the function declaration. |
2.14. Relaxed Precision
The RelaxedPrecision Decoration allows 32-bit integer and 32-bit floating-point operations to execute with a relaxed precision of somewhere between 16 and 32 bits.
For a floating-point operation, operating at relaxed precision means that the minimum requirements for range and precision are as follows:
-
the floating point range may be as small as (-214, 214)
-
the floating point magnitude range may be as small as (2-14, 214)
-
the relative floating point precision may be as small as 2-10
Relative floating-point precision is defined as the worst case (i.e. largest) ratio of the smallest step in relation to the value for all non-zero values:
Precisionrelative = (abs(v1 - v2)min / abs(v1))max for v1 ≠ 0, v2 ≠ 0, v1 ≠ v2
For integer operations, operating at relaxed precision means that the operation will be evaluated by an operation in which, for some N, 16 ≤ N ≤ 32:
-
the operation is executed as though its type were N bits in size, and
-
the result is zero or sign extended to 32 bits as determined by the signedness of the result type of the operation.
The RelaxedPrecision Decoration can be applied to:
-
The <id> of a variable, where the variable’s type is a scalar, vector, or matrix, or an array of scalar, vector, or matrix. In all cases, the components in the type must be a 32-bit numerical type.
-
The Result <id> of an instruction that operates on numerical types, meaning the instruction is to operate at relaxed precision.
-
The Result <id> of an instruction that reads or filters from an image. E.g. OpImageSampleExplicitLod, meaning the instruction is to operate at relaxed precision.
-
The Result <id> of an OpFunction meaning the function’s returned result is at relaxed precision. It cannot be applied to OpTypeFunction or to an OpFunction whose return type is OpTypeVoid.
-
A structure-type member (through OpMemberDecorate).
When applied to a variable or structure member, all loads and stores from the decorated object may be treated as though they were decorated with RelaxedPrecision. Loads may also be decorated with RelaxedPrecision, in which case they are treated as operating at relaxed precision.
All loads and stores involving relaxed precision still read and write 32 bits of data, respectively. Floating-point data read or written in such a manner is written in full 32-bit floating-point format. However, a load or store might reduce the precision (as allowed by RelaxedPrecision) of the destination value.
For debugging portability of floating-point operations, OpQuantizeToF16 may be used to explicitly reduce the precision of a relaxed-precision result to 16-bit precision. (Integer-result precision can be reduced, for example, using left- and right-shift opcodes.)
For image-sampling operations, decorations can appear on both the sampling instruction and the image variable being sampled. If either is decorated, they both should be decorated, and when both are decorated their decorations must match. If only one is decorated, the sampling instruction can behave either as if both were decorated or neither were decorated.
2.15. Debug Information
Debug information is supplied with:
-
Source-code text through OpString, OpSource, and OpSourceContinued.
-
Object names through OpName and OpMemberName.
-
Line numbers through OpLine.
A module will not lose any semantics when all such instructions are removed.
2.15.1. Function-Name Mangling
There is no functional dependency on how functions are named. Signature-typing information is explicitly provided, without any need for name "unmangling". (Valid modules can be created without inclusion of mangled names.)
By convention, for debugging purposes, modules with OpSource Source Language of OpenCL use the Itanium name-mangling standard.
2.16. Validation Rules
2.16.1. Universal Validation Rules
All modules must obey the following, or it is an invalid module:
-
The stream of instructions must be ordered as described in the Logical Layout section.
-
Any use of a feature described by a capability in the capability section requires that capability to be declared, either directly, or as a "depends on" capability on a capability that is declared.
-
Non-structure types (scalars, vectors, arrays, etc.) with the same operand parameterization cannot be type aliases. For non-structures, two type <id>s match if-and-only-if the types match.
-
If the Logical addressing model is selected:
-
OpVariable cannot allocate an object whose type is a pointer type (that is, it cannot create an object in memory that is itself a pointer and whose result would thus be a pointer to a pointer)
-
A pointer can only be an operand to the following instructions:
-
all OpAtomic instructions
-
extended instruction-set instructions that are explicitly identified as taking pointer operands
-
A pointer can be the Result <id> of only the following instructions:
-
All indexes in OpAccessChain and OpInBoundsAccessChain that are OpConstant with type of OpTypeInt with a signedness of 1 must not have their sign bit set.
-
-
SSA
-
Each <id> must appear exactly once as the Result <id> of an instruction.
-
The definition of an SSA <id> should dominate all uses of it, with the following exceptions:
-
Function calls may call functions not yet defined. However, note that the function’s argument and return types will already be known at the call site.
-
Uses in a phi-function in a loop may consume definitions in the loop that don’t dominate the use.
-
-
-
Entry point and execution model
-
There is at least one OpEntryPoint instruction, unless the Linkage capability is being used.
-
No function can be targeted by both an OpEntryPoint instruction and an OpFunctionCall instruction.
-
-
Functions
-
A function declaration (an OpFunction with no basic blocks), must have a Linkage Attributes Decoration with the Import Linkage Type.
-
A function definition (an OpFunction with basic blocks) cannot be decorated with the Import Linkage Type.
-
A function cannot have both a declaration and a definition (no forward declarations).
-
-
Global (Module Scope) Variables
-
It is illegal to initialize an imported variable. This means that a module-scope OpVariable with initialization value cannot be marked with the Import Linkage Type.
-
-
Control-Flow Graph (CFG)
-
Blocks exist only within a function.
-
The first block in a function definition is the entry point of that function and cannot be the target of any branch. (Note this means it will have no OpPhi instructions.)
-
The order of blocks in a function must satisfy the rule that blocks appear before all blocks they dominate.
-
Each block starts with a label.
-
A label is made by OpLabel.
-
This includes the first block of a function (OpFunction is not a label).
-
Labels are used only to form blocks.
-
-
The last instruction of each block is a termination instruction.
-
Termination instructions can only appear as the last instruction in a block.
-
OpLabel instructions can only appear within a function.
-
All branches within a function must be to labels in that function.
-
-
All OpFunctionCall Function operands are an <id> of an OpFunction in the same module.
-
Data rules
-
Scalar floating-point types can be parameterized only as 32 bit, plus any additional sizes enabled by capabilities.
-
Scalar integer types can be parameterized only as 32 bit, plus any additional sizes enabled by capabilities.
-
Vector types can only be parameterized with numerical types or the OpTypeBool type.
-
Vector types for can only be parameterized as having 2, 3, or 4 components, plus any additional sizes enabled by capabilities.
-
Matrix types can only be parameterized with floating-point types.
-
Matrix types can only be parameterized as having only 2, 3, or 4 columns.
-
Specialization constants (see Specialization) are limited to integers, Booleans, floating-point numbers, and vectors of these.
-
Forward reference operands in an OpTypeStruct
-
must be later declared with OpTypePointer
-
the type pointed to must be an OpTypeStruct
-
had an earlier OpTypeForwardPointer forward reference to the same <id>
-
-
All OpSampledImage instructions must be in the same block in which their Result <id> are consumed. Result <id> from OpSampledImage instructions must not appear as operands to OpPhi instructions or OpSelect instructions, or any instructions other than the image lookup and query instructions specified to take an operand whose type is OpTypeSampledImage.
-
Instructions for extracting a scalar image or scalar sampler out of a composite must only use dynamically-uniform indexes. They must be in the same block in which their Result <id> are consumed. Such Result <id> must not appear as operands to OpPhi instructions or OpSelect instructions, or any instructions other than the image instructions specified to operate on them.
-
-
Decoration rules
-
The Aliased Decoration can only be applied to intermediate objects that are pointers to non-void types.
-
The Linkage Attributes Decoration cannot be applied to functions targeted by an OpEntryPoint instruction.
-
A BuiltIn Decoration can only be applied as follows:
-
When applied to a structure-type member, all members of that structure type must also be decorated with BuiltIn. (No allowed mixing of built-in variables and non-built-in variables within a single structure.)
-
When applied to a structure-type member, that structure type cannot be contained as a member of another structure type.
-
There is at most one object per Storage Class that can contain a structure type containing members decorated with BuiltIn, consumed per entry-point.
-
-
-
OpLoad and OpStore can only consume objects whose type is a pointer.
-
A Result <id> resulting from an instruction within a function can only be used in that function.
-
A function call must have the same number of arguments as the function definition (or declaration) has parameters, and their respective types must match.
-
An instruction requiring a specific number of operands must have that many operands. The word count must agree.
-
Each opcode specifies its own requirements for number and type of operands, and these must be followed.
-
Atomic access rules
-
The pointers taken by atomic operation instructions must be a pointer into one of the following Storage Classes:
-
Uniform when used with the BufferBlock Decoration
-
Workgroup
-
CrossWorkgroup
-
Generic
-
AtomicCounter
-
Image
-
-
All pointers used in atomic operation instructions must be pointers to one of the following:
-
32-bit scalar integer
-
64-bit scalar integer
-
-
2.16.2. Validation Rules for Shader Capabilities
-
CFG:
-
Loops must be structured, having an OpLoopMerge instruction in their header.
-
Selections must be structured, having an OpSelectionMerge instruction in their header.
-
-
Entry point and execution model
-
Each entry point in a module, along with its corresponding static call tree within that module, forms a complete pipeline stage.
-
Each OpEntryPoint with the Fragment Execution Model must have an OpExecutionMode for either the OriginLowerLeft or the OriginUpperLeft Execution Mode. (Exactly one of these is required.)
-
An OpEntryPoint with the Fragment Execution Model can set at most one of the DepthGreater, DepthLess, or DepthUnchanged Execution Modes.
-
An OpEntryPoint with one of the Tessellation Execution Modes can set at most one of the SpacingEqual, FractionalEven, or FractionalOdd Execution Modes.
-
An OpEntryPoint with one of the Tessellation Execution Models can set at most one of the Triangles, Quads, or Isolines Execution Modes.
-
An OpEntryPoint with one of the Tessellation Execution Models can set at most one of the VertexOrderCw or VertexOrderCcw Execution Modes.
-
An OpEntryPoint with the Geometry Execution Model must set exactly one of the InputPoints, InputLines, InputLinesAdjacency, Triangles, or TrianglesAdjacency Execution Modes.
-
An OpEntryPoint with the Geometry Execution Model must set exactly one of the OutputPoints, OutputLineStrip, or OutputTriangleStrip Execution Modes.
-
-
Composite objects in the UniformConstant, Uniform, and PushConstant Storage Classes must be explicitly laid out. The following apply to all the aggregate and matrix types describing such an object, recursively through their nested types:
-
Each structure-type member must have an Offset Decoration.
-
Each array type must have an ArrayStride Decoration.
-
Each structure-type member that is a matrix or array-of-matrices must have be decorated with
-
a MatrixStride Decoration, and
-
one of the RowMajor or ColMajor Decorations.
-
-
The ArrayStride, MatrixStride, and Offset Decorations must be large enough to hold the size of the objects they affect (that is, specifying overlap is invalid). Each ArrayStride and MatrixStride must be greater than zero, and no two members of a given structure can be assigned to the same Offset.
-
-
For structure objects in the Input and Output Storage Classes, the following apply:
-
When applied to structure-type members, the Decorations Noperspective, Flat, Patch, Centroid, and Sample can only be applied to the top-level members of the structure type. (Nested objects' types cannot be structures whose members are decorated with these decorations.)
-
-
Decorations
-
At most one of Noperspective or Flat Decorations can be applied to the same object or member.
-
At most one of Patch, Centroid, or Sample Decorations can be applied to the same object or member.
-
At most one of RowMajor and ColMajor Decorations can be applied to a structure type.
-
At most one of Block and BufferBlock Decorations can be applied to a structure type.
-
-
All <id> used for Scope and Memory Semantics must be of an OpConstant.
2.16.3. Validation Rules for Kernel Capabilities
-
The Signedness in OpTypeInt must always be 0.
2.17. Universal Limits
These quantities are minimum limits for all implementations and validators. Implementations are allowed to support larger quantities. Specific APIs may impose larger minimums. See Language Capabilities.
Validators must either
-
inform when these limits are crossed, or
-
be explicitly parameterized with larger limits.
Limited Entity |
Minimum Limit |
|
Decimal |
Hexadecimal |
|
Characters in a literal string |
65,535 |
FFFF |
Instruction word count |
65,535 |
FFFF |
Result <id> bound |
4,194,303 |
3FFFFF |
Control-flow nesting depth |
1023 |
3FF |
Global variables (Storage Class other than Function) |
65,535 |
FFFF |
Local variables (Function Storage Class) |
524,287 |
7FFFF |
Decorations per target <id> |
Number of entries in the Decoration table. |
|
Execution modes per entry point |
255 |
FF |
Indexes for OpAccessChain, OpInBoundsAccessChain, OpPtrAccessChain, OpInBoundsPtrAccessChain, OpCompositeExtract, and OpCompositeInsert |
255 |
FF |
Number of function parameters, per function declaration |
255 |
FF |
OpFunctionCall actual arguments |
255 |
FF |
OpExtInst actual arguments |
255 |
FF |
OpSwitch (literal, label) pairs |
16,383 |
3FFF |
OpTypeStruct members |
16,383 |
3FFF |
Structure nesting depth |
255 |
FF |
2.18. Memory Model
A memory model is chosen using a single OpMemoryModel instruction near the beginning of the module. This selects both an addressing model and a memory model.
The Logical addressing model means pointers are abstract, having no physical size or numeric value. In this mode, pointers can only be created from existing objects, and they cannot be stored into an object.
The non-Logical addressing models allow physical pointers to be formed. OpVariable can be used to create objects that hold pointers. These are declared for a specific Storage Class. Pointers for one Storage Class cannot be used to access objects in another Storage Class. However, they can be converted with conversion opcodes. Any particular addressing model must describe the bit width of pointers for each of the storage classes.
2.18.1. Memory Layout
When memory is shared between a SPIR-V module and an API, its contents are transparent, and must be agreed on. For example, the Offset, MatrixStride, and ArrayStride Decorations applied to members of a struct object can partially define how the memory is laid out. In addition, the following are always true, applied recursively as needed, of the offsets within the memory buffer:
-
a vector consumes contiguous memory with lower-numbered components appearing in smaller offsets than higher-numbered components, and with component 0 starting at the vector’s Offset Decoration, if present
-
in an array, lower-numbered elements appear at smaller offsets than higher-numbered elements, with element 0 starting at the Offset Decoration for the array, if present
-
a structure has lower-numbered members appearing at smaller offsets than higher-numbered members, with member 0 starting at the Offset Decoration for the structure, if present
-
in a matrix, lower-numbered columns appear at smaller offsets than higher-numbered columns, and lower-numbered components within the matrix’s vectors appearing at smaller offsets than high-numbered components, with component 0 of column 0 starting at the Offset Decoration, if present (the RowMajor and ColMajor Decorations dictate what is contiguous)
2.18.2. Aliasing
Here, aliasing means one of:
-
Two or more pointers that point into overlapping parts of the same underlying object. That is, two intermediates, both of which are typed pointers, that can be dereferenced (in bounds) such that both dereferences access the same memory.
-
Images, buffers, or other externally allocated objects where a function might access the same underlying memory via accesses to two different objects.
How aliasing is managed depends on the Memory Model:
-
The simple and GLSL memory models can assume that aliasing is generally not present. Specifically, the compiler is free to compile as if aliasing is not present, unless a pointer is explicitly indicated to be an alias. This is indicated by applying the Aliased Decoration to an intermediate object’s <id>. Applying Restrict is allowed, but has no effect.
-
The OpenCL memory models must assume that aliasing is generally present. Specifically, the compiler must compile as if aliasing is present, unless a pointer is explicitly indicated to not alias. This is done by applying the Restrict Decoration to an intermediate object’s <id>. Applying Aliased is allowed, but has no effect.
It is invalid to apply both Restrict and Aliased to the same <id>.
2.19. Derivatives
Derivatives appear only in the Fragment Execution Model. They can be implicit or explicit. Some image instructions consume implicit derivatives, while the derivative instructions compute explicit derivatives. In all cases, derivatives are well defined only if the derivative group has uniform control flow.
2.20. Code Motion
Texturing instructions in the Fragment Execution Model that rely on an implicit derivative cannot be moved into control flow that is not known to be uniform control flow within each derivative group.
3. Binary Form
This section contains the exact form for all instructions, starting with the numerical values for all fields. See Physical Layout for the order words appear in.
3.1. Magic Number
Magic number for a SPIR-V module.
Tip
|
Endianness: A module is defined as a stream of words, not a stream of bytes. However, if stored as a stream of bytes (e.g., in a file), the magic number can be used to deduce what endianness to apply to convert the byte stream back to a word stream. |
Magic Number |
---|
0x07230203 |
3.2. Source Language
The source language is for debug purposes only, with no semantics that affect the meaning of other parts of the module. Used by OpSource.
Source Language | |
---|---|
0 |
Unknown |
1 |
ESSL |
2 |
GLSL |
3 |
OpenCL_C |
4 |
OpenCL_CPP |
5 |
HLSL |
3.3. Execution Model
Used by OpEntryPoint.
Execution Model | Enabling Capabilities | |
---|---|---|
0 |
Vertex |
Shader |
1 |
TessellationControl |
Tessellation |
2 |
TessellationEvaluation |
Tessellation |
3 |
Geometry |
Geometry |
4 |
Fragment |
Shader |
5 |
GLCompute |
Shader |
6 |
Kernel |
Kernel |
3.4. Addressing Model
Used by OpMemoryModel.
Addressing Model | Enabling Capabilities | |
---|---|---|
0 |
Logical |
|
1 |
Physical32 |
Addresses |
2 |
Physical64 |
Addresses |
3.5. Memory Model
Used by OpMemoryModel.
Memory Model | Enabling Capabilities | |
---|---|---|
0 |
Simple |
Shader |
1 |
GLSL450 |
Shader |
2 |
OpenCL |
Kernel |
3.6. Execution Mode
Declare the modes an entry point will execute in. Used by OpExecutionMode.
Execution Mode | Enabling Capabilities | Extra Operands | |||
---|---|---|---|---|---|
0 |
Invocations |
Geometry |
Literal Number |
||
1 |
SpacingEqual |
Tessellation |
|||
2 |
SpacingFractionalEven |
Tessellation |
|||
3 |
SpacingFractionalOdd |
Tessellation |
|||
4 |
VertexOrderCw |
Tessellation |
|||
5 |
VertexOrderCcw |
Tessellation |
|||
6 |
PixelCenterInteger |
Shader |
|||
7 |
OriginUpperLeft |
Shader |
|||
8 |
OriginLowerLeft |
Shader |
|||
9 |
EarlyFragmentTests |
Shader |
|||
10 |
PointMode |
Tessellation |
|||
11 |
Xfb |
TransformFeedback |
|||
12 |
DepthReplacing |
Shader |
|||
14 |
DepthGreater |
Shader |
|||
15 |
DepthLess |
Shader |
|||
16 |
DepthUnchanged |
Shader |
|||
17 |
LocalSize |
Literal Number |
Literal Number |
Literal Number |
|
18 |
LocalSizeHint |
Kernel |
Literal Number |
Literal Number |
Literal Number |
19 |
InputPoints |
Geometry |
|||
20 |
InputLines |
Geometry |
|||
21 |
InputLinesAdjacency |
Geometry |
|||
22 |
Triangles |
Geometry, Tessellation |
|||
23 |
InputTrianglesAdjacency |
Geometry |
|||
24 |
Quads |
Tessellation |
|||
25 |
Isolines |
Tessellation |
|||
26 |
OutputVertices |
Geometry, Tessellation |
Literal Number |
||
27 |
OutputPoints |
Geometry |
|||
28 |
OutputLineStrip |
Geometry |
|||
29 |
OutputTriangleStrip |
Geometry |
|||
30 |
VecTypeHint |
Kernel |
Literal Number |
||
31 |
ContractionOff |
Kernel |
|||
4446 |
PostDepthCoverage |
SampleMaskPostDepthCoverage |
|||
5027 |
StencilRefReplacingEXT |
StencilExportEXT |
3.7. Storage Class
Class of storage for declared variables (does not include intermediate values). Used by:
Storage Class | Enabling Capabilities | Enabled by Extension | |
---|---|---|---|
0 |
UniformConstant |
||
1 |
Input |
||
2 |
Uniform |
Shader |
|
3 |
Output |
Shader |
|
4 |
Workgroup |
||
5 |
CrossWorkgroup |
||
6 |
Private |
Shader |
|
7 |
Function |
||
8 |
Generic |
GenericPointer |
|
9 |
PushConstant |
Shader |
|
10 |
AtomicCounter |
AtomicStorage |
|
11 |
Image |
||
12 |
StorageBuffer |
Shader |
SPV_KHR_storage_buffer_storage_class, SPV_KHR_variable_pointers |
3.8. Dim
Dimensionality of an image. Used by OpTypeImage.
Dim | Enabling Capabilities | |
---|---|---|
0 |
1D |
Sampled1D |
1 |
2D |
|
2 |
3D |
|
3 |
Cube |
Shader |
4 |
Rect |
SampledRect |
5 |
Buffer |
SampledBuffer |
6 |
SubpassData |
InputAttachment |
3.9. Sampler Addressing Mode
Addressing mode for creating constant samplers. Used by OpConstantSampler.
Sampler Addressing Mode | Enabling Capabilities | |
---|---|---|
0 |
None |
Kernel |
1 |
ClampToEdge |
Kernel |
2 |
Clamp |
Kernel |
3 |
Repeat |
Kernel |
4 |
RepeatMirrored |
Kernel |
3.10. Sampler Filter Mode
Filter mode for creating constant samplers. Used by OpConstantSampler.
Sampler Filter Mode | Enabling Capabilities | |
---|---|---|
0 |
Nearest |
Kernel |
1 |
Linear |
Kernel |
3.11. Image Format
Declarative image format. Used by OpTypeImage.
Image Format | Enabling Capabilities | |
---|---|---|
0 |
Unknown |
|
1 |
Rgba32f |
Shader |
2 |
Rgba16f |
Shader |
3 |
R32f |
Shader |
4 |
Rgba8 |
Shader |
5 |
Rgba8Snorm |
Shader |
6 |
Rg32f |
StorageImageExtendedFormats |
7 |
Rg16f |
StorageImageExtendedFormats |
8 |
R11fG11fB10f |
StorageImageExtendedFormats |
9 |
R16f |
StorageImageExtendedFormats |
10 |
Rgba16 |
StorageImageExtendedFormats |
11 |
Rgb10A2 |
StorageImageExtendedFormats |
12 |
Rg16 |
StorageImageExtendedFormats |
13 |
Rg8 |
StorageImageExtendedFormats |
14 |
R16 |
StorageImageExtendedFormats |
15 |
R8 |
StorageImageExtendedFormats |
16 |
Rgba16Snorm |
StorageImageExtendedFormats |
17 |
Rg16Snorm |
StorageImageExtendedFormats |
18 |
Rg8Snorm |
StorageImageExtendedFormats |
19 |
R16Snorm |
StorageImageExtendedFormats |
20 |
R8Snorm |
StorageImageExtendedFormats |
21 |
Rgba32i |
Shader |
22 |
Rgba16i |
Shader |
23 |
Rgba8i |
Shader |
24 |
R32i |
Shader |
25 |
Rg32i |
StorageImageExtendedFormats |
26 |
Rg16i |
StorageImageExtendedFormats |
27 |
Rg8i |
StorageImageExtendedFormats |
28 |
R16i |
StorageImageExtendedFormats |
29 |
R8i |
StorageImageExtendedFormats |
30 |
Rgba32ui |
Shader |
31 |
Rgba16ui |
Shader |
32 |
Rgba8ui |
Shader |
33 |
R32ui |
Shader |
34 |
Rgb10a2ui |
StorageImageExtendedFormats |
35 |
Rg32ui |
StorageImageExtendedFormats |
36 |
Rg16ui |
StorageImageExtendedFormats |
37 |
Rg8ui |
StorageImageExtendedFormats |
38 |
R16ui |
StorageImageExtendedFormats |
39 |
R8ui |
StorageImageExtendedFormats |
3.12. Image Channel Order
Image channel order returned by OpImageQueryOrder.
Image Channel Order | Enabling Capabilities | |
---|---|---|
0 |
R |
Kernel |
1 |
A |
Kernel |
2 |
RG |
Kernel |
3 |
RA |
Kernel |
4 |
RGB |
Kernel |
5 |
RGBA |
Kernel |
6 |
BGRA |
Kernel |
7 |
ARGB |
Kernel |
8 |
Intensity |
Kernel |
9 |
Luminance |
Kernel |
10 |
Rx |
Kernel |
11 |
RGx |
Kernel |
12 |
RGBx |
Kernel |
13 |
Depth |
Kernel |
14 |
DepthStencil |
Kernel |
15 |
sRGB |
Kernel |
16 |
sRGBx |
Kernel |
17 |
sRGBA |
Kernel |
18 |
sBGRA |
Kernel |
19 |
ABGR |
Kernel |
3.13. Image Channel Data Type
Image channel data type returned by OpImageQueryFormat.
Image Channel Data Type | Enabling Capabilities | |
---|---|---|
0 |
SnormInt8 |
Kernel |
1 |
SnormInt16 |
Kernel |
2 |
UnormInt8 |
Kernel |
3 |
UnormInt16 |
Kernel |
4 |
UnormShort565 |
Kernel |
5 |
UnormShort555 |
Kernel |
6 |
UnormInt101010 |
Kernel |
7 |
SignedInt8 |
Kernel |
8 |
SignedInt16 |
Kernel |
9 |
SignedInt32 |
Kernel |
10 |
UnsignedInt8 |
Kernel |
11 |
UnsignedInt16 |
Kernel |
12 |
UnsignedInt32 |
Kernel |
13 |
HalfFloat |
Kernel |
14 |
Float |
Kernel |
15 |
UnormInt24 |
Kernel |
16 |
UnormInt101010_2 |
Kernel |
3.14. Image Operands
Additional operands to sampling, or getting texels from, an image. Bits that are set can indicate that another operand follows. If there are multiple following operands indicated, they are ordered: Those indicated by smaller-numbered bits appear first. At least one bit must be set (None is invalid).
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
Used by:
Image Operands | Enabling Capabilities | |
---|---|---|
0x0 |
None |
|
0x1 |
Bias |
Shader |
0x2 |
Lod |
|
0x4 |
Grad |
|
0x8 |
ConstOffset |
|
0x10 |
Offset |
ImageGatherExtended |
0x20 |
ConstOffsets |
|
0x40 |
Sample |
|
0x80 |
MinLod |
MinLod |
3.15. FP Fast Math Mode
Enables fast math operations which are otherwise unsafe.
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
FP Fast Math Mode | Enabling Capabilities | |
---|---|---|
0x0 |
None |
|
0x1 |
NotNaN |
Kernel |
0x2 |
NotInf |
Kernel |
0x4 |
NSZ |
Kernel |
0x8 |
AllowRecip |
Kernel |
0x10 |
Fast |
Kernel |
3.16. FP Rounding Mode
Associate a rounding mode to a floating-point conversion instruction.
FP Rounding Mode | Enabling Capabilities | |
---|---|---|
0 |
RTE |
Kernel, StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, StorageInputOutput16 |
1 |
RTZ |
Kernel, StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, StorageInputOutput16 |
2 |
RTP |
Kernel, StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, StorageInputOutput16 |
3 |
RTN |
Kernel, StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, StorageInputOutput16 |
3.17. Linkage Type
Associate a linkage type to functions or global variables. See linkage.
Linkage Type | Enabling Capabilities | |
---|---|---|
0 |
Export |
Linkage |
1 |
Import |
Linkage |
3.18. Access Qualifier
Defines the access permissions.
Used by OpTypeImage and OpTypePipe.
Access Qualifier | Enabling Capabilities | |
---|---|---|
0 |
ReadOnly |
Kernel |
1 |
WriteOnly |
Kernel |
2 |
ReadWrite |
Kernel |
3.19. Function Parameter Attribute
Adds additional information to the return type and to each parameter of a function.
Function Parameter Attribute | Enabling Capabilities | |
---|---|---|
0 |
Zext |
Kernel |
1 |
Sext |
Kernel |
2 |
ByVal |
Kernel |
3 |
Sret |
Kernel |
4 |
NoAlias |
Kernel |
5 |
NoCapture |
Kernel |
6 |
NoWrite |
Kernel |
7 |
NoReadWrite |
Kernel |
3.20. Decoration
Used by OpDecorate and OpMemberDecorate.
Decoration | Enabling Capabilities | Extra Operands | ||
---|---|---|---|---|
0 |
RelaxedPrecision |
Shader |
||
1 |
SpecId |
Shader |
Literal Number |
|
2 |
Block |
Shader |
||
3 |
BufferBlock |
Shader |
||
4 |
RowMajor |
Matrix |
||
5 |
ColMajor |
Matrix |
||
6 |
ArrayStride |
Shader |
Literal Number |
|
7 |
MatrixStride |
Matrix |
Literal Number |
|
8 |
GLSLShared |
Shader |
||
9 |
GLSLPacked |
Shader |
||
10 |
CPacked |
Kernel |
||
11 |
BuiltIn |
|||
13 |
NoPerspective |
Shader |
||
14 |
Flat |
Shader |
||
15 |
Patch |
Tessellation |
||
16 |
Centroid |
Shader |
||
17 |
Sample |
SampleRateShading |
||
18 |
Invariant |
Shader |
||
19 |
Restrict |
|||
20 |
Aliased |
|||
21 |
Volatile |
|||
22 |
Constant |
Kernel |
||
23 |
Coherent |
|||
24 |
NonWritable |
|||
25 |
NonReadable |
|||
26 |
Uniform |
Shader |
||
28 |
SaturatedConversion |
Kernel |
||
29 |
Stream |
GeometryStreams |
Literal Number |
|
30 |
Location |
Shader |
Literal Number |
|
31 |
Component |
Shader |
Literal Number |
|
32 |
Index |
Shader |
Literal Number |
|
33 |
Binding |
Shader |
Literal Number |
|
34 |
DescriptorSet |
Shader |
Literal Number |
|
35 |
Offset |
Shader |
Literal Number |
|
36 |
XfbBuffer |
TransformFeedback |
Literal Number |
|
37 |
XfbStride |
TransformFeedback |
Literal Number |
|
38 |
FuncParamAttr |
Kernel |
Function Parameter Attribute |
|
39 |
FPRoundingMode |
Kernel, StorageUniformBufferBlock16, StorageUniform16, StoragePushConstant16, StorageInputOutput16 |
FP Rounding Mode |
|
40 |
FPFastMathMode |
Kernel |
FP Fast Math Mode |
|
41 |
LinkageAttributes |
Linkage |
Literal String |
Linkage Type |
42 |
NoContraction |
Shader |
||
43 |
InputAttachmentIndex |
InputAttachment |
Literal Number |
|
44 |
Alignment |
Kernel |
Literal Number |
|
4999 |
ExplicitInterpAMD |
|||
5248 |
OverrideCoverageNV |
SampleMaskOverrideCoverageNV |
||
5250 |
PassthroughNV |
GeometryShaderPassthroughNV |
||
5252 |
ViewportRelativeNV |
ShaderViewportMaskNV |
||
5256 |
SecondaryViewportRelativeNV |
ShaderStereoViewNV |
Literal Number |
3.21. BuiltIn
Used when Decoration is BuiltIn. Apply to either
-
the result <id> of the variable declaration of the built-in variable, or
-
a structure-type member, if the built-in is a member of a structure.
As stated per entry below, these have additional semantics and constraints described by the client API.
BuiltIn | Enabling Capabilities | |
---|---|---|
0 |
Position |
Shader |
1 |
PointSize |
Shader |
3 |
ClipDistance |
ClipDistance |
4 |
CullDistance |
CullDistance |
5 |
VertexId |
Shader |
6 |
InstanceId |
Shader |
7 |
PrimitiveId |
Geometry, Tessellation |
8 |
InvocationId |
Geometry, Tessellation |
9 |
Layer |
Geometry |
10 |
ViewportIndex |
MultiViewport |
11 |
TessLevelOuter |
Tessellation |
12 |
TessLevelInner |
Tessellation |
13 |
TessCoord |
Tessellation |
14 |
PatchVertices |
Tessellation |
15 |
FragCoord |
Shader |
16 |
PointCoord |
Shader |
17 |
FrontFacing |
Shader |
18 |
SampleId |
SampleRateShading |
19 |
SamplePosition |
SampleRateShading |
20 |
SampleMask |
Shader |
22 |
FragDepth |
Shader |
23 |
HelperInvocation |
Shader |
24 |
NumWorkgroups |
|
25 |
WorkgroupSize |
|
26 |
WorkgroupId |
|
27 |
LocalInvocationId |
|
28 |
GlobalInvocationId |
|
29 |
LocalInvocationIndex |
|
30 |
WorkDim |
Kernel |
31 |
GlobalSize |
Kernel |
32 |
EnqueuedWorkgroupSize |
Kernel |
33 |
GlobalOffset |
Kernel |
34 |
GlobalLinearId |
Kernel |
36 |
SubgroupSize |
Kernel |
37 |
SubgroupMaxSize |
Kernel |
38 |
NumSubgroups |
Kernel |
39 |
NumEnqueuedSubgroups |
Kernel |
40 |
SubgroupId |
Kernel |
41 |
SubgroupLocalInvocationId |
Kernel |
42 |
VertexIndex |
Shader |
43 |
InstanceIndex |
Shader |
4416 |
SubgroupEqMaskKHR |
SubgroupBallotKHR |
4417 |
SubgroupGeMaskKHR |
SubgroupBallotKHR |
4418 |
SubgroupGtMaskKHR |
SubgroupBallotKHR |
4419 |
SubgroupLeMaskKHR |
SubgroupBallotKHR |
4420 |
SubgroupLtMaskKHR |
SubgroupBallotKHR |
4424 |
BaseVertex |
DrawParameters |
4425 |
BaseInstance |
DrawParameters |
4426 |
DrawIndex |
DrawParameters |
4438 |
DeviceIndex |
DeviceGroup |
4440 |
ViewIndex |
MultiView |
4992 |
BaryCoordNoPerspAMD |
|
4993 |
BaryCoordNoPerspCentroidAMD |
|
4994 |
BaryCoordNoPerspSampleAMD |
|
4995 |
BaryCoordSmoothAMD |
|
4996 |
BaryCoordSmoothCentroidAMD |
|
4997 |
BaryCoordSmoothSampleAMD |
|
4998 |
BaryCoordPullModelAMD |
|
5014 |
FragStencilRefEXT |
StencilExportEXT |
5253 |
ViewportMaskNV |
ShaderViewportMaskNV |
5257 |
SecondaryPositionNV |
ShaderStereoViewNV |
5258 |
SecondaryViewportMaskNV |
ShaderStereoViewNV |
5261 |
PositionPerViewNV |
PerViewAttributesNV |
5262 |
ViewportMaskPerViewNV |
PerViewAttributesNV |
5264 |
FullyCoveredEXT |
FragmentFullyCoveredEXT |
3.22. Selection Control
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpSelectionMerge.
Selection Control | |
---|---|
0x0 |
None |
0x1 |
Flatten |
0x2 |
DontFlatten |
3.23. Loop Control
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpLoopMerge.
Loop Control | |
---|---|
0x0 |
None |
0x1 |
Unroll |
0x2 |
DontUnroll |
3.24. Function Control
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
Used by OpFunction.
Function Control | |
---|---|
0x0 |
None |
0x1 |
Inline |
0x2 |
DontInline |
0x4 |
Pure |
0x8 |
Const |
3.25. Memory Semantics <id>
Must be an <id> of a 32-bit integer scalar that contains a mask. The rest of this description is about that mask.
Memory semantics define memory-order constraints, and on what storage classes those constraints apply to. The memory order constrains the allowed orders in which memory operations in this invocation can made visible to another invocation. The storage classes specify to which subsets of memory these constraints are to be applied. Storage classes not selected are not being constrained.
Despite being a mask and allowing multiple bits to be combined, at most one of the first four (low-order) bits can be set. Requesting both Acquire and Release semantics is done by setting the AcquireRelease bit, not by setting two bits.
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
Used by:
Memory Semantics | Enabling Capabilities | |
---|---|---|
0x0 |
None (Relaxed) |
|
0x2 |
Acquire |
|
0x4 |
Release |
|
0x8 |
AcquireRelease |
|
0x10 |
SequentiallyConsistent |
|
0x40 |
UniformMemory |
Shader |
0x80 |
SubgroupMemory |
|
0x100 |
WorkgroupMemory |
|
0x200 |
CrossWorkgroupMemory |
|
0x400 |
AtomicCounterMemory |
AtomicStorage |
0x800 |
ImageMemory |
3.26. Memory Access
Memory access semantics.
This value is a literal mask; it can be formed by combining the bits from multiple rows in the table below.
Used by:
Memory Access | |
---|---|
0x0 |
None |
0x1 |
Volatile |
0x2 |
Aligned |
0x4 |
Nontemporal |
3.27. Scope <id>
Must be an <id> of a 32-bit integer scalar that contains a mask. The rest of this description is about that mask.
The execution scope or memory scope of an operation. When used as a memory scope, it specifies the distance of synchronization from the current invocation. When used as an execution scope, it specifies the set of executing invocations taking part in the operation. Used by:
Scope | |
---|---|
0 |
CrossDevice |
1 |
Device |
2 |
Workgroup |
3 |
Subgroup |
4 |
Invocation |
3.28. Group Operation
Defines the class of workgroup or subgroup operation. Used by:
Group Operation | Enabling Capabilities | |
---|---|---|
0 |
Reduce |
Kernel |
1 |
InclusiveScan |
Kernel |
2 |
ExclusiveScan |
Kernel |
3.29. Kernel Enqueue Flags
Specify when the child kernel begins execution.
Note: Implementations are not required to honor this flag. Implementations may not schedule kernel launch earlier than the point specified by this flag, however. Used by OpEnqueueKernel.
Kernel Enqueue Flags | Enabling Capabilities | |
---|---|---|
0 |
NoWait |
Kernel |
1 |
WaitKernel |
Kernel |
2 |
WaitWorkGroup |
Kernel |
3.30. Kernel Profiling Info
Specify the profiling information to be queried. Used by OpCaptureEventProfilingInfo.
This value is a mask; it can be formed by combining the bits from multiple rows in the table below.
Kernel Profiling Info | Enabling Capabilities | |
---|---|---|
0x0 |
None |
|
0x1 |
CmdExecTime |
Kernel |
3.31. Capability
Capabilities a module can declare it uses. All used capabilities must be declared, either directly or through a dependency: all capabilities that a declared capability depends on are automatically implied.
The Depends On column lists the dependencies for each capability. These are the ones implicitly declared. It is not necessary (but allowed) to declare a dependency for a declared capability.
See the capabilities section for more detail.
Used by OpCapability.
Capability | Depends On | Enabled by Extension | |
---|---|---|---|
0 |
Matrix |
||
1 |
Shader |
Matrix |
|
2 |
Geometry |
Shader |
|
3 |
Tessellation |
Shader |
|
4 |
Addresses |
||
5 |
Linkage |
||
6 |
Kernel |
||
7 |
Vector16 |
Kernel |
|
8 |
Float16Buffer |
Kernel |
|
9 |
Float16 |
||
10 |
Float64 |
||
11 |
Int64 |
||
12 |
Int64Atomics |
Int64 |
|
13 |
ImageBasic |
Kernel |
|
14 |
ImageReadWrite |
ImageBasic |
|
15 |
ImageMipmap |
ImageBasic |
|
17 |
Pipes |
Kernel |
|
18 |
Groups |
||
19 |
DeviceEnqueue |
Kernel |
|
20 |
LiteralSampler |
Kernel |
|
21 |
AtomicStorage |
Shader |
|
22 |
Int16 |
||
23 |
TessellationPointSize |
Tessellation |
|
24 |
GeometryPointSize |
Geometry |
|
25 |
ImageGatherExtended |
Shader |
|
27 |
StorageImageMultisample |
Shader |
|
28 |
UniformBufferArrayDynamicIndexing |
Shader |
|
29 |
SampledImageArrayDynamicIndexing |
Shader |
|
30 |
StorageBufferArrayDynamicIndexing |
Shader |
|
31 |
StorageImageArrayDynamicIndexing |
Shader |
|
32 |
ClipDistance |
Shader |
|
33 |
CullDistance |
Shader |
|
34 |
ImageCubeArray |
SampledCubeArray |
|
35 |
SampleRateShading |
Shader |
|
36 |
SampledRect |
||
37 |
Shader |
||
38 |
GenericPointer |
Addresses |
|
39 |
Int8 |
Kernel |
|
40 |
InputAttachment |
Shader |
|
41 |
SparseResidency |
Shader |
|
42 |
MinLod |
Shader |
|
43 |
|||
44 |
Sampled1D |
||
45 |
SampledCubeArray |
Shader |
|
46 |
|||
47 |
SampledBuffer |
||
48 |
ImageMSArray |
Shader |
|
49 |
StorageImageExtendedFormats |
Shader |
|
50 |
ImageQuery |
Shader |
|
51 |
DerivativeControl |
Shader |
|
52 |
InterpolationFunction |
Shader |
|
53 |
TransformFeedback |
Shader |
|
54 |
GeometryStreams |
Geometry |
|
55 |
StorageImageReadWithoutFormat |
Shader |
|
56 |
StorageImageWriteWithoutFormat |
Shader |
|
57 |
MultiViewport |
Geometry |
|
4423 |
SubgroupBallotKHR |
SPV_KHR_shader_ballot |
|
4427 |
DrawParameters |
SPV_KHR_shader_draw_parameters |
|
4431 |
SubgroupVoteKHR |
SPV_KHR_subgroup_vote |
|
4433 |
StorageBuffer16BitAccess |
SPV_KHR_16bit_storage |
|
4433 |
StorageUniformBufferBlock16 |
SPV_KHR_16bit_storage |
|
4434 |
UniformAndStorageBuffer16BitAccess |
StorageBuffer16BitAccess, StorageUniformBufferBlock16 |
SPV_KHR_16bit_storage |
4434 |
StorageUniform16 |
StorageBuffer16BitAccess, StorageUniformBufferBlock16 |
SPV_KHR_16bit_storage |
4435 |
StoragePushConstant16 |
SPV_KHR_16bit_storage |
|
4436 |
StorageInputOutput16 |
SPV_KHR_16bit_storage |
|
4437 |
DeviceGroup |
SPV_KHR_device_group |
|
4439 |
MultiView |
Shader |
SPV_KHR_multiview |
4441 |
VariablePointersStorageBuffer |
Shader |
SPV_KHR_variable_pointers |
4442 |
VariablePointers |
VariablePointersStorageBuffer |
SPV_KHR_variable_pointers |
4445 |
AtomicStorageOps |
SPV_KHR_shader_atomic_counter_ops |
|
4447 |
SampleMaskPostDepthCoverage |
SPV_KHR_post_depth_coverage |
|
5009 |
ImageGatherBiasLodAMD |
Shader |
SPV_AMD_texture_gather_bias_lod |
5010 |
FragmentMaskAMD |
Shader |
SPV_AMD_shader_fragment_mask |
5013 |
StencilExportEXT |
Shader |
SPV_EXT_shader_stencil_export |
5015 |
ImageReadWriteLodAMD |
Shader |
SPV_AMD_shader_image_load_store_lod |
5249 |
SampleMaskOverrideCoverageNV |
SampleRateShading |
SPV_NV_sample_mask_override_coverage |
5251 |
GeometryShaderPassthroughNV |
Geometry |
SPV_NV_geometry_shader_passthrough |
5254 |
ShaderViewportIndexLayerEXT |
MultiViewport |
SPV_EXT_shader_viewport_index_layer |
5254 |
ShaderViewportIndexLayerNV |
MultiViewport |
SPV_NV_viewport_array2 |
5255 |
ShaderViewportMaskNV |
ShaderViewportIndexLayerNV |
SPV_NV_viewport_array2 |
5259 |
ShaderStereoViewNV |
ShaderViewportMaskNV |
SPV_NV_stereo_view_rendering |
5260 |
PerViewAttributesNV |
MultiView |
SPV_NVX_multiview_per_view_attributes |
5265 |
FragmentFullyCoveredEXT |
Shader |
SPV_EXT_fragment_fully_covered |
5568 |
SubgroupShuffleINTEL |
SPV_INTEL_subgroups |
|
5569 |
SubgroupBufferBlockIOINTEL |
SPV_INTEL_subgroups |
|
5570 |
SubgroupImageBlockIOINTEL |
SPV_INTEL_subgroups |
3.32. Instructions
Form for each instruction:
Opcode Name |
Capability Enabling Capabilities |
||
Opcode |
Results |
Operands |
3.32.1. Miscellaneous Instructions
OpNop |
|
1 |
0 |
OpUndef |
|||
3 |
1 |
<id> |
3.32.2. Debug Instructions
OpSourceContinued |
||
2 + variable |
2 |
Literal String |
OpSource |
|||||
3 + variable |
3 |
Literal Number |
Optional |
Optional |
OpSourceExtension |
||
2 + variable |
4 |
Literal String |
OpName |
|||
3 + variable |
5 |
<id> |
Literal String |
OpMemberName |
||||
4 + variable |
6 |
<id> |
Literal Number |
Literal String |
OpString |
|||
3 + variable |
7 |
Literal String |
OpLine |
||||
4 |
8 |
<id> |
Literal Number |
Literal Number |
OpNoLine |
|
1 |
317 |
3.32.3. Annotation Instructions
OpDecorate |
||||
3 + variable |
71 |
<id> |
Literal, Literal, … |
OpMemberDecorate |
|||||
4 + variable |
72 |
<id> |
Literal Number |
Literal, Literal, … |
OpDecorationGroup |
||
2 |
73 |
OpGroupDecorate |
|||
2 + variable |
74 |
<id> |
<id>, <id>, … |
OpGroupMemberDecorate |
|||
2 + variable |
75 |
<id> |
<id>, literal, |
3.32.4. Extension Instructions
OpExtension |
||
2 + variable |
10 |
Literal String |
OpExtInstImport |
|||
3 + variable |
11 |
Literal String |
OpExtInst |
||||||
5 + variable |
12 |
<id> |
<id> |
Literal Number |
<id>, <id>, … |
3.32.5. Mode-Setting Instructions
OpMemoryModel |
|||
3 |
14 |
OpEntryPoint |
|||||
4 + variable |
15 |
<id> |
Literal String |
<id>, <id>, … |
OpExecutionMode |
||||
3 + variable |
16 |
<id> |
Execution Mode |
Optional literal(s) |
OpCapability |
||
2 |
17 |
Capability |
3.32.6. Type-Declaration Instructions
2 |
19 |
OpTypeBool |
||
2 |
20 |
OpTypeInt |
||||
4 |
21 |
Literal Number |
Literal Number |
OpTypeFloat |
|||
3 |
22 |
Literal Number |
OpTypeVector |
||||
4 |
23 |
<id> |
Literal Number |
OpTypeMatrix |
Capability: |
|||
4 |
24 |
<id> |
Literal Number |
OpTypeImage |
||||||||||
9 + variable |
25 |
<id> |
Literal Number |
Literal Number |
Literal Number |
Optional |
OpTypeSampler |
||
2 |
26 |
OpTypeSampledImage |
|||
3 |
27 |
<id> |
OpTypeArray |
||||
4 |
28 |
<id> |
<id> |
OpTypeRuntimeArray |
Capability: |
||
3 |
29 |
<id> |
OpTypeStruct |
|||
2 + variable |
30 |
<id>, <id>, … |
OpTypeOpaque |
Capability: |
||
3 + variable |
31 |
Literal String |
OpTypePointer |
||||
4 |
32 |
<id> |
OpTypeFunction |
||||
3 + variable |
33 |
<id> |
<id>, <id>, … |
Capability: |
||
2 |
34 |
Capability: |
||
2 |
35 |
Capability: |
||
2 |
36 |
Capability: |
||
2 |
37 |
OpTypePipe |
Capability: |
||
3 |
38 |
Access Qualifier |
OpTypeForwardPointer |
Capability: |
||
3 |
39 |
<id> |
3.32.7. Constant-Creation Instructions
OpConstantTrue |
|||
3 |
41 |
<id> |
OpConstantFalse |
|||
3 |
42 |
<id> |
OpConstant |
||||
3 + variable |
43 |
<id> |
Literal, Literal, … |
OpConstantComposite |
||||
3 + variable |
44 |
<id> |
<id>, <id>, … |
OpConstantSampler |
Capability: |
|||||
6 |
45 |
<id> |
Literal Number |
OpConstantNull |
|||
3 |
46 |
<id> |
OpSpecConstantTrue |
|||
3 |
48 |
<id> |
OpSpecConstantFalse |
|||
3 |
49 |
<id> |
OpSpecConstant |
||||
3 + variable |
50 |
<id> |
Literal, Literal, … |
OpSpecConstantComposite |
||||
3 + variable |
51 |
<id> |
<id>, <id>, … |
OpSpecConstantOp |
|||||
4 + variable |
52 |
<id> |
Literal Number |
<id>, <id>, … |
3.32.8. Memory Instructions
OpVariable |
|||||
4 + variable |
59 |
<id> |
Optional |
OpImageTexelPointer |
||||||
6 |
60 |
<id> |
<id> |
<id> |
<id> |
OpLoad |
|||||
4 + variable |
61 |
<id> |
<id> |
Optional |
OpStore |
||||
3 + variable |
62 |
<id> |
<id> |
Optional |
OpCopyMemory |
||||
3 + variable |
63 |
<id> |
<id> |
Optional |
OpCopyMemorySized |
Capability: |
||||
4 + variable |
64 |
<id> |
<id> |
<id> |
Optional |
OpAccessChain |
|||||
4 + variable |
65 |
<id> |
<id> |
<id>, <id>, … |
OpInBoundsAccessChain |
|||||
4 + variable |
66 |
<id> |
<id> |
<id>, <id>, … |
OpPtrAccessChain |
Capability: |
|||||
5 + variable |
67 |
<id> |
<id> |
<id> |
<id>, <id>, … |
OpArrayLength |
Capability: |
||||
5 |
68 |
<id> |
<id> |
Literal Number |
OpGenericPtrMemSemantics |
Capability: |
|||
4 |
69 |
<id> |
<id> |
OpInBoundsPtrAccessChain |
Capability: |
|||||
5 + variable |
70 |
<id> |
<id> |
<id> |
<id>, <id>, … |
3.32.9. Function Instructions
OpFunction |
|||||
5 |
54 |
<id> |
<id> |
OpFunctionParameter |
|||
3 |
55 |
<id> |
1 |
56 |
OpFunctionCall |
|||||
4 + variable |
57 |
<id> |
<id> |
<id>, <id>, … |
3.32.10. Image Instructions
OpSampledImage |
|||||
5 |
86 |
<id> |
<id> |
<id> |
OpImageSampleImplicitLod |
Capability: |
||||||
5 + variable |
87 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSampleExplicitLod |
||||||||
7 + variable |
88 |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSampleDrefImplicitLod |
Capability: |
|||||||
6 + variable |
89 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSampleDrefExplicitLod |
Capability: |
||||||||
8 + variable |
90 |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSampleProjImplicitLod |
Capability: |
||||||
5 + variable |
91 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSampleProjExplicitLod |
Capability: |
|||||||
7 + variable |
92 |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSampleProjDrefImplicitLod |
Capability: |
|||||||
6 + variable |
93 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSampleProjDrefExplicitLod |
Capability: |
||||||||
8 + variable |
94 |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageFetch |
|||||||
5 + variable |
95 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageGather |
Capability: |
|||||||
6 + variable |
96 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageDrefGather |
Capability: |
|||||||
6 + variable |
97 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageRead |
|||||||
5 + variable |
98 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageWrite |
||||||
4 + variable |
99 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImage |
||||
4 |
100 |
<id> |
<id> |
OpImageQueryFormat |
Capability: |
|||
4 |
101 |
<id> |
<id> |
OpImageQueryOrder |
Capability: |
|||
4 |
102 |
<id> |
<id> |
OpImageQuerySizeLod |
Capability: |
||||
5 |
103 |
<id> |
<id> |
<id> |
OpImageQuerySize |
Capability: |
|||
4 |
104 |
<id> |
<id> |
OpImageQueryLod |
Capability: |
||||
5 |
105 |
<id> |
<id> |
<id> |
OpImageQueryLevels |
Capability: |
|||
4 |
106 |
<id> |
<id> |
OpImageQuerySamples |
Capability: |
|||
4 |
107 |
<id> |
<id> |
OpImageSparseSampleImplicitLod |
Capability: |
||||||
5 + variable |
305 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseSampleExplicitLod |
Capability: |
|||||||
7 + variable |
306 |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSparseSampleDrefImplicitLod |
Capability: |
|||||||
6 + variable |
307 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseSampleDrefExplicitLod |
Capability: |
||||||||
8 + variable |
308 |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSparseSampleProjImplicitLod |
Capability: |
||||||
5 + variable |
309 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseSampleProjExplicitLod |
Capability: |
|||||||
7 + variable |
310 |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSparseSampleProjDrefImplicitLod |
Capability: |
|||||||
6 + variable |
311 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseSampleProjDrefExplicitLod |
Capability: |
||||||||
8 + variable |
312 |
<id> |
<id> |
<id> |
<id> |
<id> |
Optional |
OpImageSparseFetch |
Capability: |
||||||
5 + variable |
313 |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseGather |
Capability: |
|||||||
6 + variable |
314 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseDrefGather |
Capability: |
|||||||
6 + variable |
315 |
<id> |
<id> |
<id> |
<id> |
Optional |
Optional |
OpImageSparseTexelsResident |
Capability: |
|||
4 |
316 |
<id> |
<id> |
OpImageSparseRead |
Capability: |
||||||
5 + variable |
320 |
<id> |
<id> |
<id> |
Optional |
Optional |
3.32.11. Conversion Instructions
OpConvertFToU |
||||
4 |
109 |
<id> |
<id> |
OpConvertFToS |
||||
4 |
110 |
<id> |
<id> |
OpConvertSToF |
||||
4 |
111 |
<id> |
<id> |
OpConvertUToF |
||||
4 |
112 |
<id> |
<id> |
OpUConvert |
||||
4 |
113 |
<id> |
<id> |
OpSConvert |
||||
4 |
114 |
<id> |
<id> |
OpFConvert |
||||
4 |
115 |
<id> |
<id> |
OpQuantizeToF16 |
Capability: |
|||
4 |
116 |
<id> |
<id> |
OpConvertPtrToU |
Capability: |
|||
4 |
117 |
<id> |
<id> |
OpSatConvertSToU |
Capability: |
|||
4 |
118 |
<id> |
<id> |
OpSatConvertUToS |
Capability: |
|||
4 |
119 |
<id> |
<id> |
OpConvertUToPtr |
Capability: |
|||
4 |
120 |
<id> |
<id> |
OpPtrCastToGeneric |
Capability: |
|||
4 |
121 |
<id> |
<id> |
OpGenericCastToPtr |
Capability: |
|||
4 |
122 |
<id> |
<id> |
OpGenericCastToPtrExplicit |
Capability: |
||||
5 |
123 |
<id> |
<id> |
Storage Class |
OpBitcast |
||||
4 |
124 |
<id> |
<id> |
3.32.12. Composite Instructions
OpVectorExtractDynamic |
|||||
5 |
77 |
<id> |
<id> |
<id> |
OpVectorInsertDynamic |
||||||
6 |
78 |
<id> |
<id> |
<id> |
<id> |
OpVectorShuffle |
||||||
5 + variable |
79 |
<id> |
<id> |
<id> |
Literal, Literal, … |
OpCompositeConstruct |
||||
3 + variable |
80 |
<id> |
<id>, <id>, … |
OpCompositeExtract |
|||||
4 + variable |
81 |
<id> |
<id> |
Literal, Literal, … |
OpCompositeInsert |
||||||
5 + variable |
82 |
<id> |
<id> |
<id> |
Literal, Literal, … |
OpCopyObject |
||||
4 |
83 |
<id> |
<id> |
OpTranspose |
Capability: |
|||
4 |
84 |
<id> |
<id> |
3.32.13. Arithmetic Instructions
OpSNegate |
||||
4 |
126 |
<id> |
<id> |
OpFNegate |
||||
4 |
127 |
<id> |
<id> |
OpIAdd |
|||||
5 |
128 |
<id> |
<id> |
<id> |
OpFAdd |
|||||
5 |
129 |
<id> |
<id> |
<id> |
OpISub |
|||||
5 |
130 |
<id> |
<id> |
<id> |
OpFSub |
|||||
5 |
131 |
<id> |
<id> |
<id> |
OpIMul |
|||||
5 |
132 |
<id> |
<id> |
<id> |
OpFMul |
|||||
5 |
133 |
<id> |
<id> |
<id> |
OpUDiv |
|||||
5 |
134 |
<id> |
<id> |
<id> |
OpSDiv |
|||||
5 |
135 |
<id> |
<id> |
<id> |
OpFDiv |
|||||
5 |
136 |
<id> |
<id> |
<id> |
OpUMod |
|||||
5 |
137 |
<id> |
<id> |
<id> |
OpSRem |
|||||
5 |
138 |
<id> |
<id> |
<id> |
OpSMod |
|||||
5 |
139 |
<id> |
<id> |
<id> |
OpFRem |
|||||
5 |
140 |
<id> |
<id> |
<id> |
OpFMod |
|||||
5 |
141 |
<id> |
<id> |
<id> |
OpVectorTimesScalar |
|||||
5 |
142 |
<id> |
<id> |
<id> |
OpMatrixTimesScalar |
Capability: |
||||
5 |
143 |
<id> |
<id> |
<id> |
OpVectorTimesMatrix |
Capability: |
||||
5 |
144 |
<id> |
<id> |
<id> |
OpMatrixTimesVector |
Capability: |
||||
5 |
145 |
<id> |
<id> |
<id> |
OpMatrixTimesMatrix |
Capability: |
||||
5 |
146 |
<id> |
<id> |
<id> |
OpOuterProduct |
Capability: |
||||
5 |
147 |
<id> |
<id> |
<id> |
OpDot |
|||||
5 |
148 |
<id> |
<id> |
<id> |
OpIAddCarry |
|||||
5 |
149 |
<id> |
<id> |
<id> |
OpISubBorrow |
|||||
5 |
150 |
<id> |
<id> |
<id> |
OpUMulExtended |
|||||
5 |
151 |
<id> |
<id> |
<id> |
OpSMulExtended |
|||||
5 |
152 |
<id> |
<id> |
<id> |
3.32.14. Bit Instructions
OpShiftRightLogical |
|||||
5 |
194 |
<id> |
<id> |
<id> |
OpShiftRightArithmetic |
|||||
5 |
195 |
<id> |
<id> |
<id> |
OpShiftLeftLogical |
|||||
5 |
196 |
<id> |
<id> |
<id> |
OpBitwiseOr |
|||||
5 |
197 |
<id> |
<id> |
<id> |
OpBitwiseXor |
|||||
5 |
198 |
<id> |
<id> |
<id> |
OpBitwiseAnd |
|||||
5 |
199 |
<id> |
<id> |
<id> |
OpNot |
||||
4 |
200 |
<id> |
<id> |
OpBitFieldInsert |
Capability: |
||||||
7 |
201 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpBitFieldSExtract |
Capability: |
|||||
6 |
202 |
<id> |
<id> |
<id> |
<id> |
OpBitFieldUExtract |
Capability: |
|||||
6 |
203 |
<id> |
<id> |
<id> |
<id> |
OpBitReverse |
Capability: |
|||
4 |
204 |
<id> |
<id> |
OpBitCount |
||||
4 |
205 |
<id> |
<id> |
3.32.15. Relational and Logical Instructions
OpAny |
||||
4 |
154 |
<id> |
<id> |
OpAll |
||||
4 |
155 |
<id> |
<id> |
OpIsNan |
||||
4 |
156 |
<id> |
<id> |
OpIsInf |
||||
4 |
157 |
<id> |
<id> |
OpIsFinite |
Capability: |
|||
4 |
158 |
<id> |
<id> |
OpIsNormal |
Capability: |
|||
4 |
159 |
<id> |
<id> |
OpSignBitSet |
Capability: |
|||
4 |
160 |
<id> |
<id> |
OpLessOrGreater |
Capability: |
||||
5 |
161 |
<id> |
<id> |
<id> |
OpOrdered |
Capability: |
||||
5 |
162 |
<id> |
<id> |
<id> |
OpUnordered |
Capability: |
||||
5 |
163 |
<id> |
<id> |
<id> |
OpLogicalEqual |
|||||
5 |
164 |
<id> |
<id> |
<id> |
OpLogicalNotEqual |
|||||
5 |
165 |
<id> |
<id> |
<id> |
OpLogicalOr |
|||||
5 |
166 |
<id> |
<id> |
<id> |
OpLogicalAnd |
|||||
5 |
167 |
<id> |
<id> |
<id> |
OpLogicalNot |
||||
4 |
168 |
<id> |
<id> |
OpSelect |
||||||
6 |
169 |
<id> |
<id> |
<id> |
<id> |
OpIEqual |
|||||
5 |
170 |
<id> |
<id> |
<id> |
OpINotEqual |
|||||
5 |
171 |
<id> |
<id> |
<id> |
OpUGreaterThan |
|||||
5 |
172 |
<id> |
<id> |
<id> |
OpSGreaterThan |
|||||
5 |
173 |
<id> |
<id> |
<id> |
OpUGreaterThanEqual |
|||||
5 |
174 |
<id> |
<id> |
<id> |
OpSGreaterThanEqual |
|||||
5 |
175 |
<id> |
<id> |
<id> |
OpULessThan |
|||||
5 |
176 |
<id> |
<id> |
<id> |
OpSLessThan |
|||||
5 |
177 |
<id> |
<id> |
<id> |
OpULessThanEqual |
|||||
5 |
178 |
<id> |
<id> |
<id> |
OpSLessThanEqual |
|||||
5 |
179 |
<id> |
<id> |
<id> |
OpFOrdEqual |
|||||
5 |
180 |
<id> |
<id> |
<id> |
OpFUnordEqual |
|||||
5 |
181 |
<id> |
<id> |
<id> |
OpFOrdNotEqual |
|||||
5 |
182 |
<id> |
<id> |
<id> |
OpFUnordNotEqual |
|||||
5 |
183 |
<id> |
<id> |
<id> |
OpFOrdLessThan |
|||||
5 |
184 |
<id> |
<id> |
<id> |
OpFUnordLessThan |
|||||
5 |
185 |
<id> |
<id> |
<id> |
OpFOrdGreaterThan |
|||||
5 |
186 |
<id> |
<id> |
<id> |
OpFUnordGreaterThan |
|||||
5 |
187 |
<id> |
<id> |
<id> |
OpFOrdLessThanEqual |
|||||
5 |
188 |
<id> |
<id> |
<id> |
OpFUnordLessThanEqual |
|||||
5 |
189 |
<id> |
<id> |
<id> |
OpFOrdGreaterThanEqual |
|||||
5 |
190 |
<id> |
<id> |
<id> |
OpFUnordGreaterThanEqual |
|||||
5 |
191 |
<id> |
<id> |
<id> |
3.32.16. Derivative Instructions
OpDPdx |
Capability: |
|||
4 |
207 |
<id> |
<id> |
OpDPdy |
Capability: |
|||
4 |
208 |
<id> |
<id> |
OpFwidth |
Capability: |
|||
4 |
209 |
<id> |
<id> |
OpDPdxFine |
Capability: |
|||
4 |
210 |
<id> |
<id> |
OpDPdyFine |
Capability: |
|||
4 |
211 |
<id> |
<id> |
OpFwidthFine |
Capability: |
|||
4 |
212 |
<id> |
<id> |
OpDPdxCoarse |
Capability: |
|||
4 |
213 |
<id> |
<id> |
OpDPdyCoarse |
Capability: |
|||
4 |
214 |
<id> |
<id> |
OpFwidthCoarse |
Capability: |
|||
4 |
215 |
<id> |
<id> |
3.32.17. Control-Flow Instructions
OpPhi |
||||
3 + variable |
245 |
<id> |
<id>, <id>, … |
OpLoopMerge |
||||
4 |
246 |
<id> |
<id> |
OpSelectionMerge |
|||
3 |
247 |
<id> |
OpBranch |
||
2 |
249 |
<id> |
OpBranchConditional |
|||||
4 + variable |
250 |
<id> |
<id> |
<id> |
Literal, Literal, … |
OpSwitch |
||||
3 + variable |
251 |
<id> |
<id> |
literal, label <id>, |
OpKill |
Capability: |
1 |
252 |
OpReturn |
|
1 |
253 |
OpReturnValue |
||
2 |
254 |
<id> |
OpUnreachable |
|
1 |
255 |
OpLifetimeStart |
Capability: |
||
3 |
256 |
<id> |
Literal Number |
OpLifetimeStop |
Capability: |
||
3 |
257 |
<id> |
Literal Number |
3.32.18. Atomic Instructions
OpAtomicLoad |
||||||
6 |
227 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
OpAtomicStore |
|||||
5 |
228 |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicExchange |
|||||||
7 |
229 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicCompareExchange |
|||||||||
9 |
230 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
Memory Semantics <id> |
<id> |
<id> |
OpAtomicCompareExchangeWeak |
Capability: |
||||||||
9 |
231 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
Memory Semantics <id> |
<id> |
<id> |
OpAtomicIIncrement |
||||||
6 |
232 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
OpAtomicIDecrement |
||||||
6 |
233 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
OpAtomicIAdd |
|||||||
7 |
234 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicISub |
|||||||
7 |
235 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicSMin |
|||||||
7 |
236 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicUMin |
|||||||
7 |
237 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicSMax |
|||||||
7 |
238 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicUMax |
|||||||
7 |
239 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicAnd |
|||||||
7 |
240 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicOr |
|||||||
7 |
241 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicXor |
|||||||
7 |
242 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
<id> |
OpAtomicFlagTestAndSet |
Capability: |
|||||
6 |
318 |
<id> |
<id> |
Scope <id> |
Memory Semantics <id> |
OpAtomicFlagClear |
Capability: |
|||
4 |
319 |
<id> |
Scope <id> |
Memory Semantics <id> |
3.32.19. Primitive Instructions
OpEmitVertex |
Capability: |
1 |
218 |
OpEndPrimitive |
Capability: |
1 |
219 |
OpEmitStreamVertex |
Capability: |
|
2 |
220 |
<id> |
OpEndStreamPrimitive |
Capability: |
|
2 |
221 |
<id> |
3.32.20. Barrier Instructions
OpControlBarrier |
||||
4 |
224 |
Scope <id> |
Scope <id> |
Memory Semantics <id> |
OpMemoryBarrier |
|||
3 |
225 |
Scope <id> |
Memory Semantics <id> |
3.32.21. Group Instructions
OpGroupAsyncCopy |
Capability: |
||||||||
9 |
259 |
<id> |
Scope <id> |
<id> |
<id> |
<id> |
<id> |
<id> |
OpGroupWaitEvents |
Capability: |
|||
4 |
260 |
Scope <id> |
<id> |
<id> |
OpGroupAll |
Capability: |
||||
5 |
261 |
<id> |
Scope <id> |
<id> |
OpGroupAny |
Capability: |
||||
5 |
262 |
<id> |
Scope <id> |
<id> |
OpGroupBroadcast |
Capability: |
|||||
6 |
263 |
<id> |
Scope <id> |
<id> |
<id> |
OpGroupIAdd |
Capability: |
|||||
6 |
264 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupFAdd |
Capability: |
|||||
6 |
265 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupFMin |
Capability: |
|||||
6 |
266 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupUMin |
Capability: |
|||||
6 |
267 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupSMin |
Capability: |
|||||
6 |
268 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupFMax |
Capability: |
|||||
6 |
269 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupUMax |
Capability: |
|||||
6 |
270 |
<id> |
Scope <id> |
Group Operation |
<id> |
OpGroupSMax |
Capability: |
|||||
6 |
271 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||
4 |
4421 |
<id> |
<id> |
OpSubgroupFirstInvocationKHR |
Capability: |
|||
4 |
4422 |
<id> |
<id> |
OpSubgroupReadInvocationKHR |
Capability: |
||||
5 |
4432 |
<id> |
<id> |
<id> |
Capability: |
||||||
6 |
5000 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5001 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5002 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5003 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5004 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5005 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5006 |
<id> |
Scope <id> |
Group Operation |
<id> |
Capability: |
||||||
6 |
5007 |
<id> |
Scope <id> |
Group Operation |
<id> |
3.32.22. Device-Side Enqueue Instructions
OpEnqueueMarker |
Capability: |
||||||
7 |
291 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpEnqueueKernel |
Capability: |
|||||||||||||
13 + variable |
292 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id>, <id>, … |
OpGetKernelNDrangeSubGroupCount |
Capability: |
|||||||
8 |
293 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
OpGetKernelNDrangeMaxSubGroupSize |
Capability: |
|||||||
8 |
294 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
OpGetKernelWorkGroupSize |
Capability: |
||||||
7 |
295 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpGetKernelPreferredWorkGroupSizeMultiple |
Capability: |
||||||
7 |
296 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpRetainEvent |
Capability: |
|
2 |
297 |
<id> |
OpReleaseEvent |
Capability: |
|
2 |
298 |
<id> |
OpCreateUserEvent |
Capability: |
||
3 |
299 |
<id> |
OpIsValidEvent |
Capability: |
|||
4 |
300 |
<id> |
<id> |
OpSetUserEventStatus |
Capability: |
||
3 |
301 |
<id> |
<id> |
OpCaptureEventProfilingInfo |
Capability: |
|||
4 |
302 |
<id> |
<id> |
<id> |
OpGetDefaultQueue |
Capability: |
||
3 |
303 |
<id> |
OpBuildNDRange |
Capability: |
|||||
6 |
304 |
<id> |
<id> |
<id> |
<id> |
3.32.23. Pipe Instructions
OpReadPipe |
Capability: |
||||||
7 |
274 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpWritePipe |
Capability: |
||||||
7 |
275 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpReservedReadPipe |
Capability: |
||||||||
9 |
276 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
OpReservedWritePipe |
Capability: |
||||||||
9 |
277 |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
<id> |
OpReserveReadPipePackets |
Capability: |
||||||
7 |
278 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpReserveWritePipePackets |
Capability: |
||||||
7 |
279 |
<id> |
<id> |
<id> |
<id> |
<id> |
OpCommitReadPipe |
Capability: |
||||
5 |
280 |
<id> |
<id> |
<id> |
<id> |
OpCommitWritePipe |
Capability: |
||||
5 |
281 |
<id> |
<id> |
<id> |
<id> |
OpIsValidReserveId |
Capability: |
|||
4 |
282 |
<id> |
<id> |
OpGetNumPipePackets |
Capability: |
|||||
6 |
283 |
<id> |
<id> |
<id> |
<id> |
OpGetMaxPipePackets |
Capability: |
|||||
6 |
284 |
<id> |
<id> |
<id> |
<id> |
OpGroupReserveReadPipePackets |
Capability: |
|||||||
8 |
285 |
<id> |
Scope <id> |
<id> |
<id> |
<id> |
<id> |
OpGroupReserveWritePipePackets |
Capability: |
|||||||
8 |
286 |
<id> |
Scope <id> |
<id> |
<id> |
<id> |
<id> |
OpGroupCommitReadPipe |
Capability: |
|||||
6 |
287 |
Scope <id> |
<id> |
<id> |
<id> |
<id> |
OpGroupCommitWritePipe |
Capability: |
|||||
6 |
288 |
Scope <id> |
<id> |
<id> |
<id> |
<id> |
4. Appendix A: Changes
4.1. Changes from Version 0.99, Revision 31
-
Added the PushConstant Storage Class.
-
Added OpIAddCarry, OpISubBorrow, OpUMulExtended, and OpSMulExtended.
-
Added OpInBoundsPtrAccessChain.
-
Added the Decoration NoContraction to prevent combining multiple operations into a single operation (bug 14396).
-
Added sparse texturing (14486):
-
Added OpImageSparse… for accessing images that might not be resident.
-
Added MinLod functionality for accessing images with a minimum level of detail.
-
-
Added back the Alignment Decoration, for the Kernel capability (14505).
-
Added a NonTemporal Memory Access (14566).
-
Structured control flow changes:
-
Changed structured loops to have a structured continue Continue Target in OpLoopMerge (14422).
-
Added rules for how "fall through" works with OpSwitch (13579).
-
Added definitions for what is "inside" a structured control-flow construct (14422).
-
-
Added SubpassData Dim to support input targets written by a previous subpass as an output target (14304). This is also a Decoration and a Capability, and can be used by some image ops to read the input target.
-
Added OpTypeForwardPointer to establish the Storage Class of a forward reference to a pointer type (13822).
-
Improved Debuggability
-
Changed OpLine to not have a target <id>, but instead be placed immediately preceding the instruction(s) it is annotating (13905).
-
Added OpNoLine to terminate the affect of OpLine (13905).
-
Changed OpSource to include the source code:
-
Allow multiple occurrences.
-
Be mixed in with the OpString instructions.
-
Optionally consume an OpString result to say which file it is annotating.
-
Optionally include the source text corresponding to that OpString.
-
Included adding OpSourceContinued for source text that is too long for a single instruction.
-
-
-
Added a large number of Capabilities for subsetting functionality (14520, 14453), including 8-bit integer support for OpenCL kernels.
-
Added VertexIndex and InstanceIndex BuiltIn Decorations (14255).
-
Added GenericPointer capability that allows the ability to use the Generic Storage Class (14287).
-
Added IndependentForwardProgress Execution Mode (14271).
-
Added OpAtomicFlagClear and OpAtomicFlagTestAndSet instructions (14315).
-
Changed OpEntryPoint to take a list of Input and Output <id> for declaring the entry point’s interface.
-
Fixed internal bugs
-
14411 Added missing documentation for mad_sat OpenCL extended instructions (enums existed, just the documentation was missing)
-
14241 Removed shader capability requirement from OpImageQueryLevels and OpImageQuerySamples.
-
14241 Removed unneeded OpImageQueryDim instruction.
-
14241 Filled in TBD section for OpAtomicCompareExchangeWeek
-
14366 All OpSampledImage must appear before uses of sampled images (and still in the first block of the entry point).
-
14450 DeviceEnqueue capability is required for OpTypeQueue and OpTypeDeviceEvent
-
14363 OpTypePipe is opaque - moved packet size and alignment to opcodes
-
14367 Float16Buffer capability clarified
-
14241 Clarified how OpSampledImage can be used
-
14402 Clarified OpTypeImage encodings for OpenCL extended instructions
-
14569 Removed mention of non-existent OpFunctionDecl
-
14372 Clarified usage of OpGenericPtrMemSemantics
-
13801 Clarified the SpecId Decoration is just for constants
-
14447 Changed literal values of Memory Semantic enums to match OpenCL/C++11 atomics, and made the Memory Semantic None and Relaxed be aliases
-
14637 Removed subgroup scope from OpGroupAsyncCopy and OpGroupWaitEvents
-
4.2. Changes from Version 0.99, Revision 32
-
Added UnormInt101010_2 to the Image Channel Data Type table.
-
Added place holder for C++11 atomic Consume Memory Semantics along with an explicit AcquireRelease memory semantic.
-
Fixed internal bugs:
-
14690 OpSwitch literal width (and hence number of operands) is determined by the type of Selector, and be rigorous about how sub-32-bit literals are stored.
-
14485 The client API owns the semantics of built-ins that only have "pass through" semantics WRT SPIR-V.
-
-
Fixed public bugs:
-
1387 Don’t describe result type of OpImageWrite.
-
4.3. Changes from Version 1.00, Revision 1
-
Adjusted Capabilities:
-
Split geometry-stream functionality into its own GeometryStreams capability (14873).
-
Have InputAttachmentIndex to depend on InputAttachment instead of Shader (14797).
-
Merge AdvancedFormats and StorageImageExtendedFormats into just StorageImageExtendedFormats (14824).
-
Require StorageImageReadWithoutFormat and StorageImageWriteWithoutFormat to read and write storage images with an Unknown Image Format.
-
Removed the ImageSRGBWrite capability.
-
-
Clarifications
-
RelaxedPrecision Decoration can be applied to OpFunction (14662).
-
-
Fixed internal bugs:
-
14797 The literal argument was missing for the InputAttachmentIndex Decoration.
-
14547 Remove the FragColor BuiltIn, so that no implicit broadcast is implied.
-
13292 Make statements about "Volatile" be more consistent with the memory model specification (non-functional change).
-
14948 Remove image-"Query" overloading on image/sampled-image type and "fetch" on non-sampled images, by adding the OpImage instruction to get the image from a sampled image.
-
14949 Make consistent placement between OpSource and OpSourceExtension in the logical layout of a module.
-
14865 Merge WorkgroupLinearId with LocalInvocationId BuiltIn Decorations.
-
14806 Include 3D images for OpImageQuerySize.
-
14325 Removed the Smooth Decoration.
-
12771 Make the version word formatted as: "0 | Major Number | Minor Number | 0" in the physical layout.
-
15035 Allow OpTypeImage to use a Depth operand of 2 for not indicating a depth or non-depth image.
-
15009 Split the OpenCL Source Language into two: OpenCL_C and OpenCL_CPP.
-
14683 OpSampledImage instructions can only be the consuming block, for scalars, and directly consumed by an image lookup or query instruction.
-
14325 mutual exclusion validation rules of Execution Modes and Decorations
-
15112 add definitions for invocation, dynamically uniform, and uniform control flow.
-
-
Renames
-
InputTargetIndex Decoration → InputAttachmentIndex
-
InputTarget Capability→ InputAttachment
-
InputTarget Dim → SubpassData
-
WorkgroupLocal Storage Class → Workgroup
-
WorkgroupGlobal Storage Class → CrossWorkgroup
-
PrivateGlobal Storage Class → Private
-
OpAsyncGroupCopy → OpGroupAsyncCopy
-
OpWaitGroupEvents → OpGroupWaitEvents
-
InputTriangles Execution Mode → Triangles
-
InputQuads Execution Mode → Quads
-
InputIsolines Execution Mode → Isolines
-
4.4. Changes from Version 1.00, Revision 2
-
Updated example at the end of Section 1 to conform to the KHR_vulkan_glsl extension and treat OpTypeBool as an abstract type.
-
Adjusted Capabilities:
-
MatrixStride depends on Matrix (15234).
-
Sample, SampleId, SamplePosition, and SampleMask depend on SampleRateShading (15234).
-
ClipDistance and CullDistance BuiltIns depend on, respectively, ClipDistance and CullDistance (1407, 15234).
-
ViewportIndex depends on MultiViewport (15234).
-
AtomicCounterMemory should be the AtomicStorage (15234).
-
Float16 has no dependencies (15234).
-
Offset Decoration should only be for Shader (15268).
-
Generic Storage Class is supposed to need the GenericPointer Capability (14287).
-
Remove capability restriction on the BuiltIn Decoration (15248).
-
-
Fixed internal bugs:
-
15203 Updated description of SampleMask BuiltIn to include "Input or output…", not just "Input…"
-
15225 Include no re-association as a constraint required by the NoContraction Decoration.
-
15210 Clarify OpPhi semantics that operand values only come from parent blocks.
-
15239 Add OpImageSparseRead, which was missing (supposed to be 12 sparse-image instructions, but only 11 got incorporated, this adds the 12th).
-
15299 Move OpUndef back to the Miscellaneous section.
-
15321 OpTypeImage does not have a Depth restriction when used with SubpassData.
-
14948 Fix the Lod Image Operands to allow both integer and floating-point values.
-
15275 Clarify specific storage classes allowed for atomic operations under universal validation rules "Atomic access rules".
-
15501 Restrict Patch Decoration to one of the tessellation execution models.
-
15472 Reserved use of OpImageSparseSampleProjImplicitLod, OpImageSparseSampleProjExplicitLod, OpImageSparseSampleProjDrefImplicitLod, and OpImageSparseSampleProjDrefExplicitLod.
-
15459 Clarify what makes different aggregate types in "Types and Variables".
-
15426 Don’t require OpQuantizeToF16 to preserve NaN patterns.
-
15418 Don’t set both Acquire and Release bits in Memory Semantics.
-
15404 OpFunction Result <id> can only be used by OpFunctionCall, OpEntryPoint, and decoration instructions.
-
15437 Restrict element type for OpTypeRuntimeArray by adding a definition of concrete types.
-
15403 Clarify OpTypeFunction can only be consumed by OpFunction and functions can only return concrete and abstract types.
-
-
Improved accuracy of the opcode word count in each instruction regarding which operands are optional. For sampling operations with explicit LOD, this included not marking the required LOD operands as optional.
-
Clarified that when NonWritable, NonReadable, Volatile, and Coherent Decorations are applied to the Uniform storage class, the BufferBlock decoration must be present.
-
Fixed external bugs:
-
1413 (see internal 15275)
-
1417 Added definitions for block, dominate, post dominate, CFG, and back edge. Removed use of "dominator tree".
-
4.5. Changes from Version 1.00, Revision 3
-
Added definition of derivative group, and use it to say when derivatives are well defined.
4.6. Changes from Version 1.00, Revision 4
-
Expanded the list of instructions that may use or return a pointer in the Logical addressing model.
-
Added missing ABGR Image Channel Order
4.7. Changes from Version 1.00, Revision 5
-
Khronos SPIR-V issue #27: Removed Shader dependency from SampledBuffer and Sampled1D Capabilities.
-
Khronos SPIR-V issue #56: Clarify that the meaning of "read-only" in the Storage Classes includes not allowing initializers.
-
Khronos SPIR-V issue #57: Clarify "modulo" means "remainder" in OpFMod's description.
-
Khronos SPIR-V issue #60: OpControlBarrier synchronizes Output variables when used in tessellation-control shader.
-
Public SPIRV-Headers issue #1: Remove the Shader capability requirement from the Input Storage Class.
-
Public SPIRV-Headers issue #10: Don’t say the (u [, v] [, w], q) has four components, as it can be closed up when the optional ones are missing. Seen in the projective image instructions.
-
Public SPIRV-Headers issues #12 and #13 and Khronos SPIR-V issue #65: Allow OpVariable as an initializer for another OpVariable instruction or the Base of an OpSpecConstantOp with an AccessChain opcode.
-
Public SPIRV-Headers issues #14: add Max enumerants of 0x7FFFFFFF to each of the non-mask enums in the C-based header files.
4.8. Changes from Version 1.00, Revision 6
-
Khronos SPIR-V issue #63: Be clear that OpUndef can be used in sequence 9 (and is preferred to be) of the Logical Layout and can be part of partially-defined OpConstantComposite.
-
Khronos SPIR-V issue #70: Don’t explicitly require operand truncation for integer operations when operating at RelaxedPrecision.
-
Khronos SPIR-V issue #76: Include OpINotEqual in the list of allowed instructions for OpSpecConstantOp.
-
Khronos SPIR-V issue #79: Remove implication that OpImageQueryLod should have a component for the array index.
-
Public SPIRV-Headers issue #17: Decorations Noperspective, Flat, Patch, Centroid, and Sample can apply to a top-level member that is itself a structure, so don’t disallow it through restrictions to numeric types.
4.9. Changes from Version 1.00, Revision 7
-
Khronos SPIR-V issue #69: OpImageSparseFetch editorial change in summary: include that it is sampled image.
-
Khronos SPIR-V issue #74: OpImageQueryLod requires a sampler.
-
Khronos SPIR-V issue #82: Clarification to the Float16Buffer Capability.
-
Khronos SPIR-V issue #89: Editorial improvements to OpMemberDecorate and OpDecorationGroup.
4.10. Changes from Version 1.00, Revision 8
-
Add SPV_KHR_subgroup_vote tokens.
-
Typo: Change "without a sampler" to "with a sampler" for the description of the SampledBuffer Capability.
-
Khronos SPIR-V issue #61: Clarification of packet size and alignment on all instructions that use the Pipes Capability.
-
Khronos SPIR-V issue #99: Use "invalid" language to replace any "compile-time error" language.
-
Khronos SPIR-V issue #55: Distinguish between branch instructions and termination instructions.
-
Khronos SPIR-V issue #94: Add missing OpSubgroupReadInvocationKHR enumerant.
-
Khronos SPIR-V issue #114: Header blocks strictly dominate their merge blocks.
-
Khronos SPIR-V issue #119: OpSpecConstantOp allows OpUndef where allowed by its opcode.
4.11. Changes from Version 1.00, Revision 9
-
Khronos Vulkan issue #652: Remove statements about matrix offsets and padding. These are described correctly in the Vulkan API specifications.
-
Khronos SPIR-V issue #113: Remove the "By Default" statements in FP Rounding Mode. These should be properly documented in client API execution environment specifications.
-
Add extension enumerants for
-
SPV_KHR_16bit_storage
-
SPV_KHR_device_group
-
SPV_KHR_multiview
-
SPV_NV_sample_mask_override_coverage
-
SPV_NV_geometry_shader_passthrough
-
SPV_NV_viewport_array2
-
SPV_NV_stereo_view_rendering
-
SPV_NVX_multiview_per_view_attributes
-
4.12. Changes from Version 1.00, Revision 10
-
Add HLSL source language.
-
Add StorageBuffer storage class.
-
Add StorageBuffer16BitAccess, UniformAndStorageBuffer16BitAccess, VariablePointersStorageBuffer, and VariablePointers capabilities.
-
Khronos SPIR-V issue #163: Be more clear that OpTypeStruct allows zero members. Also affects ArrayStride and Offset decoration validation rules.
-
Khronos SPIR-V issue #159: List allowed AtomicCounter instructions with the AtomicStorage capability rather than the validation rules.
-
Khronos SPIR-V issue #36: Describe more clearly the type of ND Range in OpGetKernelNDrangeSubGroupCount, OpGetKernelNDrangeMaxSubGroupSize, and OpEnqueueKernel.
-
Khronos SPIR-V issue #128: Be clear the OpDot operates only on vectors.
-
Khronos SPIR-V issue #80: Loop headers must dominate their continue target. See Structured Control Flow.
-
Khronos SPIR-V issue #150 allow UniformConstant storage-class variables to have initializers, depending on the client API.
4.13. Changes from Version 1.00, Revision 11
-
Public issue #2: Disallow the Cube dimension from use with the Offset, ConstOffset, and ConstOffset image operands.
-
Public issue #48: OpConvertPtrToU only returns a scalar, not a vector.
-
Khronos SPIR-V issue #130: Be more clear which masks are literal and which are not.
-
Khronos SPIR-V issue #154: Clarify only one of the listed Capabilities needs to be declared to use a feature that lists multiple capabilities. The non-declared capabilities need not be supported by the underlying implementation.
-
Khronos SPIR-V issue #174: OpImageDrefGather and OpImageSparseDrefGather return vectors, not scalars.
-
Khronos SPIR-V issue #182: The SampleMask built in does not depend on SampleRateShading, only Shader.
-
Khronos SPIR-V issue #183: OpQuantizeToF16 with too-small magnitude can result in either +0 or -0.
-
Khronos SPIR-V issue #203: OpImageTexelPointer has 3 components for cube arrays, not 4.
-
Khronos SPIR-V issue #217: Clearer language for OpArrayLength.
-
Khronos SPIR-V issue #213: Image Operand LoD is not used by query operations.
-
Khronos SPIR-V issue #223: OpPhi has exactly one parent operand per parent block.
-
Khronos SPIR-V issue #212: In the Validation Rules, make clear a pointer can be an operand in an extended instruction set.
-
Add extension enumerants for
-
SPV_AMD_shader_ballot
-
SPV_KHR_post_depth_coverage
-
SPV_AMD_shader_explicit_vertex_parameter
-
SPV_EXT_shader_stencil_export
-
SPV_INTEL_subgroups
-