Vulkan® 1.3.283 - A Specification

16. Image Operations

16.1. Image Operations Overview

Vulkan Image Operations are operations performed by those SPIR-V Image Instructions which take an OpTypeImage (representing a VkImageView) or OpTypeSampledImage (representing a (VkImageView, VkSampler) pair). Read, write, and atomic operations also take texel coordinates as operands, and return a value based on a neighborhood of texture elements (texels) within the image. Query operations return properties of the bound image or of the lookup itself. The “Depth” operand of OpTypeImage is ignored.

Note

Texel is a term which is a combination of the words texture and element. Early interactive computer graphics supported texture operations on textures, a small subset of the image operations on images described here. The discrete samples remain essentially equivalent, however, so we retain the historical term texel to refer to them.

Image Operations include the functionality of the following SPIR-V Image Instructions:

OpImageSample* and OpImageSparseSample* read one or more neighboring texels of the image, and filter the texel values based on the state of the sampler.
- Instructions with ImplicitLod in the name determine the LOD used in the sampling operation based on the coordinates used in neighboring fragments.
- Instructions with ExplicitLod in the name determine the LOD used in the sampling operation based on additional coordinates.
- Instructions with Proj in the name apply homogeneous projection to the coordinates.
OpImageFetch and OpImageSparseFetch return a single texel of the image. No sampler is used.
OpImage*Gather and OpImageSparse*Gather read neighboring texels and return a single component of each.
OpImageRead (and OpImageSparseRead) and OpImageWrite read and write, respectively, a texel in the image. No sampler is used.
OpImage*Dref* instructions apply depth comparison on the texel values.
OpImageSparse* instructions additionally return a sparse residency code.
OpImageQuerySize, OpImageQuerySizeLod, OpImageQueryLevels, and OpImageQuerySamples return properties of the image descriptor that would be accessed. The image itself is not accessed.
OpImageQueryLod returns the LOD parameters that would be used in a sample operation. The actual operation is not performed.

16.1.1. Texel Coordinate Systems

Images are addressed by texel coordinates. There are three texel coordinate systems:

normalized texel coordinates [0.0, 1.0]
unnormalized texel coordinates [0.0, width / height / depth)
integer texel coordinates [0, width / height / depth)

SPIR-V OpImageFetch, OpImageSparseFetch, OpImageRead, OpImageSparseRead, and OpImageWrite instructions use integer texel coordinates.

Other image instructions can use either normalized or unnormalized texel coordinates (selected by the unnormalizedCoordinates state of the sampler used in the instruction), but there are limitations on what operations, image state, and sampler state is supported. Normalized coordinates are logically converted to unnormalized as part of image operations, and certain steps are only performed on normalized coordinates. The array layer coordinate is always treated as unnormalized even when other coordinates are normalized.

Normalized texel coordinates are referred to as (s,t,r,q,a), with the coordinates having the following meanings:

s: Coordinate in the first dimension of an image.
t: Coordinate in the second dimension of an image.
r: Coordinate in the third dimension of an image.
- (s,t,r) are interpreted as a direction vector for Cube images.
q: Fourth coordinate, for homogeneous (projective) coordinates.
a: Coordinate for array layer.

The coordinates are extracted from the SPIR-V operand based on the dimensionality of the image variable and type of instruction. For Proj instructions, the components are in order (s, [t,] [r,] q), with t and r being conditionally present based on the Dim of the image. For non-Proj instructions, the coordinates are (s [,t] [,r] [,a]), with t and r being conditionally present based on the Dim of the image and a being conditionally present based on the Arrayed property of the image. Projective image instructions are not supported on Arrayed images.

Unnormalized texel coordinates are referred to as (u,v,w,a), with the coordinates having the following meanings:

u: Coordinate in the first dimension of an image.
v: Coordinate in the second dimension of an image.
w: Coordinate in the third dimension of an image.
a: Coordinate for array layer.

Only the u and v coordinates are directly extracted from the SPIR-V operand, because only 1D and 2D (non-Arrayed) dimensionalities support unnormalized coordinates. The components are in order (u [,v]), with v being conditionally present when the dimensionality is 2D. When normalized coordinates are converted to unnormalized coordinates, all four coordinates are used.

Integer texel coordinates are referred to as (i,j,k,l,n), with the coordinates having the following meanings:

i: Coordinate in the first dimension of an image.
j: Coordinate in the second dimension of an image.
k: Coordinate in the third dimension of an image.
l: Coordinate for array layer.
n: Index of the sample within the texel.

They are extracted from the SPIR-V operand in order (i [,j] [,k] [,l] [,n]), with j and k conditionally present based on the Dim of the image, and l conditionally present based on the Arrayed property of the image. n is conditionally present and is taken from the Sample image operand.

For all coordinate types, unused coordinates are assigned a value of zero.

Figure 3. Texel Coordinate Systems, Linear Filtering

The Texel Coordinate Systems - For the example shown of an 8×4 texel two dimensional image.

Normalized texel coordinates:
- The s coordinate goes from 0.0 to 1.0.
- The t coordinate goes from 0.0 to 1.0.
Unnormalized texel coordinates:
- The u coordinate within the range 0.0 to 8.0 is within the image, otherwise it is outside the image.
- The v coordinate within the range 0.0 to 4.0 is within the image, otherwise it is outside the image.
Integer texel coordinates:
- The i coordinate within the range 0 to 7 addresses texels within the image, otherwise it is outside the image.
- The j coordinate within the range 0 to 3 addresses texels within the image, otherwise it is outside the image.
Also shown for linear filtering:
- Given the unnormalized coordinates (u,v), the four texels selected are i₀j₀, i₁j₀, i₀j₁, and i₁j₁.
- The fractions α and β.
- Given the offset Δ_i and Δ_j, the four texels selected by the offset are i₀j'₀, i₁j'₀, i₀j'₁, and i₁j'₁.

Note

For formats with reduced-resolution components, Δ_i and Δ_j are relative to the resolution of the highest-resolution component, and therefore may be divided by two relative to the unnormalized coordinate space of the lower-resolution components.

Figure 4. Texel Coordinate Systems, Nearest Filtering

The Texel Coordinate Systems - For the example shown of an 8×4 texel two dimensional image.

Texel coordinates as above. Also shown for nearest filtering:
- Given the unnormalized coordinates (u,v), the texel selected is ij.
- Given the offset Δ_i and Δ_j, the texel selected by the offset is ij'.

16.2. Conversion Formulas

16.2.1. RGB to Shared Exponent Conversion

An RGB color (red, green, blue) is transformed to a shared exponent color (red_shared, green_shared, blue_shared, exp_shared) as follows:

First, the components (red, green, blue) are clamped to (red_clamped, green_clamped, blue_clamped) as:

: red_clamped = max(0, min(sharedexp_max, red))
: green_clamped = max(0, min(sharedexp_max, green))
: blue_clamped = max(0, min(sharedexp_max, blue))

where:

N B E_{m a x} s h a r e d e x p_{m a x} = 9 = 15 = 31 = \frac{( 2 ^{N} - 1 )}{2 ^{N}} \times 2^{(E_{m a x} - B)} number of mantissa bits per component exponent bias maximum possible biased exponent value

Note

NaN, if supported, is handled as in
IEEE 754-2008 minNum() and maxNum(). This results in any NaN being mapped to zero.

The largest clamped component, max_clamped is determined:

: max_clamped = max(red_clamped, green_clamped, blue_clamped)

A preliminary shared exponent exp' is computed:

e x p^{'} = {⌊ lo g_{2} (m a x_{c l a m p e d}) ⌋ + (B + 1) 0 for m a x_{c l a m p e d} > 2^{- (B + 1)} for m a x_{c l a m p e d} \leq 2^{- (B + 1)}

The shared exponent exp_shared is computed:

m a x_{s h a r e d} = ⌊ \frac{m a x _{c l a m p e d}}{2 ^{(e x p^{'} - B - N)}} + \frac{1}{2} ⌋

e x p_{s h a r e d} = {e x p^{'} e x p^{'} + 1 for 0 \leq m a x_{s h a r e d} < 2^{N} for m a x_{s h a r e d} = 2^{N}

Finally, three integer values in the range 0 to 2^N are computed:

r e d_{s h a r e d} g r e e n_{s h a r e d} b l u e_{s h a r e d} = ⌊ \frac{r e d _{c l a m p e d}}{2 ^{(e x p_{s h a r e d} - B - N)}} + \frac{1}{2} ⌋ = ⌊ \frac{g r e e n _{c l a m p e d}}{2 ^{(e x p_{s h a r e d} - B - N)}} + \frac{1}{2} ⌋ = ⌊ \frac{b l u e _{c l a m p e d}}{2 ^{(e x p_{s h a r e d} - B - N)}} + \frac{1}{2} ⌋

16.2.2. Shared Exponent to RGB

A shared exponent color (red_shared, green_shared, blue_shared, exp_shared) is transformed to an RGB color (red, green, blue) as follows:

: $r e d = r e d_{s h a r e d} \times 2^{(e x p_{s h a r e d} - B - N)}$
: $g r e e n = g r e e n_{s h a r e d} \times 2^{(e x p_{s h a r e d} - B - N)}$
: $b l u e = b l u e_{s h a r e d} \times 2^{(e x p_{s h a r e d} - B - N)}$

where:

: N = 9 (number of mantissa bits per component)
: B = 15 (exponent bias)

16.3. Texel Input Operations

Texel input instructions are SPIR-V image instructions that read from an image. Texel input operations are a set of steps that are performed on state, coordinates, and texel values while processing a texel input instruction, and which are common to some or all texel input instructions. They include the following steps, which are performed in the listed order:

For texel input instructions involving multiple texels (for sampling or gathering), these steps are applied for each texel that is used in the instruction. Depending on the type of image instruction, other steps are conditionally performed between these steps or involving multiple coordinate or texel values.

If Chroma Reconstruction is implicit, Texel Filtering instead takes place during chroma reconstruction, before sampler Y′C_BC_R conversion occurs.

16.3.1. Texel Input Validation Operations

Texel input validation operations inspect instruction/image/sampler state or coordinates, and in certain circumstances cause the texel value to be replaced or become undefined. There are a series of validations that the texel undergoes.

Instruction/Sampler/Image View Validation

There are a number of cases where a SPIR-V instruction can mismatch with the sampler, the image view, or both, and a number of further cases where the sampler can mismatch with the image view. In such cases the value of the texel returned is undefined.

These cases include:

The sampler borderColor is an integer type and the image view format is not one of the VkFormat integer types or a stencil component of a depth/stencil format.
The sampler borderColor is a float type and the image view format is not one of the VkFormat float types or a depth component of a depth/stencil format.
The sampler borderColor is one of the opaque black colors (VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK or VK_BORDER_COLOR_INT_OPAQUE_BLACK) and the image view VkComponentSwizzle for any of the VkComponentMapping components is not the identity swizzle.
The VkImageLayout of any subresource in the image view does not match the VkDescriptorImageInfo::imageLayout used to write the image descriptor.
The SPIR-V Image Format is not compatible with the image view’s format.
The sampler unnormalizedCoordinates is VK_TRUE and any of the limitations of unnormalized coordinates are violated.
The SPIR-V instruction is one of the OpImage*Dref* instructions and the sampler compareEnable is VK_FALSE
The SPIR-V instruction is not one of the OpImage*Dref* instructions and the sampler compareEnable is VK_TRUE
The SPIR-V instruction is one of the OpImage*Dref* instructions, the image view format is one of the depth/stencil formats, and the image view aspect is not VK_IMAGE_ASPECT_DEPTH_BIT.
The SPIR-V instruction’s image variable’s properties are not compatible with the image view:
- Rules for viewType:
  - VK_IMAGE_VIEW_TYPE_1D must have Dim = 1D, Arrayed = 0, MS = 0.
  - VK_IMAGE_VIEW_TYPE_2D must have Dim = 2D, Arrayed = 0.
  - VK_IMAGE_VIEW_TYPE_3D must have Dim = 3D, Arrayed = 0, MS = 0.
  - VK_IMAGE_VIEW_TYPE_CUBE must have Dim = Cube, Arrayed = 0, MS = 0.
  - VK_IMAGE_VIEW_TYPE_1D_ARRAY must have Dim = 1D, Arrayed = 1, MS = 0.
  - VK_IMAGE_VIEW_TYPE_2D_ARRAY must have Dim = 2D, Arrayed = 1.
  - VK_IMAGE_VIEW_TYPE_CUBE_ARRAY must have Dim = Cube, Arrayed = 1, MS = 0.
- If the image was created with VkImageCreateInfo::samples equal to VK_SAMPLE_COUNT_1_BIT, the instruction must have MS = 0.
- If the image was created with VkImageCreateInfo::samples not equal to VK_SAMPLE_COUNT_1_BIT, the instruction must have MS = 1.
- If the Sampled Type of the OpTypeImage does not match the SPIR-V Type.
- If the signedness of any read or sample operation does not match the signedness of the image’s format.

Only OpImageSample* and OpImageSparseSample* can be used with a sampler or image view that enables sampler Y′C_BC_R conversion.

OpImageFetch, OpImageSparseFetch, OpImage*Gather, and OpImageSparse*Gather must not be used with a sampler or image view that enables sampler Y′C_BC_R conversion.

The ConstOffset and Offset operands must not be used with a sampler or image view that enables sampler Y′C_BC_R conversion.

If the underlying VkImage format has an X component in its format description, undefined values are read from those bits.

Note

If the VkImage format and VkImageView format are the same, these bits will be unused by format conversion and this will have no effect. However, if the VkImageView format is different, then some bits of the result may be undefined. For example, when a VK_FORMAT_R10X6_UNORM_PACK16 VkImage is sampled via a VK_FORMAT_R16_UNORM VkImageView, the low 6 bits of the value before format conversion are undefined and format conversion may return a range of different values.

Note

Some implementations will return undefined values in the case where a sampler uses a VkSamplerAddressMode of VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT, the sampler is used with operands Offset, ConstOffset, or ConstOffsets, and the value of the offset is larger than or equal to the corresponding width, height, or depth of any accessed image level.

This behavior was not tested prior to Vulkan conformance test suite version 1.3.8.0. Affected implementations will have a conformance test waiver for this issue.

Integer Texel Coordinate Validation

Integer texel coordinates are validated against the size of the image level, and the number of layers and number of samples in the image. For SPIR-V instructions that use integer texel coordinates, this is performed directly on the integer coordinates. For instructions that use normalized or unnormalized texel coordinates, this is performed on the coordinates that result after conversion to integer texel coordinates.

If the integer texel coordinates do not satisfy all of the conditions

: 0 ≤ i < w_s
: 0 ≤ j < h_s
: 0 ≤ k < d_s
: 0 ≤ l < layers
: 0 ≤ n < samples

where:

: w_s = width of the image level
: h_s = height of the image level
: d_s = depth of the image level
: layers = number of layers in the image
: samples = number of samples per texel in the image

then the texel fails integer texel coordinate validation.

There are four cases to consider:

Valid Texel Coordinates
- If the texel coordinates pass validation (that is, the coordinates lie within the image),
then the texel value comes from the value in image memory.
Border Texel
- If the texel coordinates fail validation, and
- If the read is the result of an image sample instruction or image gather instruction, and
- If the image is not a cube image,
then the texel is a border texel and texel replacement is performed.
Invalid Texel
- If the texel coordinates fail validation, and
- If the read is the result of an image fetch instruction, image read instruction, or atomic instruction,
then the texel is an invalid texel and texel replacement is performed.
Cube Map Edge or Corner

Otherwise the texel coordinates lie beyond the edges or corners of the selected cube map face, and Cube map edge handling is performed.

Cube Map Edge Handling

If the texel coordinates lie beyond the edges or corners of the selected cube map face (as described in the prior section), the following steps are performed. Note that this does not occur when using VK_FILTER_NEAREST filtering within a mip level, since VK_FILTER_NEAREST is treated as using VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE.

Cube Map Edge Texel
- If the texel lies beyond the selected cube map face in either only i or only j, then the coordinates (i,j) and the array layer l are transformed to select the adjacent texel from the appropriate neighboring face.
Cube Map Corner Texel
- If the texel lies beyond the selected cube map face in both i and j, then there is no unique neighboring face from which to read that texel. The texel should be replaced by the average of the three values of the adjacent texels in each incident face. However, implementations may replace the cube map corner texel by other methods. The methods are subject to the constraint that if the three available texels have the same value, the resulting filtered texel must have that value.

Sparse Validation

If the texel reads from an unbound region of a sparse image, the texel is a sparse unbound texel, and processing continues with texel replacement.

Layout Validation

If all planes of a disjoint multi-planar image are not in the same image layout, the image must not be sampled with sampler Y′C_BC_R conversion enabled.

16.3.2. Format Conversion

Texels undergo a format conversion from the VkFormat of the image view to a vector of either floating point or signed or unsigned integer components, with the number of components based on the number of components present in the format.

Color formats have one, two, three, or four components, according to the format.
Depth/stencil formats are one component. The depth or stencil component is selected by the aspectMask of the image view.

Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat), using the appropriate equations in 16-Bit Floating-Point Numbers, Unsigned 11-Bit Floating-Point Numbers, Unsigned 10-Bit Floating-Point Numbers, Fixed-Point Data Conversion, and Shared Exponent to RGB. Signed integer components smaller than 32 bits are sign-extended.

If the image view format is sRGB, the color components are first converted as if they are UNORM, and then sRGB to linear conversion is applied to the R, G, and B components as described in the “sRGB EOTF” section of the Khronos Data Format Specification. The A component, if present, is unchanged.

If the image view format is block-compressed, then the texel value is first decoded, then converted based on the type and number of components defined by the compressed format.

16.3.3. Texel Replacement

A texel is replaced if it is one (and only one) of:

a border texel,
an invalid texel, or
a sparse unbound texel.

Border texels are replaced with a value based on the image format and the borderColor of the sampler. The border color is:

Table 14. Border Color B
Sampler `borderColor`	Corresponding Border Color
`VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK`	[B_r, B_g, B_b, B_a] = [0.0, 0.0, 0.0, 0.0]
`VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK`	[B_r, B_g, B_b, B_a] = [0.0, 0.0, 0.0, 1.0]
`VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE`	[B_r, B_g, B_b, B_a] = [1.0, 1.0, 1.0, 1.0]
`VK_BORDER_COLOR_INT_TRANSPARENT_BLACK`	[B_r, B_g, B_b, B_a] = [0, 0, 0, 0]
`VK_BORDER_COLOR_INT_OPAQUE_BLACK`	[B_r, B_g, B_b, B_a] = [0, 0, 0, 1]
`VK_BORDER_COLOR_INT_OPAQUE_WHITE`	[B_r, B_g, B_b, B_a] = [1, 1, 1, 1]

Note

The names VK_BORDER_COLOR_*_TRANSPARENT_BLACK, VK_BORDER_COLOR_*_OPAQUE_BLACK, and VK_BORDER_COLOR_*_OPAQUE_WHITE are meant to describe which components are zeros and ones in the vocabulary of compositing, and are not meant to imply that the numerical value of VK_BORDER_COLOR_INT_OPAQUE_WHITE is a saturating value for integers.

This is substituted for the texel value by replacing the number of components in the image format

Table 15. Border Texel Components After Replacement
Texel Aspect or Format	Component Assignment
Depth aspect	D = B_r
Stencil aspect	S = B_r
One component color format	Color_r = B_r
Two component color format	[Color_r,Color_g] = [B_r,B_g]
Three component color format	[Color_r,Color_g,Color_b] = [B_r,B_g,B_b]
Four component color format	[Color_r,Color_g,Color_b,Color_a] = [B_r,B_g,B_b,B_a]

The value returned by a read of an invalid texel is undefined, unless that read operation is from a buffer resource and the robustBufferAccess feature is enabled. In that case, an invalid texel is replaced as described by the robustBufferAccess feature. If the access is to an image resource and the x, y, z, or layer coordinate validation fails and the robustImageAccess feature is enabled, then zero must be returned for the R, G, and B components, if present. Either zero or one must be returned for the A component, if present. If only the sample index was invalid, the values returned are undefined.

Additionally, if the robustImageAccess feature is enabled, any invalid texels may be expanded to four components prior to texel replacement. This means that components not present in the image format may be replaced with 0 or may undergo conversion to RGBA as normal.

If the VkPhysicalDeviceSparseProperties::residencyNonResidentStrict property is VK_TRUE, a sparse unbound texel is replaced with 0 or 0.0 values for integer and floating-point components of the image format, respectively.

If residencyNonResidentStrict is VK_FALSE, the value of the sparse unbound texel is undefined.

16.3.4. Depth Compare Operation

If the image view has a depth/stencil format, the depth component is selected by the aspectMask, and the operation is an OpImage*Dref* instruction, a depth comparison is performed. The result is 1.0 if the comparison evaluates to true, and 0.0 otherwise. This value replaces the depth component D.

The compare operation is selected by the VkCompareOp value set by VkSamplerCreateInfo::compareOp. The reference value from the SPIR-V operand D_ref and the texel depth value D_tex are used as the reference and test values, respectively, in that operation.

If the image being sampled has an unsigned normalized fixed-point format, then D_ref is clamped to [0,1] before the compare operation.

16.3.5. Conversion to RGBA

The texel is expanded from one, two, or three components to four components based on the image base color:

Table 16. Texel Color After Conversion To RGBA
Texel Aspect or Format	RGBA Color
Depth aspect	[Color_r,Color_g,Color_b, Color_a] = [D,0,0,one]
Stencil aspect	[Color_r,Color_g,Color_b, Color_a] = [S,0,0,one]
One component color format	[Color_r,Color_g,Color_b, Color_a] = [Color_r,0,0,one]
Two component color format	[Color_r,Color_g,Color_b, Color_a] = [Color_r,Color_g,0,one]
Three component color format	[Color_r,Color_g,Color_b, Color_a] = [Color_r,Color_g,Color_b,one]
Four component color format	[Color_r,Color_g,Color_b, Color_a] = [Color_r,Color_g,Color_b,Color_a]

where one = 1.0f for floating-point formats and depth aspects, and one = 1 for integer formats and stencil aspects.

16.3.6. Component Swizzle

All texel input instructions apply a swizzle based on:

the VkComponentSwizzle enums in the components member of the VkImageViewCreateInfo structure for the image being read if sampler Y′C_BC_R conversion is not enabled, and
the VkComponentSwizzle enums in the components member of the VkSamplerYcbcrConversionCreateInfo structure for the sampler Y′C_BC_R conversion if sampler Y′C_BC_R conversion is enabled.

The swizzle can rearrange the components of the texel, or substitute zero or one for any components. It is defined as follows for each color component:

C o l o r_{c o m p o n e n t}^{'} = ⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ C o l o r_{r} C o l o r_{g} C o l o r_{b} C o l o r_{a} 0 o n e i d e n t i t y for RED swizzle for GREEN swizzle for BLUE swizzle for ALPHA swizzle for ZERO swizzle for ONE swizzle for IDENTITY swizzle

where:

o n e i d e n t i t y = {1.0 f 1 for floating point components for integer components = ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎧ C o l o r_{r} C o l o r_{g} C o l o r_{b} C o l o r_{a} for c o m p o n e n t = r for c o m p o n e n t = g for c o m p o n e n t = b for c o m p o n e n t = a

If the border color is one of the VK_BORDER_COLOR_*_OPAQUE_BLACK enums and the VkComponentSwizzle is not the identity swizzle for all components, the value of the texel after swizzle is undefined.

If the image view has a depth/stencil format and the VkComponentSwizzle is VK_COMPONENT_SWIZZLE_ONE, the value of the texel after swizzle is undefined.

16.3.7. Sparse Residency

OpImageSparse* instructions return a structure which includes a residency code indicating whether any texels accessed by the instruction are sparse unbound texels. This code can be interpreted by the OpImageSparseTexelsResident instruction which converts the residency code to a boolean value.

16.3.8. Chroma Reconstruction

In some color models, the color representation is defined in terms of monochromatic light intensity (often called “luma”) and color differences relative to this intensity, often called “chroma”. It is common for color models other than RGB to represent the chroma components at lower spatial resolution than the luma component. This approach is used to take advantage of the eye’s lower spatial sensitivity to color compared with its sensitivity to brightness. Less commonly, the same approach is used with additive color, since the green component dominates the eye’s sensitivity to light intensity and the spatial sensitivity to color introduced by red and blue is lower.

Lower-resolution components are “downsampled” by resizing them to a lower spatial resolution than the component representing luminance. This process is also commonly known as “chroma subsampling”. There is one luminance sample in each texture texel, but each chrominance sample may be shared among several texels in one or both texture dimensions.

“_444” formats do not spatially downsample chroma values compared with luma: there are unique chroma samples for each texel.
“_422” formats have downsampling in the x dimension (corresponding to u or s coordinates): they are sampled at half the resolution of luma in that dimension.
“_420” formats have downsampling in the x dimension (corresponding to u or s coordinates) and the y dimension (corresponding to v or t coordinates): they are sampled at half the resolution of luma in both dimensions.

The process of reconstructing a full color value for texture access involves accessing both chroma and luma values at the same location. To generate the color accurately, the values of the lower-resolution components at the location of the luma samples must be reconstructed from the lower-resolution sample locations, an operation known here as “chroma reconstruction” irrespective of the actual color model.

The location of the chroma samples relative to the luma coordinates is determined by the xChromaOffset and yChromaOffset members of the VkSamplerYcbcrConversionCreateInfo structure used to create the sampler Y′C_BC_R conversion.

The following diagrams show the relationship between unnormalized (u,v) coordinates and (i,j) integer texel positions in the luma component (shown in black, with circles showing integer sample positions) and the texel coordinates of reduced-resolution chroma components, shown as crosses in red.

Note

If the chroma values are reconstructed at the locations of the luma samples by means of interpolation, chroma samples from outside the image bounds are needed; these are determined according to Wrapping Operation. These diagrams represent this by showing the bounds of the “chroma texel” extending beyond the image bounds, and including additional chroma sample positions where required for interpolation. The limits of a sample for NEAREST sampling is shown as a grid.

Figure 5. 422 downsampling, xChromaOffset=COSITED_EVEN

Figure 6. 422 downsampling, xChromaOffset=MIDPOINT

Figure 7. 420 downsampling, xChromaOffset=COSITED_EVEN, yChromaOffset=COSITED_EVEN

Figure 8. 420 downsampling, xChromaOffset=MIDPOINT, yChromaOffset=COSITED_EVEN

Figure 9. 420 downsampling, xChromaOffset=COSITED_EVEN, yChromaOffset=MIDPOINT

Figure 10. 420 downsampling, xChromaOffset=MIDPOINT, yChromaOffset=MIDPOINT

Reconstruction is implemented in one of two ways:

If the format of the image that is to be sampled sets VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT, or the VkSamplerYcbcrConversionCreateInfo’s forceExplicitReconstruction is set to VK_TRUE, reconstruction is performed as an explicit step independent of filtering, described in the Explicit Reconstruction section.

If the format of the image that is to be sampled does not set VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_CHROMA_RECONSTRUCTION_EXPLICIT_BIT and if the VkSamplerYcbcrConversionCreateInfo’s forceExplicitReconstruction is set to VK_FALSE, reconstruction is performed as an implicit part of filtering prior to color model conversion, with no separate post-conversion texel filtering step, as described in the Implicit Reconstruction section.

Explicit Reconstruction

If the chromaFilter member of the VkSamplerYcbcrConversionCreateInfo structure is VK_FILTER_NEAREST:
- If the format’s R and B components are reduced in resolution in just width by a factor of two relative to the G component (i.e. this is a “_422” format), the $τ_{i j k} [l e v e l]$ values accessed by texel filtering are reconstructed as follows:
  
  $τ_{R}^{'} (i, j) τ_{B}^{'} (i, j) = τ_{R} (⌊ i \times 0.5 ⌋, j) [l e v e l] = τ_{B} (⌊ i \times 0.5 ⌋, j) [l e v e l]$
- If the format’s R and B components are reduced in resolution in width and height by a factor of two relative to the G component (i.e. this is a “_420” format), the $τ_{i j k} [l e v e l]$ values accessed by texel filtering are reconstructed as follows:
  
  $τ_{R}^{'} (i, j) τ_{B}^{'} (i, j) = τ_{R} (⌊ i \times 0.5 ⌋, ⌊ j \times 0.5 ⌋) [l e v e l] = τ_{B} (⌊ i \times 0.5 ⌋, ⌊ j \times 0.5 ⌋) [l e v e l]$
  
  Note
  
  xChromaOffset and yChromaOffset have no effect if chromaFilter is VK_FILTER_NEAREST for explicit reconstruction.
If the chromaFilter member of the VkSamplerYcbcrConversionCreateInfo structure is VK_FILTER_LINEAR:
- If the format’s R and B components are reduced in resolution in just width by a factor of two relative to the G component (i.e. this is a “_422” format):
  - If xChromaOffset is VK_CHROMA_LOCATION_COSITED_EVEN:
    
    $τ_{R B}^{'} (i, j) = ⎩ ⎪ ⎪ ⎨ ⎪ ⎪ ⎧ τ_{R B} (⌊ i \times 0.5 ⌋, j) [l e v e l], 0.5 \times τ_{R B} (⌊ i \times 0.5 ⌋, j) [l e v e l] + 0.5 \times τ_{R B} (⌊ i \times 0.5 ⌋ + 1, j) [l e v e l], 0.5 \times i = ⌊ 0.5 \times i ⌋ 0.5 \times i \neq = ⌊ 0.5 \times i ⌋$
  - If xChromaOffset is VK_CHROMA_LOCATION_MIDPOINT:
    
    $τ_{R B}^{'} (i, j) = ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎧ 0.25 \times τ_{R B} (⌊ i \times 0.5 ⌋ - 1, j) [l e v e l] + 0.75 \times τ_{R B} (⌊ i \times 0.5 ⌋, j) [l e v e l], 0.75 \times τ_{R B} (⌊ i \times 0.5 ⌋, j) [l e v e l] + 0.25 \times τ_{R B} (⌊ i \times 0.5 ⌋ + 1, j) [l e v e l], 0.5 \times i = ⌊ 0.5 \times i ⌋ 0.5 \times i \neq = ⌊ 0.5 \times i ⌋$
- If the format’s R and B components are reduced in resolution in width and height by a factor of two relative to the G component (i.e. this is a “_420” format), a similar relationship applies. Due to the number of options, these formulae are expressed more concisely as follows:
  
  $i_{R B} j_{R B} i_{f l o o r} j_{f l o o r} i_{f r a c} j_{f r a c} = {0.5 \times (i) 0.5 \times (i - 0.5) xChromaOffset = COSITED_EVEN xChromaOffset = MIDPOINT = {0.5 \times (j) 0.5 \times (j - 0.5) yChromaOffset = COSITED_EVEN yChromaOffset = MIDPOINT = ⌊ i_{R B} ⌋ = ⌊ j_{R B} ⌋ = i_{R B} - i_{f l o o r} = j_{R B} - j_{f l o o r}$
  
  $τ_{R B}^{'} (i, j) = τ_{R B} (i_{f l o o r}, j_{f l o o r}) [l e v e l] τ_{R B} (1 + i_{f l o o r}, j_{f l o o r}) [l e v e l] τ_{R B} (i_{f l o o r}, 1 + j_{f l o o r}) [l e v e l] τ_{R B} (1 + i_{f l o o r}, 1 + j_{f l o o r}) [l e v e l] \times \times \times \times (1 - i_{f r a c}) (i_{f r a c}) (1 - i_{f r a c}) (i_{f r a c}) \times \times \times \times (1 - j_{f r a c}) (1 - j_{f r a c}) (j_{f r a c}) (j_{f r a c}) + + +$

Note

In the case where the texture itself is bilinearly interpolated as described in Texel Filtering, thus requiring four full-color samples for the filtering operation, and where the reconstruction of these samples uses bilinear interpolation in the chroma components due to chromaFilter=VK_FILTER_LINEAR, up to nine chroma samples may be required, depending on the sample location.

Implicit Reconstruction

Implicit reconstruction takes place by the samples being interpolated, as required by the filter settings of the sampler, except that chromaFilter takes precedence for the chroma samples.

If chromaFilter is VK_FILTER_NEAREST, an implementation may behave as if xChromaOffset and yChromaOffset were both VK_CHROMA_LOCATION_MIDPOINT, irrespective of the values set.

Note

This will not have any visible effect if the locations of the luma samples coincide with the location of the samples used for rasterization.

The sample coordinates are adjusted by the downsample factor of the component (such that, for example, the sample coordinates are divided by two if the component has a downsample factor of two relative to the luma component):

u_{R B}^{'} (422 / 420) v_{R B}^{'} (420) = {0.5 \times (u + 0.5), 0.5 \times u, xChromaOffset = COSITED_EVEN xChromaOffset = MIDPOINT = {0.5 \times (v + 0.5), 0.5 \times v, yChromaOffset = COSITED_EVEN yChromaOffset = MIDPOINT

16.3.9. Sampler Y′C_BC_R Conversion

Sampler Y′C_BC_R conversion performs the following operations, which an implementation may combine into a single mathematical operation:

Sampler Y′C_BC_R Range Expansion

Sampler Y′C_BC_R range expansion is applied to color component values after all texel input operations which are not specific to sampler Y′C_BC_R conversion. For example, the input values to this stage have been converted using the normal format conversion rules.

Sampler Y′C_BC_R range expansion is not applied if ycbcrModel is VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY. That is, the shader receives the vector C'_rgba as output by the Component Swizzle stage without further modification.

For other values of ycbcrModel, range expansion is applied to the texel component values output by the Component Swizzle defined by the components member of VkSamplerYcbcrConversionCreateInfo. Range expansion applies independently to each component of the image. For the purposes of range expansion and Y′C_BC_R model conversion, the R and B components contain color difference (chroma) values and the G component contains luma. The A component is not modified by sampler Y′C_BC_R range expansion.

The range expansion to be applied is defined by the ycbcrRange member of the VkSamplerYcbcrConversionCreateInfo structure:

If ycbcrRange is VK_SAMPLER_YCBCR_RANGE_ITU_FULL, the following transformations are applied:

Y^{'} C_{B} C_{R} = C_{r g b a}^{'} [G] = C_{r g b a}^{'} [B] - \frac{2 ^{(n - 1)}}{( 2 ^{n} ) - 1} = C_{r g b a}^{'} [R] - \frac{2 ^{(n - 1)}}{( 2 ^{n} ) - 1}

Note

These formulae correspond to the “full range” encoding in the “Quantization schemes” chapter of the Khronos Data Format Specification.

Should any future amendments be made to the ITU specifications from which these equations are derived, the formulae used by Vulkan may also be updated to maintain parity.

If ycbcrRange is VK_SAMPLER_YCBCR_RANGE_ITU_NARROW, the following transformations are applied:

$Y^{'} C_{B} C_{R} = \frac{C _{r g b a}^{'} [ G ] \times ( 2 ^{n} - 1 ) - 1 6 \times 2 ^{n - 8}}{2 1 9 \times 2 ^{n - 8}} = \frac{C _{r g b a}^{'} [ B ] \times ( 2 ^{n} - 1 ) - 1 2 8 \times 2 ^{n - 8}}{2 2 4 \times 2 ^{n - 8}} = \frac{C _{r g b a}^{'} [ R ] \times ( 2 ^{n} - 1 ) - 1 2 8 \times 2 ^{n - 8}}{2 2 4 \times 2 ^{n - 8}}$

Note

These formulae correspond to the “narrow range” encoding in the “Quantization schemes” chapter of the Khronos Data Format Specification.
n is the bit-depth of the components in the format.

The precision of the operations performed during range expansion must be at least that of the source format.

An implementation may clamp the results of these range expansion operations such that Y′ falls in the range [0,1], and/or such that C_B and C_R fall in the range [-0.5,0.5].

Sampler Y′C_BC_R Model Conversion

The range-expanded values are converted between color models, according to the color model conversion specified in the ycbcrModel member:

VK_SAMPLER_YCBCR_MODEL_CONVERSION_RGB_IDENTITY: The color components are not modified by the color model conversion since they are assumed already to represent the desired color model in which the shader is operating; Y′C_BC_R range expansion is also ignored.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_IDENTITY: The color components are not modified by the color model conversion and are assumed to be treated as though in Y′C_BC_R form both in memory and in the shader; Y′C_BC_R range expansion is applied to the components as for other Y′C_BC_R models, with the vector (C_R,Y′,C_B,A) provided to the shader.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_709: The color components are transformed from a Y′C_BC_R representation to an R′G′B′ representation as described in the “BT.709 Y′C_BC_R conversion” section of the Khronos Data Format Specification.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_601: The color components are transformed from a Y′C_BC_R representation to an R′G′B′ representation as described in the “BT.601 Y′C_BC_R conversion” section of the Khronos Data Format Specification.
VK_SAMPLER_YCBCR_MODEL_CONVERSION_YCBCR_2020: The color components are transformed from a Y′C_BC_R representation to an R′G′B′ representation as described in the “BT.2020 Y′C_BC_R conversion” section of the Khronos Data Format Specification.

In this operation, each output component is dependent on each input component.

An implementation may clamp the R′G′B′ results of these conversions to the range [0,1].

The precision of the operations performed during model conversion must be at least that of the source format.

The alpha component is not modified by these model conversions.

Note

Sampling operations in a non-linear color space can introduce color and intensity shifts at sharp transition boundaries. To avoid this issue, the technically precise color correction sequence described in the “Introduction to Color Conversions” chapter of the Khronos Data Format Specification may be performed as follows:

Calculate the unnormalized texel coordinates corresponding to the desired sample position.
For a minFilter or magFilter of VK_FILTER_NEAREST:
1. Calculate (i,j) for the sample location as described under the “nearest filtering” formulae in (u,v,w,a) to (i,j,k,l,n) Transformation and Array Layer Selection
2. Calculate the normalized texel coordinates corresponding to these integer coordinates.
3. Sample using sampler Y′C_BC_R conversion at this location.
For a minFilter or magFilter of VK_FILTER_LINEAR:
1. Calculate (i_[0,1],j_[0,1]) for the sample location as described under the “linear filtering” formulae in (u,v,w,a) to (i,j,k,l,n) Transformation and Array Layer Selection
2. Calculate the normalized texel coordinates corresponding to these integer coordinates.
3. Sample using sampler Y′C_BC_R conversion at each of these locations.
4. Convert the non-linear A′R′G′B′ outputs of the Y′C_BC_R conversions to linear ARGB values as described in the “Transfer Functions” chapter of the Khronos Data Format Specification.
5. Interpolate the linear ARGB values using the α and β values described in the “linear filtering” section of (u,v,w,a) to (i,j,k,l,n) Transformation and Array Layer Selection and the equations in Texel Filtering.

The additional calculations and, especially, additional number of sampling operations in the VK_FILTER_LINEAR case can be expected to have a performance impact compared with using the outputs directly. Since the variations from “correct” results are subtle for most content, the application author should determine whether a more costly implementation is strictly necessary.

If chromaFilter, and minFilter or magFilter are both VK_FILTER_NEAREST, these operations are redundant and sampling using sampler Y′C_BC_R conversion at the desired sample coordinates will produce the “correct” results without further processing.

16.4. Texel Output Operations

Texel output instructions are SPIR-V image instructions that write to an image. Texel output operations are a set of steps that are performed on state, coordinates, and texel values while processing a texel output instruction, and which are common to some or all texel output instructions. They include the following steps, which are performed in the listed order:

16.4.1. Texel Output Validation Operations

Texel output validation operations inspect instruction/image state or coordinates, and in certain circumstances cause the write to have no effect. There are a series of validations that the texel undergoes.

Texel Format Validation

If the image format of the OpTypeImage is not compatible with the VkImageView’s format, the write causes the contents of the image’s memory to become undefined.

Texel Type Validation

If the Sampled Type of the OpTypeImage does not match the SPIR-V Type, the write causes the value of the texel to become undefined. For integer types, if the signedness of the access does not match the signedness of the accessed resource, the write causes the value of the texel to become undefined.

16.4.2. Integer Texel Coordinate Validation

The integer texel coordinates are validated according to the same rules as for texel input coordinate validation.

If the texel fails integer texel coordinate validation, then the write has no effect.

16.4.3. Sparse Texel Operation

If the texel attempts to write to an unbound region of a sparse image, the texel is a sparse unbound texel. In such a case, if the VkPhysicalDeviceSparseProperties::residencyNonResidentStrict property is VK_TRUE, the sparse unbound texel write has no effect. If residencyNonResidentStrict is VK_FALSE, the write may have a side effect that becomes visible to other accesses to unbound texels in any resource, but will not be visible to any device memory allocated by the application.

16.4.4. Texel Output Format Conversion

If the image format is sRGB, a linear to sRGB conversion is applied to the R, G, and B components as described in the “sRGB EOTF” section of the Khronos Data Format Specification. The A component, if present, is unchanged.

Texels then undergo a format conversion from the floating point, signed, or unsigned integer type of the texel data to the VkFormat of the image view. If the number of components in the texel data is larger than the number of components in the format, additional components are discarded.

Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat). Floating-point outputs are converted as described in Floating-Point Format Conversions and Fixed-Point Data Conversion. Integer outputs are converted such that their value is preserved. The converted value of any integer that cannot be represented in the target format is undefined.

If the VkImageView format has an X component in its format description, undefined values are written to those bits.

If the underlying VkImage format has an X component in its format description, undefined values are also written to those bits, even if result format conversion produces a valid value for those bits because the VkImageView format is different.

16.5. Normalized Texel Coordinate Operations

If the image sampler instruction provides normalized texel coordinates, some of the following operations are performed.

16.5.1. Projection Operation

For Proj image operations, the normalized texel coordinates (s,t,r,q,a) and (if present) the D_ref coordinate are transformed as follows:

s t r D_{ref} = \frac{s}{q}, = \frac{t}{q}, = \frac{r}{q}, = \frac{D _{ref}}{q}, for 1D, 2D, or 3D image for 2D or 3D image for 3D image if provided

16.5.2. Derivative Image Operations

Derivatives are used for LOD selection. These derivatives are either implicit (in an ImplicitLod image instruction in a fragment shader) or explicit (provided explicitly by shader to the image instruction in any shader).

For implicit derivatives image instructions, the derivatives of texel coordinates are calculated in the same manner as derivative operations. That is:

\partial s / \partial x \partial t / \partial x \partial r / \partial x = d P d x (s), = d P d x (t), = d P d x (r), \partial s / \partial y \partial t / \partial y \partial r / \partial y = d P d y (s), = d P d y (t), = d P d y (r), for 1D, 2D, Cube, or 3D image for 2D, Cube, or 3D image for Cube or 3D image

Partial derivatives not defined above for certain image dimensionalities are set to zero.

For explicit LOD image instructions, if the optional SPIR-V operand Grad is provided, then the operand values are used for the derivatives. The number of components present in each derivative for a given image dimensionality matches the number of partial derivatives computed above.

If the optional SPIR-V operand Lod is provided, then derivatives are set to zero, the cube map derivative transformation is skipped, and the scale factor operation is skipped. Instead, the floating point scalar coordinate is directly assigned to λ_base as described in LOD Operation.

If the image or sampler object used by an implicit derivative image instruction is not uniform across the quad and quadDivergentImplicitLod is not supported, then the derivative and LOD values are undefined. Implicit derivatives are well-defined when the image and sampler and control flow are uniform across the quad, even if they diverge between different quads.

If quadDivergentImplicitLod is supported, then derivatives and implicit LOD values are well-defined even if the image or sampler object are not uniform within a quad. The derivatives are computed as specified above, and the implicit LOD calculation proceeds for each shader invocation using its respective image and sampler object.

16.5.3. Cube Map Face Selection and Transformations

For cube map image instructions, the (s,t,r) coordinates are treated as a direction vector (r_x,r_y,r_z). The direction vector is used to select a cube map face. The direction vector is transformed to a per-face texel coordinate system (s_face,t_face), The direction vector is also used to transform the derivatives to per-face derivatives.

16.5.4. Cube Map Face Selection

The direction vector selects one of the cube map’s faces based on the largest magnitude coordinate direction (the major axis direction). Since two or more coordinates can have identical magnitude, the implementation must have rules to disambiguate this situation.

The rules should have as the first rule that r_z wins over r_y and r_x, and the second rule that r_y wins over r_x. An implementation may choose other rules, but the rules must be deterministic and depend only on (r_x,r_y,r_z).

The layer number (corresponding to a cube map face), the coordinate selections for s_c, t_c, r_c, and the selection of derivatives, are determined by the major axis direction as specified in the following two tables.

Table 17. Cube map face and coordinate selection
Major Axis Direction	Layer Number	Cube Map Face	s_c	t_c	r_c
+r_x	0	Positive X	-r_z	-r_y	r_x
-r_x	1	Negative X	+r_z	-r_y	r_x
+r_y	2	Positive Y	+r_x	+r_z	r_y
-r_y	3	Negative Y	+r_x	-r_z	r_y
+r_z	4	Positive Z	+r_x	-r_y	r_z
-r_z	5	Negative Z	-r_x	-r_y	r_z

Table 18. Cube map derivative selection
Major Axis Direction	∂s_c / ∂x	∂s_c / ∂y	∂t_c / ∂x	∂t_c / ∂y	∂r_c / ∂x	∂r_c / ∂y
+r_x	-∂r_z / ∂x	-∂r_z / ∂y	-∂r_y / ∂x	-∂r_y / ∂y	+∂r_x / ∂x	+∂r_x / ∂y
-r_x	+∂r_z / ∂x	+∂r_z / ∂y	-∂r_y / ∂x	-∂r_y / ∂y	-∂r_x / ∂x	-∂r_x / ∂y
+r_y	+∂r_x / ∂x	+∂r_x / ∂y	+∂r_z / ∂x	+∂r_z / ∂y	+∂r_y / ∂x	+∂r_y / ∂y
-r_y	+∂r_x / ∂x	+∂r_x / ∂y	-∂r_z / ∂x	-∂r_z / ∂y	-∂r_y / ∂x	-∂r_y / ∂y
+r_z	+∂r_x / ∂x	+∂r_x / ∂y	-∂r_y / ∂x	-∂r_y / ∂y	+∂r_z / ∂x	+∂r_z / ∂y
-r_z	-∂r_x / ∂x	-∂r_x / ∂y	-∂r_y / ∂x	-∂r_y / ∂y	-∂r_z / ∂x	-∂r_z / ∂y

16.5.5. Cube Map Coordinate Transformation

s_{face} t_{face} = \frac{1}{2} \times \frac{s _{c}}{∣ r _{c} ∣} + \frac{1}{2} = \frac{1}{2} \times \frac{t _{c}}{∣ r _{c} ∣} + \frac{1}{2}

16.5.6. Cube Map Derivative Transformation

\frac{\partial s _{face}}{\partial x} \frac{\partial s _{face}}{\partial x} \frac{\partial s _{face}}{\partial x} = \frac{\partial}{\partial x} (\frac{1}{2} \times \frac{s _{c}}{∣ r _{c} ∣} + \frac{1}{2}) = \frac{1}{2} \times \frac{\partial}{\partial x} (\frac{s _{c}}{∣ r _{c} ∣}) = \frac{1}{2} \times (\frac{∣ r _{c} ∣ \times \partial s _{c} / \partial x - s _{c} \times \partial r _{c} / \partial x}{( r _{c} ) ^{2}})

\frac{\partial s _{face}}{\partial y} \frac{\partial t _{face}}{\partial x} \frac{\partial t _{face}}{\partial y} = \frac{1}{2} \times (\frac{∣ r _{c} ∣ \times \partial s _{c} / \partial y - s _{c} \times \partial r _{c} / \partial y}{( r _{c} ) ^{2}}) = \frac{1}{2} \times (\frac{∣ r _{c} ∣ \times \partial t _{c} / \partial x - t _{c} \times \partial r _{c} / \partial x}{( r _{c} ) ^{2}}) = \frac{1}{2} \times (\frac{∣ r _{c} ∣ \times \partial t _{c} / \partial y - t _{c} \times \partial r _{c} / \partial y}{( r _{c} ) ^{2}})

16.5.7. Scale Factor Operation, LOD Operation and Image Level(s) Selection

LOD selection can be either explicit (provided explicitly by the image instruction) or implicit (determined from a scale factor calculated from the derivatives). The LOD must be computed with mipmapPrecisionBits of accuracy.

Scale Factor Operation

The magnitude of the derivatives are calculated by:

: m_ux = |∂s/∂x| × w_base
: m_vx = |∂t/∂x| × h_base
: m_wx = |∂r/∂x| × d_base
: m_uy = |∂s/∂y| × w_base
: m_vy = |∂t/∂y| × h_base
: m_wy = |∂r/∂y| × d_base

where:

: ∂t/∂x = ∂t/∂y = 0 (for 1D images)
: ∂r/∂x = ∂r/∂y = 0 (for 1D, 2D or Cube images)

and:

: w_base = image.w
: h_base = image.h
: d_base = image.d

(for the baseMipLevel, from the image descriptor).

A point sampled in screen space has an elliptical footprint in texture space. The minimum and maximum scale factors (ρ_min, ρ_max) should be the minor and major axes of this ellipse.

The scale factors ρ_x and ρ_y, calculated from the magnitude of the derivatives in x and y, are used to compute the minimum and maximum scale factors.

ρ_x and ρ_y may be approximated with functions f_x and f_y, subject to the following constraints:

f_{x} is continuous and monotonically increasing in each of m_{u x}, m_{v x}, and m_{w x} f_{y} is continuous and monotonically increasing in each of m_{u y}, m_{v y}, and m_{w y}

max (∣ m_{u x} ∣, ∣ m_{v x} ∣, ∣ m_{w x} ∣) \leq f_{x} \leq 2 (∣ m_{u x} ∣ + ∣ m_{v x} ∣ + ∣ m_{w x} ∣) max (∣ m_{u y} ∣, ∣ m_{v y} ∣, ∣ m_{w y} ∣) \leq f_{y} \leq 2 (∣ m_{u y} ∣ + ∣ m_{v y} ∣ + ∣ m_{w y} ∣)

The minimum and maximum scale factors (ρ_min,ρ_max) are determined by:

: ρ_max = max(ρ_x, ρ_y)
: ρ_min = min(ρ_x, ρ_y)

The ratio of anisotropy is determined by:

: η = min(ρ_max/ρ_min, max_Aniso)

where:

: sampler.max_Aniso = maxAnisotropy (from sampler descriptor)
: limits.max_Aniso = maxSamplerAnisotropy (from physical device limits)
: max_Aniso = min(sampler.max_Aniso, limits.max_Aniso)

If ρ_max = ρ_min = 0, then all the partial derivatives are zero, the fragment’s footprint in texel space is a point, and η should be treated as 1. If ρ_max ≠ 0 and ρ_min = 0 then all partial derivatives along one axis are zero, the fragment’s footprint in texel space is a line segment, and η should be treated as max_Aniso. However, anytime the footprint is small in texel space the implementation may use a smaller value of η, even when ρ_min is zero or close to zero. If either VkPhysicalDeviceFeatures::samplerAnisotropy or VkSamplerCreateInfo::anisotropyEnable are VK_FALSE, max_Aniso is set to 1.

If η = 1, sampling is isotropic. If η > 1, sampling is anisotropic.

The sampling rate (N) is derived as:

: N = ⌈η⌉

An implementation may round N up to the nearest supported sampling rate. An implementation may use the value of N as an approximation of η.

LOD Operation

The LOD parameter λ is computed as follows:

λ_{b a s e} (x, y) λ^{'} (x, y) λ = {s h a d e r O p . L o d lo g_{2} (\frac{ρ _{m a x}}{η}) (from optional SPIR-V operand) otherwise = λ_{b a s e} + c l a m p (s a m p l e r . b i a s + s h a d e r O p . b i a s, - m a x S a m p l e r L o d B i a s, m a x S a m p l e r L o d B i a s) = ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎧ l o d_{m a x}, λ^{'}, l o d_{m i n}, undefined, λ^{'} > l o d_{m a x} l o d_{m i n} \leq λ^{'} \leq l o d_{m a x} λ^{'} < l o d_{m i n} l o d_{m i n} > l o d_{m a x}

where:

s a m p l e r . b i a s s h a d e r O p . b i a s s a m p l e r . l o d_{m i n} s h a d e r O p . l o d_{m i n} l o d_{m i n} l o d_{m a x} = m i p L o d B i a s = {B i a s 0 (from optional SPIR-V operand) otherwise = m i n L o d = {M i n L o d 0 (from optional SPIR-V operand) otherwise = max (s a m p l e r . l o d_{m i n}, s h a d e r O p . l o d_{m i n}) = m a x L o d (from sampler descriptor) (from sampler descriptor) (from sampler descriptor)

and maxSamplerLodBias is the value of the VkPhysicalDeviceLimits feature maxSamplerLodBias.

Image Level(s) Selection

The image level(s) d, d_hi, and d_lo which texels are read from are determined by an image-level parameter d_l, which is computed based on the LOD parameter, as follows:

d_{l} = {n e a r e s t (d^{'}), d^{'}, mipmapMode is VK_SAMPLER_MIPMAP_MODE_NEAREST otherwise

where:

d^{'} = l e v e l_{b a s e} + clamp (λ, 0, q)

n e a r e s t (d^{'}) = {⌈ d^{'} + 0.5 ⌉ - 1, ⌊ d^{'} + 0.5 ⌋, preferred alternative

and:

l e v e l_{b a s e} q = b a s e M i p L e v e l = l e v e l C o u n t - 1

baseMipLevel and levelCount are taken from the subresourceRange of the image view.

If the sampler’s mipmapMode is VK_SAMPLER_MIPMAP_MODE_NEAREST, then the level selected is d = d_l.

If the sampler’s mipmapMode is VK_SAMPLER_MIPMAP_MODE_LINEAR, two neighboring levels are selected:

d_{h i} d_{l o} δ = ⌊ d_{l} ⌋ = m i n (d_{h i} + 1, l e v e l_{b a s e} + q) = d_{l} - d_{h i}

δ is the fractional value, quantized to the number of mipmap precision bits, used for linear filtering between levels.

16.5.8. (s,t,r,q,a) to (u,v,w,a) Transformation

The normalized texel coordinates are scaled by the image level dimensions and the array layer is selected.

This transformation is performed once for each level used in filtering (either d, or d_hi and d_lo).

u (x, y) v (x, y) w (x, y) a (x, y) = s (x, y) \times w i d t h_{s c a l e} + Δ_{i} = {0 t (x, y) \times h e i g h t_{s c a l e} + Δ_{j} for 1D images otherwise = {0 r (x, y) \times d e p t h_{s c a l e} + Δ_{k} for 2D or Cube images otherwise = {a (x, y) 0 for array images otherwise

where:

: width_scale = width_level
: height_scale = height_level
: depth_scale = depth_level

and where (Δ_i, Δ_j, Δ_k) are taken from the image instruction if it includes a ConstOffset or Offset operand, otherwise they are taken to be zero.

Operations then proceed to Unnormalized Texel Coordinate Operations.

16.6. Unnormalized Texel Coordinate Operations

16.6.1. (u,v,w,a) to (i,j,k,l,n) Transformation and Array Layer Selection

The unnormalized texel coordinates are transformed to integer texel coordinates relative to the selected mipmap level.

The layer index l is computed as:

: l = clamp(RNE(a), 0, layerCount - 1) + baseArrayLayer

where layerCount is the number of layers in the image subresource range of the image view, baseArrayLayer is the first layer from the subresource range, and where:

R N E (a) = {r o u n d T i e s T o E v e n (a) ⌊ a + 0.5 ⌋ preferred, from IEEE Std 754-2008 Floating-Point Arithmetic alternative

The sample index n is assigned the value 0.

Nearest filtering (VK_FILTER_NEAREST) computes the integer texel coordinates that the unnormalized coordinates lie within:

i j k = ⌊ u + s h i f t ⌋ = ⌊ v + s h i f t ⌋ = ⌊ w + s h i f t ⌋

where:

: shift = 0.0

Linear filtering (VK_FILTER_LINEAR) computes a set of neighboring coordinates which bound the unnormalized coordinates. The integer texel coordinates are combinations of i₀ or i₁, j₀ or j₁, k₀ or k₁, as well as weights α, β, and γ.

i_{0} i_{1} j_{0} j_{1} k_{0} k_{1} = ⌊ u - s h i f t ⌋ = i_{0} + 1 = ⌊ v - s h i f t ⌋ = j_{0} + 1 = ⌊ w - s h i f t ⌋ = k_{0} + 1

α β γ = f r a c (u - s h i f t) = f r a c (v - s h i f t) = f r a c (w - s h i f t)

where:

: shift = 0.5

and where:

f r a c (x) = x - ⌊ x ⌋

where the number of fraction bits retained is specified by VkPhysicalDeviceLimits::subTexelPrecisionBits.

16.7. Integer Texel Coordinate Operations

The OpImageFetch and OpImageFetchSparse SPIR-V instructions may supply a LOD from which texels are to be fetched using the optional SPIR-V operand Lod. Other integer-coordinate operations must not. If the Lod is provided then it must be an integer.

The image level selected is:

d = l e v e l_{b a s e} + {L o d 0 (from optional SPIR-V operand) otherwise

If d does not lie in the range [baseMipLevel, baseMipLevel + levelCount) then any values fetched are undefined, and any writes (if supported) are discarded.

16.8. Image Sample Operations

16.8.1. Wrapping Operation

Cube images ignore the wrap modes specified in the sampler. Instead, if VK_FILTER_NEAREST is used within a mip level then VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE is used, and if VK_FILTER_LINEAR is used within a mip level then sampling at the edges is performed as described earlier in the Cube map edge handling section.

The first integer texel coordinate i is transformed based on the addressModeU parameter of the sampler.

i = ⎩ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎧ i m o d s i z e (s i z e - 1) - m i r r o r ((i m o d (2 \times s i z e)) - s i z e) c l a m p (i, 0, s i z e - 1) c l a m p (i, - 1, s i z e) c l a m p (m i r r o r (i), 0, s i z e - 1) for repeat for mirrored repeat for clamp to edge for clamp to border for mirror clamp to edge

where:

m i r r o r (n) = {n - (1 + n) for n \geq 0 otherwise

j (for 2D and Cube image) and k (for 3D image) are similarly transformed based on the addressModeV and addressModeW parameters of the sampler, respectively.

16.8.2. Texel Gathering

SPIR-V instructions with Gather in the name return a vector derived from 4 texels in the base level of the image view. The rules for the VK_FILTER_LINEAR minification filter are applied to identify the four selected texels. Each texel is then converted to an RGBA value according to conversion to RGBA and then swizzled. A four-component vector is then assembled by taking the component indicated by the Component value in the instruction from the swizzled color value of the four texels. If the operation does not use the ConstOffsets image operand then the four texels form the 2 × 2 rectangle used for texture filtering:

τ [R] τ [G] τ [B] τ [A] = τ_{i 0 j 1} [l e v e l_{b a s e}] [c o m p] = τ_{i 1 j 1} [l e v e l_{b a s e}] [c o m p] = τ_{i 1 j 0} [l e v e l_{b a s e}] [c o m p] = τ_{i 0 j 0} [l e v e l_{b a s e}] [c o m p]

If the operation does use the ConstOffsets image operand then the offsets allow a custom filter to be defined:

τ [R] τ [G] τ [B] τ [A] = τ_{i 0 j 0 + Δ_{0}} [l e v e l_{b a s e}] [c o m p] = τ_{i 0 j 0 + Δ_{1}} [l e v e l_{b a s e}] [c o m p] = τ_{i 0 j 0 + Δ_{2}} [l e v e l_{b a s e}] [c o m p] = τ_{i 0 j 0 + Δ_{3}} [l e v e l_{b a s e}] [c o m p]

where:

τ [l e v e l_{b a s e}] [c o m p] c o m p = ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎧ τ [l e v e l_{b a s e}] [R], τ [l e v e l_{b a s e}] [G], τ [l e v e l_{b a s e}] [B], τ [l e v e l_{b a s e}] [A], for c o m p = 0 for c o m p = 1 for c o m p = 2 for c o m p = 3 from SPIR-V operand Component

OpImage*Gather must not be used on a sampled image with sampler Y′C_BC_R conversion enabled.

16.8.3. Texel Filtering

Texel filtering is first performed for each level (either d or d_hi and d_lo).

If λ is less than or equal to zero, the texture is said to be magnified, and the filter mode within a mip level is selected by the magFilter in the sampler. If λ is greater than zero, the texture is said to be minified, and the filter mode within a mip level is selected by the minFilter in the sampler.

Texel Nearest Filtering

Within a mip level, VK_FILTER_NEAREST filtering selects a single value using the (i, j, k) texel coordinates, with all texels taken from layer l.

τ [l e v e l] = ⎩ ⎪ ⎪ ⎨ ⎪ ⎪ ⎧ τ_{i j k} [l e v e l], τ_{i j} [l e v e l], τ_{i} [l e v e l], for 3D image for 2D or Cube image for 1D image

Texel Linear Filtering

Within a mip level, VK_FILTER_LINEAR filtering combines 8 (for 3D), 4 (for 2D or Cube), or 2 (for 1D) texel values, together with their linear weights. The linear weights are derived from the fractions computed earlier:

w_{i_{0}} w_{i_{1}} w_{j_{0}} w_{j_{1}} w_{k_{0}} w_{k_{1}} = (1 - α) = (α) = (1 - β) = (β) = (1 - γ) = (γ)

The values of multiple texels, together with their weights, are combined to produce a filtered value.

The VkSamplerReductionModeCreateInfo::reductionMode can control the process by which multiple texels, together with their weights, are combined to produce a filtered texture value.

When the reductionMode is set (explicitly or implicitly) to VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE, a weighted average is computed:

τ_{3 D} τ_{2 D} τ_{1 D} = k = k_{0} \sum k_{1} j = j_{0} \sum j_{1} i = i_{0} \sum i_{1} (w_{i}) (w_{j}) (w_{k}) τ_{i j k} = j = j_{0} \sum j_{1} i = i_{0} \sum i_{1} (w_{i}) (w_{j}) τ_{i j} = i = i_{0} \sum i_{1} (w_{i}) τ_{i}

However, if the reduction mode is VK_SAMPLER_REDUCTION_MODE_MIN or VK_SAMPLER_REDUCTION_MODE_MAX, the process operates on the above set of multiple texels, together with their weights, computing a component-wise minimum or maximum, respectively, of the components of the set of texels with non-zero weights.

Texel Mipmap Filtering

VK_SAMPLER_MIPMAP_MODE_NEAREST filtering returns the value of a single mipmap level,

τ = τ[d].

VK_SAMPLER_MIPMAP_MODE_LINEAR filtering combines the values of multiple mipmap levels (τ[hi] and τ[lo]), together with their linear weights.

The linear weights are derived from the fraction computed earlier:

w_{h i} w_{l o} = (1 - δ) = (δ)

The values of multiple mipmap levels, together with their weights, are combined to produce a final filtered value.

The VkSamplerReductionModeCreateInfo::reductionMode can control the process by which multiple texels, together with their weights, are combined to produce a filtered texture value.

When the reductionMode is set (explicitly or implicitly) to VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE, a weighted average is computed:

τ = (w_{h i}) τ [h i] + (w_{l o}) τ [l o]

However, if the reduction mode is VK_SAMPLER_REDUCTION_MODE_MIN or VK_SAMPLER_REDUCTION_MODE_MAX, the process operates on the above values, together with their weights, computing a component-wise minimum or maximum, respectively, of the components of the values with non-zero weights.

Texel Anisotropic Filtering

Anisotropic filtering is enabled by the anisotropyEnable in the sampler. When enabled, the image filtering scheme accounts for a degree of anisotropy.

The particular scheme for anisotropic texture filtering is implementation-dependent. Implementations should consider the magFilter, minFilter and mipmapMode of the sampler to control the specifics of the anisotropic filtering scheme used. In addition, implementations should consider minLod and maxLod of the sampler.

Note

For historical reasons, vendor implementations of anisotropic filtering interpret these sampler parameters in different ways, particularly in corner cases such as magFilter, minFilter of NEAREST or maxAnisotropy equal to 1.0. Applications should not expect consistent behavior in such cases, and should use anisotropic filtering only with parameters which are expected to give a quality improvement relative to LINEAR filtering.

The following describes one particular approach to implementing anisotropic filtering for the 2D Image case; implementations may choose other methods:

Given a magFilter, minFilter of VK_FILTER_LINEAR and a mipmapMode of VK_SAMPLER_MIPMAP_MODE_NEAREST:

Instead of a single isotropic sample, N isotropic samples are sampled within the image footprint of the image level d to approximate an anisotropic filter. The sum τ_2Daniso is defined using the single isotropic τ_2D(u,v) at level d.

τ_{2 D a n i s o} τ_{2 D a n i s o} = \frac{1}{N} i = 1 \sum N τ_{2 D} (u (x - \frac{1}{2} + \frac{i}{N + 1}, y), v (x - \frac{1}{2} + \frac{i}{N + 1}, y)), = \frac{1}{N} i = 1 \sum N τ_{2 D} (u (x, y - \frac{1}{2} + \frac{i}{N + 1}), v (x, y - \frac{1}{2} + \frac{i}{N + 1})), when ρ_{x} > ρ_{y} when ρ_{y} \geq ρ_{x}

When VkSamplerReductionModeCreateInfo::reductionMode is set to VK_SAMPLER_REDUCTION_MODE_WEIGHTED_AVERAGE, the above summation is used. However, if the reduction mode is VK_SAMPLER_REDUCTION_MODE_MIN or VK_SAMPLER_REDUCTION_MODE_MAX, the process operates on the above values, together with their weights, computing a component-wise minimum or maximum, respectively, of the components of the values with non-zero weights.

16.9. Image Operation Steps

Each step described in this chapter is performed by a subset of the image instructions:

Texel Input Validation Operations, Format Conversion, Texel Replacement, Conversion to RGBA, and Component Swizzle: Performed by all instructions except OpImageWrite.
Depth Comparison: Performed by OpImage*Dref instructions.
All Texel output operations: Performed by OpImageWrite.
Projection: Performed by all OpImage*Proj instructions.
Derivative Image Operations, Cube Map Operations, Scale Factor Operation, LOD Operation and Image Level(s) Selection, and Texel Anisotropic Filtering: Performed by all OpImageSample* and OpImageSparseSample* instructions.
(s,t,r,q,a) to (u,v,w,a) Transformation, Wrapping, and (u,v,w,a) to (i,j,k,l,n) Transformation And Array Layer Selection: Performed by all OpImageSample, OpImageSparseSample, and OpImage*Gather instructions.
Texel Gathering: Performed by OpImage*Gather instructions.
Texel Filtering: Performed by all OpImageSample* and OpImageSparseSample* instructions.
Sparse Residency: Performed by all OpImageSparse* instructions.

16.10. Image Query Instructions

16.10.1. Image Property Queries

OpImageQuerySize, OpImageQuerySizeLod, OpImageQueryLevels, and OpImageQuerySamples query properties of the image descriptor that would be accessed by a shader image operation.

OpImageQuerySizeLod returns the size of the image level identified by the Level of Detail operand. If that level does not exist in the image, then the value returned is undefined.

16.10.2. Lod Query

OpImageQueryLod returns the Lod parameters that would be used in an image operation with the given image and coordinates. The steps described in this chapter are performed as if for OpImageSampleImplicitLod, up to Scale Factor Operation, LOD Operation and Image Level(s) Selection. The return value is the vector (λ', d_l). These values may be subject to implementation-specific maxima and minima for very large, out-of-range values.

Vulkan® 1.3.283 - A Specification

16. Image Operations

16.1. Image Operations Overview

16.1.1. Texel Coordinate Systems

16.2. Conversion Formulas

16.2.1. RGB to Shared Exponent Conversion

16.2.2. Shared Exponent to RGB

16.3. Texel Input Operations

16.3.1. Texel Input Validation Operations

Instruction/Sampler/Image View Validation

Integer Texel Coordinate Validation

Cube Map Edge Handling

Sparse Validation

Layout Validation

16.3.2. Format Conversion

16.3.3. Texel Replacement

16.3.4. Depth Compare Operation

16.3.5. Conversion to RGBA

16.3.6. Component Swizzle

16.3.7. Sparse Residency

16.3.8. Chroma Reconstruction

Explicit Reconstruction

Implicit Reconstruction

16.3.9. Sampler Y′CBCR Conversion

Sampler Y′CBCR Range Expansion

Sampler Y′CBCR Model Conversion

16.4. Texel Output Operations

16.4.1. Texel Output Validation Operations

Texel Format Validation

Texel Type Validation

16.4.2. Integer Texel Coordinate Validation

16.4.3. Sparse Texel Operation

16.4.4. Texel Output Format Conversion

16.5. Normalized Texel Coordinate Operations

16.5.1. Projection Operation

16.5.2. Derivative Image Operations

16.5.3. Cube Map Face Selection and Transformations

16.5.4. Cube Map Face Selection

16.5.5. Cube Map Coordinate Transformation

16.5.6. Cube Map Derivative Transformation

16.5.7. Scale Factor Operation, LOD Operation and Image Level(s) Selection

Scale Factor Operation

LOD Operation

Image Level(s) Selection

16.5.8. (s,t,r,q,a) to (u,v,w,a) Transformation

16.6. Unnormalized Texel Coordinate Operations

16.6.1. (u,v,w,a) to (i,j,k,l,n) Transformation and Array Layer Selection

16.7. Integer Texel Coordinate Operations

16.8. Image Sample Operations

16.8.1. Wrapping Operation

16.8.2. Texel Gathering

16.8.3. Texel Filtering

Texel Nearest Filtering

Texel Linear Filtering

Texel Mipmap Filtering

Texel Anisotropic Filtering

16.9. Image Operation Steps

16.10. Image Query Instructions

16.10.1. Image Property Queries

16.10.2. Lod Query

Vulkan^® 1.3.283 - A Specification

16.3.9. Sampler Y′C_BC_R Conversion

Sampler Y′C_BC_R Range Expansion

Sampler Y′C_BC_R Model Conversion