Dependencies
This extension is written against the OpenCL 3.0 C Language specification and the OpenCL SPIR-V Environment specification, V3.0.8.
This extension requires OpenCL 1.0.
Overview
This extension adds built-in functions to convert between single-precision 32-bit floating-point values and 16-bit bfloat16
values.
The 16-bit bfloat16
format has similar dynamic range as the 32-bit float
format, albeit with lower precision than the 16-bit half
format.
Please note that this extension currently does not introduce a bfloat16
type to OpenCL C and instead the built-in functions convert to or from a ushort
16-bit unsigned integer type with a bit pattern that represents a bfloat16
value.
New OpenCL C Functions
ushort intel_convert_bfloat16_as_ushort(float source);
ushort2 intel_convert_bfloat162_as_ushort2(float2 source);
ushort3 intel_convert_bfloat163_as_ushort3(float3 source);
ushort4 intel_convert_bfloat164_as_ushort4(float4 source);
ushort8 intel_convert_bfloat168_as_ushort8(float8 source);
ushort16 intel_convert_bfloat1616_as_ushort16(float16 source);
float intel_convert_as_bfloat16_float(ushort source);
float2 intel_convert_as_bfloat162_float2(ushort2 source);
float3 intel_convert_as_bfloat163_float3(ushort3 source);
float4 intel_convert_as_bfloat164_float4(ushort4 source);
float8 intel_convert_as_bfloat168_float8(ushort8 source);
float16 intel_convert_as_bfloat1616_float16(ushort16 source);
Modifications to the OpenCL C Specification
Add a new Section 6.3.1.X - The bfloat16
Format
The bfloat16
format is a floating-point format occupying 16 bits.
It is a truncated version of the 32-bit IEEE 754 single-precision floating-point format.
The bfloat16
format includes one sign bit, eight exponent bits (same as the 32-bit single-precision floating-point format), and 7 mantissa bits (fewer than the 16-bit IEEE 754-2008 half-precision floating-point format).
This means that a bfloat16
number may represent numeric values with a similar dynamic range as a 32-bit float
number, albeit with lower precision than a 16-bit half
number.
The cl_intel_bfloat16_conversions
extension does not add bfloat16
as a supported data type for OpenCL kernels, however the built-in functions added by the extension are able to use and return bfloat16
data.
For these built-in functions, the bfloat16
data is passed to the function or returned from the function by encoding it into a ushort
16-bit unsigned integer data type.
If a future extension adds bfloat16
as a supported data type for OpenCL kernels, the bfloat16
data may be reinterpreted and passed to the built-in functions added by cl_intel_bfloat16_conversions
using the as_type() operator.
Add a new Section 6.4.X - bfloat16
Conversions
The bfloat16
format can be used in explicit conversions using the following suite of functions:
// conversions to bfloat16:
destType intel_convert_bfloat16_as_destType(sourceType)
destTypen intel_convert_bfloat16n_as_destTypen(sourceTypen)
// conversions from bfloat16:
destType intel_convert_as_bfloat16_destType(sourceType)
destTypen intel_convert_as_bfloat16n_destTypen(sourceType)
The number of elements in the source and destination vectors must match.
The only supported rounding mode is implicitly round-to-nearest-even. No explicit rounding modes are supported.
Supported scalar and vector data types:
destType | sourceType |
---|---|
|
|
|
|
|
|
|
|
Modifications to the OpenCL SPIR-V Environment Specification
Add a new section 5.2.X - cl_intel_bfloat16_conversions
If the OpenCL environment supports the extension cl_intel_bfloat16_conversions
then the environment must accept modules that declare use of the extension SPV_INTEL_bfloat16_conversion
and that declare the SPIR-V capability Bfloat16ConversionINTEL.
For the instructions OpConvertFToBF16INTEL and OpConvertBF16ToFINTEL added by the extension:
-
Valid types for Result Type, Float Value, and Bfloat16 Value are Scalars and OpTypeVectors with 2, 3, 4, 8, or 16 Component Count components
Issues
-
Should these functions have a special prefix (such as
__
) or suffix (such as_as_ushort
) since they do not truly operate on abfloat16
type?RESOLVED: Yes, we will use the
_as_ushort
nomenclature.The function name to convert to a
ushort
representing abfloat16
value isintel_convert_bfloat16_as_ushort
.The function name to convert from a
ushort
representing abfloat16
value isintel_convert_as_bfloat16_float
. -
Should we define a type alias for our
bfloat16
type or useushort
(orshort
) directly?RESOLVED: No, we will not define a type alias.
-
Should the integer
bfloat16
representation be signed or unsigned?RESOLVED: We will use an unsigned type.
-
Should we support vector conversion built-in functions?
RESOLVED: Yes, we will support the vector conversion built-in functions for consistency.
-
Should we support built-in functions with explicit rounding modes?
RESOLVED: No, we will not support the built-in functions with explicit rounding modes for the initial version of this extension.
The only supported rounding mode for the conversion from
float
tobfloat16
will be the implicit round-to-nearest-even rounding mode.The conversions from
bfloat16
tofloat
are lossless. -
Do we need to support packed conversions?
RESOLVED: No, we will not support packed conversions for the initial version of this extension. If we decide to add packed conversions we will also need to add them to the SPIR-V extension.
-
Do we need to say anything about out-of-range conversions?
RESOLVED: No, out-of-range behavior is covered by existing rounding rules.
-
How should we name the vector conversion functions?
RESOLVED: The name of the vector conversion functions will be
intel_convert_bfloat16n_as_ushortn
andintel_convert_as_bfloat16n_floatn
. This is consistent with the naming of the existing conversion functions.Because
bfloat16
ends with a number this does lead to awkward function names likeintel_convert_bfloat1616_as_ushort16
, but the awkward-ness is preferable to the ambiguity without the vector size suffix.If we decide to add a true
bfloat16
type we should consider other names that do not end in a number (bfloat16_t
?).