Name ARB_gpu_shader_fp64 Name Strings GL_ARB_gpu_shader_fp64 Contact Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) Contributors Barthold Lichtenbelt, NVIDIA Bill Licea-Kane, AMD Bruce Merry, ARM Chris Dodd, NVIDIA Eric Werness, NVIDIA Graham Sellers, AMD Greg Roth, NVIDIA Jeff Bolz, NVIDIA Nick Haemel, AMD Pierre Boudier, AMD Piers Daniell, NVIDIA Notice Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at http://www.khronos.org/registry/speccopyright.html Specification Update Policy Khronos-approved extension specifications are updated in response to issues and bugs prioritized by the Khronos OpenGL Working Group. For extensions which have been promoted to a core Specification, fixes will first appear in the latest version of that core Specification, and will eventually be backported to the extension document. This policy is described in more detail at https://www.khronos.org/registry/OpenGL/docs/update_policy.php Status Complete. Approved by the ARB at the 2010/01/22 F2F meeting. Approved by the Khronos Board of Promoters on March 10, 2010. Version Last Modified Date: August 27, 2012 NVIDIA Revision: 11 Number ARB Extension #89 Dependencies This extension is written against the OpenGL 3.2 (Compatibility Profile) Specification. This extension is written against version 1.50 (revision 09) of the OpenGL Shading Language Specification. OpenGL 3.2 and GLSL 1.50 are required. This extension interacts with EXT_direct_state_access. This extension interacts with NV_shader_buffer_load. Overview This extension allows GLSL shaders to use double-precision floating-point data types, including vectors and matrices of doubles. Doubles may be used as inputs, outputs, and uniforms. The shading language supports various arithmetic and comparison operators on double-precision scalar, vector, and matrix types, and provides a set of built-in functions including: * square roots and inverse square roots; * fused floating-point multiply-add operations; * splitting a floating-point number into a significand and exponent (frexp), or building a floating-point number from a significand and exponent (ldexp); * absolute value, sign tests, various functions to round to an integer value, modulus, minimum, maximum, clamping, blending two values, step functions, and testing for infinity and NaN values; * packing and unpacking doubles into a pair of 32-bit unsigned integers; * matrix component-wise multiplication, and computation of outer products, transposes, determinants, and inverses; and * vector relational functions. Double-precision versions of angle, trigonometry, and exponential functions are not supported. Implicit conversions are supported from integer and single-precision floating-point values to doubles, and this extension uses the relaxed function overloading rules specified by the ARB_gpu_shader5 extension to resolve ambiguities. This extension provides API functions for specifying double-precision uniforms in the default uniform block, including functions similar to the uniform functions added by EXT_direct_state_access (if supported). This extension provides an "LF" suffix for specifying double-precision constants. Floating-point constants without a suffix in GLSL are treated as single-precision values for backward compatibility with versions not supporting doubles; similar constants are treated as double-precision values in the "C" programming language. This extension does not support interpolation of double-precision values; doubles used as fragment shader inputs must be qualified as "flat". Additionally, this extension does not allow vertex attributes with 64-bit components. That support is added separately by EXT_vertex_attrib_64bit. IP Status No known IP claims. New Procedures and Functions void Uniform1d(int location, double x); void Uniform2d(int location, double x, double y); void Uniform3d(int location, double x, double y, double z); void Uniform4d(int location, double x, double y, double z, double w); void Uniform1dv(int location, sizei count, const double *value); void Uniform2dv(int location, sizei count, const double *value); void Uniform3dv(int location, sizei count, const double *value); void Uniform4dv(int location, sizei count, const double *value); void UniformMatrix2dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix3dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix4dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix2x3dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix2x4dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix3x2dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix3x4dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix4x2dv(int location, sizei count, boolean transpose, const double *value); void UniformMatrix4x3dv(int location, sizei count, boolean transpose, const double *value); void GetUniformdv(uint program, int location, double *params); (All of the following ProgramUniform* functions are supported if and only if EXT_direct_state_access is supported.) void ProgramUniform1dEXT(uint program, int location, double x); void ProgramUniform2dEXT(uint program, int location, double x, double y); void ProgramUniform3dEXT(uint program, int location, double x, double y, double z); void ProgramUniform4dEXT(uint program, int location, double x, double y, double z, double w); void ProgramUniform1dvEXT(uint program, int location, sizei count, const double *value); void ProgramUniform2dvEXT(uint program, int location, sizei count, const double *value); void ProgramUniform3dvEXT(uint program, int location, sizei count, const double *value); void ProgramUniform4dvEXT(uint program, int location, sizei count, const double *value); void ProgramUniformMatrix2dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix3dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix4dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix2x3dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix2x4dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix3x2dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix3x4dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix4x2dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix4x3dvEXT(uint program, int location, sizei count, boolean transpose, const double *value); New Tokens Returned in the parameter of GetActiveUniform, and GetTransformFeedbackVarying: DOUBLE DOUBLE_VEC2 0x8FFC DOUBLE_VEC3 0x8FFD DOUBLE_VEC4 0x8FFE DOUBLE_MAT2 0x8F46 DOUBLE_MAT3 0x8F47 DOUBLE_MAT4 0x8F48 DOUBLE_MAT2x3 0x8F49 DOUBLE_MAT2x4 0x8F4A DOUBLE_MAT3x2 0x8F4B DOUBLE_MAT3x4 0x8F4C DOUBLE_MAT4x2 0x8F4D DOUBLE_MAT4x3 0x8F4E Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification (OpenGL Operation) Modify Section 2.14.4, Uniform Variables, p. 89 (modify third paragraph, p. 90) ... uniform variable storage for a vertex shader. A uniform matrix with single- or double-precision components will consume no more than 4 * min(r,c) or 8 * min(r,c) uniform components, respectively. A scalar or vector uniform with double-precision components will consume no more than 2 components, where is 1 for scalars, and the component count for vectors. A link error is generated ... (add to Table 2.13, p. 96) Type Name Token Keyword -------------------- ---------------- DOUBLE double DOUBLE_VEC2 dvec2 DOUBLE_VEC3 dvec3 DOUBLE_VEC4 dvec4 DOUBLE_MAT2 dmat2 DOUBLE_MAT3 dmat3 DOUBLE_MAT4 dmat4 DOUBLE_MAT2x3 dmat2x3 DOUBLE_MAT2x4 dmat2x4 DOUBLE_MAT3x2 dmat3x2 DOUBLE_MAT3x4 dmat3x4 DOUBLE_MAT4x2 dmat4x2 DOUBLE_MAT4x3 dmat4x3 (modify list of commands at the bottom of p. 99) void Uniform{1,2,3,4}d(int location, T value); void Uniform{1,2,3,4}dv(int location, T value); void UniformMatrix{2,3,4}dv (int location, sizei count, boolean transpose, const double *value); void UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv (int location, sizei count, boolean transpose, const double *value); (insert after fourth paragraph, p. 100) The Uniform*d{v} commands will load sets of one to four double-precision floating-point values into a uniform location defined as a double, a double vector, or an array of double scalars or vectors. (modify fifth paragraph, p. 100) The UniformMatrix{2,3,4}fv and UniformMatrix{2,3,4}dv commands will load 2x2, 3x3, or 4x4 matrices (corresponding to 2, 3, or 4 in the command name) of single- or double-precision floating-point values, respectively, into ... (replace second bullet on the middle of p. 101, regarding INVALID_OPERATION errors in Uniform* comamnds) * if the type of the uniform declared in the shader does not match the component type and count indicated in the Uniform* command name (where a boolean uniform component type is considered to match any of the Uniform*i{v}, Uniform*ui{v}, or Uniform*f{v} commands), (modify sixth paragraph, p. 100) The UniformMatrix{2x3,3x2,2x4, 4x2,3x4,4x3}fv and UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv commands will load 2x3, 3x2, 2x4, 4x2, 3x4, or 4x3 matrices (corresponding to the numbers in the command name) of single- or double-precision floating-point values, respectively, into ... (modify "Uniform Buffer Object Storage", p. 102, adding a bullet after the last "Members of type", and modifying the subsequent bullet) * Members of type double are extracted from a buffer object by reading a single double-typed value at the specified offset. * Vectors with N elements with basic data types of bool, int, uint, float, or double are extracted as N values in consecutive memory locations beginning at the specified offset, with components stored in order with the first (X) component at the lowest offset. The GL data type used for component extraction is derived according to the rules for scalar members above. Modify Section 2.14.6, Varying Variables, p. 106 (modify third paragraph, p. 107) ... For the purposes of counting input and output components consumed by a shader, variables declared as vectors, matrices, and arrays will all consume multiple components. Each component of variables declared as double-precision floating-point scalars, vectors, or matrices may be counted as consuming two components. (add after the bulleted list, p. 108) For the purposes of counting the total number of components to capture, each component of outputs declared as double-precision floating-point scalars, vectors, or matrices may be counted as consuming two components. Modify Section 2.19, Transform Feedback, p. 130 (add to end of first paragraph, p. 132) ... The results of appending a varying variable to a transform feedback buffer are undefined if any component of that variable would be written at an offset not aligned to the size of the component. Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification (Rasterization) None. Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification (Per-Fragment Operations and the Frame Buffer) None. Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification (Special Functions) None. Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification (State and State Requests) Modify Section 6.1.15, Shader and Program Queries, p. 332 (add to the first list of commands, p. 337) void GetUniformdv(uint program, int location, double *params); Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) Specification (Invariance) None. Additions to the AGL/GLX/WGL Specifications None. Modifications to The OpenGL Shading Language Specification, Version 1.50 (Revision 09) Including the following line in a shader can be used to control the language features described in this extension: #extension GL_ARB_gpu_shader_fp64 : where is as specified in section 3.3. New preprocessor #defines are added to the OpenGL Shading Language: #define GL_ARB_gpu_shader_fp64 1 Modify Section 3.6, Keywords, p. 14 (add the following to the list of keywords, p. 14) double dvec2 dvec3 dvec4 dmat2 dmat3 dmat4 dmat2x2 dmat2x3 dmat2x4 dmat3x2 dmat3x3 dmat3x4 dmat4x2 dmat4x3 dmat4x4 (remove "double", "dvec2", "dvec3", and "dvec4" from the list of keywords reserved for future use, p. 15) Modify Section 4.1, Basic Types, p. 17 (add to the basic "Transparent Types" table, pp. 17-18) Types Meaning -------- ---------------------------------------------------------- double a single double-precision floating point scalar dvec2 a two-component double precision floating-point vector dvec3 a three component double precision floating-point vector dvec4 a four component double precision floating-point vector dmat2 a 2x2 double-precision floating-point matrix dmat3 a 3x3 double-precision floating-point matrix dmat4 a 4x4 double-precision floating-point matrix dmat2x2 same as dmat2 dmat2x3 a double-precision matrix with 2 columns and 3 rows dmat2x4 a double-precision matrix with 2 columns and 4 rows dmat3x2 a double-precision matrix with 3 columns and 2 rows dmat3x3 same as dmat3 dmat3x4 a double-precision matrix with 3 columns and 4 rows dmat4x2 a double-precision matrix with 4 columns and 2 rows dmat4x3 a double-precision matrix with 4 columns and 3 rows dmat4x4 same as dmat4 Modify Section 4.1.4, Floats, p. 22 (modify two paragraphs of the section, adding support for doubles) Single- and double-precision floating-point values are available for use in a variety of scalar calculations. Floating-point variables are defined as in the following example: float a, b = 1.5; double c, d = 2.0LF; As an input value to one of the processing units, a single or double-precision floating-point variable is expected to match the IEEE floating-point definition for precision and dynamic range of the corresponding type. It is not required that the precision of internal processing for operands of type "float" match the IEEE floating-point specification for floating-point operations, but the minimum guidelines for precision established by the OpenGL specification must be met. Treatment of conditions such as divide by 0 may lead to an unspecified result, but in no case should such a condition lead to the interruption or termination of processing. (modify the grammar, p. 22, adding "L" suffix) floating-suffix: one of f F lf LF (modify last paragraph, p. 22) ... including before a suffix. When the suffix "lf" or "LF" is present, the literal has type . Otherwise, the literal has type . A leading unary ... Modify Section 4.1.6, Matrices, p. 23 (modify the first paragraph of the section) The OpenGL Shading Language has built-in types for 2×2, 2×3, 2×4, 3×2, 3×3, 3×4, 4×2, 4×3, and 4×4 matrices of single- and double-precision floating-point numbers. Matrix types beginning with "mat" have single-precision components; matrix types beginning with "dmat" have double-precision components. The first number in the type is the number of columns, the second is the number of rows. Example matrix declarations: mat2 mat2D; mat3 optMatrix; mat4 view, projection; mat4x4 view; // an alternate way of declaring a mat4 mat3x2 m; // a matrix with 3 columns and 2 rows dmat4 highPrecisionMVP; dmat2x4 skinnyAndTallWithBigComponents; ... Modify Section 4.1.10, Implicit Conversions, p. 27 (modify table of implicit conversions) Can be implicitly Type of expression converted to --------------------- ------------------- int uint(*), float, double ivec2 uvec2(*), vec2, dvec2 ivec3 uvec3(*), vec3, dvec3 ivec4 uvec4(*), vec4, dvec4 uint float, double uvec2 vec2, dvec2 uvec3 vec3, dvec3 uvec4 vec4, dvec4 float double vec2 dvec2 vec3 dvec3 vec4 dvec4 mat2 dmat2 mat3 dmat3 mat4 dmat4 mat2x3 dmat2x3 mat2x4 dmat2x4 mat3x2 dmat3x2 mat3x4 dmat3x4 mat4x2 dmat4x2 mat4x3 dmat4x3 (*) if ARB_gpu_shader5 or NV_gpu_shader5 is supported (modify second paragraph of the section) No implicit conversions are provided to convert from unsigned to signed integer types, from floating-point to integer types, or from higher-precision to lower-precision types. There are no implicit array or structure conversions. (add before the final paragraph of the section, p. 27) (insert before the final paragraph of the section) When performing implicit conversion for binary operators, there may be multiple data types to which the two operands can be converted. For example, when adding an int value to a uint value, both values can be implicitly converted to uint, float, and double. In such cases, a floating-point type is chosen if either operand has a floating-point type. Otherwise, an unsigned integer type is chosen if either operand has an unsigned integer type. Otherwise, a signed integer type is chosen. If operands can be implicitly converted to multiple data types deriving from the same base data type, the type with the smallest component size is used. Modify Section 4.3.4, Inputs, p. 31 (modify third paragraph of the section, p. 31) ... Vertex shader inputs can only be single-precision floating-point scalars, vectors, or matrices, or signed and unsigned integers and integer vectors. Vertex shader inputs can also form arrays of these types, but not structures. (modify third paragraph, p. 32, allowing doubles as inputs and disallowing as non-flat fragment inputs) ... Fragment inputs can only be signed and unsigned integers and integer vectors, float, floating-point vectors, double, double-precision vectors, single- or double-precision matrices, or arrays or structures of these. Fragment shader inputs that are signed or unsigned integers, integer vectors, doubles, double-precision vectors, or double-precision matrices must be qualified with the interpolation qualifier flat. Modify Section 4.3.6, Outputs, p. 33 (modify third paragraph of the section, p. 33) They can only be float, double, single- or double-precision floating-point vectors or matrices, signed or unsigned integers or integer vectors, or arrays or structures of any these. (modify last paragraph, p. 33) ... Fragment outputs can only be float, single-precision floating-point vectors, signed or unsigned integers or integer vectors, or arrays of these. ... Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49 (add double to the first list of constructor examples) Converting between scalar types is done as the following prototypes indicate: int(uint) // converts an unsigned integer value to a signed integer int(float) // converts a float value to a signed integer int(double) // converts a double value to a signed integer int(bool) // converts a Boolean value to a signed integer uint(int) // converts a signed integer value to an unsigned integer uint(float) // converts a float value to an unsigned integer uint(double) // converts a double value to an unsigned integer uint(bool) // converts a Boolean value to an unsigned integer float(int) // converts a signed integer value to a float float(uint) // converts an unsigned integer value to a float float(double) // converts a double value to a float float(bool) // converts a Boolean value to a float double(int) // converts a signed integer value to a double double(uint) // converts an unsigned integer value to a double double(float) // converts a float value to a double double(bool) // converts a Boolean value to a double bool(int) // converts a signed integer value to a Boolean bool(uint) // converts an unsigned integer value to a Boolean bool(float) // converts a float value to a Boolean bool(double) // converts a double value to a Boolean (modify second paragraph of the section, p. 49) When constructors are used to convert any floating-point type to an integer, the fractional part of the floating-point value is dropped. ... (modify third paragraph of the section, p. 49) When a constructor is used to convert any integer or floating-point type to bool, 0 and 0.0 are converted to false, and non-zero values are converted to true. When a constructor is used to convert a bool to any integer or floating-point type, false is converted to 0 or 0.0, and true is converted to 1 or 1.0. Modify Section 5.4.2, Vector and Matrix Constructors, p. 50 (modify the last paragraph, p. 50) If the basic type (bool, int, uint, float, or double) of a parameter to a constructor does not match the basic type of the object being constructed, the scalar construction rules (above) are used to convert the parameters. (add to the first group of examples, p. 52) dmat2(dvec2, dvec2) dmat3(dvec3, dvec3, dvec3) dmat4(dvec4, dvec4, dvec4, dvec4) dmat2x4(dvec3, double, // first column double, dvec3) // second column Modify Section 5.9, Expressions, p. 57 (modify bulleted list as follows, adding support for double-precision floating-point types) Expressions in the shading language are built from the following: * Constants of type bool, int, uint, float, double, all vector types and all matrix types. ... * The arithmetic binary operators add (+), subtract (-), multiply (*), and divide (/) operate on integer, single-precision floating-point, and double-precision floating-point scalars, vectors, and matrices. If the fundamental type (integer, single-precision floating-point, double-precision floating-point) of the operands do not match, the conversions from Section 4.1.10 "Implicit Conversions" are applied to produce matching types. ... * The arithmetic unary operators negate (-), post- and pre-increment and decrement (-- and ++) operate on integer, single-precision floating-point, or double-precision floating-point values (including vectors and matrices). ... * The relational operators greater than (>), less than (<), and less than or equal (<=) operate only on scalar integer, single-precision floating-point, or double-precision floating-point expressions. The result is scalar Boolean. The fundamental type of the two operands must match, either as specified, or after one of the implicit type conversions specified in Section 4.1.10. ... ... Modify Chapter 8, Built-in Functions, p. 81 (add to description of generic types, last paragraph of p. 81) ... Where the input arguments (and corresponding output) can be double, dvec2, dvec3, or dvec4, is used as the argument. ... Similarly, is used for any matrix basic type with single-precision components and is used for any matrix basic type with double-precision components. Modify Section 8.2, Exponential Functions, p. 83 (add overloads for double-precision square roots) genDType sqrt(genDType x); genDType inversesqrt(genDType x); Modify Section 8.3, Common Functions, p. 84 (add support for double-precision floating-point multiply-add) Syntax: genDType fma(genDType a, genDType b, genDType c); The function fma() performs a fused double-precision floating-point multiply-add to compute the value a*b+c. The results of fma() may not be identical to evaluating the expression (a*b)+c, because the computation may be performed in a single operation with intermediate precision different from that used to compute a non-fma() expression. The results of fma() are guaranteed to be invariant given fixed inputs , , and , as though the result were taken from a variable declared as "precise". (add support for double-precision frexp and ldexp functions) Syntax: genDType frexp(genDType x, out genIType exp); genDType ldexp(genDType x, in genIType exp); The function frexp() splits each double-precision floating-point number in into its binary significand, a floating-point number in the range [0.5, 1.0), and an integral exponent of two, such that: x = significand * 2 ^ exponent The significand is returned by the function; the exponent is returned in the parameter . For a floating-point value of zero, the significant and exponent are both zero. For a floating-point value that is an infinity or is not a number, the results of frexp() are undefined. If the input is a vector, this operation is performed in a component-wise manner; the value returned by the function and the value written to are vectors with the same number of components as . The function ldexp() builds a double-precision floating-point number from each significand component in and the corresponding integral exponent of two in , returning: significand * 2 ^ exponent If this product is too large to be represented as a double-precision floating-point value, the result is considered undefined. If the input is a vector, this operation is performed in a component-wise manner; the value passed in and returned by the function are vectors with the same number of components as . (add overloads for double-precision functions) genDType abs(genDType x); genDType sign(genDType x); genDType floor(genDType x); genDType trunc(genDType x); genDType round(genDType x); genDType roundEven(genDType x); genDType ceil(genDType x); genDType fract(genDType x); genDType mod(genDType x, double y); genDType mod(genDType x, genDType y); genDType modf(genDType x, out genDType i); genDType min(genDType x, genDType y); genDType min(genDType x, double y); genDType max(genDType x, genDType y); genDType max(genDType x, double y); genDType clamp(genDType x, genDType minVal, genDType maxVal); genDType clamp(genDType x, double minVal, double maxVal); genDType mix(genDType x, genDType y, genDType a); genDType mix(genDType x, genDType y, double a); genDType mix(genDType x, genDType y, genBType a); genDType step(genDType edge, genDType x); genDType step(double edge, genDType x); genDType smoothstep(genDType edge0, genDType edge1, genDType x); genDType smoothstep(double edge0, double edge1, genDType x); genBType isnan(genDType x); genBType isinf(genDType x); (add support for 64-bit floating-point packing and unpacking functions) Syntax: double packDouble2x32(uvec2 v); uvec2 unpackDouble2x32(double v); The function packDouble2x32() returns a double obtained by packing the components of a two-component unsigned integer vector into a 64-bit value and interpeting its bits according to the IEEE double-precision floating-point representation. The first vector component specifies the 32 least significant bits; the second component specifies the 32 most significant bits. The function unpackDouble2x32() returns a two-component unsigned integer vector obtained by interpreting a double using the 64-bit IEEE double-precision floating-point representation and unpacking into two 32-bit halves. The first component of the vector contains the 32 least significant bits of the double; the second component consists the 32 most significant bits. Modify Section 8.4, Geometric Functions, p. 87 (add double-precision equivalents for existing geometric functions) double length(genDType x); double distance(genDType p0, genDType p1); double dot(genDType x, genDType y); dvec3 cross(dvec3 x, dvec3 y); genDType normalize(genDType x); genDType faceforward(genDType N, genDType I, genDType Nref); genDType reflect(genDType I, genDType N); genDType refract(genDType I, genDType N, double eta); Modify Section 8.5, Matrix Functions, p. 89 (add double-precision equivalents for existing matrix functions) dmat matrixCompMult(dmat x, dmat y); dmat2 outerProduct(dvec2 c, dvec2 r); dmat3 outerProduct(dvec3 c, dvec3 r); dmat4 outerProduct(dvec4 c, dvec4 r); dmat2x3 outerProduct(dvec3 c, dvec2 r); dmat3x2 outerProduct(dvec2 c, dvec3 r); dmat2x4 outerProduct(dvec4 c, dvec2 r); dmat4x2 outerProduct(dvec2 c, dvec4 r); dmat3x4 outerProduct(dvec4 c, dvec3 r); dmat4x3 outerProduct(dvec3 c, dvec4 r); dmat2 transpose(dmat2 m); dmat3 transpose(dmat3 m); dmat4 transpose(dmat4 m); dmat2x3 transpose(dmat3x2 m); dmat3x2 transpose(dmat2x3 m); dmat2x4 transpose(dmat4x2 m); dmat4x2 transpose(dmat2x4 m); dmat3x4 transpose(dmat4x3 m); dmat4x3 transpose(dmat3x4 m); double determinant(dmat2 m); double determinant(dmat3 m); double determinant(dmat4 m); dmat2 inverse(dmat2 m); dmat3 inverse(dmat3 m); dmat4 inverse(dmat4 m); Modify Section 8.6, Vector Relational Functions, p. 90 (modify the first paragraph, p. 90, adding support for relational functions operating on double precision types) Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or reserved) to operate on scalars and produce scalar Boolean results. For vector results, use the following built-in functions. In the definitions below, the following terms are used as placeholders for all vector types for a given fundamental data type. In all cases, the sizes of the input and return vectors for any particular call must match. placeholder fundamental types ----------- ------------------------------------------------ bvec bvec2, bvec3, bvec4 ivec ivec2, ivec3, ivec4 uvec uvec2, uvec3, uvec4 vec vec2, vec3, vec4, dvec2, dvec3, dvec4 Modify Section 9, Shading Language Grammar, p. 92 !!! TBD !!! GLX Protocol !!! TBD Dependencies on ARB_gpu_shader5 If ARB_gpu_shader5 is not supported, the changes to the function overloading rules in the OpenGL Shading Language Specification provided there should included in this extension. Dependencies on NV_gpu_shader5 This extension and NV_gpu_shader5 both provide support for shading language variables with 64-bit components. If both extensions are supported, the various edits describing this new support should be combined. Dependencies on EXT_direct_state_access If EXT_direct_state_access is not supported, references to the ProgramUniform*d*EXT functions should be removed. If EXT_direct_state_access is supported, that specification should be edited as follows: (modify the ProgramUniform* language) The following commands: .... void ProgramUniform{1,2,3,4}dEXT(uint program int location, T value); void ProgramUniform{1,2,3,4}dvEXT (uint program, int location, const T *value); void ProgramUniformMatrix{2,3,4}dvEXT (uint program, int location, sizei count, boolean transpose, const double *value); void ProgramUniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dvEXT (uint program, int location, sizei count, boolean transpose, const double *value); operate identically to the corresponding command where "Program" is deleted from the name (and extension suffixes are dropped or updated appropriately) except, rather than updating the currently active program object, these "Program" commands update the program object named by the parameter. ... Dependencies on NV_shader_buffer_load If NV_shader_buffer_load is supported, that specification should be edited as follows: Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load. (add rules for loads of variables having the new data types from this extension to the list of bullets following "When a shader dereferences a pointer variable") - Data of type "double" are read from or written to memory as one double-typed value at the specified GPU address. Errors None. New State None. New Implementation Dependent State None. Issues (1) How do double-precision types interact with the rules for storing uniforms in a buffer object? RESOLVED: The rules were already written with data types larger and smaller than those in the original GLSL in mind. Single precision floats typically take four bytes; doubles take eight bytes. The larger storage requirement for doubles means a larger alignment requirement; doubles still need to be size-aligned. (2) Should double-precision vertex shader inputs be supported? RESOLVED: Not in this extension. Such support will be added by the EXT_vertex_attrib_64bit extension. (3) Should double-precision fragment shader outputs be supported? RESOLVED: Not in this extension. Note that we don't have double-precision framebuffer formats to accept such values. (4) Should transform feedback be able to capture double-precision components? RESOLVED: Yes. However, undefined behavior will occur unless all components are captured to size-aligned offsets. If any variable captured in transform feedback has double-precision components, the practical requirements for defined behavior are: (a) the offset of the base of a buffer object must be a multiple of eight bytes; (b) the amount of data captured per vertex must be a multiple of eight bytes; and (c) each double-precision variable captured must be aligned to a multiple of eight bytes relative to the beginning of a vertex. If capturing a mix of single- and double-precision components, it might be necessary to use the "gl_SkipComponents1" variable from ARB_transform_feedback3 to force proper alignment. We considered the possibility of adding error checks to throw errors in cases where undefined behavior might occur, but chose not to include such errors. For OpenGL 3.0-style transform feedback, cases (b) and (c) are solely a function of the variables captured could be detected when a program object is linked. (Such an error would be more problematic for transform feedback via NV_transform_feedback, where the set of variables captured can be updated without relinking.) For case (a), the requirement of OpenGL 3.0 is that transform feedback buffer offsets must be a multiple of 4 bytes; enforcing a stricter 8-byte alignment would require either a backward-incompatible change or a Begin-time error to checks the offset of transform feedback buffers against the current program. (5) Should we have double-precision matrix types? We didn't add integer matrices, but integer matrix math is fairly uncommon. RESOLVED: Yes, we will support all matrix sizes in double-precision. We will also provide double-precision equivalents for all matrix operators and built-in matrix functions. (6) What should be done to distinguish between single- and double-precision floating-point constants? RESOLVED: We will use "LF" to identify double-precision floating-point constants. Here, we depart from the C standard. In C, floating-point constants without a suffix are implicitly double-precision and require a "F" suffix to specify a single-precision constant. However, GLSL has historically provided no support for double precision. Changing to C rules would materially affect the behavior of pre-existing shaders that add an #extension line for this extension, since constants with no suffix have meant "float" up to now. Additionally, such a change would likely have required that we introduce implicit conversions from double to float; otherwise, assigning a constant with no suffix to a float would result in a compile-time error. (7) Should we require IEEE 1394-compliant behavior for NaNs and infinities? Denorms? RESOLVED: Following historical precedent in the GLSL and OpenGL APIs not defining special-case floating-point behavior, we chose not to do so in this extension. (8) Should we provide double-precision versions of all the built-ins that take a , which are currently defined to be floats and floating-point vectors? RESOLVED: We provide double-precision versions of most of the built-in functions supported by GLSL. We opted not to provide double-precision functions for special trigonometry, exponential, derivative, and noise functions. (9) Are double-precision "varyings" (values passed between shader stages) supported by this extension? If so, is double-precision interpolation is supported? RESOLVED: Double-precision shader inputs and outputs are supported, except for vertex shader inputs and fragment shader outputs. Additionally, double-precision vertex shader inputs are provided by the separate extension EXT_vertex_attrib_64bit. No known extension provides double-precision fragment outputs, but that doesn't seem important since OpenGL provides no pixel/texture formats with double-precision components that could reasonably receive such outputs. Interpolation not supported in this extension for double-precision floating-point components. As with integer types in OpenGL 3.0, double-precision floating-point fragment shader inputs must be qualified as "flat". Note that this extension reformulates the spec language requiring "flat" qualifiers, in addition to adding doubles to the list of "flat" types. In GLSL 1.30, the spec applies these requirements to vertex shader outputs but imposes no requirement on fragment inputs. We move this requirement to fragment inputs, since vertex shader outputs may be passed to tessellation or geometry shaders without interpolation, and thus without the need for qualification by "flat". (15) Can the 64-bit uniform APIs be used to load values for uniforms of type "bool", "bvec2", "bvec3", or "bvec4"? RESOLVED: No. OpenGL 2.0 and beyond did allow "bool" variable to be set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that support to Uniform*ui* for orthogonality. But it seems pointless to extended this capability forward to 64-bit Uniform APIs as well. (19) Should we support any implicit conversion of matrix types, now that we have both "mat4" and "dmat4"? RESOLVED: No. It doesn't seem worth the trouble. Revision History Rev. Date Author Changes ---- -------- -------- ----------------------------------------- 11 08/27/12 pbrown Clarify that Uniform*d can not be used to load uniforms with boolean types (bug 9345); import issue (15) on the topic from NV_gpu_shader5. 10 03/23/10 pbrown Update issues section to include fp64 issues that were left behind in NV_gpu_shader5 when the specs were refactored. 9 02/02/10 pbrown Specify that capturing any component at an offset that is not size-aligned results in undefined behavior (bug 5863). 8 01/29/10 pbrown Remove shading language and API support for double-precision vertex attributes; moved to the EXT_vertex_attrib_64bit specification (bug 5953). Added clarification disallowing double-precision fragment shader outputs. 7 01/29/10 pbrown Delete accidental modifications to the language for equal and not equal operators (bug 5904), which already supported all types. 6 01/15/10 pbrown Modify the spec rules for counting attributes, input and output components, and components to capture in transform feedback to permit, but not require, double-precision values to require twice as many resources as single- precision equivalents (bug 5855). 5 01/14/10 pbrown Minor updates from spec reviews. 4 12/10/09 pbrown Functionality updates from spec review: Allow implicit conversion from mat*->dmat*. Rename fmad and [un]packFloat2x32 to fma and [un]packDouble2x32. Add overlooked fp64 versions of geometric functions. 3 12/10/09 pbrown Convert from EXT to ARB. 2 12/08/09 pbrown Miscellaneous fixes from spec review: Clarified input/output component counting rules, where each fp64 value counts double. General typo fixes and language clarifications. 1 pbrown Internal revisions.