Name ARM_integer_dot_product Name Strings cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_int16 cl_arm_integer_dot_product_accumulate_saturate_int8 Contributors Kevin Petit, ARM Ltd. Abel Bernabeu, ARM Ltd. Giridhar Tammana, ARM Ltd. Contact Kevin Petit, ARM Ltd. (kevin.petit 'at' ARM.com) IP Status No claims or disclosures are known to exist. Version Revision: #3, November 17th, 2017 Number OpenCL Extension #52 Status Complete. Extension Type OpenCL device extension Dependencies Requires OpenCL version 1.2 or later. Overview This extension adds built-in functions giving direct access to specialised integer dot product instructions that are supported on some devices. An application wishing to support a fallback path for devices where a specific function is not available may use the following pattern: #ifdef cl_arm_integer_dot_product_XXX #pragma OPENCL EXTENSION cl_arm_integer_dot_product_XXX : enable code using arm_dot_xxx() #else alternative implementation #endif Header File cl_ext.h New Procedures and Functions Built-in Functions int arm_dot(char4 a, char4 b); uint arm_dot(uchar4 a, uchar4 b); Description These functions are available when cl_arm_integer_dot_product_int8 is reported. The cl_arm_integer_dot_product_int8 preprocessor macro is then defined. The value returned is: (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + (a.w * b.w). Operands are zero- or sign-extended to 32-bit before the multiplications. Built-in Functions int arm_dot_acc(char4 a, char4 b, int acc); uint arm_dot_acc(uchar4 a, uchar4 b, uint acc); Description These functions are available when cl_arm_integer_dot_product_accumulate_int8 is reported. The cl_arm_integer_dot_product_accumulate_int8 preprocessor macro is then defined. The value returned is: acc + [ (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + (a.w * b.w) ]. Operands are zero- or sign-extended to 32-bit before the multiplications. Built-in Functions int arm_dot_acc(short2 a, short2 b, int acc); uint arm_dot_acc(ushort2 a, ushort2 b, uint acc); Description These functions are available when cl_arm_integer_dot_product_accumulate_int16 is reported. The cl_arm_integer_dot_product_accumulate_int16 preprocessor macro is then defined. The value returned is: acc + [ (a.x * b.x) + (a.y * b.y) ]. Operands are zero- or sign-extended to 32-bit before the multiplications. Built-in Functions int arm_dot_acc_sat(char4 a, char4 b, int acc); uint arm_dot_acc_sat(uchar4 a, uchar4 b, uint acc); Description These functions are available when cl_arm_integer_dot_product_accumulate_saturate_int8 is reported. The cl_arm_integer_dot_product_accumulate_saturate_int8 preprocessor macro is then defined. The value returned is: acc + [ (a.x * b.x) + (a.y * b.y) + (a.z * b.z) + (a.w * b.w) ]. Operands are zero- or sign-extended to 32-bit before the multiplications. The final accumulation is saturating.