Name String cl_intel_device_side_avc_motion_estimation Contributors Biju George, Intel Raghukrishnan Embar, Intel Ryan Lei, Intel Ben Ashbaugh, Intel Bartosz Sochacki, Intel Krishna Madaparambil, Intel Mateusz Tabaka, Intel Contact Biju George, Intel (biju.george 'at' intel.com) Version Version 5, November 9, 2018 Number OpenCL Extension #50 Status First Draft Dependencies OpenCL 1.2 is required. The OpenCL Intel vendor extension cl_intel_subgroups is required. The VME built-in functions are an extension of the subgroup functions defined in cl_intel_subgroups. The built-in functions which perform intra estimation depend on the OpenCL Intel vendor extension cl_intel_media_block_io in order to read-in the neighboring macroblock edge pixels. The built-in functions which perform chroma based intra estimation operations depend on the OpenCL Intel vendor extension cl_intel_planar_yuv in order to create NV12 source images. This extension is written against revision 29 of the OpenCL 2.0 API specification, against revision 33 of the OpenCL 2.0 OpenCL C specification, and against revision 32 of the OpenCL 2.0 extension specification. Overview Video motion estimation (VME) is defined as of set motion estimation operations that are used to determine the motion vectors, intra estimation angles and macroblock partitioning combination that best describe the transformation to the source macroblock, from blocks in one or more previous reference pictures (inter-prediction), or from other blocks in the same source picture (intra-prediction). It does this by searching for spatial and temporal patterns on the current and various forward and backward reference pictures. The goal of this extension is to provide programmers with a fine- grained interface to the AVC VME media sampler in Intel graphics processors. It describes the specification of low-level built-in functions, callable from OpenCL kernels, that facilitate the programming of the VME media sampler to evaluate specific AVC motion estimation operations. If only a coarser-level interface at the level of built-in kernels suffices, then the Intel vendor extensions cl_intel_motion_estimation and cl_advanced_motion_estimation may be considered. Built-in functions are defined for all the major operations of the VME media sampler. The major operations of the AVC VME media sampler in Intel Graphics Processors can be described as follows: 1. Integer motion estimation (IME) Perform motion estimation on a given source macroblock in a source image over a single or dual reference window in a reference image, at full-pixel resolution, to determine the best integer motion vectors and their associated distortions, and the best macroblock shape partitioning combination. 2. Motion estimation refinement (REF) Perform refinement operations on the results of IME. The two sub- operations are: Fractional motion estimation (FME) Perform sub-pixel refinement on the results of an IME operation. half-pixel (HPEL) or quarter-pixel (QPEL) refinements are performed to determine the best sub-pixel motion vectors and their associated distortions. Bidirectional motion estimation (BME) Perform bidirectional refinement on the results of an IME operation using two reference images to check if the bidirectional mode using two references yields lesser distortions. An FME can optionally be performed implicitly as part of a bidirectional refinement. 3. Skip and Intra check (SIC) Performs the following two sub-operations: Skip check (SKC) Compute the pixel distortion of a user-specified shape and motion vector combination. The VME media sampler fetches necessary pixels, performs fractional and bidirectional filtering (as necessary), and then computes the distortion between the derived reference and source. The skip decision can optionally be enhanced to include a 4x4 forward transform, the results of which are compared against a user specified threshold to emulate the effects of the forward quantization zeroing effect. Intra prediction estimation (IPE) Perform intra prediction on a given source macroblock to determine the best intra prediction modes and the best shape partitioning combination. New API Enums Accepted as arguments to clGetDeviceInfo CL_DEVICE_AVC_ME_VERSION_INTEL 0x410B CL_DEVICE_AVC_ME_SUPPORTS_TEXTURE_SAMPLER_USE_INTEL 0x410C CL_DEVICE_AVC_ME_SUPPORTS_PREEMPTION_INTEL 0x410D Additional valid constant values returned by clGetDeviceInfo: CL_AVC_ME_VERSION_1_INTEL 0x1 Additions to Chapter 4 of the OpenCL Specification: Modify the description of function clGetDeviceInfo(..) Table 4.3 is extended to include the following enumeration constants. +--------------------------+-----------+----------------------------+ |cl_device_info |Return Type|Description | +--------------------------+-----------+----------------------------+ |CL_DEVICE_AVC_ME_VERSION_ |cl_uint |The AVC device-side motion | |INTEL | |estimation API version | | | |number supported by the | | | |device and driver. | +--------------------------+-----------+----------------------------+ |CL_DEVICE_AVC_ME_SUPPORTS_|cl_bool |Is CL_TRUE if built-in | |TEXTURE_SAMPLER_USE_INTEL | |functions using the texture | | | |sampler may be used along | | | |with AVC VME built-in | | | |functions using the media | | | |sampler on the device, and | | | |CL_FALSE otherwise. | +--------------------------+-----------+----------------------------+ |CL_DEVICE_AVC_ME_SUPPORTS_|cl_bool |Is CL_TRUE if an enqueue of | |PREEMPTION_INTEL | |an OpenCL kernel that uses | | | |AVC VME built-in functions | | | |supports preemption on the | | | |device, and CL_FALSE | | | |otherwise. (This may be | | | |useful on platforms that | | | |have strict latency | | | |requirements on the GPU). | +--------------------------+-----------+----------------------------+ This extension requires a value of CL_AVC_ME_VERSION_1_INTEL returned by clGetDeviceInfo for CL_DEVICE_AVC_ME_VERSION_INTEL. Terms, Acronyms and Definitions Adds a new section 6.13.14.X "VME built-in functions" to the OpenCL 2.0 C specification. " The following terms, acronyms and definitions are used in and provide context for the VME built-in functions. Macro-block (MB): ----------------- An image is partitioned into macro-blocks of size 16x16 pixels. It is the basic unit of processing for AVC video motion estimation operations. Shape: ------ A MB may be partitioned into sub-blocks of one of the major shapes. A sub-block with an 8x8 major shape may be further independently partitioned into sub-blocks of one of the minor shapes. It is represented by pre-defined shape enumeration values. Major Shapes: ------------- Shapes of 16x16, 16x8, 8x16, or 8x8 partitions of a MB. A 16x16 major shape merely indicates that the MB was not further partitioned. Minor Shapes: ------------- Shapes of 8x8, 8x4, 4x8, or 4x4 sub-partitions of an 8x8 partition. A 8x8 minor shape merely indicates that the 8x8 major partition was not further sub-partitioned. Block: ------ A sub-block of a MB with one of the major or minor shapes. Reference Image: ---------------- An image (typically from the previously decoded buffer in an encoder pipeline) from which motion estimation predictions are made. Source Image: ------------- The current image for which motion estimation predictions are made. Source Macro-block Offset: -------------------------- The 2D offset of the top left corner of the source MB in pixel units. It is represented by a pair of unsigned 16-bit integers. Reference Window Offset: ------------------------ The 2D offset of the top left corner reference search window w.r.t to the top left corner of the source MB in pixel units. It is represented by a pair of signed 16-bit integers in the range [-2048, 2047]. Reference identifier: --------------------- Reference identifiers are associated to pairs of forward(L0)/ backward(L1) reference image parameters. Up to 16 pairs of reference pairs of reference image parameters are permitted, with the permitted values of reference identifiers ranging from 0 to 15. The reference identifers are assigned in increasing order in which the reference image parameter pairs are declared in the OpenCL kernel parameter list. See new section 6.13.14.X "VME built-in functions" described below for more details. Motion Vector (MV): ------------------- A 2D vector used for inter motion estimation that provides an offset from the top left corner of a block in the source image to the top left corner an identically sized block in the reference image. Generally it is used to represent the best match of a block in the reference image to a block in the source image. The best match is determined as the block minimizing the distortion. MVs are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset. It is represented by a pair of signed 16-bit signed integers. Packed Motion Vector: -------------------------- A motion vector represented as a packed 32-bit unsigned integer. The lower 16 bits contains the X coordinate and the upper 16 bits contains the Y coordinate. Bidirectional Motion Vector (BMV): ---------------------------------- A pair of MVs for the forward(L0) and backward(L1) images. Depending on how the VME operation is configured only the forward or the backward MV or both may be valid. Packed Bidirectional Motion Vector: ----------------------------------- A bidirectional MV represented as a packed 64-bit unsigned integer. The lower 32-bits contain the forward packed MV, and the upper 32-bits contain the backward packed MV. Sum Of Absolute Difference (SAD): --------------------------------- The sum of absolute differences of every full/sub-pixel location in the source block w.r.t every corresponding full/sub pixel in the reference block as specified by a given MV. The sum of absolute differences may be optionally Haar transform adjusted. It is represented by an unsigned 16-bit integer value. Haar Transform (HAAR): ---------------------- A simple wavelet transform that is used to refine the distortion measure of SAD. The per pixel difference goes through a 4x4 Haar transform. Then the SAD is replaced by the sum of the absolute values of the transform domain coefficients in the distortion. Haar transform is used as a coarse estimation of the integer transform. Motion Vector Cost Center (CC): ------------------------------- A MV has an associated cost w.r.t a cost center coordinate. The further away from the cost center, the larger will be the cost associated with the MV. Cost centers are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset. Motion Vector Cost Center Delta (CCD): -------------------------------------- The 2D offset of the cost center relative to the top left corner of the source MB. Cost center deltas are specified in QPEL resolution with the 2 LSB representing the fractional part of the offset. It is represented by a pair of signed 16-bit integers. Packed Motion Vector Cost Center Delta: --------------------------------------- A motion vector cost center delta represented as a packed 32-bit unsigned integer. The lower 16 bits contains the X coordinate and the upper 16 bits contains the Y coordinate. Bidirectional Motion Vector Cost Center Delta: ---------------------------------------------- A pair of cost center deltas for the forward and backward images. Packed Bidirectional Motion Vector Cost Center Delta: ----------------------------------------------------- A packed bidirectional motion vector cost center delta represented as a 64-bit unsigned integer. The lower 32-bits contain the forward packed CCD, and the upper 32-bits contain the backward packed CCD. Motion Vector Cost: ------------------- The MV cost is determined using a cost function described by a cost table that is indexed based on power-of-two distances from the user specified cost center, with a user specified precision (or unit) of the distances from the cost center. U4U4 Byte Format: ----------------- Represents a value of (B< 0x | | | 1 to 2 => 1x | | | 3 to 6 => 2x | | | 7 to 15 => 3x | | | | | |The value of | | |reference_base_penalty is | | |specified in U4U4 format and | | |the integer value must fit | | |within 12 bits. | +-------------------------------------+-----------------------------+ Inter shape and direction cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ These built-in functions enable shape costing for inter estimation. They allow for the configuration of payloads for the biasing of certain shapes over others based on the configured parameters. +------------------------------------+-------------------------------+ |ulong intel_sub_group_avc_mce_get_ |Get the default packed shape | |default_inter_shape_penalty( |cost for inter estimation in | | uchar slice_type, |U4U4 format. | | uchar qp ) | | | |The value of slice_type must be| | |a valid slice type enumeration | | |value. | | | | | |The value of qp must be a valid| | |quantization parameter value | | |between 0 and 51. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_mce_payload_t |Set the shape penalty for inter| |intel_sub_group_avc_mce_set_inter_ |motion estimation. | |shape_penalty( | | | ulong packed_shape_penalty, |The value of packed_shape_cost | | intel_sub_group_avc_mce_payload_t|is an unsigned long integer | | payload ) |value with the following bits | | |specifying the shape penalty in| | |U4U4 format as follows: | | | | | | 7:0 => 16x8 and 8x16 | | | 15:8 => 8x8 | | | 23:16 => 8x4 and 4x8 | | | 31:24 => 4x4 | | | 39:32 => 16x16 | | | 63:40 => must be zero | | | | | |The U4U4 decoded integer values| | |for byte 0 and byte 4 must bit | | |fit in 12 bits, while the U4U4 | | |decoded integer values for the | | |other bytes must fit within 10 | | |bits. | +------------------------------------+-------------------------------+ |uchar intel_sub_group_avc_mce_get_ |Get the default direction | |default_inter_direction_penalty( |penalty for inter estimation in| | uchar slice_type, |U4U4 format. | | uchar qp ) | | | |The value of slice_type must be| | |a valid slice type enumeration | | |value. | | | | | |The value of qp must be a valid| | |quantization parameter value | | |between 0 and 51. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_mce_payload_t |Set the direction penalty for | |intel_sub_group_avc_mce_set_ |backward images used in inter | |inter_direction_penalty( |motion estimation. | | uchar direction_cost, | | | intel_sub_group_avc_mce_payload_t|The value of direction_cost | | payload ) |must be in U4U4 format and its | | |decoded integer value must bit | | |fit in 12 bits. | +------------------------------------+-------------------------------+ Intra shape cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++ These built-in functions enable shape costing for intra estimation. They allow for the configuration of payloads for biasing of certain shapes over others based on the configured parameters. Only the built- in function providing the default shape penalty is specified as an MCE function. The function which actually configures the payload for the intra estimation operation is specified as a SIC built-in function. +----------------------------------+--------------------------------+ |uint intel_sub_group_avc_mce_get_ |Get the default packed luma | |default_intra_luma_shape_penalty( |intra shape penalty in U4U4 | | uchar slice_type, |format. | | uchar qp ) | | +----------------------------------+--------------------------------+ Inter motion vector cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++ These built-in functions enable motion vector costing for inter estimation. The distortion measure is augmented to favor motion vectors closer to the cost-center considered in conjunction with the primary objective of minimizing the SAD between the source and reference blocks. +------------------------------------+-------------------------------+ |uint2 intel_sub_group_avc_mce_get_ |Get the default inter motion | |default_inter_motion_vector_ |vector cost table for the | |cost_table( |pre-defined control points in | | uchar slice_type, |U4U4 format for the input Qp | | uchar qp ) |and slice type. | | | | | |The value of slice_type must be| | |a valid slice type enumeration | | |value. | | | | | |The value of qp must be a valid| | |quantization parameter value | | |between 0 and 51. | +------------------------------------+-------------------------------+ |uint2 intel_sub_group_avc_mce_get_ |Get the default predefined | |default_high_penalty_cost_table( |packed U4U4 format high cost | | void) |table for high Qp. This may be | | |more appropriate for frame | | |sequences with high motion. | +------------------------------------+-------------------------------+ |uint2 intel_sub_group_avc_mce_get_ |Get the default predefined | |default_medium_penalty_cost_table( |packed U4U4 format medium cost | | void) |table for medium Qp. This may | | |be more appropriate for frame | | |sequences with normal motion. | +------------------------------------+-------------------------------+ |uint2 intel_sub_group_avc_mce_get_ |Get the default predefined | |default_low_penalty_cost_table( |packed U4U4 format low cost | | void) |table for low Qp. This may be | | |more appropriate for frame | | |sequences with low motion. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_mce_payload_t |Update the input payload to set| |intel_sub_group_avc_mce_set_ |the cost precision along with | |motion_vector_cost_function( |the cost center and cost table | | ulong packed_cost_center_delta, |and return it. | | uint2 packed_cost_table, | | | uchar cost_precision, |The value of | | intel_sub_group_avc_mce_payload_t|packed_cost_center_delta is the| | payload ) |packed bidirectional cost | | |center delta value relative to | | |the source macroblock, which | | |specifies the 4 bidirectional | | |cost centers of each of the 8x8| | |partitions of the reference | | |image. If only unidirectional | | |search is performed then the | | |values of the backward | | |reference cost centers must be | | |zero. Work-item 'n' provides | | |the value of cost center | | |'n'. It is specified in QPEL | | |units. For 16x16 partitions | | |work-item '0' provides the cost| | |center. For 8x16 partitions | | |work-items '0' and '1' provide | | |the cost centers. For 16x8 | | |partitions work-items '0' and | | |'2' provide the cost | | |centers. The X and Y | | |coordinates of each cost center| | |delta must be in the range | | |[-2048, 2047] and [-512.00 to | | |511.75] respectively, otherwise| | |the results are undefined. | | | | | |The packed cost table specifies| | |the cost penalties for | | |pre-defined control points in | | |U4U4 format in the cost | | |function curve. The first 7 | | |bytes specify 7 control points | | |representing consecutive | | |powers-of-two delta units (2^0 | | |to 2^6). Each delta unit, dx, | | |is the distance of a motion | | |vector, mv, from the specified | | |cost center, cc | | |(dx=|mv-cc|). The cost penalty | | |values at in-between control | | |points are linearly | | |interpolated. The range of the | | |cost function is defined to be | | |from 2^0 to 2^6 delta | | |units. The 8th byte of the | | |packed cost table specifies the| | |penalty base factor (over_cost)| | |for dx distances that are | | |out-of-range. The penalty of | | |out-of-range cost dx distances | | |is computed as min(over_cost + | | |int(dx) - 64, 255). | | | | | |The value of the cost precision| | |parameter must be a valid cost | | |precision enumeration value, | | |and specifies the precision of | | |the delta units from the cost | | |center, dx. This effectively | | |can be used to control the | | |range of the cost function as | | |follows: | | | | | |PRECISION DELTAS PIXEL RANGE | | |--------- ------ ----------- | | | | | |PEL pixel 0-64 | | |DPEL dual 0-127 | | | pixel | | |HALF half 0-31 | | | pixel | | |QUARTER quarter 0-15 | | | pixel | | | | | |The inter distortion for a | | |block can be described by the | | |following formula: | | | | | |Distortion = | | | SAD (or HAAR) + | | | MV_Cost_Penalty + | | | Shape_Penalty + | | | Direction_Cost + | | | Multi_Reference_Penalty | +------------------------------------+-------------------------------+ Intra mode cost configuration phase functions +++++++++++++++++++++++++++++++++++++++++++++ These built-in functions enable mode costing for intra estimation. They allow for the configuration of payloads to bias the computed intra modes to be closer to their configured neighbor modes. This form of costing is similar to the inter motion vector costing. Only the built-in functions providing the defaults mode costs are specified as MCE functions. The remaining functions which actually configure the payload for the intra estimation operation is specified as SIC built- in functions. +----------------------------------+---------------------------------+ |uchar intel_sub_group_avc_mce_get_|Get the default intra mode cost | |default_intra_luma_mode_penalty( |penalty for in U4U4 format when | | uchar slice_type, |the estimated mode differs from | | uchar qp ) |its predicted mode from its | | |neighbors. | | | | | |The value of slice_type must be a| | |valid slice type enumeration | | |value. | | | | | |The value of qp must be a valid | | |quantization parameter value | | |between 0 and 51. | +----------------------------------+---------------------------------+ |uint intel_sub_group_avc_mce_get_ |Get the default intra non-dc cost| |default_non_dc_luma_intra_penalty(|penalty for intra luma estimation| | void ) |in packed 32-bit integer format. | +----------------------------------+---------------------------------+ |uchar intel_sub_group_avc_mce_get_|Get the default chroma mode base | |default_intra_chroma_mode_base_ |penalty in U4U4 format. | |penalty( | | | void ) | | +----------------------------------+---------------------------------+ Miscellaneous property configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++ These built-in functions enable miscellaneous MCE properties settings. +--------------------------------------+-----------------------------+ |intel_sub_group_avc_mce_payload_t |Update the input payload to | |intel_sub_group_avc_mce_set_ |enable an AC only HAAR SAD | |ac_only_haar( |mode and return it. It | | intel_sub_group_avc_mce_payload_t |overrides any previous | | payload ) |setting for sad adjustment. | | | | | |This feature is mainly | | |intended for improved block | | |matching in frame-rate | | |conversion (FRC) kernels. | +--------------------------------------+-----------------------------+ |intel_sub_group_avc_mce_payload_t |Update the input payload to | |intel_sub_group_avc_mce_set_source_ |specify the field polarities | |interlaced_field_polarity( |for interlaced source images | | uchar src_field_polarity, |used for inter or intra | | intel_sub_group_avc_mce_payload_t |operations. | | payload ) | | | |The value of | | |src_field_polarity must be a | | |valid field polarity | | |enumeration value indicating | | |the field polarity for the | | |source image. | +--------------------------------------+-----------------------------+ |intel_sub_group_avc_mce_payload_t |Update the input payload to | |intel_sub_group_avc_mce_set_single_ |specify the field polarities | |reference_interlaced_field_polarity( |for interlaced reference | | uchar ref_field_polarity, |images used for single | | intel_sub_group_avc_mce_payload_t |reference inter search or | | payload ) |check operation. | | | | | |The value of | | |ref_field_polarity must be a | | |valid field polarity | | |enumeration value indicating | | |the field polarity for the | | |reference image. | +--------------------------------------+-----------------------------+ |intel_sub_group_avc_mce_payload_t |Update the input payload to | |intel_sub_group_avc_mce_set_dual_ |specify the field polarities | |reference_interlaced_field_polarities(|for interlaced reference | | uchar fwd_ref_field_polarity, |images used for dual | | uchar bwd_ref_field_polarity, |reference inter search or | | intel_sub_group_avc_mce_payload_t |check operation. | | payload ) | | | |The value of | | |fwd_ref_field_polarity must | | |be a valid field polarity | | |enumeration value indicating | | |the field polarity for the | | |forward image. | | | | | |The value of | | |bwd_ref_field_polarity must | | |be a valid field polarity | | |enumeration value indicating | | |the field polarity for the | | |backward image. | +--------------------------------------+-----------------------------+ Result processing phase functions +++++++++++++++++++++++++++++++++ These built-in functions facilitate the extraction of components of the result from VME unit. +-----------------------------------+--------------------------------+ |ulong intel_sub_group_avc_mce_get_ |Get the MCE packed BMVs result. | |motion_vectors( | | | intel_sub_group_avc_mce_result_t|Up to 16 packed BMVs are | | result ) |returned, one per work-item. If | | |the MCE search operation's | | |payload was setup for | | |unidirectional search then only | | |the forward packed MV will be | | |valid in each BMV, otherwise | | |both packed MVs will be | | |valid. The BMVs have to be | | |selected by their respective | | |work-items based on the result | | |block major and minor shapes. | | | | | |If the major shape is: | | | - 16x16, then one BMV is | | | returned by work-item 0 | | | - 16x8, or 8x16, then two | | | BMVs are returned by work- | | | items 0 and 8 | | | - 8x8, then four sets of BMVs| | | corresponding to the four | | | partitions in traditional | | | Z-order are returned by | | | work-items in the ranges | | | [0, 3], [4, 7], [8, 11], | | | and [12, 15]; the minor | | | shape will determine | | | exactly which work-items | | | in the reserved inclusive | | | range for the partition | | | returns the BMVs for that | | | partition | | | | | |If the range of work-items for | | |the 8x8 major partition is [n, | | |n+3] and the minor shape is: | | | - 8x8, then work-item 'n' | | | returns the BMV for each | | | minor partition | | | - 8x4 or 4x8, then work-items| | | 'n' and 'n+2' returns the | | | BMVs for each minor | | | partition | | | - 4x4, then all work-items in| | | [n, n+3] return the BMVs | | | for each minor partition | | | in traditional Z-order | | | | | |NOTES: | | | | | |(1) All sub-block BMVs get | | |replicated for each partition. | | |For example, for a 16x16 | | |partition, all smaller sub-block| | |BMVs are replicated to the same | | |BMV, and for 8x8 partition, each| | |8x8 must have its respective | | |sub-block BMVs replicated. This | | |is not important to extract the | | |component BMVs itself, but is | | |needed if the result of this | | |function is used to initialize | | |the input motion vectors of a | | |REF initialization function. | | | | | |(2) With interlaced images, the | | |MBs for the top field MBs are | | |considered as logically | | |overlapping with the bottom MBs.| +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_mce_get_|Get the MCE inter distortions | |inter_distortions( |result corresponding to the BMVs| | intel_sub_group_avc_mce_result_t|returned by intel_sub_group_avc_| | result ) |mce_get_motion_vectors(..). The | | |MCE inter directions result | | |returned by intel_sub_group_avc_| | |mce_get_inter_directions(..) | | |will specify if the distortion | | |corresponds to the forward MV, | | |backward MV, or the | | |bidirectional MV in the BMV. Up | | |to 16 distortions are returned, | | |one per work-item. | | | | | |The distortions have to be | | |selected by their respective | | |work-items based on the result | | |block major and minor shapes | | |just as for the result MVs as | | |described above. | +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_mce_get_|Get the best inter distortion | |best_inter_distortion( |for the whole MB. | | intel_sub_group_avc_mce_result_t| | | result ) | | +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_mce_get_ |Get the MCE inter MB major | |inter_major_shape( |partition shape. | | intel_sub_group_avc_mce_result_t| | | result ) |The returned values are as per | | |the inter-MB major shapes | | |enumeration values. | | | | | |This can only be called as part | | |of an IME or REF operation | | |evaluation. | +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_mce_get_ |Get the MCE inter MB minor | |inter_minor_shapes( |partition shapes. | | intel_sub_group_avc_mce_result_t| | | result ) |It returns a bit field with the | | |minor shapes for the 4 8x8 | | |sub-partitions in traditional Z | | |order. Two bits are reserved for| | |each of the four sub-partitions | | |in row-major order. The | | |returned 2-bit values are as per| | |the inter-MB minor shapes | | |enumeration values. | | | | | |This function returns valid | | |results only if the major shape | | |is 8x8, otherwise the results | | |are undefined. | | | | | |This can only be called as part | | |of an IME or REF operation | | |evaluation. | +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_mce_get_ |Get the MCE inter MB major | |inter_directions( |partition directions. | | intel_sub_group_avc_mce_result_t| | | result ) |It returns a bit field with the | | |direction for up to 4 major | | |sub-partitions in traditional Z | | |order. Two bits are reserved for| | |each of the four | | |sub-partitions. The returned | | |2-bit values are as per the | | |inter-MB major shape direction | | |enumeration values. | | | | | |If the major partition is: | | | | | |- 16x16, then bits in the range | | | [0, 1] contains the direction | | |- 16x8 or 8x16, then bits in the| | | ranges [0, 1] and [2, 3] | | | contains the two partitions | | | directions | | |- 8x8, then bits in the ranges | | | [0, 1], [2, 3], [4, 5], and | | | [6, 7] contains the four | | | partitions directions | | | | | |The returned values are as per | | |the inter direction enumeration | | |values. | | | | | |This can only be called as part | | |of an IME or REF operation | | |evaluation. | +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_mce_get_ |Get the count of motion vectors | |inter_motion_vector_count( |(based on the partitioning | | intel_sub_group_avc_mce_result_t|decision) returned by the search| | result ) |operation. | | | | | |This can only be called as part | | |of an IME or REF operation | | |evaluation. | +-----------------------------------+--------------------------------+ |uint intel_sub_group_avc_mce_get_ |Get the MCE inter MB reference | |inter_reference_ids( |identifiers in a packed integer | | intel_sub_group_avc_mce_result_t|format, with the following bits | | result ) |specifying the reference | | |identifiers for the major | | |partitions. | | | | | |3:0 => Fwd reference block 0 | | |7:4 => Bwd reference block 0 | | |11:8 => Fwd reference block 1 | | |15:12 => Bwd reference block 1 | | |19:16 => Fwd reference block 2 | | |23:20 => Bwd reference block 2 | | |27:24 => Fwd reference block 3 | | |31:28 => Bwd reference block 3 | | | | | |The values of each individual | | |4-bit reference identifier range| | |from 0 to 15, with each value | | |identifying the distance of | | |ordered pair of forward/backward| | |reference images as declared in | | |the VME kernel parameter | | |interface list. | | | | | |If the dual-reference evaluation| | |functions are not used, then the| | |values of the backward reference| | |identifiers are undefined. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference identifier | | |pairs are replicated. For | | |example, for a 16x16 block all | | |four pairs of reference | | |identifiers are replicated to | | |the value of the first pair for | | |block 0. | | | | | |NOTE: Unless HW assisted | | |multi-reference search was | | |performed using the IME | | |streamin/streamout evaluation | | |functions, the individual 4-bit | | |reference identifier pair values| | |will all be the same (pointing | | |to the same pair for | | |forward/backward reference | | |images). | +-----------------------------------+--------------------------------+ |uchar |Get the MCE inter MB reference | |intel_sub_group_avc_mce_get_inter_ |field polarities for the | |reference_interlaced_field_ |corresponding reference | |polarities( |identifiers returned by | | uint packed_reference_ids, |intel_sub_group_avc_ | | uint |mce_get_inter_reference_ids(..) | | packed_reference_parameter_ |in a packed integer format, with| | field_polarities, |the following bits specifying | | intel_sub_group_avc_mce_result_t|the reference field polarities | | result ) |for the major partitions. | | | | | | 0 : Fwd reference block 0 | | | 1 : Fwd reference block 1 | | | 2 : Fwd reference block 2 | | | 3 : Fwd reference block 3 | | | 4 : Bwd reference block 0 | | | 5 : Bwd reference block 1 | | | 6 : Bwd reference block 2 | | | 7 : Bwd reference block 3 | | | | | |If the dual-reference evaluation| | |functions are not used, then the| | |values of the backward reference| | |field polarities are undefined. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference field | | |polarities are replicated. For | | |example, for a 16x16 block all | | |four pairs of reference field | | |polarities are replicated to the| | |value of the first pair for | | |block 0. | | | | | |The value of | | |packed_reference_ids is as | | |defined by the return value of | | |intel_sub_group_avc_mce_get_ | | |inter_reference_ids(..). | | | | | |The value of | | |packed_reference_parameter_ | | |field_polarities specifies the | | |packed bit field of field | | |polarities for each of the (up | | |to 16) forward/backward | | |interleaved pairs of reference | | |images in the same order as | | |specified in the kernel | | |parameter list, as used for the | | |inter search operation. If less | | |than 16 pairs are used then the | | |corresponding bit field values | | |are ignored. | | | | | |NOTE: An important restriction | | |is that when multiple IME | | |operations are performed for a | | |HW multi-assisted | | |multi-reference search operation| | |using the streamin/streamout | | |capabilities, the same reference| | |image parameter cannot be used | | |with different polarities in the| | |sequence of IME operations used | | |for a HW-assisted search | | |operation. In other words, the | | |field polarities for reference | | |image parameters must be used | | |consistently across IME | | |operations used in a HW assisted| | |multi-reference search | | |operation. | +-----------------------------------+--------------------------------+ ______________________________________________________________________ IME built-in functions ---------------------- A set of ordered phases of functions are required to be called to evaluate an integer motion estimation result. Initialization phase functions ++++++++++++++++++++++++++++++ These built-in functions create a properly initialized payload that can be used for further configured for evaluating IME operations. This is a required initial phase. +---------------------------------+----------------------------------+ |intel_sub_group_avc_ime_payload_t|Return an initialized payload for | |intel_sub_group_avc_ime_ |a VME integer search (IME) | |initialize( |operation. | | ushort2 src_coord, | | | uchar partition_mask, |The payload is initialized for | | uchar sad_adjustment ) |progressive frame operations, and | | |the cost configuration values and | | |the miscellaneous property values | | |are all initialized to zero. The | | |cost configuration and the | | |miscellaneous property | | |configuration phase functions must| | |be used to override the initial | | |configurations in the payload. | | | | | |The src_coord value represents the| | |2D offset of the top left corner | | |of the source MB in pixel units in| | |the source image. Source MBs at | | |the image borders are allowed to | | |be partial, but the top-left | | |corner must be within the image. | | | | | |If the source image is an | | |interlaced scan image, then the | | |bottom field lines are considered | | |as logically overlapping with the | | |top field lines (i.e. the top | | |field MBs are considered as | | |logically overlapping with the | | |bottom MBs) for the purposes for | | |specifying the src_coord value. | | | | | |The legal values for | | |partition_mask can be composed by | | |setting the appropriate bit fields| | |specified by partition mask | | |enumeration values using the '&' | | |operator. | | | | | |The legal values for | | |sad_adjustment is specified by its| | |respective enumeration values. | | | | | |If the sad_adjustment is set to | | |CLK_AVC_ME_SAD_ADJUST_MODE_HAAR_ | | |INTEL, a simple wavelet transform,| | |Haar transform, is used to refine | | |the distortion measure of | | |SAD. Haar transform here is used | | |as a coarse estimation of the | | |integer transform. | +---------------------------------+----------------------------------+ Configuration phase functions +++++++++++++++++++++++++++++ These built-in functions allow for configuration of the search window. A call to either intel_sub_group_avc_ime_set_single_reference(..) or intel_sub_group_avc_ime_set_dual_reference(..) is required. This is a required phase immediately following the initialization phase +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Update the input payload for a | |intel_sub_group_avc_ime_set_ |VME single-reference search | |single_reference( |with the configuration for the | | short2 ref_offset, |reference window search region,| | uchar search_window_config, |and return it. | | intel_sub_group_avc_ime_payload_t| | | payload ) |The 2D ref_offset specifies the| | |reference window offset. The X | | |and Y coordinates must be in | | |the range [-2048, 2047], | | |otherwise the results are | | |undefined. The reference window| | |is allowed to be partially | | |outside the image. Pixel | | |replication is applied to | | |generate out-of-bound reference| | |pixels. It is specified in PEL | | |units. Results are undefined in| | |the reference region is | | |completely outside the image. | | | | | |If the reference image is an | | |interlaced scan image, then the| | |top field lines are considered | | |as logically overlapping with | | |the bottom field lines | | |(i.e. the top field MBs are | | |considered as logically | | |overlapping with the bottom | | |MBs) for the purposes for | | |specifying the ref_offset | | |value. | | | | | |The parameter | | |search_window_config must be a | | |compile-time constant. | | | | | |The value of | | |search_window_config must be | | |one of the unreserved search | | |window configuration | | |enumeration values. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Update the input payload for a | |intel_sub_group_avc_ime_ |VME dual-reference search with | |set_dual_reference( |the configurations for the | | short2 fwd_ref_offset, |reference window search | | short2 bwd_ref_offset, |regions, and return it. | | uchar search_window_config, | | | intel_sub_group_avc_ime_payload_t|The 2D | | payload ) |fwd_ref_offset/bwd_ref_offset | | |specifies the forward/backward | | |reference window offsets. The X| | |and Y coordinates must be in | | |the range [-2048, 2047], | | |otherwise the results are | | |undefined. The reference | | |windows are allowed to be | | |partially outside the | | |image. Pixel replication is | | |applied to generate | | |out-of-bound reference | | |pixels. It is specified in PEL | | |units. | | | | | |If a reference image is an | | |interlaced scan image, then the| | |top field lines are considered | | |as logically overlapping with | | |the bottom field lines | | |(i.e. the top field MBs are | | |considered as logically | | |overlapping with the bottom | | |MBs) for the purposes for | | |specifying the corresponding | | |fwd_ref_offset and/or | | |bwd_ref_offset values. | | | | | |The parameter | | |search_window_config must be a | | |compile-time constant and must | | |be one of the unreserved search| | |window configuration | | |enumeration values. | +------------------------------------+-------------------------------+ |ushort2 intel_sub_group_avc_ime_ |Get the 2D size of the | |ref_window_size( |reference window in pixel | | uchar search_window_config, |units. | | char dual_ref ) | | | |The value of | |// deprecated |search_window_config must be | |ushort2 intel_sub_group_ime_ |one of the unreserved search | |ref_window_size( |window configuration | | uchar search_window_config, |enumeration values. | | char dual_ref ) | | | |The value of dual_ref must be | | |set to zero for a single | | |reference search window and one| | |for a dual-reference search | | |window. | +------------------------------------+-------------------------------+ |short2 |If the input 2D reference | |intel_sub_group_avc_ime_ |window offset, ref_offset, | |adjust_ref_offset( |causes the reference window to | | short2 ref_offset, |be fully out-of-bound of the | | ushort2 src_coord, |reference image, adjust it such| | ushort2 ref_window_size, |that the reference window is | | ushort2 image_size ) |within bounds of the reference | | |image. | | | | | |The 2D ref_offset specifies the| | |reference window offset. The X | | |and Y coordinates must be in | | |the range [-2048, 2047], | | |otherwise the results are | | |undefined. | | | | | |The src_coord value represents | | |the 2D offset of the top left | | |corner of the source MB in | | |pixel units in the source | | |image. Source MBs at the image | | |borders are allowed to be | | |partial. Results are undefined | | |if the source MB is completely | | |outside the image. | | | | | |The ref_window_size specifies | | |the 2D size of the reference | | |window in pixel units. | | | | | |The image_size specifies the 2D| | |size of the progressive scan, | | |or top or bottom fields, of the| | |interlaced scan image in pixel | | |units. | | | | | |If the reference image is an | | |interlaced scan image, then the| | |bottom field lines are | | |considered as logically | | |overlapping with the top field | | |lines the purposes for | | |specifying the image_size | | |value. Since, the actual layout| | |of the top and bottom fields in| | |the reference image is in an | | |interleaved fashion, the height| | |of the top or bottom fields | | |should be exactly half of the | | |actual reference image height. | | | | | |A call to | | |intel_sub_group_avc_ime_ | | |adjust_ref_coord(..) is | | |optional. It is required only | | |if the reference window offsets| | |inputs to | | |intel_sub_group_avc_ime_set_ | | |single_reference(..) or | | |intel_sub_group_avc_ime_set_ | | |dual_reference (..) is | | |potentially out-of-bounds and | | |need to be adjusted. | +------------------------------------+-------------------------------+ Payload type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the search configuration phase to convert IME payload to MCE payloads and vice-versa. +------------------------------------+-------------------------------+ |intel_sub_group_avc_mce_payload_t |Convert the IME payload to a | |intel_sub_group_avc_ime_convert_ |generic MCE payload. | |to_mce_payload( | | | intel_sub_group_avc_ime_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Convert the generic MCE payload| |intel_sub_group_avc_ |to a IME payload. | |mce_convert_to_ime_payload( | | | intel_sub_group_avc_mce_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ Multi-reference cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the search configuration phase to enable multi-reference image costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_ | |inter_base_multi_reference_penalty( |set_inter_base_multi_reference_| | uchar reference_base_penalty, |penalty(..) the payload | | intel_sub_group_avc_ime_payload_t|conversions with to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Inter shape and direction cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the search configuration phase to enable shape costing. +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ | |inter_shape_penalty( |inter_shape_penalty(..) | | ulong packed_shape_cost, |with the payload conversions | | intel_sub_group_avc_ime_payload_t |to/from MCE types. See MCE | | payload ) |version for description. | | | | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ | |inter_direction_penalty( |inter_direction_penalty(..) | | uchar direction_cost, |with the payload conversions | | intel_sub_group_avc_ime_payload_t|to/from MCE types. See MCE | | payload ) |version for description. | +-------------------------------------+------------------------------+ Inter motion vector cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the search configuration phase to enable motion vector costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ | |motion_vector_cost_function( |motion_vector_cost_function(..)| | ulong packed_cost_center_delta, |with the payload conversions | | uint2 packed_cost_table, |to/from MCE types. See MCE | | uchar cost_precision, |version for description. | | intel_sub_group_avc_ime_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ Miscellaneous property configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the search configuration phase to enable miscellaneous properties setting in the payload. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ | |source_interlaced_field_polarity( |source_interlaced_field_ | | uchar src_field_polarity, |polarity(..) with the result | | intel_sub_group_avc_ime_payload_t|conversions to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_ | |single_reference_interlaced_field_ |set_single_reference_ | |polarity( |interlaced_field_polarity(..) | | uchar ref_field_polarity, |with the result conversions | | intel_sub_group_avc_ime_payload_t|to/from MCE types. See MCE | | payload ) |version for description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ | |dual_reference_interlaced_field_ |dual_reference_interlaced_ | |polarities( |field_polarities(..) with the | | uchar fwd_ref_field_polarity, |result conversions to/from MCE | | uchar bwd_ref_field_polarity, |types. See MCE version for | | intel_sub_group_avc_ime_payload_t|description. | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Specify the maximum number of | |intel_sub_group_avc_ime_set_ |motion vectors allowed for the | |max_motion_vector_count( |current MB. The default setting| | uchar max_motion_vector_count, |is 32. Any other value may | | intel_sub_group_avc_ime_payload_t|alter the MB partitioning | | payload ) |decision. The IME operation | | |will compute the best allowed | | |partitioning such that the | | |number of sub-block motion | | |vectors will not exceed | | |max_motion_vector_count. | | | | | |The value of max_motion_vector_| | |count(..) specifies the maximum| | |number of motion vectors | | |allowed for the current MB. It | | |must be in the range [1, 32], | | |otherwise the results are | | |undefined. | | | | | |NOTE: This can be used to | | |handle the restriction for | | |certain profiles for AVC in | | |that the maximum number of | | |motion vectors allowed for two | | |consecutive MBs can only be 16.| +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Update the input payload to | |intel_sub_group_avc_ime_set_ |disable a mix of forward and | |unidirectional_mix_disable( |backward MVs in the result. | | intel_sub_group_avc_ime_payload_t| | | payload ) |Default is to enable it. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Specifies the threshold value | |intel_sub_group_avc_ime_set_early_ |of a distortion compute of a | |search_termination_threshold( |16x16 partition of a MB for a | | uchar threshold, |single-reference search, below | | intel_sub_group_avc_ime_payload_t|which no more searching is | | payload ) |performed for the MB. | | | | | |The value of threshold is | | |specified in U4U4 format and | | |the integer value must fit | | |within 14 bits. | | | | | |The input payload must have | | |been configured for a single- | | |reference search with the 16x16| | |partition enabled for this | | |threshold to be set, or else | | |the results are undefined. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |Set the (16) SAD weights for | |intel_sub_group_avc_ime_set_ |each 4x4 sub-block. | |weighted_sad( | | | uint packed_sad_weights, |These values are used to | | intel_sub_group_avc_ime_payload_t|decrease the SAD magnitude of | | payload ) |each 4x4 sub-block by dividing | | |the SAD of 4x4 sub-block of the| | |source MB by its mapped weight.| | | | | |It requires a partition_mask of| | |16x16 and forward search window| | |configuration. | | | | | |The weighting pattern used is | | |the traditional Z order for | | |each 4x4 block. Weighted-SAD | | |Control Mapping: | | | | | | 0 1 4 5 | | | 2 3 6 7 | | | 8 9 C D | | | A B E F | | | | | |Each weight is of 2 bits | | |represented in a packed format | | |in packed_sad_weights for each | | |of the 4x4 blocks in Z order. | | | | | |A prior call to | | |intel_sub_group_avc_ime_ | | |set_single_reference(..) set up| | |for the forward reference image| | |is required. | | | | | |This feature is mainly intended| | |for improved block matching in | | |image-rate conversion (FRC) | | |kernels. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ime_payload_t |This is a wrapper for | |intel_sub_group_avc_ime_set_ |intel_sub_group_avc_mce_set_ac_| |ac_only_haar( |only_haar(..) with the payload | | intel_sub_group_avc_ime_payload_t|conversions to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Evaluation phase functions ++++++++++++++++++++++++++ These built-in functions perform the evaluation of the IME operation configured in the payload with a VME media sampler and return the results. +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Evaluate the basic IME | |intel_sub_group_avc_ime_evaluate_ |operation with a single | |with_single_reference( |reference and return its | | read_only image2d_t src_image, |results. The IME payload must | | read_only image2d_t ref_image, |have been configured with | | sampler_t vme_media_sampler, |intel_sub_group_avc_ime_set_ | | intel_sub_group_avc_ime_payload_t |single_reference(..). | | payload ) | | | |The parameter ref_image must | | |be a valid forward image | | |kernel parameter per the | | |ordering conventions for the | | |kernel parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Evaluate the basic IME | |intel_sub_group_avc_ime_evaluate_ |operation with dual reference | |with_dual_reference( |and return its results. The | | read_only image2d_t src_image, |IME payload must have been | | read_only image2d_t fwd_ref_image,|configured with | | read_only image2d_t bwd_ref_image,|intel_sub_group_avc_ime_set_ | | sampler_t vme_media_sampler, |dual_reference(..). | | intel_sub_group_avc_ime_payload_t | | | payload ) |The parameter fwd_ref_image[ | | |bwd_ref_image] must be a valid| | |forward[backward] image kernel| | |parameter per the ordering | | |conventions for the kernel | | |parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_ |Evaluate the single reference | |single_reference_streamout_t |IME operation with streamout | |intel_sub_group_avc_ime_evaluate_ |and return its results. The | |with_single_reference_streamout( |IME payload must have been | | read_only image2d_t src_image, |configured with | | read_only image2d_t ref_image, |intel_sub_group_avc_ime_set_ | | sampler_t vme_media_sampler, |single_reference(..). | | intel_sub_group_avc_ime_payload_t | | | payload ) |The parameter ref_image must | | |be a valid forward image | | |kernel parameter per the | | |ordering conventions for the | | |kernel parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_ |Evaluate the dual reference | |dual_reference_streamout_t |IME operation with streamout | |intel_sub_group_avc_ime_evaluate_ |and return its results. The | |with_dual_reference_streamout( |IME payload must have been | | read_only image2d_t src_image, |configured with | | read_only image2d_t fwd_ref_image,|intel_sub_group_avc_ime_set_ | | read_only image2d_t bwd_ref_image,|dual_reference(..). | | sampler_t vme_media_sampler, | | | intel_sub_group_avc_ime_payload_t |The parameter fwd_ref_image[ | | payload ) |bwd_ref_image] must be a valid| | |forward[backward] image kernel| | |parameter per the ordering | | |conventions for the kernel | | |parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Evaluate the single reference | |intel_sub_group_avc_ime_evaluate_ |IME operation with streamin | |with_single_reference_streamin( |and return its results. The | | read_only image2d_t src_image, |IME payload must have been | | read_only image2d_t ref_image, |configured with | | sampler_t vme_media_sampler, |intel_sub_group_avc_ime_set_ | | intel_sub_group_avc_ime_payload_t|single_reference(..). | | payload, | | | intel_sub_group_avc_ime_single_ |The parameter ref_image must | | reference_streamin_t |be a valid forward image | | streamin_components ) |kernel parameter per the | | |ordering conventions for the | | |kernel parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Evaluate the dual reference | |intel_sub_group_avc_ime_evaluate_ |IME operation with streamin | |with_dual_reference_streamin( |and return its results. The | | read_only image2d_t src_image, |IME payload must have been | | read_only image2d_t fwd_ref_image,|configured with | | read_only image2d_t bwd_ref_image,|intel_sub_group_avc_ime_set_ | | sampler_t vme_media_sampler, |dual_reference(..). | | intel_sub_group_avc_ime_payload_t | | | payload, |The parameter fwd_ref_image[ | | intel_sub_group_avc_ime_dual_ |bwd_ref_image] must be a valid| | reference_streamin_t |forward[backward] image kernel| | streamin_components |parameter per the ordering | | ) |conventions for the kernel | | |parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_ |Evaluate the single reference | |single_reference_streamout_t |IME operation with streamin | |intel_sub_group_avc_ime_evaluate_ |and streamout and return its | |with_single_reference_streaminout( |results. The IME payload must | | read_only image2d_t src_image, |have been configured with | | read_only image2d_t ref_image, |intel_sub_group_avc_ime_set_ | | sampler_t vme_media_sampler, |single_reference(..). | | intel_sub_group_avc_ime_payload_t | | | payload, |The parameter ref_image must | | intel_sub_group_avc_ime_single_ |be a valid forward image | | reference_streamin_t |kernel parameter per the | | streamin_components |ordering conventions for the | | ) |kernel parameter list. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_ |Evaluate the dual reference | |dual_reference_streamout_t |IME operation with streamin | |intel_sub_group_avc_ime_evaluate_ |and streamout and return its | |with_dual_reference_streaminout( |results. The IME payload must | | read_only image2d_t src_image, |have been configured with | | read_only image2d_t fwd_ref_image,|intel_sub_ | | read_only image2d_t bwd_ref_image,|group_avc_ime_set_dual_ | | sampler_t vme_media_sampler, |reference(..). | | intel_sub_group_avc_ime_payload_t | | | payload, |The parameter fwd_ref_image[ | | intel_sub_group_avc_ime_dual_ |bwd_ref_image] must be a valid| | reference_streamin_t |forward[backward] image kernel| | streamin_components |parameter per the ordering | | ) |conventions for the kernel | | |parameter list. | +-------------------------------------+------------------------------+ Result type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the evaluation phase to convert IME results to MCE results and vice-versa. +-----------------------------------+--------------------------------+ |intel_sub_group_avc_mce_result_t |Convert the IME result into a | |intel_sub_group_avc_ime_ |MCE result. | |convert_to_mce_result( | | | intel_sub_group_avc_ime_result_t| | | result ) | | +-----------------------------------+--------------------------------+ |intel_sub_group_avc_ime_result_t |Convert the MCE result into an | |intel_sub_group_avc_mce_ |IME result. | |convert_to_ime_result( | | | intel_sub_group_avc_mce_result_t| | | result ) | | +-----------------------------------+--------------------------------+ Result processing phase functions +++++++++++++++++++++++++++++++++ These built-in functions are called following the evaluation phase to extract the various result components from an IME evaluation result. +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_ |Return the streamed out BMVs | |single_reference_streamin_t |and distortions from the input| |intel_sub_group_avc_ime_ |result from a single reference| |get_single_reference_streamin( |streamout IME operation that | | intel_sub_group_avc_ime_result_ |can be used as streamin input | | single_reference_streamout_t |for a subsequent IME | | result ) |operation. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_ |Return the streamed out BMVs | |dual_reference_streamin_t |and distortions from the input| |intel_sub_group_avc_ime_get_ |result from a dual reference | |dual_reference_streamin( |streamout IME operation that | | intel_sub_group_avc_ime_result_ |can be used as streamin input | | dual_reference_streamout_t |for a subsequent IME | | result ) |operation. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Strip out the single reference| |intel_sub_group_avc_ime_strip_ |streamout BMVs and distortions| |single_reference_streamout( |from the streamout results and| | intel_sub_group_avc_ime_result_ |return the rest. | | single_reference_streamout_t | | | result ) | | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ime_result_t |Strip out the dual reference | |intel_sub_group_avc_ime_strip_ |streamout MVs and distortions | |dual_reference_streamout( |from the streamout results and| | intel_sub_group_avc_ime_result_ |return the rest. | | dual_reference_streamout_t | | | result ) | | +-------------------------------------+------------------------------+ |uint intel_sub_group_avc_ime_get_ |Get the packed motion vectors | |streamout_major_shape_motion_vectors(|for the input major shape from| | intel_sub_group_avc_ime_result_ |the IME single reference | | single_reference_streamout_t |streamout results. | | result, | | | uchar major_shape ) |The parameter major_shape must| | |be a valid inter macro-block | | |major shape enumeration value | | |and a compile-time constant. | | | | | |Up to 4 packed MVs are | | |returned, one per | | |work-item. If the major shape | | |is: | | |- 16x6, then one packed MV | | | is returned by work-item 0 | | |- 16x8, or 8x16, then two | | | packed MVs are returned by | | | work-items 0 and 1 | | |- 8x8, then four packed MVs | | | are returned by work-items | | | 0 to 3. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ime_get_ |Get the distortions for the | |streamout_major_shape_distortions( |input major shape from the IME| | intel_sub_group_avc_ime_result_ |single reference streamout | | single_reference_streamout_t |results. | | result, | | | uchar major_shape ) |The parameter major_shape must| | |be a valid inter macro-block | | |major shape enumeration value | | |and a compile-time constant. | | | | | |Up to 4 distortions are | | |returned, one per work-item in| | |the same format as for motion | | |vectors as described above. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |Get the reference identifiers | |streamout_major_shape_reference_ids( |for the input major shape and | | intel_sub_group_avc_ime_result_ |direction from the IME dual | | single_reference_streamout_t |reference streamout results. | | result, | | | uchar major_shape ) |The parameter major_shape must| | |be a valid inter macro-block | | |major shape enumeration value | | |and a compile-time constant. | | | | | |Up to 4 reference identifiers | | |are returned, one per | | |work-item in the same format | | |as for motion vectors as | | |described above. | +-------------------------------------+------------------------------+ |uint intel_sub_group_avc_ime_get_ |Get the packed motion vectors | |streamout_major_shape_motion_vectors(|for the input major shape and | | intel_sub_group_avc_ime_result_ |direction from the IME dual | | dual_reference_streamout_t result,|reference streamout results. | | uchar major_shape, | | | uchar direction ) |The parameter major_shape must| | |be a valid inter macro-block | | |major shape enumeration value | | |and a compile-time constant. | | | | | |The parameter direction must | | |be a valid unidirectional | | |inter macro-block major | | |direction value and a | | |compile-time constant. | | | | | |Up to 4 packed MVs are | | |returned, one per work-item in| | |the same format as for motion | | |vectors for single reference | | |streamout as described above. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ime_get_ |Get the distortions for the | |streamout_major_shape_distortions( |input major shape and | | intel_sub_group_avc_ime_result_ |direction from the IME dual | | dual_reference_streamout_t result,|reference streamout results. | | uchar major_shape, | | | uchar direction ) |The parameter major_shape must| | |be a valid inter macro-block | | |major shape enumeration value | | |and a compile-time constant. | | | | | |The parameter direction must | | |be a valid unidirectional | | |inter macro-block major | | |direction value and a | | |compile-time constant. | | | | | |Up to 4 distortions are | | |returned, one per work-item in| | |the same format as for motion | | |vectors as described above. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |Get the reference identifiers | |streamout_major_shape_reference_ids( |for the input major shape and | | intel_sub_group_avc_ime_result_ |direction from the IME dual | | dual_reference_streamout_t result,|reference streamout results. | | uchar major_shape, | | | uchar direction ) | | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |Get the bitmask indicating | |border_reached( |whether any border of | | uchar image_select, |forward/backward reference | | intel_sub_group_avc_ime_result_t |image is reached by one or | | result ) |more MVs in the winning inter | | |shape. The bitmask values are | | |as per the inter border | | |reached enumeration values. | | | | | |The search window must have | | |been configured for a forward | | |reference if image_select is | | |set as | | |CLK_AVC_ME_FRAME_FORWARD_INTEL| | |and with a backward reference | | |if image_select is set as | | |CLK_AVC_ME_FRAME_BACKWARD_ | | |INTEL. | | | | | |The value of image_select is | | |either | | |CLK_AVC_ME_FRAME_FORWARD_INTEL| | |or CLK_AVC_ME_FRAME_BACKWARD_ | | |INTEL and must be a | | |compile-time constant. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |Get the indication that the | |truncated_search_indication( |search operation was prevented| | intel_sub_group_avc_ime_result_t |from providing the lowest | | result ) |distortion solution due to the| | |tighter constraints on the | | |maximum number of MB motion | | |vectors configured for the | | |search operation. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |Get the indication that | |unidirectional_early_search_ |unidirectional search | |termination( |operation terminated early | | intel_sub_group_avc_ime_result_t |because the configured | | result ) |distortion threshold was met. | +-------------------------------------+------------------------------+ |uint intel_sub_group_avc_ime_get_ |Get the 16x16 motion vector | |weighting_pattern_minimum_ |corresponding to the minimum | |motion_vector( |16x16 distortion when applying| | intel_sub_group_avc_ime_result_t |the traditional Z-order SAD | | result ) |weighting pattern. | | | | | |This can only be called if a | | |SAD weighting pattern was set | | |prior to evaluation using | | |intel_sub_group_avc_ime_set_ | | |weighted_sad(..). | | | | | |The argument for parameter | | |"pattern_id" should be a | | |compile-time constant. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ime_get_ |Get the minimum 16x16 | |weighting_pattern_minimum_distortion(|distortion when applying the | | intel_sub_group_avc_ime_result_t |traditional Z-order SAD | | result ) |weighting pattern. | +-------------------------------------+------------------------------+ |ulong intel_sub_group_avc_ime_get_ |This is a wrapper for | |motion_vectors( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |motion_vectors(..) with the | | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_distortions( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_distortions(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ime_get_ |This is a wrapper for | |best_inter_distortion( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |best_inter_distortion(..)with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_major_shape( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_major_shape(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_minor_shapes( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_minor_shapes(..) with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_directions( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_directions(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_motion_vector_count( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_motion_vector_count(..) | | result ) |with the result conversions | | |to/from MCE types. See MCE | | |version for description. | +-------------------------------------+------------------------------+ |uint intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_reference_ids( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ime_result_t |inter_reference_ids(..) with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ime_get_ |This is a wrapper for | |inter_reference_interlaced_ |intel_sub_group_avc_mce_get_ | |field_polarities( |inter_reference_interlaced_ | | uint packed_reference_ids, |field_polarities(..) with the| | uint packed_reference_parameter_ |result conversions to/from MCE| | field_polarities, |types. See MCE version for | | intel_sub_group_avc_ime_result_t |description. | | result ) | | +-------------------------------------+------------------------------+ ______________________________________________________________________ REF built-in functions ---------------------- A set of ordered phases of functions are required to be called to evaluate a refinement operation result. Initialization phase functions ++++++++++++++++++++++++++++++ These built-in functions create a properly initialized payload that can be used for further configured for evaluating REF operations. This is a required initial phase. A call to either intel_sub_group_avc_fme_initialize or intel_sub_group_avc_bme_initialize is required. +-----------------------------------+--------------------------------+ |intel_sub_group_avc_ref_payload_t |Return an initialized payload | |intel_sub_group_avc_fme_initialize(|for a VME fractional motion | | ushort2 src_coord, |estimation operation (FME). | | ulong motion_vectors, | | | uchar major_shapes, |The payload is initialized for | | uchar minor_shapes, |progressive frame operations, | | uchar directions, |and the cost configuration | | uchar pixel_resolution, |values and the miscellaneous | | uchar sad_adjustment ) |property values are all | | |initialized to zero. The cost | | |configuration and the | | |miscellaneous property | | |configuration phase functions | | |must be used to override the | | |initial configurations in the | | |payload. | | | | | |The src_coord value represents | | |the 2D offset of the top left | | |corner of the source MB in pixel| | |units in the source image. | | | | | |If the source image is an | | |interlaced scan image, then the | | |bottom field lines are | | |considered as logically | | |overlapping with the top field | | |lines (i.e. the top field MBs | | |are considered as logically | | |overlapping with the bottom MBs)| | |for the purposes for specifying | | |the src_coord value. | | | | | |The parameter motion_vectors | | |contains the BMVs returned by an| | |IME in the same format as | | |returned by | | |intel_sub_group_avc_mce_get_ | | |motion_vectors(..). The MVs are | | |in QPEL units. The X and Y | | |coordinates of each MV must be | | |in the range [-2048.00, | | |2047.75), otherwise the results | | |are undefined. | | | | | |(If this argument value is | | |manually composed, all sub-block| | |MVs must be replicated per its | | |format for each partition. For | | |example for 16x16 partition, all| | |sub-block MVs must be replicated| | |to the same MV, and for 8x8 | | |partition, each 8x8 must have | | |its respective sub-block MVs | | |replicated.) | | | | | |Legal values and format for | | |major_shapes are as per the | | |return value of | | |intel_sub_group_avc_mce_get_ | | |inter_major_shape(..). | | | | | |Legal values and format for | | |minor_shapes are as per the | | |return value of | | |intel_sub_group_avc_mce_get_ | | |inter_minor_shapes(..). | | | | | |Legal values and format for | | |directions are as per the return| | |value of | | |intel_sub_group_avc_mce_get_ | | |inter_directions(..). | | | | | |Legal values for | | |pixel_resolution is either | | |CLK_AVC_ME_SUBPIXEL_MODE_HPEL_ | | |INTEL or | | |CLK_AVC_ME_SUBPIXEL_MODE_QPEL_ | | |INTEL. | | | | | |The legal values for | | |sad_adjustment is specified by | | |its respective enumeration | | |values. | +-----------------------------------+--------------------------------+ |intel_sub_group_avc_ref_payload_t |Return an initialized payload | |intel_sub_group_avc_bme_initialize(|for a VME bidirectional motion | | ushort2 src_coord, |estimation (BME) operation. | | ulong motion_vectors, | | | uchar major_shapes, |The payload is initialized for | | uchar minor_shapes, |progressive frame operations, | | uchar directions, |and the cost configuration | | uchar pixel_resolution, |values and the miscellaneous | | uchar bidirectional_weight, |property values are all | | uchar sad_adjustment ) |initialized to zero. The cost | | |configuration and the | | |miscellaneous property | | |configuration phase functions | | |must be used to override the | | |initial configurations in the | | |payload. | | | | | |If the specified | | |pixel_resolution is a sub-pixel | | |resolution then an implicit FME | | |operation is performed with the | | |BME operation. | | | | | |The src_coord value represents | | |the 2D offset of the top left | | |corner of the source MB in pixel| | |units in the source image. | | | | | |If the source image is an | | |interlaced scan image, then the | | |bottom field lines are | | |considered as logically | | |overlapping with the top field | | |lines (i.e. the top field MBs | | |are considered as logically | | |overlapping with the bottom MBs)| | |for the purposes for specifying | | |the src_coord value. | | | | | |The parameter motion_vectors | | |contains the BMVs returned by an| | |IME in the same format as | | |returned by | | |intel_sub_group_avc_mce_get_ | | |motion_vectors(..). | | | | | |(If this argument value is | | |manually composed, all sub-block| | |MVs must be replicated per its | | |format for each partition. For | | |example for 16x16 partition, all| | |sub-block MVs must be replicated| | |to the same MV, and for 8x8 | | |partition, each 8x8 must have | | |its respective sub-block MVs | | |replicated.) | | | | | |Legal values and format for | | |major_shapes is as per the | | |return value of | | |intel_sub_group_avc_mce_get_ | | |inter_major_shape(..). | | | | | |Legal values and format for | | |minor_shapes is as per the | | |return value of | | |intel_sub_group_avc_mce_get_ | | |inter_minor_shapes(..). | | | | | |Legal values and format for | | |directions are as per the return| | |value of | | |intel_sub_group_avc_mce_get_ | | |inter_directions(..). | | | | | |Legal values for | | |pixel_resolution is either | | |CLK_AVC_ME_SUBPIXEL_MODE_ | | |INTEGER_INTEL, | | |CLK_AVC_ME_SUBPIXEL_MODE_ | | |HPEL_INTEL or | | |CLK_AVC_ME_SUBPIXEL_MODE_ | | |QPEL_INTEL. | | | | | |Legal values for | | |bidirectional_weight and sad | | |adjustment is as per their | | |respective enumeration values. | | | | | |The legal values for | | |sad_adjustment is specified by | | |its respective enumeration | | |values. | +-----------------------------------+--------------------------------+ Payload type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the initialization phase to convert REF payload to MCE payloads and vice-versa. +-------------------------------------+------------------------------+ |intel_sub_group_avc_mce_payload_t |Convert the REF payload to a | |intel_sub_group_avc_ref_ |generic ME payload. | |convert_to_mce_payload( | | | intel_sub_group_avc_ref_payload_t | | | payload ) | | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ref_payload_t |Convert the generic ME payload| |intel_sub_group_avc_mce_ |to an FBR payload. | |convert_to_ref_payload( | | | intel_sub_group_avc_mce_payload_t | | | payload ) | | +-------------------------------------+------------------------------+ Multi-reference cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the initialization phase to enable multi-reference image costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_ | |inter_base_multi_reference_penalty( |set_inter_base_multi_reference_| | uchar reference_base_penalty, |penalty(..) the payload | | intel_sub_group_avc_ref_payload_t|conversions with to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Inter shape and direction cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the initialization phase to enable shape costing. +-------------------------------------+------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |inter_shape_penalty( |inter_shape_penalty(..) with | | ulong packed_shape_cost, |the payload conversions | | intel_sub_group_avc_ref_payload_t |to/from MCE types. See MCE | | payload ) |version for description. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |inter_direction_penalty( |inter_direction_penalty(..) | | uchar direction_cost, |with the payload conversions | | intel_sub_group_avc_ref_payload_t |to/from MCE types. See MCE | | payload ) |version for description. | +-------------------------------------+------------------------------+ Inter motion vector cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the initialization phase to enable motion vector costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |motion_vector_cost_function( |motion_vector_cost_function | | ulong packed_cost_center_delta, |with the payload conversions | | uint2 packed_cost_table, |to/from MCE types. See MCE | | uchar cost_precision, |version for description. | | intel_sub_group_avc_ref_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ Miscellaneous property configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the initialization phase to enable miscellaneous properties setting in the payload. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |source_interlaced_field_polarity( |source_interlaced_field_ | | uchar src_field_polarity, |polarity(..) with the result | | intel_sub_group_avc_ref_payload_t|conversions to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |single_reference_interlaced_ |single_reference_ | |field_polarity( |interlaced_field_polarity(..) | | uchar ref_field_polarity, |with the result conversions | | intel_sub_group_avc_ref_payload_t|to/from MCE types. See MCE | | payload ) |version for description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |dual_reference_interlaced_ |dual_reference_interlaced_ | |field_polarities( |field_polarities(..) with the | | uchar fwd_ref_field_polarity, |result conversions to/from MCE | | uchar bwd_ref_field_polarity, |types. See MCE version for | | intel_sub_group_avc_ref_payload_t|description. | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |Update the input payload to | |intel_sub_group_avc_ref_set_ |disable a mix of bidirectional | |bidirectional_mix_disable( |and unidirectional MVs in the | | intel_sub_group_avc_ref_payload_t|result. | | payload ) | | | |Default is to enable it. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |Update the input payload to do | |intel_sub_group_avc_ref_set_ |enable bilinear filter | |bilinear_filter_enable( |interpolation instead of 4-tap | | intel_sub_group_avc_ref_payload_t|filter interpolation. Default | | payload ) |is 4-tap filter interpolation. | | | | | |This should not be called if | | |the payload was initialized | | |with integer pixel resolution. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_payload_t |This is a wrapper for | |intel_sub_group_avc_ref_set_ |intel_sub_group_avc_mce_set_ | |ac_only_haar( |ac_only_haar(..) with the | | intel_sub_group_avc_ref_payload_t|payload conversions to/from MCE| | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Evaluation phase functions ++++++++++++++++++++++++++ These built-in functions perform the evaluation of the REF operation configured in the payload with a VME media sampler and return the results. +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_result_t |Evaluate the basic REF | |intel_sub_group_avc_ref_ |operation with single reference| |evaluate_with_single_reference( |and return its results. | | read_only image2d_t src_image, | | | read_only image2d_t ref_image, |The parameter ref_image must be| | sampler_t vme_media_sampler, |a valid forward image kernel | | intel_sub_group_avc_ref_payload_t|parameter per the ordering | | payload ) |conventions for the kernel | | |parameter list. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_result_t |Evaluate the basic REF | |intel_sub_group_avc_ref_ |operation with dual reference | |evaluate_with_dual_reference( |and return its results. | | read_only image2d_t src_image, | | | read_only image2d_t fwd_ref_image,|The parameter | | read_only image2d_t bwd_ref_image,|fwd_ref_image[bwd_ref_image] | | sampler_t vme_media_sampler, |must be a valid | | intel_sub_group_avc_ref_payload_t |forward[backward] image kernel | | payload ) |parameter per the ordering | | |conventions for the kernel | | |parameter list. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_result_t |Evaluate the basic REF | |intel_sub_group_avc_ref_ |operation with multi references| |evaluate_with_multi_reference( |and return its results. | | read_only image2d_t src_image, | | | uint packed_reference_ids, |A unique pair of reference | | sampler_t vme_media_sampler, |identifiers (indicating unique | | intel_sub_group_avc_ref_payload_t|forward/backward reference | | payload ) |images) may be specified for | | |each of the allowed major | | |partitions using | | |packed_dual_ref_ids. | | | | | |The value of | | |packed_reference_ids is a | | |integer with the following bits| | |specifying the values for the | | |pair of reference images for | | |each major partition. | | | | | |3:0 => Fwd reference block 0 | | |7:4 => Bwd reference block 0 | | |11:8 => Fwd reference block 1 | | |15:12 => Bwd reference block 1 | | |19:16 => Fwd reference block 2 | | |23:20 => Bwd reference block 2 | | |27:24 => Fwd reference block 3 | | |31:28 => Bwd reference block 3 | | | | | |A forward[backward] reference | | |idenitifer value of 'n' | | |indicates the forward[backward]| | |image from the 'n'th pair of | | |forward/backward reference | | |images, with the value of 'n' | | |ranging from 0 to 15. | | | | | |If the REF operation is | | |configured with only forward | | |reference images then, the | | |values of the backward | | |reference identifiers are not | | |used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference identifier | | |pairs must be replicated. For | | |example, for a 16x16 block, all| | |four pair of reference | | |identifiers must be replicated | | |to the value of the first pair | | |for block 0. | | | | | |The value for the | | |packed_reference_ids argument | | |is obtained by calling | | |intel_sub_group_avc_ime_get_ | | |inter_reference_ids(..) for | | |the preceding IME operation's | | |result. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_ref_result_t |Evaluate the basic REF | |intel_sub_group_avc_ref_ |operation with multi references| |evaluate_with_multi_reference( |and return its results. This | | read_only image2d_t src_image, |is used for interlaced source | | uint packed_reference_ids, |and reference images. | | uchar packed_reference_ | | | field_polarities, |A unique pair of reference | | sampler_t vme_media_sampler, |identifiers (indicating unique | | intel_sub_group_avc_ref_payload_t|forward/backward reference | | payload ) |images) may be specified for | | |each of the allowed major | | |partitions using | | |packed_reference_ids. | | | | | |The value of | | |packed_reference_ids is an | | |integer with the following bits| | |specifying the values for the | | |pair of reference images for | | |each major partition. | | | | | |3:0 => Fwd reference block 0 | | |7:4 => Bwd reference block 0 | | |11:8 => Fwd reference block 1 | | |15:12 => Bwd reference block 1 | | |19:16 => Fwd reference block 2 | | |23:20 => Bwd reference block 2 | | |27:24 => Fwd reference block 3 | | |31:28 => Bwd reference block 3 | | | | | |A forward[backward] reference | | |idenitifer value of 'n' | | |indicates the forward[backward]| | |image from the 'n'th pair of | | |forward/backward reference | | |images, with the value of 'n' | | |ranging from 0 to 15. | | | | | |If the REF operation is | | |configured with only forward | | |reference images then, the | | |values of the backward | | |reference identifiers are not | | |used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference identifier | | |pairs must be replicated. For | | |example, for a 16x16 block, all| | |four pair of reference | | |identifiers must be replicated | | |to the value of the first pair | | |for block 0. | | | | | |The value for the | | |packed_reference_ids argument | | |is obtained by calling | | |intel_sub_group_avc_ime_get_ | | |inter_reference_ids(..) for | | |the preceding IME operation's | | |result. | | | | | |Reference field polarities for | | |forward and backward reference | | |images are specified for each | | |of the allowed major partitions| | |using packed_reference_field_ | | |polarities. | | | | | |The value of | | |packed_reference_field_ | | |polarities is an integer with | | |the following bits specifying | | |the reference field polarities | | |for the major partitions. | | | | | |0 : Fwd reference block 0 | | |1 : Fwd reference block 1 | | |2 : Fwd reference block 2 | | |3 : Fwd reference block 3 | | |4 : Bwd reference block 0 | | |5 : Bwd reference block 1 | | |6 : Bwd reference block 2 | | |7 : Bwd reference block 3 | | | | | |If the dual-reference | | |evaluation functions are not | | |used, then the values of the | | |backward reference field | | |polarities are not used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference field | | |polarities are replicated. For | | |example, for a 16x16 block all | | |four pairs of reference field | | |polarities are replicated to | | |the value of the first pair for| | |block 0. | +------------------------------------+-------------------------------+ Result type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the evaluation phase to convert REF results to MCE results and vice-versa. +-----------------------------------+--------------------------------+ |intel_sub_group_avc_mce_result_t |Convert the REF result into a | |intel_sub_group_avc_ref_ |MCE result. | |convert_to_mce_result( | | | intel_sub_group_avc_ref_result_t| | | result ) | | +-----------------------------------+--------------------------------+ |intel_sub_group_avc_ref_result_t |Convert the MCE result into an | |intel_sub_group_avc_mce_ |REF result. | |convert_to_ref_result( | | | intel_sub_group_avc_mce_result_t| | | result ) | | +-----------------------------------+--------------------------------+ Result processing phase functions +++++++++++++++++++++++++++++++++ These built-in functions are called following the evaluation phase to extract the various result components from an REF evaluation result. +-------------------------------------+------------------------------+ |ulong intel_sub_group_avc_ref_get_ |This is a wrapper for | |motion_vectors( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |motion_vectors(..) with the | | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_distortions( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_distortions(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |ushort intel_sub_group_avc_ref_get_ |This is a wrapper for | |best_inter_distortion( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |best_inter_distortion(..)with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_major_shape( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_major_shape(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_minor_shapes( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_minor_shapes(..) with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_directions( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_directions(..) with the| | result ) |result conversions to/from MCE| | |types. See MCE version for | | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_motion_vector_count( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_motion_vector_count(..) | | result ) |with the result conversions | | |to/from MCE types. See MCE | | |version for description. | +-------------------------------------+------------------------------+ |uint intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_reference_ids( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_ref_result_t |inter_reference_ids(..) with | | result ) |the result conversions to/from| | |MCE types. See MCE version for| | |description. | +-------------------------------------+------------------------------+ |uchar intel_sub_group_avc_ref_get_ |This is a wrapper for | |inter_reference_interlaced_ |intel_sub_group_avc_mce_get_ | |field_polarities( |inter_reference_interlaced_ | | uint packed_reference_ids, |field_polarities(..) with the| | uint packed_reference_parameter_ |result conversions to/from MCE| | field_polarities, |types. See MCE version for | | intel_sub_group_avc_ref_result_t |description. | | result ) | | +-------------------------------------+------------------------------+ ______________________________________________________________________ SIC built-in functions ---------------------- A set of ordered phases of functions are required to be called to evaluate a skip check or intra estimation operation result. Initialization phase functions ++++++++++++++++++++++++++++++ These built-in functions create a properly initialized payload that can be used for further configured for evaluating SIC operations. This is a required initial phase. +-----------------------------------+--------------------------------+ |intel_sub_group_avc_sic_payload_t |Return an initialized payload | |intel_sub_group_avc_sic_initialize(|for a VME SIC operation. | | ushort2 src_coord ) | | | | | | |If the source image is an | | |interlaced scan image, then the | | |bottom field lines are | | |considered as logically | | |overlapping with the top field | | |lines (i.e. the top field MBs | | |are considered as logically | | |overlapping with the bottom MBs)| | |for the purposes for specifying | | |the src_coord value. | | | | | |If the SIC operation is being | | |configured for chroma based | | |intra estimation, then the x and| | |y coordinates of src_coord must | | |be multiples of 2. | +-----------------------------------+--------------------------------+ Configuration phase functions +++++++++++++++++++++++++++++ These built-in functions allow for configuration of a skip check and/or intra estimation operation. This is a required phase following the initialization phase to configure the skip check or intra estimation operation. A call to intel_sub_group_avc_sic_configure_ skc(..) or intel_sub_group_avc_sic_configure_ipe(..) is required. Both intel_sub_group_avc_sic_configure_skc(..) and intel_sub_group_avc_ sic_configure_ipe(..) may be called to initialize the payload to perform both skip checks and intra estimation as part for the same SIC evaluation, but if called together intel_sub_group_avc_sic_configure_ skc(..) must be called first in the call sequence. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Configure the SIC payload for | |intel_sub_group_avc_sic_ |(uni or bi-directional) skip | |configure_skc( |checks. | | uint skip_block_partition_type, | | | uint skip_motion_vector_mask, |The legal values for | | ulong motion_vectors, |skip_block_partition_type must | | uchar bidirectional_weight, |be one of the specified | | uchar skip_sad_adjustment, |partition mask enumeration | | intel_sub_group_avc_sic_payload_t|values. | | payload ) | | | |Legal values for | | |skip_motion_vector_mask can be | | |composed using the '|' operator| | |from the enumeration values | | |defined for it; both | | |unidirectional and | | |bidirectional skip vectors can | | |be specified uniquely for each | | |major partition (16x16 or 8x8) | | |by an appropriate selection of | | |the skip motion vector mask | | |enumeration values. If the | | |16x16 skip_block_partition_type| | |is specified, then only the | | |16x16 enumeration values may be| | |used, else only the 8x8 | | |enumeration values may be used.| | | | | |For convenience, one of the | | |following two mechanisms can be| | |used to obtain the value for | | |the skip_motion_vector_mask | | |argument. | | | | | |If the shape and directions are| | |all compile-time constants, | | |then either one of the two | | |following macros may be used to| | |set this argument's value for | | |16x16 or 8x8 major partitions | | |respectively: | | |- CLK_AVC_ME_SKIP_BLOCK_16x16_ | | | INTEL(DIRECTION) | | |- CLK_AVC_ME_SKIP_BLOCK_8x8_ | | | INTEL(DIRECTION0, DIRECTION1,| | | DIRECTION2, DIRECTION3) | | | | | |The DIRECTION{0-3} argument | | |values must be one of the inter| | |macro-block major direction | | |values. The directions are | | |specified for each major | | |partition in traditional | | |Z-order. | | | | | |If the shape or directions are | | |not compile-time constants, | | |then the helper function | | |intel_sub_group_avc_sic_get_ | | |motion_vector_mask(..) may be | | |used to set this argument's | | |value. | | | | | |The parameter motion_vectors | | |specifies the input packed | | |BMVs. Either the forward or | | |backward is ignored if the | | |setting in | | |skip_motion_vector_mask is | | |backward or forward | | |respectively. If the setting is| | |bidirectional, then both the | | |forward and backward motion | | |vectors will be used. If the | | |skip_block_partition_type is | | |16x16, work-item 0 in the | | |subgroup provides the BMV, and | | |if the | | |skip_block_partition_type is | | |8x8, work-items 0 to 4 in the | | |subgroup provide the four | | |BMVs. The MVs are in QPEL | | |units. The X and Y coordinates | | |of each MV must be in the range| | |[-2048.00, 2047.75] and | | |[-512.00 to 511.75] | | |respectively, otherwise the | | |results are undefined. | | | | | |Legal values for | | |bidirectional_weight and | | |skip_sad_adjustment are as per | | |their respective enumeration | | |values. If the setting is | | |unidirectional, then the | | |bidirectional_weight parameter | | |value is ignored and can be set| | |to the value '0 '. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Return an initialized payload | |intel_sub_group_avc_sic_ |for a VME luma intra prediction| |configure_ipe( |estimation (IPE) operation. | | uchar luma_intra_partition_mask, | | | uchar |The luma_intra_partition_mask | | intra_neighbour_availabilty, |can be composed from their | | uchar left_edge_luma_pixels, |respective enumeration values | | uchar |using the '&' operator. | | upper_left_corner_luma_pixel, | | | uchar upper_edge_luma_pixels, |Legal values for | | uchar |intra_neighbour_availabilty and| | upper_right_edge_luma_pixels, |intra_sad_adjustment are as per| | uchar intra_sad_adjustment , |their respective enumeration | | intel_sub_group_avc_sic_payload_t|values, based on the intra | | payload ) |neighboring macroblock's to | | |consider in intra mode | | |estimation. | | | | | |The other parameters specify | | |the neighbor edge pixels for | | |the left, top-left corner, top | | |and top right edges with each | | |work-item providing each pixel | | |value. These pixels values are | | |used to perform the intra mode | | |estimation. | | | | | |For the left and top edge | | |pixels, successive subgroup | | |work-items 0 to 15 provide the | | |successive edge pixels. For the| | |top-right edge, successive | | |work-items 0 to 7 provide the | | |successive edge pixels; the | | |pixel values in work-items 8 to| | |15 are ignored. The top-left | | |corner pixel is a uniform pixel| | |value with each work-item | | |providing the same corner | | |pixel. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Return an initialized payload | |intel_sub_group_avc_sic_ |for a VME luma and chroma intra| |configure_ipe( |prediction estimation (IPE) | | uchar luma_intra_partition_mask, |operation. | | uchar intra_neighbour_ | | | availabilty, |The luma_intra_partition_mask | | uchar left_edge_luma_pixels, |can be composed from their | | uchar upper_left_corner_luma_ |respective enumeration values | | pixel, |using the '&' operator. Chroma | | uchar upper_edge_luma_ |intra prediction estimation is | | pixels, |performed on the 8x8 CbCr | | uchar upper_right_edge_luma_ |macroblock. | | pixels, | | | ushort left_edge_chroma_pixels, |Legal values for | | ushort upper_left_corner_chroma_ |intra_neighbour_availabilty and| | pixel, |intra_sad_adjustment are as per| | ushort upper_edge_chroma_pixels, |their respective enumeration | | uchar intra_sad_adjustment, |values, based on the intra | | intel_sub_group_avc_sic_payload_t|neighboring macroblock's to | | payload ) |consider in intra mode | | |estimation. | | | | | |The other parameters specify | | |the neighbor luma and chroma | | |edge pixels for the left, | | |top-left corner, top and | | |top-right (luma only) edges | | |with each work-item providing | | |each pixel value. These pixels | | |values are used to perform the | | |intra mode estimation. | | | | | |For the left and top luma edge | | |pixels successive, subgroup | | |work-items 0 to 15 provide the | | |successive edge pixels. For the| | |top-right luma edge, successive| | |work-items 0 to 7 provide the | | |successive edge pixels; the | | |pixel values in work-items 8 to| | |15 are ignored. The top-left | | |corner pixel is a uniform pixel| | |value with each work-item | | |providing the same corner | | |pixel. | | | | | |For the left and top chroma | | |CbCr pixels, successive | | |subgroup work-items 0 to 7 | | |provide the successive CbCr | | |pixels; the pixel values in | | |work-items 8 to 15 are | | |ignored. The top-left corner | | |pixel is a uniform CbCr pixel | | |value with each work-item | | |providing the same corner CbCr | | |pixel. | +------------------------------------+-------------------------------+ |uint intel_sub_group_avc_sic_get_ |Compose the value for the input| |motion_vector_mask( |argument | | uint skip_block_partition_type, |skip_motion_vector_mask for the| | uchar direction ) |input skip_block_partition_type| | |and direction. | | | | | |The legal values for | | |skip_block_partition_type must | | |be one of the specified | | |partition mask enumeration | | |values (16x16 or 8x8). | | | | | |The direction parameter is a | | |bit field with the directions | | |for the 4 8x8 sub-partitions in| | |traditional Z order, or for | | |only the 16x16 partition. Two | | |bits are reserved for each of | | |the four sub-partitions in | | |row-major order. The 2-bit | | |values are as per the inter | | |macro-block major direction | | |values. If the | | |skip_block_partition_type | | |indicates a 16x16 shape, then | | |only the 1st 2 bits contains | | |the direction, and other bits | | |must be zeroed. | +------------------------------------+-------------------------------+ Payload type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to convert IME payload to MCE payloads and vice- versa. +------------------------------------+-------------------------------+ |intel_sub_group_avc_mce_payload_t |Convert the SIC payload to a | |intel_sub_group_avc_sic_convert_ |generic MCE payload. | |to_mce_payload( | | | intel_sub_group_avc_sic_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Convert the generic MCE payload| |intel_sub_group_avc_ |to a SIC payload. | |mce_convert_to_sic_payload( | | | intel_sub_group_avc_mce_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ Multi-reference cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable multi-reference image costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_ | |inter_base_multi_reference_penalty( |set_inter_base_multi_reference_| | uchar reference_base_penalty, |penalty(..) the payload | | intel_sub_group_avc_sic_payload_t|conversions with to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Inter shape and direction cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable shape costing. +-------------------------------------+------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |inter_shape_penalty( |inter_shape_penalty(..) with | | ulong packed_shape_cost, |the payload conversions | | intel_sub_group_avc_sic_payload_t |to/from MCE types. See MCE | | payload ) |version for description. | +-------------------------------------+------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |inter_direction_penalty( |inter_direction_penalty(..) | | uchar direction_cost, |with the payload conversions | | intel_sub_group_avc_sic_payload_t|to/from MCE types. See MCE | | payload ) |version for description. | +-------------------------------------+------------------------------+ Inter motion vector cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable motion vector costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |motion_vector_cost_function( |motion_vector_cost_function(..)| | ulong packed_cost_center_delta, |with the payload conversions | | uint2 packed_cost_table, |to/from MCE types. See MCE | | uchar cost_precision, |version for description. | | intel_sub_group_avc_sic_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ Intra shape cost configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable intra shape costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Set the luma shape cost penalty| |intel_sub_group_avc_sic_set_ |for intra motion estimation. | |intra_luma_shape_penalty( | | | uint packed_shape_cost, |The value of packed_shape_cost | | intel_sub_group_avc_sic_payload_t|is an integer value with the | | payload ) |following bits specifying the | | |shape cost in U4U4 format as | | |follows: | | | | | |7:0 : must be zero | | |15:8 : 16x16 cost | | |23:16 : 8x8 cost | | |31:24 : 4x4 cost | | | | | |The U4U4 decoded integer values| | |must bit fit within 12 bits. | +------------------------------------+-------------------------------+ Intra mode cost configuration phase functions +++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable intra mode costing. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |The value of luma_mode_penalty | |intel_sub_group_avc_sic_set_ |specifies the penalty to be | |intra_luma_mode_cost_function( |applied to the estimated luma | | uchar luma_mode_penalty, |mode if it differs from its | | uint luma_packed_neighbor_modes, |predicted luma mode (based on | | uint luma_packed_non_dc_penalty, |its neighbor intra modes). It | | intel_sub_group_avc_sic_payload_t|is specified in U4U4 format and| | payload ) |must bit in 10 bits. | | | | | |The value of | | |luma_packed_neighbor_modes | | |specifies the values of the | | |already computed top and left | | |neighbor modes for the | | |bordering 4x4 blocks, with the | | |4x4 blocks numbered in the | | |traditional Z-order as shown | | |below. | | | | | | 0 1 4 5 | | | 2 3 6 7 | | | 8 9 C D | | | A B E F | | | | | |the following bits specify the | | |neighbor modes. | | | | | |3:0 => Left neighbor block 5 | | |7:4 => Left neighbor block 7 | | |11:8 => Left neighbor block D | | |15:12 => Left neighbor block F | | |19:16 => Top neighbor block A | | |23:20 => Top neighbor block B | | |27:24 => Top neighbor block E | | |31:28 => Top neighbor block F | | | | | |The value of | | |luma_packed_non_dc_penalty | | |specifies the penalty to be | | |applied for any computed non-DC| | |luma mode for each of the | | |16x16, 8x8, and 4x4 shapes, | | |with the following bits | | |specifying the penalties. | | | | | |7:0 => Intra16x16 non-dc | | | penalty | | |15:8 => Intra8x8 non-dc | | | penalty | | |23:16 => Intra4x4 non-dc | | | penalty | | |31:24 => Must be zero | | | | | |The component values are | | |specified in 8-bit integer | | |format. | | | | | |The intra distortion for each | | |intra luma block can be | | |described by the following | | |formulas: | | | | | |Intra_4x4 SAD (or Haar) + | | |luma_shape_penalty_4x4 + | | |luma_non_dc_4x4_penalty (if not| | |DC) + | | |luma_mode_penalty (if computed | | |mode is not the same predicted | | |mode from neighbor modes) | | | | | |Intra_8x8 SAD (or Haar) + | | |luma_shape_penalty_8x8 + | | |luma_non_dc_8x8_penalty (if not| | |DC) + | | |luma_mode_penalty (if computed | | |mode is not the same predicted | | |mode from neighbor modes ) | | | | | |Intra_16x16 SAD (or Haar) + | | |luma_shape_penalty_16x16 + | | |luma_non_dc_4x4_penalty (if not| | |DC) | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Set the intra chroma mode cost | |intel_sub_group_avc_sic_set_ |function by specifying the | |intra_chroma_mode_cost_function( |penalty to be applied to the | | uchar chroma_mode_penalty, |computed chroma mode. | | intel_sub_group_avc_sic_payload_t| | | payload ) |The chroma_mode_base_penalty is| | |the base penalty to be applied | | |to the computed intra chroma | | |modes. This penalty is in U4U4 | | |format. | | | | | |The U4U4 decoded integer value | | |must fit in 12 bits. | | | | | |The base penalty is scaled | | |based on the computed mode as | | |defined below. | | | | | | DC : 0x | | | HORZ : 1x | | | VERT : 1x | | | PLANE: 2x | | | | | |The intra distortion for each | | |intra 8x8 chroma block can be | | |described by the following | | |formulas: | | | | | |distortion = | | |SAD (or Haar) + | | |chroma_mode_penalty (scaled | | |based on computed mode) | +------------------------------------+-------------------------------+ Miscellaneous property configuration phase functions ++++++++++++++++++++++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the configuration phase to enable miscellaneous properties setting in the payload. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |source_interlaced_field_polarity( |source_interlaced_field_ | | uchar src_field_polarity, |polarity(..) with the result | | intel_sub_group_avc_sic_payload_t|conversions to/from MCE | | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |single_reference_interlaced_ |single_reference_ | |field_polarity( |interlaced_field_polarity(..) | | uchar ref_field_polarity, |with the result conversions | | intel_sub_group_avc_sic_payload_t|to/from MCE types. See MCE | | payload ) |version for description. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |dual_reference_interlaced_ |dual_reference_interlaced_ | |field_polarities( |field_polarities(..) with the | | uchar fwd_ref_field_polarity, |result conversions to/from MCE | | uchar bwd_ref_field_polarity, |types. See MCE version for | | intel_sub_group_avc_sic_payload_t|description. | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Update the input payload to do | |intel_sub_group_avc_sic_set_ |enable bilinear filter | |skc_bilinear_filter_enable( |interpolation instead of 4-tap | | intel_sub_group_avc_sic_payload_t|filter interpolation. Default | | payload ) |is 4-tap filter interpolation. | | | | | |This should not be called if | | |the payload was initialized | | |with integer pixel resolution. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |Enable skip check forward | |intel_sub_group_avc_sic_set_ |transform with the specified | |skc_forward_transform_enable( |SAD coefficients thresholds in | | ulong packed_sad_coefficients, |the frequency domain to | | intel_sub_group_avc_sic_payload_t|approximate the effects of | | payload ) |forward quantization. | | | | | |The skip decision will be | | |enhanced to include an accurate| | |AVC forward transform for skip | | |estimation. This feature is in | | |addition to the previous SAD or| | |HAAR skip estimation. The | | |results of the forward | | |transform are compared one | | |coefficient at a time against a| | |user-specified threshold, in | | |the input argument | | |packed_sad_coefficients, to | | |emulate quantization's zeroing | | |effect. The user is returned | | |the count of coefficients that | | |exceeded their threshold along | | |with a sum of the amount | | |exceeded, both grouped at the | | |8x8 block level (i.e. for each | | |8x8 block). | | | | | |The SAD coefficient threshold | | |matrix for a 4x4 transform as | | |shown in the table below is | | |packed into a 64-bit | | |integer. The low 16 bits | | |contains the larger DC | | |threshold. The coefficient | | |thresholds for the remaining 6 | | |AC thresholds in the order of | | |increasing frequency are | | |provided by the successive | | |8-bit bit ranges. | | | | | |0 (DC) 1 (AC) 2 (AC) 3 (AC) | | |1 (AC) 2 (AC) 3 (AC) 4 (AC) | | |2 (AC) 3 (AC) 4 (AC) 5 (AC) | | |3 (AC) 4 (AC) 5 (AC) 6 (AC) | | | | | |This is valid only for SKC | | |operations. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |The raw skip SAD computed | |intel_sub_group_avc_sic_set_ |during the evaluation phase | |block_based_raw_skip_sad( |will be the maximal SAD of | | uchar block_based_skip_type, |individual 4x4 (or 8x8) blocks,| | intel_sub_group_avc_sic_payload_t|instead of the sum of the | | payload ) |entire individual 4x4 block | | |SADs of the MB. | | | | | |The legal values for | | |block_based_skip_type must be | | |one of the specified block | | |based skip type enumeration | | |values. | | | | | |It is valid to call this | | |function only if the payload is| | |configured for a skip check | | |operation by a prior call to | | |intel_sub_group_avc_sic_ | | |configure_skc(..). | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_payload_t |This is a wrapper for | |intel_sub_group_avc_sic_set_ |intel_sub_group_avc_mce_set_ | |ac_only_haar( |ac_only_haar(..) with the | | intel_sub_group_avc_sic_payload_t|payload conversions to/from MCE| | payload ) |types. See MCE version for | | |description. | +------------------------------------+-------------------------------+ Evaluation phase functions ++++++++++++++++++++++++++ These built-in functions perform the evaluation of the SIC operation configured in the payload with a VME media sampler and return the results. +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_result_t |Evaluate the SIC IPE operation | |intel_sub_group_avc_sic_evaluate_ |and return its results. | |ipe( | | | read_only image2d_t src_image, | | | sampler_t vme_media_sampler, | | | intel_sub_group_avc_sic_payload_t| | | payload ) | | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_result_t |Evaluate the SIC operation with| |intel_sub_group_avc_sic_evaluate_ |single reference and return its| |with_single_reference( |results. | | read_only image2d_t src_image, | | | read_only image2d_t ref_image, |The parameter ref_image must be| | sampler_t vme_media_sampler, |a valid forward image kernel | | intel_sub_group_avc_sic_payload_t|parameter per the ordering | | payload ) |conventions for the kernel | | |parameter list. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_result_t |Evaluate the SIC operation with| |intel_sub_group_avc_sic_evaluate_ |dual and return its results. | |with_dual_reference ( | | | read_only image2d_t src_image, |The parameter | | read_only image2d_t fwd_ref_image,|fwd_ref_image[bwd_ref_image] | | read_only image2d_t bwd_ref_image,|must be a valid | | sampler_t vme_media_sampler, |forward[backward] image kernel | | intel_sub_group_avc_sic_payload_t |parameter per the ordering | | payload ) |conventions for the kernel | | |parameter list. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_result_t |Evaluate the SIC operation with| |intel_sub_group_avc_sic_evaluate_ |multi references and return its| |with_multi_reference( |results. | | read_only image2d_t src_image, | | | uint packed_reference_ids, |A pair of unique reference | | sampler_t vme_media_sampler, |identifier (indicating unique | | intel_sub_group_avc_sic_payload_t|forward/backward reference | | payload ) |images) may be specified for | | |each of the allowed major | | |partitions (one 16x16 or four | | |8x8) using packed_dual_ref_ids.| | | | | |The value of | | |packed_reference_ids is a | | |integer with the following bits| | |specifying the values for the | | |pair of reference images for | | |each major partition. | | | | | |3:0 => Fwd reference block 0 | | |7:4 => Bwd reference block 0 | | |11:8 => Fwd reference block 1 | | |15:12 => Bwd reference block 1 | | |19:16 => Fwd reference block 2 | | |23:20 => Bwd reference block 2 | | |27:24 => Fwd reference block 3 | | |31:28 => Bwd reference block 3 | | | | | |A forward[backward] reference | | |idenitifer value of 'n' | | |indicates the forward[backward]| | |image from the 'n'th pair of | | |forward/backward reference | | |images, with the value of 'n' | | |ranging from 0 to 15. | | | | | |If the REF operation is | | |configured with only forward | | |reference images then, the | | |values of the backward | | |reference identifiers are not | | |used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference identifier | | |pairs must be replicated. For | | |example, for a 16x16 block, all| | |four pair of reference | | |identifiers must be replicated | | |to the value of the first pair | | |for block 0. | +------------------------------------+-------------------------------+ |intel_sub_group_avc_sic_result_t |Evaluate the SIC operation with| |intel_sub_group_avc_sic_evaluate_ |multi references and return its| |with_multi_reference( |results. This is used for | | read_only image2d_t src_image, |interlaced source and reference| | uint packed_reference_ids, |images. | | uchar packed_reference_field_ | | | polarities, |A pair of unique reference | | sampler_t vme_media_sampler, |identifier (indicating unique | | intel_sub_group_avc_sic_payload_t|forward/backward reference | | payload ) |images) may be specified for | | |each of the allowed major | | |partitions (one 16x16 or four | | |8x8) using packed_dual_ref_ids.| | | | | |The value of | | |packed_reference_ids is a | | |integer with the following bits| | |specifying the values for the | | |pair of reference images for | | |each major partition. | | | | | |3:0 => Fwd reference block 0 | | |7:4 => Bwd reference block 0 | | |11:8 => Fwd reference block 1 | | |15:12 => Bwd reference block 1 | | |19:16 => Fwd reference block 2 | | |23:20 => Bwd reference block 2 | | |27:24 => Fwd reference block 3 | | |31:28 => Bwd reference block 3 | | | | | |A forward[backward] reference | | |idenitifer value of 'n' | | |indicates the forward[backward]| | |image from the ' nth' pair of | | |forward/backward reference | | |images, with the value of ' n' | | |ranging from 0 to 15. | | | | | |If the REF operation is | | |configured with only forward | | |reference images then, the | | |values of the backward | | |reference identifiers are not | | |used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference identifier | | |pairs must be replicated. For | | |example, for a 16x16 block, all| | |four pair of reference | | |identifiers must be replicated | | |to the value of the first pair | | |for block 0. | | | | | |Reference field polarities for | | |forward and backward reference | | |images are specified for each | | |of the allowed major partitions| | |using packed_reference_field_ | | |polarities. | | | | | |The value of packed_reference_ | | |field_polarities is an integer | | |with the following bits | | |specifying the reference field | | |polarities for the major | | |partitions. | | | | | |0 : Fwd reference block 0 | | |1 : Fwd reference block 1 | | |2 : Fwd reference block 2 | | |3 : Fwd reference block 3 | | |4 : Bwd reference block 0 | | |5 : Bwd reference block 1 | | |6 : Bwd reference block 2 | | |7 : Bwd reference block 3 | | | | | |If the dual-reference | | |evaluation functions are not | | |used, then the values of the | | |backward reference field | | |polarities are not used. | | | | | |The blocks are numbered using | | |the traditional Z order. For | | |larger block sizes, the | | |sub-block reference field | | |polarities are replicated. For | | |example, for a 16x16 block all | | |four pairs of reference field | | |polarities are replicated to | | |the value of the first pair for| | |block 0. | | | | | |The value for the packed_ | | |interlaced_image_reference_ | | |field_polarities argument is | | |obtained by calling | | |intel_sub_group_avc_ime_get_ | | |inter_reference_interlaced_ | | |field_polarities(..) for the | | |preceding IME operation's | | |result. | +------------------------------------+-------------------------------+ Result type conversion functions +++++++++++++++++++++++++++++++++ These are optional built-in functions that may be called following the evaluation phase to convert REF results to MCE results and vice-versa. +-----------------------------------+--------------------------------+ |intel_sub_group_avc_mce_result_t |Convert the SIC result into a | |intel_sub_group_avc_sic_ |MCE result. | |convert_to_mce_result( | | | intel_sub_group_avc_sic_result_t| | | result ) | | +-----------------------------------+--------------------------------+ |intel_sub_group_avc_sic_result_t |Convert the MCE result into an | |intel_sub_group_avc_mce_ |SIC result. | |convert_to_sic_result( | | | intel_sub_group_avc_mce_result_t| | | result ) | | +-----------------------------------+--------------------------------+ Result processing phase functions +++++++++++++++++++++++++++++++++ These built-in functions are called following the evaluation phase to extract the various result components from an SIC evaluation result. +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_sic_get_ |Get the best intra shape from | |ipe_luma_shape( |the SIC result. | | intel_sub_group_avc_sic_result_t| | | result) |The returned values are as per | | |the intra-MB shapes enumeration | | |values. | +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_sic_get_|Get the best intra luma | |best_ipe_luma_distortion( |distortion from the SIC result | | intel_sub_group_avc_sic_result_t|for the shape returned by | | result) |intel_sub_group_avc_sic_get_ | | |ipe_luma_shape(..). | +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_sic_get_|Get the best intra chroma | |best_ipe_chroma_distortion( |distortion for the 8x8 shape | | intel_sub_group_avc_sic_result_t|from the SIC result. | | result) | | +-----------------------------------+--------------------------------+ |ulong intel_sub_group_avc_sic_get_ |Get the packed intra luma modes | |packed_ipe_luma_modes( |for all blocks from the SIC | | intel_sub_group_avc_sic_result_t|result. There are four bits per | | result) |luma mode for a block and legal | | |values for luma modes are as per| | |its defined enumeration | | |values. The number of blocks is | | |based on the result of | | |intel_sub_group_avc_sic_get_ | | |ipe_luma_shape(..). | | | | | |If the luma shape is: | | |- 16x16, then one mode is | | | returned in bits [0, 3] | | |- 8x8, then four modes | | | corresponding to the four | | | partitions are returned by | | | bits in the ranges [0, 3], | | | [16, 19], [32, 35], and | | | [48, 51]; the order of the | | | four partitions are in the | | | traditional Z-order | | |- 4x4, then 16 modes (4 bits per| | | mode) are returned of all 16 | | | partitions by all the bits; | | | the order of the 16 partitions| | | are in the traditional Z-order| | | as shown below: | | | | | | 0 1 4 5 | | | 2 3 6 7 | | | 8 9 C D | | | A B E F | +-----------------------------------+--------------------------------+ |uchar intel_sub_group_avc_sic_get_ |Get the intra chroma mode for | |ipe_chroma_mode( |the 8x8 block from the SIC | | intel_sub_group_avc_sic_result_t|result. The legal values for | | result) |chroma modes are as per its | | |defined enumeration values. | +-----------------------------------+--------------------------------+ |uint intel_sub_group_avc_sic_get_ |Get the packed count of luma | |packed_skc_luma_count_threshold( |coefficient components that | | intel_sub_group_avc_sic_result_t|exceeded their transform | | result) |thresholds from the SIC result | | |for each 8x8 partition in | | |traditional Z-order. | | | | | |The format of the results is as | | |follows: | | | | | |- count of coefficients that | | | exceeded their respective | | | threshold for block 8x8_0 is | | | returned in bit [0, 7] | | |- count of coefficients that | | | exceeded their respective | | | threshold for block 8x8_1 is | | | returned in bit [8, 15] | | |- count of coefficients that | | | exceeded their respective | | | threshold for block 8x8_2 is | | | returned in bit [16, 23] | | |- count of coefficients that | | | exceeded their respective | | | threshold for block 8x8_0 is | | | returned in bit [24, 31] | | | | | |The results are only valid if | | |the SIC operation was configured| | |with frequency domain SAD | | |transform coefficients using | | |intel_sub_group_avc_sic_set_skc_| | |forward_transform_enable(..). | +-----------------------------------+--------------------------------+ |ulong intel_sub_group_avc_sic_get_ |Get the packed sum of luma | |packed_skc_luma_sum_threshold( |coefficient components that | | intel_sub_group_avc_sic_result_t|exceeded their transform | | result) |thresholds from the SIC result | | |for each 8x8 partition in | | |traditional Z-order. | | | | | |The format of the results is as | | |follows: | | |- sum of coefficients that | | | exceeded their respective | | | threshold for block 8x8_0 is | | | returned in bit [0, 15] | | |- sum of coefficients that | | | exceeded their respective | | | threshold for block 8x8_1 is | | | returned in bit [16,31] | | |- sum of coefficients that | | | exceeded their respective | | | threshold for block 8x8_2 is | | | returned in bit [32,47] | | |- sum of coefficients that | | | exceeded their respective | | | threshold for block 8x8_0 is | | | returned in bit [48, 63] | | | | | |The results are only valid if | | |the SIC operation was configured| | |with frequency domain SAD | | |transform coefficients using | | |intel_sub_group_avc_sic_set_skc_| | |forward_transform_enable(..). | +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_sic_get_|Get the skip check raw SAD | |inter_raw_sads( |(i.e. without any mode or shape | | intel_sub_group_avc_sic_result_t|costs included) for the entire | | result ) |MB if the input payload was note| | |configured for block based skip | | |checks, otherwise return the | | |maximal SAD of individual 4x4 | | |(or 8x8, if the block size for | | |block based skip checking was | | |configured as 8x8) blocks with | | |the MB. | +-----------------------------------+--------------------------------+ |ushort intel_sub_group_avc_sic_get_|This is a wrapper for | |inter_distortions( |intel_sub_group_avc_mce_get_ | | intel_sub_group_avc_sic_result_t|inter_distortions(..) with the | | result ) |result conversions to/from MCE | | |types. See MCE version for | | |description. | +-----------------------------------+--------------------------------+ ______________________________________________________________________ A Complete Example ------------------ +--------------------------------------------------------------------+ |__kernel __attribute__((reqd_work_group_size(16,1,1))) void | |block_motion_estimate_intel( | | __read_only image2d_t src_img, | | __read_only image2d_t ref_img, | | __global short2* prediction_motion_vector_buffer, | | __global short2* motion_vector_buffer, | | __global ushort* residuals_buffer, | | __global uchar2* shapes_buffer, | | int iterations ) | |{ | | // Each 16x1 workgroup processes a column of MBs in a image. | | // The number of workgroups is equal to the number of MB in | | // a image row. All column MBs are processed in parallel without | | // handling any MB dependencies. | | int gid_0 = get_group_id(0); | | int gid_1 = 0; | | | | // Initialize the inline VME sampler. | | const sampler_t vme_sampler = CLK_AVC_ME_INITIALIZE_INTEL; | | | | // Process all columns MBs in a loop. | | for( int i = 0; i < iterations ; i++, gid_1++ ) { | | ushort2 srcCoord = 0; | | short2 refCoord = 0; | | short2 predMV = 0; | | | | // Compute the source MB coordinates. | | | | srcCoord.x = gid_0 * 16; | | srcCoord.y = gid_1 * 16; | | | | // Obtain the predictor of the source MB from input predictor | | // buffer. | | if( prediction_motion_vector_buffer != NULL ) { | | predMV = | | prediction_motion_vector_buffer[ | | gid_0 + gid_1 * get_num_groups(0) ]; | | refCoord.x = predMV.x / 4; | | refCoord.y = predMV.y / 4; | | refCoord.y = refCoord.y & 0xFFFE; | | } | | | | // Enable the partition mask to allow all shapes and let the | | // VME unit decide the partitioning for the source MB. | | uchar partition_mask = CLK_AVC_ME_PARTITION_MASK_ALL_INTEL; | | uchar sad_adjustment = CLK_AVC_ME_SAD_ADJUST_MODE_NONE_INTEL; | | uchar pixel_mode = CLK_AVC_ME_SUBPIXEL_MODE_QPEL_INTEL; | | | | // Create and initialize the IME payload. | | intel_sub_group_avc_ime_payload_t payload = | | intel_sub_group_avc_ime_initialize( | | srcCoord, partition_mask, sad_adjustment ); | | | | // Setup the IME for a single reference search. | | // The search window is 48x40 with exhaustive search, with it | | // location specified by the input predictor (refCooord). | | payload = | | intel_sub_group_avc_ime_set_single_reference( | | refCoord, CLK_AVC_ME_SEARCH_WINDOW_EXHAUSTIVE_INTEL, | | payload ); | | | | // Set a cost function for the IME operation. | | ulong cost_center = 0; | | uint2 packed_cost_table = | | intel_sub_group_avc_mce_get_default_medium_penalty_cost_\ | | table(); | | uchar search_cost_precision = | | CLK_AVC_ME_COST_PRECISION_HPEL_INTEL; | | payload = | | intel_sub_group_avc_ime_set_motion_vector_cost_function( | | cost_center, packed_cost_table, search_cost_precision, | | payload ); | | | | // Evaluate the IME operation with its configured payload. | | intel_sub_group_avc_ime_result_t result = | | intel_sub_group_avc_ime_evaluate_with_single_reference( | | src_img, ref_img, vme_sampler, payload ); | | | | // Extract IME results. | | long mvs = intel_sub_group_avc_ime_get_motion_vectors( result );| | ushort sads = | | intel_sub_group_avc_ime_get_inter_distortions( result ); | | uchar major_shape = | | intel_sub_group_avc_ime_get_inter_major_shape( result ); | | uchar minor_shapes = | | intel_sub_group_avc_ime_get_inter_minor_shapes( result ); | | uchar2 shapes = { major_shape, minor_shapes }; | | uchar directions = | | intel_sub_group_avc_ime_get_inter_directions( result ); | | | | // Perform FME if sub-pixel mode is specified. | | // FME is performed on the results of the IME operation. | | if( pixel_mode != CLK_AVC_ME_SUBPIXEL_MODE_INTEGER_INTEL ) { | | intel_sub_group_avc_ref_payload_t payload = | | intel_sub_group_avc_fme_initialize( | | srcCoord, mvs, major_shape, minor_shapes, directions, | | pixel_mode, sad_adjustment); | | intel_sub_group_avc_ref_result_t result = | | intel_sub_group_avc_ref_evaluate_with_single_reference( | | src_img, ref_img, vme_sampler, payload ); | | payload = | | intel_sub_group_avc_ref_set_motion_vector_cost_function( | | cost_center, packed_cost_table, search_cost_precision, | | payload ); | | mvs = intel_sub_group_avc_ref_get_motion_vectors( result ); | | sads = | | intel_sub_group_avc_ref_get_inter_distortions( result ); | | } | | | | // Write out the results. | | int index = | | ( gid_0 * 16 + get_local_id(0) ) + | | ( gid_1 * 16 * get_num_groups(0) ); | | int2 bi_mvs = as_int2( mvs ); | | motion_vector_buffer [ index ] = as_short2( bi_mvs.s0 ); | | if( residuals_buffer != NULL ) { | | residuals_buffer [ index ] = sads; | | } | | shapes_buffer [gid_0 + gid_1 * get_num_groups(0)] = shapes; | | } | |} | +--------------------------------------------------------------------+ _____________________________________________________________________ " Revision History Version 1 (12/02/2016): First public revision. Version 2 (09/18/2016): Fixed typos. Version 3 (10/02/2018): Modified definitions of default initialization literals to be pre-defined enumeration literals. Fixed typo in intra cost configuration function. Version 4 (10/31/2018): Minor typo fixes. Version 5 (11/09/2018): Marked intel_sub_group_ime_ref_window_size function as deprecated.