Name Loop unroll pragma extension Name Strings cl_nv_pragma_unroll Number OpenCL Extension #19 Dependencies OpenCL 1.0 is required Contributors Jian-Zhong Wang Bastiaan Aarts Vinod Grover Overview This extension extends the OpenCL C language with a hint that allows loops to be unrolled. This pragma must be used for a loop and can be used to specify full unrolling or partial unrolling by a certain amount. This is a hint and the compiler may ignore this pragma for any reason. Goals The principal goal of the pragma unroll is to improve the performance of loops via unrolling. Typically this enables other optimizations or improves instruction level parallelism of a thread. Details A user may specify that a loop in the source program be unrolled. This is done via a pragma. The syntax of this pragma is as follows #pragma unroll [unroll-factor] The pragma unroll may optionally specify an unroll factor. The pragma must be placed immediately before the loop and only applies to that loop. If unroll factor is not specified then the compiler will try to do complete or full unrolling of the loop. If a loop unroll factor is specified the compiler will perform partial loop unrolling. The loop factor, if specified, must be a compile time non negative integer constant. A loop unroll factor of 1 means that the compiler should not unroll the loop. A complete unroll specification has no effect if the trip count of the loop is not compile-time computable. Examples This sections lists a few examples illustrating valid and invalid uses. - Complete unrolling example #pragma unroll for (int i = 0; i < 32; i++) { ... } This example full unrolling is requested from the compiler. Note that, since the trip count is known to be 32 the compiler will most likely honor this request. In the following example, the trip count is not known so unrolling pragma will be ignored. #pragma unroll for (int i = 0; i < n; i++) { ... } - no unrolling example #pragma unroll 1 for (int i = 0; i < 64; i++) { ... } - partial unrolling example #pragma unroll 4 for (int i = 0; i < n; i++) { ... } Note that, in this example the trip count is not knownt at compile time, but a partial unroll factor of 4 is valid. - invalid unroll pragma usage The following examples describe some invalid uses of loop unrolling pragmas. #pragma unroll -1 for (...) { ... } This is invalid because the loop unroll factor is negative. #pragma unroll if (...) { ... } This is invalid because the pragma is used on a non loop construct #pragma unroll x+1 for (...) { ... } This is invalid since the loop unroll factor is not a compile-time known value.