i32 @__nv_byte_perm(i32 %x, i32 %y, i32 %z)
__nv_byte_perm(x,y,s) returns a 32-bit integer consisting of four bytes from eight input bytes provided in the two input integers x and y, as specified by a selector, s.
The input bytes are indexed as follows:
input = x<7:0> input = x<15:8> input = x<23:16> input = x<31:24> input = y<7:0> input = y<15:8> input = y<23:16> input = y<31:24>
The selector indices are as follows (the upper 16-bits of the selector are not used):
selector = s<2:0> selector = s<6:4> selector = s<10:8> selector = s<14:12>
The returned value r is computed to be: result[n] := input[selector[n]] where result[n] is the nth byte of r.
Compute 2.0: Yes
Compute 3.0: Yes
Compute 3.5: Yes