## 1.9. SIMD Intrinsics

This section describes SIMD intrinsic functions that are only supported in device code.

### Functions

__device__ ​ unsigned int __vabs2 ( unsigned int  a )
Computes per-halfword absolute value.
__device__ ​ unsigned int __vabs4 ( unsigned int  a )
Computes per-byte absolute value.
__device__ ​ unsigned int __vabsdiffs2 ( unsigned int  a, unsigned int  b )
Computes per-halfword sum of absolute difference of signed integer.
__device__ ​ unsigned int __vabsdiffs4 ( unsigned int  a, unsigned int  b )
Computes per-byte absolute difference of signed integer.
__device__ ​ unsigned int __vabsdiffu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword absolute difference of unsigned integer computation: |a - b|.
__device__ ​ unsigned int __vabsdiffu4 ( unsigned int  a, unsigned int  b )
Computes per-byte absolute difference of unsigned integer.
__device__ ​ unsigned int __vabsss2 ( unsigned int  a )
Computes per-halfword absolute value with signed saturation.
__device__ ​ unsigned int __vabsss4 ( unsigned int  a )
Computes per-byte absolute value with signed saturation.
__device__ ​ unsigned int __vadd2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed addition, with wrap-around: a + b.
__device__ ​ unsigned int __vadd4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed addition.
__device__ ​ unsigned int __vaddss2 ( unsigned int  a, unsigned int  b )
Performs per-halfword addition with signed saturation.
__device__ ​ unsigned int __vaddss4 ( unsigned int  a, unsigned int  b )
Performs per-byte addition with signed saturation.
__device__ ​ unsigned int __vaddus2 ( unsigned int  a, unsigned int  b )
Performs per-halfword addition with unsigned saturation.
__device__ ​ unsigned int __vaddus4 ( unsigned int  a, unsigned int  b )
Performs per-byte addition with unsigned saturation.
__device__ ​ unsigned int __vavgs2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed rounded average computation.
__device__ ​ unsigned int __vavgs4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed rounder average.
__device__ ​ unsigned int __vavgu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned rounded average computation.
__device__ ​ unsigned int __vavgu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned rounded average.
__device__ ​ unsigned int __vcmpeq2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
__device__ ​ unsigned int __vcmpeq4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
__device__ ​ unsigned int __vcmpges2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a >= b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpges4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vcmpgeu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a >= b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpgeu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vcmpgts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a > b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpgts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vcmpgtu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a > b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpgtu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vcmples2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a <= b ? 0xffff : 0.
__device__ ​ unsigned int __vcmples4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vcmpleu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a <= b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpleu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vcmplts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a < b ? 0xffff : 0.
__device__ ​ unsigned int __vcmplts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vcmpltu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a < b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpltu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vcmpne2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison: a != b ? 0xffff : 0.
__device__ ​ unsigned int __vcmpne4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
__device__ ​ unsigned int __vhaddu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned average computation.
__device__ ​ unsigned int __vhaddu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned average.
__device__ ​ unsigned int __vmaxs2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed maximum computation.
__device__ ​ unsigned int __vmaxs4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed maximum.
__device__ ​ unsigned int __vmaxu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned maximum computation.
__device__ ​ unsigned int __vmaxu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned maximum.
__device__ ​ unsigned int __vmins2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed minimum computation.
__device__ ​ unsigned int __vmins4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed minimum.
__device__ ​ unsigned int __vminu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum computation.
__device__ ​ unsigned int __vminu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned minimum.
__device__ ​ unsigned int __vneg2 ( unsigned int  a )
Computes per-halfword negation.
__device__ ​ unsigned int __vneg4 ( unsigned int  a )
Performs per-byte negation.
__device__ ​ unsigned int __vnegss2 ( unsigned int  a )
Computes per-halfword negation with signed saturation.
__device__ ​ unsigned int __vnegss4 ( unsigned int  a )
Performs per-byte negation with signed saturation.
__device__ ​ unsigned int __vsads2 ( unsigned int  a, unsigned int  b )
Performs per-halfword sum of absolute difference of signed.
__device__ ​ unsigned int __vsads4 ( unsigned int  a, unsigned int  b )
Computes per-byte sum of abs difference of signed.
__device__ ​ unsigned int __vsadu2 ( unsigned int  a, unsigned int  b )
Computes per-halfword sum of abs diff of unsigned.
__device__ ​ unsigned int __vsadu4 ( unsigned int  a, unsigned int  b )
Computes per-byte sum af abs difference of unsigned.
__device__ ​ unsigned int __vseteq2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
__device__ ​ unsigned int __vseteq4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
__device__ ​ unsigned int __vsetges2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
__device__ ​ unsigned int __vsetges4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vsetgeu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum unsigned comparison.
__device__ ​ unsigned int __vsetgeu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vsetgts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
__device__ ​ unsigned int __vsetgts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vsetgtu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison.
__device__ ​ unsigned int __vsetgtu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vsetles2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum computation.
__device__ ​ unsigned int __vsetles4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vsetleu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
__device__ ​ unsigned int __vsetleu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vsetlts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
__device__ ​ unsigned int __vsetlts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
__device__ ​ unsigned int __vsetltu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison.
__device__ ​ unsigned int __vsetltu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
__device__ ​ unsigned int __vsetne2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
__device__ ​ unsigned int __vsetne4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
__device__ ​ unsigned int __vsub2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed substraction, with wrap-around.
__device__ ​ unsigned int __vsub4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction.
__device__ ​ unsigned int __vsubss2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed substraction, with signed saturation.
__device__ ​ unsigned int __vsubss4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction with signed saturation.
__device__ ​ unsigned int __vsubus2 ( unsigned int  a, unsigned int  b )
Performs per-halfword substraction with unsigned saturation.
__device__ ​ unsigned int __vsubus4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction with unsigned saturation.

### Functions

__device__ ​ unsigned int __vabs2 ( unsigned int  a )
Computes per-halfword absolute value.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 2 parts, each consisting of 2 bytes, then computes absolute value for each of parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabs4 ( unsigned int  a )
Computes per-byte absolute value.
###### Returns

Returns computed value.

###### Description

Splits argument by bytes. Computes absolute value of each byte. Result is stored as unsigned int.

__device__ ​ unsigned int __vabsdiffs2 ( unsigned int  a, unsigned int  b )
Computes per-halfword sum of absolute difference of signed integer.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each into 2 parts, each consisting of 2 bytes. For corresponding parts function computes absolute difference. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabsdiffs4 ( unsigned int  a, unsigned int  b )
Computes per-byte absolute difference of signed integer.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each into 4 parts, each consisting of 1 byte. For corresponding parts function computes absolute difference. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabsdiffu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword absolute difference of unsigned integer computation: |a - b|.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes absolute difference. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabsdiffu4 ( unsigned int  a, unsigned int  b )
Computes per-byte absolute difference of unsigned integer.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes absolute difference. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabsss2 ( unsigned int  a )
Computes per-halfword absolute value with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 2 parts, each consisting of 2 bytes, then computes absolute value with signed saturation for each of parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vabsss4 ( unsigned int  a )
Computes per-byte absolute value with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 4 parts, each consisting of 1 byte, then computes absolute value with signed saturation for each of parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vadd2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed addition, with wrap-around: a + b.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes, then performs unsigned addition on corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vadd4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed addition.
###### Returns

Returns computed value.

###### Description

Splits 'a' into 4 bytes, then performs unsigned addition on each of these bytes with the corresponding byte from 'b', ignoring overflow. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vaddss2 ( unsigned int  a, unsigned int  b )
Performs per-halfword addition with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes, then performs addition with signed saturation on corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vaddss4 ( unsigned int  a, unsigned int  b )
Performs per-byte addition with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte, then performs addition with signed saturation on corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vaddus2 ( unsigned int  a, unsigned int  b )
Performs per-halfword addition with unsigned saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes, then performs addition with unsigned saturation on corresponding parts.

__device__ ​ unsigned int __vaddus4 ( unsigned int  a, unsigned int  b )
Performs per-byte addition with unsigned saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte, then performs addition with unsigned saturation on corresponding parts.

__device__ ​ unsigned int __vavgs2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed rounded average computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. then computes signed rounded avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vavgs4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed rounder average.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. then computes signed rounded avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vavgu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned rounded average computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. then computes unsigned rounded avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vavgu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned rounded average.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. then computes unsigned rounded avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vcmpeq2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
###### Returns

Returns 0xffff computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if they are equal, and 0000 otherwise. For example __vcmpeq2(0x1234aba5, 0x1234aba6) returns 0xffff0000.

__device__ ​ unsigned int __vcmpeq4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
###### Returns

Returns 0xff if a = b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if they are equal, and 00 otherwise. For example __vcmpeq4(0x1234aba5, 0x1234aba6) returns 0xffffff00.

__device__ ​ unsigned int __vcmpges2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a >= b ? 0xffff : 0.
###### Returns

Returns 0xffff if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part >= 'b' part, and 0000 otherwise. For example __vcmpges2(0x1234aba5, 0x1234aba6) returns 0xffff0000.

__device__ ​ unsigned int __vcmpges4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 0xff if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part >= 'b' part, and 00 otherwise. For example __vcmpges4(0x1234aba5, 0x1234aba6) returns 0xffffff00.

__device__ ​ unsigned int __vcmpgeu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a >= b ? 0xffff : 0.
###### Returns

Returns 0xffff if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part >= 'b' part, and 0000 otherwise. For example __vcmpgeu2(0x1234aba5, 0x1234aba6) returns 0xffff0000.

__device__ ​ unsigned int __vcmpgeu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 0xff if a = b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part >= 'b' part, and 00 otherwise. For example __vcmpgeu4(0x1234aba5, 0x1234aba6) returns 0xffffff00.

__device__ ​ unsigned int __vcmpgts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a > b ? 0xffff : 0.
###### Returns

Returns 0xffff if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part > 'b' part, and 0000 otherwise. For example __vcmpgts2(0x1234aba5, 0x1234aba6) returns 0x00000000.

__device__ ​ unsigned int __vcmpgts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 0xff if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part > 'b' part, and 00 otherwise. For example __vcmpgts4(0x1234aba5, 0x1234aba6) returns 0x00000000.

__device__ ​ unsigned int __vcmpgtu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a > b ? 0xffff : 0.
###### Returns

Returns 0xffff if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part > 'b' part, and 0000 otherwise. For example __vcmpgtu2(0x1234aba5, 0x1234aba6) returns 0x00000000.

__device__ ​ unsigned int __vcmpgtu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 0xff if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part > 'b' part, and 00 otherwise. For example __vcmpgtu4(0x1234aba5, 0x1234aba6) returns 0x00000000.

__device__ ​ unsigned int __vcmples2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a <= b ? 0xffff : 0.
###### Returns

Returns 0xffff if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part <= 'b' part, and 0000 otherwise. For example __vcmples2(0x1234aba5, 0x1234aba6) returns 0xffffffff.

__device__ ​ unsigned int __vcmples4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 0xff if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part <= 'b' part, and 00 otherwise. For example __vcmples4(0x1234aba5, 0x1234aba6) returns 0xffffffff.

__device__ ​ unsigned int __vcmpleu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a <= b ? 0xffff : 0.
###### Returns

Returns 0xffff if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part <= 'b' part, and 0000 otherwise. For example __vcmpleu2(0x1234aba5, 0x1234aba6) returns 0xffffffff.

__device__ ​ unsigned int __vcmpleu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 0xff if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part <= 'b' part, and 00 otherwise. For example __vcmpleu4(0x1234aba5, 0x1234aba6) returns 0xffffffff.

__device__ ​ unsigned int __vcmplts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison: a < b ? 0xffff : 0.
###### Returns

Returns 0xffff if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part < 'b' part, and 0000 otherwise. For example __vcmplts2(0x1234aba5, 0x1234aba6) returns 0x0000ffff.

__device__ ​ unsigned int __vcmplts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 0xff if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part < 'b' part, and 00 otherwise. For example __vcmplts4(0x1234aba5, 0x1234aba6) returns 0x000000ff.

__device__ ​ unsigned int __vcmpltu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison: a < b ? 0xffff : 0.
###### Returns

Returns 0xffff if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part < 'b' part, and 0000 otherwise. For example __vcmpltu2(0x1234aba5, 0x1234aba6) returns 0x0000ffff.

__device__ ​ unsigned int __vcmpltu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 0xff if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part < 'b' part, and 00 otherwise. For example __vcmpltu4(0x1234aba5, 0x1234aba6) returns 0x000000ff.

__device__ ​ unsigned int __vcmpne2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison: a != b ? 0xffff : 0.
###### Returns

Returns 0xffff if a != b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts result is ffff if 'a' part != 'b' part, and 0000 otherwise. For example __vcmplts2(0x1234aba5, 0x1234aba6) returns 0x0000ffff.

__device__ ​ unsigned int __vcmpne4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
###### Returns

Returns 0xff if a != b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts result is ff if 'a' part != 'b' part, and 00 otherwise. For example __vcmplts4(0x1234aba5, 0x1234aba6) returns 0x000000ff.

__device__ ​ unsigned int __vhaddu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned average computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. then computes unsigned avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vhaddu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned average.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. then computes unsigned avarege of corresponding parts. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmaxs2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed maximum computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes signed maximum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmaxs4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed maximum.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes signed maximum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmaxu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned maximum computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes unsigned maximum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmaxu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned maximum.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes unsigned maximum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmins2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed minimum computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes signed minimum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vmins4 ( unsigned int  a, unsigned int  b )
Computes per-byte signed minimum.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes signed minimum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vminu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum computation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes unsigned minimum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vminu4 ( unsigned int  a, unsigned int  b )
Computes per-byte unsigned minimum.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes unsigned minimum. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vneg2 ( unsigned int  a )
Computes per-halfword negation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 2 parts, each consisting of 2 bytes. For each part function computes negation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vneg4 ( unsigned int  a )
Performs per-byte negation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 4 parts, each consisting of 1 byte. For each part function computes negation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vnegss2 ( unsigned int  a )
Computes per-halfword negation with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 2 parts, each consisting of 2 bytes. For each part function computes negation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vnegss4 ( unsigned int  a )
Performs per-byte negation with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of argument into 4 parts, each consisting of 1 byte. For each part function computes negation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsads2 ( unsigned int  a, unsigned int  b )
Performs per-halfword sum of absolute difference of signed.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts functions computes absolute difference and sum it up. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsads4 ( unsigned int  a, unsigned int  b )
Computes per-byte sum of abs difference of signed.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts functions computes absolute difference and sum it up. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsadu2 ( unsigned int  a, unsigned int  b )
Computes per-halfword sum of abs diff of unsigned.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function computes absolute differences, and returns sum of those differences.

__device__ ​ unsigned int __vsadu4 ( unsigned int  a, unsigned int  b )
Computes per-byte sum af abs difference of unsigned.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function computes absolute differences, and returns sum of those differences.

__device__ ​ unsigned int __vseteq2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
###### Returns

Returns 1 if a = b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part == 'b' part. If both equalities are satisfiad, function returns 1.

__device__ ​ unsigned int __vseteq4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
###### Returns

Returns 1 if a = b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part == 'b' part. If both equalities are satisfiad, function returns 1.

__device__ ​ unsigned int __vsetges2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
###### Returns

Returns 1 if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part >= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetges4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 1 if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part >= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgeu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum unsigned comparison.
###### Returns

Returns 1 if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part >= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgeu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 1 if a >= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part >= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
###### Returns

Returns 1 if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part > 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 1 if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part > 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgtu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison.
###### Returns

Returns 1 if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part > 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetgtu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 1 if a > b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part > 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetles2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned minimum computation.
###### Returns

Returns 1 if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetles4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 1 if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetleu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
###### Returns

Returns 1 if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetleu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 1 if a <= b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 part, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetlts2 ( unsigned int  a, unsigned int  b )
Performs per-halfword signed comparison.
###### Returns

Returns 1 if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetlts4 ( unsigned int  a, unsigned int  b )
Performs per-byte signed comparison.
###### Returns

Returns 1 if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetltu2 ( unsigned int  a, unsigned int  b )
Performs per-halfword unsigned comparison.
###### Returns

Returns 1 if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetltu4 ( unsigned int  a, unsigned int  b )
Performs per-byte unsigned comparison.
###### Returns

Returns 1 if a < b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part <= 'b' part. If both inequalities are satisfied, function returns 1.

__device__ ​ unsigned int __vsetne2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed comparison.
###### Returns

Returns 1 if a != b, else returns 0.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts function performs comparison 'a' part != 'b' part. If both conditions are satisfied, function returns 1.

__device__ ​ unsigned int __vsetne4 ( unsigned int  a, unsigned int  b )
Performs per-byte (un)signed comparison.
###### Returns

Returns 1 if a != b, else returns 0.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts function performs comparison 'a' part != 'b' part. If both conditions are satisfied, function returns 1.

__device__ ​ unsigned int __vsub2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed substraction, with wrap-around.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts functions performs substraction. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsub4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts functions performs substraction. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsubss2 ( unsigned int  a, unsigned int  b )
Performs per-halfword (un)signed substraction, with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts functions performs substraction with signed saturation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsubss4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction with signed saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts functions performs substraction with signed saturation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsubus2 ( unsigned int  a, unsigned int  b )
Performs per-halfword substraction with unsigned saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 2 parts, each consisting of 2 bytes. For corresponding parts functions performs substraction with unsigned saturation. Result is stored as unsigned int and returned.

__device__ ​ unsigned int __vsubus4 ( unsigned int  a, unsigned int  b )
Performs per-byte substraction with unsigned saturation.
###### Returns

Returns computed value.

###### Description

Splits 4 bytes of each argument into 4 parts, each consisting of 1 byte. For corresponding parts functions performs substraction with unsigned saturation. Result is stored as unsigned int and returned.