DALI binary arithmetic operators - type promotions¶
In this example, we will describe the rules regarding type promotions for binary arithmetic operators in DALI. Details on using arithmetic operators in DALI can be found in “DALI expressions and arithmetic operators” notebook.
Prepare the test pipeline¶
First, we will prepare the helper code, so we can easily manipulate the types and values that will appear as tensors in the DALI pipeline.
We will be using numpy as source for the custom provided data and we also need to import several things from DALI, needed to create Pipeline and use ExternalSource Operator.
[1]:
import numpy as np
from nvidia.dali.pipeline import Pipeline
import nvidia.dali.ops as ops
import nvidia.dali.types as types
from nvidia.dali.types import Constant
batch_size = 1
Defining the data¶
As we are dealing with binary operators, we need two inputs. We will create a simple helper function that returns two numpy arrays of given numpy types with arbitrary selected values. It is to make the manipulation of types easy. In an actual scenario the data processed by DALI arithmetic operators would be tensors produced by other Operator containing some images, video sequences or other data.
Keep in mind that shapes of both inputs need to match as those will be element-wise operations.
[2]:
left_magic_values = [42, 8]
right_magic_values = [9, 2]
def get_data(left_type, right_type):
return ([left_type(left_magic_values)], [right_type(right_magic_values)])
batch_size = 1
Defining the pipeline¶
The next step is to define the Pipeline. We override Pipeline.iter_setup
, a method called by the pipeline before every Pipeline.run
. It is meant to feed the data into ExternalSource()
operators indicated by self.left
and self.right
. The data will be obtained from get_data
function to which we pass the left and right types.
Note, that we do not need to instantiate any additional operators, we can use regular Python arithmetic expressions on the results of other operators in the define_graph
step.
For convenience, we’ll wrap the usage of arithmetic operations in a lambda called operation
, specified when creating the pipeline.
define_graph
will return both our data inputs and the result of applying operation
to them.
[3]:
class ArithmeticPipeline(Pipeline):
def __init__(self, operation, left_type, right_type, batch_size, num_threads, device_id):
super(ArithmeticPipeline, self).__init__(batch_size, num_threads, device_id, seed=12)
self.left_source = ops.ExternalSource()
self.right_source = ops.ExternalSource()
self.operation = operation
self.left_type = left_type
self.right_type = right_type
def define_graph(self):
self.left = self.left_source()
self.right = self.right_source()
return self.left, self.right, self.operation(self.left, self.right)
def iter_setup(self):
(l, r) = get_data(self.left_type, self.right_type)
self.feed_input(self.left, l)
self.feed_input(self.right, r)
Type promotion rules¶
Type promotions for binary operators are described below. The type promotion rules are commutative. They apply to +
, -
, *
, and //
. The /
always returns a float32 for integer inputs, and applies the rules below when at least one of the inputs is a floating point number.
Operand Type |
Operand Type |
Result Type |
Additional Conditions |
---|---|---|---|
T |
T |
T |
|
floatX |
T |
floatX |
where T is not a float |
floatX |
floatY |
float(max(X, Y)) |
|
intX |
intY |
int(max(X, Y)) |
|
uintX |
uintY |
uint(max(X, Y)) |
|
intX |
uintY |
int2Y |
if X <= Y |
intX |
uintY |
intX |
if X > Y |
bool
type is considered the smallest unsigned integer type and is treated as uint1
with respect to the table above.
The bitwise binary |
, &
, and ^
operations abide by the same type promotion rules as arithmetic binary operations, but their inputs are restricted to integral types (bool included).
Only multiplication *
and bitwise operations |
, &
, ^
can accept two bool
inputs.
Using the Pipeline¶
Let’s create a Pipeline that adds two tensors of type uint8
, run it and see the results.
[4]:
def build_and_run(pipe, op_name):
pipe.build()
pipe_out = pipe.run()
l = pipe_out[0].as_array()
r = pipe_out[1].as_array()
out = pipe_out[2].as_array()
print("{} {} {} = {}; \n\twith types {} {} {} -> {}\n".format(l, op_name, r, out, l.dtype, op_name, r.dtype, out.dtype))
pipe = ArithmeticPipeline((lambda x, y: x + y), np.uint8, np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run(pipe, "+")
[[42 8]] + [[9 2]] = [[51 10]];
with types uint8 + uint8 -> uint8
Let’s see how all of the operators behave with different type combinations by generalizing the example above. You can use the np_types
or np_int_types
in the loops to see all possible type combinations. To reduce the output we limit ourselves to only few of them. We also set some additional printing options for numpy to make the output more aligned.
[5]:
np.set_printoptions(precision=2)
[6]:
arithmetic_operations = [((lambda x, y: x + y) , "+"), ((lambda x, y: x - y) , "-"),
((lambda x, y: x * y) , "*"), ((lambda x, y: x / y) , "/"),
((lambda x, y: x // y) , "//")]
bitwise_operations = [((lambda x, y: x | y) , "|"), ((lambda x, y: x & y) , "&"),
((lambda x, y: x ^ y) , "^")]
np_types = [np.int8, np.int16, np.int32, np.int64,
np.uint8, np.uint16, np.uint32, np.uint64,
np.float32, np.float64]
for (op, op_name) in arithmetic_operations:
for left_type in [np.uint8]:
for right_type in [np.uint8, np.int32, np.float32]:
pipe = ArithmeticPipeline(op, left_type, right_type, batch_size=batch_size, num_threads=2, device_id = 0)
build_and_run(pipe, op_name)
for (op, op_name) in bitwise_operations:
for left_type in [np.uint8]:
for right_type in [np.uint8, np.int32]:
pipe = ArithmeticPipeline(op, left_type, right_type, batch_size=batch_size, num_threads=2, device_id = 0)
build_and_run(pipe, op_name)
[[42 8]] + [[9 2]] = [[51 10]];
with types uint8 + uint8 -> uint8
[[42 8]] + [[9 2]] = [[51 10]];
with types uint8 + int32 -> int32
[[42 8]] + [[9. 2.]] = [[51. 10.]];
with types uint8 + float32 -> float32
[[42 8]] - [[9 2]] = [[33 6]];
with types uint8 - uint8 -> uint8
[[42 8]] - [[9 2]] = [[33 6]];
with types uint8 - int32 -> int32
[[42 8]] - [[9. 2.]] = [[33. 6.]];
with types uint8 - float32 -> float32
[[42 8]] * [[9 2]] = [[122 16]];
with types uint8 * uint8 -> uint8
[[42 8]] * [[9 2]] = [[378 16]];
with types uint8 * int32 -> int32
[[42 8]] * [[9. 2.]] = [[378. 16.]];
with types uint8 * float32 -> float32
[[42 8]] / [[9 2]] = [[4.67 4. ]];
with types uint8 / uint8 -> float32
[[42 8]] / [[9 2]] = [[4.67 4. ]];
with types uint8 / int32 -> float32
[[42 8]] / [[9. 2.]] = [[4.67 4. ]];
with types uint8 / float32 -> float32
[[42 8]] // [[9 2]] = [[4 4]];
with types uint8 // uint8 -> uint8
[[42 8]] // [[9 2]] = [[4 4]];
with types uint8 // int32 -> int32
[[42 8]] // [[9. 2.]] = [[4.67 4. ]];
with types uint8 // float32 -> float32
[[42 8]] | [[9 2]] = [[43 10]];
with types uint8 | uint8 -> uint8
[[42 8]] | [[9 2]] = [[43 10]];
with types uint8 | int32 -> int32
[[42 8]] & [[9 2]] = [[8 0]];
with types uint8 & uint8 -> uint8
[[42 8]] & [[9 2]] = [[8 0]];
with types uint8 & int32 -> int32
[[42 8]] ^ [[9 2]] = [[35 10]];
with types uint8 ^ uint8 -> uint8
[[42 8]] ^ [[9 2]] = [[35 10]];
with types uint8 ^ int32 -> int32
Using Constants¶
Instead of operating only on Tensor data, DALI expressions can also work with constants. Those can be either values of Python int
and float
types used directly, or those values wrapped in nvidia.dali.types.Constant
. Operation between tensor and constant results in the constant being broadcasted to all elements of the tensor. The same costant is used with all samples in the batch.
Note: Currently all values of integral constants are passed to DALI as int32 and all values of floating point constants are passed to DALI as float32.
The Python int
values will be treated as int32
and the float
as float32
in regard to type promotions.
The DALI Constant
can be used to indicate other types. It accepts DALIDataType
enum values as second argument and has convenience member functions like .uint8()
or .float32()
that can be used for conversions.
As our expressions will consist of a tensor and a constant, we will adjust our previous pipeline and the helper functions - they only need to generate one tensor.
[7]:
class ArithmeticConstantsPipeline(Pipeline):
def __init__(self, operation, tensor_data_type,batch_size, num_threads, device_id):
super(ArithmeticConstantsPipeline, self).__init__(batch_size, num_threads, device_id, seed=12)
self.tensor_source = ops.ExternalSource()
self.operation = operation
self.tensor_data_type = tensor_data_type
def define_graph(self):
self.tensor = self.tensor_source()
return self.tensor, self.operation(self.tensor)
def iter_setup(self):
(t, _) = get_data(self.tensor_data_type, self.tensor_data_type)
self.feed_input(self.tensor, t)
def build_and_run_with_const(pipe, op_name, constant, is_const_left = False):
pipe.build()
pipe_out = pipe.run()
t_in = pipe_out[0].as_array()
t_out = pipe_out[1].as_array()
if is_const_left:
print("{} {} {} = \n{}; \n\twith types {} {} {} -> {}\n".format(constant, op_name, t_in, t_out, type(constant), op_name, t_in.dtype, t_out.dtype))
else:
print("{} {} {} = \n{}; \n\twith types {} {} {} -> {}\n".format(t_in, op_name, constant, t_out, t_in.dtype, op_name, type(constant), t_out.dtype))
Now, the ArithmeticConstantsPipeline
can be parametrized with a function taking the only tensor and returning the result of arithmetic operation between that tensor and a constant.
We also adjusted our print message.
Now we will check all the examples we mentioned at the beginning: int
, float
constants and nvidia.dali.types.Constant
.
[8]:
constant = 10
pipe = ArithmeticConstantsPipeline((lambda x: x + constant), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "+", constant)
constant = 10
pipe = ArithmeticConstantsPipeline((lambda x: x + constant), np.float32, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "+", constant)
constant = 42.3
pipe = ArithmeticConstantsPipeline((lambda x: x + constant), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "+", constant)
constant = 42.3
pipe = ArithmeticConstantsPipeline((lambda x: x + constant), np.float32, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "+", constant)
[[42 8]] + 10 =
[[52 18]];
with types uint8 + <class 'int'> -> int32
[[42. 8.]] + 10 =
[[52. 18.]];
with types float32 + <class 'int'> -> float32
[[42 8]] + 42.3 =
[[84.3 50.3]];
with types uint8 + <class 'float'> -> float32
[[42. 8.]] + 42.3 =
[[84.3 50.3]];
with types float32 + <class 'float'> -> float32
As we can see the value of the constant is applied to all the elements of the tensor to which it is added.
Now let’s check how to use the DALI Constant wrapper.
Passing an int
or float
to DALI Constant marks it as int32
or float32
respectively
[9]:
constant = Constant(10)
pipe = ArithmeticConstantsPipeline((lambda x: x * constant), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "*", constant)
constant = Constant(10.0)
pipe = ArithmeticConstantsPipeline((lambda x: constant * x), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "*", constant, True)
[[42 8]] * 10:DALIDataType.INT32 =
[[420 80]];
with types uint8 * <class 'nvidia.dali.types.Constant'> -> int32
10.0:DALIDataType.FLOAT * [[42 8]] =
[[420. 80.]];
with types <class 'nvidia.dali.types.Constant'> * uint8 -> float32
We can either explicitly specify the type as a second argument, or use convenience conversion member functions.
[10]:
constant = Constant(10, types.DALIDataType.UINT8)
pipe = ArithmeticConstantsPipeline((lambda x: x * constant), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "*", constant)
constant = Constant(10.0, types.DALIDataType.UINT8)
pipe = ArithmeticConstantsPipeline((lambda x: constant * x), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "*", constant, True)
constant = Constant(10).uint8()
pipe = ArithmeticConstantsPipeline((lambda x: constant * x), np.uint8, batch_size = batch_size, num_threads = 2, device_id = 0)
build_and_run_with_const(pipe, "*", constant, True)
[[42 8]] * 10:DALIDataType.UINT8 =
[[164 80]];
with types uint8 * <class 'nvidia.dali.types.Constant'> -> uint8
10:DALIDataType.UINT8 * [[42 8]] =
[[164 80]];
with types <class 'nvidia.dali.types.Constant'> * uint8 -> uint8
10:DALIDataType.UINT8 * [[42 8]] =
[[164 80]];
with types <class 'nvidia.dali.types.Constant'> * uint8 -> uint8
Treating tensors as scalars¶
If one of the tensors is considered a scalar input, the same rules apply.