NVIDIA Clara Train 3.1
v3.1

ai4med.operations package

compute_accuracy(predictions, labels, use_sigmoid=False)

Compare prediction results with labels and compute accuracy.

Parameters
  • predictions (Tensor) – output of model layers

  • labels (Tensor) – ground truth of test data

  • use_sigmoid (bool) – add sigmoid layer after predictions or not (Default: False)

Returns

dict contains accuracy tag and value

Return type

output_dict

PRelu(x, name='prelu_layer')

Parametric relu (same alpha for all channels).

For more details, check https://arxiv.org/abs/1502.01852

Parameters
  • x (Tensor) – input tensor

  • name (str) – (optional) name of scope

  • tags (list or str) – tag name for the accuracy result in dict

  • use_sigmoid (bool) – add sigmoid layer after predictions or not

Returns

tensor result, same size as input

Return type

x

bias_add(x, data_format)

Create and add a bias (trainable) variable for each input channel.

Same things as tf.nn.bias_add for 2D case, but for 3D it will reshape to 2D case, add bias, and reshape back, which is faster in TF.

Parameters
  • x (Tensor) – input tensor (4D for 2D spatial case) or (5D for 3D spatial case)

  • data_format (str) – channels_first or channels_last

Returns

tensor result, same size as input

Return type

x

process_multiclass_preds(predictions, label_format, use_sigmoid_for_binary=True, use_softmax_for_multiclass=True)

Process multple class predictions to 1 class output format.

Parameters
  • predictions (Tensor) – output of model layers

  • label_format – format of the labels

  • use_sigmoid_for_binary (bool) – use sigmoid for binary classification or not

  • use_softmax_for_multiclass (bool) – use softmax for multple class classification or not

Returns

predictions with expected format

Return type

preds

process_multiclass_preds_to_label_format(predictions, label_format, use_sigmoid_for_binary=True, use_softmax_for_multiclass=True)

Process multple class predictions to same format as label.

Parameters
  • predictions (Tensor) – output of model layers

  • label_format – format of the labels

  • use_sigmoid_for_binary (bool) – use sigmoid for binary classification or not

  • use_softmax_for_multiclass (bool) – use softmax for multple class classification or not

Returns

predictions with expected format

Return type

preds

pull_binary_values(tensor, label_format: ai4med.common.label_format.LabelFormatInfo)
conv(x, filters, kernel_size, data_format, strides=1, dilation_rate=1, use_bias=True, use_wscale=False, padding='SAME', use_fixed_padding=False)

3D/2D convolution.

Similar to standard tf.layers.conv3D and tf.layers.conv2D, but allows to specify additional options, such as: a) fixed_padding b) use_wscale (from Progressive GANs)

Parameters
  • x (Tensor) – input tensor (4D for 2D spatial case) or (5D for 3D spatial case)

  • filters (int) – number of convolution filters

  • kernel_size (tuple) – size of convolution filter kernel

  • data_format (str) – channels_first or channels_last

  • strides (int) – number of strides of convolution computation

  • dilation_rate (int) – dilation convolution rate

  • use_bias (bool) – add bias to filters or not

  • use_wscale (bool) – variable is initialzed to 0 mean 1 std or not

  • padding (bool) – use padding mode or not

  • use_fixed_padding (bool) – pad the input with zeros along the spatial dimensions or not

Returns

tensor result of convolution computation

Return type

x

conv_transpose(x, filters, kernel_size, data_format, strides=1, use_bias=False, use_wscale=False, padding='SAME')

3D/2D transpose convolution.

Parameters
  • x (Tensor) – input tensor (4D for 2D spatial case) or (5D for 3D spatial case)

  • filters (int) – number of convolution filters

  • kernel_size (tuple) – size of convolution filter kernel

  • data_format (str) – channels_first or channels_last

  • strides (int) – number of strides of convolution computation

  • use_bias (bool) – add bias to filters or not

  • use_wscale (bool) – variable is initialzed to 0 mean 1 std or not

  • padding (bool) – use padding mode or not

Returns

tensor result of transpose convolution computation

Return type

x

fixed_padding(inputs, kernel_size, data_format, dilation_rate=1, padding_mode='CONSTANT')

Explicitly pads the input with zeros along the spatial dimensions independently of image size for strides > 1.

so that we can use VALID convolution later, instead of using the SAME convolutions with implicit padding. When using SAME convolutions TF automatically pads input tensors (but the padding depends on whether dimensions is even or odd, and on strides and dilation rate somewhat inconsistently) https://github.com/tensorflow/tensorflow/issues/18213 https://stackoverflow.com/questions/47745397/why-use-fixed-padding-when-building-resnet-model-in-tensorflow https://www.tensorflow.org/api_guides/python/nn#Convolution

because of that many TF own example use the explicit fixed_padding e.g. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/slim/python/slim/nets/resnet_utils.py

Parameters
  • inputs (Tensor) – input tensor of size [batch, channels, height_in, width_in] or [batch, height_in, width_in, channels] depending on data_format

  • kernel_size (tuple) – the kernel to be used in the conv2d or max_pool2d operation should be a positive integer.

  • data_format (str) – the input format (‘channels_last’ or ‘channels_first’)

  • dilation_rate (int) – dilation convolution rate

  • padding_mode (str) – “CONSTANT”, “REFLECT”, or “SYMMETRIC” (case-insensitive)

Returns

tensor result with added padding

Return type

padded_inputs

dice_metric(predictions, targets, data_format='channels_first', skip_background=False, is_onehot_targets=False, is_independent_predictions=False, jaccard=False, threshold=0.5)

Compute average Dice between two 5D tensors (for 3D images) or 4D tensors (for 2D images).

Parameters
  • predictions (Tensor) – predicted segmentation output. Several version are allowed 1) Nx1xHxWxD - if output segmentation has only 1 channel (e.g. only foreground prediction) 2) NxCxHxWxD - if output segmentation has several channels, and if is_independent_predictions==False(default) channels are assumed to be competing probabilities (e.g. after softmax or logits), and argmax over the channel dimension will be evaluated first. (this is the case for not overlaping classes, e.g. liver, kidneys, prostate) if is_independent_predictions==True, then the channels are assumed to be independent segmentation probabilities (e.g. several channels after sigmoid), and the average dice for each channel is computed (the foreground is thresholded using threshold = 0.5 (default) parameter). (this is the case when e.g. segmenting several nested classes, e.g. 3 overlapping tumor subregions as 3 output channels) use “metric_dice”: {“is_independent_predictions”: true} json field in your net_confg

  • targets (Tensor) – true segmentation values. Usually has 1 channel dimension (e.g. Nx1xHxWxD), where each element is an index indicating class label. Alternatively it can be a one-hot-encoded tensor of the shape NxCxHxWxD, where each channel is binary (or float in interval 0..1) indicating the probability of the corresponding class label (in this case you must set is_onehot_targets = True)

  • data_format (str) – the input format (‘channels_last’ or ‘channels_first’)

  • skip_background (bool) – skip dice computation on the first channel of the predicuted output (basically skipping dice on background class) or not

  • is_onehot_targets (bool) – targets in One-Hot format or not

  • is_independent_predictions (bool) – prediction results are independent or not

  • jaccard (bool) – compute Jaccard Indect (a.k.a. IoU) instead of dice or not

  • threshold (float) – threshold to convert independent_predictions probabilities into binary

Returns

N float values (for each class separately) and a tensor of N bools (with True for each valid

dice calculation, where true class was present)

Return type

output_dict

reduce_mean_masked(tensor, tensor_mask, axis=None)

Compute mean value on masked ground truth.

Parameters
  • tensor (Tensor) – input data

  • tensor_mask (Tensor) – mask for expected ground truth

  • axis (int) – compute sum and mean on expected axis

Returns

reduced results

dice_metric_masked_output(predictions, targets, tags, data_format='channels_first', skip_background=False, is_onehot_targets=False, is_independent_predictions=False, jaccard=False, threshold=0.5)

Outputs a dict of dice values. This function outputs a dict of dice values. The first element in the output dict is the average value across all classes, and if there is more than one class, the remaining elements are per-class values. If there are no ground labels for the class in question, returns nan.

Parameters
  • predictions (Tensor) – output of model layers

  • targets (Tensor) – ground truth of test data

  • tags (list, str) – key names of generated outputs

  • data_format (str) – the input format (‘channels_last’ or ‘channels_first’)

  • skip_background (bool) – remove background class or not

  • is_onehot_targets (bool) – targets in One-Hot format or not

  • is_independent_predictions (bool) – prediction results are independent or not

  • jaccard (bool) – compute Jaccard Indect (a.k.a. IoU) instead of dice or not

  • threshold (float) – threshold to convert independent_predictions probabilities into binary

Returns

iterate through each class, and return the value in dict

Return type

output_dict

ChannelDropout(x, rate, data_format, seed=None, training=False, name=None)

Similar to conventional dropout, but drops/zeros some voxels across all channels.

Parameters
  • x (Tensor) – input data (N-D, any dimensions, e.g. 5D for 3D images)

  • rate (float) – probability of dropping each channel [0..1]

  • data_format (str) – channels_first or channels_last

  • seed (float) – optional random seed

  • training (bool) – in training (no dropout if not in training) or not

  • name (str) – optional operation name

Returns

same size as input, with some voxels zeroed out across all channels, the remaining is rescaled

Return type

x (Tensor)

SpatialDropout(x, rate, data_format, seed=None, training=False, name=None)

Similar to conventional dropout, but drops/zeros entire channel.

Parameters
  • x (Tensor) – input data (N-D, any dimensions, e.g. 5D for 3D images)

  • rate (float) – probability of dropping each channel [0..1]

  • data_format (str) – channels_first or channels_last

  • seed (float) – optional random seed

  • training (bool) – in training (no dropout if not in training) or not

  • name (str) – optional operation name

Returns

same size as input, with some channels zeroed out, the remaining channels are rescaled

Return type

x (Tensor)

channel_dropout(x, rate, data_format, seed=None, training=False, name=None)

Similar to conventional dropout, but drops/zeros some voxels across all channels.

Parameters
  • x (Tensor) – input data (N-D, any dimensions, e.g. 5D for 3D images)

  • rate (float) – probability of dropping each channel [0..1]

  • data_format (str) – channels_first or channels_last

  • seed (float) – optional random seed

  • training (bool) – in training (no dropout if not in training) or not

  • name (str) – optional operation name

Returns

same size as input, with some voxels zeroed out across all channels, the remaining is rescaled

Return type

x (Tensor)

spatial_dropout(x, rate, data_format, seed=None, training=False, name=None)

Similar to conventional dropout, but drops/zeros entire channel.

Parameters
  • x (Tensor) – input data (N-D, any dimensions, e.g. 5D for 3D images)

  • rate (float) – probability of dropping each channel [0..1]

  • data_format (str) – channels_first or channels_last

  • seed (float) – optional random seed

  • training (bool) – in training (no dropout if not in training) or not

  • name (str) – optional operation name

Returns

same size as input, with some channels zeroed out, the remaining channels are rescaled

Return type

x (Tensor)

he_normal(seed=None, dtype=tf.float32)

He normal initializer.

It draws samples from a truncated normal distribution centered on 0 with ‘stddev = sqrt(2 / fan_in)’ where ‘fan_in’ is the number of input units in the weight tensor. For more details, check He et al., http://arxiv.org/abs/1502.01852

Parameters
  • seed (float) – optional random seed, used to seed the random generator

  • dtype (tf.dtype) – indicate input data type

Returns

initializer for weights

group_norm(x, data_format, G=8, gamma_init=1.0, scope='group_norm_scope')

Group Normalization.

For more details, check https://arxiv.org/abs/1803.08494

Parameters
  • data_format (str) – channels_first or channels_last

  • G (int) – number of groups

  • gamma_init (float) – value to initialize gamma factor

  • scope (str) – name of group normalization scope

Returns

tensor result of group normalization computation

UpsampleLinear(x, data_format, upsample_factor=2, trainable=False)

Upsample/Upsize tensor several times using linear interpolation

Actually not strictly defined linear interpolation, but a smooth upsampling similar to linear, which is easier in TF. Supports 2D and 3D spatial dimensions only (that is 4D or 5D input tensors only).

Parameters
  • x (Tensor) – input data, NxCxDxHxW or NxDxHxWxC for 3D, or NxCxHxW or NxHxWxC for 2D

  • data_format (str) – channels_first or channels_last

  • upsample_factor (int) – upsample factor for spatial dimensions, must be >=1

  • trainable (bool) – the factors is to be trained or not

Returns

tensor result of spatially dimensions upsampled

UpsampleRepeat(x, data_format, upsample_factor=2)

Upsample/Upsize tensor several times using nearest neighbors interpolation (element repeat).

Supports N-D tensors (2D, 3D or any other dimensions)

Parameters
  • x (Tensor) – input data, NxCxDxHxW or NxDxHxWxC for 3D, or NxCxHxW or NxHxWxC for 2D

  • data_format (str) – channels_first or channels_last

  • upsample_factor (int) – upsample factor for spatial dimensions, must be >=1

Returns

tensor result of upsampling computation

tf_repeat(tensor, repeats, allowFirstDimBatchNone=False)

Repeat tensor elements several times along each axis.

Adopted from https://github.com/tensorflow/tensorflow/issues/8246.

Parameters
  • tensor (Tensor) – input data, N-D (any dimensional)

  • repeats (list) – Number of repeat for each dimension, length must be the same as the number of dimensions in input

  • allowFirstDimBatchNone (bool) – allow first batch dim is None or not

Returns

tensor result with same type as input with shape of tensor.shape * repeats

upsample_nearest(x, data_format, upsample_factor=2)

Upsample/Upsize tensor several times using nearest neighbors interpolation (element repeat).

Supports N-D tensors (2D, 3D or any other dimensions)

Parameters
  • x (Tensor) – input data, NxCxDxHxW or NxDxHxWxC for 3D, or NxCxHxW or NxHxWxC for 2D

  • data_format (str) – channels_first or channels_last

  • upsample_factor (int) – upsample factor for spatial dimensions, must be >=1

Returns

tensor result of upsampling computation

upsample_semilinear(x, data_format, upsample_factor=2, trainable=False)

Upsample/Upsize tensor several times using linear interpolation

Actually not strictly defined linear interpolation, but a smooth upsampling similar to linear, which is easier in TF. Supports 2D and 3D spatial dimensions only (that is 4D or 5D input tensors only).

Parameters
  • x (Tensor) – input data, NxCxDxHxW or NxDxHxWxC for 3D, or NxCxHxW or NxHxWxC for 2D

  • data_format (str) – channels_first or channels_last

  • upsample_factor (int) – upsample factor for spatial dimensions, must be >=1

  • trainable (bool) – the factors is to be trained or not

Returns

tensor result of spatially dimensions upsampled

squeeze_excitation(x, data_format, ratio=16)

Squeeze and add excitation layers

Parameters
  • x (Tensor) – input data, support both 3D and 2D

  • data_format (str) – channels_first or channels_last

  • ratio (int) – ratio to squeeze the channels

Returns

tensor result of squeezed data with excitation

channels_axis(data_format)
get_len(x)
get_number_of_channels(x, data_format)
get_shape(x)
is_3d(x)
Parameters

x – input tensor

Returns

True if the input tensor is a 3D volume (w/ batch and feature)

Input tensor shall be in [Batch_size, Image/Volume_Dim, Feature_Dim], where Image/Volume_Dim is 2 (Height, Width) or 3 for volume. Therefore, when tensor shape is 1+2+1, x is image tensor.

is_channels_first(data_format)
© Copyright 2020, NVIDIA. Last updated on Feb 2, 2023.