ai4med.utils package
-
compute_label_category
(item)
-
process_label_format
(label_format_vector) Processes label_format, which specifies number of classes in each label as an int array. Returns number of binary and multi class labels and also the starting indices for each respective label, plus the number of classes for each multi_class label.
Note: If we have 5 binary labels plus 4 multiclass labels each of 6 classes, then total number of labels network needs to output is 5 + 4 * 6, so we must record where in the network each label corresponds to
-
customized_rounding
(val, first_decimal_digit=None, rounding_precision=3) Customize rounding operation
- Parameters
val – a positive floating number
first_decimal_digit – first non-zero decimal digit in consideration, if val is greater than zero, first_decimal_digit is equal to zero (first digit above decimal point)
rounding_precision – available digits in consideration after first_decimal_digit (including first_decimal_digit)
- Returns
A floating number after customized rounding
-
create_inference_file_list
(data_file, task='segmentation', allowed_formats='.nii', '.nii.gz', '.png', '.npy', base_dir=None, input_key='image', data_list_key='validation') Returns a list of files for inference, based on given path
-
get_dataset_keys
(data_list_key)
-
load_segmentation_dataset_paths
(json_path, base_dir=None) Load image and label paths from json file
-
parse_data_list_file
(task, data_list_key, data_list_file_path, base_dir=None) Load image properties and get image/label paths from json file
Json file is similar to what you get from http://medicaldecathlon.com/ Those dataset.json files
- Parameters
task – Task to perform
data_list_key – The key to get a list of dictionary to be used
data_list_file_path – The path to the json file
base_dir – The base directory of the dataset
Returns a list of data items, each of which is a dict keyed by element names
-
resolve_data_list_file_paths
(task, data_list_key, data_list_file_path, base_dir=None)
-
str_to_dtype
(dtype_str)
-
do_test
(create_evaluator_fn)
-
create_inference_session_config
() Helper function to just create standard config for tf.session
-
get_tensor_by_node_name
(graph, name)
-
get_tensor_by_node_name_or_none
(graph, name)
-
initialize_uninitialized
(session)
-
load_frozen_graph
(frozen_graph_filename) Given a frozen_graph path, return a tf.Graph
Returns: a graph.
-
load_graph_def
(graph_def_path) Just loads a graph def from a path
-
mask_by_brats_class
(img_data: numpy.ndarray, class_name, dtype) Get a channel mask from brats class.
- Parameters
img_data – Input data to generate mask from.
class_name – One of the brats class names.
dtype – Type of required output.
- Returns
Channel mask converted from data using brats class.
-
mask_by_class_index
(img_data: numpy.ndarray, idx_number, dtype) Get channel mask using class index.
- Parameters
img_data – Input image data to generate mask.
idx_number – Class index number.
dtype – Required output type.
- Returns
Channel mask converted using provided class index.
-
class
MetricAccumulator
Bases:
object
Utility functions to package.
Package metric updates, initializerrs, computation routines into a dictionary that that the fitter function can use to make summary operations for TensorBoard.
Args:
- Returns
metrics values
-
get
(name, default=None) Get metrics values from dict with key name
- Parameters
name (str) – key name
default (str) – default value if can’t found
- Returns
value of specified key
-
reset
() Reset metrics values to None
Args:
- Returns
None
-
update
(metrics_dict) Update metrics values from dict
- Parameters
metrics_dict (dict) – new metrics values
- Returns
metrics values
Helper functions to execute multi-gpu training. Adapted from Sebastian Schoner’s blog post: http://blog.s-schoener.com/2017-12-15-parallel-tensorflow-intro/ And also: Vahid Kazemi’s Effective Tensorflow github https://github.com/vahidk/EffectiveTensorflow#multi_gpu Portions therein were adapted from other contributors. These are also attributed
UPDATE, since we now used hvd, ALL non hvd components should be staged for deletion
-
assign_to_device
(device, ps_device) Returns a function to place variables on the ps_device.
- Parameters
device – Device for everything but variables
ps_device – Device to put the variables on. Example values are /GPU:0 and /CPU:0.
ps_device is not set then the variables will be placed on the default device. (If) –
best device for shared varibles depends on the platform as well as the (The) –
Start with CPU (model.) – 0 and then test GPU:0 to see if there is an
improvement. –
-
bcast
(var, root=0) Broadcast root variable to all process variables. Just a wrapper over standard mpi calls
-
get_available_gpus
() Returns a list of the identifiers of all visible GPUs.
-
is_multi_gpu
() Returns True if multi_gpu else False.
-
make_opt_multi_gpu
(opt) Utility function to turn tf optimizer into a multi_gpu one. In this case just a wrapper over hvd to hide it in an abstraction
-
make_parallel
(fn, devices, model_dict, controller='/cpu:0') Takes a function, returning loss and acc, and parallelizes it
Note, function will take the input tensors and split them across devices. So if you are using iterators, you only need to call get_next() once, regardles of the number of devices. As you increase the number of devices you can increase the batch size accordingly. Any non tensor inputs in the model_dict will not be split, instead they will just be passed as is to the model function. Any non tensor outputs from model function will not be concatenated - only one copy of them will be stored in the output dictionary. Common reason for non tensor outputs is if the model function returns a list of update operations or variables
- Parameters
fn (function) – a function accepting input dictionary of tensors and
output dictionary of tensors (returning) –
devices (str list) – list of device identifiers to parallelize function
controller (str) – device identifier acting as the controller
- Returns
out_dictionary of parallelized tensor computations.
-
multi_gpu_get_device_count
()
-
multi_gpu_get_rank
() Returns process rank
-
multi_gpu_init
() Utility function to init mult-gpu functionality. In this case just a wrapper over hvd to hide it in an abstraction
-
multi_gpu_var_init
(root_device=0) Utility function to broadcast variables from one process to model clones In this case just a wrapper over hvd to hide it in an abstraction
-
reduce_dict
(input_dict, root=0) Reduces all values in the input dictionary across processes using the mean
-
reduce_list
(scattered_list, root=0)
-
reduce_val
(var, reduction_fn=, root=0) Reduces all values across mpi ranks
-
set_multi_gpu_config
(config) Sets up tf.Session config for multi_gpu operation.
-
shard_data
(data_dict) Shards the data for multi_gpu training/validation
- Inputs:
data_dict: dict, where each value is a list that will be sharded
- Outputs:
python dict, that only contains 1/hvd.size() instances, with some logic if the hvd.size() is not a perfect divisor of the list lengths. In that case, the short-changed shards will have an additional data instance included, to make all lengths equal
-
get_shape_format
(data)
Utilities to remove unneeded nodes from a GraphDefs.
-
strip_unused
(input_graph_def, input_node_names, output_node_names, placeholder_type_enum) Removes unused nodes from a GraphDef.
- Parameters
input_graph_def – A graph with nodes we want to prune.
input_node_names – A list of the nodes we use as inputs.
output_node_names – A list of the output nodes.
placeholder_type_enum – The AttrValue enum for the placeholder data type, or a list that specifies one value per input node name.
- Returns
A GraphDef with all unnecessary ops removed.
- Raises
ValueError – If any element in input_node_names refers to a tensor instead of an operation.
KeyError – If any element in input_node_names is not found in the graph.
-
strip_unused_from_files
(input_graph, input_binary, output_graph, output_binary, input_node_names, output_node_names, placeholder_type_enum) Removes unused nodes from a graph file.
-
str_to_dtype
(dtype_str)
-
do_test
(create_trainer_fn)