EfficientDet with TensorFlow and DALI¶
This is a modified version of original EfficientDet implementation https://github.com/google/automl/tree/master/efficientdet. It has been changed to allow to use DALI data preprocessing.
To use DALI pipeline for data loading and preprocessing --pipeline dali_gpu
or
--pipeline dali_cpu
, for original pipeline --pipeline tensorflow
.
Preparing data files from COCO dataset¶
For creating TFrecords files:
python3 ./dataset/create_coco_tfrecord.py \
--image_dir ./coco/train2017 \
--object_annotations_file ./coco/annotations/instances_train2017.json \
--output_file_prefix ./tfrecords/train
For creating TFrecord index files (necessary only for DALI pipelines):
python3 ./dataset/create_tfrecord_indexes.py \
--tfrecord_file_pattern './tfrecords/*.tfrecord' \
--tfrecord2idx_script ../../../../../tools/tfrecord2idx \
Training in Keras Fit/Compile mode¶
For the full training on all available GPUs with DALI gpu pipeline:
python3 train.py \
--multi_gpu \
--pipeline dali_gpu \
--epochs 50 \
--input_type tfrecord \
--train_file_pattern './tfrecords/train*.tfrecord' \
--batch_size 16 \
--train_steps 2000 \
--output_filename final_weights.h5
Evaluation in Keras Fit/Compile mode¶
For the evaluation with DALI gpu pipeline:
python3 eval.py \
--pipeline dali_gpu \
--input_type tfrecord \
--eval_file_pattern './tfrecords/eval*.tfrecord' \
--eval_steps 5000 \
--weights final_weights.h5
Usage¶
usage: train.py [-h] [--initial_epoch INITIAL_EPOCH] [--epochs EPOCHS] --input_type {tfrecord,coco} [--images_path IMAGES_PATH] [--annotations_path ANNOTATIONS_PATH]
[--train_file_pattern TRAIN_FILE_PATTERN] [--batch_size BATCH_SIZE] [--train_steps TRAIN_STEPS] [--eval_file_pattern EVAL_FILE_PATTERN]
[--eval_steps EVAL_STEPS] [--eval_freq EVAL_FREQ] [--eval_during_training] [--eval_after_training] --pipeline_type {synthetic,tensorflow,dali_cpu,dali_gpu}
[--multi_gpu [MULTI_GPU [MULTI_GPU ...]]] [--seed SEED] [--hparams HPARAMS] [--model_name MODEL_NAME] [--output_filename OUTPUT_FILENAME]
[--start_weights START_WEIGHTS] [--log_dir LOG_DIR] [--ckpt_dir CKPT_DIR]
optional arguments:
-h, --help show this help message and exit
--initial_epoch INITIAL_EPOCH
Epoch from which to start training.
--epochs EPOCHS Epoch on which training should finish.
--input_type {tfrecord,coco}
Input type.
--images_path IMAGES_PATH
Path to COCO images.
--annotations_path ANNOTATIONS_PATH
Path to COCO annotations.
--train_file_pattern TRAIN_FILE_PATTERN
TFrecord files glob pattern for files with training data.
--batch_size BATCH_SIZE
--train_steps TRAIN_STEPS
Number of steps (iterations) in each epoch.
--eval_file_pattern EVAL_FILE_PATTERN
TFrecord files glob pattern for files with evaluation data, defaults to `train_file_pattern` if not given.
--eval_steps EVAL_STEPS
Number of examples to evaluate during each evaluation.
--eval_freq EVAL_FREQ
During training evaluation frequency.
--eval_during_training
Whether to run evaluation every `eval_freq` epochs.
--eval_after_training
Whether to run evaluation after finished training.
--pipeline_type {synthetic,tensorflow,dali_cpu,dali_gpu}
Pipeline type used while loading and preprocessing data. One of:
tensorflow – pipeline used in original EfficientDet implementation on https://github.com/google/automl/tree/master/efficientdet
synthetic – like `tensorflow` pipeline type but repeats one batch endlessly
dali_gpu – pipeline which uses Nvidia Data Loading Library (DALI) to run part of data preprocessing on GPUs to improve efficiency
dalu_cpu – like `dali_gpu` pipeline type but restricted to run only on CPU
--multi_gpu [MULTI_GPU [MULTI_GPU ...]]
List of GPUs to use, if empty defaults to all visible GPUs.
--seed SEED
--hparams HPARAMS String or filename with parameters.
--model_name MODEL_NAME
--output_filename OUTPUT_FILENAME
Filename for final weights to save.
--start_weights START_WEIGHTS
--log_dir LOG_DIR Directory for tensorboard logs.
--ckpt_dir CKPT_DIR Directory for saving weights each step.
usage: eval.py [-h] --input_type {tfrecord,coco} [--images_path IMAGES_PATH] [--annotations_path ANNOTATIONS_PATH] [--eval_file_pattern EVAL_FILE_PATTERN]
[--eval_steps EVAL_STEPS] --pipeline_type {synthetic,tensorflow,dali_cpu,dali_gpu} [--weights WEIGHTS] [--model_name MODEL_NAME] [--hparams HPARAMS]
optional arguments:
-h, --help show this help message and exit
--input_type {tfrecord,coco}
Input type.
--images_path IMAGES_PATH
Path to COCO images.
--annotations_path ANNOTATIONS_PATH
Path to COCO annotations.
--eval_file_pattern EVAL_FILE_PATTERN
TFrecord files glob pattern for files with evaluation data.
--eval_steps EVAL_STEPS
Number of examples to evaluate.
--pipeline_type {synthetic,tensorflow,dali_cpu,dali_gpu}
Pipeline type used while loading and preprocessing data. One of:
tensorflow – pipeline used in original EfficientDet implementation on https://github.com/google/automl/tree/master/efficientdet
synthetic – like `tensorflow` pipeline type but repeats one batch endlessly
dali_gpu – pipeline which uses Nvidia Data Loading Library (DALI) to run part of data preprocessing on GPUs to improve efficiency
dalu_cpu – like `dali_gpu` pipeline type but restricted to run only on CPU
--weights WEIGHTS Name of the file with model weights.
--model_name MODEL_NAME
--hparams HPARAMS String or filename with parameters.
Requirements¶
pip install -r requirements.txt