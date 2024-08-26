When running tao model detectnet_v2 train ... if you encounter errors similar to those show below, where the error shows missing variables in the checkpoint, delete the latest .ckzip file and restart the training with the same command.

Copy Copied! Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found. (0) Not found: Key cost_sums/cyclist-bbox not found in checkpoint [[{{node save/RestoreV2}}]] (1) Not found: Key cost_sums/cyclist-bbox not found in checkpoint [[{{node save/RestoreV2}}]] [[save/RestoreV2/_877]]

This error can be raised for the following reasons:

The checkpoint wasn’t saved properly.

The backend framework version used to generate the checkpoint was mismatched from the version used to load this checkpoint.

The experiment configuration has changed from what was stored in the checkpoint to the training graph that was initialized. For example, the checkpoint was generated in TAO Toolkit 2.0 but was resumed in 3.0.