Model Management

Triton operates in one of three model control modes: NONE, POLL, or EXPLICIT.

Model Control Mode NONE

Triton attempts to load all models in the model repository at startup. Models that Triton is not able to load will be marked as UNAVAILABLE and will not be available for inferencing.

Changes to the model repository while the server is running will be ignored. Model control requests using the model control endpoint will have no affect and will receive an error response.

This model control mode is selected by specifing --model-control-mode=none when starting Triton. This is the default model control mode.

Model Control Mode EXPLICIT

At startup, Triton loads only those models specified explicitly with the --load-model command-line option. If --load-model is not specified then no models are loaded at startup. After startup, all model load and unload actions must be initiated explicitly by using the Model Control API. The response status of the model control request indicates success or failure of the load or unload action.

This model control mode is enabled by specifing --model-control-mode=explicit.

Model Control Mode POLL

Triton attempts to load all models in the model repository at startup. Models that Triton is not able to load will be marked as UNAVAILABLE and will not be available for inferencing.

Changes to the model repository will be detected and Triton will attempt to load and unload models as necessary based on those changes. Changes to the model repository may not be detected immediately because Triton polls the repository periodically. You can control the polling interval with the --repository-poll-secs option. The console log or the Status API can be used to determine when model repository changes have taken effect.

Model control requests using the model control endpoint will have no affect and will receive an error response.

This model control mode is enabled by specifing --model-control-mode=poll and by setting --repository-poll-secs to a non-zero value when starting Triton.

In POLL mode Triton responds to the following model repository changes:

  • Versions may be added and removed from models by adding and removing the corresponding version subdirectory. Triton will allow in-flight requests to complete even if they are using a removed version of the model. New requests for a removed model version will fail. Depending on the model’s version policy, changes to the available versions may change which model version is served by default.

  • Existing models can be removed from the repository by removing the corresponding model directory. Triton will allow in-flight requests to any version of the removed model to complete. New requests for a removed model will fail.

  • New models can be added to the repository by adding a new model directory.

  • The model configuration (config.pbtxt) can be changed and Triton will unload and reload the model to pick up the new model configuration.

  • Labels files providing labels for outputs that represent classifications can be added, removed, or modified and Triton will unload and reload the model to pick up the new labels. If a label file is added or removed the corresponding edit to the label_filename property of the output it corresponds to in the model configuration must be performed at the same time.