Navigator Package#
The model graph and/or checkpoint is not enough to perform a successful deployment of the model. When you are deploying model for inference you need to be aware of model inputs and outputs definition, maximal batch size that can be used for inference and other.
On that purpose, we have created a Navigator Package - an artifact containing the serialized model, model metadata and
optimization details.
The Navigator Package is a recommended way of sharing the optimized model for deployment
on PyTriton or Triton Inference Server sections
or re-running the optimize method on different hardware.
Save#
The package created during models optimize can be saved in form of Zip file using the API method:
import model_navigator as nav
nav.package.save(
package=package,
path="/path/to/package.nav"
)
The save method collect the generated models from workspace selecting:
base formats - first available serialization formats exporting model from source
max throughput format - the model that achieved the highest throughput during profiling
min latency format - the model that achieved the minimal latency during profiling
Additionally, the package contains:
status file with optimization details
logs from optimize execution
reproduction script per each model format
input and output data samples in form on numpy files
Read more in save method API specification.
Load#
The packages saved to file can be loaded for further processing:
import model_navigator as nav
package = nav.package.load(
path="/path/to/package.nav"
)
Once the package is loaded, you can obtain desired information or use it to optimize or profile the package. Read
more in load method API specification.
Optimize#
The loaded package object can be used to re-run the optimize process. In comparison to the framework dedicated API, the package optimize process starts from the serialized models inside the package and reproduces the available optimization paths. This step can be used to reproduce the process without access to sources on different hardware.
The optimization from the package can be run using:
import model_navigator as nav
optimized_package = nav.package.optimize(
package=package
)
At the end of the process, the new optimized models are generated. Please be aware, the workspace is overridden in this step. Read more in optimize method API specification.
Profile#
The optimize process uses a single sample from dataloader for profiling. The process is focusing on selecting the best model format, and this requires an unequivocal sample for performance comparison.
In some cases, you may want to profile the models on different dataset. For that purpose, the Triton Model Navigator exposes the API for profiling all samples in the dataset for each model:
import torch
import model_navigator as nav
profiling_results = nav.package.profile(
package=package,
dataloader=[torch.randn(1, 3, 256, 256), torch.randn(1, 3, 512, 512)],
)
The results contain profiling information per each model and sample. You can use it to perform desired analysis based on the results. Read more in profile method API specification.