Contents and Descriptions#
The Financial Fraud Training generates the following model and configuration files under the directory structure shown in the following:
Structure and Folders#
python_backend_model_repository
└── prediction_and_shapley
├── 1
│ ├── embedding_based_xgboost.json
│ ├── model.py
│ └── state_dict_gnn_model.pth
└── config.pbtxt
Note: The python_backend_model_repository
directory will be created under the path specified by "output_dir"
in the training configuration file.
python_backend_model_repository
#
Root folder for your NVIDIA Dynamo-Triton Python backend models.
prediction_and_shapley
#
A model repository folder inside python_backend_model_repository
that contains model code and artifacts.
Version Subdirectory (
1
)
This subdirectory represents version1
of the model. NVIDIA Dynamo-Triton requires each version of the model to reside in its own folder named with the version number.embedding_based_xgboost.json
Serialized XGBoost model file containing configuration and trained parameters for embedding-based predictions.model.py
Core Python script containing the model’s prediction logic and Shapley value calculations.state_dict_gnn_model.pth
PyTorch model file (using.pth
extension) containing the trained weights for the GNN (Graph Neural Network) component.
config.pbtxt
The configuration file required by the Triton Inference Server. It specifies details, such as input/output tensor shapes, data types, and other model-related parameters.
How to Use#
Deploy on NVIDIA Dynamo-Triton Mount the
python_backend_model_repository
folder into your NVIDIA Dynamo-Triton Server, and make sure the folder structure looks correct.
python_backend_model_repository
└── prediction_and_shapley
├── 1
│ ├── ...
└── config.pbtxt
Check
config.pbtxt
Verify thatconfig.pbtxt
accurately reflects your model’s I/O specifications. Adjust any shapes, batch sizes, or other parameters as required.Start NVIDIA Dynamo-Triton Server Launch the NVIDIA Dynamo-Triton Inference Server (example command shown below; adapt paths and arguments as needed):
!docker run --gpus all -d -p {HTTP_PORT}:{HTTP_PORT} -p {GRPC_PORT}:{GRPC_PORT} -v /path/to/python_backend_model_repository:/models --name tritonserver {TRITON_IMAGE} tritonserver --model-repository=/models
Send Inference Requests Use an HTTP or gRPC client to send inference requests to the NVIDIA Dynamo-Triton. The model will automatically load and serve both predictions and Shapley values. The following section details how to pass data to models deployed on the NVIDIA Dynamo-Triton Inference Server.