Contents and Descriptions#

The Financial Fraud Training generates the following model and configuration files under the directory structure shown in the following:

Structure and Folders#

python_backend_model_repository
└── prediction_and_shapley
 ├── 1
    ├── embedding_based_xgboost.json
    ├── model.py
    └── state_dict_gnn_model.pth
 └── config.pbtxt

Note: The python_backend_model_repository directory will be created under the path specified by "output_dir" in the training configuration file.

python_backend_model_repository#

Root folder for your NVIDIA Dynamo-Triton Python backend models.


prediction_and_shapley#

A model repository folder inside python_backend_model_repository that contains model code and artifacts.

  • Version Subdirectory (1)
    This subdirectory represents version 1 of the model. NVIDIA Dynamo-Triton requires each version of the model to reside in its own folder named with the version number.

    • embedding_based_xgboost.json
      Serialized XGBoost model file containing configuration and trained parameters for embedding-based predictions.

    • model.py
      Core Python script containing the model’s prediction logic and Shapley value calculations.

    • state_dict_gnn_model.pth
      PyTorch model file (using .pth extension) containing the trained weights for the GNN (Graph Neural Network) component.

  • config.pbtxt
    The configuration file required by the Triton Inference Server. It specifies details, such as input/output tensor shapes, data types, and other model-related parameters.


How to Use#

  1. Deploy on NVIDIA Dynamo-Triton Mount the python_backend_model_repository folder into your NVIDIA Dynamo-Triton Server, and make sure the folder structure looks correct.

python_backend_model_repository
└── prediction_and_shapley
      ├── 1
         ├── ...
      └── config.pbtxt
  1. Check config.pbtxt Verify that config.pbtxt accurately reflects your model’s I/O specifications. Adjust any shapes, batch sizes, or other parameters as required.

  2. Start NVIDIA Dynamo-Triton Server Launch the NVIDIA Dynamo-Triton Inference Server (example command shown below; adapt paths and arguments as needed):

 !docker run --gpus all -d -p {HTTP_PORT}:{HTTP_PORT} -p {GRPC_PORT}:{GRPC_PORT} -v /path/to/python_backend_model_repository:/models --name tritonserver {TRITON_IMAGE} tritonserver --model-repository=/models
  1. Send Inference Requests Use an HTTP or gRPC client to send inference requests to the NVIDIA Dynamo-Triton. The model will automatically load and serve both predictions and Shapley values. The following section details how to pass data to models deployed on the NVIDIA Dynamo-Triton Inference Server.