Run Financial Fraud Training Container#

This document provides step-by-step instructions on how to log in, pull the Financial Fraud Training Docker image, run the Financial Fraud Training container, and train models using it.

1. Authenticate with NVIDIA GPU Cloud (NGC) and Pull the Docker Image#

This guide explains how to authenticate with the NVIDIA GPU Cloud (NGC) and pull the Docker image from the NVIDIA registry.

Prerequisites#

NGC Account & API Key: Ensure you have an NGC account and a valid API key.
Docker: Make sure Docker to install and run Docker are on your machine.

Authenticate with NGC#

Log in to the NVIDIA container registry using your NGC API key:

docker login nvcr.io --username '$oauthtoken' --password $NGC_API_KEY

Note: Replace $NGC_API_KEY with your actual API key or set it in NGC_API_KEY environment variable.

Pull the Docker Image#

After authenticating, pull the desired Docker image from the registry:

docker pull nvcr.io/nvidia/cugraph/financial-fraud-training:1.0.0

2. Run the Training Container API server#

The container docs provide instructions on how to organize the data for use in training. It also provides examples of preprocessing. However, a complete example covering how to obtain, organize, and preprocess TabFormer data for use in the container is located in the Financial Fraud Training Blueprint.

Run the Container#

Run the training container with the following command:

   docker run -it --rm --name=financial-fraud-training --gpus "device=0" \
    -p 8000:8000 -e NIM_HTTP_API_PORT=8000 -p 50051:50051 \
    -e NIM_GRPC_API_PORT=50051 \
    -e NIM_DISABLE_MODEL_DOWNLOAD=True \
    -v <DATA_DIR>:/data \
    -v <DIR_TO_SAVE_TRAINED_MODELS>\
    :/trained_models nvcr.io/nvidia/cugraph/financial-fraud-training:1.0.0 \
    -e NGC_API_KEY=$NGC_API_KEY

Note: Replace <DATA_DIR> and <DIR_TO_SAVE_TRAINED_MODELS> with the actual paths on your system where your training data is located and trained models should reside.

Command Breakdown:#

Docker Run Options:

-it: Run the container interactively.

--rm: Automatically removes the container when it exits.

--runtime=nvidia & --gpus all: Ensures the container uses NVIDIA GPUs.

--name=financial-fraud-training: Names the container.

-p: Maps the container ports (8000 for HTTP API and 50051 for gRPC API) to the host.

-e: Sets environment variables required for the training (e.g., port numbers, disabling model download).
-e: Sets environment variables required for the training (e.g., port numbers, disabling model download).

Volume Mounts:

-v <DATA_DIR>:/data: Mounts your GNN data directory into the container.

-v <DIR_TO_SAVE_TRAINED_MODELS>:/trained_models: Mounts a directory for saving trained models.

Image and Authentication:

The image nvcr.io/nvidia/cugraph/financial-fraud-training:1.0.0 is used to run the container.

-e NGC_API_KEY=$NGC_API_KEY passes your API key into the container for authentication.

Note: Replace <DATA_DIR> and <DIR_TO_SAVE_TRAINED_MODELS> with the actual paths on your system where your GNN data is located and trained models should reside.

Run Training#

Important: Before running the training, it must be configured. Sample configuration JSON and explanations for the parameters are provided for training with and without GNN embeddings.

The API expects a JSON configuration payload that specifies the training configuration.

Use the following command to initiate the training process:

curl -X POST "http://0.0.0.0:8000/train" \
  -H "Content-Type: application/json" \
  -d @training_config.json

Command Breakdown#

curl: A command-line tool for transferring data with URLs.

-X POST: Specifies that the request method is POST.

"http://0.0.0.0:8000/train": The API endpoint to trigger the training. Here, 0.0.0.0 indicates the server is accessible on all network interfaces at port 8000, and /train is the designated endpoint for training jobs.

-H "Content-Type: application/json": Sets the request header to indicate that the payload is in JSON format.

-d @training_config.json: Sends the contents of the file training_config.json as the request body. This file should contain your training configuration in JSON format.

Before running the command, ensure that the training server is up and running at http://0.0.0.0:8000 and the training configuration file is present in your current directory and contains the correct configuration for your training job.