Benchmarks

With data generated with simulation, it is possible to train machine learning models efficiently without labor-intensive supervision. Use the following steps to mesure how efficiently a specific hardware setup can generate training data.

  1. Make sure nvidia-docker and Isaac-sim docker image are available.

  2. Run following command to generate necessary configuration files:

    Copy
    Copied!
                

    python3 ./packages/nvidia_qa/benchmark/isaac_sim/py_generate_apps.py --gpu_count <NUMBER_OF_GPU> --sim_per_gpu <NUMBER_OF_SIM> --sim_host_ip <SIM_HOST_IP> --mosaic_host_ip <MOSAIC_HOST_IP> --sim_image_name <ISAACSIM_IMAGE> --sim_config_folder /home/<SIM_USERNAME>/deploy/<DEV_USERNAME>/isaac_sim_runner-pkg/packages/nvidia_qa/benchmark/isaac_sim/bridge_config

    Where <NUMBER_OF_GPU> is the number of GPUs in the system. The default value is 1. To fully take advantage of GPU hardware, experiment with running more than 1 simulation container for each GPU. Specify the number of containers in <NUMBER_OF_SIM>. The default value is 1. <SIM_USERNAME> is the username that would be used on the simulation device. <DEV_USERNAME> is the current username on the development setup. <SIM_HOST_IP> is the ip of simulation device. <MOSAIC_HOST_IP> is the ip of device for monitoring app. <ISAACSIM_IMAGE> is the name of Isaac-sim docker image as shown in the output of docker image ls.

    In case of running both apps on the same device, Run ifconfig to determine the Docker bridge host IP address under adapter docker0. Provide the address as :code:<MOSAIC_HOST_IP>. This helps containers talk with app running locally.

  3. With configurations generated, deploy both packages to simulation and monitoring setups with the following commands (from the sdk/ subdirectory):

    Copy
    Copied!
                

    ./../engine/engine/build/deploy.sh -h <SIM_HOST_IP> -d x86_64 -p //packages/nvidia_qa/benchmark/isaac_sim:isaac_sim_runner-pkg ./../engine/engine/build/deploy.sh -h <MOSAIC_HOST_IP> -d x86_64 -p //packages/nvidia_qa/benchmark/isaac_sim:mosaic-pkg

  4. On simulation setup execute following command:

    Copy
    Copied!
                

    cd ~/deploy/<DEV_USERNAME>/isaac_sim_runner-pkg ./packages/nvidia_qa/benchmark/isaac_sim/isaac_sim_runner

    This application starts all containers needed and shuts all of them down on exit.

  5. On monitoring setup, start the corresponding Mosaic viewer application with the following command:

  6. Start the Sight application by loading http://<MOSAIC_HOST_IP>:3000 in a web browser. The application listens to all containers running and displays generated color camera footage in the Mosaic viewer for validation. It can also measure performance of data generation.

    By default, the generated application is only generating the color camera image. Modify packages/nvidia_qa/benchmark/isaac_sim/isaacsim.subgraph.json. to match what you need.

    The default Mosaic application is moving the color camera around the scene and a white ball to diversify data generation. Provide a different parameter of --listen_to_host and --publish_to_containers in step 2 above for something different.

For the full list of parameters to experiment with, run the following command:

Copy
Copied!
            

python3 ./packages/nvidia_qa/benchmark/isaac_sim/py_generate_apps.py -h

This section describes how to launch a single training application, such as the object pose estimation autoencoder training, with multiple Isaac Sim Unity3D simulators providing training data in parallel. Before proceeding, follow Setup XServer with Virtual Display on DGX and Install docker and Isaac SDK image.

The training app and multiple simulators run inside a single docker container. Each simulator publishes training data over a different TCP port, and the training application listens on all the ports and pipes the data into a single SampleAccumulator for training.

multisim.png

Scripts in packages/nvidia_qa/benchmark/isaac_sim_unity3d convert Isaac applications intended for training with a single simulator to used for multi-simulator training. The entry point is the deploy_single_docker.sh script, which does the following:

  1. Calls create_single_docker.py to convert the training and simulation app.
  2. Deploys the training app to remote.
  3. Copies the sim executable to remote.
  4. Generates the docker_run.sh script to launch all simulators and the training app inside docker, and copies this script to remote
  5. Generates the docker run command. The command is logged to the local console as Info: docker cmd. You can copy the command to run on the server from your deploy path. Alternatively, add argument -r or –run to ssh into the server and run the docker container at the end of the script.

Check the arguments in this script, then run the following with the appropriate arguments:

Without additional arguments, the script runs the object-pose estimation training with the simulation build in ~/isaac_sim_unity3d/projects/sample/Builds/pose_estimation_training.x86_64, with 9 simulators (3 GPUs and 3 simulators per GPU).

As an example, to deploy and run the freespace dnn training with the sidewalk scene and 60 simulators, execute the following:

–gpu and –sim_per_gpu specify the number of GPUs for simulations and the number of simulators to launch on each GPU. The total number of simulator instances is the product of the two. Note that we always run the training application on GPU 0, and the simulators are launched on GPU 1 and above. For the DGX stations with 4 GPUs, –gpu must not exceed 3. You should check if your training speed is limited by sample generation rate or by training to determine how many simulators to launch.

–sim_build and –sim_bin specify the path and filename (without extension) for the simulator build on local filesystem. For example, if the pose estimation training scene executable is at $HOME/isaac_sim_unity3d/projects/sample/Builds/pose_estimation_training .x86_64, the command line arguments are –sim_build ~/isaac_sim_unity3d/projects/sample/Builds

and `–sim_bin pose_estimation_training’.
–training_app gives the name of the training app JSON, without an extension, relative to the
isaac root path. –training_nodes gives a comma-separated list of node names that are duplicated. In most case this is just the simulation subgraph node name. To correctly set the TCP ports for the training app, the TCP configs should be present in the training app JSON itself, not in a subgraph JSON.

The python scripts contain more options that are not exposed in the bash script, thus if you have a different use case for multiple simulators, you may use the convert_simulator_app.py and convert_training_app.py scripts. Check out the argument parser in multi_simulator_utils.py for all options. In particular, for convert_training_app.py, –indexed_edges allows the duplicated edge to be indexed, which is useful for MosaicViewer, and –ignore_edges prevents duplicating particular edges, which is useful for a single CameraViewer.

Setup XServer with Virtual Display on DGX

Unity does not support native headless rendering, and running a X server inside docker container is not supported: To run Unity simulator on DGX, you need to start XServer on all the GPUs with virtual display, and pass the X11 socket into the docker container. The following steps set up the XServer with virtual displays. These steps require sudo access to the server machine.

  1. Generate the /etc/X11/xorg.conf file with this command (only once after installing/flashing the system):
Note

You only need to perform the above step once, but you need to perform the steps below after every reboot.

  1. Physically disconnect the monitor from the DGX station.
  2. Stop the Display Manager with this command:
  1. Kill all xorg instances that show up in nvidia-smi, then start a new Xorg instance:
  1. Assign access with the following command:

After these steps, if you run nvidia-docker with the -v /tmp/.X11-unix:/tmp/.X11-unix:rw and -e NVIDIA_VISIBLE_DEVICES=all commands, you will be able to run Unity using GPU x for graphics with DISPLAY=:0.x inside docker.

Install docker and Isaac SDK image

Isaac SDK contains ./engine/engine/build/docker/install_docker.sh. Copy the commands from the file and run on the server to install docker.

There are two ways to obtain the Isaac SDK image:

  • Option one: Pull from nvcr.io. This requires an NGC account to obtain your API key.
  • Option two: Build locally. Follow Using Docker to create the isaacbuild:latest image locally, then copy it to the server. Save the docker image to a .tar locally:

Copy this image to the server, then load the image into docker on the server:

© Copyright 2018-2020, NVIDIA Corporation. Last updated on Oct 30, 2023.