With data generated with simulation, it is possible to train machine learning models efficiently without labor-intensive supervision. Use the following steps to mesure how efficiently a specific hardware setup can generate training data.
Make sure nvidia-docker and Isaac-sim docker image are available.
Run following command to generate necessary configuration files:
python3 ./packages/nvidia_qa/benchmark/isaac_sim/py_generate_apps.py --gpu_count <NUMBER_OF_GPU> --sim_per_gpu <NUMBER_OF_SIM> --sim_host_ip <SIM_HOST_IP> --mosaic_host_ip <MOSAIC_HOST_IP> --sim_image_name <ISAACSIM_IMAGE> --sim_config_folder /home/<SIM_USERNAME>/deploy/<DEV_USERNAME>/isaac_sim_runner-pkg/packages/nvidia_qa/benchmark/isaac_sim/bridge_config
Where <NUMBER_OF_GPU> is the number of GPUs in the system. The default value is 1. To fully take advantage of GPU hardware, experiment with running more than 1 simulation container for each GPU. Specify the number of containers in <NUMBER_OF_SIM>. The default value is 1. <SIM_USERNAME> is the username that would be used on the simulation device. <DEV_USERNAME> is the current username on the development setup. <SIM_HOST_IP> is the ip of simulation device. <MOSAIC_HOST_IP> is the ip of device for monitoring app. <ISAACSIM_IMAGE> is the name of Isaac-sim docker image as shown in the output of
docker image ls.
In case of running both apps on the same device, Run
ifconfigto determine the Docker bridge host IP address under adapter
docker0. Provide the address as :code:<MOSAIC_HOST_IP>. This helps containers talk with app running locally.
With configurations generated, deploy both packages to simulation and monitoring setups with the following commands (from the
./../engine/engine/build/deploy.sh -h <SIM_HOST_IP> -d x86_64 -p //packages/nvidia_qa/benchmark/isaac_sim:isaac_sim_runner-pkg ./../engine/engine/build/deploy.sh -h <MOSAIC_HOST_IP> -d x86_64 -p //packages/nvidia_qa/benchmark/isaac_sim:mosaic-pkg
On simulation setup execute following command:
cd ~/deploy/<DEV_USERNAME>/isaac_sim_runner-pkg ./packages/nvidia_qa/benchmark/isaac_sim/isaac_sim_runner
This application starts all containers needed and shuts all of them down on exit.
On monitoring setup, start the corresponding Mosaic viewer application with the following command:
Start the Sight application by loading
http://<MOSAIC_HOST_IP>:3000in a web browser. The application listens to all containers running and displays generated color camera footage in the Mosaic viewer for validation. It can also measure performance of data generation.
By default, the generated application is only generating the color camera image. Modify
packages/nvidia_qa/benchmark/isaac_sim/isaacsim.subgraph.json. to match what you need.
The default Mosaic application is moving the color camera around the scene and a white ball to diversify data generation. Provide a different parameter of
--publish_to_containersin step 2 above for something different.
For the full list of parameters to experiment with, run the following command:
python3 ./packages/nvidia_qa/benchmark/isaac_sim/py_generate_apps.py -h
This section describes how to launch a single training application, such as the object pose estimation autoencoder training, with multiple Isaac Sim Unity3D simulators providing training data in parallel. Before proceeding, follow Setup XServer with Virtual Display on DGX and Install docker and Isaac SDK image.
The training app and multiple simulators run inside a single docker container. Each simulator publishes training data over a different TCP port, and the training application listens on all the ports and pipes the data into a single SampleAccumulator for training.
packages/nvidia_qa/benchmark/isaac_sim_unity3d convert Isaac applications intended
for training with a single simulator to used for multi-simulator training. The entry point is the
deploy_single_docker.sh script, which does the following:
create_single_docker.pyto convert the training and simulation app.
- Deploys the training app to remote.
- Copies the sim executable to remote.
- Generates the
docker_run.shscript to launch all simulators and the training app inside docker, and copies this script to remote
- Generates the docker run command. The command is logged to the local console as Info: docker cmd. You can copy the command to run on the server from your deploy path. Alternatively, add argument -r or –run to ssh into the server and run the docker container at the end of the script.
Check the arguments in this script, then run the following with the appropriate arguments:
Without additional arguments, the script runs the object-pose estimation training
with the simulation build in
with 9 simulators (3 GPUs and 3 simulators per GPU).
As an example, to deploy and run the freespace dnn training with the sidewalk scene and 60 simulators, execute the following:
–gpu and –sim_per_gpu specify the number of GPUs for simulations and the number of simulators to launch on each GPU. The total number of simulator instances is the product of the two. Note that we always run the training application on GPU 0, and the simulators are launched on GPU 1 and above. For the DGX stations with 4 GPUs, –gpu must not exceed 3. You should check if your training speed is limited by sample generation rate or by training to determine how many simulators to launch.
–sim_build and –sim_bin specify the path and filename (without extension) for the
simulator build on local filesystem. For example, if the pose estimation training scene
executable is at
$HOME/isaac_sim_unity3d/projects/sample/Builds/pose_estimation_training .x86_64, the command line arguments are –sim_build ~/isaac_sim_unity3d/projects/sample/Builds
- –training_app gives the name of the training app JSON, without an extension, relative to the
The python scripts contain more options that are not exposed in the bash script, thus if you
have a different use case for multiple simulators, you may use the
convert_training_app.py scripts. Check out the argument parser in
for all options. In particular, for
convert_training_app.py, –indexed_edges allows the
duplicated edge to be indexed, which is useful for MosaicViewer, and –ignore_edges prevents
duplicating particular edges, which is useful for a single CameraViewer.
Setup XServer with Virtual Display on DGX
Unity does not support native headless rendering, and running a X server inside docker container is not supported: To run Unity simulator on DGX, you need to start XServer on all the GPUs with virtual display, and pass the X11 socket into the docker container. The following steps set up the XServer with virtual displays. These steps require sudo access to the server machine.
- Generate the
/etc/X11/xorg.conffile with this command (only once after installing/flashing the system):
You only need to perform the above step once, but you need to perform the steps below after every reboot.
- Physically disconnect the monitor from the DGX station.
- Stop the Display Manager with this command:
- Kill all xorg instances that show up in nvidia-smi, then start a new Xorg instance:
- Assign access with the following command:
After these steps, if you run nvidia-docker with the -v /tmp/.X11-unix:/tmp/.X11-unix:rw and -e NVIDIA_VISIBLE_DEVICES=all commands, you will be able to run Unity using GPU x for graphics with DISPLAY=:0.x inside docker.
Install docker and Isaac SDK image
Isaac SDK contains
./engine/engine/build/docker/install_docker.sh. Copy the commands from
the file and run on the server to install docker.
There are two ways to obtain the Isaac SDK image:
- Option one: Pull from
nvcr.io. This requires an NGC account to obtain your API key.
- Option two: Build locally. Follow Using Docker to create the isaacbuild:latest image locally, then copy it to the server. Save the docker image to a .tar locally:
Copy this image to the server, then load the image into docker on the server: