Troubleshooting

NVIDIA AI Workbench is still in beta and there are some edge cases or bugs that you may encounter. Or maybe you forgot to do certain steps. Either way, this section has a list of common issues and some checklists to go through to try and figure them out on your own.

If these fail, you should check the AI Workbench DevZone Forum for the most recent help. If that doesn’t solve your issue then you should post on DevZone.

  • Check that you downloaded the correct installer for your architecture. The x-86 installer won’t work on arm64 and vice versa.

  • Check that you made the installer file executable.

  • Check that fuse and libfuse2 are installed as required to support AppImage.

  • Run the installer again.

If none of those work, please post an issue in the AI Workbench DevZone Forum.

  • Check that your Windows version supports WSL2. It must be Windows 11 or Windows 10 build 19041 or higher. See here to check your Windows version.`

  • There is a known issue being tracked causing indeterminant behavior when using Docker Desktop with Windows. Podman is recommended for Windows while this issue is being investigated.

  • Check that you correctly set the absolute path on your local machine to the SSH key. It should be something like /home/<local_user_name>/.ssh/<ssh_key>.

  • Check that you correctly set the absolute path on the remote to the Workbench binaries. It should be something like /home/<remote_user_name>/.nvwb/bin/

  • Check the permissions on the SSH key are correct. They should be 600.

  • Check the permissions on the folder the key is in. They should be 700.

  • Check that the remote system is on and that you have the correct IP address and username.

If none of those work, please post an issue in the AI Workbench DevZone Forum.

Here are some common issues you may face when bringing your own container to the project.

  1. If the container build is failing at the container pull step, check the following.

    • Ensure the container image and tag are spelled correctly. Make sure the tag is correctly specified.

    • You may not have connected an API key integration needed to pull the container of interest. Add the API key to AI Workbench if this is the case.

    • You may be missing access permissions to the container. Ensure those permissions are granted and try again.

  2. If you are running into permissions issues with the pip install step, you may need to clear the package_manager_environment section.

Copy
Copied!
            

package_manager_environment: name: "" target: ""


  1. If you are properly building and running the container but having trouble accessing certain locations in the project environment. For example:

Copy
Copied!
            

mv: cannot create regular file at '/path/to/a/location/my-file.txt': permission denied

AI Workbench builds the project environment as root. However, if the user information is unspecified in the project spec file, AI Workbench will create a new user and run the container as this non-root user. This can lead to permissions issues when trying to access certain locations in the container created by the root user upon container build.

It is best practice to decouple the project from the underlying container, but if that is not possible, the easiest workaround is to tell the Workbench Service to run the container as the root user.

Change the project spec file to the following:

Copy
Copied!
            

user_info: uid: "0" gid: "0" username: "root"


If none of those work, please post an issue in the AI Workbench DevZone Forum.

Previous FAQs
Next Updates and Issues
© Copyright 2023-2024, NVIDIA. Last updated on Jan 21, 2024.