The Project Specification#

Overview of the Project Specification#

A Workbench project is just a git repository with some metadata files that provided information to drive automation and the overall user experience.

The central file for the metadata is the specification file, i.e. /project/.project/spec.yaml.

You do not need to know anything about the specification file to use Workbench, but you must be very careful if you decide to edit it.

Workbench reads and writes to this file
It is a yaml file, and is thus very particular about formatting
It has a schema that you must follow
It has required fields and optional fields
You can edit the file but you cannot delete it
If you edit the file and things break, they aren’t broken permanently. Just fix the file

Project Specification Concerns#

The specification has four main concerns broken into sections.

You can see and edit the sections in the /project/.project/spec.yaml file.

meta: Basic metadata about the project like name and description
layout: Directory structure and how versioning backends are applied
environment: Information about the container image and environment, including configured applications
execution: Runtime information for the project container

There are some things that aren’t concerns of the specification.

Git history or the git remote being used
The individual files in the project, except potentially the compose.yaml file

Example AI Workbench Project Spec File#

The following is an example of an AI Workbench project spec.yaml file.

specVersion: v2
specMinorVersion: 1
meta:
  name: example-project
  image: project-example-project
  description: An example project using PyTorch
  labels: []
  createdOn: "2024-01-04T23:32:17Z"
  defaultBranch: main
layout:
- path: code/
  type: code
  storage: git
- path: models/
  type: models
  storage: gitlfs
- path: data/
  type: data
  storage: gitlfs
- path: data/scratch/
  type: data
  storage: gitignore
environment:
  base:
    registry: nvcr.io
    image: nvidia/ai-workbench/pytorch:1.0.2
    build_timestamp: "20231212000523"
    name: PyTorch
    supported_architectures: []
    cuda_version: "12.2"
    description: A Pytorch 2.1 environment with CUDA 12.2
    entrypoint_script: ""
    labels:
    - cuda12.2
    - pytorch2.1
    apps:
    - name: jupyterlab
      type: jupyterlab
      class: webapp
      start_command: jupyter lab --allow-root --port 8888 --ip 0.0.0.0 --no-browser
        --NotebookApp.base_url=\$PROXY_PREFIX --NotebookApp.default_url=/lab --NotebookApp.allow_origin='*'
      health_check_command: '[ \$(echo url=\$(jupyter lab list | head -n 2 | tail
        -n 1 | cut -f1 -d'' '' | grep -v ''Currently'' | sed "s@/?@/lab?@g") | curl
        -o /dev/null -s -w ''%{http_code}'' --config -) == ''200'' ]'
      timeout_seconds: 90
      stop_command: jupyter lab stop 8888
      user_msg: ""
      icon_url: ""
      webapp_options:
        autolaunch: true
        port: "8888"
        proxy:
          trim_prefix: false
        url_command: jupyter lab list | head -n 2 | tail -n 1 | cut -f1 -d' ' | grep
          -v 'Currently'
    - name: tensorboard
      type: tensorboard
      class: webapp
      start_command: tensorboard --logdir \$TENSORBOARD_LOGS_DIRECTORY --path_prefix=\$PROXY_PREFIX
        --bind_all
      health_check_command: '[ \$(curl -o /dev/null -s -w ''%{http_code}'' http://localhost:\$TENSORBOARD_PORT\$PROXY_PREFIX/)
        == ''200'' ]'
      timeout_seconds: 90
      stop_command: ""
      user_msg: ""
      icon_url: ""
      webapp_options:
        autolaunch: true
        port: "6006"
        proxy:
          trim_prefix: false
        url: http://localhost:6006
    programming_languages:
    - python3
    icon_url: ""
    image_version: 1.0.3
    os: linux
    os_distro: ubuntu
    os_distro_release: "22.04"
    schema_version: v2
    user_info:
      uid: "1001"
      gid: "1001"
      username: "appuser"
    package_managers:
    - name: apt
      binary_path: /usr/bin/apt
      installed_packages:
      - curl
      - git
      - git-lfs
      - vim
    - name: pip
      binary_path: /usr/local/bin/pip
      installed_packages:
      - jupyterlab==4.0.7
    package_manager_environment:
      name: ""
      target: ""
execution:
  apps: []
  resources:
    gpu:
      requested: 1
      sharedMemoryMB: 1024
  secrets: []
  mounts:
  - type: project
    target: /project/
    description: project directory
    options: rw
  - type: volume
    target: /data/tensorboard/logs/
    description: Tensorboard Log Files
    options: volumeName=tensorboard-logs-volume

AI Workbench Project Spec Definition#

Project Metadata#

The spec file includes project metadata, such as version, name, descrption, and directory structure. The following are the project metadata fields.

Field	Description	Example Usage
`specVersion`	The schema version number of the current project spec.	1specVersion: v2
`specMinorVersion`	The schema minor version number of the current project spec.	1specMinorVersion: 1
`meta`	Metadata for the project to appear correctly in the AI Workbench application.	—
`meta.name`	The name of the project.	1name: hello-world
`meta.image`	The name of the project container image. This image name is local to the computer and isn’t pushed to a container registry.	1image: project-hello-world
`meta.description`	The description of the project.	1description: An example project using PyTorch
`meta.labels`	A list of labels for the project.	—
`meta.createdOn`	When the project was created, formatted as an RFC 3339 string.	1createdOn: "2024-01-04T23:32:17Z"
`meta.defaultBranch`	The default Git branch for the project.	1defaultBranch: main
`layout`	A list of information about the project directories. AI Workbench uses this information to reconcile configuration, such as by adding a path to the .gitignore or .gitattributes files. `path` is a project directory relative to the root of the repository. `type` is the type of content in the directory. Valid values are `code`, `data`, `model`. `storage` is how the data in the directory should be stored. Valid values are `git`, `gitlfs`, `gitignore`.	1layout: 2 - path: data/scratch/ 3 type: data 4 storage: gitignore

Project Environment Information#

The spec file includes environment information, such as the container image for the project, and additional packages that are installed in it. You can manually configure this section of the spec, but typically it is populated automatically from the labels of the environment image that you select when you create a project.

Warning

Any manual modifications to the data in the environment.base section of the spec.yaml file are overridden when the environment version is updated.

In the environment section of the spec file, all fields are children of the base field.

environment:
  base:
    ... all fields

The following are the environment.base fields.

Field	Description	Example Usage
`registry`	The container registry that has the image.	1registry: nvcr.io
`image`	The container image on top of which the project container is built.	1image: nvidia/pytorch:23.12-py3
`build_timestamp`	The timestamp of the last image build. For the timestamp specify year, month, day, hour, minutes, seconds.	1build_timestamp: "20231212000523"
`name`	The name of the container.	1name: PyTorch
`supported_architectures`	A list of supported architectures that the image is compatible with.	1supported_architectures: 2 3 - "amd64" 4 - "arm64"
`cuda_version`	The version of CUDA installed in the environment, if applicable. This field tells AI Workbench what version of CUDA the host driver must support. If this value is not set correctly, you may experience runtime errors that AI Workbench fails to warn about when starting the container.	1cuda_version: "12.2"
`description`	A description of the container.	1description: A Pytorch 2.1 environment with CUDA 12.2
`entrypoint_script`	The path to a script that runs when the project container starts.	1entrypoint_script: /path/to/script.sh
`labels`	A list of labels for the container, such as search term keywords or descriptors.	1labels: 2 - cuda12.2 3 - pytorch2.1 4 - python3 5 - jupyterlab
`apps`	A list of applications installed in the container. For the definition of the fields in `apps`, see Project App Information.	1apps: 2- name: jupyterlab 3 ... more fields 4- name: tensorboard 5 ... more fields
`programming_languages`	A list of programming languages installed in the container.	1programming_languages: 2 - python3
`icon_url`	A link to the icon or image for the container.	1icon_url: https://my-website.com/my-image.png
`image_version`	The version number of the container image, if any.	1image_version: 1.0.3
`os`	The name of the container operating system.	1os: linux
`os_distro`	The name of the container operating system distribution.	1os_distro: ubuntu
`os_distro_release`	The release version of the container operating system distribution.	1os_distro_release: "22.04"
`schema_version`	Metadata for the version of the container label schema currently read by AI Workbench.	1schema_version: v2
`user_info`	Information about the user that the container processes should run as.	1user_info: 2 uid: "1001" 3 gid: "1001" 4 username: "appuser"
`package_managers`	A list of package managers installed in the container, and for each one an optional list of installed packages.	1package_managers: 2- name: conda 3 binary_path: /opt/conda/bin/conda 4 installed_packages: 5 - python=3.9.18 6 - pip 7- name: apt 8 binary_path: /usr/bin/apt 9 installed_packages: 10 - ca-certificates 11 - curl 12- name: pip 13 binary_path: /opt/conda/bin/pip 14 installed_packages: []
`package_manager_environment`	A package manager environment that should be activated before installing packages and starting applications. Useful if your container uses a virtual environment or conda when building.	1package_manager_environment: 2 name: "" 3 target: ""

Project App Information#

The spec file includes information about applications installed in the environment and custom applications, such as commands and parameters for running them. The following are the application information fields.

Field	Description	Example Usage
`name`	The name of the application. This name appears in the user interface.	1name: jupyterlab
`type`	The type of application, used to determine what application-specific automation is run.	1type: jupyterlab
`class`	The class of application, used to determine what optional configuration options are available. Valid values are `webapp`, `process`, and `native`.	1class: webapp
`start_command`	The shell command used to start the application. Must not be a blocking command.	1start_command: jupyter lab...
`health_check_command`	The shell command used to check the health or status of the application. A return of zero means the application is running and healthy. A return of non-zero means that the application is not running or unhealthy.	1health_check_command: '<code>'
`timeout_seconds`	The number of seconds that AI Workbench waits for the `health_check_command` to complete. Valid values are greater than 0 and less than 3600. The default value is 60.	1timeout_seconds: 90
`stop_command`	The shell command used to stop the application.	1stop_command: jupyter lab stop 8888
`user_msg`	An optional message that appears to the user when the application is running. If `class` is `webapp`, you can use the placeholder string `{{URL}}` in the message, and it is populated after the app starts.	1user_msg: ""
`icon_url`	An optional link to the icon or image used for the application.	1icon_url: ""
`webapp_options`	If `class` is `webapp`, the following options are available. `autolaunch` - True if AI Workbench should automatically open the application URL for the user; otherwise, false. `port` - The port that the application runs on. `proxy` - If specified include `trim_prefix` - True if the AI Workbench reverse proxy should remove the application-specific Url prefix before forwarding the request to the application; otherwise, false. `url` - The static URI used to access the application. Or `url_command` - The shell command used to get the URI for the application. The output from this command is considered the URL. If `class` is `process`, the following options are available. `wait_until_finished` - True if the AI Workbench desktop application should wait for the application to finish; false if the desktop app should let it run in the background. If `true` the desktop app notifies you when the process completes. The CLI always waits.	1webapp_options: 2 autolaunch: true 3 port: "8888" 4 proxy: 5 trim_prefix: false 6 url_command: <your command> 1webapp_options: 2 autolaunch: true 3 port: "6006" 4 proxy: 5 trim_prefix: false 6 url: http://localhost:6006 1process_options: 2 wait_until_finished: true

Project Runtime Information#

The spec file includes runtime information, such as environment variables, the number of GPUs to be mounted in the container, and custom application information. The following are the runtime information fields.

Field	Description	Example Usage
`execution`	Information about how to run the project.	—
`execution.apps`	A list of custom applications installed in the project that are not part of the container environment. For the definition of the fields in `apps`, see Project App Information.	1apps: 2- name: jupyterlab 3 ... more fields 4- name: mychat 5 ... more fields
`execution.resources`	Host resources that are requested or required to run the project. `gpu` is the number of GPUs requested. If no GPUs are available the project can be started without any GPUs. `sharedMemoryMB` is the amount of shared memory (in MB) to allocate to the project container. For more information, see GPU Configuration.	1resources: 2 gpu: 3 requested: 0 4 sharedMemoryMB: 0
`execution.secrets`	A list of sensitive environment variables to set before the project starts. Specify the name and description of each variable only. The value of each variable is not part of the spec file and is configured at runtime. For more information, see Environment Variables.	1secrets: 2- variable: secret1 3 description: Secret 1
`execution.mounts`	A list of external folders and files that are used by the project, and where they should be located in the project container. Required values to configure the mount (like the source directory) are not part of the spec file, and are configured at runtime. For `type`, valid values are `project`, `host`, `volume`, or `tmp`. There is exacly one `project` mount (or the container fails to start). `target` is the target location within the project container. Target locations are absolute paths and include the trailing slash. For more information, see AI Workbench Mounts.	1mounts: 2 - type: project 3 target: /project/ 4 description: project directory 5 options: rw

Customize Your Container#

If you want to change the behavior of the container environment for a single project, you can manually edit the metadata contained in the spec.yaml file. First you modify your spec.yaml file as described following, and then you rebuild your project environment.

Note

If you want to use one of the pre-built containers and make simple customizations, such as adding packages, see Walkthrough: Customize Your Environment and Development Environments instead.

If you want to create a fully-custom container that you can use for you own projects, or that you can publish and share with other AI Workbench users, see Use Your Own Container instead. This is an advanced scenario.

Use the following list to determine what changes to make to your spec file.

Change Required — Changes that must occur to customize the project container. Without these changes, your project does not build correctly.
Change Recommended — Best-practice changes that you should make to the project container. These are fields that are used by the NVIDIA AI Workbench client software. Your project might still build, and components might still function; however, your experience working on the project inside the AI Workbench UI might be negatively impacted.
Change Optional — Change these fields only if they are relevant to your project.

To customize your container, navigate to the .project/spec.yaml file of your project, and scroll to the environment section. Use the information in the following table to edit your spec file. The full description and example usage for each field is in AI Workbench Project Spec Definition.

Field	Change?	Recommended Action
`registry`	Required	Specify the container registry for your container of interest; if you have the Dockerfile, you may need to first build the container and push it to a registry.
`image`	Required	Specify the container image (and tag, if any) for your container of interest; if you have the Dockerfile, you may need to first build the container and push it to a registry. Also include the namespace if needed, but do not include the registry.
`build_timestamp`	Optional	No manual updates are necessary. AI Workbench updates this field when you build your environment.
`name`	Required	Specify a name for the container. AI Workbench displays the old container information in the UI if left unchanged.
`supported_architectures`	Recommended	Specify a list of supported architectures for your container, or leave as a blank list if not applicable.
`cuda_version`	Required	Specify the version of CUDA installed in this container, or leave blank if not applicable. AI Workbench matches drivers incorrectly if not updated.
`description`	Required	Specify a brief and informative description of the container. AI Workbench displays the old container information in the UI if left unchanged.
`entrypoint_script`	Optional	If your project requires any custom action when the container starts, we provide a location for an entrypoint script you may want to use. Specify the location to the script here.
`labels`	Recommended	Specify a list of search term keywords or labels to be attached alongside your container. Consider these as search term keywords or descriptors for your container.
`apps`	Recommended	Specify the applications installed in the container. For details, see Project App Information.
`programming_languages`	Recommended	Specify the programming languages installed in this container, or leave blank if not applicable.
`icon_url`	Optional	If you would like AI Workbench to display an icon as part of this container, link the URL to the icon image here or leave it blank for defaults.
`image_version`	Recommended	Specify the version number of the container image, if any.
`os`	Recommended	Specify the name of the container operating system.
`os_distro`	Recommended	Specify the name of the container operating system distribution.
`os_distro_release`	Recommended	Specify the release version of the container operating system distribution.
`schema_version`	Optional	There is no need to update this field. However, if incorrect the project breaks. The current version is `v2`.
`user_info`	Recommended	AI Workbench automatically provisions a user for you when you run the container, but this field overrides that user.
`package_managers`	Recommended	For each package manager in the container, specify the name of the package manager, the complete path to the manager binary, and a comma-separated list of packages installed by that manager. If this remains unchanged, the package manager widget will likely not work, especially if you are working with a venv or conda environment.
`package_manager_environment`	Recommended	Specify the package manager environment that should be activated before installing packages and starting applications. Useful if your container uses a virtual environment or conda when building.