Work with Any Git Repository#
Overview#
- You can clone any Git repository into AI Workbench and it will set up a project for you.
If the repository does not have a
.project/spec.yamlfile, AI Workbench detects this during the clone and prompts you to select a base image. It then scaffolds a default specification file so you can start working immediately.- Cloning a non-project is the recommended way to bring an existing repository into AI Workbench.
You do not need to manually create configuration files before cloning. AI Workbench handles the initial setup and you can customize the specification afterward.
- You can also create a specification file manually before cloning.
This gives you full control over the initial configuration, including layout, packages, and resource settings. This approach is useful if you need a specific setup from the first build or if you are preparing a repository for others to clone.
Key Concepts#
- Specification File:
The
.project/spec.yamlfile that defines how AI Workbench builds and configures the project container.- Non-Project Clone:
Cloning a repository that does not have a
.project/spec.yamlfile. AI Workbench detects this, prompts you to select a base image, and scaffolds the specification.- Base Container Image:
A pre-built Docker image that provides the foundation for the project environment (OS, tools, frameworks).
Clone a Non-project#
- Step One: Get the repository URL.
If the repository is private, it must come from GitHub, GitLab or a self-hosted GitLab instance that you have access to. This requires you to have the relevant integration configured.
If it is public, it can come from any Git platform.
- Step Two: Open a location and clone using the URL.
Select Location Manager > Location Card
Select Clone Project
Enter Clone Project > Git Repository URL > followed by the repository URL
(optional) Alter the default path on disk Clone Project > Path
Select Clone Project > Clone
- Step Three: Select Clone or Clone and Detach.
Clone keeps the remote origin. Use this if you want to pull upstream changes.
Clone and Detach removes the remote origin and history. Use this if you want to push changes to your own account.
- Step Four: Select a base image.
AI Workbench detects that the repository does not have a
.project/spec.yamlfile and prompts you to pick a base image. Select one of the base images and proceed.
Success: The project tab opens and the container starts building with the scaffolded specification.
Existing build files in the repository may affect the container build.
AI Workbench looks for configuration files at the top level of the repository during the build.
If the repository has requirements.txt, apt.txt, preBuild.bash, or postBuild.bash files, they are used during the build and may require you to handle conflicts.
If these files are not at the top level, the container builds with no issues.
After cloning, you can customize the scaffolded specification.
The scaffolded spec.yaml provides a minimal working configuration.
You can edit it to add packages, GPU resources, secrets, mounts, and other features.
See the Add Optional Features to Your Project section below.
Manually Create a Project Specification#
- Creating a specification manually gives you full control over the project configuration before cloning.
This is useful when you need a specific setup from the first build or when preparing a repository for others to clone. Only a few fields are required: version information, base container image details, and a project mount.
- Step One: Create the directory structure.
Navigate to your repository root directory
Create the recommended directory structure:
mkdir -p .project code models data data/scratch
- Step Two: Create the minimal spec.yaml file.
Create a minimal
spec.yamlfile with this template:cat > .project/spec.yaml << 'EOF' specVersion: v2 specMinorVersion: 2 meta: name: my-project-name image: project-my-project-name description: My AI Workbench Project Description createdOn: "2025-11-15T12:00:00Z" defaultBranch: main labels: [] layout: - path: code/ type: code storage: git - path: models/ type: models storage: gitlfs - path: data/ type: data storage: gitlfs - path: data/scratch/ type: data storage: gitignore environment: base: registry: nvcr.io image: nvidia/ai-workbench/python-basic:1.0.8 name: Python Basic description: A Python environment with JupyterLab os: linux os_distro: ubuntu os_distro_release: "22.04" schema_version: v2 execution: mounts: - type: project target: /project/ description: Project directory options: rw EOF
Edit the file to customize these required fields:
meta.name- Lowercase alphanumeric with hyphens (e.g.,my-ml-project)meta.image- Should match patternproject-<name>(e.g.,project-my-ml-project)meta.description- Brief description of your projectmeta.createdOn- ISO 8601 timestamp with current date and timeenvironment.base.registry- Container registry (e.g.,nvcr.io,docker.io)environment.base.image- Base container image with tag (choose from NGC catalog)environment.base.name- Human-readable container nameenvironment.base.description- Brief description of the containerenvironment.base.os_distro- Must match the actual OS distribution in your base container (e.g.,ubuntu,centos)environment.base.os_distro_release- Must match the actual OS release in your base container (e.g.,"22.04","20.04")
- Step Three: Commit and push the .project directory.
Stage and commit the new directory:
git add .project/ code/ models/ data/ git commit -m "Add AI Workbench project specification" git push origin main
Ensure the repository is accessible (public or with appropriate credentials)
- Step Four: Clone the repository in AI Workbench.
Select Location Manager > Location Card
Select Clone Project
Enter Clone Project > Git Repository URL > followed by your repository URL
Select Clone Project > Clone
AI Workbench reads the
spec.yamlfile and builds the container
Success: The project appears in the Location Manager and the container builds successfully.
Use NVIDIA base images for simpler specifications.
NVIDIA-provided base images from nvcr.io have proper labels that minimize configuration effort. You still need to specify the OS distribution and release to match the actual container. Custom containers need additional Docker labels for full AI Workbench integration. See Use a Custom Container Image for custom container requirements.
The validation command has limitations.
The nvwb validate project-spec command checks basic syntax and structure.
However, it does not catch all errors and has known limitations.
The most reliable test is to push your repository and clone it with AI Workbench.
Add Optional Features to Your Project#
- The minimal specification is functional, but you can add optional features as needed.
Start simple and add complexity only when required. Each addition should serve a specific purpose for your project.
- Customize project layout for different storage needs.
The minimal specification includes a standard layout with code, models, data, and scratch directories. Modify the
layoutsection to change storage backends for different directories. Storage options:git(regular),gitlfs(large files),gitignore(excluded). Each layout entry requirestypeandstoragefields.- Add package dependencies with helper files.
Create these files in
.project/directory:requirements.txt- Python packages (one per line)apt.txt- System packages (one per line)preBuild.bash- Runs before container buildpostBuild.bash- Runs after container build
See Manage Packages for details.
- Configure GPU and memory resources.
Add
resourcessection toexecution:execution: resources: gpu: requested: 1 sharedMemoryMB: 2048 mounts: - type: project target: /project/ description: Project directory options: rw
Both
gpu.requestedandsharedMemoryMBare optional. See Configure GPU Settings for Project Container for hardware configuration.- Add environment secrets for API keys.
Add
secretssection toexecution:execution: secrets: - variable: NVIDIA_API_KEY description: NVIDIA API key for accessing models - variable: OPENAI_API_KEY description: OpenAI API key mounts: - type: project target: /project/ description: Project directory options: rw
Only
variableis required;descriptionis optional but recommended.- Add volume mounts for shared storage.
Add additional mounts to
execution.mounts:execution: mounts: - type: project target: /project/ description: Project directory options: rw - type: volume target: /data/ description: Shared dataset storage options: volumeName=my-data-volume
Each mount requires
typeandtarget(absolute path with trailing slash). The project mount is always required.- For complete field documentation, see:
AI Workbench Project Specification - Full specification reference
Manage Packages - Package management
Manage Runtime Settings - Mounts and variables