> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/openshell/llms.txt.
> For full documentation content, see https://docs.nvidia.com/openshell/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/openshell/_mcp/server.

# Google Vertex AI

> Configure OpenShell to route inference traffic through Google Vertex AI, including Anthropic Claude and Gemini models.

Google Vertex AI is a managed machine learning platform that hosts Anthropic Claude, Gemini, and third-party models through Google Cloud. OpenShell can route `inference.local` traffic to Vertex AI using gateway-managed credential refresh, so sandbox agents do not handle GCP credentials directly.

## Prerequisites

Before creating a Vertex AI provider, ensure you have:

* A GCP project with the [Vertex AI API](https://console.cloud.google.com/apis/library/aiplatform.googleapis.com) enabled.
* One of the following:
  * A GCP service account with the **Vertex AI User** role and a downloaded JSON key file, for production use.
  * The `gcloud` CLI with Application Default Credentials configured, for local development.

## Authentication

The `google-vertex-ai` provider supports two credential sources.

### Service Account Key

Supply the JSON key file content as the `GOOGLE_SERVICE_ACCOUNT_KEY` credential. OpenShell persists that value only as gateway-side refresh bootstrap material until you update or delete it. The raw service-account JSON and private key are not sandbox runtime credentials and are not exposed to sandboxes. Runtime inference requests use short-lived access tokens minted by the gateway and stored under a separate credential key.

```shell
openshell provider create \
  --name vertex-prod \
  --type google-vertex-ai \
  --credential GOOGLE_SERVICE_ACCOUNT_KEY="$(cat /path/to/key.json)" \
  --config VERTEX_AI_PROJECT_ID=my-gcp-project \
  --config VERTEX_AI_REGION=us-central1
```

Then configure gateway-managed refresh so the gateway uses the private key as refresh bootstrap material and rotates access tokens:

```shell
openshell provider refresh configure vertex-prod \
  --credential-key GOOGLE_VERTEX_AI_SERVICE_ACCOUNT_TOKEN \
  --strategy google-service-account-jwt \
  --material client_email="sa@my-gcp-project.iam.gserviceaccount.com" \
  --material private_key="$(jq -r .private_key /path/to/key.json)" \
  --secret-material-key private_key
```

### gcloud Application Default Credentials

For local development, configure ADC first, then pass `--from-gcloud-adc`:

```shell
gcloud auth application-default login
```

```shell
openshell provider create \
  --name vertex-local \
  --type google-vertex-ai \
  --from-gcloud-adc \
  --config VERTEX_AI_PROJECT_ID=my-gcp-project \
  --config VERTEX_AI_REGION=us-central1
```

`--from-gcloud-adc` reads `GOOGLE_APPLICATION_CREDENTIALS` first, then falls back to `$CLOUDSDK_CONFIG/application_default_credentials.json` when that environment variable is set, then to `~/.config/gcloud/application_default_credentials.json`. It configures an OAuth2 refresh token flow on the gateway and immediately mints the first access token before the command returns. If the command succeeds, the provider is ready for inference right away. It only works with user credentials generated by `gcloud auth application-default login`. If your ADC file is a service account key, the CLI returns an error and directs you to use the service account key method above.

ADC-backed providers mint and rotate access tokens into `GOOGLE_VERTEX_AI_TOKEN`.

`--from-gcloud-adc` is only valid for `google-vertex-ai` providers.

## Configuration Keys

Pass these as `--config KEY=VALUE` when creating the provider, or set them as environment variables and use `--from-existing`.

| Key                         | Required                                                                | Default                  | Description                                                                                                                                              |
| --------------------------- | ----------------------------------------------------------------------- | ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `VERTEX_AI_PROJECT_ID`      | Yes (unless `GOOGLE_VERTEX_AI_BASE_URL` or `VERTEX_AI_BASE_URL` is set) | —                        | GCP project ID.                                                                                                                                          |
| `VERTEX_AI_REGION`          | No                                                                      | `us-central1`            | Vertex location selector. Use a regional location such as `us-central1`, or `global`, `us`, or `eu` for the supported global and multi-region endpoints. |
| `GOOGLE_VERTEX_AI_BASE_URL` | No                                                                      | —                        | Full base URL override for non-Anthropic routes. Must be an official Vertex AI HTTPS endpoint root.                                                      |
| `VERTEX_AI_BASE_URL`        | No                                                                      | —                        | Backward-compatible alias for `GOOGLE_VERTEX_AI_BASE_URL`.                                                                                               |
| `VERTEX_AI_PUBLISHER`       | No                                                                      | Inferred from model name | Set to `anthropic` to force Anthropic Messages API routing, or any other value for OpenAI-compatible routing.                                            |

When `VERTEX_AI_PROJECT_ID` is set and no base URL override is present, the gateway maps `VERTEX_AI_REGION` to the Vertex host automatically:

* Regional locations such as `us-central1` use `https://<region>-aiplatform.googleapis.com`.
* `global` uses `https://aiplatform.googleapis.com`.
* `us` and `eu` use `https://aiplatform.<region>.rep.googleapis.com`.

For Anthropic models, OpenShell builds the publisher-model Vertex path automatically and injects `anthropic_version` into the request body. Vertex rawPredict does not receive `anthropic-version` as a header, and OpenShell strips `anthropic-beta` for Vertex Claude routes. For non-Anthropic models, OpenShell uses Vertex's OpenAI-compatible Chat Completions route under `.../endpoints/openapi/chat/completions`.

Use `GOOGLE_VERTEX_AI_BASE_URL` or `VERTEX_AI_BASE_URL` only for non-Anthropic Vertex routes. OpenShell rejects Anthropic models when a base URL override is set because Anthropic routes require model-path shaping and `anthropic_version` body injection. Overrides must use `https://` and an official Vertex AI hostname such as `aiplatform.googleapis.com`, `aiplatform.us.rep.googleapis.com`, `aiplatform.eu.rep.googleapis.com`, or `<region>-aiplatform.googleapis.com`.

## Supported Models

Vertex AI hosts Anthropic Claude models (claude-3-5-sonnet, claude-3-opus, and others) through a native Messages API integration, and Gemini and other third-party models through Vertex's OpenAI-compatible Chat Completions endpoint. OpenShell infers the routing path from the model name. For the full list of available models and regions, refer to the [Google Cloud model garden documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/overview).

Model names that match the `claude-*` pattern route through the Anthropic Messages API on Vertex. All other model names route through Vertex Chat Completions. Set `VERTEX_AI_PUBLISHER=anthropic` to force Anthropic routing when the model name does not follow the standard pattern.

OpenShell exposes Anthropic Vertex routes for inference only. It does not advertise OpenAI-style model discovery for those routes, so use the Google Cloud docs or Model Garden to discover supported Anthropic model IDs.

## Configure Inference Routing

Before configuring inference routing, enable provider endpoint injection so the Vertex AI network endpoints are automatically included in sandbox policies:

```shell
openshell settings set --global --key providers_v2_enabled --value true --yes
```

Then point `inference.local` at the provider:

```shell
openshell inference set \
  --provider vertex-prod \
  --model claude-sonnet-4-6
```

Use `--no-verify` if the endpoint verification fails. This is common with the `global` region, where the validation probe may not match the actual rawPredict path:

```shell
openshell inference set \
  --provider vertex-prod \
  --model claude-sonnet-4-6 \
  --no-verify
```

Sandboxes on that gateway reach the model at `https://inference.local`. For full details on inference routing, refer to [Inference Routing](/sandboxes/inference-routing).

## Use from a Sandbox

Agents inside sandboxes should reach Vertex AI through `inference.local`, not by connecting to Vertex AI directly. The gateway manages GCP credential refresh and request translation; the agent only needs to point its SDK at the local endpoint.

The complete setup from scratch:

```shell
# 1. Enable provider endpoint injection
openshell settings set --global --key providers_v2_enabled --value true --yes

# 2. Create the provider
openshell provider create \
  --name vertex-local \
  --type google-vertex-ai \
  --from-gcloud-adc \
  --config VERTEX_AI_PROJECT_ID=my-gcp-project \
  --config VERTEX_AI_REGION=us-central1

# 3. Configure inference routing
openshell inference set --provider vertex-local --model claude-sonnet-4-6 --no-verify

# 4. Create a sandbox with the provider attached
openshell sandbox create --name my-sandbox --provider vertex-local
```

Then inside the sandbox, launch the agent as shown below.

```shell
ANTHROPIC_BASE_URL="https://inference.local" ANTHROPIC_API_KEY=unused claude --bare
```

`--bare` skips the OAuth login flow and uses `ANTHROPIC_API_KEY` directly for authentication. The key value does not reach Vertex AI — `inference.local` strips it and injects the real GCP access token before forwarding.

Do not set `CLAUDE_CODE_USE_VERTEX=1` inside the sandbox. That flag makes Claude Code connect directly to Vertex AI and attempt GCP credential discovery (ADC file, metadata service), which fails because the sandbox does not expose GCP credentials. Use `inference.local` instead.

```shell
ANTHROPIC_BASE_URL="https://inference.local/v1" ANTHROPIC_API_KEY=unused opencode
```

OpenCode requires `/v1` in the base URL. Without it, OpenCode sends `POST /messages` instead of `POST /v1/messages`, which does not match the inference pattern and is denied.

### Policy Proposals

After running an agent, the TUI (`openshell term`) may show policy proposals for denied endpoints. Common ones for Vertex AI sandboxes:

| Endpoint                      | Action             | Reason                                                                                                                                                                           |
| ----------------------------- | ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `metadata.google.internal:80` | **Reject**         | Resolves to `169.254.169.254` (GCE metadata service). Always blocked regardless of policy — the proxy blocks the resolved IP unconditionally to prevent credential exfiltration. |
| `downloads.claude.ai:443`     | Approve if desired | Claude Code update checking and asset loading. Not required for inference.                                                                                                       |
| `storage.googleapis.com:443`  | Approve if desired | Google Cloud Storage. Used by some Claude Code features. Not required for inference.                                                                                             |

## From Existing Environment

If one of these token env vars is already set in your shell, create the provider with `--from-existing`:

* `GOOGLE_VERTEX_AI_TOKEN` or `VERTEX_AI_TOKEN`
* `GOOGLE_VERTEX_AI_SERVICE_ACCOUNT_TOKEN` or `VERTEX_AI_SERVICE_ACCOUNT_TOKEN`

OpenShell also reads these config env vars during `--from-existing`:

* `VERTEX_AI_PROJECT_ID`
* `VERTEX_AI_REGION`
* `GOOGLE_VERTEX_AI_BASE_URL` or `VERTEX_AI_BASE_URL`
* `VERTEX_AI_PUBLISHER`

Then create the provider:

```shell
openshell provider create \
  --name vertex-env \
  --type google-vertex-ai \
  --from-existing
```

This reads credentials and config from the environment variables listed in the configuration keys table above.

## Next Steps

* To configure `inference.local` routing, refer to [Inference Routing](/sandboxes/inference-routing).
* To manage provider credentials and refresh, refer to [Providers](/sandboxes/manage-providers).
* To apply network policies to sandboxes using this provider, refer to [Policies](/sandboxes/policies).