Google Vertex AI
Google Vertex AI is a managed machine learning platform that hosts Anthropic Claude, Gemini, and third-party models through Google Cloud. OpenShell can route inference.local traffic to Vertex AI using gateway-managed credential refresh, so sandbox agents do not handle GCP credentials directly.
Prerequisites
Before creating a Vertex AI provider, ensure you have:
- A GCP project with the Vertex AI API enabled.
- One of the following:
- A GCP service account with the Vertex AI User role and a downloaded JSON key file, for production use.
- The
gcloudCLI with Application Default Credentials configured, for local development.
Authentication
The google-vertex-ai provider supports two credential sources.
Service Account Key
Supply the JSON key file content as the GOOGLE_SERVICE_ACCOUNT_KEY credential. OpenShell persists that value only as gateway-side refresh bootstrap material until you update or delete it. The raw service-account JSON and private key are not sandbox runtime credentials and are not exposed to sandboxes. Runtime inference requests use short-lived access tokens minted by the gateway and stored under a separate credential key.
Then configure gateway-managed refresh so the gateway uses the private key as refresh bootstrap material and rotates access tokens:
gcloud Application Default Credentials
For local development, configure ADC first, then pass --from-gcloud-adc:
--from-gcloud-adc reads GOOGLE_APPLICATION_CREDENTIALS first, then falls back to $CLOUDSDK_CONFIG/application_default_credentials.json when that environment variable is set, then to ~/.config/gcloud/application_default_credentials.json. It configures an OAuth2 refresh token flow on the gateway and immediately mints the first access token before the command returns. If the command succeeds, the provider is ready for inference right away. It only works with user credentials generated by gcloud auth application-default login. If your ADC file is a service account key, the CLI returns an error and directs you to use the service account key method above.
ADC-backed providers mint and rotate access tokens into GOOGLE_VERTEX_AI_TOKEN.
--from-gcloud-adc is only valid for google-vertex-ai providers.
Configuration Keys
Pass these as --config KEY=VALUE when creating the provider, or set them as environment variables and use --from-existing.
When VERTEX_AI_PROJECT_ID is set and no base URL override is present, the gateway maps VERTEX_AI_REGION to the Vertex host automatically:
- Regional locations such as
us-central1usehttps://<region>-aiplatform.googleapis.com. globaluseshttps://aiplatform.googleapis.com.usandeuusehttps://aiplatform.<region>.rep.googleapis.com.
For Anthropic models, OpenShell builds the publisher-model Vertex path automatically and injects anthropic_version into the request body. Vertex rawPredict does not receive anthropic-version as a header, and OpenShell strips anthropic-beta for Vertex Claude routes. For non-Anthropic models, OpenShell uses Vertex’s OpenAI-compatible Chat Completions route under .../endpoints/openapi/chat/completions.
Use GOOGLE_VERTEX_AI_BASE_URL or VERTEX_AI_BASE_URL only for non-Anthropic Vertex routes. OpenShell rejects Anthropic models when a base URL override is set because Anthropic routes require model-path shaping and anthropic_version body injection. Overrides must use https:// and an official Vertex AI hostname such as aiplatform.googleapis.com, aiplatform.us.rep.googleapis.com, aiplatform.eu.rep.googleapis.com, or <region>-aiplatform.googleapis.com.
Supported Models
Vertex AI hosts Anthropic Claude models (claude-3-5-sonnet, claude-3-opus, and others) through a native Messages API integration, and Gemini and other third-party models through Vertex’s OpenAI-compatible Chat Completions endpoint. OpenShell infers the routing path from the model name. For the full list of available models and regions, refer to the Google Cloud model garden documentation.
Model names that match the claude-* pattern route through the Anthropic Messages API on Vertex. All other model names route through Vertex Chat Completions. Set VERTEX_AI_PUBLISHER=anthropic to force Anthropic routing when the model name does not follow the standard pattern.
OpenShell exposes Anthropic Vertex routes for inference only. It does not advertise OpenAI-style model discovery for those routes, so use the Google Cloud docs or Model Garden to discover supported Anthropic model IDs.
Configure Inference Routing
Before configuring inference routing, enable provider endpoint injection so the Vertex AI network endpoints are automatically included in sandbox policies:
Then point inference.local at the provider:
Use --no-verify if the endpoint verification fails. This is common with the global region, where the validation probe may not match the actual rawPredict path:
Sandboxes on that gateway reach the model at https://inference.local. For full details on inference routing, refer to Inference Routing.
Use from a Sandbox
Agents inside sandboxes should reach Vertex AI through inference.local, not by connecting to Vertex AI directly. The gateway manages GCP credential refresh and request translation; the agent only needs to point its SDK at the local endpoint.
The complete setup from scratch:
Then inside the sandbox, launch the agent as shown below.
Claude Code
OpenCode
--bare skips the OAuth login flow and uses ANTHROPIC_API_KEY directly for authentication. The key value does not reach Vertex AI — inference.local strips it and injects the real GCP access token before forwarding.
Do not set CLAUDE_CODE_USE_VERTEX=1 inside the sandbox. That flag makes Claude Code connect directly to Vertex AI and attempt GCP credential discovery (ADC file, metadata service), which fails because the sandbox does not expose GCP credentials. Use inference.local instead.
Policy Proposals
After running an agent, the TUI (openshell term) may show policy proposals for denied endpoints. Common ones for Vertex AI sandboxes:
From Existing Environment
If one of these token env vars is already set in your shell, create the provider with --from-existing:
GOOGLE_VERTEX_AI_TOKENorVERTEX_AI_TOKENGOOGLE_VERTEX_AI_SERVICE_ACCOUNT_TOKENorVERTEX_AI_SERVICE_ACCOUNT_TOKEN
OpenShell also reads these config env vars during --from-existing:
VERTEX_AI_PROJECT_IDVERTEX_AI_REGIONGOOGLE_VERTEX_AI_BASE_URLorVERTEX_AI_BASE_URLVERTEX_AI_PUBLISHER
Then create the provider:
This reads credentials and config from the environment variables listed in the configuration keys table above.
Next Steps
- To configure
inference.localrouting, refer to Inference Routing. - To manage provider credentials and refresh, refer to Providers.
- To apply network policies to sandboxes using this provider, refer to Policies.