Day 0 Machine Identity Configuration

View as Markdown

This guide is the Day 0 reference for enabling machine identity (SPIFFE JWT-SVID issuance) at the site level. It covers the secrets, site config, and DPU agent settings that must be in place before any tenant can configure per-org identity (a Day 1 activity).

Machine identity lets tenant workloads on provisioned instances obtain short-lived JWT tokens that assert a SPIFFE ID. NICo signs those tokens when a DPU calls the Core gRPC API or when a workload reads the instance metadata service (IMDS). Per-org issuer, audiences, and TTL are configured later — see Machine Identity (Day 1).

The design and API details live in the SPIFFE JWT-SVID SDD. This page focuses on what an operator configures once during initial site bring-up.


Prerequisites

Before starting:

  • A running NICo deployment with healthy nico-api (Core) and nico-rest-api (REST).
  • Site config (siteConfig in helm-prereqs/values/ncx-core.yaml or equivalent) is already wired for your site — see Quick Start Guide, Step 3.

This page assumes you have completed Day 0 IP and Network Configuration or equivalent network bring-up.


What Day 0 Enables

Day 0 configuration turns on the site-wide machinery:

LayerWhat you configureEffect
Site secretsmachine_identity.encryption_keysAES keys used to encrypt per-org signing private keys at rest
Site config[machine_identity] in site_config.tomlGlobal enable switch, algorithm, current encryption key id, optional TTL bounds and egress controls
DPU agentOptional [machine-identity]Rate limits and timeouts for IMDS GET …/meta-data/identity (not stored in site config)

Per-org settings (issuer, audiences, TTL, token delegation) are not Day 0 — they are created with PUT …/tenant-identity/config after Day 0 is complete.


1. Generate Master Encryption Keys

Per-org JWT signing private keys are encrypted at rest with a site master encryption key (KEK). Generate one or more 256-bit keys and store them in site credentials.

$openssl rand -base64 32

Record each key under a stable id (for example kv1, kv2). The id must match current_encryption_key_id in site config.

File-backed credentials

When using a local credential snapshot (development or file-based deployments), add a machine_identity block:

1{
2 "machine_identity": {
3 "encryption_keys": {
4 "kv1": "<base64-encoded-32-byte-key>"
5 }
6 }
7}

Vault-backed credentials

In Vault deployments, store each key at a path that resolves to machine_identity/encryption_keys/<key-id> in the credential loader (for example …/machine_identity/encryption_keys/kv1). Follow the same secret-management process you use for other NICo site credentials.

Important: Keep all key ids that appear in stored ciphertext until you complete a KEK re-wrap. Decrypt always uses the key_id embedded in each encrypted blob, not the site’s current key id alone. See Master Encryption Key Rotation (KEK).


2. Configure Site [machine_identity]

Add or update the [machine_identity] section in the NICo API site config. In Helm deployments this is typically helm/charts/nico-api/files/carbide-api-config.toml (rendered into the nico-api ConfigMap) or the siteConfig overlay you maintain for your environment.

1[machine_identity]
2enabled = true
3current_encryption_key_id = "kv1" # must match a key in site secrets
4algorithm = "ES256" # only ES256 is supported today
5
6# Optional bounds enforced on per-org tokenTtlSeconds (defaults are documented in the SDD)
7# token_ttl_min_sec = 60
8# token_ttl_max_sec = 86400
9
10# Optional: max signing_key_overlap_seconds on per-org JWT signing-key rotation
11# signing_key_overlap_max_sec = 604800
12
13# Optional: egress proxy for token delegation (RFC 8693) — see Day 1 guide
14# token_endpoint_http_proxy = "https://nico-egress-proxy.example.com"
15
16# Optional hostname allowlists (empty = no extra restriction beyond API validation)
17# trust_domain_allowlist = ["*.example.com"]
18# token_endpoint_domain_allowlist = ["sts.example.com", "*.tenant.example.com"]

Field reference (Day 0)

FieldRequired when enabled = trueNotes
enabledfalse disables the feature; per-org PUT returns 503
algorithmYesMust be ES256
current_encryption_key_idYesMust exist in machine_identity.encryption_keys secrets
token_ttl_min_sec / token_ttl_max_secNoBounds for per-org tokenTtlSeconds
signing_key_overlap_max_secNoUpper bound for per-org signing-key rotation overlap
token_endpoint_http_proxyNoRecommended when using token delegation to external HTTPS STS endpoints
trust_domain_allowlistNoRestricts per-org JWT issuer trust domains
token_endpoint_domain_allowlistNoRestricts token delegation token_endpoint hosts

Startup behavior

ScenarioBehavior
[machine_identity] section missingFeature disabled; API starts normally
Section present, enabled = falseFeature disabled; per-org APIs return 503
Section present, enabled = true, invalid or incompleteAPI fails to start — fix config before rollout
Section present, valid, enabled = trueFeature operational

After editing site config or secrets, restart nico-api (site config for [machine_identity] is not hot-reloaded).


3. Configure DPU Agent [machine-identity] (Optional)

IMDS identity requests are served by the DPU agent (and standalone FMDS when used). Limits and an optional HTTP sign-proxy are configured on the agent, not in the API site config.

Defaults apply when the section is omitted:

SettingDefaultDescription
requests-per-second3Sustained admission rate for IMDS identity GETs (GCRA refill rate). Limits how many signing requests the agent accepts per second over time.
burst8Maximum burst above the sustained rate before new requests must wait or are rejected. Allows short spikes without immediately hitting the limit.
wait-timeout-secs2How long a request may block waiting for a rate-limit permit. If no capacity becomes available within this window, the agent fails the request (avoids indefinite queueing).
sign-timeout-secs5Wall-clock timeout for the signing step — either Forge gRPC SignMachineIdentity or the optional HTTP sign-proxy call.
sign-proxy-url(unset)Optional base URL for HTTP pass-through signing. When set, the agent forwards GET {url}/latest/meta-data/identity with the same query string instead of calling Forge gRPC. Scheme must be http or https.
sign-proxy-tls-root-ca(unset)Optional path to a PEM file of trusted CA roots for an https sign-proxy URL (for example a private CA). Ignored for http: URLs. Requires sign-proxy-url.

Example override in the agent config file (see crates/agent/example_agent_config.toml):

1[machine-identity]
2requests-per-second = 3
3burst = 8
4wait-timeout-secs = 2
5sign-timeout-secs = 5
6# sign-proxy-url = "https://sign-proxy.example.com/prefix"
7# sign-proxy-tls-root-ca = "/etc/forge/sign_proxy_root.pem"

When sign-proxy-url is set, the agent forwards signing to an HTTP proxy instead of calling Forge gRPC directly. Use this only when your architecture requires an out-of-band signing path.

Restart or redeploy DPU agents after changing this section.


4. Apply and Verify Site-Level Enablement

4.1 Confirm API startup

After rollout, verify nico-api pods are running and logs show no machine-identity config errors:

$kubectl logs -n <nico-namespace> deploy/nico-api --tail=100

If [machine_identity] is enabled but secrets or required fields are wrong, the pod will crash-loop until fixed.

4.2 Confirm global gate (expected before Day 1)

Before any org has identity config, per-org REST calls correctly return 503 Service Unavailable (machine identity not enabled at site level is indistinguishable from “enabled globally but not configured for org” until you complete Day 1):

$curl -s -o /dev/null -w '%{http_code}\n' \
> -H "Authorization: Bearer $TOKEN" \
> "https://<nico-rest>/v2/org/<org>/nico/site/<site-id>/tenant-identity/config"

Once Day 0 is complete and enabled = true, this endpoint returns 404 (no config yet) or 200 (after Day 1), not 503.

After Day 1 config and a READY instance, run the Machine Identity Verification runbook for gRPC and IMDS smoke tests.


Troubleshooting

Day 0 verification uses API startup checks (§4.1) and the REST global gate (§4.2). Signing-path errors (SignMachineIdentity, IMDS) surface only after Day 1 and are covered in Machine Identity Verification.

SymptomLikely causeAction
nico-api crash on startupMissing/invalid [machine_identity] or unknown current_encryption_key_idFix TOML; ensure secrets contain the referenced key id
Per-org GET/PUT returns 503Global enabled = false or invalid global configSet enabled = true with valid required fields; restart API

Next Steps