Once you have NVIDIA Infra Controller (NICo) up and running, you can begin ingesting machines.
Ensure you have the following prerequisites met before ingesting machines:
You have the nico-admin-cli command available: You can compile it from sources or you can use the pre-compiled binary. Another choice is to use a containerized version. You can also download it from the cluster; see next section for details.
You can access the NICo site using the nico-admin-cli.
The NICo API service is running at IP address NICo_API_EXTERNAL. It is recommended that you add this IP address to your trusted list.
DHCP requests from all managed host IPMI networks have been forwarded to the NICo service running at IP address NICo_DHCP_EXTERNAL.
You have the following information for all hosts that need to be ingested:
These can be generated from site vault. Follow these steps to generate them.NICO_LB_IP
additional_issuer_cns (one-time per cluster)Expected: additional_issuer_cns = ["site-root"]
If it’s empty, edit the configmap and set it, then restart:
Replace <FQDN for nico-api-endpoint> appropriately which usually is api-<ENVIRONMENT_NAME>.<SITE_DOMAIN_NAME>
You can run admin cli commands as
Alternatively to shorten the command line you can create a file named carbide_api_cli.json in folder $HOME/.config and add the following content:
/etc/hosts entryIf you have trouble resolving api-<ENVIRONMENT_NAME>.<SITE_DOMAIN_NAME> you have to map it to the LoadBalancer IP:
NICo requires knowledge of the current and desired BMC and UEFI credentials for hosts and DPUs. NICo will reset current crendtials to the desired credentials on the BMC and UEFI when ingesting a host. You can use these credentials when accessing the host or DPU BMC yourself, and NICo will use these credentials for its automated processes.
The required credentials include the following:
Note: The following commands use the
<api-url>placeholder, which is typically the following:
Run this command to update the desired Host and DPU BMC password:
Run this command to generate the desired host UEFI password:
Run this command to update host uefi password:
Run this command to update DPU uefi password:
NICo needs to know the factory default credentials for each BMC, which is expressed as a JSON table of “Expected Machines”. The serial number is used to verify the BMC MAC matches the actual serial number of the chassis.
Prepare an expected_machines.json file as follows:
Only servers listed in this table will be ingested, so you must include all servers in this file.
Each entry supports additional optional fields:
host_lifecycle_profile (object): Per-host profile for settings that affect
state-machine progression. Future per-host knobs should be added here.
disable_lockdown (bool, default false): When true, the state machine
does not lockdown the host during lifecycle management. This is useful for automation
workflows that need lockdown persistently disabled.dpf_enabled (bool): Enable/disable DPF for this host.
dpu_mode ("dpu_mode" | "nic_mode" | "no_dpu"): Per-host DPU operating mode.
bmc_retain_credentials (bool): Skip BMC password rotation.
default_pause_ingestion_and_poweron (bool): Pause ingestion and power-on for this host.
bmc_ip_address (string): Static BMC IP (pre-allocates a machine interface).
When the file is ready, upload it to the site with the following command:
NICo uses Measured Boot using the on-host Trusted Platform Module (TPM) v2.0 to enforce cryptographic identity of the host hardware and firmware. The following command configures NICo to approve all pending machines based on PCR Registers 0, 3, 5, and 6.
Once machines are approved, NICo’s Site Explorer begins automatically ingesting them. No further operator action is required under normal circumstances.
The high-level flow is:
ManagedHost object is created and the state machine starts.DpuDiscoveringState / DPUInit: NICo configures Secure Boot on the DPU, installs the DPU OS (BFB image), and power-cycles the host to apply the new DPU configuration.HostInit: NICo configures BIOS, sets the host boot order, optionally collects TPM attestation measurements, waits for hardware discovery via the scout agent, and applies UEFI lockdown. When the scout agent reports back, NICo replaces the temporary predicted host ID (prefix fm100p) with a stable host ID (prefix fm100h) derived from the host’s own DMI serial data or TPM certificate.BomValidating / Validation: NICo validates the discovered hardware against the expected SKU. If hardware validation is enabled, the host is rebooted and tested before proceeding.Ready: the host transitions through HostInit/Discovered and enters the available pool, ready for an instance to be assigned to it.For the full DPU lifecycle — OS installation, firmware upgrades, health monitoring, and reprovisioning — see DPU Lifecycle Management. For the complete state transitions, including substates, retry logic, and reprovision paths, see the Managed Host State Diagrams.
When a machine is not being created or is stuck in a pre-Ready state, nico-api logs are the primary investigation tool. Filtering logs by the host BMC IP or DPU BMC IP is often the fastest way to understand where ingestion or pairing is failing.
You can check the current detailed state of any managed host using:
For a full guide on diagnosing stuck objects, including how to use the NICo Grafana dashboard and how to read state handler error logs, see Stuck Objects Runbook.
Before pairing can occur, Site Explorer must successfully explore each BMC endpoint. Exploration failures are logged and surfaced in nico-api logs and the NICo Grafana dashboard. Common error types:
For a complete reference of all Redfish endpoints and required response fields, see Redfish Endpoints Reference.
The following are the conditions in which Site Explorer cannot complete pairing and logs a host_dpu_pairing_blockers_count metric. Each requires operator investigation.
For DPU pairing failures, including dpu_pf0_mac_missing and cases where the DPU is in an unknown or corrupt state, a common fix is to install a vanilla pre-ingestion BFB image via rshim to return the DPU to a clean state. This runs as part of the preingestion state machine:
This command copies the NICo BFB image directly to the DPU via rshim (SSH to the DPU BMC) and triggers a DPU reboot to complete the installation. After the BFB is installed, NICo power-cycles the host automatically to apply the new DPU image.
Note: The
--host-bmc-ipflag is required. NICo uses it to power-cycle the host after the BFB copy completes. Use--pre-copy-powercycleif the host needs to release rshim control to the DPU BMC before the copy can start.
For additional DPU-specific troubleshooting including Secure Boot configuration, BMC password resets, and firmware version checks, see Adding New Machines to an Existing Site.
The expected machines table in the nico-api database holds the following fields per host:
Use nico-admin-cli to operate on individual entries:
Replace all entries from a JSON file:
Erase all entries:
Export the current table as JSON: