Collecting Machine Diagnostic Information using nico-admin-cli
Collecting Machine Diagnostic Information using nico-admin-cli
Collecting Machine Diagnostic Information using nico-admin-cli
This guide describes how to use the nico-admin-cli debug bundle command to collect diagnostic information for troubleshooting machines managed by NVIDIA Infra Controller (NICo). The command creates a ZIP file containing logs, health data, and machine state information.
The debug bundle command collects data from two sources:
Grafana (Loki) (optional): Fetches logs using Grafana’s Loki datasource
--grafana-url is not providedNICo API: Fetches machine information
The generated ZIP file contains:
Before running the debug bundle command, ensure you have:
nico-admin-cliYou need nico-admin-cli installed with valid client certificates to connect to the NICo API. Refer to your NICo installation documentation for setup instructions.
Note: This is only required if you want to collect logs. If --grafana-url is not provided, log collection is skipped.
Set the GRAFANA_AUTH_TOKEN environment variable:
This token is used to authenticate with Grafana and fetch logs from the Loki datasource.
If you are running from an environment that requires a SOCKS proxy, set the proxy:
Note: When running from inside the cluster (nico-api pod), the proxy is not required.
https://grafana.example.com)Required:
-c <API_URL>: NICo API endpoint
https://<your-nico-api-url>/https://127.0.0.1:1079<MACHINE_ID>: The machine ID to collect debug information for--start-time <TIME>: Start time in format HH:MM:SS or YYYY-MM-DD HH:MM:SSOptional:
--grafana-url <URL>: Grafana base URL (e.g., https://grafana.example.com). If not provided, log collection is skipped.--end-time <TIME>: End time in format HH:MM:SS or YYYY-MM-DD HH:MM:SS (default: current time)--output-path <PATH>: Directory where the ZIP file will be saved (default: /tmp)--batch-size <SIZE>: Batch size for log collection (default: 5000, max: 5000)--utc: Interpret start-time and end-time as UTC instead of local timezoneWith Grafana configured (collect logs):
With all options specified:
Without Grafana (metadata only):
When you run the debug bundle command, it shows progress through multiple steps: