DOCA Documentation v2.10.0

DOCA OS Inspector Service Guide

Warning

Not part of DOCA release

Contents:

This guide provides instructions on how to use the DOCA OS Inspector on top of NVIDIA® BlueField® DPU.

DOCA OS Inspector service allows monitoring various aspects of a target VM/bare-metal Host by inspecting the memory of the target operating system and exporting it to be utilized by various services for security, big data and many more AI-based services.

DOCA OS Inspector service is linked to DOCA Telemetry Service (DTS). DOCA OS Inspector uses DOCA App Shield Library for collecting OS data of the target system without hindering it. The service parses the collected data and forwards it to the DTS which manages the rest of the telemetry aspects.

The DOCA OS Inspector runs inside of its own Kubernetes pod on BlueField. The collected data is parsed and sent, in a predefined struct, to a telemetry collector which manages the rest of the telemetry aspects.

  1. Please follow the steps needed to work with DOCA App Shield lib As explained in the Lib's documentation/.

  2. Copy doca_apsh_config generated JSON files from host/VM to the DPU, to the path /opt/mellanox/doca/services/os_inspector/.

    for example:

    Copy
    Copied!
                

    dpu> scp root@192.168.100.1:~/*.json /opt/mellanox/doca/services/os_inspector/

  3. Place your service configuration JSON files os_inspector_params.json and os_inspector_cfg.json at the path /opt/mellanox/doca/services/os_inspector/.

  4. Create a VF to be used by the service according to the DOCA Virtual Functions User Guide and expose it to the target system.

FW Version

The firmware version must be 24.32.1010 and higher.

BlueField OS (BFB) Version

Supported BlueField OS versions are 3.9.3 and higher.

For information about the deployment of DOCA containers on top of the BlueField DPU, refer to NVIDIA DOCA Container Deployment Guide.

Service-specific configuration steps and deployment instructions can be found under the service's container page.

JSON Input

This file configures what data objects, "events", will be exported from the service, to view the service parameters JSON go to Service Parameters JSON section.

The DOCA OS Inspector configuration file should be placed under /opt/mellanox/doca/services/os_inspector/<json_file_name>.json and be built in the following format:

Copy
Copied!
            

/* * key: lib APSH struct * value: true/false - collect & export or not */ { "processes_info": [true/false], "threads_info": [true/false], "libs_info": [true/false], "vads_info": [true/false], "system_modules_info": [true/false], "privileges_info": [true/false], "processes_envars_info": [true/false], }

Allowed data objects for export:

  • "processes_info" – Information about each process that is running on the target system, see DOCA App Shield API documentation of doca_apsh_process_attr TODO insert correct link here and in all below

  • "threads_info" – Information about all threads that are running on the target system, see DOCA App Shield API documentation of doca_apsh_thread_attr

  • "libs_info" – Information about all the libraries of each process that is running on the target system, see DOCA App Shield API documentation of doca_apsh_lib_attr

  • "vads_info" – Information about the VADs/VMAs (Windows/Linux) of each process that is running on the target system, see DOCA App Shield API documentation of doca_apsh_vad_attr

  • "system_modules_info" – Information about each kernel module that is active in the target system, see DOCA App Shield API documentation of doca_apsh_module_attr

  • "privileges_info" – Information about the privileges of each process that is running on the target system, see DOCA App Shield API documentation of doca_apsh_privilege_attr

  • "processes_envars_info" – Information about the environment variables state for each process that is running on the target system, see DOCA App Shield API documentation of doca_apsh_envar_attr

All exported data object that relate to a process contain two fields that can be use to identify the related process: PID (process id) and COMM (process executable name)

Events Configuration JSON Example:

Copy
Copied!
            

{ "processes_info": true, "threads_info": true, "libs_info": false, "vads_info": true, "system_modules_info": true, "privileges_info": false, "processes_envars_info": false }

Note

The precise fields of each data object may depend on the target OS type and some of the data object might be available only for a certain OS type.

Note

Changing the JSON file will not cause the service to change configuration during runtime.

Note

Current string values are assumed to be up to 999 bytes long. If a string is longer than 999 bytes, the service will export 998 bytes with the last byte as "+" to indicate the value is truncated


Service Parameters JSON

This file configures the general behavior of the service and gives it the location of needed resources.

The DOCA OS Inspector parameters file should be placed as/opt/mellanox/doca/services/os_inspector/os_inspector_params.json and be built in the following format:

Copy
Copied!
            

{ "doca_general_flags":{ // -l - sets the log level for the service DEBUG=60, CRITICAL=20 "log-level": 60, }, "doca_program_flags":{ // -p - Sets the path to the events configuration file in a JSON format. "policy": "/os_inspector/os_inspector_cfg.json", "memr": "/os_inspector/mem_regions.json", "vuid": "MT2140X05931MLNXS0D0F0", "dma": "mlx5_0", "osym": "/os_inspector/symbols.json", "osty": "linux", "time": 20 } }

Each JSON key is defined as follows:

  • The doca_program_flagsis the DOCA general runtime arguments received by the service:

    • "log-level": <value> – sets the log level <CRITICAL=20, ERROR=30, WARNING=40, INFO=50, DEBUG=60>

  • The doca_program_flagsis a JSON with the runtime arguments received by the services which are as follows:

    • "policy": <path> – Path to the JSON file with export configuration

    • "memr": <path> – System memory regions map

    • "vuid": <string> – VUID of the System device

    • "dma": <string> – DMA device name

    • "osym": <path> – System OS symbol map path

    • "osty": <windows|linux> – System OS type - windows/linux

    • "time": <seconds> – Scan time interval in seconds

Yaml File

The .yaml file downloaded from NGC can be easily edited according to your needs.

Copy
Copied!
            

env: # Set according to the local setup - name: SERVICE_ARGS value: /os_inspector/os_inspector_params.json

  • The SERVICE_ARGS is a JSON with the runtime arguments received by the services which is defined at Service Parameters JSON section.

Verifying Output

Enabling write to data in the DTS allows debugging the validity of the DOCA Flow Inspector.

To allow DTS to write locally, uncomment the following line in dts_config.ini:

Copy
Copied!
            

#output=/data

Note

Any changes in dts_config.ini necessitate restarting the pod for the new settings to apply.

The schema folder contains JSON-formatted metadata files which allow reading the binary files containing the actual data. The binary files are written according to the naming convention shown in the following example (apt install tree):

Copy
Copied!
            

$ tree /opt/mellanox/doca/services/telemetry/data/ /opt/mellanox/doca/services/telemetry/data/ ├── {year} │ └── {mmdd} │ └── {hash} │ ├── {source_id} │ │ └── {source_tag}{timestamp}.bin │ └── {another_source_id} │ └── {another_source_tag}{timestamp}.bin └── schema └── schema_{MD5_digest}.json

New binary files appear when:

  • The service starts

  • When the binary file's max age/size restriction is reached

  • When JSON file is changed and new schemas of telemetry are created

  • An hour passes

If no schema or no data folders are present, refer to the Troubleshooting section in DOCA Telemetry Service Guide.

Note

source_id is usually set to the machine hostname. source_tag is a line describing the collected counters, and it is often set as the provider's name or name of user-counters.

Reading the binary data can be done from within the DTS container using the following command:

Copy
Copied!
            

crictl exec -it <Container ID> /opt/mellanox/collectx/bin/clx_read -s /data/schema /data/path/to/datafile.bin

The data written locally should be shown in a JSON format. You should expect a large output.

© Copyright 2025, NVIDIA. Last updated on Jul 10, 2025.