The NVIDIA Config Manager Network Template Render Service is an event-driven microservice that automatically generates and versions network device configurations. The service monitors Nautobot (the network’s source of truth) for changes, and renders updated configurations using Jinja2 templates. The rendered configurations are stored in the Config Store.
The service consists of three main components: the API, the event consumers, and the event dispatcher.
You can use the render service’s API endpoints to trigger the rendering of network device configurations.
POST /v1/render/{device_uuid}/render - Render the configuration for a single devicePOST /v1/render/all - Queue renders for all devices that are enabled for renderingPOST /v1/render/batch - Queue renders for a list of devicesThree specialized pull-based consumers process NATS JetStream events:
Nautobot event consumer responds to Nautobot model changes (device, interface, cable, IP address, and so on). The consumer dispatches events to model-specific handlers, and queues device renders.
Device change consumer responds to queued device render requests from event handlers. The consumer executes renders with distributed locking, and updates the config store.
Template change consumer responds to template version updates. The consumer re-renders devices with stale template versions. If the running version is less than the desired version, the consumer will NAK the message and wait for 30 seconds before trying again.
The event dispatcher is a dynamic event routing system that maps Nautobot model events to handler functions. The event dispatcher maintains a dispatch table that maps Nautobot model events to handler functions, and exposes Prometheus metrics for event processing.
The rendering process is as follows:
nv_config_manager_templates.Renderer.Producer: Runs as a Kubernetes job on service deployment. The producer queries Nautobot for devices with stale template_version, and publishes template-change events for outdated devices.
Version tracking: The producer records the nv_config_manager_templates version in Nautobot, and the template consumer refuses to process events for newer versions (NAKs with 30s delay). This allows for zero-downtime rolling deployments (old pods terminate, new pods process backlog).
The service is deployed as a Kubernetes deployment, with three consumer deployments (nautobot, device, template), a producer job (runs on helm upgrade), and Redis for distributed locking. The service exposes Prometheus metrics on port 8000.
The service is configured using a configuration file (config.py). The configuration file contains the Nautobot URL and token, NATS connection details (TLS, credentials), Redis connection for locking, Config store client settings, and environment-specific aggregate management flags.
Prometheus metrics:
Event processing:
nv_config_manager_events_received - Events received through NATS (by model, instance, namespace).nv_config_manager_events_processed - Events successfully processed.nv_config_manager_events_skipped - Events skipped (no handler, device not enabled).nv_config_manager_events_failed - Events that failed processing (by exception type).nv_config_manager_event_processing_time - Event processing duration histogram.Nautobot changes:
nv_config_manager_nautobot_change_messages_receivednv_config_manager_nautobot_change_messages_processednv_config_manager_nautobot_change_messages_failednv_config_manager_nautobot_change_message_processing_time - Render durationnv_config_manager_nautobot_change_message_end_to_end_time - Nautobot publish to Config Store persistTemplate changes:
nv_config_manager_template_change_messages_received (by template_version)nv_config_manager_template_change_messages_processednv_config_manager_template_change_messages_failednv_config_manager_template_change_message_processing_timeException types:
NautobotException - Nautobot API errors, retry on transient failuresRenderException - Template rendering failures, ACK (do not retry)DeviceNotEnabledError - Device not enabled for rendering, ACKEventParseError - Malformed event data, fail counter incrementedConfigStoreException - Config store persistence errorsConsumer behavior:
Dynamic handler discovery: The event dispatcher builds routing table by introspecting events/ module functions, eliminating manual registration.
Pull-based consumption: Consumers fetch messages on-demand rather than push-based subscriptions, enabling better flow control and horizontal scaling.
Distributed locking: Redis-backed locks prevent concurrent renders for the same device across multiple consumer instances.
Version-aware processing: Template consumer compares running version to message version, refusing to process newer versions to enable safe rolling deployments.
Async blocking operations: Long-running synchronous operations (Nautobot API calls, template rendering) run in thread pools using asyncio.to_thread() to avoid blocking the event loop.
Connection sharing: NATSConnectionManager and NautobotConnectionManager share connections across components within a process.