For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • NVIDIA Switch Infrastructure
    • I want to...
  • Quick Start
    • Start Here
    • Getting Started with Config Manager
    • TUI Wizard Reference
    • Configuration Samples
    • Interfaces
    • Local Development Quick Start
    • First Run Tour
  • Config Manager Overview
    • Config Manager Concepts
    • Getting Started with Nautobot
  • User Guides
    • New Site Bringup
    • Workflow Lifecycle
  • Deployment
    • Hosting Options
    • Network Topology Requirements
    • Firewall Ports
    • Airgapped Deployment
    • Troubleshooting
  • Services
      • Overview
      • Architecture
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • System Architecture
  • Components
  • API Service (FastAPI)
  • Web UI
  • PostgreSQL (CNPG)
  • Redis
  • Nautobot
  • High Availability Architecture
  • Data Flows
  • Write Operation Flow
  • Read Operation Flow
  • Concurrent Write Handling
  • Storage Architecture
  • Database Schema
  • Compression
  • Versioning
  • Nautobot Integration
  • Deployment Architecture
  • Kubernetes Deployment
  • Infrastructure Requirements
  • Monitoring and Observability
  • Prometheus Metrics
  • Health Checks
  • Logging
ServicesConfig Store Service

Config Store Architecture

||View as Markdown|
Previous

Config Store Service

Next

Current User Info

This document describes the system architecture of the NVIDIA Config Manager Config Store service.

System Architecture

Components

API Service (FastAPI)

The API service is a FastAPI application that provides a REST API for configuration management. It provides:

  • Versioned configuration storage with 1-year retention
  • Gzip compression (level 6) for storage efficiency
  • PostgreSQL advisory locks for fine-grained concurrency control
  • RESTful API with OpenAPI documentation
  • Diff generation between versions
  • Bulk operations and batch endpoints

Web UI

  • Next.js web interface for browsing device configurations
  • See Web UI for features and access instructions

PostgreSQL (CNPG)

  • Primary data store for versioned configs
  • Advisory locks for concurrent writes
  • Automatic versioning per device/filename/file_type
  • Compressed content storage

Redis

  • Cache for Nautobot device metadata
  • Reduces load on Nautobot API

Nautobot

  • Source of truth for device metadata (site, platform, role, rack)
  • Accessed through the GraphQL API
  • Metadata cached in Redis for performance

High Availability Architecture

In this high-availability architecture:

  • Any API replica can handle any request
  • If one replica crashes, others continue serving
  • Fine-grained locking allows concurrent writes from all replicas
  • No single point of failure (SPOF)

Data Flows

Write Operation Flow

  1. Client sends HTTP POST request to API endpoint
  2. FastAPI receives request and validates input
  3. Storage layer acquires PostgreSQL advisory lock (device+filename+file_type)
  4. Content is compressed using gzip (level 6)
  5. Content hash is calculated for deduplication
  6. New version is inserted into config_files table
  7. Lock is automatically released on transaction commit
  8. Response returned with version number

Read Operation Flow

  1. Client sends HTTP GET request to API endpoint
  2. FastAPI queries PostgreSQL for latest version
  3. Content is decompressed from storage
  4. Device metadata is enriched from Redis cache (or Nautobot if cache miss)
  5. Response returned with config content and metadata

Concurrent Write Handling

  • PostgreSQL advisory locks provide fine-grained locking at device+filename+file_type level
  • Different devices can write simultaneously without blocking
  • Intended and backup configs have independent locks
  • Locks are automatically released on transaction commit/rollback
  • Failed transactions release locks automatically

Storage Architecture

Database Schema

The config_files table stores all versioned configuration content:

1config_files:
2- id (UUID, primary key)
3- device_uuid (UUID, indexed)
4- filename (text)
5- file_type (enum: intended|backup, indexed)
6- version (integer)
7- content (bytea, compressed)
8- content_hash (text, SHA256 of uncompressed)
9- author (text, indexed)
10- commit_message (text)
11- created_at (timestamp with timezone, indexed)
12
13Unique constraint: (device_uuid, filename, file_type, version)
14Indexes: device+filename, device+filename+file_type, created_at, author

Compression

  • Content is compressed using gzip level 6 before storage
  • Typical compression ratio: ~93% reduction (50KB → ~5KB)
  • Decompression happens on read operations
  • Content hash is calculated on uncompressed content for deduplication

Versioning

  • Automatic version increment per device/filename/file_type combination
  • Versions start at 1 and increment sequentially
  • Each version is immutable (no updates, only new versions)
  • Full audit trail with author, commit message, and timestamp

Nautobot Integration

Device metadata is fetched from Nautobot through GraphQL and cached in Redis:

  • Site information
  • Platform details
  • Device role
  • Rack location
  • Other device attributes

This metadata enriches API responses and enables device-centric views in the UI.

Caching Strategy:

  • Metadata cached in Redis with TTL
  • Cache misses trigger GraphQL queries to Nautobot
  • Cache refresh service periodically updates stale entries

Deployment Architecture

Kubernetes Deployment

The service is deployed as a Kubernetes application with:

  • API Service: 3-5 replicas for high availability
  • PostgreSQL: CNPG cluster (primary + 2 replicas)
  • Redis: Shared service for Nautobot metadata caching
  • Web UI: Optional Next.js frontend
  • Gateway: For external access

Infrastructure Requirements

  • PostgreSQL: CNPG cluster (primary + 2 replicas)
    • Memory: 16GB per instance
    • CPU: 4-8 cores per instance
    • Storage: 200GB SSD
  • Redis: Shared service for Nautobot metadata caching
  • API Replicas: 3-5 replicas for high availability
    • Memory: 1GB per replica
    • CPU: 500m per replica

Monitoring and Observability

Prometheus Metrics

You can access Prometheus metrics at the operational /metrics endpoint. Config store provides the default set of metrics, as documented in the Instrumentator documentation:

  • http_requests_total - Total number of requests
  • http_request_size_bytes - Sum of the content lengths of all incoming requests
  • http_response_size_bytes - Sum of the content lengths of all outgoing responses
  • http_request_duration_seconds - Total duration of requests, limited to only a few buckets
  • http_request_duration_highr_seconds - Higher resolution duration of requests, with a large number of buckets

Health Checks

  • Health Check Route: GET /healthcheck
  • Readiness Probe: Database connectivity check
  • Liveness Probe: Application responsiveness check

Logging

  • Structured logging with request IDs
  • Audit logging for all configuration changes
  • Error logging with stack traces
  • Performance logging for slow operations