Overview
NVSentinel supports PostgreSQL as an alternative datastore to MongoDB. The PostgreSQL provider offers similar functionality with a relational database backend, providing flexibility in deployment options.
Features
Full feature parity with MongoDB provider
TLS/SSL support with client certificate authentication
Change stream emulation using triggers and polling
JSONB storage for flexible document-like data structures
Automatic schema management with tables, indexes, and triggers
Production-ready with connection pooling and error handling
When to Use PostgreSQL
Choose PostgreSQL if you:
Already have PostgreSQL infrastructure in your organization
Prefer relational databases for operational reasons
Need to leverage existing PostgreSQL expertise
Want to use PostgreSQL-specific features or tools
Have compliance requirements that favor PostgreSQL
Architecture
Database Schema
The PostgreSQL provider uses the following tables:
health_events - Stores GPU and system health events
maintenance_events - Stores CSP maintenance events
datastore_changelog - Tracks changes for change stream emulation
resume_tokens - Stores resume tokens for change stream clients
Each table uses JSONB columns to store document data while extracting key fields for indexing.
Change Stream Emulation
PostgreSQL doesn’t have native change streams like MongoDB. The provider emulates this using:
Triggers on tables to capture INSERT/UPDATE/DELETE operations
Changelog table to store change events
Polling mechanism to detect new changes
Resume tokens to track client progress and enable resumption
Configuration
Helm Chart Deployment
Using values-postgresql.yaml
Deploy NVSentinel with PostgreSQL using the provided values file:
Custom Configuration
Create your own values file with PostgreSQL settings:
TLS/SSL Configuration
PostgreSQL requires TLS for secure connections. The Helm chart automatically configures:
cert-manager resources for certificate generation
Client certificates for application authentication
Server certificates for PostgreSQL TLS
InitContainers to fix certificate permissions
SSL Modes
disable - No SSL (not recommended for production)
require - SSL required but no certificate verification
verify-ca - Verify server certificate against CA
verify-full - Full verification including hostname (recommended)
Environment Variables
The provider recognizes these environment variables:
DATASTORE_PROVIDER - Set to “postgresql”
DATASTORE_HOST - PostgreSQL hostname
DATASTORE_PORT - PostgreSQL port (default: 5432)
DATASTORE_DATABASE - Database name
DATASTORE_USERNAME - Database username
DATASTORE_SSLMODE - SSL mode
DATASTORE_SSLCERT - Client certificate path
DATASTORE_SSLKEY - Client key path
DATASTORE_SSLROOTCERT - CA certificate path
POSTGRESQL_CLIENT_CERT_MOUNT_PATH - Certificate directory path
Alternatively, use MONGODB_CLIENT_CERT_MOUNT_PATH for backward compatibility - the SDK handles both.
Development with Tilt
Using PostgreSQL in Tilt
Set the environment variable before running Tilt:
This will:
Load values-tilt-postgresql.yaml automatically
Deploy PostgreSQL instead of MongoDB
Generate PostgreSQL certificates via cert-manager
Configure all services to use PostgreSQL
Switching Back to MongoDB
Simply unset the variable and restart:
Development Values
The values-tilt-postgresql.yaml file includes:
Single-replica PostgreSQL for faster startup
Development password (nvsentinel-dev)
Auto-initialized schema with tables and triggers
Control plane node selector for PostgreSQL pod
Migration Guide
From MongoDB to PostgreSQL
Export data from MongoDB (if preserving data):
Deploy PostgreSQL :
Import data to PostgreSQL (if needed):
Testing the Migration
Verify PostgreSQL deployment:
Schema Management
Automatic Initialization
The PostgreSQL subchart automatically runs initialization scripts that:
Create tables if they don’t exist
Create indexes for query performance
Set up triggers for change tracking
Create helper functions
Manual Schema Updates
If you need to modify the schema:
Connect to PostgreSQL:
Run your schema updates:
Schema Reference
See postgresql-schema.sql for the complete schema definition.
Connection Pooling
The provider uses Go’s database/sql package with connection pooling:
Indexing Strategy
Key indexes for performance:
node_name - Queries by node
status - Status filtering
created_at / event_timestamp - Time-based queries
agent / csp - Event source filtering
is_fatal - Fatal event queries
Query Optimization
The provider extracts frequently-queried fields from JSONB:
Reduces JSONB query overhead
Enables efficient B-tree index usage
Maintains flexibility for additional JSONB fields
Troubleshooting
Connection Issues
Problem : “connection refused” or “no route to host”
Solutions :
Verify PostgreSQL service is running: kubectl get svc nvsentinel-postgresql -n nvsentinel
Check network policies aren’t blocking access
Verify DNS resolution: nslookup nvsentinel-postgresql.nvsentinel.svc.cluster.local
Certificate Issues
Problem : “certificate verify failed” or permission denied
Solutions :
Check cert-manager created certificates: kubectl get certificate -n nvsentinel
Verify certificate secret exists: kubectl get secret postgresql-client-cert -n nvsentinel
Check initContainer logs: kubectl logs -n nvsentinel deployment/fault-quarantine -c fix-cert-permissions
Ensure certificate permissions are correct (key: 600, cert/ca: 644)
SSL Mode Issues
Problem : “SSL required” or “SSL not supported”
Solutions :
Verify sslmode is set correctly in values
Check PostgreSQL server has TLS enabled
Verify server certificate is valid
Change Stream Not Working
Problem : Services not receiving updates
Solutions :
Check triggers are created: Connect to DB and run \d health_events to see triggers
Verify changelog table has entries: SELECT COUNT(*) FROM datastore_changelog;
Check resume tokens are being updated: SELECT * FROM resume_tokens;
Review component logs for polling errors
Problem : Slow queries or high CPU
Solutions :
Check indexes exist: \d health_events in psql
Analyze slow queries: Enable PostgreSQL slow query log
Increase connection pool size if needed
Consider vacuuming: VACUUM ANALYZE health_events;
Monitoring
Metrics
Monitor these PostgreSQL metrics:
Connection pool utilization
Query latency (P50, P95, P99)
Transaction rate
Table sizes and growth
Index hit ratio
Replication lag (if using replicas)
Health Checks
Verify datastore health:
Security Considerations
Authentication
Client certificates - Mutual TLS authentication required
Password authentication - Not recommended for production
SCRAM-SHA-256 - Supported for password-based auth
Authorization
Configure PostgreSQL user permissions:
Network Security
Use NetworkPolicies to restrict access
Enable TLS for all connections
Use private networks when possible
Rotate certificates regularly
Backup and Recovery
Backup Strategy
The PostgreSQL subchart supports automated backups:
Manual Backup
Recovery
Comparison: PostgreSQL vs MongoDB
References
Support
For issues or questions:
Check this documentation and troubleshooting section
Review logs: kubectl logs -n nvsentinel <pod-name>
File an issue: https://github.com/nvidia/nvsentinel/issues