Guardrails#

Guardrails help validate inputs and outputs of physics ML models to build confidence in predictions. PhysicsNeMo provides tools for both pre-inference (input validation) and post-inference (output quality assessment) workflows.

Note

Guardrails are an experimental feature. APIs and functionality may change in future releases without backward compatibility guarantees. Contributions are welcome to advance the guardrails.

Overview#

Type

Tool

Purpose

Pre-inference

Geometry Guardrail

OOD detection for 3D geometries before inference

Post-inference

PDE Residuals

Physics consistency via continuity and momentum

Post-inference

Model Ensemble Variance

Error quantification from multiple predictions

Pre-inference: Geometry Guardrail#

The geometry guardrail detects out-of-distribution (OOD) 3D shapes before running inference. Models trained on a specific geometry distribution (e.g., automotive body shapes) may perform poorly on geometries that differ in position, orientation, scale, or shape. The guardrail learns the distribution of geometries from training data and flags unusual shapes at query time, so you can reject or investigate them before inference.

Location: physicsnemo.experimental.guardrails

Feature extraction

Each mesh is reduced to a 22-dimensional feature vector. Features are intentionally non-invariant to translation, rotation, and scale, so the guardrail can detect geometries that differ in absolute position, orientation, or size from the training distribution. The feature set includes centroid position, principal component axes and eigenvalues, bounding box extents, second moments of inertia, and total and projected surface areas.

Density modeling

The guardrail fits a probabilistic density model on the training feature set. Two methods are available:

  • GMM (default): Gaussian Mixture Model.

  • PCE: Polynomial Chaos Expansion with Hermite polynomials.

Classification

Anomaly scores are converted to empirical percentiles relative to the training distribution. Configurable thresholds (warn_pct, reject_pct) define:

  • OK: Percentile < warn_pct — typical geometry, safe for inference

  • WARN: warn_pct ≤ percentile < reject_pct — unusual, investigate

  • REJECT: Percentile ≥ reject_pct — highly anomalous, likely OOD

Usage

Fit from a directory of STL files (parallel processing), or from a list of mesh objects. Query returns status and percentile per geometry. Save and load fitted guardrails via .npz; schema compatibility is checked on load.

from pathlib import Path
from physicsnemo.experimental.guardrails import GeometryGuardrail

guardrail = GeometryGuardrail(
    method="gmm",        # or "pce"
    gmm_components=1,
    warn_pct=99.0,
    reject_pct=99.9,
    device="cuda",
)
guardrail.fit_from_dir(Path("data/train_stl"), n_workers=8)
results = guardrail.query_from_dir(Path("data/test_stl"))

for r in results:
    print(f"{r['name']}: {r['status']} (p={r['percentile']:.1f}%)")

# Save for reuse
guardrail.save(Path("guardrail.npz"))

Example: See examples/minimal/guardrails/ for a full workflow using DrivAerML and AhmedML datasets.

Post-inference: PDE Residuals#

PDE residuals measure how well model predictions satisfy the governing equations (e.g., continuity and momentum equations for Navier-Stokes equations). Predictions that violate these equations are less trustworthy; high residuals often indicate model uncertainty.

PhysicsNeMo-CFD offers a sample workflow for quantifying PDE residuals for continuity and momentum. This sample can serve as a template for other use cases or PDEs. In that workflow, the functions compute_continuity_residuals and compute_momentum_residuals in physicsnemo.cfd.bench.metrics.physics compute mass conservation (divergence of velocity) and RANS momentum balance, respectively. The typical steps are to run inference, interpolate predictions onto a volume mesh, then compute residuals. High residuals in wake or high-shear regions often indicate model uncertainty. See the volume_benchmarking notebook in the PhysicsNeMo-CFD benchmarking workflow.

Post-inference: Model Ensemble Variance#

PhysicsNeMo-CFD offers a sample workflow for quantifying prediction uncertainty via ensemble variance. This sample can serve as a template for other use cases. In that workflow, prediction variance across multiple realizations provides an uncertainty proxy: run several inferences and use the standard deviation of predictions as an error indicator. Two common ensemble variants are demonstrated: input (mesh) sensitivity—create variants of the same geometry via decimation, subdivision, or remeshing and run inference on each; and checkpoint sensitivity—use multiple model checkpoints and run inference with each. For each input, collect predictions from N realizations, compute mean and std at each point, and visualize std to identify regions of high uncertainty (e.g., front, wheels, mirrors in automotive aero). See the benchmarking_in_absence_of_gt notebook in the PhysicsNeMo-CFD benchmarking workflow.