Guardrails#
Guardrails help validate inputs and outputs of physics ML models to build confidence in predictions. PhysicsNeMo provides tools for both pre-inference (input validation) and post-inference (output quality assessment) workflows.
Note
Guardrails are an experimental feature. APIs and functionality may change in future releases without backward compatibility guarantees. Contributions are welcome to advance the guardrails.
Overview#
Type |
Tool |
Purpose |
|---|---|---|
Pre-inference |
Geometry Guardrail |
OOD detection for 3D geometries before inference |
Post-inference |
PDE Residuals |
Physics consistency via continuity and momentum |
Post-inference |
Model Ensemble Variance |
Error quantification from multiple predictions |
Pre-inference: Geometry Guardrail#
The geometry guardrail detects out-of-distribution (OOD) 3D shapes before running inference. Models trained on a specific geometry distribution (e.g., automotive body shapes) may perform poorly on geometries that differ in position, orientation, scale, or shape. The guardrail learns the distribution of geometries from training data and flags unusual shapes at query time, so you can reject or investigate them before inference.
Location: physicsnemo.experimental.guardrails
Feature extraction
Each mesh is reduced to a 22-dimensional feature vector. Features are intentionally non-invariant to translation, rotation, and scale, so the guardrail can detect geometries that differ in absolute position, orientation, or size from the training distribution. The feature set includes centroid position, principal component axes and eigenvalues, bounding box extents, second moments of inertia, and total and projected surface areas.
Density modeling
The guardrail fits a probabilistic density model on the training feature set. Two methods are available:
GMM (default): Gaussian Mixture Model.
PCE: Polynomial Chaos Expansion with Hermite polynomials.
Classification
Anomaly scores are converted to empirical percentiles relative to the training
distribution. Configurable thresholds (warn_pct, reject_pct) define:
OK: Percentile < warn_pct — typical geometry, safe for inference
WARN: warn_pct ≤ percentile < reject_pct — unusual, investigate
REJECT: Percentile ≥ reject_pct — highly anomalous, likely OOD
Usage
Fit from a directory of STL files (parallel processing), or from a list of mesh
objects. Query returns status and percentile per geometry. Save and load fitted
guardrails via .npz; schema compatibility is checked on load.
from pathlib import Path
from physicsnemo.experimental.guardrails import GeometryGuardrail
guardrail = GeometryGuardrail(
method="gmm", # or "pce"
gmm_components=1,
warn_pct=99.0,
reject_pct=99.9,
device="cuda",
)
guardrail.fit_from_dir(Path("data/train_stl"), n_workers=8)
results = guardrail.query_from_dir(Path("data/test_stl"))
for r in results:
print(f"{r['name']}: {r['status']} (p={r['percentile']:.1f}%)")
# Save for reuse
guardrail.save(Path("guardrail.npz"))
Example: See examples/minimal/guardrails/ for a full workflow using
DrivAerML and AhmedML datasets.
Post-inference: PDE Residuals#
PDE residuals measure how well model predictions satisfy the governing equations (e.g., continuity and momentum equations for Navier-Stokes equations). Predictions that violate these equations are less trustworthy; high residuals often indicate model uncertainty.
PhysicsNeMo-CFD
offers a sample workflow for quantifying PDE residuals for
continuity and momentum. This sample can serve as a template for other use cases
or PDEs. In that workflow, the functions compute_continuity_residuals and
compute_momentum_residuals in physicsnemo.cfd.bench.metrics.physics
compute mass conservation (divergence of velocity) and RANS momentum balance,
respectively. The typical steps are to run inference, interpolate predictions
onto a volume mesh, then compute residuals. High residuals in wake or high-shear
regions often indicate model uncertainty. See the volume_benchmarking
notebook in the PhysicsNeMo-CFD benchmarking workflow.
Post-inference: Model Ensemble Variance#
PhysicsNeMo-CFD offers a sample workflow for quantifying prediction uncertainty via ensemble variance. This sample can serve as a template for other use cases. In that workflow, prediction variance across multiple realizations provides an uncertainty proxy: run several inferences and use the standard deviation of predictions as an error indicator. Two common ensemble variants are demonstrated: input (mesh) sensitivity—create variants of the same geometry via decimation, subdivision, or remeshing and run inference on each; and checkpoint sensitivity—use multiple model checkpoints and run inference with each. For each input, collect predictions from N realizations, compute mean and std at each point, and visualize std to identify regions of high uncertainty (e.g., front, wheels, mirrors in automotive aero). See the benchmarking_in_absence_of_gt notebook in the PhysicsNeMo-CFD benchmarking workflow.