nat.profiler.calc.calc_runner#
Attributes#
Classes#
Handles linear regression analysis for concurrency vs time metrics. |
|
Calculator for GPU sizing based on concurrency vs. time metrics. |
Module Contents#
- logger#
- class LinearFitAnalyzer(fit_config: nat.profiler.calc.data_models.FitConfig)#
Handles linear regression analysis for concurrency vs time metrics.
- fit_config#
- analyze_metrics(
- sizing_metrics_per_concurrency: dict[int, nat.profiler.calc.data_models.SizingMetrics],
Analyze metrics and return alerts including outlier information.
- Returns:
dict[int, CalcAlerts]: Alerts per concurrency including outlier flags
- class CalcRunner(config: nat.profiler.calc.data_models.CalcRunnerConfig)#
Calculator for GPU sizing based on concurrency vs. time metrics.
Initialize CalcRunner with a config file and a list of concurrencies.
- config#
- metrics_per_concurrency: dict[int, nat.profiler.calc.data_models.SizingMetrics]#
- gpu_estimates_per_concurrency: dict[int, nat.profiler.calc.data_models.GPUEstimates]#
- alerts_per_concurrency: dict[int, nat.profiler.calc.data_models.CalcAlerts]#
- linear_analyzer#
- validate_config() None #
Validate the configuration parameters. Raises ValueError if configuration is invalid.
- property output_dir: pathlib.Path#
- _calc_gpu_estimates_based_on_slope(
- sizing_metrics_per_concurrency: dict[int, nat.profiler.calc.data_models.SizingMetrics],
- use_latency: bool,
- use_runtime: bool,
Calculate GPU estimates based on the linear fit results
- _calc_gpu_estimates_per_concurrency(
- sizing_metrics_per_concurrency: dict[int, nat.profiler.calc.data_models.SizingMetrics],
Calculate per-concurrency GPU estimates and existing alerts.
- _validate_metrics_data(sizing_metrics_per_concurrency: dict) dict #
Validate and filter metrics data.
- _calc_fit_and_gpu_estimate(
- sizing_metrics_per_concurrency: dict[int, nat.profiler.calc.data_models.SizingMetrics],
Estimate GPU count to meet target latency and/or workflow runtime SLA for a given target user load.
Returns: - GPU estimates based on the slope of the time vs concurrency - GPU estimates per concurrency (rough estimates) - Alerts per concurrency (outliers, etc.)
- generate_calc_runner_output() nat.profiler.calc.data_models.CalcRunnerOutput #
Build CalcRunnerOutput from sizing metrics per concurrency.
- plot_concurrency_vs_time_metrics(output_dir: pathlib.Path)#
Plots concurrency vs. time metrics using pre-computed fits.
- write_output(
- output_dir: pathlib.Path,
- calc_runner_output: nat.profiler.calc.data_models.CalcRunnerOutput,
Write the output to the output directory.
- run_offline() nat.profiler.calc.data_models.CalcRunnerOutput #
Run in offline mode. 1. Read previous jobs in online mode and create sizing metrics per concurrency 2. Calculate GPU estimates 3. Write the output to the offline subdirectory
- async run_online() nat.profiler.calc.data_models.CalcRunnerOutput #
Create a MultiEvaluationRunner with concurrency overrides. Run in online mode. 1. Run the workflow 2. Create sizing metrics per concurrency from the profiler results and usage stats 3. Calculate GPU estimates 4. Write the output to the online subdirectory
- async run() nat.profiler.calc.data_models.CalcRunnerOutput #
online mode: 1. Run the workflow 2. Collect profiler results and usage stats 3. Calculate GPU estimates 4. Write the output to the online subdirectory
offline mode: 1. Read previous jobs in online mode and only append unique concurrency values to metrics_per_concurrency 2. Calculate GPU estimates 3. Write the output to the offline subdirectory