aiq.profiler.profile_runner#
Attributes#
Classes#
A utility to run a series of prompts through an AgentIQ workflow for profiling: |
Module Contents#
- logger#
- class SimpleMetricsHolder(/, **data: Any)#
Bases:
pydantic.BaseModel
- workflow_run_time_confidence_intervals: Any#
- llm_latency_confidence_intervals: Any#
- throughput_estimate_confidence_interval: Any#
- class InferenceOptimizationHolder(/, **data: Any)#
Bases:
pydantic.BaseModel
- confidence_intervals: SimpleMetricsHolder#
- common_prefixes: Any#
- token_uniqueness: Any#
- workflow_runtimes: Any#
- class ProfilerRunner(
- profiler_config: aiq.data_models.evaluate.ProfilerConfig,
- output_dir: pathlib.Path,
A utility to run a series of prompts through an AgentIQ workflow for profiling:
can load prompts from a file
or generate them via an LLM
collect usage stats for each run
store them in a configured directory
Updated version with additional metrics:
For each request, we collect a list of UsageStatistic objects, store them individually, and also keep a final large JSON of all requests.
- We then compute:
90, 95, 99% confidence intervals for the mean total workflow run time.
90, 95, 99% confidence intervals for the mean LLM latency.
90, 95, 99% estimates of throughput.
All computed metrics are saved to a metrics JSON file at the end.
- profile_config#
- output_dir#
- _converter#
- all_steps = []#
- async run(
- all_steps: list[list[aiq.data_models.intermediate_step.IntermediateStep]],
Main entrypoint: Works on Input DataFrame generated from eval to fit forecasting model, writes out combined requests JSON, then computes and saves additional metrics, and optionally fits a forecasting model.
- _compute_workflow_run_time_confidence_intervals() aiq.profiler.inference_metrics_model.InferenceMetricsModel #
Computes 90, 95, 99% confidence intervals for the mean total workflow run time (in seconds). The total workflow run time for each request is the difference between the last and first event timestamps in usage_stats.
- _compute_llm_latency_confidence_intervals() aiq.profiler.inference_metrics_model.InferenceMetricsModel #
Computes 90, 95, 99% confidence intervals for the mean LLM latency. LLM latency is defined as the difference between an LLM_END event_timestamp and the immediately preceding LLM_START event_timestamp, across all usage_stats.
- _compute_throughput_estimates() aiq.profiler.inference_metrics_model.InferenceMetricsModel #
Computes 90, 95, 99% confidence intervals for throughput, defined as:
throughput = (total number of requests) / (total time window),where total time window is from the earliest usage_stats event across all requests to the latest usage_stats event. Note: This is a simple approximate measure of overall throughput for the entire run.
- _compute_confidence_intervals( ) aiq.profiler.inference_metrics_model.InferenceMetricsModel #
Helper to compute 90, 95, 99% confidence intervals for the mean of a dataset. Uses a z-score from the normal approximation for large samples.
Returns a dict like:
{ 'ninetieth_interval': (lower, upper), 'ninety_fifth_interval': (lower, upper), 'ninety_ninth_interval': (lower, upper), }