Generated File Structures#
Overview#
This document serves as a guide to understanding the structure and contents of the files generated by GenAi-Perf.
Directory Structure#
After running GenAi-Perf, your file tree should contain the following:
genai-perf/
├── artifacts/
│ ├── data/
│ └── images/
File Types#
Within the artifacts and docs directories, several file types are generated, including .gzip, .csv, .json, .html, and .jpeg. Below is a detailed explanation of each file and its purpose.
Artifacts Directory#
Data Subdirectory#
The data subdirectory contains the raw and processed performance data files.
GZIP Files#
all_data.gzip: Aggregated performance data from all collected metrics.
input_sequence_lengths_vs_output_sequence_lengths.gzip: This contains data on the input sequence lengths versus the output sequence lengths for each request.
request_latency.gzip: This contains the latency for each request.
time_to_first_token.gzip: This contains the time to first token for each request.
token_to_token_vs_output_position.gzip: This contains the time from one token generation to the next versus the position of the output token for each token.
ttft_vs_input_sequence_lengths.gzip: This contains the time to first token versus the input sequence length for each request.
JSON Files#
inputs.json: This contains the input prompts provided to the LLM during testing.
profile_export.json: This is provided by Perf Analyzer and contains the timestamps for each event in the lifecycle of each request. This is low-level data used to calculate metrics by GenAi-Perf.
profile_export_genai_perf.json: A JSON of the output of GenAI-Perf as well as additional details, including the submitted command line arguments.
CSV File#
profile_export_genai_perf.csv: A CSV of the output tables printed in the GenAi-Perf output. These may have more detail than the printed tables.
Images Subdirectory#
The images subdirectory contains visual representations of the performance data. All images are in both HTML and JPEG formats.
HTML and JPEG Files#
input_sequence_lengths_vs_output_sequence_lengths: A heat map showing the relationship between input and generated tokens.
request_latency: A box plot showing request latency.
time_to_first_token: A box plot showing time to first token.
token_to_token_vs_output_position: A scatterplot showing token-to-token time versus output token position.
ttft_vs_input_sequence_lengths: A scatterplot showing token-to-token time versus the input sequence lengths.
Usage Instructions#
To use the generated files, navigate to the artifacts/data directory. Then, the next steps depend on the file format you wish to work with.
GZIP Files#
The GZIP files contain Parquet files with calculated data, which can be read with Pandas in Python. For example, you can create a dataframe with these files:
import pandas
df = pandas.read_partquet(path_to_file)`
You can then use Pandas to work with the data.
print(df.head()) # See the first few rows of the data.
print(df.describe()) # Get summary statistics for the data
CSV and JSON Files#
Open .csv and .json files with spreadsheet or JSON parsing tools for structured data analysis. These can also be read via a text editor, like Vim.
HTML Files#
View .html visualizations in a web browser for interactive data exploration.
JPEG Files#
Use an image software to open .jpeg images for static visual representations.