Sizing suggestion: The tool provides a sizing suggestion by recommending the smallest appropriate vGPU profile for the estimated workload. If the workload is too large for any available vGPU profile, the tool will instead suggest using multiple GPUs via passthrough. The tool also creates a json formatted log with the information of workload, hardware, and suggested configurations.

Local deployment: A log of the deployment results/analysis is created after running the vLMM locally. This verifies if the suggested vGPU profile is sufficient for your AI workload. Additionally, a runtime log is created to go over steps in the actual vLLM deployment which can be useful for debugging deployment issues or reproducibility. These logs are exportable into a .txt file with both the deployment and runtime logs.