API Documentation#

The following APIs are supported by VSS.

REST APIs for File Management	REST API for Live-stream Management
REST API for Summarization, VLM Captions and Q&A	REST API for Alerts
REST API for Models	REST API for Health Check
REST API for Review Alert

Detailed VSS API documentation is available at VSS API Glossary. It is also available at http://<VSS_API_ENDPOINT>/docs after VSS is deployed.

Refer to Launch UI for Helm or Launch UI for Docker Compose for instructions to find the VSS_API_ENDPOINT port information.

The --print-curl-command option of the Python CLI Client can also be used as a reference of the API request format.

VSS API Glossary#

This glossary provides detailed documentation for the VSS API endpoints. This glossary is separated into to the major sections and a full list of the APIs and their details are provided at the bottom`.

Alerts API#

Alerts-related APIs exposed by the service.

API Endpoint	Description
/alerts [GET]	List all live stream alerts added to the VIA Server.
/alerts [POST]	Add an alert for a live stream.
/alerts/{alert_id} [DELETE]	Delete a live stream alert added to the VIA Server.
/alerts/recent [GET]	Get recently generated alerts. Optionally filter by live stream ID.

Chat API#

Chat-related APIs exposed by the service.

API Endpoint	Description
/chat/completions [POST]	Run video interactive question and answer.

Files API#

Files-related APIs exposed by the service.

API Endpoint	Description
/files [POST]	Files are used to upload media files.
/files [GET]	Returns a list of files.
/files/{file_id} [DELETE]	The ID of the file to use for this request.
/files/{file_id} [GET]	Returns information about a specific file.
/files/{file_id}/content [GET]	Returns the contents of the specified file.

Health API#

Health-related APIs exposed by the service.

API Endpoint	Description
/health/ready [GET]	Get VIA readiness status.
/health/live [GET]	Get VIA liveness status.

Live-stream API#

Live-stream-related APIs exposed by the service.

API Endpoint	Description
/live-stream [GET]	List all live streams.
/live-stream [POST]	API for adding live / camera stream.
/live-stream/{stream_id} [DELETE]	API for removing live / camerea stream matching stream_id.

Metrics API#

Metrics-related APIs exposed by the service.

API Endpoint	Description
/metrics [GET]	Get VIA metrics in prometheus format.

Models API#

Models-related APIs exposed by the service.

API Endpoint	Description
/models [GET]	Lists the currently available models, and provides basic information about each one such as the owner and availability.

Recommended_config API#

Recommended_config-related APIs exposed by the service.

API Endpoint	Description
/recommended_config [POST]	Recommend config for a video.

Summarize API#

Summarize-related APIs exposed by the service.

API Endpoint	Description
/summarize [POST]	Run video summarization query.

VLM Captions API#

VLM Captions-related APIs exposed by the service.

API Endpoint	Description
/generate_vlm_captions [POST]	Generate VLM captions for video files or live streams.

Alert Review API#

Review Alert-related APIs exposed by the service.

API Endpoint	Description
/reviewAlert [POST]	Review an external alert using VLM analysis.

List of APIs#

/alerts [GET]#

Summary: List all live stream alerts
Description: List all live stream alerts added to the VIA Server.

No parameters defined.

/alerts [POST]#

Summary: Add an alert
Description: Add an alert for a live stream.

Parameter	Description
name	Name of the alert
liveStreamId	ID of the live stream to configure the alert for
events	List of events to generate alert for
callback	URL to call when events are detected
callbackJsonTemplate	JSON Template for the callback body with supported placeholders
callbackToken	Bearer token to use when calling the callback URL

/alerts/{alert_id} [DELETE]#

Summary: Delete a live stream alert
Description: Delete a live stream alert added to the VIA Server.

Parameter	Description
alert_id	Unique ID of the alert to be deleted.

/alerts/recent [GET]#

Summary: Get recent alerts
Description: Get recently generated alerts. Optionally filter by live stream ID.

Parameter	Description
live_stream_id	Optional live stream ID to filter alerts.

/chat/completions [POST]#

Summary: VIA Chat or Q&A
Description: Run video interactive question and answer.

Parameter	Description
id	Unique ID or list of IDs of the files/live-streams to query (max 50 items). Note: List of IDs work only for image files.
messages	List of chat messages containing the conversation history
model	Model to use for this query (for example, “vila-1.5”)
api_type	API used to access model (for example, “internal”)
max_tokens	Maximum number of tokens to generate (1-1024)
temperature	Sampling temperature for text generation (0-1)
top_p	Top-p sampling mass for text generation (0-1)
top_k	Number of highest probability tokens to keep (1-1000)
seed	Random seed for generation
highlight	If true, generate a highlight for the video

/files [POST]#

Summary: API for uploading a media file
Description: Files are used to upload media files.

Parameter	Description
purpose	The intended purpose of the uploaded file (must be “vision” for VIA use-case)
media_type	Media type (“image” or “video”)
file	File object to be uploaded
filename	Filename along with path to be used (alternative to file upload)
camera_id	Camera ID to be used for the file (optional)

/files [GET]#

Summary: Returns list of files
Description: Returns a list of files.

Parameter	Description
purpose	Only return files with the given purpose.

/files/{file_id} [DELETE]#

Summary: Delete a file
Description: The ID of the file to use for this request.

Parameter	Description
file_id	File having ‘file_id’ to be deleted.

/files/{file_id} [GET]#

Summary: Returns information about a specific file
Description: Returns information about a specific file.

Parameter	Description
file_id	The ID of the file to use for this request.

/files/{file_id}/content [GET]#

Summary: Returns the contents of the specified file
Description: Returns the contents of the specified file.

Parameter	Description
file_id	The ID of the file to use for this request.

/health/ready [GET]#

Summary: Get VIA readiness status
Description: Get VIA readiness status.

No parameters defined.

/health/live [GET]#

Summary: Get VIA liveness status
Description: Get VIA liveness status.

No parameters defined.

/live-stream [GET]#

Summary: List all live streams
Description: List all live streams.

No parameters defined.

/live-stream [POST]#

Summary: Add a live stream
Description: API for adding live / camera stream.

Parameter	Description
liveStreamUrl	URL of the RTSP stream
description	Description of the live stream
username	Username for RTSP authentication (if required)
password	Password for RTSP authentication (if required)
camera_id	Camera ID to be used for the live stream (optional)

/live-stream/{stream_id} [DELETE]#

Summary: Remove a live stream
Description: API for removing live / camerea stream matching stream_id.

Parameter	Description
stream_id	Unique identifier for the live stream to be deleted.

/metrics [GET]#

Summary: Get VIA metrics
Description: Get VIA metrics in prometheus text exposition format that can be scraped by a prometheus server.

No parameters defined.

Available Metrics

VIA exposes the following Prometheus metrics organized by category:

System Metrics (Gauges)

Metric Name	Description
`system_uptime_seconds`	Number of seconds the VIA server system has been running
`video_file_queries_processed`	Number of video file queries whose processing is complete
`video_file_queries_pending`	Number of video file queries which are queued and yet to be processed
`active_live_streams`	Number of live streams whose summaries are being actively generated

Latency Metrics - Latest Values (Gauges)

Metric Name	Description
`e2e_latency_seconds_latest`	Latest end-to-end latency in seconds for complete video processing pipeline
`vlm_pipeline_latency_seconds_latest`	Latest latency of the VLM (Vision Language Model) pipeline processing in seconds
`vlm_latency_seconds_latest`	Latest VLM inference processing latency in seconds
`decode_latency_seconds_latest`	Latest video decode processing latency in seconds
`ca_rag_latency_seconds_latest`	Latest Context-Aware RAG processing latency in seconds
`chat_completions_latency_seconds_latest`	Latest chat completions API processing latency in seconds
`cv_pipeline_latency_seconds_latest`	Latest CV (Computer Vision) pipeline processing latency in seconds
`asr_pipeline_latency_seconds_latest`	Latest ASR (Automatic Speech Recognition) pipeline processing latency in seconds

Latency Metrics - Distribution (Histograms)

Metric Name	Description
`decode_latency_seconds`	Video decode processing latency distribution in seconds. Buckets: [0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1.0, 3.0, 10.0]
`vlm_latency_seconds`	VLM processing latency distribution in seconds. Buckets: [1.0, 3.0, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0]
`add_doc_latency_seconds`	Context manager add_doc processing latency distribution in seconds. Buckets: [0.00005, 0.0001, 0.0005, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1.0]
`chat_completions_latency_seconds`	Chat completions API processing latency distribution in seconds. Buckets: [0.1, 0.3, 0.5, 1.0, 2.0, 3.0, 5.0, 10.0, 15.0, 20.0]
`asr_pipeline_latency_seconds`	ASR pipeline processing latency distribution in seconds. Buckets: [0.1, 0.3, 0.5, 1.0, 2.0, 3.0, 5.0, 10.0, 15.0, 20.0]
`live_stream_summary_latency_seconds`	Live stream summary processing latency distribution in seconds. Buckets: [10.0, 15.0, 20.0, 30.0, 40.0, 50.0, 70.0, 100.0, 200, 300, 500, 1000]
`live_stream_captions_latency_seconds`	Live stream captions processing latency distribution in seconds. Buckets: [10.0, 15.0, 20.0, 30.0, 40.0, 50.0, 70.0, 100.0, 200, 300, 500, 1000]

Throughput and Performance Metrics (Histograms)

Metric Name	Description
`stream_fps`	FPS (Frames Per Second) measurements per stream. Buckets: [1.0, 5.0, 10.0, 20.0, 30, 50.0, 100.0, 200, 300, 400, 500, 750, 1000, 5000]
`vlm_input_tokens_per_chunk`	Number of tokens input to the VLM model per chunk. Buckets: [10, 20, 50, 100, 200, 500, 1000, 2000]
`vlm_output_tokens_per_chunk`	Number of tokens output from the VLM model per chunk. Buckets: [10, 20, 50, 100, 200, 500, 1000, 2000]

/models [GET]#

Summary: Lists the currently available models, and provides basic information about each one such as the owner and availability
Description: Lists the currently available models, and provides basic information about each one such as the owner and availability.

No parameters defined.

/recommended_config [POST]#

Summary: Recommend config for a video
Description: Recommend config for a video.

No parameters defined.

/summarize [POST]#

Summary: Summarize a video
Description: Run video summarization query.

Parameter	Description
id	Unique ID or list of IDs of the files/live-streams to summarize (max 50 items). List of IDs work only for image files.
prompt	Prompt for summarization
system_prompt	System prompt for the VLM. To enable reasoning with Cosmos Reason1, add <think></think> and <answer></answer> tags to the system prompt.
model	Model to use for this query (for example, “vila-1.5”)
api_type	Specifies the type of API
response_format	Specifies the format of the response (0 for JSON and 1 for text)
stream	If true, partial message deltas will be sent as server-sent events
max_tokens	Maximum number of tokens to generate (1-1024)
temperature	Sampling temperature for text generation (0-1)
top_p	Top-p sampling mass for text generation (0-1)
top_k	Number of highest probability tokens to keep (1-1000)
seed	Random seed for generation (1-4294967295)
chunk_duration	Chunk videos into chunks of specified duration in seconds (default: 0 for no chunking)
chunk_overlap_duration	Chunk overlap duration in seconds (default: 0 for no overlap)
summary_duration	Summarize every specified duration of video (applicable to live streams only. -1 for summary_duration till EOS) (-1, 3600)
media_info	media_info object that contains start and end times offsets for processing part of a video file. Not applicable for live-streaming
user	A unique identifier for the user
caption_summarization_prompt	Prompt for caption summarization
summary_aggregation_prompt	Prompt for summary aggregation
tools	Configuration of the tool to be used as part of the request. Currently, only alert is supported.
enable_chat	Enable chat Q&A on the input media
enable_chat_history	Enable chat history during Q&A
enable_cv_metadata	Enable CV metadata
cv_pipeline_prompt	Prompt for CV pipeline
num_frames_per_chunk	Number of frames per chunk for VLM (0-256)
vlm_input_width	VLM input width (0-4096)
vlm_input_height	VLM input height (0-4096)
enable_audio	Enable transcription of audio stream
summarize_batch_size	Summarization batch size (1-1024)
rag_top_k	RAG top k results (1-1024)
rag_batch_size	RAG batch size (1-1024)
summarize_max_tokens	Maximum number of tokens for summarization (1,40)
summarize_temperature	Sampling temperature for summarization (0,1)
summarize_top_p	Top-p sampling mass for summarization (0,1)
chat_max_tokens	Maximum number of tokens for chat responses (1,40)
chat_temperature	Sampling temperature for chat responses (0,1)
chat_top_p	Top-p sampling mass for chat responses (0,1)
notification_max_tokens	Maximum number of tokens for notifications (1,40)
notification_temperature	Sampling temperature for notifications (0,1)
notification_top_p	Top-p sampling mass for notifications (0,1)

/reviewAlert [POST]#

Summary: Review an external alert
Description: Review an external alert using VLM analysis to determine if the alert is valid based on video content.

Parameter	Description
version	Alert review API version (default: “1.0”)
id	Unique request ID
@timestamp	NTP timestamp when the alert was generated (auto-generated if not provided)
sensor_id	Sensor identifier (for example, “camera-001”, “sensor-west-entrance”)
video_path	Path to video file relative to the VSS base media path
cv_metadata_path	Path to CV metadata file relative to the VSS base media path (optional)
confidence	Confidence score (0.0-1.0, default: 1.0)
alert	Alert information object containing severity, status, type, and description
event	Event information object containing type and description
vss_params	VSS parameters object containing chunk settings, VLM parameters, and debug options
meta_labels	List of metadata labels in key-value format (optional)

/generate_vlm_captions [POST]#

Summary: Generate VLM captions
Description: Generate VLM captions for video files or live streams using Vision Language Models. For live streams, captions are generated in real-time and can be streamed using server-sent events.

Parameter	Description
id	Unique ID of the file/live-stream to generate VLM captions for.
model	Model to use for this query (for example, “vila-1.5”)
prompt	Prompt for VLM captions generation (optional)
system_prompt	System prompt for the VLM. To enable reasoning with Cosmos Reason1, add <think></think> and <answer></answer> tags to the system prompt.
api_type	API used to access model (default: “internal”)
response_format	An object specifying the format that the model must output (default: text)
stream	If true, partial message deltas will be sent as server-sent events (default: false). For live streams, streaming is recommended for real-time caption generation.
stream_options	Options for streaming response (optional)
max_tokens	Maximum number of tokens to generate (1-1024)
temperature	Sampling temperature for text generation (0-1)
top_p	Top-p sampling mass for text generation (0-1)
top_k	Number of highest probability tokens to keep (1-1000)
seed	Random seed for generation (1-4294967295)
chunk_duration	Chunk videos into chunks of specified duration in seconds (default: 0 for no chunking)
chunk_overlap_duration	Chunk overlap duration in seconds (default: 0 for no overlap)
media_info	Provide start and end times offsets for processing part of a video file. Not applicable for live-streaming
user	A unique identifier for the user
tools	List of tools for the current VLM captions request (max 100)
enable_cv_metadata	Enable CV metadata (default: false)
cv_pipeline_prompt	Prompt for CV pipeline (for example, “person . car . bicycle;0.5”)
num_frames_per_chunk	Number of frames per chunk to use for the VLM (0-256)
vlm_input_width	VLM input width (0-4096)
vlm_input_height	VLM input height (0-4096)
enable_cv_metadata	Enable CV metadata (default: false)
cv_pipeline_prompt	Prompt for CV pipeline (for example, “person . car . bicycle;0.5”)
enable_reasoning	Enable reasoning for VLM captions generation (default: false)