Known Issues#
Multi-session Q&A is not currently supported. Users should try chat only on a single file or live stream at a time. Trying chat on multiple files and/or live streams may lead to incorrect replies. This does not affect summarization.
Models are trained on specific data/use cases so if tested on other inputs then it might give incorrect results.
VLM Model accuracy: Sometimes time stamps returned are not accurate. Also, it can hallucinate for certain questions. Prompt tuning might be required.
Summarization accuracy: Summarization accuracy is heavily dependent on VLM accuracy. Also, the default configs have been tuned for the warehouse use case. User can supply custom VLM and summarization prompts to the
/summarize
API.The following harmless warnings might be seen during VSS application execution. These can be safely ignored.
GLib (gthread-posix.c): Unexpected error from C library during ‘pthread_setspecific’: Invalid argument. Aborting
Due to a browser limitation, loading multiple Gradio sessions in the same browser may cause Gradio sessions to get stuck or appear to be slow.
Guardrails might not reject some prompts that are expected to be rejected. This could be because the prompt might be relevant in other contexts as well as topics in the prompt might not be configured to be rejected. You can try tuning the guardrails configuration if required.
OpenAI connection errors or 429 (too many requests) errors might be seen sometimes if too many requests are sent to GPT-4v or GPT-4o VLMs. It can be due to lower TPM/RPM limits associated with the OpenAI account.
CA-RAG Summarization might show a truncated summary response. This is due to the max_tokens. Try increasing the number in the CA-RAG config file.
Helm deployment: VSS deployment pod fails with Error: (LLM call Exception: llm-nim-svc)
Inspite of having a init container wait for LLM pod to come up, VSS deployment can for an unknown reason error out like below.
2024-11-27 17:51:44,763 [91mERROR[0m Failed to load VIA stream handler - LLM Call Exception: HTTPConnectionPool(host='llm-nim-svc', port=8000): Max retries exceeded with url: /v1/chat/completions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2c9d0ad6c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
If this happens, please wait for additional few minutes and a pod restart fixes the issue.
Users can monitor this using
sudo watch microk8s kubectl get pod
.Gradio UI might be slow to load the thumbnails and video preview for longer videos. This becomes especially noticable over slower network connections.