Generate clip-level embeddings for search, question answering, filtering, and duplicate removal.
Use the pipeline stages or the example script flags to generate clip-level embeddings.
Add CosmosEmbed1FrameCreationStage to transform extracted frames into model-ready tensors.
Add CosmosEmbed1EmbeddingStage to generate clip.cosmos_embed1_embedding and optional clip.cosmos_embed1_text_match.
Cosmos-Embed1 frame creation parameters
clip.cosmos_embed1_frames → temporary tensors used by the embedding stageclip.cosmos_embed1_embedding → final clip-level vector (NumPy array)clip.cosmos_embed1_text_matchAdd InternVideo2FrameCreationStage to transform extracted frames into model-ready tensors.
Add InternVideo2EmbeddingStage to generate clip.intern_video_2_embedding and optional clip.intern_video_2_text_match.
InternVideo2 frame creation parameters
clip.intern_video_2_frames → temporary tensors used by the embedding stageclip.intern_video_2_embedding → final clip-level vector (NumPy array)clip.intern_video_2_text_matchtarget_fps during frame extraction or adjust clip length so that the model receives the required number of frames.gpu_memory_gb, reduce batch size if exposed, or use a smaller resolution variant.model_dir and network access. The stages download weights if missing.