nemo_curator.models.cosmos_embed1
Module Contents
Classes
Data
COSMOS_EMBED1_MODEL_REVISION_INFO
API
Bases: ModelInterface
Cosmos-Embed1 embedding model.
Get the model ID names.
Download the processor config for the CosmosEmbed1 model on the node.
Download the weights for the CosmosEmbed1 model on the node.
Encode video frames for the model.
Parameters:
The input video frames.
Returns: torch.Tensor
The encoded video frames.
Evaluate the model.
Parameters:
The video embedding.
The text embeddings.
Returns: tuple[list[float], list[int]]
The predicted probabilities and indices.
Formulate input frames for the model.
Parameters:
List of video frames.
Returns: npt.NDArray[np.float32] | None
The formulated input frames.
Get the target number of frames for the model.
Returns: int
The target number of frames.
Get the text embedding for the given text.
Parameters:
The input text.
Returns: torch.Tensor
The text embedding.
Set up the Cosmos-Embed1 model.
This method initializes the model and its configuration for processing video and text data.