> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/curator/_mcp/server.

# nemo_curator.models.transnetv2

Model for fast shot transition detection.

@article\{soucek2020transnetv2,
title=\{TransNet V2: An effective deep network architecture for fast shot transition detection},
author=\{Sou\{\v\{c}}ek, Tom\{'a}\{\v\{s}} and Loko\{\v\{c}}, Jakub},
year=\{2020},
journal=\{arXiv preprint arXiv:2008.04838},
}

## Module Contents

### Classes

| Name                                                                       | Description                                                                      |
| -------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| [`ColorHistograms`](#nemo_curator-models-transnetv2-ColorHistograms)       | Model for computing and comparing color histograms across video frames.          |
| [`Conv3DConfigurable`](#nemo_curator-models-transnetv2-Conv3DConfigurable) | Configurable 3D convolution layer with support for separable convolutions.       |
| [`DilatedDCNNV2`](#nemo_curator-models-transnetv2-DilatedDCNNV2)           | Dilated dense convolutional model with multiple dilation rates.                  |
| [`FrameSimilarity`](#nemo_curator-models-transnetv2-FrameSimilarity)       | Model for computing frame similarity features in video sequences.                |
| [`StackedDDCNNV2`](#nemo_curator-models-transnetv2-StackedDDCNNV2)         | Stacked dilated dense convolutional neural network for video feature extraction. |
| [`TransNetV2`](#nemo_curator-models-transnetv2-TransNetV2)                 | Interface for TransNetV2 shot transition detection model.                        |
| [`_TransNetV2`](#nemo_curator-models-transnetv2-_TransNetV2)               | -                                                                                |

### Data

[`_TRANSNETV2_MODEL_ID`](#nemo_curator-models-transnetv2-_TRANSNETV2_MODEL_ID)

[`_TRANSNETV2_MODEL_REVISION`](#nemo_curator-models-transnetv2-_TRANSNETV2_MODEL_REVISION)

[`_TRANSNETV2_MODEL_WEIGHTS`](#nemo_curator-models-transnetv2-_TRANSNETV2_MODEL_WEIGHTS)

### API

```python
class nemo_curator.models.transnetv2.ColorHistograms(
    lookup_window: int = 101,
    output_dim: int | None = None
)
```

**Bases:** `Module`

Model for computing and comparing color histograms across video frames.

```python
nemo_curator.models.transnetv2.ColorHistograms.compute_color_histograms(
    frames: torch.Tensor
) -> torch.Tensor
```

staticmethod

Compute color histograms for video frames.

**Parameters:**

Input tensor of video frames.

**Returns:** `torch.Tensor`

Color histogram tensor.

```python
nemo_curator.models.transnetv2.ColorHistograms.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input frames through the model.

**Parameters:**

Input tensor of video frames.

**Returns:** `torch.Tensor`

Model predictions for shot transitions.

```python
class nemo_curator.models.transnetv2.Conv3DConfigurable(
    in_filters: int,
    filters: int,
    dilation_rate: int,
    separable: bool = True,
    use_bias: bool = True
)
```

**Bases:** `Module`

Configurable 3D convolution layer with support for separable convolutions.

```python
nemo_curator.models.transnetv2.Conv3DConfigurable.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input through the 3D convolutional layers.

**Parameters:**

Input tensor.

**Returns:** `torch.Tensor`

Processed tensor.

```python
class nemo_curator.models.transnetv2.DilatedDCNNV2(
    in_filters: int,
    filters: int,
    batch_norm: bool = True,
    activation: collections.abc.Callable[[torch.Tensor], torch.Tensor] | None = None
)
```

**Bases:** `Module`

Dilated dense convolutional model with multiple dilation rates.

```python
nemo_curator.models.transnetv2.DilatedDCNNV2.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input through the dilated dense convolutional network.

**Parameters:**

Input tensor.

**Returns:** `torch.Tensor`

Processed tensor.

```python
class nemo_curator.models.transnetv2.FrameSimilarity(
    in_filters: int,
    similarity_dim: int = 128,
    lookup_window: int = 101,
    output_dim: int = 128,
    use_bias: bool = False
)
```

**Bases:** `Module`

Model for computing frame similarity features in video sequences.

```python
nemo_curator.models.transnetv2.FrameSimilarity.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input frames through the model.

**Parameters:**

Input tensor of video frames.

**Returns:** `torch.Tensor`

Frame similarity features.

```python
class nemo_curator.models.transnetv2.StackedDDCNNV2(
    in_filters: int,
    n_blocks: int,
    filters: int,
    shortcut: bool = True,
    pool_type: str = 'avg',
    stochastic_depth_drop_prob: float = 0.0
)
```

**Bases:** `Module`

Stacked dilated dense convolutional neural network for video feature extraction.

```python
nemo_curator.models.transnetv2.StackedDDCNNV2.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input through the stacked dilated dense convolutional network.

**Parameters:**

Input tensor.

**Returns:** `torch.Tensor`

Processed tensor.

```python
class nemo_curator.models.transnetv2.TransNetV2(
    model_dir: str | None = None
)
```

**Bases:** [ModelInterface](/nemo-curator/nemo_curator/models/base#nemo_curator-models-base-ModelInterface)

Interface for TransNetV2 shot transition detection model.

Get the model ID names.

```python
nemo_curator.models.transnetv2.TransNetV2.__call__(
    inputs: torch.Tensor
) -> torch.Tensor
```

TransNetV2 model call.

**Parameters:**

tensor of shape \[# batch, # frames, height, width, RGB].

**Returns:** `torch.Tensor`

tensor of shape \[# batch, # frames, 1] of probabilities for each frame being a shot transition.

```python
nemo_curator.models.transnetv2.TransNetV2.download_weights_on_node(
    model_dir: str
) -> None
```

classmethod

Download TransNetV2 weights on the node.

**Parameters:**

Directory to save the model weights. If None, uses self.model\_dir.

```python
nemo_curator.models.transnetv2.TransNetV2.setup() -> None
```

Set up the TransNetV2 model interface.

```python
class nemo_curator.models.transnetv2._TransNetV2(
    rf: int = 16,
    rl: int = 3,
    rs: int = 2,
    rd: int = 1024,
    use_many_hot_targets: bool = True,
    use_frame_similarity: bool = True,
    use_color_histograms: bool = True,
    use_mean_pooling: bool = False,
    dropout_rate: float = 0.5
)
```

**Bases:** `Module`

```python
nemo_curator.models.transnetv2._TransNetV2.forward(
    inputs: torch.Tensor
) -> torch.Tensor
```

Process input through the TransNetV2 model.

**Parameters:**

Input tensor of video frames.

**Returns:** `torch.Tensor`

Model predictions for shot transitions.

```python
nemo_curator.models.transnetv2._TRANSNETV2_MODEL_ID: Final = 'Sn4kehead/TransNetV2'
```

```python
nemo_curator.models.transnetv2._TRANSNETV2_MODEL_REVISION: Final = 'db6ceab'
```

```python
nemo_curator.models.transnetv2._TRANSNETV2_MODEL_WEIGHTS: Final = 'transnetv2-pytorch-weights.pth'
```