nat.llm.huggingface_llm#
HuggingFace Transformers LLM Provider - Local in-process model execution.
Attributes#
Classes#
Singleton cache for loaded HuggingFace models. |
|
Configuration for HuggingFace LLM - loads model directly for local execution. |
Functions#
|
Return cached model data (model, tokenizer, torch) or None if not loaded. |
|
Clean up a loaded model and free GPU memory. |
|
HuggingFace model provider - loads models locally for in-process execution. |
Module Contents#
- logger#
- class ModelCache#
Singleton cache for loaded HuggingFace models.
Models remain cached for the provider’s lifetime (not per-query!) to enable fast reuse: - During nat serve: Cached while server runs, cleaned up on shutdown - During nat red-team: Cached across all evaluation queries, cleaned up when complete - During nat run: Cached for single workflow execution, cleaned up when done
- _instance: ModelCache | None = None#
- _cache: dict[str, ModelCacheEntry]#
- get(model_name: str) ModelCacheEntry | None#
Return cached model data or None if not loaded.
- set(model_name: str, data: ModelCacheEntry) None#
Cache model data.
- class HuggingFaceConfig(/, **data: Any)#
Bases:
nat.data_models.llm.LLMBaseConfigConfiguration for HuggingFace LLM - loads model directly for local execution.
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.
- get_cached_model(model_name: str) ModelCacheEntry | None#
Return cached model data (model, tokenizer, torch) or None if not loaded.
- async _cleanup_model(model_name: str) None#
Clean up a loaded model and free GPU memory.
- Args:
model_name: Name of the model to clean up.
- async huggingface_provider(
- config: HuggingFaceConfig,
- builder: nat.builder.builder.Builder,
HuggingFace model provider - loads models locally for in-process execution.
- Args:
config: Configuration for the HuggingFace model. builder: The NAT builder instance.
- Yields:
LLMProviderInfo: Provider information for the loaded model.