> For clean Markdown of any page, append .md to the page URL. > For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt. > For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt. # nemo_curator.core.serve.server ## Module Contents ### Classes | Name | Description | | -------------------------------------------------------------------- | ------------------------------------------------------- | | [`InferenceServer`](#nemo_curator-core-serve-server-InferenceServer) | Serve one or more models behind a typed backend config. | ### Functions | Name | Description | | ------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------ | | [`is_inference_server_active`](#nemo_curator-core-serve-server-is_inference_server_active) | Check whether any inference server is currently running in this process. | ### Data [`_active_servers`](#nemo_curator-core-serve-server-_active_servers) ### API ```python class nemo_curator.core.serve.server.InferenceServer( models: list[nemo_curator.core.serve.base.BaseModelConfig], backend: nemo_curator.core.serve.base.BaseServerConfig = RayServeServerConfig(), name: str = 'default', port: int = DEFAULT_SERVE_PORT, health_check_timeout_s: int = DEFAULT_SERVE_HEALTH_TIMEOUT_S, verbose: bool = False ) ``` Dataclass Serve one or more models behind a typed backend config. OpenAI-compatible base URL for the served models. ```python nemo_curator.core.serve.server.InferenceServer.__enter__() ``` ```python nemo_curator.core.serve.server.InferenceServer.__exit__( exc = () ) ``` ```python nemo_curator.core.serve.server.InferenceServer.__post_init__() -> None ``` ```python nemo_curator.core.serve.server.InferenceServer._create_backend() -> nemo_curator.core.serve.base.InferenceBackend ``` ```python nemo_curator.core.serve.server.InferenceServer._validate_model_configs() -> None ``` Check every model is accepted by the backend and that all models share one concrete type. ```python nemo_curator.core.serve.server.InferenceServer._wait_for_healthy() -> None ``` Poll `/v1/models` until all expected models appear in the response. ```python nemo_curator.core.serve.server.InferenceServer.start() -> None ``` Deploy all models and wait for them to become healthy. ```python nemo_curator.core.serve.server.InferenceServer.stop() -> None ``` Shut down the active inference backend and release resources. ```python nemo_curator.core.serve.server.is_inference_server_active() -> bool ``` Check whether any inference server is currently running in this process. ```python nemo_curator.core.serve.server._active_servers: set[str] = set() ```