nemo_eval.adapters.server#

Main server holding entry-point to all the interceptors.

The arch is as follows:

     ┌───────────────────────┐
     │                       │
     │ NVIDIA Eval Factory   │
     │                       │
     └───▲──────┬────────────┘
         │      │
 returns │      │
         │      │ calls
         │      │
         │      │
     ┌───┼──────┼──────────────────────────────────────────────────┐
     │   │      ▼                                                  │
     │ AdapterServer (@ localhost:<free port>)                     │
     │                                                             │
     │   ▲      │       chain of RequestInterceptors:              │
     │   │      │       flask.Request                              │
     │   │      │       is passed on the way up                    │
     │   │      │                                                  │   ┌──────────────────────┐
     │   │ ┌────▼───────────────────────────────────────────────┐  │   │                      │
     │   │ │intcptr_1─────►intcptr_2───►...───►intcptr_N────────┼──┼───►                      │
     │   │ │                     │                              │  │   │                      │
     │   │ └─────────────────────┼──────────────────────────────┘  │   │                      │
     │   │                       │(e.g. for caching interceptors,  │   │  upstream endpoint   │
     │   │                       │ this "shortcut" will happen)    │   │   with actual model  │
     │   │                       │                                 │   │                      │
     │   │                       └─────────────┐                   │   │                      │
     │   │                                     │                   │   │                      │
     │ ┌─┼─────────────────────────────────────▼────┐              │   │                      │
     │ │intcptr'_M◄──intcptr'_2◄──...◄───intcptr'_1 ◄──────────────┼───┤                      │
     │ └────────────────────────────────────────────┘              │   └──────────────────────┘
     │                                                             │
     │              Chain of ResponseInterceptors:                 │
     │              requests.Response is passed on the way down    │
     │                                                             │
     │                                                             │
     └─────────────────────────────────────────────────────────────┘

In other words, interceptors are pieces of independent logic which should be relatively easy to add separately.

Module Contents#

Classes#

_AdapterServer

Main server which serves on a local port and holds chain of interceptors

Functions#

create_server_process

Create and start a server process, returning the process and the config.

API#

nemo_eval.adapters.server.create_server_process(
adapter_config: nemo_eval.api.AdapterConfig,
) Tuple[multiprocessing.Process, nemo_eval.api.AdapterConfig][source]#

Create and start a server process, returning the process and the config.

This makes sure that the factory function is not needing any complex serialization for multiprocessing.

Parameters:
  • api_url – The API URL the adapter will call

  • adapter_config – Configuration for the adapter server

Returns:

Tuple of (process, adapter_config) where process is the running server process, and adapter_config is the configuration with port filled in.

class nemo_eval.adapters.server._AdapterServer(adapter_config: nemo_eval.api.AdapterConfig)[source]#

Main server which serves on a local port and holds chain of interceptors

Initialization

Initializes the app, creates server and adds interceptors

Parameters:

adapter_config – should be obtained from the evaluation script, see api.py

adapter_host: str#

None

adapter_port: int#

None

request_interceptors: list[nemo_eval.adapters.interceptors.RequestInterceptor]#

None

response_interceptors: list[nemo_eval.adapters.interceptors.ResponseInterceptor]#

None

app: flask.Flask#

None

api_url: str#

None

_build_interceptor_chains(
use_reasoning: bool,
end_reasoning_token: str,
custom_system_prompt: Optional[str],
max_logged_requests: Optional[int],
max_logged_responses: Optional[int],
)[source]#
run() None[source]#

Start the Flask server.

_EXCLUDED_HEADERS#

[‘content-encoding’, ‘content-length’, ‘transfer-encoding’, ‘connection’]

classmethod _process_response_headers(
response: requests.Response,
) list[tuple[str, str]][source]#

Process response headers, removing excluded ones.

_handler(path: str) flask.Response[source]#
_log_response_interceptor_error(
interceptor: nemo_eval.adapters.interceptors.ResponseInterceptor,
adapter_response: nemo_eval.adapters.interceptors.AdapterResponse,
error: Exception,
) None[source]#