Use this guide when subscribers, exporters, or diagnostics need a provider-neutral view of raw LLM responses.
You will attach a response codec to a managed LLM wrapper so NeMo Relay can decode provider responses into AnnotatedLLMResponse data for LLM end events.
Response codecs are observability-only:
You need:
annotated_response from LLM end events.Response codecs normalize provider output into fields that subscribers can inspect consistently:
Use these annotations for observability, export, and debugging. Keep business logic that changes the caller-visible response in the framework or provider adapter, not in the response codec.
The built-in provider codecs also implement response decoding:
OpenAIChatCodecOpenAIResponsesCodecAnthropicMessagesCodecChoose the codec that matches the actual provider response shape. For example, do not use OpenAIChatCodec for an OpenAI Responses API payload only because both came from an OpenAI-compatible provider.
The examples below attach built-in response codecs for supported provider response shapes.
Subscribers can inspect annotated_response on LLM end events. The exact event category fields are binding-provided, so defensive checks should confirm the annotation exists before reading it.
Use a custom response codec when the provider or framework response does not match a built-in shape.
In Python, a custom response codec can route to built-in codecs and return their native AnnotatedLLMResponse values:
In Node.js, implement decodeResponse and return the normalized response JSON shape:
In Rust, implement LlmResponseCodec directly:
Streaming LLM wrappers decode the aggregated response produced by the stream finalizer. The response codec does not see each token or chunk. Use stream collectors for chunk-level behavior, and use response codecs for the final normalized end-event annotation.
Use this checklist to confirm the implementation preserves the expected runtime contract.
decode_response returns a normalized response with safe, JSON-compatible fields.annotated_response only on LLM end events where decode succeeds.Check these symptoms first when the workflow does not behave as expected.
tool_calls.api_specific or extra.Use these links to continue from this workflow into the next related task.