nemoguardrails.integrations.langchain.providers.trtllm.llm

View as Markdown

A Langchain LLM component for connecting to Triton + TensorRT LLM backend.

Module Contents

Classes

NameDescription
TRTLLMA custom Langchain LLM class that integrates with TRTLLM triton models.

Data

BAD_WORDS

RANDOM_SEED

STOP_WORDS

API

class nemoguardrails.integrations.langchain.providers.trtllm.llm.TRTLLM()

Bases: BaseLLM

A custom Langchain LLM class that integrates with TRTLLM triton models.

Arguments: server_url: (str) The URL of the Triton inference server to use. model_name: (str) The name of the Triton TRT model to use. temperature: (str) Temperature to use for sampling top_p: (float) The top-p value to use for sampling top_k: (float) The top k values use for sampling beam_width: (int) Last n number of tokens to penalize repetition_penalty: (int) Last n number of tokens to penalize length_penalty: (float) The penalty to apply repeated tokens tokens: (int) The maximum number of tokens to generate. client: The client object used to communicate with the inference server

_get_model_default_parameters
Dict[str, Any]
_identifying_params
Dict[str, Any]

Get all the identifying parameters.

_invocation_params
Dict[str, Any]
_llm_type
str
beam_width
Optional[int] = 1
length_penalty
Optional[float] = 1.0
model_name
str = 'ensemble'
repetition_penalty
Optional[float] = 1.0
server_url
str = Field(None, alias='server_url')
streaming
Optional[bool] = True
temperature
Optional[float] = 1.0
tokens
Optional[int] = 100
top_k
Optional[int] = 1
top_p
Optional[float] = 0
nemoguardrails.integrations.langchain.providers.trtllm.llm.TRTLLM._acall(
args = (),
kwargs = {}
)
async

Async version.

nemoguardrails.integrations.langchain.providers.trtllm.llm.TRTLLM._call(
prompt: str,
stop: typing.Optional[typing.List[str]] = None,
run_manager: typing.Optional[langchain_core.callbacks.manager.CallbackManagerForLLMRun] = None,
kwargs: typing.Any = {}
) -> str

Execute an inference request.

Parameters:

prompt
str

The prompt to pass into the model.

stop
Optional[List[str]]Defaults to None

A list of strings to stop generation when encountered

Returns: str

The string generated by the model

nemoguardrails.integrations.langchain.providers.trtllm.llm.TRTLLM.validate_environment(
values: typing.Dict[str, typing.Any]
) -> typing.Dict[str, typing.Any]
classmethod

Validate that python package exists in environment.

nemoguardrails.integrations.langchain.providers.trtllm.llm.BAD_WORDS = ['']
nemoguardrails.integrations.langchain.providers.trtllm.llm.RANDOM_SEED = 0
nemoguardrails.integrations.langchain.providers.trtllm.llm.STOP_WORDS = ['</s>']