Completion Pipeline#
Table of Contents#
Supported Environments#
All environments require additional Conda packages which can be installed with either the conda/environments/all_cuda-128_arch-$(arch).yaml
or conda/environments/examples_cuda-125_arch-$(arch).yaml
environment files. Refer to the Install Dependencies section for more information.
Environment |
Supported |
Notes |
---|---|---|
Conda |
✔ |
|
Morpheus Docker Container |
✔ |
|
Morpheus Release Container |
✔ |
|
Dev Container |
✔ |
Background Information#
Purpose#
The primary goal of this example is to showcase the creation of a pipeline that integrates an LLM service with Morpheus. Although this example features a single implementation, the pipeline and its components are versatile and can be adapted to various scenarios with unique requirements. The following highlights different customization points within the pipeline and the specific choices made for this example:
LLM Service#
The pipeline is designed to support any LLM service that adheres to our LLMService interface. Compatible services include OpenAI, or even local execution using llama-cpp-python.
Downstream Tasks#
Post LLM execution, the model’s output can be leveraged for various tasks, including model training, analysis, or simulating an attack. In this particular example, we have simplified the implementation and focused solely on the LLMEngine.
Pipeline Implementation#
This example Morpheus pipeline is built using the following components:
InMemorySourceStage: Manages LLM queries in a DataFrame.
DeserializationStage: Converts MessageMeta objects into ControlMessages required by the LLMEngine.
LLMEngineStage: Encompasses the core LLMEngine functionality.
An
ExtracterNode
extracts the questions from the DataFrame.A
PromptTemplateNode
converts data and a template into the final inputs for the LLM.The LLM executes using an
LLMGenerateNode
to run the LLM queries.Finally, the responses are incorporated back into the ControlMessage using a
SimpleTaskHandler
.
InMemorySinkStage: Store the results.
Getting Started#
Prerequisites#
Before running the pipeline, ensure that the OPENAI_API_KEY
environment variable is set.
Install Dependencies#
Install the required dependencies.
conda env update --solver=libmamba \
-n ${CONDA_DEFAULT_ENV} \
--file ./conda/environments/examples_cuda-128_arch-$(arch).yaml
Running the Morpheus Pipeline#
The top level entrypoint to each of the LLM example pipelines is examples/llm/main.py
. This script accepts a set
of Options and a Pipeline to run. Baseline options are below, and for the purposes of this document we’ll assume a
pipeline option of completion
:
Run example:#
python examples/llm/main.py completion [OPTIONS] COMMAND [ARGS]...
Commands:#
pipeline
Options:#
--use_cpu_only
Description: Run in CPU only mode
Default:
False
--num_threads INTEGER RANGE
Description: Number of internal pipeline threads to use.
Default:
12
--pipeline_batch_size INTEGER RANGE
Description: Internal batch size for the pipeline. Can be much larger than the model batch size. Also used for Kafka consumers.
Default:
1024
--model_max_batch_size INTEGER RANGE
Description: Max batch size to use for the model.
Default:
64
--repeat_count INTEGER RANGE
Description: Number of times to repeat the input query. Useful for testing performance.
Default:
64
--llm_service [OpenAI]
Description: LLM service to issue requests to.
Default:
OpenAI
--help
Description: Show the help message with options and commands details.
Running Morpheus Pipeline with OpenAI LLM service#
python examples/llm/main.py completion pipeline