Example cyBERT Morpheus Pipeline for Apache Log Parsing#

Example Morpheus pipeline using Triton Inference server and Morpheus.

Supported Environments#

Environment	Supported	Notes
Conda	✔
Morpheus Docker Container	✔	Requires launching Triton on the host
Morpheus Release Container	✔	Requires launching Triton on the host
Dev Container	✔	Requires using the `dev-triton-start` script. If using the `run.py` script this requires adding the `--server_url=triton:8001` flag. If using the CLI example this requires replacing `--server_url=localhost:8001` with `--server_url=triton:8001`

Set up Triton Inference Server#

Pull Triton Inference Server Docker Image#

Pull the Morpheus Triton models Docker image from NGC.

Example:

docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:25.06

Start Triton Inference Server Container#

From the Morpheus repo root directory, run the following to launch Triton and load the log-parsing-onnx model:

docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:25.06 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model log-parsing-onnx

Verify Model Deployment#

Once Triton server finishes starting up, it will display the status of all loaded models. Successful deployment of the model will show the following:

+------------------+---------+--------+
| Model            | Version | Status |
+------------------+---------+--------+
| log-parsing-onnx | 1       | READY  |
+------------------+---------+--------+

Note: If this is not present in the output, check the Triton log for any error messages related to loading the model.

Run Log Parsing Pipeline#

Run the following from the root of the Morpheus repo to start the log parsing pipeline:

python examples/log_parsing/run.py \
    --input_file=./examples/data/log-parsing-validation-data-input.csv \
    --model_vocab_hash_file=data/bert-base-cased-hash.txt \
    --model_vocab_file=data/bert-base-cased-vocab.txt \
    --model_name log-parsing-onnx \
    --model_config_file=./examples/data/log-parsing-config-20220418.json

Use --help to display information about the command line options:

python run.py --help

Options:
  --num_threads INTEGER RANGE     Number of internal pipeline threads to use
                                  [x>=1]
  --pipeline_batch_size INTEGER RANGE
                                  Internal batch size for the pipeline. Can be
                                  much larger than the model batch size. Also
                                  used for Kafka consumers  [x>=1]
  --model_max_batch_size INTEGER RANGE
                                  Max batch size to use for the model  [x>=1]
  --input_file PATH               Input filepath  [required]
  --output_file TEXT              The path to the file where the inference
                                  output will be saved.
  --model_vocab_hash_file FILE    Model vocab hash file to use for pre-
                                  processing  [required]
  --model_vocab_file FILE         Model vocab file to use for post-processing
                                  [required]
  --model_seq_length INTEGER RANGE
                                  Sequence length to use for the model  [x>=1]
  --model_name TEXT               The name of the model that is deployed on
                                  Triton server  [required]
  --model_config_file TEXT        Model config file  [required]
  --server_url TEXT               Tritonserver url  [required]
  --help                          Show this message and exit.

CLI Example#

The above example is illustrative of using the Python API to build a custom Morpheus pipeline. Alternately, the Morpheus command line could have been used to accomplish the same goal. To do this we must ensure the examples/log_parsing directory is available in the PYTHONPATH and each of the custom stages are registered as plugins.

From the root of the Morpheus repo, run:

PYTHONPATH="examples/log_parsing" \
morpheus --log_level INFO \
	--plugin "inference" \
	--plugin "postprocessing" \
	run --pipeline_batch_size 1024 --model_max_batch_size 32  \
	pipeline-nlp \
	from-file --filename ./examples/data/log-parsing-validation-data-input.csv \
	deserialize \
	preprocess --vocab_hash_file=data/bert-base-cased-hash.txt --stride 64 --column=raw \
	monitor --description "Preprocessing rate" \
	inf-logparsing --model_name log-parsing-onnx --server_url localhost:8001 --force_convert_inputs=True \
	monitor --description "Inference rate" --unit inf \
	log-postprocess --vocab_path=data/bert-base-cased-vocab.txt \
		--model_config_path=./examples/data/log-parsing-config-20220418.json \
	to-file --filename .tmp/output/log-parsing-cli-output.jsonlines --overwrite  \
	monitor --description "Postprocessing rate"