Large Language Models (Latest)
Large Language Models (Latest)

Release Notes

Summary

This is the latest release of NIM.

Language Models

  • Llama 3.1 8B Base

  • Llama 3.1 8B Instruct

  • Llama 3.1 70B Instruct

New Features

Known Issues

  • vLLM is not currently supported on Llama 3.1 models.

  • NIM does not support Multi-instance GPU mode (MIG).

Summary

This is the first general release of NIM.

Language Models

  • Llama 3 8B Instruct

  • Llama 3 70B Instruct

  • Mistral-7B-Instruct-v0.3

  • Mixtral-8x7B-v0.1

  • Mixtral-8x22B-v0.1

Known Issues

P-Tuning is not supported.

Empty metrics values on multi-GPU TensorRT-LLM model Metrics items gpu_cache_usage_perc, num_requests_running, and num_requests_waiting will not be reported for multi-GPU TensorRT-LLM model, because TensorRT-LLM currently doesn’t expose iteration statistics in orchestrator mode.

No tokenizer found error when running PEFT This warning can be safely ignored.

Previous Introduction
Next Getting Started
© Copyright © 2024, NVIDIA Corporation. Last updated on Jul 26, 2024.