Models#

The Models page in NeMo Studio provides a playground interface for testing and experimenting with models. You can try out different models, configure system prompts, and provide input-output examples for prompt-tuning before committing to full model fine-tuning or deployment.


Backend Microservices#

In the backend, the UI communicates with NeMo Deployment Management and NIM Proxy to manage model deployments and run inference on them.


Models Page UI Overview#

The following are the main components and features of the Models page.

Model Listing#

The Models page displays your saved models in a table format with the following columns:

  • Model Name: The unique identifier for your model.

  • Base Model: The base model being used.

  • Description: Optional text describing the model’s purpose or experiment focus.

  • Created: Timestamp showing when the model was created.

Model Management#

You can perform the following actions on each model by clicking the three-dot menu icon:

  • Open: Open the model to view and interact with it in the playground.

  • Clone and Edit: Create a copy of the model and edit its configuration to create a new model.

  • Delete: Delete the model.

Model Testing#

The following are the common tasks you can perform on the Models page.

  • Model Selection: Choose from available base models or your custom fine-tuned models to test.

  • Real-Time Interaction: Send prompts and receive model responses immediately.

  • Multiple Sessions: Create and manage multiple playground sessions for different experiments.


Model Configuration#

When you create a new model, you can configure the following settings.

System Prompts#

System prompts set the model’s behavior and role:

  • Custom Instructions: Define how the model should behave, its persona, or task-specific guidelines.

  • Prompt Templates: Save and reuse effective system prompts across sessions.

  • Context Setting: Establish the model’s role, tone, and response style.

Learning Examples#

Enhance model performance through in-context learning:

  • Input-Output Pairs: Provide example input-output pairs to guide model behavior.

  • Few-Shot Learning: Add multiple examples to help the model understand desired response patterns.

  • Example Management: Add, edit, or remove examples to refine model responses.

  • Dynamic Testing: See how different examples affect model outputs in real-time.

Tools#

Expand the capabilities of large language models by enabling access to external data and functionalities:

  • Tool Integration: Add tools that allow models to access external APIs, databases, or services.

  • Enhanced Functionality: Enable models to perform actions beyond text generation, such as retrieving real-time data or executing functions.

  • Custom Tools: Configure custom tools specific to your use case or domain.

  • Tool Management: Add, configure, or remove tools to optimize model capabilities.

Hyperparameters#

Tune model behavior by adjusting parameters such as temperature, maximum tokens, and timeout.

  • Temperature: Controls the creativity and randomness of model outputs. Higher values (closer to 2) enable more creative and varied outputs, suitable for tasks such as creative writing. Lower values (closer to 0) produce more deterministic and focused responses. A value within the [0.5, 0.8] range is a good starting point for experimentation.

  • Maximum Tokens: Sets the maximum number of tokens the model can generate in a response. Tokens can be entire words or parts of words. For English, 100 tokens form approximately 75 words on average. Range: 1 to 4096 tokens.

  • Real-Time Adjustment: Modify hyperparameters and immediately see their impact on model responses.


Use Cases#

The following are possible use cases for the Models page:

  • Prompt Engineering: Develop and refine prompts before deploying models in production. Test different phrasings, instructions, and formats to achieve optimal results.

  • Pre-Fine-Tuning Validation: Validate that prompt tuning with examples can achieve your desired outcomes before investing time and resources in full model fine-tuning.

  • Rapid Prototyping: Quickly prototype AI-powered features or applications by testing model behavior with different configurations and use cases.