Custom Model Settings

View as Markdown

While Data Designer ships with pre-configured model providers and configurations, you can create custom configurations to use different models, adjust inference parameters, or connect to custom API endpoints.

When to Use Custom Settings

Use custom model settings when you need to:

  • Use models not included in the defaults
  • Adjust inference parameters (temperature, top_p, max_tokens) for specific use cases
  • Add distribution-based inference parameters for variability
  • Connect to self-hosted or custom model endpoints
  • Create multiple variants of the same model with different settings

Creating and Using Custom Settings

Custom Models with Default Providers

Create custom model configurations that use the default providers (no need to define providers yourself):

1import data_designer.config as dd
2from data_designer.interface import DataDesigner
3
4# Create custom models using default providers
5custom_models = [
6 # High-temperature for more variability
7 dd.ModelConfig(
8 alias="creative-writer",
9 model="nvidia/nemotron-3-nano-30b-a3b",
10 provider="nvidia", # Uses default NVIDIA provider
11 inference_parameters=dd.ChatCompletionInferenceParams(
12 temperature=1.2,
13 top_p=0.98,
14 max_tokens=4096,
15 ),
16 ),
17 # Low-temperature for less variability
18 dd.ModelConfig(
19 alias="fact-checker",
20 model="nvidia/nemotron-3-nano-30b-a3b",
21 provider="nvidia", # Uses default NVIDIA provider
22 inference_parameters=dd.ChatCompletionInferenceParams(
23 temperature=0.1,
24 top_p=0.9,
25 max_tokens=2048,
26 ),
27 ),
28]
29
30# Create DataDesigner (uses default providers)
31data_designer = DataDesigner()
32
33# Pass custom models to config builder
34config_builder = dd.DataDesignerConfigBuilder(model_configs=custom_models)
35
36# Add a topic column using a categorical sampler
37config_builder.add_column(
38 dd.SamplerColumnConfig(
39 name="topic",
40 sampler_type=dd.SamplerType.CATEGORY,
41 params=dd.CategorySamplerParams(
42 values=["Artificial Intelligence", "Space Exploration", "Ancient History", "Climate Science"],
43 ),
44 )
45)
46
47# Use your custom models
48config_builder.add_column(
49 dd.LLMTextColumnConfig(
50 name="creative_story",
51 model_alias="creative-writer",
52 prompt="Write a creative short story about {{topic}}.",
53 )
54)
55
56config_builder.add_column(
57 dd.LLMTextColumnConfig(
58 name="facts",
59 model_alias="fact-checker",
60 prompt="List 3 facts about {{topic}}.",
61 )
62)
63
64# Preview your dataset
65preview_result = data_designer.preview(config_builder=config_builder)
66preview_result.display_sample_record()
Default Providers Always Available

When you only specify model_configs, the default model providers (NVIDIA, OpenAI, and OpenRouter) are still available. You only need to create custom providers if you want to connect to different endpoints or modify provider settings.

Always specify provider= on ModelConfig

Leaving provider unset (or passing provider=None) on ModelConfig is deprecated. The legacy “implicit default provider” routing — used when provider is omitted — emits a DeprecationWarning and will be removed in a future release. Always reference the intended provider by name, as the examples below do. See issue #589.

Mixing Custom and Default Models

When you provide custom model_configs to DataDesignerConfigBuilder, they replace the defaults entirely. To use custom model configs in addition to the default configs, use the add_model_config method:

1import data_designer.config as dd
2
3# Load defaults first
4config_builder = dd.DataDesignerConfigBuilder()
5
6# Add custom model to defaults
7config_builder.add_model_config(
8 dd.ModelConfig(
9 alias="my-custom-model",
10 model="nvidia/llama-3.3-nemotron-super-49b-v1.5",
11 provider="nvidia", # Uses default provider
12 inference_parameters=dd.ChatCompletionInferenceParams(
13 temperature=0.6,
14 max_tokens=8192,
15 ),
16 )
17)
18
19# Now you can use both default and custom models
20# Default: nvidia-text, nvidia-reasoning, nvidia-vision, etc.
21# Custom: my-custom-model

Custom Providers with Custom Models

Define both custom providers and custom model configurations when you need to connect to services not included in the defaults:

Network Accessibility The custom provider endpoints must be reachable from where Data Designer runs. Ensure network connectivity, firewall rules, and any VPN requirements are properly configured.

1import data_designer.config as dd
2from data_designer.interface import DataDesigner
3
4# Step 1: Define custom providers
5custom_providers = [
6 dd.ModelProvider(
7 name="my-custom-provider",
8 endpoint="https://api.my-llm-service.com/v1",
9 provider_type="openai", # OpenAI-compatible API
10 api_key="MY_SERVICE_API_KEY", # Environment variable name
11 ),
12 dd.ModelProvider(
13 name="my-self-hosted-provider",
14 endpoint="https://my-org.internal.com/llm/v1",
15 provider_type="openai",
16 api_key="SELF_HOSTED_API_KEY",
17 ),
18]
19
20# Step 2: Define custom models
21custom_models = [
22 dd.ModelConfig(
23 alias="my-text-model",
24 model="openai/some-model-id",
25 provider="my-custom-provider", # References provider by name
26 inference_parameters=dd.ChatCompletionInferenceParams(
27 temperature=0.85,
28 top_p=0.95,
29 max_tokens=2048,
30 ),
31 ),
32 dd.ModelConfig(
33 alias="my-self-hosted-text-model",
34 model="openai/some-hosted-model-id",
35 provider="my-self-hosted-provider",
36 inference_parameters=dd.ChatCompletionInferenceParams(
37 temperature=0.7,
38 top_p=0.9,
39 max_tokens=1024,
40 ),
41 ),
42]
43
44# Step 3: Create DataDesigner with custom providers
45data_designer = DataDesigner(model_providers=custom_providers)
46
47# Step 4: Create config builder with custom models
48config_builder = dd.DataDesignerConfigBuilder(model_configs=custom_models)
49
50# Step 5: Add a topic column using a categorical sampler
51config_builder.add_column(
52 dd.SamplerColumnConfig(
53 name="topic",
54 sampler_type=dd.SamplerType.CATEGORY,
55 params=dd.CategorySamplerParams(
56 values=["Technology", "Healthcare", "Finance", "Education"],
57 ),
58 )
59)
60
61# Step 6: Use your custom model by referencing its alias
62config_builder.add_column(
63 dd.LLMTextColumnConfig(
64 name="short_news_article",
65 model_alias="my-text-model", # Reference custom alias
66 prompt="Write a short news article about the '{{topic}}' topic in 10 sentences.",
67 )
68)
69
70config_builder.add_column(
71 dd.LLMTextColumnConfig(
72 name="long_news_article",
73 model_alias="my-self-hosted-text-model", # Reference custom alias
74 prompt="Write a detailed news article about the '{{topic}}' topic.",
75 )
76)
77
78# Step 7: Preview your dataset
79preview_result = data_designer.preview(config_builder=config_builder)
80preview_result.display_sample_record()

See Also