Managing and Customizing TuneD Profiles#
TuneD is a system tuning service that provides profiles for optimizing system performance for various use cases.
The NVIDIA BaseOS software includes the nvidia-tuned-profiles package, which provides
pre-configured TuneD profiles optimized for different NVIDIA DGX platforms and use cases.
About NVIDIA TuneD Profiles#
The nvidia-tuned-profiles package installs profiles for various DGX systems to
/usr/lib/tuned/profiles/. These profiles are categorized as follows:
Platform-Specific Performance Profiles
dgx-a100-performance,dgx-a800-performance- Optimized for DGX A100/A800 systemsdgx-h100-performance,dgx-h200-performance,dgx-h800-performance- Optimized for DGX H100/H200/H800 systemsdgx-b200-performance, - Optimized for DGX B200 systems
Crashdump Profiles
dgx-a100-crashdump,dgx-a800-crashdump- Crashdump configuration for A100/A800 systemsdgx-h100-crashdump,dgx-h200-crashdump- Crashdump configuration for H100/H200 systemsdgx-b200-crashdump- Crashdump configuration for B200 systems
Base and Common Profiles
dgx-base- Base profile with common DGX settings and includes cachefilesd overridesnvidia-base- Base profile with NVIDIA-specific settings and service overridesnvidia-x86-64-performance- Performance profile for x86_64 architecturesnvidia-crashdump-core- Core crashdump configurationnvidia-no-mitigations- Disables CPU mitigations for better performance
Understanding Profile Inheritance#
Most NVIDIA profiles use the include directive to inherit settings from base profiles. This creates
a hierarchy where platform-specific profiles build upon common base configurations:
nvidia-base - Provides core NVIDIA settings including:
CPU governor set to performance
Network ARP tuning for better networking
Service management for docker and nvidia-persistenced
Kernel parameter
init_on_alloc=0for performancenvidia-peermem module loading configuration
dgx-base - Provides DGX-specific settings including:
Configuration for cachefilesd service (requires /raid to be mounted)
Service overrides to ensure proper startup dependencies
Platform profiles (for example, dgx-h100-performance) - Include both nvidia-base and dgx-base, then add:
Platform-specific bootloader parameters
Hardware-specific module parameters
Console and IOMMU settings
Listing Available TuneD Profiles#
To view all available TuneD profiles on your system:
sudo tuned-adm list
To view the currently active profile:
sudo tuned-adm active
To verify the current profile is properly applied:
sudo tuned-adm verify
To check the status of the TuneD service:
sudo systemctl status tuned
Cloning and Modifying an Existing Profile#
You can clone an existing NVIDIA profile and customize it for your specific needs. Custom profiles should
be created in /etc/tuned/ to avoid conflicts with package updates.
Identify the profile to clone by listing available profiles:
sudo tuned-adm list
Create a new directory for your custom profile:
sudo mkdir -p /etc/tuned/my-custom-dgx-profile
Copy the configuration from an existing profile. For example, to clone the
dgx-h100-performanceprofile:sudo cp /usr/lib/tuned/profiles/dgx-h100-performance/tuned.conf /etc/tuned/my-custom-dgx-profile/
The original
dgx-h100-performanceprofile contains:[main] include=nvidia-base,dgx-base summary=TuneD Profile for DGX H100 [bootloader] cmdline_iommu=iommu=pt cmdline_console=console=tty0 console=ttyS0,115200n8 cmdline_pci=pci=realloc=off
Edit the custom profile configuration:
sudo vi /etc/tuned/my-custom-dgx-profile/tuned.conf
Modify settings as needed. For example, you might want to add custom network tuning or adjust kernel parameters:
[main] include=nvidia-base,dgx-base summary=Custom DGX H100 profile with network tuning [bootloader] cmdline_iommu=iommu=pt cmdline_console=console=tty0 console=ttyS0,115200n8 cmdline_pci=pci=realloc=off # Add custom boot parameters cmdline_hugepages=hugepagesz=2M hugepages=8192 [sysctl] # Add custom network tuning net.core.rmem_max=268435456 net.core.wmem_max=268435456 net.core.rmem_default=67108864 net.core.wmem_default=67108864 net.ipv4.tcp_rmem=4096 87380 134217728 net.ipv4.tcp_wmem=4096 65536 134217728 # Reduce swappiness for workloads with large memory requirements vm.swappiness=10
Save the file and activate your custom profile:
sudo tuned-adm profile my-custom-dgx-profile
Verify the profile is active:
sudo tuned-adm active sudo tuned-adm verify
Creating a Custom Profile from Scratch#
If you need to create a completely custom profile rather than modifying an existing one:
Create a directory for your new profile:
sudo mkdir -p /etc/tuned/my-dgx-custom
Create a new
tuned.conffile:sudo vi /etc/tuned/my-dgx-custom/tuned.conf
Add your profile configuration. Here’s an example that builds on NVIDIA base profiles:
[main] include=nvidia-base summary=Custom DGX performance profile for AI workloads description=Optimized profile for training large language models [bootloader] # Enable IOMMU in passthrough mode for better performance cmdline_iommu=iommu=pt # Allocate hugepages for better memory performance cmdline_hugepages=hugepagesz=2M hugepages=16384 # Disable CPU mitigations for maximum performance cmdline_mitigations=mitigations=off [sysctl] # Network tuning for distributed training net.core.rmem_max=268435456 net.core.wmem_max=268435456 net.core.rmem_default=67108864 net.core.wmem_default=67108864 net.ipv4.tcp_rmem=4096 87380 134217728 net.ipv4.tcp_wmem=4096 65536 134217728 # Memory management vm.swappiness=10 vm.dirty_ratio=40 vm.dirty_background_ratio=10 # Disable NUMA balancing for better performance in GPU workloads kernel.numa_balancing=0 [modules] # Load nvidia driver with relaxed ordering enabled nvidia=NVreg_EnablePCIERelaxedOrderingMode=1
Common profile sections and options:
[main] - Profile metadata (summary, description)
[cpu] - CPU-related settings (governor, energy policy)
[sysctl] - Kernel parameters
[bootloader] - Kernel boot parameters
[disk] - Storage settings (readahead, scheduler)
[vm] - Virtual memory settings
[service] - Service enable/disable directives
[script] - Custom scripts to run
Activate the new profile:
sudo tuned-adm profile my-dgx-custom
Verify the profile is active and properly applied:
sudo tuned-adm active sudo tuned-adm verify
Using TuneD Profile Merging#
TuneD supports merging multiple profiles to combine their settings. This is useful when you want to combine the optimizations from multiple profiles without creating a completely new profile.
Applying Multiple Profiles#
You can apply multiple profiles at once, and TuneD will merge their configurations. Profiles are applied in order, with later profiles overriding settings from earlier ones.
Note
TuneD merges profiles automatically without validating the logical consistency of the combined settings. Carefully review the profiles you’re merging to avoid conflicting configurations. For example, combining a profile optimized for high throughput with one optimized for power saving could result in counterproductive settings.
To apply multiple profiles using the merge functionality:
sudo tuned-adm profile <profile1> <profile2> <profile3>
Example - combining NVIDIA base settings with architecture-specific optimizations:
sudo tuned-adm profile nvidia-base nvidia-x86-64-performance
Verify the merged profile is active:
sudo tuned-adm active
The output will show all active profiles separated by spaces.
Creating a Profile that Includes Other Profiles#
You can also create a custom profile that explicitly includes other profiles using the include directive:
Create a new custom profile directory:
sudo mkdir -p /etc/tuned/my-merged-profile
Create a
tuned.conffile that includes other profiles:sudo vi /etc/tuned/my-merged-profile/tuned.conf
Use the
includedirective to merge base profiles and add customizations:[main] summary=Custom merged DGX profile include=dgx-base nvidia-x86-64-performance [sysctl] # Additional custom settings that extend the included profiles vm.swappiness=5 net.core.netdev_max_backlog=5000 [cpu] # Override CPU settings from included profiles governor=performance
In this example:
The profile includes settings from both
dgx-baseandnvidia-x86-64-performanceAdditional custom settings are layered on top
Settings defined in this profile will override those from included profiles
Activate the merged profile:
sudo tuned-adm profile my-merged-profile
Common Profile Merge Examples#
Example 1: Performance with security mitigations disabled
sudo tuned-adm profile dgx-h100-performance nvidia-no-mitigations
This combines the DGX H100 performance optimizations with CPU security mitigations disabled for maximum performance. Only use this in isolated, trusted environments.
Example 2: Base configuration with architecture-specific tuning
sudo tuned-adm profile nvidia-base nvidia-x86-64-performance
This creates a generic high-performance NVIDIA profile by combining:
nvidia-base: Core NVIDIA settings (CPU governor, network tuning, service management)
nvidia-x86-64-performance: Architecture-specific optimizations for x86_64 systems
After applying, verify with:
sudo tuned-adm active
# Output: Current active profile: nvidia-base nvidia-x86-64-performance
Example 3: Crashdump-enabled profile
sudo tuned-adm profile dgx-h100-crashdump
The crashdump profiles (like dgx-h100-crashdump) automatically include the base performance profile
and add crashdump configuration:
[main]
include=dgx-h100-performance,nvidia-crashdump-core
summary=TuneD Profile for DGX H100 with Crashdump Enabled
[bootloader]
cmdline_crashkernel=crashkernel=1G-:2048M
This reserves 2GB of memory for crashdump capture and configures kernel panic behavior.
Important
When merging profiles, settings from profiles listed later in the command override settings from earlier profiles. Plan your profile order accordingly.
Viewing Profile Contents#
To examine the contents of a profile before using it:
View system-provided profiles:
cat /usr/lib/tuned/profiles/<profile-name>/tuned.confFor example, to view the
dgx-h100-performanceprofile:cat /usr/lib/tuned/profiles/dgx-h100-performance/tuned.confThis will show:
[main] include=nvidia-base,dgx-base summary=TuneD Profile for DGX H100 [bootloader] cmdline_iommu=iommu=pt cmdline_console=console=tty0 console=ttyS0,115200n8 cmdline_pci=pci=realloc=off
View custom profiles:
cat /etc/tuned/<profile-name>/tuned.conf
Example: Examining Key Profiles#
nvidia-base profile - The foundation for most NVIDIA profiles:
[main]
summary=Base NVIDIA tuning configuration
[service]
service.docker=start,enable,file:/usr/lib/tuned/profiles/nvidia-base/docker-override.conf
service.nvidia-persistenced=start,enable,file:/usr/lib/tuned/profiles/nvidia-base/nvidia-persistenced-override.conf
[cpu]
governor=performance
[bootloader]
cmdline_init_on_alloc=init_on_alloc=0
[modules]
nvidia-peermem=+r opt1=noop
[sysctl]
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_ignore = 1
dgx-a100-performance profile - Platform-specific configuration:
[main]
include=nvidia-base,dgx-base
summary=TuneD Profile for DGX A100
[bootloader]
cmdline_iommu=iommu=pt
cmdline_console=console=tty0 console=ttyS1,115200n8
[modules]
nvidia=NVreg_EnablePCIERelaxedOrderingMode=1
nvidia-crashdump-core profile - Crashdump configuration:
[sysctl]
kernel.panic_on_unrecovered_nmi = 1
kernel.unknown_nmi_panic = 1
kernel.hardlockup_panic = 1
kernel.panic_on_io_nmi = 1
kernel.softlockup_panic = 1
kernel.panic_on_oops = 1
kernel.hung_task_panic = 1
kernel.panic_on_rcu_stall = 1
kernel.panic = 30
Disabling TuneD#
If you need to disable TuneD and revert all tuning changes:
sudo tuned-adm off
sudo systemctl stop tuned
sudo systemctl disable tuned
To re-enable TuneD:
sudo systemctl enable tuned
sudo systemctl start tuned
sudo tuned-adm profile <profile-name>
Disabling Security Mitigations for Maximum Performance#
The nvidia-no-mitigations profile disables CPU security mitigations (such as Spectre and Meltdown
protections) to achieve maximum system performance. This section provides detailed instructions on when
and how to disable these mitigations safely.
Understanding Security Mitigations#
Modern CPUs include hardware security vulnerabilities (such as Spectre, Meltdown, L1TF, MDS, and others) that allow potential side-channel attacks. The Linux kernel implements mitigations for these vulnerabilities, but these protections can reduce system performance by 5-30% depending on the workload.
Performance Impact:
Memory-intensive workloads: 5-10% overhead
System call-heavy workloads: 15-30% overhead
GPU compute workloads: 3-8% overhead (varies by operation)
Security Considerations:
Disabling mitigations should only be done in environments where:
Systems are physically isolated or on trusted networks
No untrusted code or containers are executed
Multi-tenant workloads are not running
Maximum performance is critical and security risks are understood and accepted
Danger
Disabling CPU security mitigations removes protections against known CPU vulnerabilities including Spectre, Meltdown, L1TF, MDS, TAA, and others. Only disable mitigations in trusted, isolated environments where you control all code execution.
Using the nvidia-no-mitigations Profile#
The nvidia-no-mitigations profile contains a simple configuration:
[main]
summary=NVIDIA no mitigations settings
[bootloader]
cmdline_mitigations=mitigations=off
This adds the mitigations=off kernel parameter, which disables all CPU vulnerability mitigations.
Method 1: Applying nvidia-no-mitigations with Your Platform Profile#
To combine your platform-specific profile with the no-mitigations setting:
Check your current active profile:
sudo tuned-adm active
Example output:
Current active profile: dgx-h100-performanceApply your platform profile merged with nvidia-no-mitigations:
sudo tuned-adm profile dgx-h100-performance nvidia-no-mitigations
Replace
dgx-h100-performancewith your actual platform profile (for example,dgx-a100-performance,dgx-b200-performance, and so forth).Verify the profile is active:
sudo tuned-adm active
Output should show:
Current active profile: dgx-h100-performance nvidia-no-mitigationsReboot the system for the kernel command line changes to take effect:
sudo rebootAfter reboot, verify the mitigations are disabled:
cat /proc/cmdline | grep mitigations
You should see
mitigations=offin the output.Check the current mitigation status:
grep . /sys/devices/system/cpu/vulnerabilities/*
The output should show “Mitigation” entries are either disabled or show “Vulnerable” status, indicating mitigations are not active.
Method 2: Creating a Custom Profile with Mitigations Disabled#
If you want to create a permanent custom profile that includes your platform settings and disabled mitigations:
Create a custom profile directory:
sudo mkdir -p /etc/tuned/dgx-h100-no-mitigations
Create the profile configuration:
sudo vi /etc/tuned/dgx-h100-no-mitigations/tuned.conf
Add the following configuration:
[main] include=dgx-h100-performance summary=DGX H100 Performance with Security Mitigations Disabled [bootloader] cmdline_mitigations=mitigations=off
This profile inherits all settings from
dgx-h100-performanceand adds the mitigations=off parameter.Activate the custom profile:
sudo tuned-adm profile dgx-h100-no-mitigations
Verify the profile is active:
sudo tuned-adm active sudo tuned-adm verify
Reboot the system:
sudo rebootAfter reboot, verify mitigations are disabled as shown in Method 1.
Method 3: Selective Mitigation Disabling#
If you want more granular control, you can disable specific mitigations instead of all of them:
Create a custom profile:
sudo mkdir -p /etc/tuned/dgx-h100-selective-mitigations sudo vi /etc/tuned/dgx-h100-selective-mitigations/tuned.conf
Configure selective mitigations:
[main] include=dgx-h100-performance summary=DGX H100 with Selective Mitigations [bootloader] # Disable specific mitigations individually cmdline_spectre_v2=spectre_v2=off cmdline_spec_store_bypass=spec_store_bypass_disable=off cmdline_l1tf=l1tf=off cmdline_mds=mds=off cmdline_tsx_async_abort=tsx_async_abort=off cmdline_kpti=nopti
Available options for selective disablement:
spectre_v2=off- Disable Spectre Variant 2 mitigationsspec_store_bypass_disable=off- Disable Spectre Variant 4 mitigationsl1tf=off- Disable L1 Terminal Fault mitigationsmds=off- Disable Microarchitectural Data Sampling mitigationstsx_async_abort=off- Disable TSA mitigationsnopti- Disable Page Table Isolation (Meltdown mitigation)
Activate the profile and reboot:
sudo tuned-adm profile dgx-h100-selective-mitigations sudo reboot
Verifying Mitigation Status#
After disabling mitigations and rebooting, verify the configuration:
Check kernel command line:
cat /proc/cmdlineLook for
mitigations=offor your specific mitigation parameters.Check CPU vulnerability status:
grep . /sys/devices/system/cpu/vulnerabilities/*
Example output with mitigations disabled:
/sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected /sys/devices/system/cpu/vulnerabilities/l1tf:Vulnerable /sys/devices/system/cpu/vulnerabilities/mds:Vulnerable; SMT vulnerable /sys/devices/system/cpu/vulnerabilities/meltdown:Vulnerable /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Vulnerable /sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable /sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected
The
Vulnerablestatus indicates mitigations are disabled.Compare with enabled mitigations (for reference):
Example output with mitigations enabled:
/sys/devices/system/cpu/vulnerabilities/l1tf:Mitigation: PTE Inversion /sys/devices/system/cpu/vulnerabilities/mds:Mitigation: Clear buffers /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Mitigation: SSB disabled /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: usercopy/swapgs barriers /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Enhanced IBRS
Re-enabling Security Mitigations#
If you need to re-enable security mitigations:
Method 1: Switch back to standard profile#
sudo tuned-adm profile dgx-h100-performance
sudo reboot
Method 2: Remove nvidia-no-mitigations from merged profile#
If you were using a merged profile:
# Instead of: dgx-h100-performance nvidia-no-mitigations
# Use:
sudo tuned-adm profile dgx-h100-performance
sudo reboot
Method 3: Delete custom profile#
If you created a custom profile:
sudo tuned-adm profile dgx-h100-performance
sudo rm -rf /etc/tuned/dgx-h100-no-mitigations
sudo reboot
After rebooting, verify mitigations are enabled:
grep . /sys/devices/system/cpu/vulnerabilities/*
You should see Mitigation entries instead of Vulnerable status.
Performance Testing Recommendations#
When disabling mitigations, measure the actual performance impact for your specific workload:
Benchmark with mitigations enabled (baseline):
sudo tuned-adm profile dgx-h100-performance sudo reboot # Run your performance benchmarks and record results
Benchmark with mitigations disabled:
sudo tuned-adm profile dgx-h100-performance nvidia-no-mitigations sudo reboot # Run the same benchmarks and compare results
Calculate the performance improvement:
If improvement is < 5%, consider keeping mitigations enabled for security
If improvement is > 10%, the trade-off may be worthwhile in trusted environments
Document your findings and revisit the decision periodically
Practical Use Case Scenarios#
Here are some common scenarios and recommended profile configurations.
Scenario 1: Standard Production DGX H100 System#
For most production workloads, use the default platform profile:
sudo tuned-adm profile dgx-h100-performance
This profile includes all necessary optimizations from nvidia-base and dgx-base.
Scenario 2: High-Performance GPU Training#
For GPU training workloads requiring maximum performance, use the default platform profile:
sudo tuned-adm profile dgx-h100-performance
This profile provides platform-specific optimizations including IOMMU settings, console configuration, and PCI settings optimized for DGX H100 systems.
To verify the profile is active:
sudo tuned-adm active
Output: Current active profile: dgx-h100-performance
Scenario 3: High-Performance Inference Server#
Create a custom profile optimized for inference workloads with low latency requirements:
sudo mkdir -p /etc/tuned/dgx-inference
sudo vi /etc/tuned/dgx-inference/tuned.conf
[main]
include=dgx-h100-performance
summary=Optimized for inference workloads
[bootloader]
# Isolate CPUs for inference processes (adjust based on your CPU count)
cmdline_isolcpus=isolcpus=8-63
# Allocate hugepages for better memory access
cmdline_hugepages=hugepagesz=2M hugepages=8192
[sysctl]
# Minimize latency
vm.swappiness=1
# Optimize for response time
kernel.sched_latency_ns=1000000
kernel.sched_min_granularity_ns=100000
Then activate:
sudo tuned-adm profile dgx-inference
Scenario 4: System with Crashdump Debugging Required#
When you need to capture crash information for debugging:
sudo tuned-adm profile dgx-h100-crashdump
This automatically configures crashkernel memory reservation and panic behavior.
Scenario 5: Multi-Node Training Cluster with Network Optimization#
For distributed training across multiple nodes requiring network tuning:
sudo mkdir -p /etc/tuned/dgx-distributed-training
sudo vi /etc/tuned/dgx-distributed-training/tuned.conf
[main]
include=dgx-h100-performance
summary=Optimized for multi-node distributed training
[sysctl]
# Network buffer tuning for high-throughput connections
net.core.rmem_max=536870912
net.core.wmem_max=536870912
net.core.rmem_default=134217728
net.core.wmem_default=134217728
net.ipv4.tcp_rmem=4096 87380 268435456
net.ipv4.tcp_wmem=4096 65536 268435456
# Increase connection backlog
net.core.netdev_max_backlog=10000
net.ipv4.tcp_max_syn_backlog=8192
# TCP tuning for high-speed networks
net.ipv4.tcp_congestion_control=bbr
net.core.default_qdisc=fq
Then activate:
sudo tuned-adm profile dgx-distributed-training
Best Practices#
Always create custom profiles in
/etc/tuned/to avoid conflicts with package updatesTest custom profiles on non-production systems before deploying to production
Use
tuned-adm verifyto ensure profiles are correctly appliedDocument any custom settings and the reasons for the changes
For platform-specific systems, start with the appropriate NVIDIA profile and customize as needed
Use profile merging when you want to combine features from multiple profiles
Monitor system performance after applying or changing profiles to ensure desired results
Remember that bootloader changes (in the
[bootloader]section) require a system reboot to take effectUse
tuned-adm profile_info <profile-name>to see detailed information about what a profile does
Note
After installing NVIDIA BaseOS, the appropriate dgx-<platform>-performance profile
should be automatically activated. Verify this with sudo tuned-adm active during initial setup.
Additional Resources#
For more information about TuneD configuration options and advanced usage, refer to:
TuneD documentation:
man tunedandman tuned-admTuneD configuration guide:
man tuned.conf/usr/share/doc/tuned/directory for additional documentation