Versions Compared
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Image Removed
The following reference deployment guide (RDG) demonstrates a complete deployment of the RedHat Openstack Platform 13 for Media Streaming applications with Mellanox NIC hardware offload capabilities.
We'll explain the setup components, scale considerations and other aspects such as the hardware BoM (Bill of Materials) and time synchronization, as well as streaming application compliance testing in the cloud.
Before we start it's highly recommended to become familiar with the key technology features of this deployment guide:
- Rivermax video streaming library
- ASAP2 - Accelerated Switch and Packet Processing
Visit the product pages in the links below to learn more about these feature and their capabilities:
Mellanox Rivermax Video Streaming Library
Mellanox Accelerated Switching and Packet Processing
Widget Connector url https://www.youtube.com/watch?v=LuKuW5PAvwU
Widget Connector | ||
---|---|---|
|
Image Added
Created on Sep 12, 2019
Introduction
More and more media and entertainment (M&E) solution providers are moving their proprietary legacy video solutions to a next-generation IP-based infrastructures to meet the increasing global demand for ultra-high-definition video content. Broadcast providers are looking into cloud-based solutions to offer better scalability and flexibility which introduces new challenges such as multi-tenant high quality streaming at scale and time synchronization in cloud.
Red Hat OpenStack Platform (OSP) is a cloud computing platform that enables the creation, deployment, scale, and management of a secure and reliable public or private OpenStack-based cloud. This production-ready platform offers a tight integration with NVIDIA products and technologies and is used in this guide to demonstrate a full deployment of the "NVIDIA Media Cloud".
NVIDIA Media Cloud is a solution that includes NVIDIA Rivermax library for Packet Pacing, Kernel Bypass and Packet Aggregation along with cloud time synchronization and OVS HW offload to the NVIDIA SmartNIC using the Accelerated Switching And Packet Processing (ASAP2 ) framework.
By the end of this guide you will be able to run offloaded HD media streams between VMs in different racks and validate that it complies with the SMPTE 2110 standards, while using commodity switches, servers and NICs.
Widget Connector | ||
---|---|---|
|
References
- Mellanox Rivermax product page
- Mellanox ASAP2 product page
- Mellanox VMA
- Using Composable Networks - Red Hat Customer Portal
- Performance evaluation of OVS offload using Mellanox Accelerated Switching And Packet Processing (ASAP2) technology in a RedHat OSP 13 OpenStack environment.
- Rivermax Linux Performance Tuning Guide
- Setting PTP on Onyx Switch
- BBC R&D Blogs:
Introduction
More and more media and entertainment (M&E) solution providers are moving their proprietary legacy video solutions to a next-generation IP-based infrastructures to meet the increasing global demand for ultra-high-definition video content. Broadcast providers are looking into cloud-based solutions to offer better scalability and flexibility which introduces new challenges such as multi-tenant high quality streaming at scale and time synchronization in cloud.
Red Hat OpenStack Platform (OSP) is a cloud computing platform that enables the creation, deployment, scale, and management of a secure and reliable public or private OpenStack-based cloud. This production-ready platform offers a tight integration with Mellanox products and technologies and is used in this guide to demonstrate a full deployment of the "Mellanox Media Cloud".
Mellanox Media Cloud is a solution that includes Rivermax library for Packet Pacing, Kernel Bypass and Packet Aggregation along with cloud time synchronization and OVS HW offload to the Mellanox NIC using the Accelerated Switching And Packet Processing (ASAP2 ) framework.
By the end of this guide you will be able to run offloaded HD media streams between VMs in different racks and validate that it complies with the SMPTE 2110 standards, while using commodity switches, servers and NICs.
|
The following reference deployment guide (RDG) demonstrates a complete deployment of the RedHat Openstack Platform 13 for Media Streaming applications with NVIDIA SmartNIC hardware offload capabilities.
We'll explain the setup components, scale considerations and other aspects such as the hardware BoM (Bill of Materials) and time synchronization, as well as streaming application compliance testing in the cloud.
Before we start it's highly recommended to become familiar with the key technology features of this deployment guide:
- Rivermax video streaming library
- ASAP2 - Accelerated Switch and Packet Processing
Visit the product pages in the links below to learn more about these feature and their capabilities:
NVIDIA Rivermax Video Streaming Library
NVIDIA Accelerated Switching and Packet Processing
Info | ||
---|---|---|
| ||
All configuration files as well as QCOW VM image are located here: |
References
- NVIDIA Rivermax product page
- NVIDIA VMA
- Using Composable Networks - Red Hat Customer Portal
- Performance evaluation of OVS offload using NVIDIA Accelerated Switching And Packet Processing (ASAP2) technology in a RedHat OSP 13 OpenStack environment.
- Rivermax Linux Performance Tuning Guide
- Setting PTP on Onyx Switch
- BBC R&D Blogs:
NVIDIA Components
- NVIDIA Rivermax implements an optimized software library API for media streaming applications. It runs on Mellanox NVIDIA ConnectX®-5 network adapters or higher, enabling the use of off-the-shelf (COTS) servers for HD to Ultra HD flows. The Rivermax and ConnectX®-5 adapter cards combination complies with the SMPTE 2110-21 standards, which reduces CPU utilization for video data streaming, and removes bottlenecks for the highest throughput.
- Mellanox NVIDIA Accelerated Switching and Packet Processing (ASAP²) is a framework that enables offloading network data planes into a SmartNIC HW, such as Open vSwitch (OVS) offload which enables a performance boost of up to 10x with complete CPU load reduction. ASAP² is available on the ConnectX-4 Lx and later.
- Mellanox NVIDIA Spectrum Switch family provides the most efficient network solutions for the ever-increasing performance demands of data center applications.
- Mellanox NVIDIA ConnectX Network Adapter family delivers industry-leading connectivity for performance-driven server and storage applications. ConnectX adapter cards enable high bandwidth, coupled with ultra-low latency for diverse applications and systems, resulting in faster access and real-time responses.
- Mellanox NVIDIA LinkX Cables and Transceivers family provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400Gb interconnect products for Cloud, Web 2.0, Enterprise, telco, and storage data center applications. They are often used to link top-of-rack switches downwards to servers, storage & appliances and upwards in switch-to-switch applications
Solution Setup Overview
Below is a list of all the different components in this solution and how they are utilized:
Cloud Platform
The RH-OSP13 will be deployed in large scale and utilized as the cloud platform.
Compute Nodes
The compute nodes will be configured and deployed as “Media Compute Nodes”, adjusted for low latency virtual media applications. Each Compute/Controller node is equipped with a dual-port 100GB NIC of which one port is dedicated for VXLAN tenant network and the other for VLAN Multicast tenant network, Storage, Control, and PTP time synchronization.
Packet Pacing is enabled on the NIC ports specifically to allow MC (MultiCast) Pacing on the VLAN Tenant Network.
Network
The different network components used in this user guide are configured in the following way:
- Multiple racks interconnected via Spine/Leaf network architecture
- Composable routed provider networks are used per rack
- Compute nodes on different provider network segments will host local DHCP agent instances per openstack subnet segment.
- L3 OSPF underlay is used to route between the provider routed networks (another fabric-wide IGP can be used as desired)
- Multicast
- VLAN tenant network is used for Multicast media traffic and will utilize SR-IOV on the VM
- IP-PIM (Sparse Mode) is used for routing the tenant Multicast streams between the racks which are located in routed provider networks
- IGMP snooping is used to manage the tenant multicast groups in the same L2 racks domain
- Unicast
- ASAP²-enabled Compute nodes are located in different racks and maintain VXLAN tunnels as overlay for tenant VM traffic
- The VXLAN tenant network is used for Unicast media traffic and will utilize ASAP² to offload the CPU-intensive VXLAN traffic, in order to avoid the encapsulation/decapsulation performance penalty and achieve the optimum throughput
- Openstack Neutron is used as an SDN controller. All network configuration for every openstack node will be done via Openstack orchestration
- RHOSP inbox drivers are used on all infrastructure components except for VM guests
Time Synchronization
Time Synchronization will be configured in the following way:
- linuxptp tools are used on compute nodes and application VMs
- PTP traffic is untagged on the compute nodes
- Onyx Switches propagate the time between the compute nodes and act as PTP Boundary Clock devices
- One of the switches is used as PTP master clock (in real-life deployments a dedicated grand master should be used)
- KVM virtual PTP driver is used by the VMs to pull the PTP time from their hosting hypervisor which is synced to the PTP clock source
Media Application
Mellanox NVIDIA provides a Rivermax VM cloud image which includes all Rivermax tools and applications. The Rivermax VM provides a demonstration of the media test application and allows the user to validate compliance with the relevant media standards, i.e SMPTE2110SMPTE 2110 (an evaluation License is required).
Solution Components
Image Added
Solution General Design
Solution Multicast Design
Cloud Media Application Design
VXLAN HW Offload Overview
Large Scale Overview
Image Removed
Solution Components
Image Removed
Image Added
HW Configuration
Bill of Materials (BoM)
Image Added
Info | ||
---|---|---|
| ||
|
Solution Example
We chose the below key features as a baseline to demonstrate the solution used in this RDG.
Info | ||
---|---|---|
| ||
The solution example below does not contain redundancy configuration |
Solution Scale
- 2 x racks with a dedicated provider network set per rack
- 1 x SN2700 switch as Spine switch
- 2 x SN2100 switches as Leaf switches, 1 per rack
- 5 nodes in rack 1 (3 x Controller, 2 x Compute)
- 2 nodes in rack 2 (2 x Compute)
- All nodes are connected to the Leaf switches using 2 x 100GB ports per node
- Leaf switches are connected to each Spine switch using a single 100GB port
Physical Rack Diagram
In this RDG we placed all the equipment into the same rack, but the wiring and configuration simulates a two rack network setup.
Image Removed
Network Diagram
Image Removed
Info | ||
---|---|---|
| ||
Compute nodes access External Network/Internet through the undercloud node which functions as a router. |
Image Added
PTP Diagram

Image Added
Info | ||
---|---|---|
| ||
One of the Onyx Leaf switches is used as PTP clock source GrandMaster instead of a dedicated device. |
Solution Networking
Network Diagram
Image Added
Info | ||
---|---|---|
| ||
Compute nodes access External Network/Internet through the undercloud node which functions as a router. |
Network Physical Configuration
Warning | ||
---|---|---|
| ||
The configuration steps below refer to a solution example based on 2 racks |
Network Configuration
Below is a detailed step-by-step description of the network configuration:
Physical Configuration- Connect the switches to the switch mgmt network
- Interconnect the switches using 100GB/s cables
Image Removed
Image Added
- Connect the Controller/Compute servers to the relevant networks according to the following diagrams:
Image Removed
Image Removed
Image Added
Image Added
- Connect the Undercloud Director server to the IPMI, PXE and External networks.
Switch Profile Configuration
MC Max Profile must be set on all switches. This will remove existing configurations and will require a reboot.
Warning |
---|
You shall backup your switch configuration in case you plan to use it later. |
Run the command on all switches:
Code Block | ||||
---|---|---|---|---|
| ||||
system profile eth-ipv4-mc-max show system profile |
Switch Interface Configuration
Set the VLANs and VLAN interfaces on the Leaf switches according to the following:
Network Name | Network Set | Leaf Switch Location | Network Details | Switch Interface IP | VLAN ID | Switch Physical Port | Switchport Mode | Note |
---|---|---|---|---|---|---|---|---|
Storage | 1 | Rack 1 | 172.16.0.0 / 24 | 172.16.0.1 | 11 | A | hybrid | |
Storage_Mgmt | 172.17.0.0 / 24 | 172.17.0.1 | 21 | A | hybrid | |||
Internal API | 172.18.0.0 / 24 | 172.18.0.1 | 31 | A | hybrid | |||
PTP | 172.20.0.0 /24 | 172.20.0.1 | 51 | A | hybrid | access vlan | ||
MC_Tenant_VLAN | 11.11.11.0/24 | 11.11.11.1 | 101 | A | hybrid | |||
Tenant_VXLAN | 172.19.0.0 / 24 | 172.19.0.1 | 41 | B | access | |||
Storage_2 | 2 | Rack 2 | 172.16.2.0 / 24 | 172.16.2.1 | 12 | A | hybrid | |
Storage_Mgmt_2 | 172.17.2.0 / 24 | 172.17.2.1 | 22 | A | hybrid | |||
Internal API _2 | 172.18.2.0 /24 | 172.18.2.1 | 32 | A | hybrid | |||
PTP_2 | 172.20.2.0/24 | 172.20.2.1 | 52 | A | hybrid | access vlan | ||
MC_Tenant_VLAN | 22.22.22.0/24 | 22.22.22.1 | 101 | A | hybrid | |||
Tenant_VXLAN_2 | 172.19.2.0 /24 | 172.19.2.1 | 42 | B | access |
Rack 2 1 Leaf switch VLAN Diagram | Rack 2 Leaf switch VLAN Diagram |
|
|
---|
Image Removed
Image Removed
Switch Full Configuration
Info | ||
---|---|---|
| ||
|
Spine (SW-09) configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
## ## STP configuration ## no spanning-tree ## ## L3 configuration ## ip routing interface ethernet 1/1-1/2 no switchport force interface ethernet 1/1 ip address 192.168.119.9/24 primary interface ethernet 1/2 ip address 192.168.109.9/24 primary interface loopback 0 ip address 1.1.1.9/32 primary ## ## LLDP configuration ## lldp ## ## OSPF configuration ## protocol ospf router ospf router-id 1.1.1.9 interface ethernet 1/1 ip ospf area 0.0.0.0 interface ethernet 1/2 ip ospf area 0.0.0.0 interface ethernet 1/1 ip ospf network broadcast interface ethernet 1/2 ip ospf network broadcast router ospf redistribute direct ## ## IP Multicast router configuration ## ip multicast-routing ## ## PIM configuration ## protocol pim interface ethernet 1/1 ip pim sparse-mode interface ethernet 1/2 ip pim sparse-mode ip pim multipath next-hop s-g-hash ip pim rp-address 1.1.1.9 ## ## IGMP configuration ## interface ethernet 1/1 ip igmp immediate-leave interface ethernet 1/2 ip igmp immediate-leave ## ## Network management configuration ## ntp disable ## ## PTP protocol ## protocol ptp ptp vrf default enable interface ethernet 1/1 ptp enable interface ethernet 1/2 ptp enable |
Leaf Rack 1 (SW-11) configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
## ## STP configuration ## no spanning-tree ## ## LLDP configuration ## lldp ## ## VLAN configuration ## vlan 11 name storage exit vlan 21 name storage_mgmt exit vlan 31 name internal_api exit vlan 41 name tenant_vxlan exit vlan 51 name ptp exit vlan 101 name tenant_vlan_mc exit interface ethernet 1/1-1/5 switchport access vlan 41 interface ethernet 1/11-1/15 switchport mode hybrid interface ethernet 1/11-1/15 switchport access vlan 51 interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 11 interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 21 interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 31 interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 101 ## ## IGMP Snooping configuration ## ip igmp snooping unregistered multicast forward-to-mrouter-ports ip igmp snooping vlan 51 ip igmp snooping vlan 101 ip igmp snooping vlan 51 ip igmp snooping querier vlan 101 ip igmp snooping querier interface ethernet 1/11-1/15 ip igmp snooping fast-leave ## ## L3 configuration ## ip routing interface ethernet 1/9 no switchport force interface ethernet 1/9 ip address 192.168.119.11/24 primary interface vlan 11 ip address 172.16.0.1 255.255.255.0 interface vlan 21 ip address 172.17.0.1 255.255.255.0 interface vlan 31 ip address 172.18.0.1 255.255.255.0 interface vlan 41 ip address 172.19.0.1 255.255.255.0 interface vlan 51 ip address 172.20.0.1 255.255.255.0 interface vlan 101 ip address 11.11.11.1 255.255.255.0 ## ## OSPF configuration ## protocol ospf router ospf router-id 1.1.1.11 interface ethernet 1/9 ip ospf area 0.0.0.0 interface ethernet 1/9 ip ospf network broadcast router ospf redistribute direct ## ## IP Multicast router configuration ## ip multicast-routing ## ## PIM configuration ## protocol pim interface ethernet 1/9 ip pim sparse-mode ip pim multipath next-hop s-g-hash interface vlan 101 ip pim sparse-mode ip pim rp-address 1.1.1.9 ## ## IGMP configuration ## interface ethernet 1/9 ip igmp immediate-leave interface vlan 101 ip igmp immediate-leave ## ## Network management configuration ## ntp disable ## ## PTP protocol ## protocol ptp ptp vrf default enable ptp priority1 1 interface ethernet 1/9 ptp enable interface ethernet 1/11-1/15 ptp enable interface vlan 51 ptp enable |
SW-10 (Leaf Rack 2)
Code Block | ||||
---|---|---|---|---|
| ||||
SW-10 Leaf Rack 2 ## ## STP configuration ## no spanning-tree ## ## LLDP configuration ## lldp ## ## VLAN configuration ## vlan 12 name storage exit vlan 22 name storage_mgmt exit vlan 32 name internal_api exit vlan 42 name tenant_vxlan exit vlan 52 name ptp exit vlan 101 name tenant_vlan_mc exit interface ethernet 1/1-1/2 switchport access vlan 42 interface ethernet 1/11-1/12 switchport mode hybrid interface ethernet 1/11-1/12 switchport access vlan 52 interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 12 interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 22 interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 32 interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 101 ## ## IGMP Snooping configuration ## ip igmp snooping unregistered multicast forward-to-mrouter-ports ip igmp snooping vlan 52 ip igmp snooping vlan 52 ip igmp snooping querier vlan 101 ip igmp snooping vlan 101 ip igmp snooping querier interface ethernet 1/11-1/12 ip igmp snooping fast-leave ## ## L3 configuration ## ip routing interface ethernet 1/9 no switchport force interface ethernet 1/9 ip address 192.168.109.10/24 primary interface vlan 12 ip address 172.16.2.1 255.255.255.0 interface vlan 22 ip address 172.17.2.1 255.255.255.0 interface vlan 32 ip address 172.18.2.1 255.255.255.0 interface vlan 42 ip address 172.19.2.1 255.255.255.0 interface vlan 52 ip address 172.20.2.1 255.255.255.0 interface vlan 101 ip address 22.22.22.1 255.255.255.0 ## ## OSPF configuration ## protocol ospf router ospf router-id 2.2.2.10 interface ethernet 1/9 ip ospf area 0.0.0.0 interface ethernet 1/9 ip ospf network broadcast router ospf redistribute direct ## ## IP Multicast router configuration ## ip multicast-routing ## ## PIM configuration ## protocol pim interface ethernet 1/9 ip pim sparse-mode ip pim multipath next-hop s-g-hash interface vlan 101 ip pim sparse-mode ip pim rp-address 1.1.1.9 ## ## IGMP configuration ## interface ethernet 1/9 ip igmp immediate-leave interface vlan 101 ip igmp immediate-leave ## ## Network management configuration ## ntp disable ## ## PTP protocol ## protocol ptp ptp vrf default enable interface ethernet 1/9 ptp enable interface ethernet 1/11-1/12 ptp enable interface vlan 52 ptp enable |
Solution Configuration and Deployment
The following information will take you through the configuration and deployment steps of the solution.
Prerequisites
Make sure that the hardware specifications are identical for servers with the same role (Compute/Controller/etc.)
Server Preparation - BIOS
Make sure that for all servers:
- The network boot is set on the interface connected to PXE network.
- Virtualization and SRIOV are enabled.
Make sure that for Compute servers:
- The network boot is set on the interface connected to the PXE network
- Virtualization and SRIOV are enabled
- Power Profile is at "Maximum Performance"
- HyperThreading is disabled
- C-state is disabled
- Turbo Mode is disabled
- Collaborative Power Control is disabled
- Processor Power and Utilization Monitoring (crtl+A) are disabled
NIC Preparation
SRIOV configuration is disabled by default on ConnectX-5 NICs and must be enabled for every NIC used by a Compute node.
To enable and configure SRIOV, insert the Compute NIC into a test server with an OS installed, and follow the below steps:
Run the following to verify that the firmware version is 16.21.2030 or later:
Code Block language text theme RDark [root@host ~]# ethtool -i ens2f0 driver: mlx5_core version: 5.0-0 firmware-version: 16.22.1002 (MT_0000000009) expansion-rom-version: bus-info: 0000:07:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: no supports-priv-flags: yes
If the firmware version is older, download and burn the new firmware as detailed in How to Install Mellanox OFED on Linux (Rev 4.4-2.0.7.0)as described hereInstall the mstflint package:
Code Block language text theme RDark [root@host ~]# yum install mstflint
Identify the PCI ID of the first 100G port and enable SRIOV:
Code Block language text theme RDark [root@host ~]# lspci | grep -i mel 07:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] 07:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex] [root@host ~]# [root@host ~]# mstconfig -d 0000:07:00.0 query | grep -i sriov SRIOV_EN False(0) SRIOV_IB_ROUTING_MODE_P1 GID(0) SRIOV_IB_ROUTING_MODE_P2 GID(0) [root@host ~]# mstconfig -d 0000:07:00.0 set SRIOV_EN=1 Device #1: ---------- Device type: ConnectX5 PCI device: 0000:07:00.0 Configurations: Next Boot New SRIOV_EN False(0) True(1) Apply new Configuration? ? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations.
Set the number of VFs to a high value, such as 64, and reboot the server to apply the new configuration:
Code Block language text theme RDark [root@host ~]# mstconfig -d 0000:07:00.0 query | grep -i vfs NUM_OF_VFS 0 [root@host ~]# mstconfig -d 0000:07:00.0 set NUM_OF_VFS=64 Device #1: ---------- Device type: ConnectX5 PCI device: 0000:07:00.0 Configurations: Next Boot New NUM_OF_VFS 0 64 Apply new Configuration? ? (y/n) [n] : y Applying... Done! -I- Please reboot machine to load new configurations. [root@host ~]# reboot
- Confirm the new settings were applied using the mstconfig query commands shown above.
- Insert the NIC back to the Compute node.
- Repeat the procedure above for every Compute node NIC used in our setup.
Info | ||
---|---|---|
| ||
|
Accelerated RH-OSP Installation and Deployment
The following steps will take you through the accelerated RH-OSP installation and deployment procedure:
- Install Red Hat 7.6 OS on the Undercloud server and set an IP to its interface which is connected to the external network; make sure it has internet connectivity.
- Install the Undercloud and the director as instructed in section 4 of the Red Hat OSP Director Installation and Usage - Red Hat Customer Portal. Our undercloud.conf file is attached as a reference here: Configuration Files
Configure a container image source as instructed in section 5 of the above guide. Our solution is using undercloud as a local registry.
Note title Note The following overcloud image versions are used in our deployment:
rhosp-director-images-13.0-20190418.1.el7ost.noarch
rhosp-director-images-ipa-13.0-20190418.1.el7ost.noarch
rhosp-director-images-ipa-x86_64-13.0-20190418.1.el7ost.noarch
rhosp-director-images-x86_64-13.0-20190418.1.el7ost.noarchThe overcloud image is RH 7.6 with kernel 3.10.0-957.10.1.el7.x86_64
Register the nodes of the overcloud as instructed in section 6.1. Our instackenv.json file is attached as a reference.
Inspect the hardware of the nodes as instructed in section 6.2.
Once introspection is completed, it is recommended to confirm for each node that the desired root disk was detected since cloud deployment can fail later because of insufficient disk space. Use the following command to check the free space on the detected disk selected as root:Code Block language text theme RDark (undercloud) [stack@rhosp-director ~]$ openstack baremetal node show 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | grep properties | properties | {u'memory_mb': u'131072', u'cpu_arch': u'x86_64', u'local_gb': u'418', u'cpus': u'24', u'capabilities': u'boot_option:local'}
“local_gb” value is representing the disk size. In case the disk size is low and not as expected, use the procedure described in section 6.6 for defining the root disk for the node. Note that an additional introspection cycle is required for this node after the root disk is changed.
Verify that all nodes were registered properly and changed their state to “available” before proceeding to the next step:
Code Block language text theme RDark +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+ | d1fca940-e341-491b-8afd-0cf6d748aa29 | controller-1 | None | power off | available | False | | 6b24d02c-3fd2-4e55-a730-c45008f01723 | controller-2 | None | power off | available | False | | 098c3e2d-1c70-41d2-983b-6c266387de0b | controller-3 | None | power off | available | False | | 91492c2a-b26c-49ef-9d4e-e492a1578076 | compute-1 | None | power off | available | False | | cdf9e0ec-e3cb-4005-86f6-d40e684a9b19 | compute-2 | None | power off | available | False | | 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | compute-3 | None | power off | available | False | | bb5e829a-834b-4eb1-b733-0012ce9d5f00 | compute-4 | None | power off | available | False | +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
The next step is to Tag the nodes into profiles
Tag the controllers nodes into “control” default profile:
Code Block language text theme RDark (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-1 (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-2 (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-3
Create two new compute flavors -- one per rack (compute-r1, compute-r2) -- and attach the flavors to profiles with a correlated name:
Code Block language text theme RDark (undercloud) [stack@rhosp-director ~]$ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r1 (undercloud) [stack@rhosp-director ~]$ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r1" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r1 (undercloud) [stack@rhosp-director ~]$ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r2 (undercloud) [stack@rhosp-director ~]$ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r2" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r2
Tag compute nodes 1,3 into “compute-r1” profile to associate it with Rack 1, and compute nodes 2,4 into “compute-r2” profile to associate it with Rack 2:
Code Block language text theme RDark (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r1,boot_option:local' compute-1 (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r1,boot_option:local' compute-3 (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r2,boot_option:local' compute-2 (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r2,boot_option:local' compute-4
Verify profile tagging per node using the command below:
Code Block language text theme RDark (undercloud) [stack@rhosp-director ~]$ openstack overcloud profiles list +--------------------------------------+--------------+-----------------+-----------------+-------------------+ | Node UUID | Node Name | Provision State | Current Profile | Possible Profiles | +--------------------------------------+--------------+-----------------+-----------------+-------------------+ | d1fca940-e341-491b-8afd-0cf6d748aa29 | controller-1 | available | control | | | 6b24d02c-3fd2-4e55-a730-c45008f01723 | controller-2 | available | control | | | 098c3e2d-1c70-41d2-983b-6c266387de0b | controller-3 | available | control | | | 91492c2a-b26c-49ef-9d4e-e492a1578076 | compute-1 | available | compute-r1 | | | cdf9e0ec-e3cb-4005-86f6-d40e684a9b19 | compute-2 | available | compute-r2 | | | 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | compute-3 | available | compute-r1 | | | bb5e829a-834b-4eb1-b733-0012ce9d5f00 | compute-4 | available | compute-r2 | | +--------------------------------------+--------------+-----------------+-----------------+-------------------+
Info It is possible to tag the nodes into profiles in instackenv.json file during node registration (section 6.1) instead of running the tag command per node, however flavors and profiles must be created in any case.
Mellanox NVIDIA NICs Listing
Run the following command to go over all registered nodes and identify the interface names of the dual port Mellanox NVIDIA 100GB NIC. Interface names are used later on in the configuration files.
Code Block | ||||
---|---|---|---|---|
| ||||
(undercloud) [stack@rhosp-director templates]$ for node in $(openstack baremetal node list --fields uuid -f value) ; do openstack baremetal introspection interface list $node ; done . . +-----------+-------------------+----------------------+-------------------+----------------+ | Interface | MAC Address | Switch Port VLAN IDs | Switch Chassis ID | Switch Port ID | +-----------+-------------------+----------------------+-------------------+----------------+ | eno1 | ec:b1:d7:83:11:b8 | [] | 94:57:a5:25:fa:80 | 29 | | eno2 | ec:b1:d7:83:11:b9 | [] | None | None | | eno3 | ec:b1:d7:83:11:ba | [] | None | None | | eno4 | ec:b1:d7:83:11:bb | [] | None | None | | ens1f1 | ec:0d:9a:7d:81:b3 | [] | 24:8a:07:7f:ef:00 | Eth1/14 | | ens1f0 | ec:0d:9a:7d:81:b2 | [] | 24:8a:07:7f:ef:00 | Eth1/1 | +-----------+-------------------+----------------------+-------------------+----------------+ |
Note | ||
---|---|---|
| ||
Names must be identical for all nodes, or at least for all nodes sharing the same role. In our case, it is ens2f0/ens2f1 in Controller nodes, and enf1f0/ens1f1 in Compute nodes. |
Note | ||
---|---|---|
| ||
The configuration file examples in the following sections are partial and were employed to highlight specific sections. The full configuration files are available to download in the following link: |
Deployment configuration and environment files:
Role definitions file:
The provided /home/stack/templates/roles roles_data_rivermax.yaml file includes a standard Controller role and two types of Compute roles, one per associated network rack
- The NeutronDhcpAgent service is added to the Compute roles
Below is a partial output of the config files:
Code Block | ||||
---|---|---|---|---|
| ||||
############################################################################### # Role: ComputeSriov1 # ############################################################################### - name: ComputeSriov1 description: | Compute SR-IOV Role R1 CountDefault: 1 networks: - InternalApi - Tenant - Storage - Ptp HostnameFormatDefault: '%stackname%-computesriov1-%index%' disable_upgrade_deployment: True ServicesDefault: |
Code Block | ||||
---|---|---|---|---|
| ||||
############################################################################### # Role: ComputeSriov2 # ############################################################################### - name: ComputeSriov2 description: | Compute SR-IOV Role R2 CountDefault: 1 networks: - InternalApi_2 - Tenant_2 - Storage_2 - Ptp_2 HostnameFormatDefault: '%stackname%-computesriov2-%index%' disable_upgrade_deployment: True ServicesDefault: |
The full configuration file is attached to this document for your convenience.
Node Counts and Flavors file:
The provided /home/stack/templates/node-info.yaml specifies count nodes and the correlated flavor per role.
Full configuration file:
Code Block | ||||
---|---|---|---|---|
| ||||
parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeSriov1Flavor: compute-r1 OvercloudComputeSriov2Flavor: compute-r2 ControllerCount: 3 ComputeSriov1Count: 2 |
Rivermax Environment Configuration file:
The provided /home/stack/templates/rivermax-env.yaml file is used to configure the Compute nodes for low latency applications with HW offload:
ens1f0 is used for accelerated VXLAN data plane (Nova physical_network: null is required for VXLAN offload)
CPU isolation: cores 2-5,12-17 are isolated from Hypervisor and 2-5,12-15 will be used by Rivermax VMs. cores 16,17 are excluded from Nova and will be used exclusively for running linuxptp tasks on the compute node
ens1f1 is used for vlan traffic
Each compute node role is associated with a dedicated physical network to be used later on for multi-segment network, notice that Nova PCI white list physical network remains the same.
VF function #1 is excluded from Nova PCI white list (will be used for Hypervisor VF for PTP traffic).
userdata_disable_service.yaml is called to disable chrony(ntp) service on overcloud nodes during compute - this is required for stable PTP setup.
ExtraConfig for mapping Role config params to the correct network set, and for setting Firewall rules allowing PTP traffic to the compute nodes
Full configuration file is attached to this document
Note | ||
---|---|---|
| ||
The following configuration file is correlated to specific compute server HW, OS and drivers in which: Mellanox NVIDIA's ConnectX adapter interface names are ens1f0, ens1f1 The PCI IDs used for SRIOV SR-IOV VFs allocated for Nova usage are specified explicitly per compute role. In different system the names and PCI addresses might be different. It is required to have this information before cloud deployment in order to adjust the configuration files. |
Code Block | ||||
---|---|---|---|---|
| ||||
# A Heat environment file for adjusting the compute nodes to low latency media applications with HW Offload resource_registry: OS::TripleO::Services::NeutronSriovHostConfig: /usr/share/openstack-tripleo-heat-templates/puppet/services/neutron-sriov-host-config.yaml OS::TripleO::NodeUserData: /home/stack/templates/userdata_disable_service.yaml OS::TripleO::Services::Ntp: OS::Heat::None OS::TripleO::Services::NeutronOvsAgent: /usr/share/openstack-tripleo-heat-templates/puppet/services/neutron-ovs-agent.yaml parameter_defaults: DisableService: "chronyd" NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter'] NovaSchedulerAvailableFilters: ["nova.scheduler.filters.all_filters","nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter"] # ComputeSriov1 Role params: 1 vxlan offload interface, 1 legacy sriov interface, isolated cores, cores 16-17 are isolated and excluded from nova for ptp usage. ComputeSriov1Parameters: KernelArgs: "default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17" NovaVcpuPinSet: "2-5,12-15" OvsHwOffload: True NovaReservedHostMemory: 4096 NovaPCIPassthrough: - devname: "ens1f0" physical_network: null - address: {"domain": ".*", "bus": "08", "slot": "08", "function": "[4-7]"} physical_network: "tenantvlan1" NeutronPhysicalDevMappings: "tenantvlan1:ens1f1" NeutronBridgeMappings: ["tenantvlan1:br-stor"] # Extra config for mapping config params to rack 1 networks and for setting PTP Firewall rule ComputeSriov1ExtraConfig: neutron::agents::ml2::ovs::local_ip: "%{hiera('tenant')}" nova::vncproxy::host: "%{hiera('internal_api')}" nova::compute::vncserver_proxyclient_address: "%{hiera('internal_api')}" nova::compute::libvirt::vncserver_listen: "%{hiera('internal_api')}" nova::my_ip: "%{hiera('internal_api')}" nova::migration::libvirt::live_migration_inbound_addr: "%{hiera('internal_api')}" cold_migration_ssh_inbound_addr: "%{hiera('internal_api')}" live_migration_ssh_inbound_addr: "%{hiera('internal_api')}" tripleo::profile::base::database::mysql::client::mysql_client_bind_address: "%{hiera('internal_api')}" tripleo::firewall::firewall_rules: '199 allow PTP traffic over dedicated interface': dport: [319,320] proto: udp action: accept # ComputeSriov2 Role params: 1 vxlan offload interface, 1 legacy sriov interface, isolated cores, cores 16-17 are isolated and excluded from nova for ptp usage. ComputeSriov2Parameters: KernelArgs: "default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17" NovaVcpuPinSet: "2-5,12-15" OvsHwOffload: True NovaReservedHostMemory: 4096 NeutronSriovNumVFs: NovaPCIPassthrough: - devname: "ens1f0" physical_network: null - address: {"domain": ".*", "bus": "08", "slot": "02", "function": "[4-7]"} physical_network: "tenantvlan1" NeutronPhysicalDevMappings: "tenantvlan2:ens1f1" NeutronBridgeMappings: ["tenantvlan2:br-stor"] # Extra config for mapping config params to rack 2 networks and for setting PTP Firewall rule ComputeSriov2ExtraConfig: neutron::agents::ml2::ovs::local_ip: "%{hiera('tenant_2')}" nova::vncproxy::host: "%{hiera('internal_api_2')}" nova::compute::vncserver_proxyclient_address: "%{hiera('internal_api_2')}" nova::compute::libvirt::vncserver_listen: "%{hiera('internal_api_2')}" nova::my_ip: "%{hiera('internal_api_2')}" nova::migration::libvirt::live_migration_inbound_addr: "%{hiera('internal_api_2')}" cold_migration_ssh_inbound_addr: "%{hiera('internal_api_2')}" live_migration_ssh_inbound_addr: "%{hiera('internal_api_2')}" tripleo::profile::base::database::mysql::client::mysql_client_bind_address: "%{hiera('internal_api_2')}" tripleo::firewall::firewall_rules: '199 allow PTP traffic over dedicated interface': dport: [319,320] proto: udp action: accept |
Disable_Service Configuration file:
The provided /home/stack/templates/userdata_disable_service.yaml is used to disable services on overcloud nodes during deployment.
It is used in rivermaxin rivermax-env.yaml to disable chrony(ntp) service:
Code Block | ||||
---|---|---|---|---|
| ||||
heat_template_version: queens description: > Uses cloud-init to enable root logins and set the root password. Note this is less secure than the default configuration and may not be appropriate for production environments, it's intended for illustration and development/debugging only. parameters: DisableService: description: Disable a service hidden: true type: string resources: userdata: type: OS::Heat::MultipartMime properties: parts: - config: {get_resource: disable_service} disable_service: type: OS::Heat::SoftwareConfig properties: config: str_replace: template: | #!/bin/bash set -x sudo systemctl disable $service sudo systemctl stop $service params: $service: {get_param: DisableService} outputs: OS::stack_id: value: {get_resource: userdata} |
Network configuration Files:
The provided network_data_rivermax.yaml file is used to configure the cloud networks according to the following guidelines:
- rack 1 networks set parameters match the subnets/vlans configured on Rack 1 Leaf switch. The network names used are specified in roles_data.yaml for Controller\ComputeSriov1 role networks.
- rack 2 networks match the subnets/vlans configured on Rack 2 Leaf switch. The network names are specified in roles_data.yaml for ComputeSriov2 role networks.
- “management” network,is not used in our example
- PTP network is shared to both racks in our example
The configuration is based on the following matrix to match the Leaf switch configuration as executed in Network Configuration section above:
Network Name | Network Set | Network Location | Network Details | VLAN | Network Allocation Pool |
Storage | 1 | Rack 1 | 172.16.0.0/24 | 11 | 172.16.0.100-250 |
Storage_Mgmt | 172.17.0.0/24 | 21 | 172.17.0.100-250 | ||
Internal API | 172.18.0.0/24 | 31 | 172.18.0.100-250 | ||
Tenant | 172.19.0.0/24 | 41 | 172.19.0.100-250 | ||
PTP | 172.20.0.0/24 | untagged | 172.20.0.100-250 | ||
Storage_2 | 2 | Rack 2 | 172.16.2.0/24 | 12 | 172.16.2.100-250 |
Storage_Mgmt_2 | 172.17.2.0/24 | 22 | 172.17.2.100-250 | ||
Internal API _2 | 172.18.2.0/24 | 32 | 172.18.2.100-250 | ||
Tenant _2 | 172.19.2.0/24 | 42 | 172.19.2.100-250 | ||
PTP_2 | 172.20.2.0/24 | untagged | 172.20.2.100-250 | ||
External | - | Public Switch | 10.7.208.0/24 | - | 10.7.208.10-21 |
Full configuration file is attached to this document
Below is a partial example for one of the configured networks: Storage (2 networks sets), External, and PTP networks configuration:
Code Block | ||||
---|---|---|---|---|
| ||||
- name: Storage vip: true vlan: 11 name_lower: storage ip_subnet: '172.16.0.0/24' allocation_pools: [{'start': '172.16.0.100', 'end': '172.16.0.250'}] ipv6_subnet: 'fd00:fd00:fd00:1100::/64' ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:1100::10', 'end': 'fd00:fd00:fd00:1100:ffff:ffff:ffff:fffe'}] . . - name: Storage_2 vip: true vlan: 12 name_lower: storage_2 ip_subnet: '172.16.2.0/24' allocation_pools: [{'start': '172.16.2.100', 'end': '172.16.2.250'}] ipv6_subnet: 'fd00:fd00:fd00:1200::/64' ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:1200::10', 'end': 'fd00:fd00:fd00:1200:ffff:ffff:ffff:fffe'}] . . - name: External vip: true name_lower: external vlan: 10 ip_subnet: '10.7.208.0/24' allocation_pools: [{'start': '10.7.208.10', 'end': '10.7.208.21'}] gateway_ip: '10.7.208.1' ipv6_subnet: '2001:db8:fd00:1000::/64' ipv6_allocation_pools: [{'start': '2001:db8:fd00:1000::10', 'end': '2001:db8:fd00:1000:ffff:ffff:ffff:fffe'}] gateway_ipv6: '2001:db8:fd00:1000::1' . . - name: Ptp name_lower: ptp ip_subnet: '172.20.1.0/24' allocation_pools: [{'start': '172.20.1.100', 'end': '172.20.1.250'}] - name: Ptp_2 name_lower: ptp_2 ip_subnet: '172.20.2.0/24' allocation_pools: [{'start': '172.20.2.100', 'end': '172.20.2.250'}] |
The provided networkprovided network-environment-rivermax.yaml file is used to configure the nova\neutron networks parameters according to the cloud networks:
- vxlan tunnels
- tenant vlan ranges to be used for SRIOV ports are 100-200
Full configuration file is attached to this document
Code Block | ||||
---|---|---|---|---|
| ||||
. . . NeutronNetworkType: 'vlan,vxlan,flat' NeutronTunnelTypes: 'vxlan' NeutronNetworkVLANRanges: 'tenantvlan1:100:200,tenantvlan2:100:200' NeutronFlatNetworks: 'datacentre' NeutronBridgeMappings: 'datacentre:br-ex,tenantvlan1:br-stor' |
Role type configuration files:
/home/stack/templates/controller.yaml
- Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
- Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. Supernet and gateway for 2 tenant networks can be seen below.
- Controller nodes network settings we used:
- Dedicated 1G interface (type “interface”) for provisioning (PXE) network.
- Dedicated 1G interface (type “ovs_bridge”) for External network. This network has a default GW configured.
- Dedicated 100G interface (type “interface” without vlans) for data plane (Tenant) network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks.
- Dedicated 100G interface (type “ovs_bridge”) with vlans for Storage/StorageMgmt/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks.
See example below. Full configuration file is attached to this document.
Code Block language text theme RDark TenantSupernet: default: '172.19.0.0/16' description: Supernet that contains Tenant subnets for all roles. type: string TenantGateway: default: '172.19.0.1' description: Router gateway on tenant network type: string Tenant_2Gateway: default: '172.19.2.1' description: Router gateway on tenant_2 network type: string . . resources: OsNetConfigImpl: type: OS::Heat::SoftwareConfig properties: group: script config: str_replace: template: get_file: /usr/share/openstack-tripleo-heat-templates/network/scripts/run-os-net-config.sh params: $network_config: network_config: . . # NIC 3 - Data Plane (Tenant net) - type: ovs_bridge name: br-sriov use_dhcp: false members: - type: interface name: ens2f0 addresses: - ip_netmask: get_param: TenantIpSubnet routes: - ip_netmask: get_param: TenantSupernet next_hop: get_param: TenantGateway
/home/stack/templates/computesriov1.yaml:
- Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
- Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. - not mentioned in the example below, see example above or full configuration file.
- Networks and routes used by Compute nodes in Rack 1 with ComputeSriov1 role:
- Dedicated 1G interface for provisioning (PXE) network
- Dedicated 100G interface for offloaded vxlan data plane network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks
- Dedicated 100G interface with host VF for PTP and with OVS vlans for Storage/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks - not mentioned in the example below, see full configuration file.
See example below. Full configuration file is attached to this document.
Code Block language text theme RDark network_config: # NIC 1 - Provisioning net - type: interface name: eno1 use_dhcp: false dns_servers: get_param: DnsServers addresses: - ip_netmask: list_join: - / - - get_param: ControlPlaneIp - get_param: ControlPlaneSubnetCidr routes: - ip_netmask: 169.254.169.254/32 next_hop: get_param: EC2MetadataIp - default: true next_hop: get_param: ControlPlaneDefaultRoute # NIC 2 - ASAP2 VXLAN Data Plane (Tenant net) - type: sriov_pf name: ens1f0 numvfs: 8 link_mode: switchdev - type: interface name: ens1f0 use_dhcp: false addresses: - ip_netmask: get_param: TenantIpSubnet routes: - ip_netmask: get_param: TenantSupernet next_hop: get_param: TenantGateway # NIC 3 - Storage and Control over OVS, legacy SRIOV for Data Plane, NIC Partitioning for PTP VF owned by Host - type: ovs_bridge name: br-stor use_dhcp: false members: - type: sriov_pf name: ens1f1 numvfs: 8 # force the MAC address of the bridge to this interface primary: true - type: vlan vlan_id: get_param: StorageNetworkVlanID addresses: - ip_netmask: get_param: StorageIpSubnet routes: - ip_netmask: get_param: StorageSupernet next_hop: get_param: StorageGateway - type: vlan vlan_id: get_param: InternalApiNetworkVlanID addresses: - ip_netmask: get_param: InternalApiIpSubnet routes: - ip_netmask: get_param: InternalApiSupernet next_hop: get_param: InternalApiGateway - type: sriov_vf device: ens1f1 vfid: 1 addresses: - ip_netmask: get_param: PtpIpSubnet
/home/stack/templates/computesriov2.yaml:
- Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
- Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. - not mentioned in the example below, see example above or full configuration file.
- Networks and routes used by Compute nodes in Rack 2 with ComputeSriov2 role:
- Dedicated 1G interface for provisioning (PXE) network - not mentioned in the example below, see example above or full configuration file.
- Dedicated 100G interface for offloaded vxlan data plane network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks
Dedicated 100G interface with host VF for PTP and with OVS vlans for Storage/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks - not mentioned in the example below, see full configuration file.
See example below. Full configuration file is attached to this document.
Code Block language text theme RDark network_config: # NIC 1 - Provisioning net - type: interface name: eno1 use_dhcp: false dns_servers: get_param: DnsServers addresses: - ip_netmask: list_join: - / - - get_param: ControlPlaneIp - get_param: ControlPlaneSubnetCidr routes: - ip_netmask: 169.254.169.254/32 next_hop: get_param: EC2MetadataIp - default: true next_hop: get_param: ControlPlaneDefaultRoute # NIC 2 - ASAP2 VXLAN Data Plane (Tenant net) - type: sriov_pf name: ens1f0 numvfs: 8 link_mode: switchdev - type: interface name: ens1f0 use_dhcp: false addresses: - ip_netmask: get_param: Tenant_2IpSubnet routes: - ip_netmask: get_param: TenantSupernet next_hop: get_param: Tenant_2Gateway # NIC 3 - Storage and Control over OVS, legacy SRIOV for Data Plane, NIC Partitioning for PTP VF owned by Host - type: ovs_bridge name: br-stor use_dhcp: false members: - type: sriov_pf name: ens1f1 numvfs: 8 # force the MAC address of the bridge to this interface primary: true - type: vlan vlan_id: get_param: Storage_2NetworkVlanID addresses: - ip_netmask: get_param: Storage_2IpSubnet routes: - ip_netmask: get_param: StorageSupernet next_hop: get_param: Storage_2Gateway - type: vlan vlan_id: get_param: InternalApi_2NetworkVlanID addresses: - ip_netmask: get_param: InternalApi_2IpSubnet routes: - ip_netmask: get_param: InternalApiSupernet next_hop: get_param: InternalApi_2Gateway - type: sriov_vf device: ens1f1 vfid: 1 addresses: - ip_netmask: get_param: Ptp_2IpSubnet
Deploying the Overcloud
Using the provided configuration and environment files, the cloud will be deployed utilizing:
- 3 controllers associated with Rack 1 networks
- 2 Compute nodes associated with Rack 1 (provider network 1)
- 2 Compute nodes associated with Rack 2 (provider network 2)
- Routes to allow connectivity between racks/networks
- VXLAN overlay tunnels between all the nodes
Before starting the deployment, verify connectivity between the racks' Leaf switches SW vlan interfaces facing the nodes over the OSPF underlay fabric. Without inter-rack connectivity for all networks, the overcloud deployment will fail.
Info | ||
---|---|---|
| ||
|
To start the overcloud deployment, issue the command below:
Code Block | ||||
---|---|---|---|---|
| ||||
(undercloud) [stack@rhosp-director templates]$ openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates \ --libvirt-type kvm \ -n /home/stack/templates/network_data_rivermax.yaml \ -r /home/stack/templates/roles_data_rivermax.yaml \ --timeout 90 \ --validation-warnings-fatal \ --ntp-server 0.asia.pool.ntp.org \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/overcloud_images.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml \ -e /home/stack/templates/network-environment-rivermax.yaml \ -e /home/stack/templates/rivermax-env.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml |
Post Deployment Steps
Media Compute Node configuration:
Verify the system booted with the required low latency adjustments
Code Block language text theme RDark # cat /proc/cmdline BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.10.1.el7.x86_64 root=UUID=334f450f-1946-4577-a4eb-822bd33b8db2 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17 # cat /sys/module/intel_idle/parameters/max_cstate 0 # cat /sys/devices/system/cpu/cpuidle/current_driver none
Upload MFT package to the compute node and install it
Info title Note NVIDIA Mellanox Firmware Tools (MFT) can be obtained in http://www.mellanox.com/page/management_tools.obtained here.
GCC and kernel-devel packages are required for MFT install.
Code Block language text theme RDark #yum install gcc kernel-devel-3.10.0-957.10.1.el7.x86_64 -y #tar -xzvf mft-4.12.0-105-x86_64-rpm.tgz #cd mft-4.12.0-105-x86_64-rpm #./install.sh mst start
Verify NIC Firmware and upgrade it to the latest if required
Code Block language text theme RDark # mlxfwmanager --query Querying Mellanox devices firmware ... Device #1: ---------- Device Type: ConnectX5 Part Number: MCX556A-EDA_Ax Description: ConnectX-5 Ex VPI adapter card; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16; tall bracket; ROHS R6 PSID: MT_0000000009 PCI Device Name: /dev/mst/mt4121_pciconf0 Base MAC: ec0d9a7d81b2 Versions: Current Available FW 16.25.1020 N/A PXE 3.5.0701 N/A UEFI 14.18.0019 N/A Status: No matching image found
Enable packet pacing and HW time stamp on the port used for PTP
Info title Note rivermax_config script is available to download here
The relevant interface in our case is ens1f1.
REBOOT is required between the steps.
The "mcra" setting will not survive a reboot.
This is expected to be persistent and enabled by default in future FW releases.
Code Block language text theme RDark #mst start Starting MST (Mellanox Software Tools) driver set Loading MST PCI module - Success [warn] mst_pciconf is already loaded, skipping Create devices -W- Missing "lsusb" command, skipping MTUSB devices detection Unloading MST PCI module (unused) - Success #mst status -v MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE MST PCI RDMA NET NUMA ConnectX5(rev:0) /dev/mst/mt4121_pciconf0.1 08:00.1 mlx5_1 net-ens1f1 0 ConnectX5(rev:0) /dev/mst/mt4121_pciconf0 08:00.0 mlx5_0 net-ens1f0 0 #chmod 777 rivermax_config #./rivermax_config ens1f1 running this can take few minutes... enabling Done!
Code Block language text theme RDark # reboot
Code Block language text theme RDark #mst start Starting MST (Mellanox Software Tools) driver set Loading MST PCI module - Success [warn] mst_pciconf is already loaded, skipping Create devices -W- Missing "lsusb" command, skipping MTUSB devices detection Unloading MST PCI module (unused) - Success #mcra /dev/mst/mt4121_pciconf0.1 0xd8068 3 #mcra /dev/mst/mt4121_pciconf0.1 0xd8068 0x00000003
- Sync the compute node clock
install linuxptp
Code Block language text theme RDark # yum install -y linuxptp
Use one of the following methods to identify the host VF interface name used for PTP (look for IP address from the PTP network or for "virtfn1" which is correlated to vfid 1 used in the configuration deployment files)
Code Block language text theme RDark [root@overcloud-computesriov1-0 ~]# ip addr show | grep "172.20" inet 172.20.0.102/24 brd 172.20.0.255 scope global enp8s8f3 [root@overcloud-computesriov1-0 ~]# ls /sys/class/net/ens1f1/device/virtfn1/net/ enp8s8f3
verify connectivity to clock master (Onyx leaf switch sw11 over vlan 51 for Rack1, Onyx leaf switch sw10 over vlan 52 for Rack2)
Code Block language text theme RDark [root@overcloud-computesriov1-0 ~]# ping 172.20.0.1 PING 172.20.0.1 (172.20.0.1) 56(84) bytes of data. 64 bytes from 172.20.0.1: icmp_seq=1 ttl=64 time=0.158 ms [root@overcloud-computesriov2-0 ~]# ping 172.20.2.1 PING 172.20.2.1 (172.20.2.1) 56(84) bytes of data. 64 bytes from 172.20.2.1: icmp_seq=1 ttl=64 time=0.110 ms
edit /etc/ptp4l.conf to include the following global parameter and the PTP interface parameters
Code Block language text theme RDark [global] domainNumber 127 priority1 128 priority2 127 use_syslog 1 logging_level 6 tx_timestamp_timeout 30 hybrid_e2e 1 dscp_event 46 dscp_general 46 [enp8s8f3] logAnnounceInterval -2 announceReceiptTimeout 3 logSyncInterval -3 logMinDelayReqInterval -3 delay_mechanism E2E network_transport UDPv4
Start ptp4l on the PTP VF interface
Info title Note The command below is used to run the ptp4l in slave mode on a dedicated host CPU which is isolated and excluded from Nova per our deployment configuration files (core 16 in our case).
The second command is used to verify PTP clock is locked on master clock source, rms values should be low.
Code Block language text theme RDark # taskset -c 16 ptp4l -s -f /etc/ptp4l.conf & # tail -f /var/log/messages | grep rms ptp4l: [2560.009] rms 12 max 22 freq -12197 +/- 16 ptp4l: [2561.010] rms 10 max 18 freq -12200 +/- 13 delay 63 +/- 0 ptp4l: [2562.010] rms 10 max 21 freq -12212 +/- 10 delay 63 +/- 0 ptp4l: [2563.011] rms 10 max 21 freq -12208 +/- 14 delay 63 +/- 0 ptp4l: [2564.012] rms 9 max 14 freq -12220 +/- 8
Start phc2sys on the same interface to sync the host system clock time
Info title Note The command below is used to run the phc2sys on a dedicated host CPU which is isolated and excluded from Nova per our deployment configuration files (core 17 in our case).
The second command is used to verify system clock is synched to PTP, offset values should be low and match the ptp4l rms values
Code Block language text theme RDark # taskset -c 17 phc2sys -s enp8s8f3 -w -m -n 127 >> /var/log/messages & # tail -f /var/log/messages | grep offset phc2sys[2797.730] phc offset 0 s2 freq +14570 delay 959 phc2sys[2798.730]: phc offset -43 s2 freq +14527 delay 957 phc2sys[2799.730]: phc offset 10 s2 freq +14567 delay 951
Application VMs and Use Cases
In the section below we will cover two main use cases:
- IP Multicast stream between media VMs located in different L3 routed provider networks
Image Removed
Image Added
- HW-Offloaded Unicast stream over VXLAN tunnel between media VMs located in different L3 routed provider networks
Image Removed
Image Added
Media Instances Creation
Each Media VM will own both SRIOV-based vlan network and ASAP²-based VXLAN network. The same VMs can be used to test all of the use cases.
Download Contact Nvidia Networking Support to get the Rivermax VM cloud image image file (RivermaxCloud_v3.qcow2)
Info title Note The login credentials to VMs that are using this image are: root/3tango
Upload Rivermax cloud image to overcloud image repository
Code Block language text theme RDark source overcloudrc openstack image create --file RivermaxCloud_v1v3.qcow2 --disk-format qcow2 --container-format bare rivermax
Create a flavor with dedicated cpu policy to ensure VM vCPUs are pinned to the isolated host CPUs
Code Block language text theme RDark openstack flavor create m1.rivermax --id auto --ram 4096 --disk 20 --vcpus 4 openstack flavor set m1.rivermax --property hw:mem_page_size=large openstack flavor set m1.rivermax --property hw:cpu_policy=dedicated
Create a multi-segment network for the tenant vlan Multicast traffic
Info title Notes Each network segment contains SRIOV direct port with IP from a different subnet.
The subnets are associated with a different physical network, each one correlated with a different routed provider rack.
Routes to the subnets are propagated between racks via provider L3 infrastructure (OSPF in our case).
The subnets GWs are the Leaf ToR Switch per rack.
Both segments under this multi-segment network are carrying the same segment vlan.
Code Block language text theme RDark openstack network create mc_vlan_net --provider-physical-network tenantvlan1 --provider-network-type vlan --provider-segment 101 --share openstack network segment list --network mc_vlan_net +--------------------------------------+------+--------------------------------------+--------------+---------+ | ID | Name | Network | Network Type | Segment | +--------------------------------------+------+--------------------------------------+--------------+---------+ | 309dd695-b45d-455e-b171-5739cc309dcf | None | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan | 101 | +--------------------------------------+------+--------------------------------------+--------------+---------+ openstack network segment set --name segment1 309dd695-b45d-455e-b171-5739cc309dcf openstack network segment create --physical-network tenantvlan2 --network-type vlan --segment 101 --network mc_vlan_net segment2 (overcloud) [stack@rhosp-director ~]$ openstack network segment list +--------------------------------------+----------+--------------------------------------+--------------+---------+ | ID | Name | Network | Network Type | Segment | +--------------------------------------+----------+--------------------------------------+--------------+---------+ | 309dd695-b45d-455e-b171-5739cc309dcf | segment1 | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan | 101 | | cac89791-2d7f-45e7-8c85-cc0a65060e81 | segment2 | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan | 101 | +--------------------------------------+----------+--------------------------------------+--------------+---------+ openstack subnet create mc_vlan_subnet --dhcp --network mc_vlan_net --network-segment segment1 --subnet-range 11.11.11.0/24 --gateway 11.11.11.1 openstack subnet create mc_vlan_subnet_2 --dhcp --network mc_vlan_net --network-segment segment2 --subnet-range 22.22.22.0/24 --gateway 22.22.22.1 openstack port create mc_direct1 --vnic-type=direct --network mc_vlan_net openstack port create mc_direct2 --vnic-type=direct --network mc_vlan_net
Create vxlan tenant network for Unicast traffic with 2 x ASAP² offload ports
Code Block language text theme RDark openstack network create tenant_vxlan_net --provider-network-type vxlan --share openstack subnet create tenant_vxlan_subnet --dhcp --network tenant_vxlan_net --subnet-range 33.33.33.0/24 --gateway none openstack port create offload1 --vnic-type=direct --network tenant_vxlan_net --binding-profile '{"capabilities":["switchdev"]}' openstack port create offload2 --vnic-type=direct --network tenant_vxlan_net --binding-profile '{"capabilities":["switchdev"]}'
Create a rivermax instance on media compute node located in Rack 1 (provider network segment 1) with one direct SRIOV port on the vlan network and one ASAP² offload port on the vxlan network
Code Block language text theme RDark openstack server create --flavor m1.rivermax --image rivermax --nic port-id=mc_direct1 --nic port-id=offload1 vm1 --availability-zone nova:overcloud-computesriov1-0.localdomain
Create a second rivermax instance on media compute node located in Rack 2 (provider network segment 2) with one direct SRIOV port on the vlan network and one ASAP² offload port on the vxlan network
Code Block language text theme RDark openstack server create --flavor m1.rivermax --image rivermax --nic port-id=mc_direct2 --nic port-id=offload2 vm2 --availability-zone nova:overcloud-computesriov2-0.localdomain
Connect to the compute nodes and verify the VMs are pinned to the isolated CPUs
Code Block language text theme RDark [root@overcloud-computesriov1-0 ~]# virsh list Id Name State ---------------------------------------------------- 1 instance-0000002b running [root@overcloud-computesriov1-0 ~]# virsh vcpupin 1 VCPU: CPU Affinity ---------------------------------- 0: 15 1: 2 2: 3 3: 4
Rivermax Application Testing - Use Case 1:
In the following section we use Rivermax application VMs created on 2 media compute nodes located in different network racks.
First we will lock on the PTP clock generated by the Onyx switches and propagated into the VMs via KVM vPTP driver.
Next we will generate media standards compliant stream on VM1 and validate compliance using Mellanox NVIDIA Rivermax AnalyzeX tool on VM2. The Multicast stream generated by VM1 will traverse over the network using PIM-SM and will be received by VM2 who joined the group. Please notice this stream contains RTP header (including 1 SRD) for each packet and comply with known media RFCs, however the RTP payload is 0 so it is cannot be visually displayed.
In the last step we will decode and stream a real video file on VM1 and play it in the receiver VM2 using graphical interface using NVIDIA Rivermax Simple Viewer tool.
- Upload rivermax and analyzex license files to the Rivermax VMs and place it under /opt/mellanox/rivermx directory.
On both VMs run the following command to sync the system time from PTP:
Code Block language text theme RDark taskset -c 1 phc2sys -s /dev/ptp2 -O 0 -m >> /var/log/messages & # tail -f /var/log/messages | grep offset phc2sys[2797.730] phc offset 0 s2 freq +14570 delay 959 phc2sys[2798.730]: phc offset -43 s2 freq +14527 delay 957 phc2sys[2799.730]: phc offset 10 s2 freq +14567 delay 951
Notice the phc2sys is running on a dedicated VM core 1 (which is isolated from the hypervisor) and applied on ptp2 device. In some cases the ptp devices names in the VM will be different.
Ignore the "clock is not adjustable" message when applying the command.
Low and stable offset values will indicate a lock.
Info title Note Important: Verify the freq values in the output are close to the values seen in the compute node level (see above where we performed the command on host).
If not, use in the phc2sys command a different /dev/ptp device that is available in the VM system.On both VMs, run the SDP file modification script to adjust the media configuration file (sdp_hd_video_audio) as desired:
Code Block language text theme RDark #cd /home/Rivermax #./sdp_modify.sh === SDP File Modification Script === Default source IP Address is 11.11.11.10 would you like to change it (Y\N)?y Please select source IP Address in format X.X.X.X :11.11.11.25 Default Video stream multicast IP Address is 224.1.1.20 would you like to change it (Y\N)?y Please select Video stream multicast IP Address:224.1.1.110 Default Video stream multicast Port is 5000 would you like to change it (Y\N)?n Default Audio stream multicast IP Address is 224.1.1.30 would you like to change it (Y\N)?y Please select Audio stream multicast IP Address:224.1.1.110 Default Audio stream multicast Port: is 5010 would you like to change it (Y\N)?n Your SDP file is ready with the following parameters: IP_ADDR 11.11.11.25 MC_VIDEO_IP 224.1.1.110 MC_VIDEO_PORT 5000 MC_AUDIO_IP 224.1.1.110 MC_AUDIO_PORT 5010 # cat sdp_hd_video_audio v=0 o=- 1443716955 1443716955 IN IP4 11.11.11.25 s=st2110 stream t=0 0 m=video 5000 RTP/AVP 96 c=IN IP4 224.1.1.110/64 a=source-filter:incl IN IP4 224.1.1.110 11.11.11.25 a=rtpmap:96 raw/90000 a=fmtp:96 sampling=YCbCr-4:2:2; width=1920; height=1080; exactframerate=50; depth=10; TCS=SDR; colorimetry=BT709; PM=2110GPM; SSN=ST2110-20:2017; TP=2110TPN; a=mediaclk:direct=0 a=ts-refclk:localmac=40-a3-6b-a0-2b-d2 m=audio 5010 RTP/AVP 97 c=IN IP4 224.1.1.110/64 a=source-filter:incl IN IP4 224.1.1.110 11.11.11.25 a=rtpmap:97 L24/48000/2 a=mediaclk:direct=0 rate=48000 a=ptime:1 a=ts-refclk:localmac=40-a3-6b-a0-2b-d2
On both VMs issue the following command to define the VMA memory buffers:
Code Block language text theme RDark export VMA_RX_BUFS=2048 export VMA_TX_BUFS=2048 export VMA_RX_WRE=1024 export VMA_TX_WRE=1024
On the first VM ("transmitter VM") generate the media stream using Rivermax media_sender application. The command below is used to run Rivermax media sender application on dedicated VM vCPUs 2,3 (which are isolated from the hypervisor).
Media sender application is using the system time to operate.
Code Block language text theme RDark # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m
On the second VM ("receiver VM") run the AnalyzeX tool to verify compliance. The command below is used to run Rivermax AnalyzeX compliance tool on dedicated VM vCPUs 1-3 (which are isolated from the hypervisor).
Code Block language text theme RDark # VMA_HW_TS_CONVERSION=2 ANALYZEX_STACK_JITTER=2 LD_PRELOAD=libvma.so taskset -c 1-3 ./analyzex -i ens4 -s sdp_hd_video_audio -p
The following AnalyzeX result indicate full compliance to ST2110 media standards:
Image Removed
Image Added
- Stop Rivermax media_sender application on VM1 and AnalyzeX tool on VM2.
Login to VM1 and extract the video file under /home/Rivermax directory
Code Block language text theme RDark # gunzip mellanoxTV_1080p50.ycbcr.gz
Re-run Rivermax media_sender application on VM1 - this time specify the video file. Lower rate is used to allow the graphical interface to cope with video playing task:
Code Block language text theme RDark # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m -f mellanoxTV_1080p50.ycbcr --fps 25
Open a graphical remote session to VM2. In our case we have allocated a public floating ip to VM2 and used X2Go client to open a remote session:
Image Removed
Image Added
Open the Terminal and run the Rivermax rx_hello_wolrd_viewer application under /home/Rivermax directory. Specify the local VLAN IP address of VM2 and the Multicast address of the stream. Once the command is issued the video will start playing on screen.
Code Block language text theme RDark #cd /home/Rivermax # ./rx_hello_world_viewer -i 22.22.22.4 -m 224.1.1.110 -p 5000
Image Removed
Image Removed
Image Removed
Image Added
Image Added
Image Added
The following video demonstrates the procedure:
View file name Simple_player.mp4 height 250
Rivermax Application Testing - Use Case 2
In the following section we use the same Rivermax application VMs that were created on 2 remote media compute nodes to generate a Unicast stream between the VMs over VXLAN overlay network.
After validating the PTP clock is locked, we will start the stream and monitor it with the same tools.
The Unicast stream generated by VM1 will create vxlan OVS flow that will be offloaded to the NIC HW.
- Make sure rivermax and analyzex license files are placed on the Rivermax VMs as instructed in Use Case 1.
Make sure the system time on both VMs is updated from PTP as instructed in Use Case 1.
Code Block language text theme RDark # tail -f /var/log/messages | grep offset phc2sys[2797.730] phc offset 0 s2 freq +14570 delay 959 phc2sys[2798.730]: phc offset -43 s2 freq +14527 delay 957 phc2sys[2799.730]: phc offset 10 s2 freq +14567 delay 951
On transmitter VM1 run the SDP file modification script to create a Unicast configuration file - specify VM1 VXLAN IP address as source IP and VM2 VXLAN IP address as the stream destination
Code Block language text theme RDark # ./sdp_modify.sh === SDP File Modification Script === Default source IP Address is 11.11.11.10 would you like to change it (Y\N)?y Please select source IP Address in format X.X.X.X :33.33.33.12 Default Video stream multicast IP Address is 224.1.1.20 would you like to change it (Y\N)?y Please select Video stream multicast IP Address:33.33.33.16 Default Video stream multicast Port is 5000 would you like to change it (Y\N)?n Default Audio stream multicast IP Address is 224.1.1.30 would you like to change it (Y\N)?y Please select Audio stream multicast IP Address:33.33.33.16 Default Audio stream multicast Port: is 5010 would you like to change it (Y\N)?n Your SDP file is ready with the following parameters: IP_ADDR 33.33.33.12 MC_VIDEO_IP 33.33.33.16 MC_VIDEO_PORT 5000 MC_AUDIO_IP 33.33.33.16 MC_AUDIO_PORT 5010
On VM1 generate the media stream using Rivermax media_sender application - use the unicast SDP file you created in previous step
Code Block language text theme RDark # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m
On receiver VM2 run the Rivermax rx_hello_wolrd application with the local VXLAN interface IP
Info title Note Make sure you use rx_hello_world tool and not rx_hello_world_viewer.
Code Block language text theme RDark # ./rx_hello_world -i 33.33.33.16 -m 33.33.33.16 -p 5000
On the Compute nodes verify the flows are offloaded to the HW
On compute node 1 which is hosting transmitter VM1 the offloaded flow includes the traffic coming from the VM over the Representor interface and goes into the VXLAN tunnel :
Code Block language text theme RDark [root@overcloud-computesriov1-0 heat-admin]# ovs-dpctl dump-flows type=offloaded --name in_port(eth4),eth(src=fa:16:3e:94:a4:5d,dst=fa:16:3e:fc:59:f3),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:54527279, bytes:71539619808, used:0.330s, actions:set(tunnel(tun_id=0x8,src=172.19.0.100,dst=172.19.2.105,tp_dst=4789,flags(key))),vxlan_sys_4789
On compute node 2 which is hosting receiver VM2 the offloaded flow includes the traffic coming over the VXLAN tunnels and goes into the VM over the Representor interface:
Code Block language text theme RDark [root@overcloud-computesriov2-0 ~]# ovs-dpctl dump-flows type=offloaded --name tunnel(tun_id=0x8,src=172.19.0.100,dst=172.19.2.105,tp_dst=4789,flags(+key)),in_port(vxlan_sys_4789),eth(src=fa:16:3e:94:a4:5d,dst=fa:16:3e:fc:59:f3),eth_type(0x0800),ipv4(frag=no), packets:75722169, bytes:95561342656, used:0.420s, actions:eth5 sys_4789
Authors
Include Page | ||||
---|---|---|---|---|
|
Related Documents
Content by Label showLabels false showSpace false sort creation cql label = "replacein ("openstack","rhosp","asap²","media_and_entertainment","virtual_machine","virtualization") and space = currentSpace() and state = "Approved"