Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Image Removed

The following reference deployment guide (RDGdemonstrates a complete deployment of the RedHat Openstack Platform 13 for Media Streaming applications with Mellanox NIC hardware offload capabilities.

We'll explain the setup components, scale considerations and other aspects such as the hardware BoM (Bill of Materials) and time synchronization, as well as streaming application compliance testing in the cloud.

Before we start it's highly recommended to become familiar with the key technology features of this deployment guide: 

  • Rivermax video streaming library 
  • ASAP2 Accelerated Switch and Packet Processing

Visit the product pages in the links below to learn more about these feature and their capabilities: 

Mellanox Rivermax Video Streaming Library 

Mellanox Accelerated Switching and Packet Processing 

Widget Connector
urlhttps://www.youtube.com/watch?v=LuKuW5PAvwU

Widget Connector
urlhttps://www.youtube.com/watch?v=m6MIcdw5e5I

Widget Connector
urlhttps://www.youtube.com/watch?v=S0ebcHnZwk0

References

  • Mellanox Rivermax product page
  • Mellanox ASAP2 product page
  • Mellanox VMA
  • Director Installation and Usage - Red Hat Customer Portal

  • Using Composable Networks - Red Hat Customer Portal
  • Performance evaluation of OVS offload using Mellanox Accelerated Switching And Packet Processing (ASAP2) technology in a RedHat OSP 13 OpenStack environment.
  • Rivermax Linux Performance Tuning Guide
  • Setting PTP on Onyx Switch
  • BBC R&D Blogs:
  • Computing and Networks at Scale
  • Cloud-Fit Production Architecture
  • IP Production Facilities

    Image Added

    Created on Sep 12, 2019 

    Introduction

    More and more media and entertainment (M&E) solution providers are moving their proprietary legacy video solutions to a next-generation IP-based infrastructures to meet the increasing global demand for ultra-high-definition video content. Broadcast providers are looking into cloud-based solutions to offer better scalability and flexibility which introduces new challenges such as multi-tenant high quality streaming at scale and time synchronization in cloud.

    Red Hat OpenStack Platform (OSP) is a cloud computing platform that enables the creation, deployment, scale, and management of a secure and reliable public or private OpenStack-based cloud. This production-ready platform offers a tight integration with Mellanox NVIDIA products and technologies and is used in this guide to demonstrate a full deployment of the "Mellanox NVIDIA Media Cloud".

    Mellanox NVIDIA Media Cloud is a solution that includes NVIDIA Rivermax library for Packet Pacing, Kernel Bypass and Packet Aggregation along with cloud time synchronization and OVS HW offload to the Mellanox NIC NVIDIA SmartNIC using the Accelerated Switching And Packet Processing (ASAP2 ) framework.
    By the end of this guide you will be able to run offloaded HD media streams between VMs in different racks and validate that it complies with the SMPTE 2110 standards, while using commodity switches, servers and NICs.racks and validate that it complies with the SMPTE 2110 standards, while using commodity switches, servers and NICs.

    Widget Connector
    urlhttp://youtube.com/watch?v=cuvNKwhbFVI

    The following reference deployment guide (RDG) demonstrates a complete deployment of the RedHat Openstack Platform 13 for Media Streaming applications with NVIDIA SmartNIC hardware offload capabilities.

    We'll explain the setup components, scale considerations and other aspects such as the hardware BoM (Bill of Materials) and time synchronization, as well as streaming application compliance testing in the cloud.

    Before we start it's highly recommended to become familiar with the key technology features of this deployment guide: 

    • Rivermax video streaming library 
    • ASAP2 - Accelerated Switch and Packet Processing

    Visit the product pages in the links below to learn more about these feature and their capabilities: 

    NVIDIA Rivermax Video Streaming Library 

    NVIDIA Accelerated Switching and Packet Processing 


    Mellanox
    Info
    titleDownloadable Content

    All configuration files as well as QCOW VM image are located here: 

    Mellanox Components


    References


    NVIDIA Components

    • NVIDIA Rivermax implements an optimized software library API for media streaming applications. It runs on Mellanox NVIDIA ConnectX®-5 network adapters or higher, enabling the use of off-the-shelf (COTS) servers for HD to Ultra HD flows. The Rivermax and ConnectX®-5 adapter cards combination complies with the SMPTE 2110-21 standards, which reduces CPU utilization for video data streaming, and removes bottlenecks for the highest throughput.
    • Mellanox NVIDIA Accelerated Switching and Packet Processing (ASAP²) is a framework that enables offloading network data planes into a SmartNIC HW, such as Open vSwitch (OVS) offload which enables a performance boost of up to 10x with complete CPU load reduction. ASAP² is available on the ConnectX-4 Lx and later.
    • Mellanox NVIDIA Spectrum Switch family provides the most efficient network solutions for the ever-increasing performance demands of data center applications.
    • Mellanox NVIDIA ConnectX Network Adapter family delivers industry-leading connectivity for performance-driven server and storage applications. ConnectX adapter cards enable high bandwidth, coupled with ultra-low latency for diverse applications and systems, resulting in faster access and real-time responses.
    • Mellanox NVIDIA LinkX Cables and Transceivers family provides the industry’s most complete line of 10, 25, 40, 50, 100, 200, and 400Gb interconnect products for Cloud, Web 2.0, Enterprise, telco, and storage data center applications. They are often used to link top-of-rack switches downwards to servers, storage & appliances and upwards in switch-to-switch applications

    Solution Setup Overview 

    Below is a list of all the different components in this solution and how they are utilized:

    Cloud Platform

    The RH-OSP13 will be deployed in large scale and utilized as the cloud platform. 

    Compute Nodes 

    The compute nodes will be configured and deployed as “Media Compute Nodes”, adjusted for low latency virtual media applications. Each Compute/Controller node is equipped with a dual-port 100GB NIC of which one port is dedicated for VXLAN tenant network and the other for VLAN Multicast tenant network, Storage, Control, and PTP time synchronization.

    Packet Pacing is enabled on the NIC ports specifically to allow MC (MultiCast) Pacing on the VLAN Tenant Network.

    Network

    The different network components used in this user guide are configured in the following way:

    • Multiple racks interconnected via Spine/Leaf network architecture
    • Composable routed provider networks are used per rack
    • Compute nodes on different provider network segments will host local DHCP agent instances per openstack subnet segment
    • L3 OSPF underlay is used to route between the provider routed networks (another fabric-wide IGP can be used as desired)
    • Multicast
      • VLAN tenant network is used for Multicast media traffic and will utilize SR-IOV on the VM 
      • IP-PIM (Sparse Mode) is used for routing the tenant Multicast streams between the racks which are located in routed provider networks
      • IGMP snooping is used to manage the tenant multicast groups in the same L2 racks domain
    • Unicast  
      • ASAP²-enabled Compute nodes are located in different racks and maintain VXLAN tunnels as overlay for tenant VM traffic
      • The VXLAN tenant network is used for Unicast media traffic and will utilize ASAP²  to offload the CPU-intensive VXLAN traffic, in order to avoid the encapsulation/decapsulation performance penalty and achieve the optimum throughput
    • Openstack Neutron is used as an SDN controller. All network configuration for every openstack node will be done via Openstack orchestration
    • RHOSP inbox drivers are used on all infrastructure components except for VM guests

    Time Synchronization

    Time Synchronization will be configured in the following way:

    • linuxptp tools are used on compute nodes and application VMs
    • PTP traffic is untagged on the compute nodes
    • Onyx Switches propagate the time between the compute nodes and act as PTP Boundary Clock devices
    • One of the switches is used as PTP master clock (in real-life deployments a dedicated grand master should be used)
    • KVM virtual PTP driver is used by the VMs to pull the PTP time from their hosting hypervisor which is synced to the PTP clock source

    Media Application

    Mellanox NVIDIA provides a Rivermax VM cloud image which includes all Rivermax tools and applications. The Rivermax VM provides a demonstration of the media test application and allows the user to validate compliance with the relevant media standards, i.e SMPTE2110SMPTE 2110 (an evaluation License is required).required).



    Solution Components   


    Image Added




    Solution General Design


    Solution Multicast Design


    Cloud Media Application Design 


    VXLAN HW Offload Overview



    Large Scale Overview

    Image Removed

    Solution Components   

    Image RemovedImage Added


    HW Configuration

    Bill of Materials (BoM)

    Image Added

    Info
    titleNote
    • The BoM above refers to the maximal configuration in a large scale solution, with a blocking ratio of 3:1
    • It is possible to change the blocking ratio to obtain a different capacity
    • The SN2100 and SN2700 switches share the same feature set and can be used in this solution accordingly with compute and/or network capacity required
    • The 2-Rack BoM will be used in the solution example described below


    Solution Example

    We chose the below key features as a baseline to demonstrate the solution used in this RDG.

    Info
    titleNote

    The solution example below does not contain redundancy configuration

    Solution Scale

    • 2 x racks with a dedicated provider network set per rack
    • 1 x SN2700 switch as Spine switch
    • 2 x SN2100 switches as Leaf switches, 1 per rack
    • 5 nodes in rack 1 (3 x Controller, 2 x Compute)
    • 2 nodes in rack 2 (2 x Compute)
    • All nodes are connected to the Leaf switches using 2 x 100GB ports per node
    • Leaf switches are connected to each Spine switch using a single 100GB port

    Physical Rack Diagram

    In thee this RDG we place placed all the equipment into the same rack, but the wiring and configuration comply 2 simulates a two rack network setup.

      Image Removed

    Network Diagram

    Image Removed

    Info
    titleNote

    Compute nodes access External Network/Internet through the undercloud node which acts as a router.

    Image Added


    PTP Diagram

     Image Removed

    Image Added


    Info
    titleNote

    One of the Onyx Leaf switches is used as PTP clock source GrandMaster instead of a dedicated device.

    Solution Networking

    Network Diagram


    Image Added

    Info
    titleNote

    Compute nodes access External Network/Internet through the undercloud node which functions as a router.


    Network Physical Configuration

    Warning
    titleImportant !

    The configuration steps below refer to a solution example based on 2 racks

    Network Configuration

    Below is a detailed step-by-step description of the network configuration:

    Physical Configuration


    1. Connect the switches to the switch mgmt network
    2. Interconnect the switches using 100GB/s cables
      Image RemovedImage Added
    3. Connect the Controller/Compute servers to the relevant networks according to the following diagrams:
      Image RemovedImage RemovedImage AddedImage Added
    4. Connect the Undercloud Director server to the IPMI, PXE and External networks.

    Switch Profile Configuration

    You need to set MC Max Profile must be set on all switches. This will remove existing configurations and will require a reboot.

    Warning
    You shall backup your switch configuration in case you plan to use it later.


    Run the command on all your switches:

    Code Block
    languagetext
    themeRDark
    system profile eth-ipv4-mc-max 
    show system profile

    Switch Interface Configuration

    Set the VLANs and VLAN interfaces on the Leaf switches according to the following:

    Network Name

    Network Set

    Leaf Switch Location

    Network Details

    Switch Interface IP

    VLAN ID

    Switch Physical Port

    Switchport Mode

    Note

    Storage

    1

    Rack 1

    172.16.0.0 / 24

    172.16.0.1

    11

    A

    hybrid


    Storage_Mgmt

    172.17.0.0 / 24

    172.17.0.1

    21

    A

    hybrid


    Internal API

    172.18.0.0 / 24

    172.18.0.1

    31

    A

    hybrid


    PTP172.20.0.0 /24172.20.0.151Ahybridaccess vlan
    MC_Tenant_VLAN11.11.11.0/2411.11.11.1101Ahybrid

    Tenant_VXLAN

    172.19.0.0 / 24

    172.19.0.1

    41

    B

    access


    Storage_2

    2

    Rack 2

    172.16.2.0 / 24

    172.16.2.1

    12

    A

    hybrid


    Storage_Mgmt_2

    172.17.2.0 / 24

    172.17.2.1

    22

    A

    hybrid


    Internal API _2

    172.18.2.0 /24

    172.18.2.1

    32

    A

    hybrid


    PTP_2172.20.2.0/24172.20.2.152Ahybridaccess vlan
    MC_Tenant_VLAN22.22.22.0/2422.22.22.1101Ahybrid

    Tenant_VXLAN_2

    172.19.2.0 /24

    172.19.2.1

    42

    B

    access




    Rack 2 1 Leaf switch VLAN DiagramRack 2 Leaf switch VLAN Diagram

    Image RemovedImage Added

    Image RemovedImage Added

    Image Removed            Image Removed

    Switch Full Configuration

    Info
    titleNote
    • Onyx 3.8.1204 version and up is required
    • Switch SW-09 is used as Spine switch
    • SW-10 and SW-11 are used as Leaf switches
    • Leaf SW-11 is configured with a PTP grandmaster role
    • port 1/9 in all Leaf switches should face the Spine switch. The rest of the ports should face the Compute\Controller nodes
    • igmp immediate/fast-leave switch configurations should be removed in case multiple virtual receivers are used on a Compute node

    Spine (SW-09) configuration: 

    Code Block
    languagetext
    themeRDark
    ##
    ## STP configuration
    ##
    no spanning-tree
    
    ##   
    ## L3 configuration
    ##
    ip routing
    interface ethernet 1/1-1/2 no switchport force
    interface ethernet 1/1 ip address 192.168.119.9/24 primary
    interface ethernet 1/2 ip address 192.168.109.9/24 primary
    interface loopback 0 ip address 1.1.1.9/32 primary
     
    ##
    ## LLDP configuration
    ##
       lldp
       
    ##
    ## OSPF configuration
    ##
    protocol ospf
    router ospf router-id 1.1.1.9
       interface ethernet 1/1 ip ospf area 0.0.0.0
       interface ethernet 1/2 ip ospf area 0.0.0.0
       interface ethernet 1/1 ip ospf network broadcast
       interface ethernet 1/2 ip ospf network broadcast
       router ospf redistribute direct   
      
    ##
    ## IP Multicast router configuration
    ##
       ip multicast-routing 
       
    ##
    ## PIM configuration
    ##
       protocol pim
       interface ethernet 1/1 ip pim sparse-mode
       interface ethernet 1/2 ip pim sparse-mode
       ip pim multipath next-hop s-g-hash
       ip pim rp-address 1.1.1.9
    
    ##
    ## IGMP configuration
    ##
       interface ethernet 1/1 ip igmp immediate-leave
       interface ethernet 1/2 ip igmp immediate-leave
     
    ##
    ## Network management configuration
    ##
       ntp disable
       
    ##
    ## PTP protocol
    ##
       protocol ptp
       ptp vrf default enable
       interface ethernet 1/1 ptp enable
       interface ethernet 1/2 ptp enable

    Leaf Rack 1 (SW-11) configuration:

    Code Block
    languagetext
    themeRDark
    ##
    ## STP configuration
    ##
    no spanning-tree
    
    
    ##
    ## LLDP configuration
    ##
       lldp
        
    ##
    ## VLAN configuration
    ##
    
    vlan 11 
    name storage
    exit
    vlan 21
    name storage_mgmt
    exit
    vlan 31 
    name internal_api
    exit
    vlan 41 
    name tenant_vxlan
    exit
    vlan 51 
    name ptp
    exit
    vlan 101 
    name tenant_vlan_mc
    exit
    
    interface ethernet 1/1-1/5 switchport access vlan 41
    interface ethernet 1/11-1/15 switchport mode hybrid 
    interface ethernet 1/11-1/15 switchport access vlan 51
    interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 11
    interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 21
    interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 31
    interface ethernet 1/11-1/15 switchport hybrid allowed-vlan add 101
    
    ##
    ## IGMP Snooping configuration
    ##
       ip igmp snooping unregistered multicast forward-to-mrouter-ports
       ip igmp snooping
       vlan 51 ip igmp snooping
      vlan 101 ip igmp snooping
       vlan 51 ip igmp snooping querier
       vlan 101 ip igmp snooping querier
       interface ethernet 1/11-1/15 ip igmp snooping fast-leave
    
       
    ##   
    ## L3 configuration
    ##
    ip routing
    interface ethernet 1/9 no switchport force
    interface ethernet 1/9 ip address 192.168.119.11/24 primary
    
    interface vlan 11 ip address 172.16.0.1 255.255.255.0
    interface vlan 21 ip address 172.17.0.1 255.255.255.0
    interface vlan 31 ip address 172.18.0.1 255.255.255.0
    interface vlan 41 ip address 172.19.0.1 255.255.255.0
    interface vlan 51 ip address 172.20.0.1 255.255.255.0
    interface vlan 101 ip address 11.11.11.1 255.255.255.0
      
      
    ##
    ## OSPF configuration
    ##
    protocol ospf
    router ospf router-id 1.1.1.11
    interface ethernet 1/9 ip ospf area 0.0.0.0
    interface ethernet 1/9 ip ospf network broadcast
    router ospf redistribute direct  
      
       
    ##
    ## IP Multicast router configuration
    ##
       ip multicast-routing 
       
    ##
    ## PIM configuration
    ##
       protocol pim
       interface ethernet 1/9 ip pim sparse-mode
       ip pim multipath next-hop s-g-hash
       interface vlan 101 ip pim sparse-mode
       ip pim rp-address 1.1.1.9
    
    ##
    ## IGMP configuration
    ##
       interface ethernet 1/9 ip igmp immediate-leave
       interface vlan 101 ip igmp immediate-leave
       
    ##
    ## Network management configuration
    ##
       ntp disable
       
    ##
    ## PTP protocol
    ##
       protocol ptp
       ptp vrf default enable
       ptp priority1 1
       interface ethernet 1/9 ptp enable
       interface ethernet 1/11-1/15 ptp enable
       interface vlan 51 ptp enable 
    

    SW-10 (Leaf Rack 2)

    Code Block
    languagetext
    themeRDark
    SW-10 Leaf Rack 2
    
    ##
    ## STP configuration
    ##
    no spanning-tree
    
    
    ##
    ## LLDP configuration
    ##
       lldp
        
    ##
    ## VLAN configuration
    ##
    
    vlan 12
    name storage
    exit
    vlan 22
    name storage_mgmt
    exit
    vlan 32 
    name internal_api
    exit
    vlan 42 
    name tenant_vxlan
    exit
    vlan 52 
    name ptp
    exit
    vlan 101 
    name tenant_vlan_mc
    exit
    
    interface ethernet 1/1-1/2 switchport access vlan 42
    interface ethernet 1/11-1/12 switchport mode hybrid 
    interface ethernet 1/11-1/12 switchport access vlan 52
    interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 12
    interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 22
    interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 32
    interface ethernet 1/11-1/12 switchport hybrid allowed-vlan add 101
    
    ##
    ## IGMP Snooping configuration
    ##
       ip igmp snooping unregistered multicast forward-to-mrouter-ports
       ip igmp snooping
       vlan 52 ip igmp snooping
       vlan 52 ip igmp snooping querier
       vlan 101 ip igmp snooping
       vlan 101 ip igmp snooping querier
       interface ethernet 1/11-1/12 ip igmp snooping fast-leave
    
       
    ##   
    ## L3 configuration
    ##
    ip routing
    interface ethernet 1/9 no switchport force
    interface ethernet 1/9 ip address 192.168.109.10/24 primary
    
    interface vlan 12 ip address 172.16.2.1 255.255.255.0
    interface vlan 22 ip address 172.17.2.1 255.255.255.0
    interface vlan 32 ip address 172.18.2.1 255.255.255.0
    interface vlan 42 ip address 172.19.2.1 255.255.255.0
    interface vlan 52 ip address 172.20.2.1 255.255.255.0
    interface vlan 101 ip address 22.22.22.1 255.255.255.0
      
      
    ##
    ## OSPF configuration
    ##
    protocol ospf
    router ospf router-id 2.2.2.10
    interface ethernet 1/9 ip ospf area 0.0.0.0
    interface ethernet 1/9 ip ospf network broadcast
    router ospf redistribute direct  
      
       
    ##
    ## IP Multicast router configuration
    ##
       ip multicast-routing 
       
    ##
    ## PIM configuration
    ##
       protocol pim
       interface ethernet 1/9 ip pim sparse-mode
       ip pim multipath next-hop s-g-hash
       interface vlan 101 ip pim sparse-mode
       ip pim rp-address 1.1.1.9
    
    ##
    ## IGMP configuration
    ##
       interface ethernet 1/9 ip igmp immediate-leave
       interface vlan 101 ip igmp immediate-leave
       
    ##
    ## Network management configuration
    ##
       ntp disable
       
    ##
    ## PTP protocol
    ##
       protocol ptp
       ptp vrf default enable
       interface ethernet 1/9 ptp enable
       interface ethernet 1/11-1/12 ptp enable
       interface vlan 52 ptp enable 
    



    Solution Configuration and Deployment

    The following information will take you through the configuration and deployment steps of the solution.

    Prerequisites

    Make sure that the hardware specifications are identical for servers with the same role (Compute/Controller/etc.)

    Server Preparation - BIOS

    Make sure that for all servers:

    1. The network boot is set on the interface connected to PXE network.
    2. Virtualization and SRIOV are enabled.


    Make sure that for Compute servers:

    1. The network boot is set on the interface connected to the PXE network
    2. Virtualization and SRIOV are enabled
    3. Power Profile is at "Maximum Performance"
    4. HyperThreading is disabled
    5. C-state is disabled
    6. Turbo Mode is disabled
    7. Collaborative Power Control is disabled
    8. Processor Power and Utilization Monitoring (crtl+A) are disabled

    NIC Preparation 

    SRIOV configuration is disabled by default on ConnectX-5 NICs and must be enabled for every NIC used by a Compute node.

    To enable and configure SRIOV, insert the Compute NIC into a test server with an OS installed, and follow the below steps:

    1. Run the following to verify that the firmware version is 16.21.2030 or later:

      Code Block
      languagetext
      themeRDark
      [root@host ~]# ethtool -i ens2f0
      driver: mlx5_core
      version: 5.0-0
      firmware-version: 16.22.1002 (MT_0000000009)
      expansion-rom-version:
      bus-info: 0000:07:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: no
      supports-register-dump: no
      supports-priv-flags: yes


      If the firmware version is older, download and burn the new firmware as detailed in How to Install Mellanox OFED on Linux (Rev 4.4-2.0.7.0)as described here


    2. Install the mstflint package:

      Code Block
      languagetext
      themeRDark
      [root@host ~]# yum install mstflint
      
    3. Identify the PCI ID of the first 100G port and enable SRIOV:

      Code Block
      languagetext
      themeRDark
      [root@host ~]# lspci | grep -i mel
      07:00.0 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
      07:00.1 Ethernet controller: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
      [root@host ~]#
      [root@host ~]# mstconfig -d 0000:07:00.0 query | grep -i sriov
      SRIOV_EN False(0)
      SRIOV_IB_ROUTING_MODE_P1 GID(0)
      SRIOV_IB_ROUTING_MODE_P2 GID(0)
      [root@host ~]# mstconfig -d 0000:07:00.0 set SRIOV_EN=1
      Device #1:
      ----------
      
      Device type: ConnectX5
      PCI device: 0000:07:00.0
      
      Configurations: Next Boot New
      SRIOV_EN False(0) True(1)
      
      Apply new Configuration? ? (y/n) [n] : y
      Applying... Done!
      -I- Please reboot machine to load new configurations.
    4. Set the number of VFs to a high value, such as 64, and reboot the server to apply the new configuration:

      Code Block
      languagetext
      themeRDark
      [root@host ~]# mstconfig -d 0000:07:00.0 query | grep -i vfs
      NUM_OF_VFS 0
      [root@host ~]# mstconfig -d 0000:07:00.0 set NUM_OF_VFS=64
      
      Device #1:
      ----------
      
      Device type: ConnectX5
      PCI device: 0000:07:00.0
      
      Configurations: Next Boot New
      NUM_OF_VFS 0 64
      
      Apply new Configuration? ? (y/n) [n] : y
      Applying... Done!
      -I- Please reboot machine to load new configurations.
      [root@host ~]# reboot
    5. Confirm the new settings were applied using the mstconfig query commands shown above.
    6. Insert the NIC back to the Compute node.
    7. Repeat the procedure above for every Compute node NIC used in our setup.


    Info
    titleNote
    • In our solution, the first port of the two 100G ports in every NIC is used for the ASAP² accelerated data plane. This is the reason we enable SRIOV SR-IOV only on the first Mellanox ConnectX NIC PCI device (07:00.0 in the example above).
    • There are future plans to support an automated procedure to update and configure the NICs on the Compute nodes from the Undercloud.

    Accelerated RH-OSP Installation and Deployment

    The following steps will take you through the accelerated RH-OSP installation and deployment procedure:

    1. Install Red Hat 7.6 OS on the Undercloud server and set an IP to its interface which is connected to the external network; make sure it has internet connectivity.
    2. Install the Undercloud and the director as instructed in section 4 of the Red Hat OSP Director Installation and Usage - Red Hat Customer Portal. Our undercloud.conf file is attached as a reference here: Configuration Files
    3. Configure a container image source as instructed in section 5 of the above guide. Our solution is using undercloud as a local registry.

      Note
      titleNote

      The following overcloud image versions are used in our deployment:

      rhosp-director-images-13.0-20190418.1.el7ost.noarch
      rhosp-director-images-ipa-13.0-20190418.1.el7ost.noarch
      rhosp-director-images-ipa-x86_64-13.0-20190418.1.el7ost.noarch
      rhosp-director-images-x86_64-13.0-20190418.1.el7ost.noarch

      The overcloud image is RH 7.6 with kernel 3.10.0-957.10.1.el7.x86_64

    4. Register the nodes of the overcloud as instructed in section 6.1. Our instackenv.json file is attached as a reference.

    5. Inspect the hardware of the nodes as instructed in section 6.2.
      Once introspection is completed, it is recommended to confirm for each node that the desired root disk was detected since cloud deployment can fail later because of insufficient disk space. Use the following command to check the free space on the detected disk selected as root:

      Code Block
      languagetext
      themeRDark
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node show 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | grep properties | properties | {u'memory_mb': u'131072', u'cpu_arch': u'x86_64', u'local_gb': u'418', u'cpus': u'24', u'capabilities': u'boot_option:local'}

      “local_gb” value is representing the disk size. In case the disk size is low and not as expected, use the procedure described in section 6.6 for defining the root disk for the node. Note that an additional introspection cycle is required for this node after the root disk is changed.

    6. Verify that all nodes were registered properly and changed their state to “available” before proceeding to the next step:

      Code Block
      languagetext
      themeRDark
      +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
      | UUID                                 | Name         | Instance UUID | Power State | Provisioning State | Maintenance |
      +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+
      | d1fca940-e341-491b-8afd-0cf6d748aa29 | controller-1 | None          | power off   | available          | False       |
      | 6b24d02c-3fd2-4e55-a730-c45008f01723 | controller-2 | None          | power off   | available          | False       |
      | 098c3e2d-1c70-41d2-983b-6c266387de0b | controller-3 | None          | power off   | available          | False       |
      | 91492c2a-b26c-49ef-9d4e-e492a1578076 | compute-1    | None          | power off   | available          | False       |
      | cdf9e0ec-e3cb-4005-86f6-d40e684a9b19 | compute-2    | None          | power off   | available          | False       |
      | 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | compute-3    | None          | power off   | available          | False       |
      | bb5e829a-834b-4eb1-b733-0012ce9d5f00 | compute-4    | None          | power off   | available          | False       |
      +--------------------------------------+--------------+---------------+-------------+--------------------+-------------+



    The next step is to Tag the nodes into profiles


    1. Tag the controllers nodes into “control” default profile:

      Code Block
      languagetext
      themeRDark
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-1
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-2
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:control,boot_option:local' controller-3


    2. Create two new compute flavors -- one per rack (compute-r1, compute-r2) -- and attach the flavors to profiles with a correlated name:

      Code Block
      languagetext
      themeRDark
      (undercloud) [stack@rhosp-director ~]$ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r1
      (undercloud) [stack@rhosp-director ~]$ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r1" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r1
      
      (undercloud) [stack@rhosp-director ~]$ openstack flavor create --id auto --ram 4096 --disk 40 --vcpus 1 compute-r2
      (undercloud) [stack@rhosp-director ~]$ openstack flavor set --property "capabilities:boot_option"="local" --property "capabilities:profile"="compute-r2" --property "resources:CUSTOM_BAREMETAL"="1" --property "resources:DISK_GB"="0" --property "resources:MEMORY_MB"="0" --property "resources:VCPU"="0" compute-r2
    3. Tag compute nodes 1,3 into “compute-r1” profile to associate it with Rack 1, and compute nodes 2,4 into “compute-r2” profile to associate it with Rack 2:

      Code Block
      languagetext
      themeRDark
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r1,boot_option:local' compute-1
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r1,boot_option:local' compute-3
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r2,boot_option:local' compute-2
      (undercloud) [stack@rhosp-director ~]$ openstack baremetal node set --property capabilities='profile:compute-r2,boot_option:local' compute-4
    4. Verify profile tagging per node using the command below:

      Code Block
      languagetext
      themeRDark
      (undercloud) [stack@rhosp-director ~]$ openstack overcloud profiles list
      +--------------------------------------+--------------+-----------------+-----------------+-------------------+
      | Node UUID                            | Node Name    | Provision State | Current Profile | Possible Profiles |
      +--------------------------------------+--------------+-----------------+-----------------+-------------------+
      | d1fca940-e341-491b-8afd-0cf6d748aa29 | controller-1 | available       | control | |
      | 6b24d02c-3fd2-4e55-a730-c45008f01723 | controller-2 | available       | control | |
      | 098c3e2d-1c70-41d2-983b-6c266387de0b | controller-3 | available       | control | |
      | 91492c2a-b26c-49ef-9d4e-e492a1578076 | compute-1    | available       | compute-r1 | |
      | cdf9e0ec-e3cb-4005-86f6-d40e684a9b19 | compute-2    | available       | compute-r2 | |
      | 92c4c1cb-ce7d-48d4-a2d9-75b2651db097 | compute-3    | available       | compute-r1 | |
      | bb5e829a-834b-4eb1-b733-0012ce9d5f00 | compute-4    | available       | compute-r2 | |
      +--------------------------------------+--------------+-----------------+-----------------+-------------------+
      Info

      It is possible to tag the nodes into profiles in instackenv.json file during node registration (section 6.1) instead of running the tag command per node, however flavors and profiles must be created in any case.

    Mellanox NVIDIA NICs Listing

    Run the following command to go over all registered nodes and identify the interface names of the dual port Mellanox NVIDIA 100GB NIC. Interface names are used later on in the configuration files.

    Code Block
    languagetext
    themeRDark
    (undercloud) [stack@rhosp-director templates]$ for node in $(openstack baremetal node list --fields uuid -f value) ; do openstack baremetal introspection interface list $node ; done
    .
    .
    +-----------+-------------------+----------------------+-------------------+----------------+
    | Interface | MAC Address       | Switch Port VLAN IDs | Switch Chassis ID | Switch Port ID |
    +-----------+-------------------+----------------------+-------------------+----------------+
    | eno1      | ec:b1:d7:83:11:b8 | []                   | 94:57:a5:25:fa:80 | 29 |
    | eno2      | ec:b1:d7:83:11:b9 | []                   | None              | None |
    | eno3      | ec:b1:d7:83:11:ba | []                   | None              | None |
    | eno4      | ec:b1:d7:83:11:bb | []                   | None              | None |
    | ens1f1    | ec:0d:9a:7d:81:b3 | []                   | 24:8a:07:7f:ef:00 | Eth1/14 |
    | ens1f0    | ec:0d:9a:7d:81:b2 | []                   | 24:8a:07:7f:ef:00 | Eth1/1 |
    +-----------+-------------------+----------------------+-------------------+----------------+
    Note
    titleNote

    Names must be identical for all nodes, or at least for all nodes sharing the same role. In our case, it is ens2f0/ens2f1 in Controller nodes, and enf1f0/ens1f1 in Compute nodes.



    Note
    titleNote

    The configuration file examples in the following sections are partial and were employed to highlight specific sections. The full configuration files are available to download in the following link:

    Configuration Files

    Deployment configuration and environment files:

    Role definitions file:

    • The provided /home/stack/templates/roles roles_data_rivermax.yaml file includes a standard Controller role and two types of Compute roles, one per associated network rack

    • The NeutronDhcpAgent service is added to the Compute roles

     Below is a partial output of the config files:

    Code Block
    languagetext
    themeRDark
    ###############################################################################
    # Role: ComputeSriov1 #
    ###############################################################################
    - name: ComputeSriov1
    description: |
    Compute SR-IOV Role R1
    CountDefault: 1
    networks:
    - InternalApi
    - Tenant
    - Storage
    - Ptp
    HostnameFormatDefault: '%stackname%-computesriov1-%index%'
    disable_upgrade_deployment: True
    ServicesDefault:
    Code Block
    languagetext
    themeRDark
    ###############################################################################
    # Role: ComputeSriov2 #
    ###############################################################################
    - name: ComputeSriov2
    description: |
    Compute SR-IOV Role R2
    CountDefault: 1
    networks:
    - InternalApi_2
    - Tenant_2
    - Storage_2
    - Ptp_2
    HostnameFormatDefault: '%stackname%-computesriov2-%index%'
    disable_upgrade_deployment: True
    ServicesDefault:


    The full configuration file is attached to this document for your convenience.


    Node Counts and Flavors file:

    The provided /home/stack/templates/node-info.yaml specifies count nodes and the correlated flavor per role.


    Full configuration file:

    Code Block
    languagetext
    themeRDark
    parameter_defaults:
    OvercloudControllerFlavor: control
    OvercloudComputeSriov1Flavor: compute-r1
    OvercloudComputeSriov2Flavor: compute-r2
    ControllerCount: 3
    ComputeSriov1Count: 2



    Rivermax Environment Configuration file:


    The provided /home/stack/templates/rivermax-env.yaml file is used to configure the Compute nodes for low latency applications with HW offload:


    • ens1f0 is used for accelerated VXLAN data plane (Nova physical_network: null is required for VXLAN offload)

    • CPU isolation: cores 2-5,12-17 are isolated from Hypervisor and 2-5,12-15 will be used by Rivermax VMs. cores 16,17 are excluded from Nova and will be used exclusively for running linuxptp tasks on the compute node

      ens1f1 is used for vlan traffic

      • Each compute node role is associated with a dedicated physical network to be used later on for multi-segment network, notice that  Nova PCI white list physical network remains the same.

      • VF function #1 is excluded from Nova PCI white list (will be used for Hypervisor VF for PTP traffic).

    • userdata_disable_service.yaml is called to disable chrony(ntp) service on overcloud nodes during compute - this is required for stable PTP setup.

    • ExtraConfig for mapping Role config params to the correct network set, and for setting Firewall rules allowing PTP traffic to the compute nodes


    Full configuration file is attached to this document

    Note
    titleNote

    The following configuration file is correlated to specific compute server HW, OS and drivers in which:

    Mellanox NVIDIA's ConnectX adapter interface names are ens1f0, ens1f1

    The PCI IDs used for SRIOV SR-IOV VFs allocated for Nova usage are specified explicitly per compute role.

    In different system the names and PCI addresses might be different.

    It is required to have this information before cloud deployment in order to adjust the configuration files.

    Code Block
    languagetext
    themeRDark
    # A Heat environment file for adjusting the compute nodes to low latency media applications with HW Offload
    
    resource_registry:
      OS::TripleO::Services::NeutronSriovHostConfig: /usr/share/openstack-tripleo-heat-templates/puppet/services/neutron-sriov-host-config.yaml
      OS::TripleO::NodeUserData: /home/stack/templates/userdata_disable_service.yaml
      OS::TripleO::Services::Ntp: OS::Heat::None
      OS::TripleO::Services::NeutronOvsAgent: /usr/share/openstack-tripleo-heat-templates/puppet/services/neutron-ovs-agent.yaml
      
    parameter_defaults:
    
      DisableService: "chronyd"  
      NovaSchedulerDefaultFilters: ['RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter','ServerGroupAntiAffinityFilter','ServerGroupAffinityFilter','PciPassthroughFilter']
      NovaSchedulerAvailableFilters: ["nova.scheduler.filters.all_filters","nova.scheduler.filters.pci_passthrough_filter.PciPassthroughFilter"]  
    
      # ComputeSriov1 Role params: 1 vxlan offload interface, 1 legacy sriov interface, isolated cores, cores 16-17 are isolated and excluded from nova for ptp usage. 
      ComputeSriov1Parameters:
        KernelArgs: "default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17"
        NovaVcpuPinSet: "2-5,12-15" 
        OvsHwOffload: True
        NovaReservedHostMemory: 4096
        NovaPCIPassthrough:
          - devname: "ens1f0"
            physical_network: null
          - address: {"domain": ".*", "bus": "08", "slot": "08", "function": "[4-7]"}
            physical_network: "tenantvlan1"      
        NeutronPhysicalDevMappings: "tenantvlan1:ens1f1"
        NeutronBridgeMappings: ["tenantvlan1:br-stor"]
        
      # Extra config for mapping config params to rack 1 networks and for setting PTP Firewall rule
      ComputeSriov1ExtraConfig:
        neutron::agents::ml2::ovs::local_ip: "%{hiera('tenant')}"
        nova::vncproxy::host: "%{hiera('internal_api')}"
        nova::compute::vncserver_proxyclient_address: "%{hiera('internal_api')}"
        nova::compute::libvirt::vncserver_listen: "%{hiera('internal_api')}"    
        nova::my_ip: "%{hiera('internal_api')}"
        nova::migration::libvirt::live_migration_inbound_addr: "%{hiera('internal_api')}"
        cold_migration_ssh_inbound_addr: "%{hiera('internal_api')}"
        live_migration_ssh_inbound_addr: "%{hiera('internal_api')}" 
        tripleo::profile::base::database::mysql::client::mysql_client_bind_address: "%{hiera('internal_api')}"
        tripleo::firewall::firewall_rules:
          '199 allow PTP traffic over dedicated interface':
            dport: [319,320]
            proto: udp
            action: accept
      
      # ComputeSriov2 Role params: 1 vxlan offload interface, 1 legacy sriov interface, isolated cores, cores 16-17 are isolated and excluded from nova for ptp usage. 
      ComputeSriov2Parameters:
        KernelArgs: "default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17"
        NovaVcpuPinSet: "2-5,12-15" 
        OvsHwOffload: True
        NovaReservedHostMemory: 4096
        NeutronSriovNumVFs: 
        NovaPCIPassthrough:
          - devname: "ens1f0"
            physical_network: null
          - address: {"domain": ".*", "bus": "08", "slot": "02", "function": "[4-7]"}
            physical_network: "tenantvlan1"     
        NeutronPhysicalDevMappings: "tenantvlan2:ens1f1"
        NeutronBridgeMappings: ["tenantvlan2:br-stor"]
    
        # Extra config for mapping config params to rack 2 networks and for setting PTP Firewall rule
      ComputeSriov2ExtraConfig:
        neutron::agents::ml2::ovs::local_ip: "%{hiera('tenant_2')}"
        nova::vncproxy::host: "%{hiera('internal_api_2')}"
        nova::compute::vncserver_proxyclient_address: "%{hiera('internal_api_2')}"
        nova::compute::libvirt::vncserver_listen: "%{hiera('internal_api_2')}"      
        nova::my_ip: "%{hiera('internal_api_2')}"
        nova::migration::libvirt::live_migration_inbound_addr: "%{hiera('internal_api_2')}"
        cold_migration_ssh_inbound_addr: "%{hiera('internal_api_2')}"
        live_migration_ssh_inbound_addr: "%{hiera('internal_api_2')}" 
        tripleo::profile::base::database::mysql::client::mysql_client_bind_address: "%{hiera('internal_api_2')}"
        tripleo::firewall::firewall_rules:
          '199 allow PTP traffic over dedicated interface':
            dport: [319,320]
            proto: udp
            action: accept



    Disable_Service Configuration file:

    The provided /home/stack/templates/userdata_disable_service.yaml is used to disable services on overcloud nodes during deployment.

    It is used in rivermaxin rivermax-env.yaml to disable chrony(ntp) service:

    Code Block
    languagetext
    themeRDark
    heat_template_version: queens                                                          
    
    description: >
      Uses cloud-init to enable root logins and set the root password.
      Note this is less secure than the default configuration and may not be
      appropriate for production environments, it's intended for illustration
      and development/debugging only.
    
    parameters:
      DisableService:
        description: Disable a service
        hidden: true
        type: string
    
    resources:
      userdata:
        type: OS::Heat::MultipartMime
        properties:
          parts:
          - config: {get_resource: disable_service}
    
      disable_service:
       type: OS::Heat::SoftwareConfig
       properties:
          config:
            str_replace:
              template: |
               #!/bin/bash
               set -x
               sudo systemctl disable $service
               sudo systemctl stop $service
              params:
               $service: {get_param: DisableService}
    
    outputs:
      OS::stack_id:
        value: {get_resource: userdata}
    


    Network configuration Files:

    The provided network_data_rivermax.yaml file is used to configure the cloud networks according to the following guidelines:

    • rack 1 networks set parameters match the subnets/vlans configured on Rack 1 Leaf switch. The network names used are specified in roles_data.yaml for Controller\ComputeSriov1 role networks.
    • rack 2 networks match the subnets/vlans configured on Rack 2 Leaf switch. The network names are specified in roles_data.yaml for ComputeSriov2 role networks.
    • “management” network,is not used in our example
    • PTP network is shared to both racks in our example


    The configuration is based on the following matrix to match the Leaf switch configuration as executed in Network Configuration section above:

    Network Name

    Network Set

    Network Location

    Network Details

    VLAN

    Network Allocation Pool

    Storage

    1

    Rack 1

    172.16.0.0/24

    11

    172.16.0.100-250

    Storage_Mgmt


    172.17.0.0/24

    21

    172.17.0.100-250

    Internal API


    172.18.0.0/24

    31

    172.18.0.100-250

    Tenant


    172.19.0.0/24

    41

    172.19.0.100-250

    PTP
    172.20.0.0/24untagged172.20.0.100-250

    Storage_2

    2




    Rack 2

    172.16.2.0/24

    12

    172.16.2.100-250

    Storage_Mgmt_2


    172.17.2.0/24

    22

    172.17.2.100-250

    Internal API _2


    172.18.2.0/24

    32

    172.18.2.100-250

    Tenant _2


    172.19.2.0/24

    42

    172.19.2.100-250

    PTP_2
    172.20.2.0/24untagged172.20.2.100-250

    External

    -

    Public Switch

    10.7.208.0/24

    -

    10.7.208.10-21

    Full configuration file is attached to this document



    Below is a partial example for one of the configured networks: Storage (2 networks sets), External, and PTP networks configuration:

    Code Block
    languagetext
    themeRDark
    - name: Storage
    vip: true
    vlan: 11
    name_lower: storage
    ip_subnet: '172.16.0.0/24'
    allocation_pools: [{'start': '172.16.0.100', 'end': '172.16.0.250'}]
    ipv6_subnet: 'fd00:fd00:fd00:1100::/64'
    ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:1100::10', 'end': 'fd00:fd00:fd00:1100:ffff:ffff:ffff:fffe'}]
    .
    .
    - name: Storage_2
    vip: true
    vlan: 12
    name_lower: storage_2
    ip_subnet: '172.16.2.0/24'
    allocation_pools: [{'start': '172.16.2.100', 'end': '172.16.2.250'}]
    ipv6_subnet: 'fd00:fd00:fd00:1200::/64'
    ipv6_allocation_pools: [{'start': 'fd00:fd00:fd00:1200::10', 'end': 'fd00:fd00:fd00:1200:ffff:ffff:ffff:fffe'}]
    .
    .
    - name: External
    vip: true
    name_lower: external
    vlan: 10
    ip_subnet: '10.7.208.0/24'
    allocation_pools: [{'start': '10.7.208.10', 'end': '10.7.208.21'}]
    gateway_ip: '10.7.208.1'
    ipv6_subnet: '2001:db8:fd00:1000::/64'
    ipv6_allocation_pools: [{'start': '2001:db8:fd00:1000::10', 'end': '2001:db8:fd00:1000:ffff:ffff:ffff:fffe'}]
    gateway_ipv6: '2001:db8:fd00:1000::1'
    .
    .
    - name: Ptp
    name_lower: ptp
    ip_subnet: '172.20.1.0/24'
    allocation_pools: [{'start': '172.20.1.100', 'end': '172.20.1.250'}]
    
    - name: Ptp_2
      name_lower: ptp_2
      ip_subnet: '172.20.2.0/24'
      allocation_pools: [{'start': '172.20.2.100', 'end': '172.20.2.250'}] 



    The provided networkprovided network-environment-rivermax.yaml file is used to configure the nova\neutron networks parameters according to the cloud networks:

    • vxlan tunnels
    • tenant vlan ranges to be used for SRIOV ports are 100-200

    Full configuration file is attached to this document

    Code Block
    languagetext
    themeRDark
    .
    .
    .
      NeutronNetworkType: 'vlan,vxlan,flat'
      NeutronTunnelTypes: 'vxlan'
      NeutronNetworkVLANRanges: 'tenantvlan1:100:200,tenantvlan2:100:200'
      NeutronFlatNetworks: 'datacentre'
      NeutronBridgeMappings: 'datacentre:br-ex,tenantvlan1:br-stor'


    Role type configuration files: 

    /home/stack/templates/controller.yaml 

    • Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
    • Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. Supernet and gateway for 2 tenant networks can be seen below.
    • Controller nodes network settings we used:
      • Dedicated 1G interface (type “interface”) for provisioning (PXE) network.
      • Dedicated 1G interface (type “ovs_bridge”) for External network. This network has a default GW configured.
      • Dedicated 100G interface (type “interface” without vlans) for data plane (Tenant) network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks.
      • Dedicated 100G interface (type “ovs_bridge”) with vlans for Storage/StorageMgmt/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks.
    • See example below. Full configuration file is attached to this document.

      Code Block
      languagetext
      themeRDark
      TenantSupernet:
      default: '172.19.0.0/16'
      description: Supernet that contains Tenant subnets for all roles.
      type: string
      TenantGateway:
      default: '172.19.0.1'
      description: Router gateway on tenant network
      type: string
      Tenant_2Gateway:
      default: '172.19.2.1'
      description: Router gateway on tenant_2 network
      type: string
      .
      .
      resources:
      OsNetConfigImpl:
      type: OS::Heat::SoftwareConfig
      properties:
      group: script
      config:
      str_replace:
      template:
      get_file: /usr/share/openstack-tripleo-heat-templates/network/scripts/run-os-net-config.sh
      params:
      $network_config:
      network_config:
      .
      .
      # NIC 3 - Data Plane (Tenant net)
      - type: ovs_bridge
      name: br-sriov
      use_dhcp: false
      members:
      - type: interface
      name: ens2f0
      addresses:
      - ip_netmask:
      get_param: TenantIpSubnet
      routes:
      - ip_netmask:
      get_param: TenantSupernet
      next_hop:
      get_param: TenantGateway

    /home/stack/templates/computesriov1.yaml:

    • Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
    • Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. - not mentioned in the example below, see example above or full configuration file.
    • Networks and routes used by Compute nodes in Rack 1 with ComputeSriov1 role:
      • Dedicated 1G interface for provisioning (PXE) network 
      • Dedicated 100G interface for offloaded vxlan data plane network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks
      • Dedicated 100G interface with host VF for PTP and with OVS vlans for Storage/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks - not mentioned in the example below, see full configuration file.
    • See example below. Full configuration file is attached to this document.

      Code Block
      languagetext
      themeRDark
       network_config:
                       # NIC 1 - Provisioning net
                    - type: interface                                                                                                
                      name: eno1                                                                                               
                      use_dhcp: false                                                                                                 
                      dns_servers:                                                                                                    
                        get_param: DnsServers                                                                                         
                      addresses:                                                                                                      
                      - ip_netmask:
                          list_join:
                          - /
                          - - get_param: ControlPlaneIp
                            - get_param: ControlPlaneSubnetCidr
                      routes:
                      - ip_netmask: 169.254.169.254/32
                        next_hop:
                          get_param: EC2MetadataIp
                      - default: true
                        next_hop:
                          get_param: ControlPlaneDefaultRoute
      
                         
                      # NIC 2 - ASAP2 VXLAN Data Plane (Tenant net)
                    - type: sriov_pf
                      name: ens1f0
                      numvfs: 8
                      link_mode: switchdev
                    - type: interface
                      name: ens1f0
                      use_dhcp: false
                      addresses:
                        - ip_netmask:
                            get_param: TenantIpSubnet 
                      routes:
                        - ip_netmask:
                            get_param: TenantSupernet
                          next_hop:
                            get_param: TenantGateway
                    
                    
                      # NIC 3 - Storage and Control over OVS, legacy SRIOV for Data Plane, NIC Partitioning for PTP VF owned by Host
                    - type: ovs_bridge
                      name: br-stor
                      use_dhcp: false
                      members:
                      - type: sriov_pf
                        name: ens1f1
                        numvfs: 8
                        # force the MAC address of the bridge to this interface
                        primary: true
                      - type: vlan
                        vlan_id:
                          get_param: StorageNetworkVlanID
                        addresses:
                        - ip_netmask:
                            get_param: StorageIpSubnet
                        routes:
                        - ip_netmask:
                            get_param: StorageSupernet
                          next_hop:
                            get_param: StorageGateway
                      - type: vlan
                        vlan_id:
                          get_param: InternalApiNetworkVlanID
                        addresses:
                        - ip_netmask:
                            get_param: InternalApiIpSubnet
                        routes:
                        - ip_netmask:
                            get_param: InternalApiSupernet
                          next_hop:
                            get_param: InternalApiGateway
                    - type: sriov_vf
                      device: ens1f1
                      vfid: 1
                      addresses:
                      - ip_netmask:
                          get_param: PtpIpSubnet
      
      

    /home/stack/templates/computesriov2.yaml:

    • Make sure the location of run-os-net-config.sh script in the configuration file is pointing to the correct script location.
    • Supernet and GW per network allow routing between network sets located in different racks. The GW would be the IP interface which was configured on the Leaf switch interface facing this network. - not mentioned in the example below, see example above or full configuration file.
    • Networks and routes used by Compute nodes in Rack 2 with ComputeSriov2 role:
      • Dedicated 1G interface for provisioning (PXE) network - not mentioned in the example below, see example above or full configuration file.
      • Dedicated 100G interface for offloaded vxlan data plane network in Rack 1. The network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks
      • Dedicated 100G interface with host VF for PTP and with OVS vlans for Storage/InternalApi networks in Rack 1. Each network is associated with a supernet and has a route allowing it to reach other networks in the same supernet located in different racks - not mentioned in the example below, see full configuration file.

    • See example below. Full configuration file is attached to this document.

      Code Block
      languagetext
      themeRDark
      network_config:
                      # NIC 1 - Provisioning net
                    - type: interface                                                                                                
                      name: eno1                                                                                               
                      use_dhcp: false                                                                                                 
                      dns_servers:                                                                                                    
                        get_param: DnsServers                                                                                         
                      addresses:                                                                                                      
                      - ip_netmask:
                          list_join:
                          - /
                          - - get_param: ControlPlaneIp
                            - get_param: ControlPlaneSubnetCidr
                      routes:
                      - ip_netmask: 169.254.169.254/32
                        next_hop:
                          get_param: EC2MetadataIp
                      - default: true
                        next_hop:
                          get_param: ControlPlaneDefaultRoute
      
                         
                      # NIC 2 - ASAP2 VXLAN Data Plane (Tenant net)
                    - type: sriov_pf
                      name: ens1f0
                      numvfs: 8
                      link_mode: switchdev
                    - type: interface
                      name: ens1f0
                      use_dhcp: false
                      addresses:
                        - ip_netmask:
                            get_param: Tenant_2IpSubnet 
                      routes:
                        - ip_netmask:
                            get_param: TenantSupernet
                          next_hop:
                            get_param: Tenant_2Gateway
                    
                    
                      # NIC 3 - Storage and Control over OVS, legacy SRIOV for Data Plane, NIC Partitioning for PTP VF owned by Host
                    - type: ovs_bridge
                      name: br-stor
                      use_dhcp: false
                      members:
                      - type: sriov_pf
                        name: ens1f1
                        numvfs: 8
                        # force the MAC address of the bridge to this interface
                        primary: true
                      - type: vlan
                        vlan_id:
                          get_param: Storage_2NetworkVlanID
                        addresses:
                        - ip_netmask:
                            get_param: Storage_2IpSubnet
                        routes:
                        - ip_netmask:
                            get_param: StorageSupernet
                          next_hop:
                            get_param: Storage_2Gateway
                      - type: vlan
                        vlan_id:
                          get_param: InternalApi_2NetworkVlanID
                        addresses:
                        - ip_netmask:
                            get_param: InternalApi_2IpSubnet
                        routes:
                        - ip_netmask:
                            get_param: InternalApiSupernet
                          next_hop:
                            get_param: InternalApi_2Gateway
                    - type: sriov_vf
                      device: ens1f1
                      vfid: 1
                      addresses:
                      - ip_netmask:
                          get_param: Ptp_2IpSubnet

    Deploying the Overcloud

    Using the provided configuration and environment files, the cloud will be deployed utilizing:

    • 3 controllers associated with Rack 1 networks
    • 2 Compute nodes associated with Rack 1 (provider network 1)
    • 2 Compute nodes associated with Rack 2 (provider network 2)
    • Routes to allow connectivity between racks/networks
    • VXLAN overlay tunnels between all the nodes

    Before starting the deployment, verify connectivity between the racks' Leaf switches SW vlan interfaces facing the nodes over the OSPF underlay fabric. Without inter-rack connectivity for all networks, the overcloud deployment will fail.

    Info
    titleNote
    • Do not change the order of the environment files in the deploy command.
    • Make sure that the NTP server specified in the deploy command is accessible and can provide time to the undercloud node
    • The overcloud_images.yaml file used in the deploy command is created during undercloud installation, verify its existence in the specified location
    • The network-isolation.yaml and neutron-sriov.yaml files specified in the deploy command are created automatically during deployment from j2.yaml template file

    To start the overcloud deployment, issue the command below: 

    Code Block
    languagetext
    themeRDark
    (undercloud) [stack@rhosp-director templates]$ openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates \
    --libvirt-type kvm \
    -n /home/stack/templates/network_data_rivermax.yaml \
    -r /home/stack/templates/roles_data_rivermax.yaml \
    --timeout 90 \
    --validation-warnings-fatal \
    --ntp-server 0.asia.pool.ntp.org \
    -e /home/stack/templates/node-info.yaml \
    -e /home/stack/templates/overcloud_images.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/neutron-sriov.yaml \
    -e /home/stack/templates/network-environment-rivermax.yaml \
    -e /home/stack/templates/rivermax-env.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/host-config-and-reboot.yaml \
    -e /usr/share/openstack-tripleo-heat-templates/environments/disable-telemetry.yaml
    
    


    Post Deployment Steps

    Media Compute Node configuration:

    1. Verify the system booted with the required low latency adjustments

      Code Block
      languagetext
      themeRDark
      # cat /proc/cmdline
      BOOT_IMAGE=/boot/vmlinuz-3.10.0-957.10.1.el7.x86_64 root=UUID=334f450f-1946-4577-a4eb-822bd33b8db2 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto rhgb quiet default_hugepagesz=2MB hugepagesz=2MB hugepages=8192 intel_iommu=on iommu=pt processor.max_cstate=0 intel_idle.max_cstate=0 nosoftlockup isolcpus=2-5,12-17 nohz_full=2-5,12-17 rcu_nocbs=2-5,12-17
      
      # cat /sys/module/intel_idle/parameters/max_cstate
      0
      
      # cat /sys/devices/system/cpu/cpuidle/current_driver
      none
    2. Upload MFT package to the compute node and install it 

      Info
      titleNote

      NVIDIA Mellanox Firmware Tools (MFT) can be obtained in http://www.mellanox.com/page/management_tools.obtained here.

      GCC and kernel-devel packages are required for MFT install.

      Code Block
      languagetext
      themeRDark
      #yum install gcc kernel-devel-3.10.0-957.10.1.el7.x86_64 -y
      #tar -xzvf mft-4.12.0-105-x86_64-rpm.tgz
      #cd mft-4.12.0-105-x86_64-rpm
      #./install.sh
      mst start
    3. Verify NIC Firmware and upgrade it to the latest if required 

      Code Block
      languagetext
      themeRDark
      # mlxfwmanager --query
      Querying Mellanox devices firmware ...
      
      Device #1:
      ----------
      
        Device Type:      ConnectX5
        Part Number:      MCX556A-EDA_Ax
        Description:      ConnectX-5 Ex VPI adapter card; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0 x16; tall bracket; ROHS R6
        PSID:             MT_0000000009
        PCI Device Name:  /dev/mst/mt4121_pciconf0
        Base MAC:         ec0d9a7d81b2
        Versions:         Current        Available
           FW             16.25.1020     N/A
           PXE            3.5.0701       N/A
           UEFI           14.18.0019     N/A
      
        Status:           No matching image found
    4. Enable packet pacing and HW time stamp on the port used for PTP

      Info
      titleNote

      rivermax_config script is available to download here

      The relevant interface in our case is ens1f1.

      REBOOT is required between the steps.

      The "mcra" setting will not survive a reboot.

      This is expected to be persistent and enabled by default in future FW releases.

      Code Block
      languagetext
      themeRDark
      #mst start
      Starting MST (Mellanox Software Tools) driver set
      Loading MST PCI module - Success
      [warn] mst_pciconf is already loaded, skipping
      Create devices
      -W- Missing "lsusb" command, skipping MTUSB devices detection
      Unloading MST PCI module (unused) - Success
      
      #mst status -v
      MST modules:
      ------------
          MST PCI module is not loaded
          MST PCI configuration module loaded
      PCI devices:
      ------------
      DEVICE_TYPE             MST                           PCI       RDMA            NET                       NUMA
      ConnectX5(rev:0)        /dev/mst/mt4121_pciconf0.1    08:00.1   mlx5_1          net-ens1f1                0
      
      ConnectX5(rev:0)        /dev/mst/mt4121_pciconf0      08:00.0   mlx5_0          net-ens1f0                0
      
       
      #chmod 777 rivermax_config
      #./rivermax_config ens1f1
      running this can take few minutes...
      enabling
      Done!
      Code Block
      languagetext
      themeRDark
      # reboot
      Code Block
      languagetext
      themeRDark
      #mst start
      Starting MST (Mellanox Software Tools) driver set
      Loading MST PCI module - Success
      [warn] mst_pciconf is already loaded, skipping
      Create devices
      -W- Missing "lsusb" command, skipping MTUSB devices detection
      Unloading MST PCI module (unused) - Success
       
      #mcra /dev/mst/mt4121_pciconf0.1 0xd8068 3
      
      #mcra /dev/mst/mt4121_pciconf0.1 0xd8068
      0x00000003
    5. Sync the compute node clock

    6. install linuxptp

      Code Block
      languagetext
      themeRDark
      # yum install -y linuxptp
    7.  Use one of the following methods to identify the host VF interface name used for PTP (look for IP address from the PTP network or for "virtfn1" which is correlated to vfid 1 used in the configuration deployment files)

      Code Block
      languagetext
      themeRDark
      [root@overcloud-computesriov1-0 ~]# ip addr show | grep "172.20"
          inet 172.20.0.102/24 brd 172.20.0.255 scope global enp8s8f3
      
      
       
      [root@overcloud-computesriov1-0 ~]# ls /sys/class/net/ens1f1/device/virtfn1/net/
      enp8s8f3
    8. verify connectivity to clock master (Onyx leaf switch sw11 over vlan 51 for Rack1, Onyx leaf switch sw10 over vlan 52 for Rack2)

      Code Block
      languagetext
      themeRDark
      [root@overcloud-computesriov1-0 ~]# ping 172.20.0.1
      PING 172.20.0.1 (172.20.0.1) 56(84) bytes of data.
      64 bytes from 172.20.0.1: icmp_seq=1 ttl=64 time=0.158 ms
      
      
       
      [root@overcloud-computesriov2-0 ~]# ping 172.20.2.1
      PING 172.20.2.1 (172.20.2.1) 56(84) bytes of data.
      64 bytes from 172.20.2.1: icmp_seq=1 ttl=64 time=0.110 ms
    9. edit /etc/ptp4l.conf to include the following global parameter and the PTP interface parameters

      Code Block
      languagetext
      themeRDark
      [global]
      domainNumber 127
      priority1 128
      priority2 127
      use_syslog 1
      logging_level 6
      tx_timestamp_timeout 30
      hybrid_e2e 1
      dscp_event 46
      dscp_general 46
       
      [enp8s8f3]
      logAnnounceInterval -2
      announceReceiptTimeout 3
      logSyncInterval -3
      logMinDelayReqInterval -3
      delay_mechanism E2E
      network_transport UDPv4
       
    10. Start ptp4l on the PTP VF interface

      Info
      titleNote

      The command below is used to run the ptp4l in slave mode on a dedicated host CPU which is isolated and excluded from Nova per our deployment configuration files (core 16 in our case).

      The second command is used to verify PTP clock is locked on master clock source, rms values should be low.

      Code Block
      languagetext
      themeRDark
      # taskset -c 16 ptp4l -s -f /etc/ptp4l.conf &
      # tail -f /var/log/messages | grep rms
      ptp4l: [2560.009] rms   12 max   22 freq -12197 +/-  16
      ptp4l: [2561.010] rms   10 max   18 freq -12200 +/-  13 delay    63 +/-   0
      ptp4l: [2562.010] rms   10 max   21 freq -12212 +/-  10 delay    63 +/-   0
      ptp4l: [2563.011] rms   10 max   21 freq -12208 +/-  14 delay    63 +/-   0
      ptp4l: [2564.012] rms    9 max   14 freq -12220 +/-   8
    11.  Start  phc2sys on the same interface to sync the host system clock time

      Info
      titleNote

      The command below is used to run the phc2sys on a dedicated host CPU which is isolated and excluded from Nova per our deployment configuration files (core 17 in our case).

      The second command is used to verify system clock is synched to PTP, offset values should be low and match the ptp4l rms values

      Code Block
      languagetext
      themeRDark
      # taskset -c 17  phc2sys -s enp8s8f3 -w -m -n 127 >> /var/log/messages &
      # tail -f /var/log/messages | grep offset
      phc2sys[2797.730] phc offset         0 s2 freq  +14570 delay    959
      phc2sys[2798.730]: phc offset       -43 s2 freq  +14527 delay    957
      phc2sys[2799.730]: phc offset        10 s2 freq  +14567 delay    951

    Application VMs and Use Cases

    In the section below we will cover two main use cases:

    1. IP Multicast stream between media VMs located in different L3 routed provider networks 
      Image RemovedImage Added



    2. HW-Offloaded Unicast stream over VXLAN tunnel between media VMs located in different L3 routed provider networks 
      Image RemovedImage Added



    Media Instances Creation

    Each Media VM will own both SRIOV-based vlan network and ASAP²-based VXLAN network. The same VMs can be used to test all of the use cases.

    1. Download Contact Nvidia Networking Support to get the Rivermax VM cloud image image file (RivermaxCloud_v3.qcow2)

      Info
      titleNote

      The login credentials to VMs that are using this image are: root/3tango

    2. Upload Rivermax cloud image to overcloud image repository

      Code Block
      languagetext
      themeRDark
      source overcloudrc
      openstack image create --file RivermaxCloud_v1v3.qcow2 --disk-format qcow2 --container-format bare rivermax
    3. Create a flavor with dedicated cpu policy to ensure VM vCPUs are pinned to the isolated host CPUs

      Code Block
      languagetext
      themeRDark
      openstack flavor create m1.rivermax --id auto --ram 4096 --disk 20 --vcpus 4
      openstack flavor set m1.rivermax --property hw:mem_page_size=large
      openstack flavor set m1.rivermax --property hw:cpu_policy=dedicated
    4. Create a multi-segment network for the tenant vlan Multicast traffic

      Info
      titleNotes

      Each network segment contains SRIOV direct port with IP from a different subnet.

      The subnets are associated with a different physical network, each one correlated with a different routed provider rack.

      Routes to the subnets are propagated between racks via provider L3 infrastructure (OSPF in our case).

      The subnets GWs are the Leaf ToR Switch per rack.

      Both segments under this multi-segment network are carrying the same segment vlan.

      Code Block
      languagetext
      themeRDark
      openstack network create mc_vlan_net --provider-physical-network tenantvlan1 --provider-network-type vlan --provider-segment 101 --share
      openstack network segment list --network mc_vlan_net
      +--------------------------------------+------+--------------------------------------+--------------+---------+
      | ID                                   | Name | Network                              | Network Type | Segment |
      +--------------------------------------+------+--------------------------------------+--------------+---------+
      | 309dd695-b45d-455e-b171-5739cc309dcf | None | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan         |     101 |
      +--------------------------------------+------+--------------------------------------+--------------+---------+
      
      
      openstack network segment set --name segment1 309dd695-b45d-455e-b171-5739cc309dcf
      openstack network segment create --physical-network tenantvlan2 --network-type vlan --segment 101 --network mc_vlan_net segment2
       
      (overcloud) [stack@rhosp-director ~]$ openstack network segment list 
      +--------------------------------------+----------+--------------------------------------+--------------+---------+
      | ID                                   | Name     | Network                              | Network Type | Segment |
      +--------------------------------------+----------+--------------------------------------+--------------+---------+
      | 309dd695-b45d-455e-b171-5739cc309dcf | segment1 | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan         |     101 |
      | cac89791-2d7f-45e7-8c85-cc0a65060e81 | segment2 | 00665b03-eeae-4b5d-af65-063f8e989c24 | vlan         |     101 |
      +--------------------------------------+----------+--------------------------------------+--------------+---------+
      
      openstack subnet create mc_vlan_subnet --dhcp --network mc_vlan_net --network-segment segment1 --subnet-range 11.11.11.0/24 --gateway 11.11.11.1
      openstack subnet create mc_vlan_subnet_2 --dhcp --network mc_vlan_net --network-segment segment2 --subnet-range 22.22.22.0/24 --gateway 22.22.22.1
      
      openstack port create mc_direct1 --vnic-type=direct --network mc_vlan_net 
      openstack port create mc_direct2 --vnic-type=direct --network mc_vlan_net 
    5. Create vxlan tenant network for Unicast traffic with 2 x  ASAP² offload ports 

      Code Block
      languagetext
      themeRDark
      openstack network create tenant_vxlan_net --provider-network-type vxlan --share
      openstack subnet create tenant_vxlan_subnet --dhcp --network tenant_vxlan_net --subnet-range 33.33.33.0/24 --gateway none
      openstack port create offload1 --vnic-type=direct --network tenant_vxlan_net --binding-profile '{"capabilities":["switchdev"]}'
      openstack port create offload2 --vnic-type=direct --network tenant_vxlan_net --binding-profile '{"capabilities":["switchdev"]}'
    6. Create a rivermax instance on media compute node located in Rack 1 (provider network segment 1) with one direct SRIOV port on the vlan network and one ASAP² offload port on the vxlan network

      Code Block
      languagetext
      themeRDark
      openstack server create --flavor m1.rivermax --image rivermax --nic port-id=mc_direct1 --nic port-id=offload1 vm1 --availability-zone nova:overcloud-computesriov1-0.localdomain
    7. Create a second rivermax instance on media compute node located in Rack 2 (provider network segment 2) with one direct SRIOV port on the vlan network and one ASAP² offload port on the vxlan network

      Code Block
      languagetext
      themeRDark
      openstack server create --flavor m1.rivermax --image rivermax --nic port-id=mc_direct2 --nic port-id=offload2 vm2 --availability-zone nova:overcloud-computesriov2-0.localdomain
    8. Connect to the compute nodes and verify the VMs are pinned to the isolated CPUs

      Code Block
      languagetext
      themeRDark
      [root@overcloud-computesriov1-0 ~]# virsh list
       Id    Name                           State
      ----------------------------------------------------
       1     instance-0000002b              running
      
      
      [root@overcloud-computesriov1-0 ~]# virsh vcpupin 1
      VCPU: CPU Affinity
      ----------------------------------
         0: 15
         1: 2
         2: 3
         3: 4
      



    Rivermax Application Testing - Use Case 1:

    In the following section we use Rivermax application VMs created on 2 media compute nodes located in different network racks.

    First we will lock on the PTP clock generated by the Onyx switches and propagated into the VMs via KVM vPTP driver.

    Next we will generate media standards compliant stream on VM1 and validate compliance using Mellanox NVIDIA Rivermax AnalyzeX tool on VM2. The Multicast stream generated by VM1 will traverse over the network using PIM-SM and will be received by VM2 who joined the group. Please notice this stream contains RTP header (including 1 SRD) for each packet and comply with known media RFCs, however the RTP payload is 0 so it is cannot be visually displayed.

    In the last step we will decode and stream a real video file on VM1 and play it in the receiver VM2 using graphical interface using NVIDIA Rivermax Simple Viewer tool.


    1. Upload rivermax and analyzex license files to the Rivermax VMs and place it under /opt/mellanox/rivermx directory.
    2. On both VMs run the following command to sync the system time from PTP:

      Code Block
      languagetext
      themeRDark
      taskset -c 1 phc2sys -s /dev/ptp2 -O 0 -m >> /var/log/messages &
      
      # tail -f /var/log/messages | grep offset
      phc2sys[2797.730] phc offset         0 s2 freq  +14570 delay    959
      phc2sys[2798.730]: phc offset       -43 s2 freq  +14527 delay    957
      phc2sys[2799.730]: phc offset        10 s2 freq  +14567 delay    951

      Notice the phc2sys is running on a dedicated VM core 1 (which is isolated from the hypervisor) and applied on ptp2 device. In some cases the ptp devices names in the VM will be different.

      Ignore the "clock is not adjustable" message when applying the command.

      Low and stable offset values will indicate a lock.

      Info
      titleNote

      Important: Verify the freq values in the output are close to the values seen in the compute node level (see above where we performed the command on host).
      If not, use in the phc2sys command a different /dev/ptp device that is available in the VM system.

    3. On both VMs, run the SDP file modification script to adjust the media configuration file (sdp_hd_video_audio) as desired:

      Code Block
      languagetext
      themeRDark
      #cd /home/Rivermax
      #./sdp_modify.sh
      === SDP File Modification Script ===
      Default source IP Address is 11.11.11.10 would you like to change it (Y\N)?y
      Please select source IP Address in format X.X.X.X :11.11.11.25
      Default Video stream multicast IP Address is 224.1.1.20 would you like to change it (Y\N)?y
      Please select Video stream multicast IP Address:224.1.1.110
      Default  Video stream multicast Port is 5000 would you like to change it (Y\N)?n
      Default Audio stream multicast IP Address is 224.1.1.30 would you like to change it (Y\N)?y
      Please select Audio stream multicast IP Address:224.1.1.110
      Default Audio stream multicast Port: is 5010 would you like to change it (Y\N)?n
      Your SDP file is ready with the following parameters:
      IP_ADDR 11.11.11.25
      MC_VIDEO_IP 224.1.1.110
      MC_VIDEO_PORT 5000
      MC_AUDIO_IP 224.1.1.110
      MC_AUDIO_PORT 5010
      
      
      # cat sdp_hd_video_audio
      v=0
      o=- 1443716955 1443716955 IN IP4 11.11.11.25
      s=st2110 stream
      t=0 0
      m=video 5000 RTP/AVP 96
      c=IN IP4 224.1.1.110/64
      a=source-filter:incl IN IP4 224.1.1.110 11.11.11.25
      a=rtpmap:96 raw/90000
      a=fmtp:96 sampling=YCbCr-4:2:2; width=1920; height=1080; exactframerate=50; depth=10; TCS=SDR; colorimetry=BT709; PM=2110GPM; SSN=ST2110-20:2017; TP=2110TPN;
      a=mediaclk:direct=0
      a=ts-refclk:localmac=40-a3-6b-a0-2b-d2
      m=audio 5010 RTP/AVP 97
      c=IN IP4 224.1.1.110/64
      a=source-filter:incl IN IP4 224.1.1.110 11.11.11.25
      a=rtpmap:97 L24/48000/2
      a=mediaclk:direct=0 rate=48000
      a=ptime:1
      a=ts-refclk:localmac=40-a3-6b-a0-2b-d2
    4. On both VMs issue the following command to define the VMA memory buffers:

      Code Block
      languagetext
      themeRDark
      export VMA_RX_BUFS=2048
      export VMA_TX_BUFS=2048
      export VMA_RX_WRE=1024
      export VMA_TX_WRE=1024
    5. On the first VM ("transmitter VM") generate the media stream using Rivermax media_sender application. The command below is used to run Rivermax media sender application on dedicated VM vCPUs 2,3 (which are isolated from the hypervisor).

      Media sender application is using the system time to operate.

      Code Block
      languagetext
      themeRDark
      # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m
    6. On the second VM ("receiver VM") run the AnalyzeX tool to verify compliance. The command below is used to run Rivermax AnalyzeX compliance tool on dedicated VM vCPUs 1-3 (which are isolated from the hypervisor).

      Code Block
      languagetext
      themeRDark
      # VMA_HW_TS_CONVERSION=2 ANALYZEX_STACK_JITTER=2 LD_PRELOAD=libvma.so taskset -c 1-3 ./analyzex -i ens4 -s sdp_hd_video_audio -p
    7. The following AnalyzeX result indicate full compliance to ST2110 media standards:

      Image RemovedImage Added

    8. Stop Rivermax media_sender application on VM1 and AnalyzeX tool on VM2.

    9. Login to VM1 and extract the video file under /home/Rivermax directory

      Code Block
      languagetext
      themeRDark
      # gunzip mellanoxTV_1080p50.ycbcr.gz
    10. Re-run Rivermax media_sender application on VM1 - this time specify the video file. Lower rate is used to allow the graphical interface to cope with video playing task:

      Code Block
      languagetext
      themeRDark
      # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m -f mellanoxTV_1080p50.ycbcr --fps 25
    11. Open a graphical remote session to VM2. In our case we have allocated a public floating ip to VM2 and used X2Go client to open a remote session:

      Image RemovedImage Added

    12. Open the Terminal and run the Rivermax rx_hello_wolrd_viewer application under /home/Rivermax directory. Specify the local VLAN IP address of VM2 and the Multicast address of the stream. Once the command is issued the video will start playing on screen.

      Code Block
      languagetext
      themeRDark
      #cd /home/Rivermax
      # ./rx_hello_world_viewer -i 22.22.22.4 -m 224.1.1.110 -p 5000

      Image RemovedImage Removed

      Image RemovedImage AddedImage Added

      Image Added

      The following video demonstrates the procedure:

      View file
      nameSimple_player.mp4
      height250



    Rivermax Application Testing - Use Case 2

    In the following section we use the same Rivermax application VMs that were created on 2 remote media compute nodes to generate a Unicast stream between the VMs over VXLAN overlay network.

    After validating the PTP clock is locked, we will start the stream and monitor it with the same tools.

    The Unicast stream generated by VM1 will create vxlan OVS flow that will be offloaded to the NIC HW.


    1. Make sure rivermax and analyzex license files are placed on the Rivermax VMs as instructed in Use Case 1.
    2. Make sure the system time on both VMs is updated from PTP as instructed in Use Case 1.

      Code Block
      languagetext
      themeRDark
      # tail -f /var/log/messages | grep offset
      phc2sys[2797.730] phc offset         0 s2 freq  +14570 delay    959
      phc2sys[2798.730]: phc offset       -43 s2 freq  +14527 delay    957
      phc2sys[2799.730]: phc offset        10 s2 freq  +14567 delay    951
    3. On transmitter VM1 run the SDP file modification script to create a Unicast configuration file - specify VM1 VXLAN IP address as source IP and  VM2 VXLAN IP address as the stream destination

      Code Block
      languagetext
      themeRDark
      # ./sdp_modify.sh
      === SDP File Modification Script ===
      Default source IP Address is 11.11.11.10 would you like to change it (Y\N)?y
      Please select source IP Address in format X.X.X.X :33.33.33.12
      Default Video stream multicast IP Address is 224.1.1.20 would you like to change it (Y\N)?y
      Please select Video stream multicast IP Address:33.33.33.16
      Default  Video stream multicast Port is 5000 would you like to change it (Y\N)?n
      Default Audio stream multicast IP Address is 224.1.1.30 would you like to change it (Y\N)?y
      Please select Audio stream multicast IP Address:33.33.33.16
      Default Audio stream multicast Port: is 5010 would you like to change it (Y\N)?n
      Your SDP file is ready with the following parameters:
      IP_ADDR 33.33.33.12
      MC_VIDEO_IP 33.33.33.16
      MC_VIDEO_PORT 5000
      MC_AUDIO_IP 33.33.33.16
      MC_AUDIO_PORT 5010
    4. On VM1 generate the media stream using Rivermax media_sender application - use the unicast SDP file you created in previous step

      Code Block
      languagetext
      themeRDark
      # ./media_sender -c 2 -a 3 -s sdp_hd_video_audio -m
    5. On receiver VM2 run the Rivermax rx_hello_wolrd application with the local VXLAN interface IP

      Info
      titleNote

      Make sure you use rx_hello_world tool and not rx_hello_world_viewer.

      Code Block
      languagetext
      themeRDark
      # ./rx_hello_world -i 33.33.33.16 -m 33.33.33.16 -p 5000
    6. On the Compute nodes verify the flows are offloaded to the HW

      1. On compute node 1 which is hosting transmitter VM1 the offloaded flow includes the traffic coming from the VM over the Representor interface and goes into the VXLAN tunnel :

        Code Block
        languagetext
        themeRDark
        [root@overcloud-computesriov1-0 heat-admin]# ovs-dpctl dump-flows type=offloaded --name
        
         
        in_port(eth4),eth(src=fa:16:3e:94:a4:5d,dst=fa:16:3e:fc:59:f3),eth_type(0x0800),ipv4(tos=0/0x3,frag=no), packets:54527279, bytes:71539619808, used:0.330s, actions:set(tunnel(tun_id=0x8,src=172.19.0.100,dst=172.19.2.105,tp_dst=4789,flags(key))),vxlan_sys_4789
      2. On compute node 2 which is hosting receiver VM2 the offloaded flow includes the traffic coming over the VXLAN tunnels and goes into the VM over the Representor interface:

        Code Block
        languagetext
        themeRDark
        [root@overcloud-computesriov2-0 ~]# ovs-dpctl dump-flows type=offloaded --name
         
        tunnel(tun_id=0x8,src=172.19.0.100,dst=172.19.2.105,tp_dst=4789,flags(+key)),in_port(vxlan_sys_4789),eth(src=fa:16:3e:94:a4:5d,dst=fa:16:3e:fc:59:f3),eth_type(0x0800),ipv4(frag=no), packets:75722169, bytes:95561342656, used:0.420s, actions:eth5 sys_4789
    QED!

    Authors

    Include Page
    SA:Itai Levy
    SA:Itai Levy

    Related Documents

    Content by Label
    showLabelsfalse
    showSpacefalse
    sortcreation
    cqllabel = "replacein ("openstack","rhosp","asap²","media_and_entertainment","virtual_machine","virtualization") and space = currentSpace() and state = "Approved"