Getting Started#

To leverage RDMA with ATS for high performance compute, the following steps are outlined within this guide:

  • Configure NVIDIA ConnectX-6 Dx for RoCE

  • Enable ATS on VMware ESXi and Virtual Machines

  • Enable ATS on NVIDIA Connect X-6 DX NIC

  • Configure NUMA Affinity

  • Creating Docker File for Multi-Node Training

  • Setup Keyless Entry between VM’s on the multi-node cluster

  • Run Sample ResNet-50 Multi-Node Training

Configure NVIDIA ConnectX-6 Dx NIC and Spectrum switch for RoCE#

To leverage RoCE, the NVIDIA ConnectX-6 Dx NIC must run RoCE over a loss network in DSCP-based QoS mode. The following knowledge Article is a helpful resource for applying this configuration: https://community.mellanox.com/s/article/lossless-roce-configuration-for-mlnx-os-switches-in-dscp-based-qos-mode

For this guide, we will reference configuration steps within the Knowledge Article for version 3.8.2008 and above.

  1. Run the following commands on the NVIDIA switch:

    switch (config) # roce
    

    Note

    The RoCE feature has been Automated so that all that is needed to run RoCE on lossless fabric is running the roce command.

  2. Create an isolated vLAN and place the NVIDIA ConnectX NICs into the created vLAN as access ports. The four servers connected into switch ports 1/1 - 1/4.

    1switch (config) # interface vlan 111
    2switch (config vlan 111) # exit
    3switch (config) # interface ethernet 1/1-1/4 switchport access vlan 111
    
  3. Set the MTU to 9216 on the interfaces (on versions below 3.9.2110, the switch’s default MTU is 1500).

    1switch (config) # interface ethernet 1/1-1/4 shutdown
    2switch (config) # interface ethernet 1/1-1/4 mtu 9216
    3switch (config) # interface ethernet 1/1-1/4 no shutdown
    
  4. Optional, if you are running Cumulus Linux, follow these instructions to enable RoCE: https://docs.cumulusnetworks.com/cumulus-linux-42/Network-Solutions/RDMA-over-Converged-Ethernet-RoCE/.

Enable ATS on VMware ESXi and VMs#

To enable Peer-2-Peer (P2P) with high performance, we will enable ATS by updating the VMKernel and then the VM configuration.

  1. Update the VMKernel for Peer-2-Peer (P2P).

    • To enable the ATS boot option, invoke the following command and reboot ESXi:

      esxcli system settings kernel set -s atsSupport -v TRUE
      
    • Verify the value is correct after reboot, invoke:

      esxcli system settings kernel list -o atsSupport
      
    • The output should resemble the following:

      1Name          Type     Configured  Runtime   Default  Description
      2------------  -------  ----------  -------   -------  -----------
      3atsSupport    Bool     TRUE        TRUE      FALSE    Enable Support for PCIe ATS
      
  2. Update the VM configuration for P2P.

  3. Edit the VM configuration settings:

    1pciPassthru.allowP2P=true                       # enable P2P
    2pciPassthru.RelaxACSforP2P=true         # update ACS capabilities in switch
    

    Note

    When relaxing ACS for P2P is enabled, VMware will locate an ATS capable passthrough device, find its parent switch, and enable the ACS Direct Translated bit. The previous restriction that all functions of peer networking devices must be given to a single VM has been removed. Each function of a peer device can be given to a separate VM.

  4. If there are multiple GPU physical devices, the VM can specify a specific device for P2P with existing config:

    pciPassthru0.cfg.gpu-pci-id = "ssss:bb:dd.f"
    

    Note

    The gpu-pci-id is in hex SBDF format. If the GPU is in SR-IOV mode, you should specify a VF address.

Enable ATS on the NVIDIA ConnectX-6 Dx NIC#

  1. Install python 2.7 with the command below:

    sudo apt-get install python
    
  2. Download and install MLNX OFED 5.0: https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/.

    • Select (OS/Version/Architecture) and download tar file, for example: (Ubuntu/20.04/x86_64).

    • Download, then copy the package to the VMs, and run the following commands to extract and install:

      1tar xvf MLNX_OFED_LINUX-5.2-2.2.4.0-ubuntu20.04-x86_64.tgz
      2cd  MLNX_OFED_LINUX-5.2-2.2.4.0-ubuntu20.04-x86_64.tgz
      3sudo ./mlnxofedinstall
      

      Note

      The above step will also update firmware for all CX5 or CX6 cards.

    • Run the following command after the install is complete:

      sudo /etc/init.d/openibd restart
      

      Note

      During the install process, the CX-6 NICs are detected, and OFED should update the firmware. If this fails, download the latest firmware and update manually. Repeat the OFED install after.

  3. Check OFED and Firmware versions using the following commands:

    1dpkg -l | grep mlnx-ofed
    2cat /sys/class/infiniband/mlx5*/fw_ver
    
  4. Start Mellanox software tools:

    sudo mst start
    
  5. Check the status of the ATS_ENABLED Configuration for the CX-6 NIC using the below command. You should see output similar to the following:

    1sudo mlxconfig -d /dev/mst/mt4123_pciconf0 query | grep -i ATS
    2ATS_ENABLED                         False(0)
    
  6. If it is not present, the firmware does not support ATS. Update to a version of the firmware that does. If set to False, use the following command to enable ATS:

     1sudo mlxconfig -d /dev/mst/mt4123_pciconf0 set ATS_ENABLED=true
     2Device #1:
     3----------
     4Device type:    ConnectX6
     5Name:           MCX653105A-HDA_Ax
     6Description:    ConnectX-6 VPI adapter card; HDR IB (200Gb/s) and 200GbE; single-port QSFP56; PCIe4.0 x16; tall bracket; ROHS R6
     7Device:        /dev/mst/mt4123_pciconf0
     8
     9Configurations:           Next Boot     New
    10ATS_ENABLED               False(0)      True(1)
    11Apply new Configuration? (y/n) [n] : y
    12Applying... Done!
    13-I- Please reboot machine to load new configurations.
    
  7. Once you have enabled ATS on the CX-6 on both VMs, put the host in maintenance mode and reboot the ESXi host.

    Note

    If you have vMotion configured between two hosts, then VMs on a host can move to another running host while the necessary reboots occur to enable ATS.

    Note

    Remember to re-submit the command to enable the ACS Direct Translated bit on the PCIe switch.

  8. After the ESXi host reboot is complete, power back on vCenter and the VMs.

  9. Next, verify that ATS is enabled on VMs by running the following commands:

    1sudo mst start
    2sudo mlxconfig -d /dev/mst/mt4123_pciconf0 query | grep -i ATS
    3sudo lspci -vvv
    
  10. Search for the Mellanox CX-6 device, and verify the output contains the ATS Capability as configured below:

    1Capabilities: [480 v1] Address Translation Service (ATS)
    2    ATSCap: Invalidate Queue Depth: 00
    3     ATSCtl: Enable+, Smallest Translation Unit: 00
    

    Note

    Enable+ indicates it’s been successfully enabled.

Configure NUMA Affinity for the VMs#

  1. Check which NUMA node your NICs and GPUs are attached to, run the following command on the ESXi host:

    1esxcli hardware pci list | grep -A 30 -B 10 NVIDIA
    2esxcli hardware pci list | grep -A 30 -B 10 Mellanox
    
  2. The following output describes the devices NUMA node:

     10000:3b:02.3
     2    Address: 0000:3b:02.3
     3    Segment: 0x0000
     4    Bus: 0x3b
     5    Slot: 0x02
     6    Function: 0x3
     7    VMkernel Name: PF_0.59.0_VF_15
     8    Vendor Name: NVIDIA Corporation
     9    Device Name: NVIDIAA100-PCIE-40GB
    10    Configured Owner: VMkernel
    11    Current Owner: VMkernel
    12    Vendor ID: 0x10de
    13    Device ID: 0x20f1
    14    SubVendor ID: 0x10de
    15    SubDevice ID: 0x0000
    16    Device Class: 0x0302
    17    Device Class Name: 3D controller
    18    Programming Interface: 0x00
    19    Revision ID: 0xa1
    20    Interrupt Line: 0xff
    21    IRQ: 255
    22    Interrupt Vector: 0x00
    23PCI Pin: 0xff
    24    Spawned Bus: 0x00
    25    Flags: 0x0001
    26    Module ID: 54
    27    Module Name: nvidia
    28    Chassis: 0
    29    Physical Slot: -1
    30    Slot Description:
    31    Device Layer Bus Address: s00000001.00.vf15
    32    Passthru Capable: true
    33    Parent Device: PCI 0:58:0:0
    34    Dependent Device: PCI 0:59:2:3
    35    Reset Method: Function reset
    36    FPT Sharable: true
    37    NUMA Node: 0
    38    Extended Device ID: 65535
    39    Extended Device Name:
    
  3. Make sure the NIC and the GPU are on the same NUMA node.

  4. Within the VM configuration, add a new key-value:

    numa.nodeAffinity = <numa node value>
    

Creating Docker File For Multi-Node Training#

  1. Make a Docker image following the Dockerfile below:

     1FROM nvcr.io/nvaie/tensorflow:21.07-tf1-py3
     2
     3ARG DEBIAN_FRONTEND=noninteractiv
     4
     5# Set MOFED version, OS version and platform
     6ENV MOFED_VERSION 5.2-2.2.4.0
     7
     8#http://content.mellanox.com/ofed/MLNX_OFED-5.2-2.2.4.0/MLNX_OFED_LINUX-5.2-2.2.4.0-ubuntu20.04-x86_64.tgz
     9ENV OS_VERSION ubuntu20.04
    10
    11ENV PLATFORM x86_64
    12
    13
    14RUN pip3 install --user --upgrade pip && \
    15    pip3 install --no-cache-dir absl-py
    16
    17RUN apt-get update && \
    18    apt-get install -y --allow-downgrades --allow-change-held-packages --no-install-recommends \
    19        apt-utils build-essential cmake tcsh tcl tk \
    20        make git curl vim wget ca-certificates \
    21        iputils-ping net-tools ethtool \
    22        perl lsb-release python-libxml2 \
    23        iproute2 pciutils libnl-route-3-200 \
    24        kmod libnuma1 lsof openssh-server \
    25        swig libelf1 automake libglib2.0-0 \
    26        autoconf graphviz chrpath flex libnl-3-200 m4 \
    27        debhelper autotools-dev gfortran libltdl-dev  \
    28        dmidecode build-essential cmake git zip pciutils hwloc  numactl \
    29        dpatch bison pkg-config numactl  dkms udev libnl-route-3-dev libnl-3-dev  \
    30        libmnl0 libmnl-dev expect-dev ncat \
    31        usbutils iperf3 bc tree \
    32        quilt  \
    33        landscape-common  libpci-dev && \
    34        rm -rf /var/lib/apt/lists/*
    35# hugepages libgfortran3 netcat
    36# linux-headers-$(uname -r)
    37
    38
    39WORKDIR /workspace
    40RUN wget http://content.mellanox.com/ofed/MLNX_OFED-${MOFED_VERSION}/MLNX_OFED_LINUX-$MOFED_VERSION-$OS_VERSION-$PLATFORM.tgz && \
    41    tar -xvf MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}.tgz && \
    42    MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/mlnxofedinstall --user-space-only --without-fw-update --force && \
    43    tree /workspace/MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/
    44    #dpkg -i /workspace/MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/DEBS/libibumad-dev*.deb && \
    45    #dpkg -i /workspace/MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/DEBS/libibumad3*.deb
    46
    47
    48#    MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/mlnxofedinstall --dpdk --upstream-libs --without-fw-update --force --umad-dev-rw -q
    49#--user-space-only
    50#    MLNX_OFED_LINUX-${MOFED_VERSION}-${OS_VERSION}-${PLATFORM}/mlnxofedinstall --dpdk --without-fw-update --force  -q
    51
    52#WORKDIR /workspace
    53#RUN wget https://www.mellanox.com/downloads/MFT/mft-4.16.1-9-x86_64-deb.tgz && \
    54#tar xzvf mft-4.16.1-9-x86_64-deb.tgz&& \
    55#cd mft-4.16.1-9-x86_64-deb && \
    56#./install.sh
    57
    58
    59WORKDIR /workspace
    60RUN git clone -b cnn_tf_v1.15_compatible https://github.com/tensorflow/benchmarks.git
    61
    62
    63WORKDIR /workspace
    64RUN git clone https://github.com/NVIDIA/nccl-tests && \
    65cd nccl-tests && \
    66make MPI=1 MPI_HOME=/usr/local/mpi
    67
    68
    69WORKDIR /workspace
    70RUN git clone https://github.com/linux-rdma/perftest && \
    71    cd perftest && \
    72    ./autogen.sh && \
    73    CUDA_H_PATH=/usr/local/cuda/include/cuda.h ./configure && \
    74    make install
    75
    76
    77
    78WORKDIR /test
    79
    80
    81RUN rm -f ${_CUDA_COMPAT_PATH}/.*.checked
    
  2. Run the following commands to build the docker multi-node container in the same folder as the Dockerfile:

    sudo docker build -t multinode:latest .
    
  3. Tag and upload the image to your NVIDIA AI Enterprise private registry:

    1sudo docker tag multinode <NVIDIA_AI_Enterprise_private_registry_username>/multinode
    2sudo docker push
    

Setup Keyless Entry Between VMs On The Multi-Node Cluster#

On a clean install of a system, the ~/.ssh directory is typically empty. However, the following files will be generated/added using the steps within this guide:

id_rsa and id_rsa.pub

SSH keys used for keyless entry between nodes.

authorized_keys

A list of RSA public keys from other nodes/systems recognized by a server for ssh access.

config

A file created that provides ssh security key checking settings when accessing other nodes.

mpicont.sh

A script that we will create to allow mpi to talk between containers on separate nodes.

ssh_container/

A directory that contains the files above but for internode container communication.

known_hosts

This file is auto-generated by ssh and lists keys for all hosts that a user has ever connected to.

Generating SSH Keys#

On the master node, we will create a pair of ssh keys shared between the nodes. Then another pair will be generated to use between containers running between the nodes. We will name each set of keys accordingly for this guide, but the default key names id_rsa and id_rsa.pub are ok.

Host/WorkerSSH Keys#

  1. Within the command-line terminal, create a new SSH key:

    ssh-keygen -t rsa
    
  2. Enter file in which to save the key /home/nvidia/.ssh/id_rsa):

    id_rsa_host
    

This will generate the following files:

  • id_rsa_host

  • id_rsa_host.pub

Container SSH Keys#

  1. Make a directory named ssh_container. This directory can be created anywhere, but we will just put it in our ~/.ssh directory for our example:

    1mkdir ssh_container
    2cd ssh_container
    3ssh-keygen -t rsa
    
  2. Enter file in which to save the key (/home/nvidia/.ssh/id_rsa):

    <path/to>/ssh_container/id_rsa_cont
    

Within the ssh_container directory, this will generate:

  • id_rsa_cont

  • id_rsa_cont.pub

Creating Config Files for Keyless Entry#

In our lab environment, the username is nvidia for our Ubuntu VMs. Please substitute the username in the following steps to reflect the user in your environment. On the master node, create a file called config (~/.ssh/config) and put in the following contents:

1Host *
2    User nvidia
3    IdentityFile ~/.ssh/id_rsa_host
4    StrictHostKeyChecking no
5    UserKnownHostsFile=/dev/null

Within the ssh_container directory (~/.ssh/ssh_container/config), create another config file for the keyless entry between containers:

1Host *
2    User nvidia
3    IdentityFile /root/.ssh/id_rsa_cont
4    StrictHostKeyChecking no
5    UserKnownHostsFile=/dev/null
6    LogLevel=Error
7    ServerAliveInterval=30

Add Public SSH Keys To “Authorized_keys”#

For keyless entry to work on the workernodes, the contents of the public ssh keys need to be copied to the authorized_keys file for both internode communications and communications between the containers on separate nodes.

In the ~/.ssh folder:

echo `cat id_rsa_host.pub` > authorized_keys

In the ~/.ssh/ssh_container folder:

echo `cat id_rsa_cont.pub` > authorized_keys

Create mpicont.sh script#

  1. Within the in the ~/.ssh directory, create a script called mpicont.sh with the following contents:

    1mpicont.sh
    2docker exec mpicont /bin/bash -c "$SSH_ORIGINAL_COMMAND"
    
  2. Then make the script executable:

    chmod +x mpicont.sh
    
Add Container SSH Key to the Master’s authorized_keys File#

Add the following line to the master authorized_keys file:

command="bash /home/nvidia/.ssh/mpicont.sh",no-port-forwarding,no-agent-forwarding,no-X11-forwarding <add contents of id_rsa_cont.pub>
Copy ~/.ssh to Worker Nodes and Confirm Keyless entry#

Now we can copy all the files from the master node’s ~/.ssh directory to all of the worker nodes we specified in our nodelist.

scp -r .ssh $<worker_node_IP>:/home/nvidia/.ssh/;done
Change Permissions in the ssh_container on all Nodes#

On all the nodes, change the ownership of the ssh_container/config file so that the owner is root:

sudo chown root:root config

Then change the permissions to 600 for all files in the ssh_container folder.

sudo chmod 600 *

Below is a list of all the files that were copied over to the workers and their proper permissions:

1~/.ssh$ ll *
2-rw------- 1 nvidia nvidia  894 Jan 24 17:46 authorized_keys
3-rw-r--r-- 1 nvidia nvidia  125 Jan 24 14:21 config
4-rw------- 1 nvidia nvidia 1675 Jan 24 14:19 id_rsa_host
5-rw-r--r-- 1 nvidia nvidia  396 Jan 24 14:19 id_rsa_host.pub
6-rwxrwxr-x 1 nvidia nvidia   57 Jan 24 15:55 mpicont.sh*

ssh_container:

1total 24
2drwxrwxr-x 2 nvidia nvidia 4096 Feb  6 16:50 ./
3drwxrwxr-x 4 nvidia nvidia 4096 Feb  7 11:29 ../
4-rw------- 1 nvidia nvidia  396 Jan 24 15:58 authorized_keys
5-rw------- 1 root   root    161 Jan 24 17:54 config
6-rw------- 1 nvidia nvidia 1675 Jan 24 15:58 id_rsa_cont
7-rw------- 1 nvidia nvidia  396 Jan 24 15:58 id_rsa_cont.pub

Now run Docker containers on all the worker nodes, using the following command:

sudo docker run -it --gpus=all --net=host --uts=host --ipc=host --ulimit stack=67108864 --ulimit memlock=-1 --shm-size=1g --name=mpicont --device=/dev/infiniband -v /home/nvidia/.ssh/ssh_container:/root/.ssh <NVIDIA_AI_Enterprise_private_registry_username>/multinode:latest sleep infinity

On the master node, run:

sudo docker run -it --gpus=all --net=host --uts=host --ipc=host --ulimit stack=67108864 --ulimit memlock=-1 --shm-size=1g --name=mpicont --device=/dev/infiniband -v /home/nvidia/.ssh/ssh_container:/root/.ssh <NVIDIA_AI_Enterprise_private_registry_username>/multinode:latest /bin/bash

To test if the ssh keyless mpi commands are running, run the following command depending on how many workers you have:

mpirun --allow-run-as-root -H <master_IP>,<worker1_IP>,<worker2_IP>,<worker3_IP> -np "4" hostname

To verify available GPUs on all work nodes, run the following command:

mpirun --allow-run-as-root -H <worker1_IP>,<worker2_IP>,<worker3_IP> -np "3" nvidia-smi

Note

Our lab environment, np (number of processes or, in other words, number of GPUs) parameter is 4. Please modify the np parameter to reflect your environment.

The output should reflect the hostnames for all four nodes.

Install nv_peer_memory#

On each of the nodes install nv_peer_mem modules.

git clone https://github.com/Mellanox/nv_peer_memory.git
cd nv_peer_memory
./build_module.sh
cd /tmp
tar xzf /tmp/nvidia-peer-memory_1.0.orig.tar.gz
cd nvidia-peer-memory-1.0
dpkg-buildpackage -us -uc
dpkg -i <path to generated deb files>

Run Sample ResNet-50 Multi-Node Training#

Note

Ensure that ssh keyless mpi is running with the command below:

mpirun --allow-run-as-root -H <master_IP>,<worker1_IP>,<worker2_IP>,<worker3_IP> -np "4" hostname

Run the following command to test example ResNet-50 multi-node benchmark depending on the worker node count:

mpirun --allow-run-as-root -H <master_IP>,<worker1_IP>,<worker2_IP>,<worker3_IP>  -np "4 " -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO  python3 /workspace/benchmarks/scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model resnet50 --batch_size 512 --use_fp16 --variable_update=horovod --xla=True

Interpreting the Results#

This benchmark reports images per second training performance at each reporting iteration. Use the last few values reported to represent training performance.

 1Done warm up
 2Step        Img/sec total_loss
 3Done warm up
 4Step        Img/sec total_loss
 5Done warm up
 6Step        Img/sec total_loss
 7Done warm up
 8Step        Img/sec total_loss
 91   images/sec: 2100.6 +/- 0.0 (jitter = 0.0)       7.738
101   images/sec: 2100.8 +/- 0.0 (jitter = 0.0)       7.742
111   images/sec: 2100.2 +/- 0.0 (jitter = 0.0)       7.734
121   images/sec: 2100.8 +/- 0.0 (jitter = 0.0)       7.770
1310  images/sec: 2100.0 +/- 61.9 (jitter = 6.6)      7.607
1410  images/sec: 2100.4 +/- 60.4 (jitter = 189.7)    7.656
1510  images/sec: 2100.9 +/- 59.2 (jitter = 88.7)     7.611
1610  images/sec: 2100.9 +/- 59.0 (jitter = 175.8)    7.647
1720  images/sec: 2100.2 +/- 39.4 (jitter = 92.3)     7.527
1820  images/sec: 2100.2 +/- 43.8 (jitter = 198.3)    7.515
1920  images/sec: 2100.1 +/- 41.1 (jitter = 181.8)    7.512
2020  images/sec: 2100.1 +/- 43.0 (jitter = 14.7)     7.501
2130  images/sec: 2100.9 +/- 34.9 (jitter = 198.3)    7.490
2230  images/sec: 2100.4 +/- 35.3 (jitter = 11.1)     7.474
2330  images/sec: 2100.7 +/- 33.3 (jitter = 92.9)     7.483
2430  images/sec: 2100.3 +/- 34.9 (jitter = 157.3)    7.493
2540  images/sec: 2100.5 +/- 28.3 (jitter = 76.4)     7.476
2640  images/sec: 2100.9 +/- 31.2 (jitter = 193.8)    7.476
2740  images/sec: 2100.5 +/- 31.2 (jitter = 186.9)    7.483
2840  images/sec: 2100.2 +/- 31.5 (jitter = 18.9)     7.474
2950  images/sec: 2100.8 +/- 28.1 (jitter = 15.0)     7.480
3050  images/sec: 2100.3 +/- 28.3 (jitter = 168.8)    7.468
3150  images/sec: 2100.7 +/- 25.7 (jitter = 76.4)     7.485
3250  images/sec: 2100.2 +/- 27.4 (jitter = 218.1)    7.485
3360  images/sec: 2100.2 +/- 25.6 (jitter = 173.0)    7.485
3460  images/sec: 2100.3 +/- 23.3 (jitter = 66.1)     7.501
3560  images/sec: 2100.4 +/- 24.8 (jitter = 190.7)    7.480
3660  images/sec: 2100.2 +/- 26.4 (jitter = 20.6)     7.493
3770  images/sec: 2100.4 +/- 24.3 (jitter = 16.4)     7.495
3870  images/sec: 2100.4 +/- 23.9 (jitter = 157.3)    7.498
3970  images/sec: 2100.0 +/- 22.1 (jitter = 52.3)     7.503
4070  images/sec: 2100.5 +/- 23.4 (jitter = 218.3)    7.509
4180  images/sec: 2100.3 +/- 22.4 (jitter = 157.3)    7.490
4280  images/sec: 2100.2 +/- 20.6 (jitter = 50.7)     7.510
4380  images/sec: 2100.6 +/- 21.7 (jitter = 195.2)    7.520
4480  images/sec: 2100.2 +/- 22.4 (jitter = 30.3)     7.508
4590  images/sec: 2100.8 +/- 21.2 (jitter = 22.3)     7.481
4690  images/sec: 2100.1 +/- 20.8 (jitter = 157.3)    7.489
4790  images/sec: 2100.7 +/- 19.7 (jitter = 35.1)     7.496
4890  images/sec: 2100.7 +/- 20.7 (jitter = 218.1)    7.471
49100 images/sec: 2100.2 +/- 20.2 (jitter = 30.3)     7.501
50----------------------------------------------------------------
51total images/sec: 8400.46
52----------------------------------------------------------------
53100 images/sec: 1520.1 +/- 19.9 (jitter = 166.6)    7.522
54----------------------------------------------------------------
55total images/sec: 8400.99
56----------------------------------------------------------------
57100 images/sec: 1517.6 +/- 18.6 (jitter = 52.3)     7.507
58----------------------------------------------------------------
59total images/sec: 8400.84
60----------------------------------------------------------------
61100 images/sec: 1517.9 +/- 19.6 (jitter = 219.0)    7.500
62----------------------------------------------------------------
63total images/sec: 8400.58
64----------------------------------------------------------------