NVIDIA CloudXR Client SDK Files¶

The {sdk-root-folder}\Client folder contains the header file and library files that comprise the NVIDIA CloudXR SDK. You must include the header file in your application code and link to the appropriate library file when you compile your application.

File	File Name Including Path
Header files	{sdk-root-folder}\Client\Include\CloudXRClient.h {sdk-root-folder}\Client\Include\CloudXRCommon.h {sdk-root-folder}\Client\Include\CloudXRInputEvents.h
Windows OS library file	{sdk-root-folder}\Client\Lib\Windows\CloudXRClient.lib
Android OS library file	{sdk-root-folder}\Client\Lib\Android\CloudXR.aar
iOS library file	{sdk-root-folder}\Client\Lib\iOS\libCloudXR.a
iOS StreamSDK framework	{sdk-root-folder}\Client\Lib\iOS\StreamSdk.framework

To create a new application on most platforms, copy or reference the libraries/headers in the application project or makefile.

For Android, copy the CloudXR.AAR file into your project directories, typically the {project-root}\app\libs directory, and the build.gradle script can unpack this file. This unpacking process generates the required headers and libraries, so you do not need to copy or reference the base headers. For more information, look at the existing CloudXR Android samples.

Sample Client Application Source Code¶

The {sdk-root-folder}\Sample folder contains source code and the associated third-party code to build sample applications for supported client hardware on Windows, Android, and iOS operating systems. The sample applications include:

Developing a CloudXR Client¶

Overview¶

At a high level, the following phases exist in a CloudXR client application:

Client Setup: Creates the client Receiver object and initiate server connection.
Client Main Loop: Latches and renders frames, releases the frames, and handle state changes, such as connecting and disconnecting.
Client Cleanup: Cleans up threads and all CloudXR resources, which releases the Receiver object.

A simplistic main() function in psuedo-code might look like the following:

int main() {
   MySetupDeviceDesc(&ddesc);
   MySetupCallbacks(&callbacks);
   MySetupReceiverDesc(&rdesc, ddesc, callbacks)

   cxrCreateReceiver();
   cxrConnect();

   while (!*exiting*) {
      MyPlatformEventHandling();
      if (client_state < connected) {
         MyRenderConnectionProgress();
      }
      else if (client_state == connected) {
         UpdateTrackingState();
         cxrLatchFrame()
         MyRenderFrames(framesLatched);
         // NOTE: on Android you must call cxrBlitFrame()
         cxrReleaseFrame()
         // NOTE: every n frames you may call cxrGetConnectionStats()
      }
      else {
         MyHandleDisconnect();
         exiting = true;
      }
   }

   cxrDestroyReceiver()
}

Note that UpdateTrackingState is a placeholder in this pseudo-code for the tracking state callback, which happens in another thread if asynchronous, or could happen inline here if synchronous.  Inside of that function is where you now handle controller addition and binding of inputs, removal, and collecting and sending input events.

Client Setup¶

To set up the client:

Allocate a device descriptor structure (see cxrDeviceDesc). This allocation contains details about the hardware device properties or other specific runtime settings, including information such as the count of video streams, and details of each stream including format, resoultion, FPS, and bitrate (see cxrClientVideoStreamDesc), audio support, frequency to poll input, and flags to enable/disable certain CloudXR server features like Pose Prediction, Virtual VSync, and Foveation.

Note

Many client structures and setup have changed, including the new streaming config, new controller setup/handling, a few removed options, etc. You will want to review the structure documentation and the sample application code in order to better understand what updates are needed for a custom client to migrate to CloudXR 4.0.

As an example, the device descriptor no longer has a cxrDeliveryType, instead width, height and format are now set on a per stream basis. So streaming AR to a tablet device would be set up as 1 RGBA stream, while streaming VR to a HMD would be set up as 2 RGB streams with isStereo set to true.

The following is a portion of the Windows sample client that has a member struct for the descriptor, and fills in fields with information like:

float fps = 0.0f;
uint32_t width = 0;
uint32_t height = 0;
int32_t x, y;

d->GetWindowBounds(&x, &y, &width, &height);
fps = m_hmd->GetFloatTrackedDeviceProperty(k_unTrackedDeviceIndex_Hmd, Prop_DisplayFrequency_Float);

m_deviceDesc.numVideoStreamDescs = CXR_NUM_VIDEO_STREAMS_XR;
for (uint32_t i = 0; i < m_deviceDesc.numVideoStreamDescs; i++)
{
   m_deviceDesc.videoStreamDescs[i].format = cxrClientSurfaceFormat_RGB;
   m_deviceDesc.videoStreamDescs[i].width = width / 2;
   m_deviceDesc.videoStreamDescs[i].height = height;
   m_deviceDesc.videoStreamDescs[i].fps = std::min(CXR_MAX_VIDEO_STREAM_FPS, fps);
   m_deviceDesc.videoStreamDescs[i].maxBitrate = options.mMaxVideoBitrate;
}
m_deviceDesc.stereoDisplay = (2==m_deviceDesc.numVideoStreamDescs);

m_deviceDesc.maxResFactor = options.mMaxResFactor;

m_deviceDesc.ipd = m_hmd->GetFloatTrackedDeviceProperty(k_unTrackedDeviceIndex_Hmd, Prop_UserIpdMeters_Float);

m_deviceDesc.predOffset = 0;

for (int i = 0; i < 2; i++)
{
   m_hmd->GetProjectionRaw((vr::EVREye)i, &m_deviceDesc.proj[i][0], &m_deviceDesc.proj[i][1], &m_deviceDesc.proj[i][2], &m_deviceDesc.proj[i][3]);
}

m_deviceDesc.receiveAudio = m_clientOptions.mReceiveAudio;
m_deviceDesc.sendAudio = m_clientOptions.mSendAudio;
m_deviceDesc.embedInfoInVideo = false;

m_deviceDesc.foveatedScaleFactor = m_clientOptions.mFoveation;
m_deviceDesc.foveationModeCaps = 0;

m_deviceDesc.posePollFreq = 0;
m_deviceDesc.disablePosePrediction = false;
m_deviceDesc.angularVelocityInDeviceSpace = false;

GetChaperone(&m_deviceDesc.chaperone);

Set up a client callback struct (see cxrClientCallbacks), which holds pointers to callback functions that your application supports.

The GetTrackingState callback must be implemented if the application wants the server to sync with latest view and input changes.

If the client application wants to support playing back audio from the server, implement the RenderAudio callback and pass along audio buffers to some audio playback system – see the samples for a few approaches across platforms.

All clients will typically want to implement the UpdateClientState callback to be notified of connection state changes, important especially for asynchronous connect, and to be notified of expected and unexpected disconnects during streaming.

New for the 4.0 release, the CloudXR runtime no longer ‘owns’ logging of messages. Instead, if you want to capture log messages from the CloudXR runtime, the client application needs to implement the LogMessage callback, and hand that information over to some client-side logging system. If you do not already have a log solution, you can look at CloudXRFileLogger, included in the shared directory of the SDK, and used by all the sample clients to approximate the internal log file output CloudXR used to do – the exception is the iOS client, which has a basic implementation of a logging class in Swift.

In the non-iOS samples you will see logging macros being used, and they all route to a static ‘dispatch’ function. Inside of the client, if you use those logging macros, they will redirect right to your own dispatch function. Inside of the CloudXR library, those same macros go to a library-internal dispatch function, and if you have registered a LogMessage callback it will pass the message along to the client app to handle. If the client does not have a message callback, the CloudXR internal fallback just logs to the appropriate standard/debug output for each platform. You don’t have to use this approach in your client, but it is a convenient pattern.

The purpose here, like many other recent changes, is to provide the client app with much more control. If you already use some third party logging library, then you can simply format fields as needed and hand that over to the logger, and CloudXR messages will immediately be integrated into you application logs directly.

Prepare a receiver descriptor struct (see cxrReceiverDesc).

This step copies the device descriptor and the client callbacks, sets the client context for all callbacks, which might be a singleton object pointer, or other global struct, sets the streaming requirements, and sets various debug/logging options, including an app-specific output data path. For more specific details on setting up output paths on the different platforms, see CloudXR File Storage.

Filling out the receiver descriptor might look like the following for a Windows client:

cxrReceiverDesc desc = {};
cxrClientCallbacks callbacks = {};

// fill out callbacks with your supported callbacks, including the new LogMessage callback, and the important UpdateClientState callback.
// See samples for how each app sets up these callback.
callbacks.GetTrackingState = ...
callbacks.TriggerHaptic = ...
callbacks.RenderAudio = ...
callbacks.ReceiveUserData = ...
callbacks.UpdateClientState = ...
callbacks.LogMessage = ...
// Note the callbacks struct now holds the *context* for the client instead of the receiver descriptor.
callbacks.clientContext = this;

// Start filling in receiver descriptor fields.
// This is where you set your app-specific output path
strncpy(desc.appOutputPath, outputPath.c_str(), CXR_MAX_PATH - 1);
desc.appOutputPath[CXR_MAX_PATH - 1] = 0;

desc.requestedVersion = CLOUDXR_VERSION_DWORD;

desc.deviceDesc = deviceDesc;
desc.clientCallbacks = clientCallbacks;
desc.shareContext = nullptr;

desc.debugFlags = m_clientOptions.mDebugFlags;
desc.logMaxSizeKB = m_clientOptions.mLogMaxSizeKB;

Note

For devices that don’t support end-to-end sRGB, you will want to flag that so CloudXR will shift into linear mode. As simple as:

m_receiverDesc.debugFlags |= cxrDebugFlags_OutputLinearRGBColor;

With all of the structures and fields prepared, the client can now call cxrCreateReceiver(), passing in the receiver descriptor, and a pointer to a cxrReceiverHandle to hold the returned Receiver handle needed for all further interactions with the CloudXR SDK. If it fails, report the error to the user (and log it), and exit cleanly.

If the creation of the Receiver succeeded, initiate a connection to the server by using cxrConnect(). Again, from the Windows client sample, this might look something like:
```
m_connectionDesc.async = true;
m_connectionDesc.useL4S = m_clientOptions.mUseL4S;
m_connectionDesc.clientNetwork = m_clientOptions.mClientNetwork;
m_connectionDesc.topology = m_clientOptions.mTopology;
cxrError err = cxrConnect(m_receiver, m_clientOptions.mServerIP.c_str(), &m_connectionDesc);
```
The parameters are the Receiver object, the server IP, and connection descriptor.

The IP is presumed to be a dotted numeric IPv4 address, which might have been manually entered, translated using something like DNS or custom matchmaking, or gathered from something like mDNS/Bonjour.

The connection descriptor cxrConnectionDesc fields help establish the connection and tell the server info about the connection the client is aware of. In general, passing along values from launch options for configurability is a good approach excepting bespoke apps for a very specific device/config.
```
connectionDesc.async = cxrTrue;
connectionDesc.maxVideoBitrateKbps = launch_options_.mMaxVideoBitrate;
connectionDesc.clientNetwork = launch_options_.mClientNetwork;
connectionDesc.topology = launch_options_.mTopology;
connectionDesc.useL4S = launch_options_.mUseL4S
```
An important field is async, which if true instructs the library to use a background thread to initiate connection to the server, and if false runs immediately on the current thread. If you decide to use the asynchronous mode (which is recommended), ensure that you implement the handling of connection state changes in an UpdateClientState callback.
If the connection fails, notify the user about the error and either exit or return to your connection UI to allow the user to try again.
If the connection succeeds, set a state in the application to indicate that streaming is ready, and in the main loop, handle the streaming status change and begin rendering frames.

Client Main Loop¶

Client States¶

The main loop of the application might need to deal with different states and determine what to do and render in each state.

Note

Many of the states map directly to values in cxrClientState.

Before receiver creation

If the main loop includes more than just CloudXR streaming, and does not instantiate the CloudXR client until some state is achieved, the application might be rendering the UI to interact with the user, or just show a loading indicator.
Before the connection

The application might show a loading indicator, or in the case of connecting to the server asynchronously, it might show a Connecting to server indicator.
After a successful connection

When cxrConnect() is called with the asynchronous flag, or called in synchronous mode but in a background thread, the main loop needs to recognize the transition into successful streaming with a flag or a state variable. An application can then begin a fade-in transition or display a Connection Established message.
Render streamed frames

After the application has established a connection to the server, it is ready to start receiving video frames from the network. Each time through the main loop while connected, the code needs to determine whether there are frames available. After this, the latest frames are retrieved, rendered out, and released back to the system. This process is covered in the sections below.
After the disconnect

After a disconnect state is detected, the application might need to set flags or make calls to indicate to other systems that streaming has finished, and that anything related to the CloudXR session needs to be cleaned up and shut down. If an unexpected disconnect occurred, the application might display an error message and exit or return to the initial connection interface.

CloudXR Input System¶

With CloudXR 4.0, the system for sending input events on the client to a server has been rewritten from scratch, with a design eye towards industry standards. From the code reference, see cxrAddController(), cxrFireControllerEvents(), and cxrRemoveController().

Note

All devices/clients must use the new input system, the older code and structures no longer exist in the CloudXR 4.0 release.

We have of course updated the SteamVR driver to the new system and tested it thoroughly with top applications to ensure SteamVR mapping/profiles are wired properly, and applications with complex profiles like Half-Life: Alyx are completely functional compared to CloudXR 3.x.

In addition, the new experimental Server Sample was designed around the new system, having a set of actions it can bind, the master list of all possible inputs it supports, and a profile system that manages mapping client inputs to server actions. If this all sounds familiar, it should as above we designed it with future clients and servers in mind. While developed in concert with the Quest client revision, it should work for all clients – though it may have limits to functionality depending on the client.

In the new system, you first register a ‘new’ controller (that you haven’t ‘seen’ yet, and haven’t registered yet) with the server via cxrAddController(). You pass in a cxrControllerDesc that describes the controller for the server:

A numeric identifier, currently must be 0 for left controller and 1 for right controller. Can be anything for other input devices.
A string defining its ‘role’. For default controllers, we chose a custom URI of “cxr://input/hand/left” and “cxr://input/hand/right”, as our input paths don’t currently require hand naming in the path. For other input devices, it can be whatever identifies its use.
The product name of the controller, used to identify a visual model and profile bindings.
A count of inputs, a table of input paths, and a table of data types per input, that overall defines what inputs the controller can and will produce.

The master list of input paths supported by the provided SteamVR server driver and the experimental Sample Server is:

static const char* inputPathsGeneric[] =
{
   "/input/system/click",
   "/input/application_menu/click",
   "/input/trigger/click",
   "/input/trigger/touch",
   "/input/trigger/value",
   "/input/trackpad/click", // valve and htc have trackpads on PC HMDs.
   "/input/trackpad/touch",
   "/input/trackpad/x",
   "/input/trackpad/y",
   "/input/joystick/click", // oculus steam driver historically uses 'joystick' term
   "/input/joystick/touch",
   "/input/joystick/x",
   "/input/joystick/y",
   "/input/x/click",
   "/input/y/click",
   "/input/a/click",
   "/input/b/click",
   "/input/x/touch",
   "/input/y/touch",
   "/input/a/touch",
   "/input/b/touch",
   "/input/thumb_rest/touch",
   "/input/grip/click",
   "/input/grip/touch",
   "/input/grip/value",
   "/input/grip/force",
   "/input/thumbstick/click", // valve steam driver historically uses 'thumbstick' term
   "/input/thumbstick/touch",
   "/input/thumbstick/x",
   "/input/thumbstick/y",
};

Note

For a full example of the new system in code, look at the Oculus client sample, as it shows a reasonable approach to implementing support for the new controller/input system. It registers controllers on the fly when first detected as active, providing the appropriate array of input paths. And when polling controllers for input status, it generates an array of cxrControllerEvent, which takes the input path index and a union of different format data (boolean, integer, float), and when finished it calls cxrFireControllerEvents() to send the input event list to the server.

Rendering Streaming Video¶

Frame Acquire¶

To determine whether there is an available video frame, call cxrLatchFrame() to attempt to acquire the next frame(s) in order. The first parameter of interest is the cxrFramesLatched structure, which needs to have a scope so that it will exist until the rendering is complete. After being returned, it is populated with information about the frame(s) that have been acquired.

The next paramter is a bitmask for the frames/streams from which to acquire frames. Most applications can just pass cxrFrameMask_All to tell the system to acquire frames from all streams in lockstep. This is typically the case for XR/AR/VR but will work in most situations for Generic mode connections.

However, some Generic mode applications might want one stream at a time, so they will loop over the total video streams by index, and can latch one at a time passing as the mask 1<<index each time. They can also get a specific ‘subset’ of streams if the particular indices are well known, just by OR’ing the stream bitmasks together. For example, to grab streams 0 and 3 you need to pass 1<<0 | 1<<3.

The last parameter to LatchFrame is a timeout value in milliseconds. If frames are not ready, the parameter will sleep briefly, check again, and repeat this process until the timeout has been exceeded. In general, the timeout will be a factor of the length of a frame, and a reasonable starting value is half display refresh (or 2000/displayHz). This value allows the call to return if too much time has passed without the frame(s) being available. This way, if there is a delay in frame delivery, the application can give cycles to other systems in the main loop. Short timeout values also provide the opportunity to render some cached visual to the screen and/or an indication of a streaming delay. If the application prefers to manually manage sleeps, a timeout of zero will result in a check and quick return without any sleep.

Frame Render¶

If the Latch fails, the application can either skip rendering or render something cached, and then continue on through the main loop. This gives other systems a chance to run/update in the event of a frame delay, and ensures any per-frame state-checking logic (such as handling input, or client state changes like disconnects) occurs in a reasonable period of time.

If the Latch succeeds, then the returned cxrFramesLatched structure holds the frame data needed to render out the frame(s). For Android, there is an API call cxrBlitFrame() that should be called after setting up render target and viewport, and it will use the shared OpenGL|ES context to properly blit out latched frames (including handling things like alpha-blending for AR streams, or de-foveation for VR streams). For other platforms, portions of post-processing like de-foveation is handled during the decoding step, and it is then up to the application to know what data format the decoded frame data is in and how to render (blit/submit) as appropriate for the given graphics API.

Frame Release¶

When rendering is completed, cxrReleaseFrame() must be called to tell CloudXR that you have consumed the latched frame(s), and that CloudXR can internally release and recycle.

Updating Headset Properties¶

To update the headset projection parameters, refresh rate, or IPD alongside a pose tracking update, the HasProjection, HasRefresh, or HasIPD flags must be set in the tracking struct, and the new projection parameters, refresh rate, or IPD value set in cxrHmdTrackingState’s proj, displayRefresh, or ipd fields.

Note

Certain servers and/or clients may not properly respond to live changes in refresh or IPD, as they were not designed for dyanmic adjustments to those values.

Connection Stats¶

Periodically cxrGetConnectionStats() may be called to monitor the health of the connection. We recommend waiting for fps * 3 frames to be latched (~3 sec) between calls. Examples of how to do this and how to interpet the stats have been implemented in the sample clients.

Client Cleanup¶

Before exiting your application, free resources that are connected to CloudXR. At a minimum, you must call cxrDestroyReceiver(), which will flush internal buffers and shared handles.